Knowledge-stack

March 20th, 2017

DRAFT

Figure 1 illustrates ongoing and past case studies by CSIRO coastal modelling team (https://research.csiro.au/cem/ ). Each of these case studies is based typically on several years of research by a team of experts. The knowledge gained through this research eventually is encapsulated into reports and scientific papers and also into numerical models which simulate physical and biogeochemical processes in these regions. Some of these models continue to run in real-time beyond the life-time of a particular project producing large volumes of data stored on electronic carriers. Despite valuable information stored in these data, a number of issues undermine their utility to practice. For example, extracting knowledge from such data (even when this data is freely available online) is not a trivial task. The delivery of the established knowledge to the end-users is often poorly structured and inefficient. There is little to no interoperability across analogous digital products produced by other modelling teams in other areas and other subject domains. A successful business model for maintaining these simulations through time is not always available. Legal, financial and technical services are often missing or poorly implemented.

 

Fig 1. Locations of coastal modelling studies

 

Exponential grows of Internet of Things (IoT) devices in recent years has led to the development and proliferation of on-line platforms facilitating storage, processing and utilisation of real-time data streams (e.g. Amazon WS IOT, Google Cloud IoT, Microsoft Azure IoT Suite, IBM Watson IoT  to mention just a few big players here). IoT web-platforms are typically geared towards embedded “smart” sensors enabling online registration and two-way communication between the end-user and the sensing device(s) over the wireless network.

Analogous to “smart” sensors, numerical models can “track” the evolution of complex environmental systems by simulating these systems in real time. Unlike sensors such models have also a capacity to predict behaviour of these environments in the future. Managers, researchers, and students can benefit from using this data. Despite high potential value of these data, their utility to practice, however, is often limited because of the fragmented nature of infrastructure underpinning storage, processing, and delivery of these data to the end users.

Having established a scalable solution to accommodate and process real-time digital products from various online resources (including models and sensors) would greatly facilitate uptake of knowledge from this data and enhance our capacity to understand and predict complex environmental systems.

The rest of this document outlines an incomplete and very preliminary vision of  such a scalable solution to be updated later through the discussions on the CooP meeting. The key purpose of having it presented here is  to provide an inception point for these subsequent discussions.

 

Knowledge stack

A preliminary vision of the Digital Coast is a network of self-sustainable web-platforms. Each platform hosts a particular set of data and is operated by a community of experts. Each platform is structured into a number of layers, each layer representing a particular stage on the way from the raw data towards the knowledge associated with these data. The first four layers of this structure (data layer, processing layer, packaging & delivery layer, and integration/networking layer) comprise a knowledge-stack (k-stack). An individual web-platform is an instance of the knowledge-stack.

 

  1. The data layer handles acquisition  and management of the raw data produced by models and sensors.
  2. The processing layer builds on top of the data layer and handles extraction of knowledge from this data.
  3. The third layer (packaging & delivery layer) deals with packaging and delivery of knowledge established in the previous layer. This knowledge can be expressed as a collection of articles, reports, web-pages etc., or it could be integrated into a structured educational course about specific subjects related to a particular set of data. For example, water-quality experts establishing an online course about biogeochemical processes in the GBR region. Students taking this course will be issued a certificate acknowledging successful completion of this course.
  4. The integration/networking layer generalises local data and knowledge produced by a particular community across space, time and disciplines. This generalisation is achieved by enabling a hierarchical networked stricture to be built out of individual knowledge-stacks. According to this vision, an expert in a particular discipline must be able to automatically fire-up a new knowledge-stack (via registry) and then populate it with  data by connecting to the established stacks (considered a new source of data).
  5. Finally the registrar handles registration, instantiation and monitoring of new instances of knowledge-stacks.

 

A number of features differentiates our platform from other established platforms:

  1. Distributed nature. Unlike many established IoT platforms, each instance of the k-stack is hosted and maintained (and hence owned) by the respective modelling team. Such a distributed nature of this platform would make it easy to scale it up towards really big and heterogeneous data.
  2. Educational infrastructure. Processing of large volumes of data (“Big Data”) nowadays it typically carried out locally by moving compute resources to data rather than moving data across the net. The knowledge established through such processing is then disseminated to end-users often in an ad-hoc unstructured way. To facilitate uptake of this knowledge, our web-platform is offering an infrastructure for online education embedded into every individual instance of the knowledge-stack. According to this vision, every modelling team is considered an educational facility (a small “school” or a “university”) where experts teach students about particular modelling products. Both experts and students have direct access to these products and can conduct experiments with real data when needed.
  3. Networking capacity. Every individual “school” is teaching a local knowledge in a sense that this knowledge is based on a particular set of data from a particular model (or sensors). These isolated blocks of knowledge are integrated across space, time, and disciplines via web APIs.
  4. To facilitate sustainable production and delivery of knowledge, optional online payment interfaces are attached to every individual k-stack. Through these interfaces digital products created at every layer of the knowledge-stack are delivered to the end-users either for free or for a fee.

 

 

Team

The team is structured around layers comprising the knowledge-stack and the registrar. The list below is a preliminary very tentative list of the team members (to be confirmed)

 

To be filled in later

 

Customers/users

  1. Modellers.
  2. Students.
  3. Managers.
  4. Researchers.
  5. General public.

 

January 2017

Nugzar Margvelashvili