Skip to main content

Data staging: Conflux

A computerProject vision: to transform stored and real-time raw data into product-ready information for agriculture and land management applications

There is tremendous growth in the data we can access from many sources including satellites, machinery, networked in-field sensors, Internet of Things technologies, and mobile phone apps. These data are heterogeneous in their meaning, their format and the communication systems that carry them. Collection of the data is often distributed across sizeable areas, involves third parties and multi-tiered device capabilities, and carries an ongoing risk of hardware failures.

If these rich monitoring data are going to be used to inform decisions, they must be captured, organised, and transformed using coherent and reliable methods and in ways that ensure that they can be trusted. To reach the Digiscape goal of building a variety of different agricultural and land management decision support tools efficiently, then methods are needed to combine these monitoring data with model-based forecasts and other analyses and then to provide them to decision-makers.

We are building Conflux, a data staging service that is specifically designed for land sector applications. Conflux transforms raw data into “product-ready” information suitable for Digiscape’s services and applications, and allows stored or real-time sensor data to be combined with the predictive models required by each of the Digiscape applications. Conflux is being constructed using CSIRO’s existing Senaps technology.

Conflux will:

  • ensure that the knowledge and information derived from the staged data is trusted and traceable, through quality assurance and provenance services
  • guarantee data integrity/security, and the privacy of its sources
  • embed existing predictive models (e.g. APSIM) into larger decision support work flows that depend on sensor data
  • allow third-party services and applications to have easier access to richer data and better efficiency by re-using common building blocks across application domains (i.e. “develop once, use many times”)
  • unite disparate environmental information into common data models
  • be implemented to high standards of software quality.

The science challenges / questions we’re addressing:

  • How can sensor networks convey more resilient information despite environmental effects (e.g. storms) and system limitations (e.g. transient access failure, battery life)?
  • How to optimally deploy the components of these data staging services in a distributed manner across the sizeable sets of third parties and multi-tiered devices involved?
  • How to design dynamic mechanisms allowing these components to automatically adjust to changes in requirements or contexts (e.g. hardware failure, environment-induced faults)?

The project is led by Dr Thierry Rakotoarivelo.