Data staging

Extracting the maximum information from sensor networks

What is the data ‘problem’?

There is tremendous growth in the data we can access from many sources including satellites, machinery, networked in-field sensors, Internet of Things technologies, and mobile phone apps. These data are heterogeneous in their meaning, their format and the communication systems that carry them. Collection of the data is often distributed across sizeable areas, involves third parties and multi-tiered device capabilities, and carries an ongoing risk of hardware failures.

What science challenges / questions are we addressing?

This research is helping tackle the following science challenges:

How can sensor networks convey more resilient information despite environmental effects (e.g. storms) and system limitations (e.g. transient access failure, battery life)?
How to optimally deploy the components of these data staging services in a distributed manner across the sizeable sets of third parties and multi-tiered devices involved?
How to design dynamic mechanisms allowing these components to automatically adjust to changes in requirements or contexts (e.g. hardware failure, environment-induced faults)?

So far we’ve produced several novel methods and algorithms, which together push further the limits of what wireless networked sensors can achieve while collecting and transmitting field data in land sector applications.

A graph showing sensor data — Graphs showing sensor data

Take for example a scenario where sensors have fixed battery life. One of our novel algorithms computes the best coding rate for transmitting data to minimise the overall energy usage by the entire set of sensors. This in turn allows the entire set of sensors to function for a longer period of time before going flat.

Another one of our novel methods finds the best coding rates for all sensors to achieve the same knowledge about their environment with minimal transmissions, again allowing all sensors to last longer in the field. In addition, this state of shared common knowledge between sensors allows them to dynamically adapt their behaviour. For example, when measuring chemicals in an estuary, downstream sensors can increase their sampling rate if they know that upstream sensors have detected high reading fluctuations, thereby being able to capture more information about a potential ongoing input event in the estuary.

The research contributions of this project have been published in peer-reviewed high-ranked scientific conferences and journals. We successfully completed some initial trials of one of these algorithms on real-world data in the context of canopy temperature measurements in tomato fields, and are looking for further applications within the land sector.

Where can I find out more about data staging?

Read the publications:

Ding N, Sadeghi P, Smith D, Rakotoarivelo T (2018) Distributed data compression in sensor clusters: a maximum independent flow approach. pp. 2221-2225 in ‘IEEE International Symposium on Information Theory (ISIT 2018)’, Vail, Colorado, 17-22 June 2018. doi:10.1109/ISIT.2018.8437754

Ding N, Smith D, Sadeghi P, Rakotoarivelo T (2018) Fairness in multiterminal data compression: a splitting method for the egalitarian solution. pp. 6383-6387 in ‘IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2018)’, Calgary, 15 April 2018

Ding N, Smith D, Sadeghi P, Rakotoarivelo T (2018) Fairness in multiterminal data compression: decomposition of Shapley value. pp. 886-890 in ‘IEEE International Symposium on Information Theory (ISIT 2018)’, Vail, Colorado, 17-22 June 2018. doi:10.1109/ISIT.2018.8437475

Ding N, Sadeghi P, Rakotoarivelo T (2018) Improving computational efficiency of communication for omniscience and successive omniscience. ‘Allerton Conference on Communication, Control, and Computing’, Monticello, Illinois, 2-5 October 2018

Ding N, Sadeghi P (2019) A Submodularity-based Clustering Algorithm for the Information Bottleneck and Privacy Funnel. 2019 IEEE Information Theory Workshop, Visby, Sweden, 2019, pp1-5 doi: 10.1109/ITW44776.2019.8989355

Sawyer N, Naderi Soorki M, Saad W, Smith D, Ding N (2019) Evolutionary Games for Correlation-Aware Clustering in Massive Machine-to-Machine Networks. pp. 6527-6543 in ‘IEEE Transactions on Communications’, 67, 9. doi:10.1109/TCOMM.2019.2917437

Liu Y, Ding N, Sadeghi P, Rakotoarivelo T (2020) Privacy-Utility Tradeoff in a Guessing Framework Inspired by Index Coding. ‘IEEE International Symposium on Information Theory (ISIT 2020)’, Los Angeles, California, 21-26 June 2020

Contact the project leads:

Chris Sharman

Primary Emailchris.sharman@csiro.au

Dr Thierry Rakotoarivelo

Thierry's research focuses on data privacy and information security, and their application to federated systems. Thierry led the Digiscape's data staging research project.