Remotely Sensed Data

Please, scroll down to see ppt slides from this talk.

Remote sensing data is typically both spatially and temporally irregular: i.e. image footprints vary between repeat observations, and successful acquisitions do not occur on a regular basis.

Data is “clumped” both spatially and temporally and, hence, not suited to the monolithic array approaches traditionally employed due to large volume of no-data pixels.

The Datacube arranges the data spatially and temporally to allow efficient large-scale analysis.

“Dice’N’Stack” method used to subdivide the data into spatially-regular, time-stamped, band-aggregated tiles which can be traversed as a dense temporal stack.

‘Dice and Stack’

Calibrated to surface reflectance observations

Spatial alignment and consistent calibration makes analysis much simpler

Every unique observation is kept and included for analysis creating very dense time-series

And here is a bit more detail on the architecture. I won’t go into too much detail here, but can elaborate later if needed. Importantly, the end users of this new infrastructure in this case can develop their own applications tools and software that sit on top of the basic data & computing platform,

A flexible architecture that supports infinite user applications, increasing and diverse datasets, local or cloud-based deployment, and automated ingestion of new datasets.

Open Source (Apache v2.0) software to allow free and open access, Advanced Programming Interface (API) access, future data and capability growth, and commercial opportunities.

A flexible architecture that supports infinite user applications, increasing and diverse datasets, local or cloud-based deployment, and automated ingestion of new datasets.

Open Source (Apache v2.0) software to allow free and open access, Advanced Programming Interface (API) access, future data and capability growth, and commercial opportunities.

When you look at how DataCubes are set up, there are many ways that private industry can also benefit from such platforms.

Vision of a Global Network of Interoperable DataCubes

Different national/regional/global applications;

BUT consistent standards, formats and interoperability.

It supports addressing global challenges – Sustainable Development Goals

This is a zoomed area and the way the final product looks. The different colours here indicate the different times when this particular river system has water. Decision trees and logistic regression.

• But the Data Cube paradigm has enabled us to do not only undertake this analysis at a regional scale for the first time, it has in fact enabled us to scale this to produce a comprehensive national product

• We can now analyse 15 years of data, across the country, representing every <n> square metres … in under 1 day.

• Not only does this mean we can now for the first time produce this analysis once …

• It means that we can produce it in an ongoing basis, enabling the production of updated information as new data becomes available, and enabling researchers to interact with the data iteratively to improve quality and try new ideas

• And it means that, in future, industry and others can build value-adding products on top of this data by leveraging the services-based approach adopted

• A water detection algorithm was used based on a decision tree classifier, and a comparison methodology using a logistic regression. This approach provided an understanding of the confidence in the water observations. The results were used to map the presence of surface water across the entire continent from every observation of 27 years of satellite im- agery.

When the available 11-year time series is analyzed, this is the resulting product.

Limitations of Landsat, 30 m spatial resolution, SNR limiting

Exploitation of the Australian Geospatial Data Cube (1987 – present)

But insufficient spectral resolution – limited to turbidity only

Figure 8 A shows the TSS time series for Lake Cargelligo (CA), a lake with the high mean TSS (74 mg l-1). B shows the TSS time series for Lake Wallace (WA), the lake with the lowest mean TSS (6 mg l-1) in the study area. The black line represents the running mean of four values.