Causal inference in complex multiscale systems

October 14th, 2022

The Causal inference and prediction in high dimensional multi-scale systems project seeks to identify robust relationships between climate and socio-economic impacts.

Turning petabytes into predictions

There are petabytes of short- and long-term weather, economic and geo-political data generated globally. Imagine if we could combine that data to predict socio-economic impacts of climate change.

The Causal inference and prediction in high dimensional multi-scale systems project seeks to identify robust relationships between climate and socio-economic impacts. For example, one of the driving forces behind the Arab Spring was high food costs, caused by a severe drought impacting wheat production.

The challenge is looking at all the noisy data and identifying features in time that are evolving and causally linked to the climate change signal. The whole process can move surprisingly fast and the challenge is to create AI that can not only recognise the changing features but also identify the factor, or combination of factors, causing them.

It’s a massive undertaking requiring expertise from across CSIRO.

A single climate model projection may require weeks to run and analyse, but with this AI platform users can specify many bespoke CO2 emissions pathways, testing hundreds of models in minutes.

The goal is to produce robust estimates of climate-induced risks to the social and economic structures that underpin our nation’s security and enhance our ability to successfully navigate oncoming impacts.

Dr Terry O’Kane: Project leader – Causal inference in complex multiscale systems

Aims and impacts

We are now in a period of history that is known as the Anthropocene. It is the geological epoch dating from the commencement of significant human impact on Earth’s geology and ecosystems, and one increasingly associated with enhanced climate variability and change due to CO2 emissions.

Climate risks are the impact of manifest changes to the frequency and intensity of extreme events (e.g., heat waves, drought, tropical cyclones, etc.,) on our socio-economic, health and geo-political structures. These extreme events are under the combined influence of the larger-scale natural variations in radiative forcing, and the human-induced global warming trend. Anthropogenic change acts to increase vulnerability by rapidly modifying environmental factors which multiply the impact of threats from disease and economic shocks. This project aspires to deliver a world-first ML software platform for the effective estimation of climate risk via early detection and predictions of systematic changes to the physical system integrated with socio-economic, eHealth and financial databases for the delivery of impactful climate services.

Damages

The banking community and financial sectors have an ever-increasing need to assess the chronic (economy-wide) and acute (local hazards) climate risks for a wide range of economic scenarios. For example, the Network for Greening the Financial System (NGFS, https://www.ngfs.net/en) is a group of central banks undertaking an international effort to promote the assessment of climate risk in the global banking and financial sectors. Integrated assessment models (IAMs) are the primary tool used to evaluate the technological and economic feasibility of climate goals such as the Paris Agreement’s long-term temperature goal to hold global warming well below 2˚C. Of particular concern is the fact that the NGFS emissions pathways inferred from the economic activity simulated in the current generation of IAMs, are not the emissions used to force climate model projections of future climatic states. The modelled climate change response is, therefore, inconsistent with the associated economic scenarios. The key point being it is the climate response that dictates the climate-induced damages to socio-economic development and human health. This project will deliver on the capability to produce climate data for user-defined emissions scenarios. This means that for the first time one will now have the ability to produce consistent climate and economic projections for bespoke future emissions pathways.

Climate prediction

Under the auspices of the World Meteorological Organization, forecasts of the near-term climate (1-5 years lead-time) are now routinely produced by several international operational centres. CSIRO is a leader in this area having generated a massive data resource of the climate over the past six decades comprising close to one hundred replicant “digital” earths [9,10] generated through assimilation of a comprehensive set of atmospheric, ocean and sea-ice observations and a state-of-the-art global climate model (https://registry.opendata.aws/csiro-cafe60/). The important point being, that prediction of near-term changes in the climate are now recognised as being of equal importance to weather prediction and a primary tool for infrastructure planning.

Climate Impacts

Climate has known impacts on agricultural and food production [13], commodity prices [12], the frequency of disease outbreaks and the potential for conflict. For example, the political upheavals of the so-called “Arab Spring” are now recognised to be in part exacerbated by a prolonged drought in the Ukraine that raised global wheat prices impacting bread prices across the middle east. The recent COVID-19 pandemic caused massive disruption to the global supply chain, energy and transport sectors. The associated severe downturn in production also resulted in decreased CO2 and aerosol emissions reminding us of just how sensitive is our interconnected world.

The Challenge

Estimating risk associated with societal impacts due to systematic changes in the Earth system arising from the combined influences of internal climate variability and anthropogenic change, begins with quantifying our ability to model the changing dynamics of the physical and socio-economic worlds. Coordinated international efforts have produced enormous volumes of data spanning spatial and temporal scales from weather prediction over a few days to projections of future climate decades from now. However, even the identification of climatic regimes and attribution of the factors driving transitions between periods of drought or increased rain associated with persistent synoptic patterns embedded within these vast data sets represents a huge challenge. Some of the opportunities and challenges in this space are illustrated in figure 2. Extending the problem to infer and predict impacts, feedbacks and risk multipliers in the wider socio-economic sphere is of the utmost importance. However, current approaches to characterise climate risk are largely empirical with simple relationships between only a very few variables established by correlation.

A new approach

This project brings together experts from CSIRO Data61, Oceans & Atmosphere, Health & Biosecurity and the MLAI FSP to deliver to the Anti-Microbial Resistance (AMR), Infectious Disease Resilience (IDR) and Climate Resilient Enterprises (CRE) missions via the Artificial Intelligence for missions (AI4M).  The approach uniquely combines new machine learning methods with applied mathematics approaches from dynamical systems theory to formulate generic methods of causal inference for problems that are currently not tractable using standard machine learning methods. Such problems occur even where, although huge in volume and diverse in type, the available data is insufficient for training ML models. In this category we tackle the problem of combining key socio-economic and eHealth data archives with observed and simulated data of the physical climate system to detect and attribute causal relationships between climate variability and change and the occurrence of disease (pandemics), finance & economics, and conflict.

Clim-RISK

How we operationalise climate data to be relevant across the financial services, insurance, agricultural, mining, health and government planning sectors, delivering positive impacts to the Australian economy and society. This requires an integrated platform to approximate climate variability for multiple scenarios at relatively low computational cost. Delivering improved models of the expected climate response under a range of emission pathways will further enhance our understanding of risk associated with changes in the physical processes underpinning future climate variability.

A major focus is on delivery to the Climate Resilience Enterprise (CRE). The CRE will transform the way in which CSIRO solves complex, multidimensional climate and risk problems.  Mission leader Juliet Bell has noted “Our success and position in the market will be dependent on the AI questions posed by Terry and colleagues in the application above. … The Australian private sector – particularly banks, also have an interest in this work and there is a pipeline of private sector funding for the kind of AI activity being proposing (e.g., understanding how to couple natural perils with climate models etc).”

Dr Vassili Kitsios: WP2 Stream leader – Climate Resilience Enterprise

Disease risk in a warming world

In order to predict risks associated with the observed expansion of the tropics, we will be working with the team of Teresa Wozniak and the Infectious Disease Resilience (IDR) and Anti-Microbial Resistance (AMR) missions. Specifically, the work will identify the most likely relevant climate variables and indices of importance to assess the impact of climatic processes on soilborne bacteria and infectious disease outbreaks. This work spans a huge range of spatial (national, state, ABS statistical area levels, etc.) and temporal scales (daily, monthly, seasonal, annual,) to determine relationships between emerging antimicrobial resistance and infectious disease outbreaks and specific climatic events.

Methods

Novel methodologies for feature selection, causal inference and prediction combining optimization methods with Bayesian machine learning in computationally efficient frameworks have been developed and tested on climate data. Some of the underpinning work, including advances in variational network inference and deep convolutional Gaussian processes, has been peer reviewed and published in leading journals of machine learning [1,4,7,8,11], earth system modelling [2,5,6] and climate science [9,10]. Much of this work is at TRL3, i.e., critical function and proof of concept has been established. Methods currently being deployed include

  • (TRL3) FEM-BV-VARX: Finite element bounded variation vector autoregressive with external factors [3-6, 9, 10]
  • (TRL3) eSPA+: entropy-optimal scalable probabilistic approximation [3,4,11]
  • (TRL3) DBNs: Dynamic Bayesian Networks [2]
  • (TRL2) NODEs: Neural ordinary differential equations [8]
  • (TRL1): Free energy Gaussian state space models
  • (TRL2): variational network inference [1]
  • (TRL2): deep convolutional Gaussian processes [7]

Team structure

Project leaders: Terry O’Kane (O&A) & Edwin Bonilla Pabon (Data61)

Stream leaders: Vassili Kitsios (O&A), Teresa Wozniak (H&B)

Team members: David Newth (O&A), Max Ott (Data61), Tim Erwin (ESS), Mark Collier (O&A), He Zhou (Data61), Dan Chian (Microsoft)

Collaborators: Dylan Harries (SAHMRI), Xuhui Fan (U. Newcastle), Scott Sisson (UNSW), Illia Horenko (Technical University of Kaiserslautern), Theo Damoulas (U. Warwick), Harita Dellaporta (U. Warwick), Jana De Wiljes (University of Potsdam), Courtney Quinn (UTAS), Virginia Aglietti (DeepMind)

References

[1] Dezfouli, A., Bonilla, E. V., & Nock, R.. (2018). Variational Network Inference: Strong and Stable with Concrete Support. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research.
[2] Harries, D., & O’Kane, T. J. (2021). Dynamic Bayesian networks for evaluation of Granger causal relationships in climate reanalyses. Journal of Advances in Modeling Earth Systems, 13, e2020MS002442. https:// doi.org/10.1029/2020MS002442
[3] I. Horenko, D. Rodrigues T.J. O’Kane and K. Everschor-Sitte (2021) Scalable detection of latent relations and their applications to magnetic imaging, Commun. Appl. Math. Comput. Sci., 16(2), 267—297, https://doi.org/10.2140/camcos.2021.16.267
[4] I. Horenko (2020) On a scalable entropic breaching of the overfitting barrier for small data problems in machine learning, Neural Computation 32(8), 1563-1579
[5] T.J. O’Kane, J.S. Risbey, C. Franzke, I. Horenko & D. Monselesan (2013) Changes in the meta-stability of the mid-latitude Southern Hemisphere circulation and the utility of non-stationary cluster analysis and split flow blocking indices as diagnostic tools. (J. Atmos. Sci., 70 (3) pp 824-842)
[6] C. Quinn, T.J. O’Kane & D. Harries (2021) Systematic calculation of finite-time mixed singular vectors and characterization of error growth for persistent coherent atmospheric disturbances over Eurasia (submitted CHAOS)
[7] Tran, G., Bonilla, E.V., Cunningham, J., Michiardi, P. & Filippone, M. (2019). Calibrating Deep Convolutional Gaussian Processes. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research.
[8] Zhi W., Lai T., Ott L., Bonilla E. V., & Ramos F. (2021). Learning ODEs via Diffeomorphisms for Fast and Robust Integration. arXiv:2107.01650.
[9] T. J. O’KaneP. A. SanderyV. KitsiosP. SakovM. A. ChamberlainM. A. CollierR. FiedlerT. S. MooreC. C. ChapmanB. M. Sloyan, and R. J. Matear (2021) CAFE60v1: A 60-year large ensemble climate reanalysis. Part I: System design, model configuration and data assimilation. J. Climate 34, 5153–5169, DOI: https://doi.org/10.1175/JCLI-D-20-0974.1 Page(s): 1–48
[10] T. J. O’KaneP. A. SanderyV. KitsiosP. SakovM. A. ChamberlainD. T. SquireM. A. CollierC. C. ChapmanR. FiedlerD. HarriesT. S. MooreD. RichardsonJ. S. RisbeyB. J. E. SchroeterS. SchroeterB. M. SloyanC. TozerI. G. WattersonA. BlackC. Quinn, and R. J. Matear  (2021) CAFE60v1: A 60-year large ensemble climate reanalysis. Part II: Evaluation, J. Climate 34, 5171–5194 DOI: https://doi.org/10.1175/JCLI-D-20-0518.1
[11] E. Vecchi, L. Pospisil, S. Albrecht, T.J. O’Kane, I. Horenko (2021) eSPA+: Scalable entropy-optimal machine learning classification for small data problems (2022) 34 (5): 1220–1255.
[12] Kitsios, V., De Mello, L., and Matear, R. (2022). Forecasting commodity returns by exploiting climate model forecasts of the el niño southern oscillation. Environmental Data Science, Vol. 1, e7, pp 1-16. doi.org/10.1017/eds.2022.6
[13] Cai, Y., Bandara, J. S., and Newth, D. (2016). A framework for integrated assessment of food production economics in south asia under climate change. Environmental Modelling & Software, 75:459–497.