PhD Project: Statistical analysis and visualisation of spatially distributed big time series electricity usage data
Electricity smart meter technology is increasingly being deployed in residential and commercial buildings. The technology facilitates the collection of energy usage data at much finer temporal scales than was possible previously. These smart meters collect energy usage information at half hourly intervals, resulting in over 35 billion half hourly observations per year across all households in the state. Using the geographical data of the households also makes it possible to identify spatial and spatio-temporal patterns. This project will concentrate on undertaking research into spatial and spatio-temporal visualisation and inferential methods for cognostics of spatially distributed big time series electricity usage data. The research activity has three main aims:
- Cognostics methods for electricity usage time series data with spatial and spatio-temporal structure. The aim is to test existing metrics for time series data (e.g. Fulcher, Little, and Jones, 2013) to ascertain if they provide spatially useful diagnostic statistics when the spatial dimension of the spatially distributed big time series data is accounted for. If not, to propose new metrics that do provide spatially useful diagnostic statistics, and to then extend the methods to provide useful spatio-temporal diagnostic statistics. The methods will need to be able to account for spatially distributed explanatory variables such as demographics, building size and material, and household behavioural patterns. The methods developed will be reliant on parallel processing using multiple multi-core computers and platforms such as Hadoop, Spark or Tessera.
- Develop visualisation methods for spatial and spatio-temporal cognostics for spatially distributed big time series. The goal is to provide visualisation methods for supporting the analysis, to find anomalies, possibly errors in the data, or unusual uses, and explore patterns like clusters of behaviour, and summarise the behaviour (e.g. Javed et al, 2010). Cognostics provide efficient numerical summaries that can be understood better with plots of the time series, in the spatial context where the data arises. The new challenge for visualisation is handling the volume of data supplied by smart meters, especially to provide interactive graphics (e.g. Cheng et al, 2016). In addition, providing visual explanations is helpful for decision makers to understand patterns or changes in patterns. There is a need to develop visualisation tools that can be used to describe associations between different variables, clustering in spatial location, demography, behavioural or building type, that can change over time (e.g. Wickham et al, 2012). New methods are needed to visualise changing spatial-temporal patterns in an intuitive and interactive way.
- Develop inferential methods for spatially distributed large time series electricity usage data. It is important to determine if any identified clusters or patterns are indeed statistically meaningful. That is, what confidence do we have that the identified patterns actually exist and are not a random artefact? As the electricity energy usage data will likely exhibit complex multi-seasonality, serial and spatial correlation, making comparisons against permuted observations assumed to be independent will not be appropriate. The permutations against which comparisons will be undertaken will need to respect the multi-seasonality and build on the seasonal block bootstrap (Dudek et al., 2014), the serial correlation (Kreiss and Lahiri, 2012) and the spatial correlation (García-Soidán, Menezes and Rubinos, 2014).