Advanced Data Analytics in Transport – Machine Learning Perspective

February 10th, 2015

Transport data analysis and modelling are being transformed with the help from machine learning techniques and the Big Data platform. Machine learning techniques make it possible to derive patterns and models from large volume, high dimensional data. The Big Data platform leverages distributed file system and parallel computing to enable fast processing of data. With the two being combined for transport analysis, it is capable of making sense of large real-time traffic data streams as well as supporting large-scale traffic simulation.

For business aspects of applying machine learning in transport, please see the companion page.

1. Predicting Near Future Traffic Jams and Hot Spots of Congestion
When an incident or congestion occur on a major road, it is likely that the traffic of the surrounding area will be affected. Early prediction of the nearby congested roads to avoid is very important in deciding the optimal route for the drivers to avoid the area and especially for emergency vehicles to reach the site. Figure 1 illustrates the frequent pattern for congestion propagation around Olympic Park in Melbourne.

Figure 1: Congestion propagation pattern around Olympic Park, Melbourne

We have introduced an algorithm based on the combination of association rule mining and dynamic Bayesian network to construct causality trees from congestions and estimate their propagation probabilities based on temporal and spatial information. Frequent sub-structures of these causality trees reveal not only recurring interactions among spatio-temporal congestions, but potential bottlenecks or flaws in the designs of existing traffic networks.
This algorithm was then utilised to discover real-time ‘hot spots’ for congestions in the Sydney CBD, Australia. Figure 2 shows the traffic conditions for a typical morning in Sydney CBD.

Figure 2: Sydney CBD’s morning traffic pattern

2. Predicting secondary incident location (warning and planning)
In the special conditions such as night time or heavy rain, the occurrence of one incident can be used to predict the secondary incident nearby. We have developed an algorithm based on frequent item set and association rule mining to discover the likelihood of incident propagations from both spatial and temporal perspectives which:

Determine the main cause (incident) of the dangerous conditions.
Predict incident-prone conditions on sections of road when first incident happened.
Improve conditions by actionable suggestions to prevent the secondary incidents from happening, e.g. applying traffic management measures.

3. Incident duration prediction
The research project named “Decision Support for Incident Management” (also known as Machine Learning Assessment of Road Incidents) with NSW Transportation Management Center mainly focused on machine learning methods for incident duration prediction and outlier detection. The best performance was achieved with Gradient Boosted Trees accompanied by advanced sampling algorithms.

Duration is the most important factor in incident optimisation processes. Given the available resources and traffic conditions, the system should be able to estimate the total duration from response activation to clearance time. The performance of the management process depends on these predictions for individual accidents to optimise the clearance time for specific significant accident or overall network coverage.

Figure 3: Sydney area incident map (estimated mapping)

4. Traffic Watch
For incident management purpose, we have developed the live system displaying the current location and status of all incidents from multiple sources including social media. Figure 4 presented the web service and interface for the real-time view of current incidents in Victoria. Heat map and cluster map were also generated.

This system utilises advanced web technologies and state-of-the-art machine learning (ML) algorithms such as Support Vector Machines and Conditional Random Fields to classify the tweets relevant to traffic incidents.

Figure 4: Incident detection and visualisation

5. Sydney CBD Mobility Modelling
This is the NSW Roads and Maritime Services’ project on evaluating the impact of both planned and unplanned closures of George St on traffic volume, congestion and travel time at intersections and road links in and around the Sydney CBD (Figure 5).

Focusing on quantify the impact of a light-rail construction and operation in the Sydney CBD on traffic flows, we have developed the state-of-the-art transport zoning method and origin-destination demand modelling tool, which are integrated with existing traffic assignment models.

Figure 5: A snapshot of Sydney CBD traffic based on degree of saturation

6. RMS Key Road Performance Report
This project fuses RMS data from multiple sources to produce meaningful contents and support interactive user interface. The back-end system fuses SCATS and GPS data to produce travel time distribution for each of the 127 key roads at 15-minute interval, 24/7. The advanced reporting architecture will be able to generate offline and online reports for road performance evaluation. Figure 6 presented the online reporting interface for the RMS Key Roads Performance.

Figure 6: Online report for Key Road Performance in NSW

7. Smart Motorway
The ADAIT team developed Data61’s indigenous smart motorway control system, and evaluated the cost/benefit of such system for Sydney M4 as well as West Australia Kwinana Freeway. Microscopic simulation demonstrated that travel time savings can reach 40% during peak hours, with capacity gain equivalent to adding an extra lane. This means a saving of $22 million per annum on part of M4 alone, potential for more than $500m p.a. applied to whole system in Sydney.

Figure 7: Modelled freeway section on Sydney M4, with M4/M7 intersection on the background; this is one of the key congestion areas for westbound traffic during peak-hours

8. Road Congestion in Urban Areas
Road congestion in urban area is largely determined by the equilibrium between demand and supply capacity. Transport planning usually deals with the supply capacity, whereas congestion management may look for short-term demand incentives. However, the problem is how to evaluate the impact of transport planning and congestion management scheme to the entire urban traffic system with higher accuracy, improved efficiency and visual demonstration. The ADAIT team aims to develop a traffic simulation platform to assist transport planners and traffic engineers to evaluate the holistic impact of their own proposals or borrowed solutions from other cities to the entire urban traffic system in Sydney.

Figure 8: Demo of our platform

Intelligent Fleet Logistics – Improving Customer Profitability

Coming up next:

Big Data Knowledge Discovery