Skip to main content

Traffic Watch

Posted by: data61

February 11, 2015

Real-time Traffic Incident Detection and Monitoring Using Social Media.”



Social networks have become a valuable source of real-time information. Transport Management Centre (TMC) in Australian state of New South Wales has collaborated with Data61 to develop TrafficWatch, a system that leverages Twitter as a channel for transport network monitoring, incident and event management. This system utilises advanced web technologies and state-of-the-art machine learning (ML) algorithms. The live web interface is based on Data61 Subspace and 3D Cesium Bing map to provide a spatial and temporal display of tweets that are potentially related to transport issues. The crawled tweets are first filtered to show incidents in Australia, and then divided into different groups by online clustering and classification algorithms.

One of the issues identified was that only 3% of tweets contained geo-locations, making it difficult to interpret the locations of reported issues. TrafficWatch was able to geo-locate an additional 20% of tweets by analysing locations extracted from its text analysis module.

Findings from the use of TrafficWatch at TMC demonstrated that it has potential to report incidents earlier than other data sources, as well as identifying unreported incidents. The concept of monitoring social media also shows promise in improving TMC’s network monitoring capabilities to assess network impacts of incidents and events.

System Artichecture

  • Cutting edge NLP techniques
  • State-of-the-art ML algorithms for text clustering, classification and named-entity recognition
  • Real-time NSW Traffic Watch web-service with record and playback timeline on 3D Bing map


system(System architecture)

Traffic Entities Auto-annotation

  • The tagged entities are essential input for the incident’s significant level ranking system
  • The same incident details can be described by different key words (e.g. break down/stall/stationary), generalising them into categories and entities will help the model to learn these variations more efficiently
  • For management purpose, the tweets indexed by entities rather than normal key words will allow rapid retrieval of the incidents that satisfied certain criteria


brat(Traffic entities annotation schema and examples of annotated tweets)

Aggregating Tweets into Meaningful Clusters

  • Effectively gives users a summary of the popular incident types in the tweets as they emerge over time
  • Online algorithm to incrementally cluster the tweets from live stream data using cosine similarity evaluation and gradually improve its centroids when more data is available
  • Unsupervised clustering algorithm provides a more general view of the tweet clusters on the common key words used


Cluster2(Example of an incident cluster detected by TrafficWatch on 13 Jan, 2015)

People: Fang Chen (technical contact), Hoang Nguyen, Chen Cai, Ronnie Taib, Paul Rivera, Kin-Hon Chan.