Sentiment Analysis Tools

February 12th, 2015

There is a tsunami of online information and opinions posted on news sites, blogs and the twittersphere. Data61 is developing tools such as EventWatch and OpinionWatch to help people and organisations understand public sentiment or reactions to events.   These tools have wide-ranging applications across a number of industries, including communications, media, politics and retail.




Using the volume of rich data generated from Twitter (200 billion tweets are tweeted annually), the EventWatch platform aims to extract more useful information by aggregating Tweets into meaningful clusters.  This offers a real-time summary of the events or topics in the Tweets.  EventWatch uses the following technology to cluster and analyse tweets:EW

  • Keyphrase Extraction – is the task of finding unigrams that frequently co-occur in a text corpus (e.g. ‘opening keynote’, ‘live demo’, ‘stunning animations’).  Keyphrases are generally more useful than single words because they are more specific.
  • Named Entity Recognition – is the task of finding names of people, organisations, and locations in a text corpus.
  • Sentiment Analysis – is an estimation if a document is positive or negative towards its subject.

These technologies help formulate more coherent Tweet clusters as well as extract more informative labels for each one, offering users:

  • an idea about the topics and events around a particular organisation, person, place, or product and how they emerged over time
  • the sentiments associated with the topic or event
  • an opportunity to react accordingly since the platform runs on live Twitter data.



Opinionwatch is a visual document explorer designed to help users understand the major topics of their document set, drill down within chosen topic areas and possibly identify trends through our visualisations. The key technologies used to create these visualisations are:


Topic Modelling – using machine learning to automatically extract topics from a document set and generating a graph using these topics and their proportions in each document. This gives users a quick overview of the general topics being discussed and the concentration of documents for each one. Users can then drill down into specific topics.

Sentiment Analysis – estimating document sentiment using a sentiment lexicon, processing of word negations, and windowing based on search terms. Document sentiments are then visualised using green for positive and red for negative sentiment.

Key Phrase Extraction – presenting the published documents in a timeline where key phrases are extracted daily to offer a better idea of the events that have transpired.

People: Paul Rivera