Skip to main content

Dinusha Vatsalan

 


Dinusha Vatsalan

Research Scientist,
Networks Group
CyberPhysical Systems Research Program
Data61-CSIRO


Contact

email: first{dot}last{@}data61{dot}csiro{dot}au
phone: +61 2 9490 5734


News
Short Bio
Projects
Current Activities
Publications


News

Short Bio

Dr. Dinusha Vatsalan is a Research Scientist at Data61-CSIRO, Australia, and an Honorary Lecturer in the Research School of Computer Science at the Australian National University. Her research interests are mainly in privacy preserving techniques, privacy in data matching and mining, privacy in social media, privacy risk evaluation and prediction, health informatics, and population informatics. She is currently working on privacy in web search and social media as well as privacy preserving data matching.

Education:

Work Experience:

  • Research Scientist – Networks group, CyberPhysical Systems Research Program, Data61-CSIRO, Sydney, Australia, May 2017 – Present
  • Honorary Lecturer – Research School of Computer Science, Australian National University, Canberra, Australia, May 2017 – Present.
  • Research Fellow (Level B) – Research School of Computer Science, Australian National University, Canberra, Australia, Oct 2014 – Apr 2017.
  • Research Assistant – Research School of Computer Science, Australian National University, Canberra, Australia, Feb 2012 – May 2012; and Sep 2013 – Nov 2013.
  • Tutor – Research School of Computer Science, Australian National University, Canberra, Australia, Feb 2013 – Jun 2013.
  • Instructor – School of Computing, University of Colombo, Sri Lanka, Nov 2009 – Nov 2010.
  • Trainee Software Engineer – Aeturnum (Pvt) Ltd, Sri Lanka, Feb 2008 – Aug 2008.

Awards & Grants:

  • Early Career Researcher Travel Award – Funded by the Australian National University, 2016.
  • Data Linkage and Anonymisation Programme participant –  Funded by the Isaac Newton Institute for Mathematical Sciences, Cambridge, UK, 2016.
  • Investigator and mentor for the project Privacy-preserving medical data linkage – Awarded by Google Summer of Code 2016.
  • Co-investigator for the project Advancing data integration: Privacy and semantics for record linkage – Awarded by the Australia-Germany Joint Research Cooperation Scheme, 2015-2016.
  • Endeavour Postgraduate Research Award – Awarded by the Australian Government, DEEWR, 2011.
  • Recognition of Women in Data Mining Citation – Awarded by AusDM 2011 Conference, 2011.
  • NBQSA Silver Award – National Best Quality Software Awards, ViduSuwa eHealth Project, eHealth Track, 2010.
  • Manthan Award South Asia – South Asia’s Best e-Content Award, ViduSuwa eHealth Project, eHealth Track, 2009.
  • eSwabhimani Award – National Best e-Content Award, ViduSuwa eHealth Project, eHealth and Environment Category, 2009.
  • eIndia 2009 Speaker Award – Awarded by the eIndia Conferece, eHealth Track, 2009.

Professional Services:

  • PhD thesis examiner – Data mining and privacy, Charles Sturt University, 2017
  • Lecturer – ANU online course: Master of Applied Data Analytics – Data Wrangling (COMP8430) , 2017
  • PhD Co-supervisor – Privacy preserving record linkage, Thilina Ranbaduge, ANU, 2014-present; Temporal record linkage, Yichen Hu, ANU, 2016-present
  • Masters Co-supervisor – Predicting number of patients in emergency department,  Yuebin (Alex) Zhao, 2016-present; Error repairing in entity resolution, Chong Feng, ANU, 2016; Household linkage, Narayan Mani, ANU, 2015;
  • Reviewer – Journal of Data and Information Quality (JDIQ) 2017; Journal of Information Systems (JIS) 2017,2016; Transactions on Knowledge and Data Engineering (TKDE) 2015, 2016; Journal of Knowledge and Information Systems 2015, 2016; Journal of BioMed Central (BMC) Medical Research Methodology 2015; Journal of Algorithms 2015; IEEE International Conference on Data Mining Workshop on Data Integration and Applications (DINA) 2014 – 2016; and AusDM Conferences 2014 – 2016.
  • External Reviewer – IEEE ICDM 2015; Springer PAKDD 2014, 2015; ACM Knowledge Discovery and Data mining (KDD) 2015; and Journal of Privacy and Confidentiality (CMU) 2015.
  • Co-organizer – First International Workshop on Population Informatics for Big Data (PopInfo) 2015.
  • Organizing Committee Member – AusDM 2013, Canberra; ICTer 2010 conferences
  • Program Committee Member – Asia-Pacific Symposium on Intelligent and Evolutionary Systems (IES 2016); IEEE International Conference on Data Mining (ICDM) Workshop on Data Integration and Applications (DINA) 2014 – 2016; and AusDM Conferences 2014 – 2017.

Projects

  • Privacy-preserving Medical data Linkage (PriMedLink)
    • Involved as an investigator and mentor for the PriMedLink project that aims to research and develop novel algorithms for privacy-preserving linking of medical data for health analytics (such as similar patient matching, clinical trials, and customized treatment)
    • Open source software development funded by Google Summer of Code 2016
  • Multi-Party Privacy-Preserving Record Linkage (MP-PPRL)
    • Developed several novel software algorithms for privacy-preserving record linkage of multiple large databases using data mining techniques and privacy-preserving techniques
    • Funded by the ARC Discovery Project grant DP130101801
  • Privacy-Preserving Record Linkage (PPRL)
    • Conducted extensive research on PPRL techniques and developed novel software algorithms addressing the current gaps in PPRL
    • Funded by the Australian government (DEEWR) – Endeavour postgraduate research award
  • A flexible and extensible personal data generator and corruptor
    • Involved in the development, testing, and manual writing of a synthetic personal data and temporal data generator for Fujitsu Laboratories, Japan using the Python programing language; and a web-based GUI GeCo using Python, HTML, PHP, JS, AJAX, and JSON
  • ViduSuwa: A Mobile Telemedicine Solution for Patients in Emerging Countries
    • Involved in the eHealth research project, ViduSuwa, on mobile technologies for enhancing eHealth solutions using Electronic Medical Record (EMR) and M-Communication systems implemented using J2EE and J2ME
    • Conducted research on mobile technologies for enhancing eHealth solutions and published research outcomes in conference proceedings and a journal article
    • Funded by the Information and Communication Technology Agency (ICTA) Sri Lanka

Current Activities

  • Program Committee Member – AusDM conference 2017

Publications

Journal articles:

  • Automatic Discovery of Abnormal Values in Large Textual Databases. Peter Christen, Ross Gayler, Khoi-Nguyen Tran, Jeffrey Fisher and Dinusha Vatsalan. Journal of Data and Information Quality (ACM), volume 7, issue 1-2, April 2016. Article available online at dx.doi.org/10.1145/2889311
  • Privacy-preserving matching of similar patients. Dinusha Vatsalan and Peter Christen. Journal of Biomedical Informatics (Elsevier), volume 59, February 2016, Pages 285-298. Article available online at doi:10.1016/j.jbi.2015.12.004.
  • Evaluation of advanced techniques for multi-party privacy-preserving record linkage on real-world health databases. Thilina Ranbaduge, Dinusha Vatsalan, Sean Randall, and Peter Christen. Proceedings of the International Population Data Linkage Conference, Swansea, Wales, August 2016. Abstract available online at http://www.ipdlnconference2016.org/Programme/Abstract/89
  • A taxonomy of privacy-preserving record linkage techniques. Dinusha Vatsalan, Peter Christen, and Vassilios S. Verykios. In Journal of Information Systems (Elsevier), volume 38, issue 6, September 2013, Pages 946-969. Article available online at http://dx.doi.org/10.1016/j.is.2012.11.005. (One of the most cited Information Systems articles – http://www.journals.elsevier.com/information-systems/most-cited-articles)
  • eClinics Integration Techniques for Clinical Information Systems Moving in to a National Network. Dinusha Vatsalan, Shiromi Arunatilake, Keith Chapman, Saatviga Sudhahar, Chamal Abeywardhana. In Sri Lanka Journal of Bio-Medical Informatics, volume 2, issue 4, June 2012, Pages 130-143. Article available online at http://dx.doi.org/10.4038/sljbmi.v2i4.2257.

Book chapters:

  • Privacy-Preserving Record Linkage for Big Data: Current Approaches and Research Challenges. Dinusha Vatsalan, Ziad Sehili, Peter Christen, and Erhard Rahm. Book chapter in Big Data Handbook, Springer, 2016.
  • Advanced Record Linkage Methods and Privacy Aspects for Population Reconstruction – A Survey and Case Studies. Peter Christen, Dinusha Vatsalan, and Zhichun Fu. Invited book chapter in Population Reconstruction. Gerrit Bloothooft, Peter Christen, Kees Mandemakers, and Marijn Schraagen (editors). Springer, August 2015.

Conference proceedings:

  • Efficient Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage. Peter Christen, Rainer Schnell, Dinusha Vatsalan, and Thilina Ranbaduge. Proceedings of Springer PAKDD, Jeju Island, South Korea, May 2017.
  • Improving Temporal Record Linkage using Regression Classification. Yichen Hu, Qing Wang, Dinusha Vatsalan, and Peter Christen. Proceedings of Springer PAKDD, Jeju Island, South Korea, May 2017.
  • Scalable privacy-preserving linking of multiple databases using counting Bloom filters. Dinusha Vatsalan, Peter Christen, and Erhard Rahm. Proceedings of the ICDMW on Privacy and Discrimination in Data Mining (PDDM), Barcelona, Spain, December 2016. An extended version of the article is available in arXiv proceedings.
  • Regression classification for improved temporal record linkage. Yichen Hu, Qing Wang, Dinusha Vatsalan, and Peter Christen. Proceedings of the AusDM, Canberra, December 2016.
  • Scalable Block Scheduling for Efficient Multi-Database Record Linkage. Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen. Proceedings of the IEEE International Conference on Data Mining (ICDM’16), Barcelona, Spain, December 2016.
  • Efficient Record Linkage Using a Compact Hamming Space. Dimitrios Karapiperis, Dinusha Vatsalan, Vassilios Verykios, and Peter Christen. Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16), Bordeaux, France, March 2016. Paper (pdf, 1.9MB) available online in Open Proceedings.
  • Hashing-based Distributed Multi-party Blocking for Privacy-preserving Record Linkage. Thilina Ranbaduge, Dinusha Vatsalan, Peter Christen, and Vassilios Verykios. Proceedings of the 20th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’16), Auckland, New Zealand, April 2016. Paper (pdf, 532 KB) available online from Springer Link.
  • MERLIN – A Tool for Multi-party Privacy-preserving Record Linkage. Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen. Proceedings of the IEEE International Conference on Data Mining (ICDM’15), Atlantic City, November 2015 (Demo paper).
  • Efficient Entity Resolution with Addaptive and Interactive Training Data Selection. Peter Christen, Dinusha Vatsalan, and Qing Wang. Proceedings of the IEEE International Conference on Data Mining (ICDM’15), Atlantic City, November 2015 (Short paper).
  • Efficient Interactive Training Selection for Large-scale Entity Resolution. Qing Wang, Dinusha Vatsalan, and Peter Christen. Proceedings of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’15), Ho Chi Minh City, Vietnam, May 2015 (Full paper). Paper (pdf, 471 KB) available online from Springer Link.
  • Clustering-based Scalable Indexing for Multi-party Privacy-preserving Record Linkage. Thilina Ranbaduge, Dinusha Vatsalan, and Peter Christen. Proceedings of the 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’15), Ho Chi Minh City, Vietnam, May 2015 (Full paper). Paper (pdf, 382 KB) available online from Springer Link.
  • Large-Scale Multi-party Counting Set Intersection Using a Space Efficient Global Synopsis. Dimitrios Karapiperis, Dinusha Vatsalan, Vassilios S. Verykios, and Peter Christen. Proceedings of the 20th International Conference on Database Systems for Advanced Applications (DASFAA’13), Hanoi, Vietnam, April 2015 (Full Paper). Paper available online from Springer Link.
  • Scalable Privacy-Preserving Record Linkage for Multiple Databases. Dinusha Vatsalan and Peter Christen. Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM’13), Shanghai, China, November 2014 (Poster paper). Paper (pdf, 460 KB) available online from ACM Digital Library Link. An extended version of the article is available in arXiv proceedings.
  • Tree Based Scalable Indexing for Multi-Party Privacy-Preserving Record Linkage. Thilina Ranbaduge, Peter Christen, Dinusha Vatsalan. Proceedings of the 12th Australasian Data Mining Conference, Brisbane, Australia, November 2014 (Full paper). Paper (pdf, 754 KB) available online.
  • An evaluation framework for privacy-preserving record linkage. Dinusha Vatsalan, Peter Christen, Christine M. O’Keefe, and Vassilios S. Verykios. In Journal of Privacy and Confidentiality (CMU), volume 6, issue 1, 2014, Pages 35-75. Article available online at http://repository.cmu.edu/jpc/vol6/iss1/3.
  • Challenges for privacy preservation in data integration. Peter Christen, Dinusha Vatsalan, and Vassilios S. Verykios. Journal of Data and Information Quality (ACM), volume 5, issue 1-2, September 2014. Article available online at http://dl.acm.org/citation.cfm?id=2629604.
  • Efficient two-party private blocking based on sorted nearest neighborhood clustering. Dinusha Vatsalan, Peter Christen, and Vassilios S. Verykios. Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13), San Francisco, United States, October 2013 (Full paper). Paper (pdf, 5.5 MB) available online from ACM Digital Library Link.
  • Flexible and extensible generation and corruption of personal data. Peter Christen and Dinusha Vatsalan. Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13), San Francisco, United States, October 2013 (Poster paper). Paper (pdf, 394 KB) available online from ACM Digital Library Link.
  • GeCo: an online personal data generator and corruptor. Khoi-Nguyen Tran, Dinusha Vatsalan, and Peter Christen. Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13), San Francisco, United States, October 2013 (Demo paper). Paper (pdf, 592 KB) available online from ACM Digital Library Link.
  • Sorted Nearest Neighborhood Clustering for Efficient Private Blocking. Dinusha Vatsalan and Peter Christen. Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13), Gold Coast, Australia, April 2013 (Full paper). Paper (pdf, 456 KB) available online from Springer Link.
  • An Iterative Two-Party Protocol for Scalable Privacy-Preserving Record Linkage. Dinusha Vatsalan and Peter Christen. Proceedings of the 10th Australasian Data Mining Conference (AusDM’12), Sydney, December 2012 (Full paper). Paper (pdf, 682 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 134.
  • An Efficient Two-Party Protocol for Approximate Matching in Private Record Linkage. Dinusha Vatsalan, Peter Christen and Vassilios Verykios. Proceedings of the 9th Australasian Data Mining Conference (AusDM’11), Ballarat, December 2011 (Full paper). Paper (pdf, 880 KB) available online from Conferences in Research and Practice in Information Technology (CRPIT), vol. 121.
  • Mobile technologies for enhancing eHealth solutions in developing countries. Dinusha Vatsalan, Shiromi Arunatileka, Keith Chapman, Gihan Senaviratne, Saatviga Sudahar, Dulindra Wijetileka, Yvonne Wickramasinghe. Proceedings of the 2nd International Conference on eHealth, Telemedicine and Social Medicine (eTelemed 2010), Sint Maarten, Netherlands Antilles (Full paper). Paper (pdf, 571 KB) available online from IEEE Xplore.
  • Enhancing e-Health using m-Communication in Developing Countries. Dinusha Vatsalan, Shiromi Arunatileka, Keith Chapman, Gihan Senaviratne, Saatviga Sudahar, Dulindra Wijetileka, Yvonne Wickramasinghe. Proceedings of the 5th eIndia Conference 2009, Hyderabad, India (Full paper). Paper available online from elets online.

Other:

  • Scalable and Approximate Privacy-Preserving Record Linkage. Dinusha Vatsalan. PhD Thesis, Research School of Computer Science, College of Engineering and Computer Science, The Australian National University, October 2014. Thesis available online at ANU Digital Theses.
  • A flexible data generator for privacy-preserving data mining and record linkage. Peter Christen and Dinusha Vatsalan. User Manual, Fujitsu Laboratories Collaboration, Research School of Computer Science, College of Engineering and Computer Science, The Australian National University, 2012
  • Enhancing e-Health using m-Communication in Developing Countries. Dinusha Vatsalan, Shiromi Arunatileka, Keith Chapman, Gihan Senaviratne, Saatviga Sudahar, Dulindra Wijetileka, Yvonne Wickramasinghe. Magazine article, 2009.

Invited talks and seminars:

  • Privacy preserving techniques for data matching. Dinusha Vatsalan. Seminar to the Networks group, Data61-CSIRO, 2017.
  • Advanced Techniques for Privacy-Preserving Linking of Multiple Large Databases. Dinusha Vatsalan. Presentation at the Data Linkage and Anonymization Programme by the Isaac Newton Institute for Mathematical Sciences, University of Cambridge, September 2016. Slides (pdf, 2.7MB).
  • A Tutorial on Population Informatics using Big Data. Peter Christen, Hye-Chung Kum, Qing Wang, and Dinusha Vatsalan. Tutorial at the Pacific Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Auckland, New Zealand, April 2016. Slides (pdf, 15MB).
  • Techniques for Scalable Privacy-preserving Record Linkage. Peter Christen, Vassilios S. Verykios, and Dinusha Vatsalan. Tutorial at the 22nd ACM International Conference on Information and Knowledge Management (CIKM’13), San Francisco, United States, October 2013. Slides (pdf, 2.6 MB).
  • Scalable Privacy-preserving Record Linkage. Dinusha Vatsalan and Peter Christen. Invited presentation at the Australian Bureau of Statistics, Canberra, Australia, June 2013. Slides (pdf, 3 MB).
  • An Iterative Two-Party Protocol for Privacy-Preserving Record Linkage using Bloom Filters. Dinusha Vatsalan and Peter Christen. Invited presentation at the SAX Institute, Sydney, Australia and NICTA, Sydney, Australia, December 2012. Slides (pdf, 2.4 MB).