May 2020
Publications
- Jagriti Jalal, Mayank Singh, Arindam Pal, Lipika Dey and Animesh Mukherjee, ‘Identification, Tracking and Impact: Understanding the trade secret of catchphrases’, ACM/IEEE Joint Conference on Digital Libraries (JCDL 2020)
- Rizka Purwanto, Arindam Pal, Alan Blair, Sanjay Jha, ‘PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites’, IEEE Conference on Communications and Network Security (CNS 2020)
- Paheli Bhattacharya, Kripabandhu Ghosh, Arindam Pal and Saptarshi Ghosh, Hier-SPCNet: ‘A Legal Statute Hierarchy-based Heterogeneous Network for Computing Legal Document Similarity’, International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2020)(CORE A* conference, but also the top conference in Information Retrieval)
- Yang Liu, Yilong Yang, Zhuo Ma, Ximeng Liu, Zhuzhu Wang, Siqi Ma, ‘PE-HEALTH: Enabling Fully Encrypted CNN for Health Monitor with Optimized Communication’, accepted by IEEE/ACM International Symposium on Quality of Service (IWQoS) 2020.
- Phan, Raphael Abe, Masayuki; Batten, Lynn; Cheon, Jung; Dawson, Ed; Galbraith, Steven; Guo, Jian; Hui, Lucas; Kim, Kwangjo; Lai, Xuejia; Lee, Dong Hoon; Matsui, Mitsuru; Matsumoto, Tsutomu; Moriai, Shiho; Nguyen, Phong; Pei, Dingyi; Phan, Duong Hieu; Pieprzyk, Josef; Wang, Huaxiong; Wolfe, Hank; Wong, Duncan; Wu, Tzong-Chen; Yang, Bo-Yin; Yu, Yu; Yiu, Siu-Ming; Zhou, Jianying, ‘Advances in Security Research in the Asiacrypt Region’, Communications of the ACM
- We had 2 publications accepted in SIGIR 2020, a CORE A* conference, but also the top conference in Information Retrieval.
Dr Arindam Pal had co-authored a paper called ‘Hier-SPCNet: A Legal Statute Hierarchy-based Heterogeneous Network for Computing Legal Document Similarity’ with Paheli Bhattacharya, Kripabandhu Ghosh, Arindam Pal and Saptarshi Ghosh.
Computing similarity between two legal case documents is an important and challenging task in Legal Information Retrieval, for which text-based and network-based measures have been proposed in literature. All prior network-based similarity methods considered a precedent citation network among case documents only (PCNet). However, this approach misses an important source of legal knowledge – the hierarchy of legal statutes that are applicable in a given legal jurisdiction (e.g., country). We propose to augment the PCNet with the hierarchy of legal statutes, to form a heterogeneous network Hier-SPCNet, having citation links between case documents and statutes, as well as citation and hierarchy links among the statutes. Experiments over a set of Indian Supreme Court case documents show that our proposed heterogeneous network enables significantly better document similarity estimation, as compared to existing approaches using PCNet. We also show that the proposed network-based method can complement text-based measures for better estimation of legal document similarity.
Dr Wei Kang had co-authored a paper entitled “Evidence Weighted Tree Ensembles for Text Classification”.
Abstract: In text analysis, documents are often mapped to vectors of binary values where 1 indicates the presence of a word and 0 the absence. The vectors are then used to train predictive models. In methods like random forests, predictions from some decision trees may be made purely from absent words. This type of predictions should be trusted less as absent words can be interpreted in multiple ways. In this work, we propose to improve the comprehensibility and accuracy of ensemble models by distinguishing word presence and absence. The presented method weights predictions based on word presence in tree-based ensembles. Experimental results on 35 real text datasets indicate that our method, combined with tree-based ensembles, outperforms or are competitive with the state-of-the-art ensemble methods on various text classification tasks.
Projects
- One of our project team, lead by Surya Nepal and Seyit Camtepe has just completed successfully the ‘Assessing the Security and Compliancy of ROS-M Applications’ project in collaboration with US Army. This project extends the US Army static code analysis framework with methods to orchestrate the existing static analysis and CSIRO Data61’s novel ML-based solutions to check if bugs are resembling known vulnerability classes; and delivers a software platform, named as buGFinder, which consists of structured methods to orchestrate CSIRO Data61’s solutions as well as the existing static code analysis tools to assure that implementations of ROS-M applications conform to the security requirements and contain no bugs resembling the known software vulnerabilities. Our modular platform approach with state-of-the-art technologies such as docker images and containers help integrate a tool once and use many times to automate it over a series of ROS-M applications pulled from the registry. Aggregated output of the applicable tools can give much stronger insights and can aid human experts by effectively eliminating many false results.
For more information, visit Link
- Spotlight on the Adelaide Security Data Science team in Data61’s April newsletter, InsideData61. Data 61 formally welcomes Adelaide’s new Security Data Science team, set to boost DSS capability in national and cyber security research with Andrew Feutrill as a team leader.
For more information, visit Link
Students
- Dimaz Wijaya
Dimaz Ankaa Wijaya (PhD at Monash University) submitted his thesis, entitled “Anonymity in Cryptocurrency”, in April 2020. Congratulations Dimaz on this great milestone. He worked under Dr Dongxi Liu’s supervision for three years, supported by Data61 Top-up Scholarship. He enjoyed his time at Data61, with Dr Marthie Grobler as his line manager. He also participated in the CSIRO mentoring program with Dr Gary Delaney as his mentor. The programs and activities at Data61 polished his organisational and communication skills that are useful for his future career.
Anonymity in Cryptocurrency: A cryptocurrency is a decentralised digital currency that utilises blockchain technology to remove the role of a central authority. Monero is one of the cryptocurrencies that improves its anonymity by employing privacy-preserving cryptographic techniques, such as linkable ring signature. In this thesis, we explore three areas in Monero system that can cause anonymity problems. These areas are Monero transaction creation protocol, Monero protocol update, and Monero third-party services. We identify attack schemes to reduce honest users’ transaction anonymity. We then investigate the impact of Monero protocol updates to transaction anonymity. Lastly, we study wallet service providers that can trace Monero transactions and mining pools that leak information.
- Hagen Lauer (Monash University)
Hagen just started a postdoc position in Germany after submitting his thesis in January 2020 with his supervisors Carsten Rudolph (Monash) and Surya Nepal (Data61). All the best Hagen!
“My thesis entitled “Security and Trust inVirtual Environments” tackles the design and verification of a Virtual Trusted Platform for applications such as Cloud Computing. Virtualization is a core concept in modern cloud systems and clients place a vast amount of trust in the virtualization system to provide essential security guarantees such as data confidentiality and software integrity. A virtualization system’s unlimited access to software and data in virtual environments presents a genuine scientific challenge. The Trusted Computing Module (TPM) as part of a trusted platform can be used to establish trust in a computer and my thesis discusses challenges and presents solutions related to establishing trust in a virtual environment. With the help of my supervisors, I was able to make great progress and secure Monash University’s Postgraduate Publications Award for my thesis. With significant help from Data61, I was able to present our research at excellent venues which resulted in great feedback and many new research contacts across the globe. I am particularly grateful for Data61’s support in opportunities such as attending the 50th Turing Awards Conference, DSTG’s Hacking for Defense, Data61 Summer Schools, and several Trusted Computing Group events. Shortly after submitting my thesis, I joined the Fraunhofer Institute for Secure Information Technology in Darmstadt, Germany as a research associate where I conduct my research and coordinate R&D projects in the Digital Energy space. I felt well prepared for this role because of the experience and influence from my colleagues and supervisors at Data61 and I certainly hope to collaborate with them again in the future.”
- Let’s meet two of our new students: Dennis Liu and Maisie Badami
Dennis is a PhD student at the University of Adelaide under the supervision of Joshua Ross and Lewis Mitchell in the School of Mathematical Sciences, and is interested in data science, systems design, meaningful insights and learning algorithms. Currently, he is working on developing models of disease outbreaks based on digital data, and understanding the inherent biases in digital data. Alongside this work, he is also investigating different statistical & machine learning algorithms to understand their mathematical foundations and their appropriate use.
Dennis is also Working on SA traffic data to evaluate social distancing measures and their effect on COVID transmission, this could incorporate Google mobility indices, Facebook colocation data and public transport data too.
‘A major highlight was assisting the Australian Government response and preparedness for COVID-19 through epidemiological modelling, and being asked to help inform the greater public on the mathematics behind why we should flatten the curve by The Conversation.’ says Dennis.
Maisie is a PhD student from UNSW, researching automation techniques to support conducting empirical studies (such as systematic reviews) across different research disciplines. Maisie’s research is focused on adopting machine learning (especially NLP) and crowdsourcing methods to mitigate significant challenges that researchers are facing when conducting these types of studies and dealing with the huge amount of unstructured data.
Workshops/conferences
- Two of our workshop proposals (Arindam Pal) have been accepted at IEEE International Conference on Big Data (BigData 2020) (http://bigdataieee.org/BigData2020/), to be held during December 10 – 13, 2020 at Atlanta, Georgia, USA.
- IEEE International Workshop on Fair and Interpretable Learning Algorithms (FILA 2020): https://fila-workshop.github.io/
- IEEE International Workshop on Data Analytics for Smart Health (DASH 2020): https://sites.google.com/view/ieee-dash-2020
- Sushmita Ruj has been invited to the Program Committee of IACR ACNS 2021.
- Marthie Grobler and Mohan Baruwal Chhetri are organizing a Workshop on Human Centric Software Engineering and Cyber Security in conjunction with the 35th IEEE/ACM International Conference on Automated Software Engineering (https://conf.researchr.org/home/ase-2020) to be held in Melbourne from 21-25 September, 2020