Privacy-Preserving Analytics for the EdTech Industry
The Challenge
Education, Australia’s largest services export (valued at $30bn pa.), increasingly relies on education technology (EdTech) which is a major growth opportunity for Australia (projected to be worth $1.7bn pa by 2022 – Austrade). The ability to personalise learning and improve outcomes based on insights gained from student data is a major driver of the $250bn global EdTech industry. Artificial Intelligence (AI) is increasingly being used to deliver these insights, also known as Learning Analytics (LA). The data that enables LA is personal and the insights derived can even be more sensitive than the source data (e.g. predicting a student is at risk of failing).
The top global economies are currently passing legislation requiring tight controls around privacy of personal data. It will be essential for Education Technology systems to protect personal and confidential data processed by LA systems. They will be required to prove that the data remain private, especially in the common situation where multiple systems are used and some or all of the systems involved operate “in the cloud”. However, current approaches to data privacy protection lose valuable information, significantly reducing its utility for LA and reporting. Worse still, they can still be re-identified using common approaches such inference and linking attacks (new information is inferred from the de-identified data, or data is linked with another data source to re-identify a subject). These shortcomings severely limit educators’ options to store their data safely and efficiently or use it to innovate and improve.
The Project
This project will provide a platform for educators and EdTech to provably, irreversibly de-identify learning data whilst preserving its usefulness for LA. It will also provide a mechanism to consistently measure the privacy risk of a dataset (the risk that a subject could be re-identified) before and after de-identification is applied. This risk mechanism will allow educators, for the first time, to set measurable standards for acceptable privacy risk. With this solution in place, educators will be able to make fuller use of their student data for innovation and competition while also measurably improving its safety.
Related Publications
- D. Vatsalan, T. Rakotoarivelo, R. Bhaskar, P. Tyler, D. Ladjal, “Privacy risk quantification in education data using Markov model”, British Journal of Educational Technology 53 (4), 804-821, 2022.
- D. Ladjal, S. Joksimović, T. Rakotoarivelo, C. Zhan, “Technological frameworks on ethical and trustworthy learning analytics”, British Journal of Educational Technology 53 (4), 733-736, 2022.
- S. Joksimović, D. Ladjal, C. Zhan, T. Rakotoarivelo, A.Li, “Towards Trusted Learning Analytics”, International Conference on Learning Analytics & Knowledge (LAK22), 2022.
- S. Joksimović, R. Marshall, T. Rakotoarivelo, D. Ladjal, C. Zhan, A. Pardo, “Privacy-driven learning analytics”, Manage your own learning analytics: Implement a Rasch modelling approach, Springer Publishing, 2021.
- R. Marshall, A. Pardo, D. Smith, T. Watson, “Implementing next generation privacy and ethics research in education technology”, British Journal of Educational Technology, 53(4), 737-755, 2022.