Distributed machine learning: privacy, security and implementation

Duration: Ongoing Research Activity

Distributed machine learning techniques, including federated learning and split neural networks (or split learning), enable machine learning without directly accessing raw data, which can often be personal and sensitive, held by clients such as hospitals or end devices such as the Internet of Things (IoT). Generally, distributed machine learning technique enables a cloud server to obtain a joint model across a number of participants who trains its own shared data locally without disclosing to other parties. However, there are still challenges when deploying distributed learning techniques due to privacy, security as well as implementation overhead concerns. First, attacks such as data inversion, membership inference, and property inference threat raw data privacy even without direct access to them. Second, distributed learning techniques are inherently vulnerable to poisoning and backdoor attacks considering the fact when a number of participants are malicious where the cloud server has no right to examine the local data. Third, though such distributed machine learning techniques have great potential in distributed system applications in the context of IoT-Edge-Cloud scenario. There is a significant gap on the empirical end-to-end evaluations of these techniques in IoT-enabled distributed systems to gain insights and provide a deeper understanding of their practicality. This project aims to investigate and address aforementioned challenges.

Presentation LINK