Context

MLAI models with design constraints, e.g. scalability, uncertainty propagation, and privacy.

For ML applications to thrive outside the lab, they must take into account the constraints inherent to the real world, and have an efficient interface to input data.

Constraints are often the result of domain specific requirements interacting with the wider world. These include privacy, scalability, robustness, fairness, security and verifiability. This activity will develop and deploy ML models across domains that allow for such constraints.

To reduce dependence on human curation of data as input for models the activity will also implement a data description framework that allows discovery and interpretation of data and the context in which it was generated.

Challenges

  • Learning securely from federated/distributed data. Implementation of numerical linear algebra libraries that directly include secure multiparty computation, building a solid foundation for development of provably secure MLAI algorithms. This includes the study of trade offs in quantisation as well as information geometry.
  • Preserving privacy of sensitive data. Many domains require assurance of privacy preservation. This research will embed privacy-preserving techniques in AI services in a modular way to “sanitise” platforms.
  • Robustness against adversarial attacks. Assurance of robustness against adversarial attacks is necessary for deployment of ML solutions in targeted applications. That assurance is challenging because of the way ML generalises from training data to previously unseen situations. It is both infeasible and undesirable to specify the desired output for every possible situation, but we still need to provide certain assurances about the generalisation of the behaviour of the machine learning algorithms.
  • Integration and learning from diverse data. Integration between and discovery across domains is challenging. Monolithic data systems can solve some of these, but do not account for diversity across domains. By implementing a hybrid approach through the description of the processes through which data is generated, and recording relationships between entities in the generation process, data will be accessible to models, along with the context that makes it interpretable.