Encrypted-Data-Based Trainer

Summary: An encrypted-data-based trainer learns AI models over encrypted data without the need for decryption, through the use of homomorphic encryption techniques.

Type of pattern: Product pattern

Type of objective: Trustworthiness

Target users: Data scientists

Impacted stakeholders: RAI governors, AI users, AI consumers

Lifecycle stages: Design

Relevant AI ethics principles: Privacy protection and security

Context: There are often privacy concerns about the training data, because it is often sensitive and should not be revealed. For example, a medical model for disease prediction is trained over the patients’ private medical data, which should be protected and not disclosed to unauthorized parties.

Problem: How can we ensure the privacy of training data in the training process?

Solution: Homomorphic encryption is a privacy-preserving technique that enables computations to be performed on ciphertexts directly, resulting in the same outcome as if it were performed on the original data. For example, when the numbers 1 and 2 are encrypted and their ciphertexts are added using homomorphic encryption algorithms, the decrypted result would yield 3. This technique can be used to encrypt training data, allowing AI models to learn from the encrypted data without the need for decryption.

Benefits:

  • Data privacy: The training data is always encrypted to reduce the risk of compromise. 
  • Data usability: There is no need to modify the features of training data to preserve the data privacy. 

Drawbacks:

  • Training inefficiency: Training neural networks over using data encrypted with homomorphic encryption algorithms can be challenging, as the networks become slower after the encryption is applied.  
  • Lack of visibility: After encrypting the training data, it can be hard for data scientists to monitor the training process, such as identifying and correcting mislabeled data. 

Related Patterns:

  • Secure aggregator: The secure multi-party computation techniques used in secure aggregation are a form of homomorphic encryption techniques under multiparty setting.

Known Uses:

  • IBM Homomorphic Encryption Services enables AI models to be trained on the sensitive data without exposing the data to the AI model training environment.
  • Microsoft SEAL is a homomorphic encryption library that allows computation including training to be performed on encrypted data.
  • Google’s Fully Homomorphic Encryption is a cryptographic technique to secure computation that does not need to share the decrypted data in order to perform operations on them.