Encrypting On-Chain Data

Summary

Ensure confidentiality of the data stored on blockchain by encrypting them.

Context

For some applications on a blockchain, there might be business-sensitive data that should be accessible only to the involved participants. An example would be a special discount price offered by a service provider to a subset of new users. Such information should not be accessible to the other users who do not get the discount.

Problem

Data privacy is one of the main limitations of blockchain. All the information on a blockchain is available to the participants of the blockchain network. There is no privileged user within the blockchain network, no matter the blockchain is public, consortium, or private. On a public blockchain, new participants can join the blockchain network freely and access all the information recorded on the blockchain. How to ensure confidentiality of the data stored on a blockchain?

Forces

  • Transparency – Every participant within a blockchain network is able to access all the historical transactions on the blockchain, which is required to enable them to validate previous transactions. The transactions on a public blockchain are also accessible to everyone with access to the Internet, simply using tools like a blockchain explorer such as Etherscan.
  • Lack of confidentiality – Because all the information on a blockchain is publicly available to everyone in the network, business-sensitive data meant to be kept confidential should not be stored on the blockchain, at least not in plain-text form.

Solution

To preserve the privacy of the involved participants, symmetric or asymmetric encryption can be used to encrypt data before inserting the data into the blockchain as transactions. One possible design for sharing encrypted data among multiple participants is as follow. First, one of the involved participants creates a secret key for encrypting data and distributes it off-chain during an initial key exchange. Neither the key nor the seed to generate it should be shared on the blockchain. When one of the participants needs to add a new data item to the blockchain, they first symmetrically encrypt it using the secret key. Then the transaction with the encrypted data is submitted to the blockchain. Consequently, only the participants with access to the secret key can decrypt the data embedded in the transaction.

A graphical representation of encrypting on-chain data pattern

A graphical representation of encrypting on-chain data pattern

Benefits

  • Confidentiality – Using encryption, the publicly accessible information on the blockchain is encrypted preventing anyone without the secret key from interpreting the information.

Drawbacks

  • Compromised key – Both symmetric and asymmetric encryption require off-chain key management. If key management is not properly managed, it can lead to compromise and disclosure of private or secret keys. If the required private key or secret key is compromised, the encryption mechanism fails.
  • Access revocation – As data on a blockchain are immutable, read access to data cannot be revoked once the transaction is confirmed. Thus, as long as the participant keeps the secret key used to encrypt the data, it has access to the encrypted data.
  • Immutable data – Even if stored in encrypted form, the critical data will remain in the blockchain forever. In addition to the risk of key compromise, the encrypted data may be subject to brute force decryption attacks in the future, or breakthroughs in technology like quantum computing might render current encryption technologies ineffective. Thus, even if the data are considered to be secure with a given key size at the time of storing in the blockchain, it may no longer be the case in the future.
  • Key sharing – The encryption key needs to be shared on-chain before submitting any relevant transaction to the blockchain secretly. Although blockchain can be used as a software connector to communicate data, secret keys cannot be shared through blockchain because the shared key would be publicly accessible if being communicated through blockchain.
  • Limits utility of smart contracts – Smart contracts cannot interpreter data without the secret key as it is kept off-chain. Hence, this limits the amount of on-chain computation on data. While homomorphic encryption enables certain computations to be performed on encrypted data, adding such computations to smart contracts increase their cost, e.g., complex smart contracts consume more gas on Ethereum.

Related patterns

  • The blockchain anchor pattern can be used to store data off-chain and submit only a cryptographic representation of data (in the form of a hash) to the blockchain.

Known uses

  • Encrypted queries from Oraclize (now known as Provable). Oraclize is a smart contract running on Ethereum public blockchain that ich provides a service to access the state from the external world. Oraclize allows smart contract developers to encrypt the parameters of their queries locally by using a public key before passing them to a smart contract. The only one who can decrypt the call parameters is Oraclize with the paired private key.
  • MLGBlockchain‘s crypto digital signature encrypts data and shares the data between the parties who interact and transmit data through blockchain.
  • Hawk is a smart contract system that stores transactions as encrypted data on the blockchain. The Hawk compiler can automatically generate a cryptographic protocol for a smart contract. The involved participants interact with the blockchain following the cryptographic protocol.