Data Migration Patterns

Disclaimer: This is a summary of patterns we have observed during our research and should not be considered any form of technical or investment advice. Also, the given “known examples” do not imply they are the best implementations of the said pattern or any superior to any other implementation of the pattern not listed.

With the rapid evolution of technological, economic, and regulatory landscapes, contemporary blockchain platforms are all but certain to undergo significant changes. Therefore, the applications that rely on them eventually need to migrate from one blockchain instance to another to remain competitive and secure. Further, data migration would be required to enhance the business process, performance, cost efficiency, privacy, and regulatory compliance. However, the differences in data and smart contract representations, modes of hosting, transaction fees; and the need to preserve consistency, immutability, and data provenance introduce unique challenges over database migration. The following collection presents a set of migration patterns to address those scenarios and the above data management challenges.

Overview of data migration patterns

Overview of blockchain data migration patterns

Overview of data migration patterns.

 

We explain the patterns in the context of data migration architecture illustrated in the following figure. Similar to database migration, we envision the migration team will utilize a tool (either developed in-house or off-the-shelf) to simplify the migration process. The migration tool could follow the Extract, Transform, and Load (ETL) process to copy data from the source blockchain and recreate them on the target blockchain. Due to the incompatibilities between the source and target blockchains’ data representations, and the creation of new accounts, smart contracts, and replay of transactions, changes may be needed at the Blockchain Access/API Layer (BAL). Similar to the data access layer in databases, BAL abstracts the connectivity to the blockchain. It may also map application-level references to blockchain identifiers (ID) as they are very different. For example, a username used by a DApp needs to be mapped to the user’s address or public key on the blockchain. Such application-level reference to blockchain ID mapping is usually maintained in a protected database within BAL, which we refer to as the ID database. When the application holds the user’s private key (e.g., custodial wallet), keys may also be maintained in this database. Therefore, in addition to updating the BAL to integrate the target blockchain, ID database within the BAL needs to be updated to reflect new account and smart contract addresses, keys, and transaction IDs during the migration. Moreover, ID database can be used to identify what accounts, states, transactions, and smart contracts to migrate, as blockchains try to be anonymous by not keeping track of applications and their users. Therefore, BAL and its ID database are likely to be an integral part of the migration architecture. The dotted lines in figure show the flow of account (accID), smart contract (scID), and transaction (txID) identifiers from/to the ID database.

Data migration architecture

Blockchain data migration architecture

Data migration architecture

 

Pattern Collection