Background and Motivation
Before we get into the improvements to MPT, let’s talk about the background of this research.
I am a PhD student and designing a public chain. In addition to the core consensus upgrade: the useful proof of work, another important task of this public chain is to implement a smart contract system compatible with ETH. We have currently implemented a virtual machine based on Python bytecode and integrated it into a blockchain contract system, and surprisingly achieved Ethereum RPC compatibility. We used Python to build a smart contract that complies with the ERC20 standard for testing. Naturally, we need to implement a KV data structure that can support tens of millions of users at the lower level of contract execution (similar to Mapping in Solidity, used to store the number of tokens under each account, and there may be hundreds of millions of such accounts).
Next, the work of public chain design naturally touched upon concepts such as MPT and trie. We referred to the design of Ethereum, three trees, trie, MPT, etc.
Finding bottlenecks
Fortunately, before we were ready to implement the smart contract calling MPT, we suddenly had the idea to run a simple MPT tree benchmark program on the AWS server. When we tried to insert 100 million KV data using Trie and MPT, we were very surprised to get the result: the performance of the MPT tree was so low.
Running the following code, we observed that inserting 10 million KV key-value pairs into Rocksdb takes several minutes with Trie and several hours with MPT. When we increase the test data size to 100 million, the MPT insertion program takes several days to run. This means that when blockchain uses the MPT data structure to run smart contracts, the speed will also be greatly limited.
Experiments have shown that the overhead of using the MPT data structure will not only slow down the execution of smart contracts, but also reduce the synchronization speed of blockchain nodes (even when bandwidth is sufficient, it takes several days to download data from other nodes and rebuild nodes). However, since Ethereum's data structure design is difficult to modify, even if we rewrite or optimize it with a "faster" programming language, it will be difficult for the blockchain to break through performance limitations without updating the underlying data structure.
test_mpt_write.py
test_mpt_write.py
I still remember when I was just learning to write code, teachers told us a basic principle, program = algorithm + data structure. If we limit the data structure, then even if we try our best to optimize the algorithm (which often requires years of academic efforts, and in many cases it is almost impossible to improve it), it is difficult to break through the performance limitations of the current data structure. Then the very academic improvement plan, looking for a better-performing MPT to replace the algorithm, may take years of research. As a predecessor working in this direction, Verkle Tree uses polynomials to optimize the blockchain data structure, and we will try some more unique and simple ideas.
If we adopt the first principles thinking method, can we do without MPT? If we can bypass MPT and implement smart contracts directly on Trie, there will be no overhead brought by MPT (Ma Yilong's first principles, a non-existent thing will not have performance overhead).
As a price, our newly designed solution may not be directly compatible with the existing MPT, but since the Ethereum RPC interface is not modified, a lot of existing Ethereum infrastructure can be reused: such as Etherjs and MetaMask. Bypassing MPT is an internal data structure optimization, which is a free exploration of blockchain performance.
Prerequisites
Rocksdb and Trie
Trie is an advanced tree data structure mentioned in university textbooks. In Mr. Xiao's video, MPT is built based on Trie. We can think that the implementation environment of Trie is in memory.
However, in our engineering, we cannot directly use Trie to implement the product, because we also need to make the data persistent on the hard disk. This step requires a lot of engineering optimization, so we generally implement the MPT tree based on Rocksdb.
Rocksdb is an open source fork of FB based on leveldb, and uses LSMT at the bottom layer (see the paper The log-structured merge-tree). From an abstract perspective, we can think of Rocksdb as an optimized, persistent dictionary tree.
The basic use of Rocksdb is very simple. The three basic operations used are Get put and Prefix Iterate query.
state machine
The state machine is a tool often used to model blockchains. It is very simple: add an input to an existing state to get a new state.
The global state of Ethereum can be understood as the state of the state machine. For example, in block 1, the KV value of all smart contracts is a global state. All transactions in a block are executed by EVM and can be understood as the input of the state machine. For example, a Mint contract call makes a user's Balance and the contract's Total variable become 1000 in block 2.
A state machine concept that is easily overlooked is called a state transition function, which defines the rules for input and rejects input information when the input is unreasonable. For example, when we try to transfer a negative amount to someone else, such a transaction will not be executed (of course, a smart contract with a bug can accept such a transaction. In ETH, the state transition function can be customized through a Turing-complete language).
MPT Introduction
You may have heard of the three Ethereum trees, namely the transaction tree, the state tree, and the receipt tree. They are all MPT trees, the full name is Merkle Patricia Tries. Among them, the state tree may be the most suitable use case for the MPT data structure. For details, please refer to Mr. Xiao’s video explanation.
The MPT tree is based on the Trie data structure and can calculate all state data into a root hash like the Merkle tree and put it in the Ethereum block header. The Ethereum block header contains three root hashes of the MPT tree, which we usually call three trees.
Is it possible to not put the root of the global state in the block header? Yes, Bitcoin blocks contain transactions, and the Merkle root of the transaction is placed in the block header (using the Merkle tree but not MPT). However, Bitcoin does not have smart contracts like Ethereum, nor does it have the concept of global state. When Ethereum first designed a blockchain with smart contracts, it abandoned Bitcoin's 1M block design and UTXO, chose the MPT data structure and account model, and used Gas to solve the shutdown problem, meeting the demand for global state in the operation of smart contracts.
So, what is the goal of using MPT in Ethereum?
Use the Merkle root in the block header to find the corresponding state of the blockchain.
Save space occupied by duplicate data.
The correct state can be verified by the root hash.
The first point is a rigid demand. When executing EVM, various states need to be read. If a design similar to Bitcoin UTXO is used, many blocks need to be backtracked to obtain the account balance status of a certain user. Using MPT can meet this demand very well.
Second, MPT saves space. The blockchain state can be regarded as a huge dictionary. In each block, only a small number of keys are updated, and most keys retain their original values. Using the MPT tree, only the key values that need to be updated can be updated, without the need to copy the unchanged state in each block.
The third point is that the advantage of MPT is that the state key value accessed by the root hash has completed the verification of the global state. This is also the main reason why it is time-consuming to write to the MPT tree. If MPT is not used, the node can still verify the consistency of the data during the synchronization block.
By not using MPT to complete points 1 and 2, we can bypass the MPT overhead. So we need to design a solution based on Trie (which can be understood as Rocksdb) to store the state data of the blockchain.
design
We mainly design a Trie-based solution to store on-chain status. According to the first principle, if there is no MPT data structure, there is no system overhead required to run MPT. As long as the Trie-based smart contract system can be implemented and run correctly, it means that the optimization has been completed. In practice, we can regard Rocksdb as a Trie, or more simply, a high-performance KV database.
Use [globalstate prefix_contract address_variable name_block height_block hash] as the key value of the KV database, such as the following format. 0x1 is the contract address, Total is the variable name, 1 is the block height, and ABC1234 is the block hash value (omitted later)
globalstate_0x1_total_1_abc1234
In Ethereum Solidity, variables can be defined as Memory, Storage, and Calldata. We focus on the Storage type because it will be stored in the global state. In addition to simple variables, we also consider Mapping, because the number of users on the blockchain is global, so there will be a very large KV mapping. In addition, we will treat data types such as Array as simple data structures and store them in Rocksdb directly after Json. When we are coding, we should naturally estimate the order of magnitude of Value data in the KV data structure to choose an appropriate storage method.
Contract storage state variables
How can we achieve the first and second design goals of MPT without using it?
Suppose we have a variable Total in the ERC20 contract. This variable will only change when the total number of tokens increases (Mint) or decreases (Burn), but the value of Total remains unchanged when user A transfers money to user B.
For simplicity, assume that the contract address is 0x1. When the block height is 1, the owner of the contract performs a Mint 2000. At block heights 2-8, the user performs multiple transfers. Now the block height is 9.
At this point, we only need to query the database for the key prefixed with [globalstate_0x1_total_]. Although our current latest block is 9, since the Total variable has not changed in blocks 2-8, the first result of the above query will be the value of [globalstate_0x1_total_1], so we have found the latest value of the Total variable, the Total variable that changed in block 1.
At this time, new blocks are generated. Suppose the user performs two more Mints at blocks 9 and 11. Here we find a feature of Rocksdb. If we have the following Key
globalstate_0x1_total_1 : 2000
globalstate_0x1_total_9 : 4000
globalstate_0x1_total_11 : 6000
Then the first result of the search is always block 1, not the latest block 11. Because we have not found a parameter in the Rocksdb documentation to change the order of search results, we used a little trick to use a larger number such as 100 to subtract the block height (it will actually be much larger) and fill it with zeros, so the latest block will be at the top of the search results.
globalstate_0x1_total_089 : 6000
globalstate_0x1_total_091 : 4000
globalstate_0x1_total_099 : 2000
Storage Mapping Type
As we mentioned earlier, the number of keys in the Mapping data structure may be huge, so we cannot Json the dictionary of tens of thousands of keys in the Mapping into a string, otherwise the cost of adding or deleting keys, or modifying values will be very terrible.
Assuming the user's address is 0x2, we slightly expand the previous storage Key format:
[globalstate_0x1_[balance_0x2]_2] : 2000
The previous [variable name] here is expanded to [variable name + Mapping dictionary Key]
The above data indicates that the Balance of user 0x2 at block height 2 is 2000. If there is no updated data to cover the current balance, the user's data in block 2 represents the latest status.
globalstate_0x1_balance_0x2_098 : 2000
globalstate_0x1_total_0x1_099 : 2000
Verify the block
Like the Merkle root, the MPT root also represents a verification of the global state. As long as we can ensure the consistency of block data, whether to use MPT and write the root into the block header can be a design choice.
We have made some improvements in the block design so that the block header contains two bodies, one is the transaction data block and the other is the status update data block.
The transaction data block contains the hash of all transactions. Miners or nodes can start from the global state of a certain block height, execute all transactions in the order given in the transaction data block, and obtain the global state of the next block. If the transaction volume is large, it is unlikely to be executed in parallel (because it will disrupt the transaction order), so it is time-consuming to require all nodes in the world to execute all transactions.
To this end, we designed a state update data block, which includes the updated global state key value after running all transactions. If it is a node that lags behind by many blocks, in addition to choosing to run the virtual machine to execute all transactions, it can also directly insert the key value into the database based on the content of the state update body to save computing power and shorten synchronization time.
Of course, the key values used to execute all transaction updates and the Diff contained in the status update block must be exactly the same, otherwise the new block is illegal.
Assuming there are 10,000 nodes in the world, when we roll the dice and set a 1% probability to execute a transaction, about 100 nodes will execute the block transaction, and the other nodes will trust the content of the status update data block, then many nodes in this blockchain system will be able to save a lot of repeated operations.
accomplish
After completing the previous design, we immediately set about realizing this idea.
First, we integrated Python VM into the blockchain system. This is a Python virtual machine implemented by Python, and it is currently compatible with PY 3.10 bytecode. We hope that smart contracts can use some of Python's syntax. For example, we only need functions and do not want developers to use classes, so we do not fully support all PY bytecodes at present.
Regarding compilers, Solidity needs to develop a dedicated compiler to convert source code into EVM bytecode. Using Python, with the help of this excellent infrastructure language with a history of 30 years, Python source code can be easily converted into PY bytecode, and users can hardly feel the compilation time.
Our VM code repository is as follows. We welcome your feedback and suggestions to fix potential security issues:
Link: https://github.com/EcoPoW/python_stf_vm In the second step, we developed a Python version of ERC20, and the code has been updated. Unlike Solidity, we did not use keywords to identify how variables are stored. All local variables are in memory, and we use _get and _put global functions to read and write states.Link: https://github.com/EcoPoW/BitPoW/blob/master/contract_erc20.py
https://github.com/EcoPoW/BitPoW/blob/master/state.py
Implementation and improvement
It took us about two months from posing the problem to design and coding implementation.
After a whole summer of design + actual coding, the new smart contract system, which only uses the dictionary tree Trie without relying on the MPT data structure, can already be put into operation (you can try to run the test node locally through Github clone). Bypassing the MPT-based smart contract storage means that under the condition of sufficient bandwidth, the synchronization time of a node with a size of 100 million KV can be reduced from several days to a few minutes. The execution efficiency of smart contracts will also become faster, and the amount of computing consumed by the CPU will be more used to execute bytecode rather than the hash operations required to build the Merkle tree. We are very lucky that when implementing smart contracts in engineering, we did not directly follow the "industrial standard" design, but first tried optimization and subtraction to obtain a smart contract blockchain with the correct "data structure". Because it is different from Web2 products, we compare it to changing tires while driving a car, and Web3 blockchain is more like a rocket launch. It is difficult to make major structural improvements to a running blockchain, so we can only improve the "next" system and make full preparations before going online.
At present, the Testnet2 test network of the BitPoW blockchain we designed has been deployed. Everyone can connect to this blockchain through RPC using MetaMask for testing and try ERC20 transfers based on Python VM. At the same time, we still have a lot of work to do, and we still need to rely on the community to promote this new blockchain infrastructure.
We welcome everyone to learn about and actually test the technical implementations driven by these new concepts, including performance testing after removing MPT and security testing of new smart contracts and new chains.
Summarize
So far, we have bypassed MPT and designed a smart contract storage solution based directly on Trie. At present, we have implemented this improvement on a blockchain based on Python vm as a proof of feasibility. It took us about three months from discovering the problem to proposing a solution to implementing it in code, but the more important thing about this research is that before this, like many ordinary people, we learned MPT knowledge from the classroom and did not think further about improving it. The solution is not complicated, but it requires practice and action. The solution is not complicated, but it requires practice and action. Only by actively thinking in practice can we find inspiration and further innovation. Thank you very much for LXDAO's support for this research, and I hope to meet more friends in the community who are interested in the underlying blockchain. This industry is still in its early stages and the infrastructure is far from mature. We hope to promote the popularization of the underlying knowledge of blockchain so that more people can participate in the forefront of technological innovation.