Written by: Miranda Christ, Joseph Bonneau
Compiled by: TechFlow
As blockchains support more users and more frequent transactions, the amount of information (i.e., “state”) that validators store to verify transactions also increases. For example, in Bitcoin, the state consists of a set of unspent transaction outputs (UTXOs). In Ethereum, the state consists of the account balance of each account and the code and storage of each smart contract.
As a blockchain grows to enough accounts or UTXOs to support real daily transactions for a significant portion of the population, this storage burden will become unmanageable, making it difficult to become a validator and posing a threat to decentralization. The tempting solution is to turn to cryptography, where tools such as Merkle trees and zero-knowledge proofs help us achieve things that were previously unimaginable.
This is exactly what “stateless blockchains” aim to achieve. But despite much research into them, they are still far from practical. But it turns out that this lag in progress is inherent — the gap between these constructions and practicality will never be bridged. Our recent work shows that without additional measures to manage state, any stateless blockchain scheme, no matter how clever, is not feasible. However, as we show at the end of this article, this impossibility result should not discourage people.
no status
Today, the state is large but manageable. For example, Bitcoin nodes store about 7 GB of data and Ethereum nodes store about 650 GB of data. However, the storage burden of a full node grows roughly linearly with the throughput of the chain (transactions per second, or TPS), and today's throughput is still unacceptable. With current designs, the state required to truly support everyday transactions (tens to hundreds of thousands of transactions per second) would become unmanageable, requiring gigabytes or even petabytes of storage.
This has prompted people to look for technical ways to significantly reduce the amount of state required by validators. It is crucial to implement a stateless blockchain, which would require validators to only store a constant size of state, regardless of transaction throughput (actually, this term is a misnomer: there is still state, just small enough, so as to be practical at any future throughput - usually constant size). This lightweight storage requirement will make running a validator node much easier; optimistically, everyone will be able to run a node on their phone. Since increasing the number of validators increases the security of the chain, it is important to lower the barrier to entry for validators.
Despite a lot of research on stateless blockchains (e.g., by Todd, Buterin, Boneh et al., and Srinivasan et al.), they are still far from being practical, and to our knowledge, none have been deployed. The fundamental problem with all known stateless blockchains is that they require users to store additional data called a witness to help validators verify transactions involving their accounts. For example, this witness might be a Merkle inclusion proof showing that the user's account and balance are included in the global state commitment. When users make a transaction, they submit this witness to the validator to show that their account has a sufficient balance.
Unlike stored private keys, which never need to change, these witnesses change frequently, even for users who are not actively transacting, placing an impractical burden on users. Similarly, imagine if you were constantly monitoring all other credit card transactions around the world and updating some local data accordingly for use of your own credit card. For blockchains to be practical, users must be able to go offline and interact with the blockchain only when submitting transactions. In many cases, such as hardware wallets, updating witnesses is not just inconvenient, it is impossible.
This leads us to a natural research question: Can we build a stateless blockchain that does not require frequent witness updates (or a blockchain that only requires infrequent witness updates)? To answer this question, we develop a novel theoretical framework (revocable proof systems) that generalizes stateless blockchains. Using this framework, we prove an impossibility result: the tradeoff between a succinct global state and frequent witness updates is intrinsically difficult to reconcile. Our proof technique is information-theoretic, which means that future computers will not be powerful enough to solve this problem: the gap between stateless blockchain construction and practicality will never be bridged.
Background of our research
To help understand our impossibility results, we first describe a natural but inefficient way to construct a stateless blockchain using Merkle trees. Our goal is to enable validators to determine whether a transaction submitted by a user is valid — for example, whether the user has sufficient account balance to make the transaction. In a stateless blockchain scheme, validators store a constant-sized state. When a user makes a transaction, they must include a witness with the transaction. The validator can use the current state and the (transaction, witness) pair submitted by the user to verify whether the user has sufficient account balance to make the transaction.
We first construct a Merkle tree where each (account ID, balance) pair (a, b) is included as a leaf node. The constant-size state V stored by the validator is the root of this tree, which serves as a commitment to a set of account balance pairs. Each user maintains its witness as a Merkle inclusion proof. The Merkle inclusion proof of a leaf (a, b) consists of the partner nodes (v1, …, vk) along the path from that leaf to the root of the tree. Given a transaction made by account a and a claimed balance b, the validator can verify that b is indeed the balance of account a by checking the proofs (v1, …, vk) of (a, b) against its current state V. If so, the validator executes the transaction and must update the account's balance accordingly. A convenient property of Merkle trees is that given a Merkle inclusion proof for a leaf, it is easy to compute the resulting root when that leaf changes. In other words, the validator can easily compute an updated state V' that captures the new balance of account a after the transaction is executed.
This Merkle tree scheme has two major drawbacks. First, user witnesses are relatively large, growing logarithmically with the total number of accounts in the system. Ideally, they should be of constant size, which we can achieve using schemes such as RSA accumulators.
The second drawback is more difficult to avoid: the proof of the account balance pair changes every time another user makes a transaction. Recall that the proof of a leaf node consists of the partner nodes on the path from that leaf node to the root of the tree. If the other leaf nodes change, one of them will change, which poses a practical problem. Most blockchain users want to passively keep their coins in their wallets and only go online when they want to make a transaction. However, in such a stateless blockchain, users must constantly monitor the transactions of others to keep their witness information up to date. Although third parties can do this monitoring on behalf of users, this deviates from the standard stateless blockchain model. In practice, this is an insurmountable challenge for stateless blockchains and places a heavy burden on users.
Our conclusion: statelessness is impossible
This phenomenon does not only apply to this Merkle tree structure - all known stateless blockchain schemes require users to frequently update their witnesses, as we demonstrate here. More precisely, we show that the number of users who must update their witnesses grows roughly linearly with the total number of transactions made by all users.
This means that even if user Alice does not make any transactions, her witness may need to change as other users' transactions occur. As long as the succinct state stored by validators is too small to capture the complete state (i.e., the set of all account balances), increasing the size of the succinct state will not help much. We plot this relationship shown below, along with the number of witness changes required per day for blockchains with different throughputs, based on our theorem. The graphs show the number of witness changes required for an optimal stateless blockchain. Here, the data universe refers to the total number of accounts in the account model or the total number of UTXOs in the UTXO model.
At the heart of our proof is an information-theoretic argument. A core principle of information theory, formalized by Claude Shannon, is that if Alice randomly chooses an object from a set of size 2n, and wishes to tell Bob which object she chose, she must send him at least n bits. If there exists a stateless blockchain scheme where users rarely update their witnesses, Alice can tell Bob which object she chose in less than n bits, which is impossible, as Shannon proved. Therefore, such a stateless blockchain does not exist.
For simplicity, we describe here a proof of a slightly weaker statement: there is no stateless blockchain where users never need to update their witness information. The key is that Alice uses a stateless blockchain scheme to encode her message to send to Bob. Initially, Alice and Bob both know the complete set of account balance pairs for all n users. Assume that each account has at least one coin. Alice and Bob also both know the concise state V of the stateless blockchain and the witnesses wi for all account balance pairs (ai, bi). Alice and Bob also agree on a mapping between messages and sets of accounts. Alice will choose a set of accounts A that corresponds to her message, and she will then spend coins from these accounts. She will use the stateless blockchain to communicate her chosen set to Bob, and he can learn her message from this set.
Encoding: Alice spends one coin from each account in A. Using the stateless blockchain scheme, Alice computes the updated state V' and sends it to Bob.
Decoding: For each i, Bob checks whether Verify(wi, (ai, bi)) is true. Bob outputs the set of accounts B such that Verify(wi, (ai, bi)) = false.
Bob successfully outputs the same set that Alice chose: B = A. First, observe that if Alice spends a coin from account ai, the witness of its old balance should no longer be accepted - otherwise, Alice would be able to double-spend. Therefore, for every account ai in A, Verify(wi, (ai, bi)) = false, and Bob will include this account in B. On the other hand, Bob will never include accounts from which Alice has not spent coins in B, because the balances of these accounts remain unchanged, and (recall the relaxed statement we are going to prove) their witnesses never change. Therefore, B is exactly equal to A.
Finally, we resolve our contradiction by calculating the number of bits Alice should have sent to Bob. There are 2^n subsets she could have chosen, so according to Shannon's law, she should have sent at least n bits to Bob. However, she only sent a constant-sized state V', which is much shorter than n bits.
Although we use a stateless blockchain when describing our proofs, Alice and Bob can perform similarly efficient communication using a variety of other authenticated data structures, including accumulators, vector commitments, and authenticated dictionaries. We formalize this class of data structures using a new abstraction we call revocable proof systems.
Impact of the results
Our results show that you can’t “cryptographically eliminate state” and that there is no silver bullet commitment scheme that allows us to build a stateless blockchain where users never have to update their witnesses. State doesn’t disappear, it’s transferred out of the hands of validators and pushed to users in the form of frequent witness updates.
There are other promising solutions that deviate from the strict stateless blockchain model. A natural relaxation of this model is to allow a third party (neither a user nor a validator) to be responsible for storing the complete state. This third party, called an attestation service node, uses the complete state to generate up-to-date witness information for the user. Users can then use this witness information to make transactions just as they would in a regular stateless blockchain, where the validator still only stores the succinct state. The incentive mechanism for this system, in particular how users compensate the attestation service node, is an interesting open research direction.
While our discussion so far has focused on L1 blockchains, our results also have implications for L2 systems such as Rollup servers. Rollups (whether Optimistic or ZK) typically use a small value to store a commitment to a large state on L1. This state includes the accounts of each user on L2. We want these users to be able to withdraw funds directly on L1 (without the cooperation of the L2 server) by publishing witnesses to their current account balances. This setting is also an instance of a revocable proof system in our model. In fact, it can be said that stateless blockchains have already been implemented in practice, in the form of L2 Rollups.
Unfortunately, however, this means that our impossibility result applies directly. A user’s Rollup withdrawal witness must change frequently, otherwise almost the entire L2 state must be written to L1. As a result, today’s Rollups typically assume the existence of a Data Availability Committee (sometimes called a “validium”), similar to a “proof service node”, that helps users compute new witness information when they prepare to withdraw. Our results show that the Ethereum documentation’s warning to users: “Without transaction data, users cannot compute Merkle proofs to prove ownership of funds and perform withdrawals.” will always apply.