Original title: (Possible futures of the Ethereum protocol, part 1: The Merge)
By: Vitalik Buterin
Compiled by: Tia, Techub News
Originally, “the Merge” referred to the transition from proof of work to proof of stake. Ethereum has been running stably as a proof of stake system for nearly two years now, and proof of stake has performed very well in terms of stability, performance, and avoiding centralization risks. However, there are still some areas where proof of stake needs improvement.
My roadmap for 2023 consists of several parts: improved technical features such as increased stability, performance, and accessibility to small validators, as part of “the Merge”, and economic changes to address centralization risks.
This post will focus on “the Merge”: how can the technical design of Proof of Stake be improved, and what are the ways to achieve these improvements?
Please note that this is a list of ideas and not an exhaustive list of things that need to be done for Proof of Stake.
Single slot finality and democratizing staking
What problem are we solving?
Currently, it takes 2-3 epochs (about 15 minutes) to finalize a block, and 32 ETH is required to become a staker. This is a compromise made to balance three goals:
Maximize the number of validators participating in staking (which directly means minimizing the minimum amount of ETH required to stake)
Minimize finalization time
Minimize the overhead of running a node
These three goals are in conflict with each other: in order to achieve economic finality (i.e. an attacker would need to destroy a lot of ETH to revert a finalized block), every validator needs to sign two messages every time finality is achieved. So if you have many validators, it either takes a long time to process all the signatures, which requires very powerful nodes to process all the signatures at the same time.
Ultimately this is all in service of achieving this goal: it would take a huge cost for an attacker to succeed. This is what the term “economic finality” means. If this goal is not taken into account, then we can just randomly select a committee (such as what Algorand does) to finalize each slot. But the problem with this approach is that if an attacker is able to control 51% of the validators, then they can carry out an attack (undo finalized blocks, censor or delay finalization) at a very low cost: only the part of the nodes they are on the committee can be detected as participating in the attack and punished, either through slashing or a few soft forks. This means that an attacker can attack the chain over and over again many times. Therefore, if we want to achieve economic finality, a simple committee-based system will not work.
At first glance, we do need all validators to participate.
But ideally, we can still achieve economic finality while improving the following two parts:
Complete blocks in one slot (ideally, keeping or even reducing the current length of 12 seconds) instead of 15 minutes
Allow validators to stake 1 ETH (previously 32 ETH)
These goals can all be thought of as “bringing Ethereum’s performance closer to the (more centralized) performance-focused L1.”
But it will still use a higher-security finalization mechanism to ensure the security of all Ethereum users. Today, most users cannot get this level of security because they are unwilling to wait 15 minutes; using the Single slot finalization mechanism, users can achieve transaction finality almost immediately after confirming the transaction. Secondly, if users and applications don’t have to worry about chain rollbacks, it can simplify the protocol and the surrounding infrastructure, and the protocol and infrastructure need to consider fewer factors.
The second goal is to support solo stakers (users who stake independently rather than relying on institutions). The main factor preventing more people from solo staking is the 32 ETH minimum. Lowering the minimum to 1 ETH will solve this problem to the point where other issues become the main factor limiting solo staking.
But there is a challenge: faster finality and more democratized staking conflict with minimizing overhead. This is why we did not adopt Single Slot Finality in the first place. However, recent research has suggested some possible solutions to this problem.
What is it and how does it work?
Single-slot finality refers to a consensus algorithm that finalizes a block within a single slot. This is not a difficult goal in itself: many algorithms (such as Tendermint consensus) already achieve this with optimal properties. However, Tendermint does not have the "inactivity leaks" property that is unique to Ethereum. This property allows Ethereum to continue to operate and eventually recover even if more than 1/3 of the validators are offline. Fortunately, there are already solutions to achieve this property: there are proposals to modify Tendermint-style consensus to accommodate inactivity leaks.
Single slot finality proposal
The hardest part is figuring out how to make Single slot finality work with very high validator counts without incurring extremely high node operator overhead. Currently, there are several solutions:
Option 1: Brute Force — Work towards a better signature aggregation protocol, perhaps using ZK-SNARKs, which would essentially allow us to process signatures from millions of validators per slot.
Horn, one of the protocols designed to optimize aggregation
Option 2: Orbit Committee, a new mechanism that randomly selects medium-sized committees to determine chain finality, but comes with a high economic attack cost.
One way to think about Orbit SSF is that it opens up a compromise option that does not have to have economic finality like an Algorand-style committee, but can also achieve some of the high economic attack costs to a certain extent, so that Ethereum still has enough economic finality to ensure extreme security, but at the same time it can also improve the efficiency of a single solt.
Orbit exploits pre-existing heterogeneity in validator deposit sizes to obtain as much economic finality as possible, while still giving small validators a role to participate. Additionally, Orbit uses a slow committee rotation mechanism to ensure a high overlap between adjacent quorums, ensuring that its economic finality still applies across committee rotations.
Option 3: Two-tier staking, which divides stakers into two categories, one with a higher deposit requirement and one with a lower deposit requirement. Only stakers in the higher deposit requirement tier would have direct economic finality. There have been some proposals (e.g., see the Rainbow staking post) to specify what rights and responsibilities stakers in the lower deposit tier would need. Common ideas include:
Delegating rights to higher-level stakeholders
Randomly select lower-tier stakers to attest and finalize each block
Right to produce inclusion lists
What are the connections with existing research?
Path to Single Slot Finality (2022): https://notes.ethereum.org/@vbuterin/single_slot_finality
Specific proposal for Ethereum Single Slot Finality Protocol (2023): https://eprint.iacr.org/2023/280
Orbit SSF: https: //ethresear.ch/t/orbit-ssf-solo-staking-friendly-validator-set-management-for-ssf/19928
Further analysis of Orbit style mechanisms: https://notes.ethereum.org/@anderselowsson/Vorbit_SSF
Horn, Signature Aggregation Protocol (2022): https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219
Signature Merging for Large-Scale Consensus (2023): https://ethresear.ch/t/signature-merging-for-large-scale-consensus/17386?u=asn
Signature aggregation protocol proposed by Khovratovich et al.: https://hackmd.io/@7dpNYqjKQGeYC7wMlPxHtQ/BykM3ggu0#/
Signature aggregation based on STARK (2022): https://hackmd.io/@vbuterin/stark_aggregation
Rainbow Staking: https://ethresear.ch/t/unbundling-staking-towards-rainbow-staking/18683
What else needs to be done? What are the trade-offs?
There are four paths to choose from (we can also take a hybrid path):
Maintaining the status quo
Orbit SSF
Brute force SSF
SSF with two-tier staking mechanism
(1) means doing nothing and leaving things as they are, but this would make Ethereum’s security experience and staking centralization properties worse than they otherwise would be.
(2) Avoid “high tech” and solve the problem by cleverly rethinking the protocol assumptions: we relax the “economic finality” requirement so that we require attacks to be expensive, but the cost of an attack can be 10 times lower than it is today (e.g., $2.5 billion instead of $25 billion). It is widely believed that Ethereum’s economic finality today is far higher than it needs to be, and its main security risks are elsewhere, so this is arguably an acceptable sacrifice.
The main work is to verify that the Orbit mechanism is secure and has the properties we want, and then fully formalize and implement it. In addition, EIP-7251 (increase the maximum valid balance) allows voluntary validator balances to merge, which immediately reduces the verification overhead of the chain and serves as an effective initial stage for the launch of Orbit.
(3) Use high technology to force the problem. To do this, a large number of signatures (more than 1 million) must be collected in a very short period of time (5-10 seconds).
(4) It creates a two-tier staking system that does not require overthinking the mechanism or using high technology, but it does still have centralization risks. The risk depends largely on the specific rights obtained by the lower staking layer. For example:
If lower-level stakers need to delegate their attestation power to higher-level stakers, delegation may become centralized and we will end up with two highly concentrated staking tiers.
If a random sampling of lower layers is required to approve each block, then an attacker can prevent finality by spending only a tiny amount of ETH.
If lower-level stakers can only produce inclusion lists, then the proof layer may remain centralized, at which point a 51% attack on the proof layer can censor the inclusion list itself.
It is possible to combine multiple strategies, for example:
(1 + 2): Add Orbit, but do not enforce Single slot finality
(1+3): Use brute force techniques to reduce the minimum deposit amount without single slot finalization. The amount of aggregation required is 64 times less than the pure (3) case, so the problem becomes easier.
(2+3): Implement Orbit SSF with conservative parameters (e.g. 128k validator committee instead of 8k or 32k) and use brute force techniques to make it super efficient.
(1+4): Add rainbow staking but do not finalize Single slot
How does it interact with the rest of the roadmap?
Among other benefits, single slot finality reduces the risk of certain types of multi-block MEV attacks. Additionally, in a single slot finality world, the prover-proposer separation design and other in-protocol block production mechanisms need to be designed differently.
The weakness of achieving your goal through brute force is that reducing the slot time becomes much more difficult.
Single secret leader election
What problem are we solving?
Today, it is known in advance which validator will propose the next block. This creates a security vulnerability: an attacker can monitor the network, determine which validators correspond to which IP addresses, and launch a DoS attack on the validator when it is about to propose a block.
What is it and how does it work?
The best way to solve the DoS problem is to hide the information about which validator will generate the next block (at least until the block is actually generated). If the "single" requirement is not taken into account (only one party generates the next block), one solution is to let anyone create the next block, but this requires the randao reveal to be less than 2 (256) / N. Usually, there is only one validator that can meet this requirement (but sometimes there are two or more, and sometimes there are none). Therefore, combining the "confidentiality" requirement with the "single" requirement has always been a difficult problem.
The single secret leader election protocol solves this problem by using some cryptography to create a "blind" validator ID for each validator, and then giving many proposers the opportunity to shuffle and re-blind the pool of blind IDs (similar to how mixnets work). Within each slot, a random blind ID is chosen. Only the owner of that blind ID can generate a valid proof to propose a block, but no one knows which validator that blind ID corresponds to.
Whisk SSLE protocol
What are the connections with existing research?
Dan Boneh’s paper (2020): https://eprint.iacr.org/2020/025.pdf
Whisk (Ethereum specific proposal, 2022): https://ethresear.ch/t/whisk-a-practical-shuffle-based-ssle-protocol-for-ethereum/11763
Single secret leader election tag on ethresear.ch: https://ethresear.ch/tag/single-secret-leader-election
Simplified SSLE using ring signatures: https://ethresear.ch/t/simplified-ssle/12315
What's left to do? What are the trade-offs?
Really, all that remains is to find and implement a protocol that is simple enough that we can easily implement it on mainnet. We value the simplicity of Ethereum very much, and we don’t want complexity to grow further. The SSLE implementations we’ve seen add hundreds of lines of code to the specification and introduce new assumptions in complex cryptography. Finding a sufficiently efficient quantum-resistant SSLE implementation is also an open problem.
It may eventually come to the point where the “marginal additional complexity” of SSLE will only drop low enough if we take the plunge and introduce a mechanism for performing general zero-knowledge proofs in the Ethereum protocol at L1 for other reasons (e.g. state tries, ZK-EVM).
Another option is to not bother with SSLE at all, and instead use extra-protocol mitigations (e.g. at the p2p layer) to address the DoS problem.
How does it interact with the rest of the roadmap?
If we add an attester-proposer separation (APS) mechanism, such as execution tickets, then execution blocks (i.e. blocks containing Ethereum transactions) will not need SSLE, as we can rely on specialized block builders. However, for consensus blocks (i.e. blocks containing protocol messages such as attestations, parts that may contain lists, etc.), we will still benefit from SSLE.
Faster transaction confirmation
What problem are we solving?
There is value in further reductions in Ethereum’s transaction confirmation time, from 12 seconds to 4 seconds. Doing so will significantly improve L1 and rollups-based user experience while making the defi protocol more efficient. It will also make it easier for L2 to decentralize, as it will allow a large number of L2 applications to run on rollups, thereby reducing the need for L2 to build its own committee-based decentralized ordering.
What is it and how does it work?
There are roughly two techniques here:
Reduce the slot time, for example to 8 seconds or 4 seconds. This does not necessarily mean 4-second finality: finality itself requires three rounds of communication, so we can make each round a separate block, which will get at least tentative confirmation after 4 seconds.
Allow proposers to publish preconfirmations during a slot. In an extreme case, proposers could include transactions as they see them in their blocks in real time, and immediately publish preconfirmation messages for each transaction ("My first transaction is 0×1234...", "My second transaction is 0×5678..."). The case where a proposer publishes two conflicting confirmations can be handled in two ways: (i) slash the proposer, or (ii) use witnesses to vote on which one is earlier.
What are the connections with existing research?
Based on preconfirmations: https://ethresear.ch/t/based-preconfirmations/17353
Protocol Enforced Proposer Commitments (PEPC): https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879
Staggered periods on parachains (2018 idea for low latency): https://ethresear.ch/t/staggered-periods/1793
What's left to do? What are the trade-offs?
It is unclear how feasible it is to reduce slot times. Even today, it is difficult for stakers in many parts of the world to get proofs fast enough. Attempting 4 second slot times risks centralizing the validator set and making it impractical to become a validator outside of a few privileged regions due to latency.
The weakness of the proposer preconfirmation approach is that it greatly improves average-case inclusion time, but not the worst-case: if the current proposer is performing well, your transaction will be preconfirmed in 0.5 seconds instead of being included in (on average) 6 seconds, but if the current proposer is offline or performing poorly, you will still have to wait a full 12 seconds before the next slot can start and a new proposer is available.
Additionally, there is an open question of how to incentivize preconfirmations. Proposers have an incentive to maximize their optionality for as long as possible. If witnesses sign off on the timeliness of preconfirmations, then transaction senders can make part of their fees conditional on immediate preconfirmations, but this places an additional burden on witnesses and may make it more difficult for witnesses to continue to act as neutral "dumb pipes."
On the other hand, if we don’t try to do this, and keep finality times at 12 seconds (or longer), the ecosystem will place more emphasis on L2 pre-confirmation mechanisms, and interactions across L2 will take longer.
How does it interact with the rest of the roadmap?
Proposer-based pre-confirmation actually relies on attestor-proposer separation (APS) mechanisms such as execution tickets. Otherwise, the pressure to provide real-time pre-confirmation may be too concentrated for regular validators.
Other research areas
51% Attack Recovery
It is often assumed that if a 51% attack were to occur (including attacks that cannot be cryptographically proven, such as censorship), the community would come together to implement a minority soft fork, ensuring that the good guys win and the bad guys are leaked or slashed for inactivity. However, this level of over-reliance on the social layer is arguably unhealthy. We can try to reduce reliance on the social layer by making the recovery process as automated as possible.
Full automation is impossible, since that would be equivalent to a >50% fault-tolerant consensus algorithm, and we already know the (very strict) mathematically provable limitations of such algorithms. But we can automate some of this: for example, a client could automatically refuse to accept a chain as final, or even as the head of a fork choice, if it censors transactions that the client has seen for a long time. A key goal is to ensure that an attacker cannot at least achieve a quick and complete victory.
Raising the quorum threshold
Today, blocks are finalized as long as 67% of stakers support them. Some argue that this is too aggressive. In the entire history of Ethereum, there has only been one (very brief) failure of finality. If this was raised to 80%, the number of additional non-finality periods would be relatively low, but Ethereum would gain security: in particular, many of the more contentious situations would result in temporary halts in finality. This seems like a healthier situation than one where the “wrong party” wins immediately, whether that wrong party is an attacker or the client is buggy.
This also answers the question of “what’s the point of solo stakers?”. Today, with most stakers already staking through staking pools, it seems highly unlikely that a single staker could reach 51% of ETH staked. However, it seems possible to reach a quorum-blocking minority with a single staker if we try, especially if the quorum is 80% (so only 21% is needed for a quorum-blocking minority). As long as solo stakers do not participate in a 51% attack (either finality reversal or censorship), such an attack will not result in a “clean win”, and solo stakers will have an incentive to help prevent a minority soft fork.
Resistant to quantum attacks
Metaculus currently believes that quantum computers could begin to crack cryptography sometime in the 2030s, albeit with large margins of error:
Quantum computing experts, such as Scott Aaronson, have also recently started to think more seriously about the possibility of quantum computers actually working in the medium term. This will have implications for the entire Ethereum roadmap: it means that every part of the Ethereum protocol that currently relies on elliptic curves will need some kind of hash-based or other quantum-resistant alternative. This specifically means that we cannot assume that we will be able to forever rely on the superior performance of BLS aggregation to process signatures from large validator sets. This justifies conservatism in performance assumptions for proof-of-stake designs, and is a reason to more aggressively develop quantum-resistant alternatives.