Original title: Possible futures for the Ethereum protocol, part 2: The Surge

Original author: Vitalik Buterin

Original translation: Karen, Foresight News

Special thanks to Justin Drake, Francesco, Hsiao-wei Wang, @antonttc and Georgios Konstantopoulos.

Originally, there were two scaling strategies in the Ethereum roadmap. One (see an early paper in 2015) was "sharding": each node only needs to verify and store a small portion of transactions, rather than verifying and storing all transactions in the chain. Any other peer-to-peer network (such as BitTorrent) works this way, so of course we can make blockchain work the same way. The other is Layer2 protocols: these networks will be on top of Ethereum, allowing it to fully benefit from its security while keeping most of the data and computation outside the main chain. Layer2 protocols refer to state channels in 2015, Plasma in 2017, and then Rollup in 2019. Rollup is more powerful than state channels or Plasma, but they require a lot of on-chain data bandwidth. Fortunately, by 2019, sharding research had solved the problem of verifying "data availability" at scale. As a result, the two paths merged and we got a Rollup-centric roadmap that remains Ethereum's scaling strategy today.

The Surge, 2023 Roadmap Edition

The Rollup-centric roadmap proposes a simple division of labor: Ethereum L1 focuses on becoming a powerful and decentralized base layer, while L2 takes on the task of helping the ecosystem scale. This pattern is ubiquitous in society: the court system (L1) exists not to be super fast and efficient but to protect contracts and property rights, and entrepreneurs (L2) build on this solid foundation to lead humanity to (literally and metaphorically) Mars.

This year, the Rollup-centric roadmap has achieved important results: with the launch of EIP-4844 blobs, the data bandwidth of Ethereum L1 has increased significantly, and multiple Ethereum Virtual Machine (EVM) Rollups have entered the first stage. Each L2 exists as a "shard" with its own internal rules and logic, and the diversity and diversification of shard implementations are now a reality. But as we have seen, there are also some unique challenges in taking this path. Therefore, our task now is to complete the Rollup-centric roadmap and solve these problems while maintaining the robustness and decentralization that are unique to Ethereum L1.

The Surge: Key Goals

1. In the future, Ethereum can reach more than 100,000 TPS through L2;

2. Maintain the decentralization and robustness of L1;

3. At least some L2s fully inherit the core properties of Ethereum (trustlessness, openness, and anti-censorship);

4. Ethereum should feel like a unified ecosystem, not 34 different blockchains.

In this chapter

1. Scalability Triangle Paradox

2. Further progress in data availability sampling

3. Data compression

4. Generalized Plasma

5. Mature L2 proof system

6. Cross-L2 interoperability improvements

7. Scaling execution on L1

Scalability Triangle Paradox

The scalability triangle paradox is an idea proposed in 2017, which posits that there is a contradiction between three characteristics of blockchain: decentralization (more specifically: low cost of running a node), scalability (high number of transactions processed), and security (an attacker needs to compromise a large portion of the nodes in the network to make a single transaction fail).

It is worth noting that the trilemma is not a theorem, and the post introducing the trilemma does not come with a mathematical proof. It does give a heuristic mathematical argument: if a decentralization-friendly node (such as a consumer laptop) can verify N transactions per second, and you have a chain that processes k*N transactions per second, then (i) each transaction can only be seen by 1/k nodes, meaning that an attacker only needs to compromise a few nodes to pass a malicious transaction, or (ii) your nodes will become powerful and your chain will not be decentralized. The purpose of this post was never to prove that breaking the trilemma is impossible; rather, it was intended to show that breaking the trilemma is difficult and requires thinking outside the box somewhat implied by the argument.

For years, some high-performance chains have often claimed that they have solved the trilemma without fundamentally changing their architecture, usually by applying software engineering tricks to optimize nodes. This has always been misleading, and running nodes on these chains is much more difficult than running nodes on Ethereum. This post will explore why this is the case, and why L1 client software engineering alone cannot scale Ethereum?

However, data availability sampling combined with SNARKs does solve the trilemma: it allows clients to verify that a certain amount of data is available and a certain number of computational steps were executed correctly, while only downloading a small amount of data and performing a very small amount of computation. SNARKs are trustless. Data availability sampling has a subtle few-of-N trust model, but it retains the fundamental property of non-scalable chains, namely that even a 51% attack cannot force bad blocks to be accepted by the network.

Another approach to solving the trilemma is the Plasma architecture, which uses clever techniques to push the responsibility of monitoring data availability onto users in an incentive-compatible way. Back in 2017-2019, when we only had fraud proofs as a means to scale computational power, Plasma was very limited in terms of secure execution, but with the popularity of SNARKs (zero-knowledge succinct non-interactive arguments), the Plasma architecture has become more viable for a wider range of use cases than ever before.

Further Progress on Data Availability Sampling

What Problem Are We Solving?

On March 13, 2024, when the Dencun upgrade goes live, the Ethereum blockchain will have 3 ~125 kB blobs per 12 second slot, or ~375 kB of data available bandwidth per slot. Assuming transaction data is published directly on-chain, an ERC20 transfer is about 180 bytes, so the maximum TPS for Rollup on Ethereum is: 375,000 / 12 / 180 = 173.6 TPS

If we add Ethereum's calldata (theoretical maximum: 30 million gas per slot / 16 gas per byte = 1,875,000 bytes per slot), it becomes 607 TPS. With PeerDAS, the number of blobs could increase to 8-16, which would provide 463-926 TPS for calldata.

This is a significant improvement over Ethereum L1, but not enough. We want more scalability. Our medium-term goal is 16 MB per slot, which, when combined with improvements in Rollup data compression, would bring ~58,000 TPS.

What is it? How does it work?

PeerDAS is a relatively simple implementation of "1D sampling". In Ethereum, each blob is a 4096-degree polynomial over a 253-bit prime field. We broadcast shares of the polynomial, where each share contains 16 evaluations at 16 adjacent coordinates from a total of 8192 coordinates. Of these 8192 evaluations, any 4096 (according to the currently proposed parameters: any 64 of the 128 possible samples) can recover the blob.

PeerDAS works by having each client listen to a small number of subnets, where the i-th subnet broadcasts the i-th sample of any blob, and requests the blobs it needs on other subnets by asking peers in the global p2p network (who will listen to different subnets). A more conservative version, SubnetDAS, uses only the subnet mechanism without the additional layer of asking peers. The current proposal is for nodes participating in proof of stake to use SubnetDAS, while other nodes (i.e. clients) use PeerDAS.

In theory, we can scale 1D sampling quite a bit: if we increase the maximum number of blobs to 256 (with a target of 128), then we can hit our 16MB target, with 16 samples per node in data availability sampling * 128 blobs * 512 bytes per sample per blob = 1MB of data bandwidth per slot. This is just barely within our tolerance: it's doable, but it means bandwidth-constrained clients can't sample. We can optimize this somewhat by reducing the number of blobs and increasing blob size, but this makes reconstruction more expensive.

So we ultimately want to go a step further and do 2D sampling, which randomly samples not only within blobs, but also between blobs. Using the linearity property promised by KZG, the set of blobs in a block is extended by a new set of virtual blobs that redundantly encode the same information.

So, eventually we want to go a step further and do 2D sampling, which randomly samples not only within a blob, but also between blobs. The linear property promised by KZG is used to expand the set of blobs in a block with a list of new virtual blobs that redundantly encode the same information.

2D Sampling. Source: a16z crypto

Crucially, the scaling of computational commitments does not require blobs, so the scheme is fundamentally friendly to distributed block construction. Nodes that actually build blocks only need to have blob KZG commitments, and they can rely on data availability sampling (DAS) to verify the availability of data blocks. 1D Data Availability Sampling (1D DAS) is also inherently friendly to distributed block construction.

What are the links to existing research?

1. Original post introducing data availability (2018): https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding

2. Follow-up paper: https://arxiv.org/abs/1809.09044

3. Explanation paper on DAS, paradigm: https://www.paradigm.xyz/2022/08/das

4. 2D availability with KZG commitments: https://ethresear.ch/t/2d-data-availability-with-kate-commitments/8081

5. ethresear.ch PeerDAS: https://ethresear.ch/t/peerdas-a-simpler-das-approach-using-battle-tested-p2p-components/16541 and paper: https://eprint.iacr.org/2024/1362

6.EIP-7594: https://eips.ethereum.org/EIPS/eip-7594

SubnetDAS on 7.ethresear.ch: https://ethresear.ch/t/subnetdas-an-intermediate-das-approach/17169

8. Nuances of recoverability in 2D sampling: https://ethresear.ch/t/nuances-of-data-recoverability-in-data-availability-sampling/16256

What needs to be done? What are the trade-offs?

Next up is completing the implementation and rollout of PeerDAS. After that, it will be a gradual process of increasing the number of blobs on PeerDAS while carefully watching the network and improving the software to ensure security. In the meantime, we expect more academic work to formalize PeerDAS and other versions of DAS and their interactions with issues like fork choice rule security.

Further down the road, more work will be needed to determine the ideal version of 2D DAS and prove its security properties. We also hope to eventually move away from KZG to an alternative that is quantum-safe and does not require a trusted setup. It is not clear what candidates are friendly to distributed blockchain construction at this point. Even the expensive “brute force” technique of using recursive STARKs to generate validity proofs for reconstructing rows and columns is not sufficient because while technically one STARK is O(log(n) * log(log(n)) hashes in size (using STIR), in practice the STARK is almost as large as the entire blob.

The long-term realistic path I see is:

1. Implement an ideal 2D DAS;

2. Stick with 1D DAS, sacrificing sampling bandwidth efficiency and accepting a lower data cap for simplicity and robustness

3. (Hard pivot) abandon DA and fully embrace Plasma as the primary L2 architecture we focus on.

Note that this option exists even if we decide to scale execution directly at the L1 layer. This is because if L1 For the L1 layer to handle a large number of TPS, L1 blocks will become very large, and clients will want an efficient way to verify their correctness, so we will have to use the same techniques as Rollup (such as ZK-EVM and DAS) at the L1 layer.

How does it interact with the rest of the roadmap?

If data compression is implemented, the need for 2D DAS will be reduced, or at least delayed, and if Plasma is widely used, the need will be further reduced. DAS also poses challenges to distributed block construction protocols and mechanisms: while DAS is theoretically friendly to distributed reconstruction, this in practice needs to be combined with the inclusion list proposal and the fork choice mechanism around it.

Data Compression

What Problem Are We Solving?

Each transaction in a Rollup takes up a significant amount of on-chain data space: an ERC20 transfer takes about 180 bytes. This limits the scalability of the Layer protocol even with ideal data availability sampling. At 16 MB per slot, we get:

16000000 / 12 / 180 = 7407 TPS

What if we could solve not only the numerator, but also the denominator, so that each transaction in a Rollup takes up fewer bytes on-chain?

What is it and how does it work?

In my opinion, the best explanation is this picture from two years ago:

In zero-byte compression, each long sequence of zero bytes is replaced with two bytes indicating how many zero bytes there are. Going one step further, we take advantage of specific properties of transactions:

Signature aggregation: We switch from ECDSA signatures to BLS signatures, which have the property that multiple signatures can be combined into a single signature that can prove the validity of all the original signatures. In L1, BLS signatures are not considered due to the high computational cost of verification even with aggregation. But in an environment like L2 where data is scarce, it makes sense to use BLS signatures. The aggregation property of ERC-4337 provides a way to achieve this functionality.

Replace addresses with pointers: If an address has been used before, we can replace the 20-byte address with a 4-byte pointer to a location in history.

Custom serialization of transaction values - Most transaction values have very few bits, for example, 0.25 ETH is represented as 250,000,000,000,000,000 wei. The same goes for the maximum base fee and priority fee. Therefore, we can use a custom decimal floating point format to represent most monetary values.

What are the links to existing research?

1. Explore sequence.xyz: https://sequence.xyz/blog/compressing-calldata

2. L2 Calldata Optimization Contract: https://github.com/ScopeLift/l2-optimizoooors

3. Rollups based on proof of validity (aka ZK rollups) publish state differences instead of transactions: https://ethresear.ch/t/rollup-diff-compression-application-level-compression-strategies-to-reduce-the-l2-data-footprint-on-l1/9975

4. BLS Wallet - BLS aggregation via ERC-4337: https://github.com/getwax/bls-wallet

What else needs to be done, and what are the tradeoffs?

The main thing to do next is to actually implement the above scheme. The main tradeoffs include:

1. Switching to BLS signatures requires a lot of effort and reduces compatibility with trusted hardware chips that can enhance security. ZK-SNARK wrappers of other signature schemes can be used instead.

2. Dynamic compression (e.g., replacing addresses with pointers) complicates client code.

3. Publishing state differences to the chain instead of transactions reduces auditability and makes many software (e.g., block browsers) inoperable.

How does it interact with other parts of the roadmap?

Adopting ERC-4337 and eventually incorporating parts of it into the L2 EVM can greatly speed up the deployment of aggregation technology. Putting parts of ERC-4337 on L1 can speed up its deployment on L2.

Generalized Plasma

What problem are we solving?

Even with 16MB blobs and data compression, 58,000 TPS may not be enough to fully meet the needs of consumer payments, decentralized social, or other high-bandwidth areas, especially when we start to consider privacy factors, which may reduce scalability by 3-8 times. For high-transaction volume, low-value use cases, one current option is to use Validium, which stores data off-chain and adopts an interesting security model: operators cannot steal users' funds, but they may temporarily or permanently freeze all users' funds. But we can do better.

What is it and how does it work?

Plasma is a scaling solution that involves an operator publishing blocks off-chain and putting the Merkle roots of those blocks on-chain (unlike Rollup, which puts the full blocks on-chain). For each block, the operator sends each user a Merkle branch to prove what has, or has not, changed to that user's assets. Users can withdraw their assets by providing a Merkle branch. Importantly, this branch does not have to be rooted at the latest state. Therefore, even if there is a problem with data availability, users can still recover their assets by extracting the latest state available to them. If a user submits an invalid branch (for example, withdrawing an asset they have already sent to someone else, or the operator creates an asset out of thin air), the legal ownership of the asset can be determined through an on-chain challenge mechanism.

Plasma Cash chain diagram. A transaction that spends coin i is placed at the i-th position in the tree. In this example, assuming all previous trees are valid, we know that Eve currently owns token 1, David owns token 4, and George owns token 6.

Early versions of Plasma were only able to handle the payments use case and could not be effectively generalized further. However, if we require that every root be verified with a SNARK, then Plasma becomes much more powerful. Each challenge game can be greatly simplified because we rule out most of the possible paths for the operator to cheat. At the same time, new paths are opened up that allow Plasma technology to scale to a wider range of asset classes. Finally, in the case where the operator does not cheat, users can withdraw their funds immediately without having to wait for a week-long challenge period.

One way to make an EVM Plasma chain (not the only way): use ZK-SNARK to build a parallel UTXO tree that reflects the balance changes made by the EVM and defines a unique mapping of the "same token" at different points in history. Then you can build a Plasma structure on top of it.

A key insight is that Plasma systems do not need to be perfect. Even if you can only protect a subset of assets (for example, just tokens that have not moved in the past week), you have greatly improved the status quo of the current hyper-scalable EVM (ie Validium).

Another class of structures is a hybrid Plasma/Rollup, such as Intmax. These constructions put very small amounts of data per user on-chain (e.g., 5 bytes), and in doing so achieve properties somewhere in between Plasma and Rollup: in the case of Intmax, you get very high scalability and privacy, though even at 16MB you’re theoretically limited to about 16,000,000 / 12 / 5 = 266,667 TPS.

What are some links to existing research?

1. Original Plasma paper: https://plasma.io/plasma-deprecated.pdf

2. Plasma Cash: https://ethresear.ch/t/plasma-cash-plasma-with-much-less-per-user-data-checking/1298

3. Plasma Cashflow: https://hackmd.io/DgzmJIRjSzCYvl4lUjZXNQ?view#-Exit

4. Intmax (2023): https://eprint.iacr.org/2023/1082

What else needs to be done? What are the trade-offs?

The main task remaining is to put Plasma systems into actual production applications. As mentioned above, "Plasma vs. Validium" is not an either-or choice: any Validium can improve its security properties at least to some extent by incorporating Plasma features in its exit mechanism. Research focuses on obtaining the best properties for the EVM (in terms of trust requirements, worst-case L1 gas costs, and the ability to resist DoS attacks), as well as alternative application-specific structures. In addition, Plasma has a higher conceptual complexity relative to Rollup, which needs to be directly addressed by researching and building better general frameworks.

The main trade-off with Plasma designs is that they are more operator-dependent and more difficult to base, although hybrid Plasma/Rollup designs can generally avoid this weakness.

How does it interact with other parts of the roadmap?

The more effective the Plasma solution, the less pressure there is on L1 to have high-performance data availability functions. Moving activity to L2 can also reduce MEV pressure on L1.

Mature L2 Proof Systems

What Problem Are We Solving?

Currently, most Rollups are not actually trustless. There is a safety committee that has the ability to override (optimistic or validity) the behavior of the proof system. In some cases, the proof system does not even run at all, or even if it does, it only has an "advisory" function. The most advanced Rollups include: (i) some trustless application-specific Rollups, such as Fuel; (ii) as of this writing, Optimism and Arbitrum are two full-EVM Rollups that have achieved a partial trustless milestone known as "Phase 1". The reason why Rollups have not made further progress is the fear of bugs in the code. We need trustless Rollups, so we must face and solve this problem.

What is it and how does it work?

First, let’s recap the “stage” system originally introduced in this article.

Stage 0: Users must be able to run a node and sync the chain. It doesn’t matter if validation is fully trusted/centralized.

Stage 1: There must be a (trustless) proof system that ensures only valid transactions are accepted. It is allowed to have a security committee that can overturn the proof system, but there must be a 75% threshold vote. In addition, the quorum-blocking portion of the committee (i.e. 26%+) must be outside the main company building the Rollup. A less powerful upgrade mechanism (such as a DAO) is allowed, but it must have a long enough delay that if it approves a malicious upgrade, users can withdraw their funds before the funds go live.

Stage 2: There must be a (trustless) proof system that ensures only valid transactions are accepted. The safety committee is only allowed to intervene if there is a provable bug in the code, e.g. if two redundant proof systems disagree with each other, or if a proof system accepts two different post-state roots for the same block (or does not accept anything for a long enough period of time, e.g. a week). Upgrade mechanisms are allowed, but must have very long delays.

Our goal is to reach stage 2. The main challenge in reaching stage 2 is to gain enough confidence that the proof system is actually trustworthy enough. There are two main ways to do this:

1. Formal Verification: We can use modern mathematical and computational techniques to prove (optimistic and validity) that the proof system only accepts blocks that pass the EVM specification. These techniques have been around for decades, but recent advances (like Lean 4) have made them more practical, and advances in AI-assisted proofs may further accelerate this trend.

2. Multi-provers: Make multiple proof systems and invest in these proof systems with a security committee (or other gadgets with trust assumptions, such as TEE). If the proof systems agree, the security committee has no power; if they disagree, the security committee can only choose between one of them, it cannot unilaterally impose its own answer.

Stylized diagram of a multi-prover, combining an optimistic proof system, a validity proof system, and a security committee.

What are some links to existing research?

1. EVM K Semantics (formal verification work from 2017): https://github.com/runtimeverification/evm-semantics

2. Talk on the idea of multi-proofs (2022): https://www.youtube.com/watch?v=6hfVzCWT6YITaiko

3. Plans to use multiple proofs: https://docs.taiko.xyz/core-concepts/multi-proofs/

What else needs to be done? What are the trade-offs?

For formal verification, the workload is large. We need to create a formally verified version of the entire SNARK prover for the EVM. This is an extremely complex project, although we have already started. There is a trick that greatly simplifies this task: we can create a formally verified SNARK prover for a minimal VM (such as RISC-V or Cairo), and then implement the EVM in that minimal VM (and formally prove its equivalence to other Ethereum VM specifications).

There are two major parts to multi-proof that are not yet done. First, we need to have enough confidence in at least two different proof systems, both that they are reasonably secure on their own, and that if they break, those problems are different and unrelated (so they don't break at the same time). Second, we need to have very high trust in the underlying logic of the combined proof system. This part of the code is much smaller. There are ways to make it very small, just store the funds in a Safe multisig contract signed by the contracts representing the individual proof systems, but this will increase the gas cost on the chain. We need to find some balance between efficiency and security.

How does this interact with the rest of the roadmap?

Moving activity to L2 reduces MEV pressure on L1.

Cross-L2 interoperability improvements

What problem are we solving?

A major challenge facing the L2 ecosystem today is that it is difficult for users to navigate. Additionally, the easiest approach often reintroduces trust assumptions: centralized cross-chain, RPC clients, etc. We need to make using the L2 ecosystem feel like using a unified Ethereum ecosystem.

What is it? How does it work?

There are many categories of cross-L2 interoperability improvements. In theory, Ethereum centered around Rollup is the same thing as executing sharded L1. The current Ethereum L2 ecosystem is still far from the ideal state in practice:

1. Chain-specific addresses: The address should contain chain information (L1, Optimism, Arbitrum...). Once this is achieved, the cross-L2 send process can be implemented by simply putting the address into the "Send" field, and the wallet can handle how to send it in the background (including using cross-chain protocols).

2. Chain-specific payment requests: It should be easy and standardized to create messages of the form "Send me X number of Y type generations on chain Z". This has two main application scenarios: (i) whether it is payments between people or payments between people and merchant services; (ii) DApps requesting funds.

3. Cross-chain exchange and gas payment: There should be a standardized open protocol to express cross-chain operations, such as "I will send 1 ether (on Optimism) to whoever sent me 0.9999 ether on Arbitrum", and "I will send 0.0001 ether (on Optimism) to whoever includes this transaction on Arbitrum". ERC-7683 is an attempt at the former, and RIP-7755 is an attempt at the latter, although both have wider application than these specific use cases.

4. Light client: Users should be able to actually verify the chain they are interacting with, rather than just trusting the RPC provider. a16z crypto's Helios can do this (for Ethereum itself), but we need to extend this trustlessness to L2. ERC-3668 (CCIP-read) is a strategy to achieve this goal.

How a light client updates its view of the Ethereum header chain. Once you have the header chain, you can use Merkle proofs to verify any state object. Once you have the correct L1 state object, you can use Merkle proofs (and signatures if you want to check pre-confirmations) to verify any state object on L2. Helios already does the former. Expanding to the latter is a standardization challenge.

1. Keystore wallet: Today, if you want to update the key that controls your smart contract wallet, you have to update it on all N chains where that wallet exists. Keystore wallets are a technology that allows keys to exist in only one place (either on L1 or potentially on L2 in the future), and then any L2 that has a copy of the wallet can read the key from it. This means that updates only need to be done once. To improve efficiency, the Keystore wallet requires L2 to have a standardized way to read information on L1 at no cost; there are two proposals for this, namely L1SLOAD and REMOTESTATICCALL.

Keystore Wallet Working Principle

2. A more radical "shared token bridge" concept: Imagine a world where all L2s are proof-of-validity Rollups and each slot is submitted to Ethereum. Even in such a world, to transfer assets from one L2 to another L2 in a native state, withdrawals and deposits are still required, which requires paying a lot of L1 Gas fees. One way to solve this problem is to create a shared minimalist Rollup whose only function is to maintain which L2 owns each type of token and how much balance each has, and allow these balances to be updated in batches through a series of cross-L2 send operations initiated by any L2. This will make cross-L2 transfers without paying L1 gas fees for each transfer, and without using liquidity provider-based technologies such as ERC-7683.

3. Synchronous composability: allows synchronous calls to occur between a specific L2 and L1 or between multiple L2s. This helps improve the financial efficiency of DeFi protocols. The former can be achieved without any cross-L2 coordination; the latter requires shared ordering. Rollup-based technologies automatically apply to all of them.

What are the links to existing research?

1. Chain-specific address: ERC-3770: https://eips.ethereum.org/EIPS/eip-3770

2. ERC-7683: https://eips.ethereum.org/EIPS/eip-7683

3. RIP-7755: https://github.com/wilsoncusack/RIPs/blob/cross-l2-call-standard/RIPS/rip-7755.md

4. Scroll keystore wallet design: https://hackmd.io/@haichen/keystore

5. Helios: https://github.com/a16z/helios

6. ERC-3668 (sometimes called CCIP Read): https://eips.ethereum.org/EIPS/eip-3668

7. Justin Drake’s proposal for “based (shared) preconfirmations”: https://ethresear.ch/t/based-preconfirmations/17353 8. L1SLOAD (RIP-7728): https://ethereum-magicians.org/t/rip-7728-l1sload-precompile/20388 9. REMOTESTATICCALL in Optimism: https://github.com/ethereum-optimism/ecosystem-contributions/issues/76 10. AggLayer, which includes the idea of a shared token bridge: https://github.com/AggLayer What else needs to be done? What are the tradeoffs? Many of the examples above face the standards dilemma of when and which layers to standardize. If standardization happens too early, a poor solution may become entrenched. If standardization happens too late, unnecessary fragmentation may result. In some cases, there is both a short-term solution that has weaker properties but is easier to implement, and a long-term solution that is "eventually correct" but will take years to implement.

These tasks are not just technical problems, they are also (perhaps even primarily) social problems that require L2 and wallets to cooperate as well as L1.

How do they interact with the rest of the roadmap?

Most of these proposals are "higher layer" constructs, and therefore have little impact on L1 considerations. One exception is shared ordering, which has a significant impact on maximum extractable value (MEV).

Scaling execution on L1

What problem are we solving?

If L2 becomes very scalable and successful, but L1 can still only handle a very small amount of transactions, then Ethereum may face many risks:

1. The economic conditions of ETH assets will become more unstable, which in turn will affect the long-term security of the network.

2. Many L2s benefit from close ties with the highly developed financial ecosystem on L1. If this ecosystem is greatly weakened, then the incentive to become L2 (rather than becoming an independent L1) will be weakened.

3. It will take a long time for L2 to achieve exactly the same security guarantees as L1.

4. If L2 fails (for example, due to malicious behavior or disappearance of the operator), users will still need to go through L1 to recover their assets. Therefore, L1 needs to be powerful enough to actually handle the highly complex and messy final work of L2 at least occasionally.

For these reasons, it is very valuable to continue to expand L1 itself and ensure that it can continue to accommodate more and more use cases.

What is it? How does it work?

The simplest way to expand is to directly increase the gas limit. However, this may make L1 more centralized, thereby weakening another important feature of Ethereum L1 that is so powerful: its credibility as a robust base layer. There is still debate about how far simply increasing the gas limit is sustainable, and this will vary depending on what other techniques are implemented to make it easier to verify larger blocks (e.g., history expiration, statelessness, L1 EVM validity proofs). Another important thing that needs to be continuously improved is the efficiency of Ethereum client software, which is much more efficient today than it was five years ago. An effective L1 gas limit increase strategy will involve accelerating the development of these verification techniques.

1.EOF: A new EVM bytecode format that is more friendly to static analysis and allows for faster implementations. Given these efficiency gains, EOF bytecode can achieve lower gas fees.

2.Multi-dimensional gas pricing: Setting different base fees and limits for compute, data, and storage can increase the average capacity of Ethereum L1 without increasing the maximum capacity (thereby avoiding the creation of new security risks).

3. Reduce Gas Costs for Specific Opcodes and Precompiles - Historically, we have increased the Gas costs of certain underpriced operations several times to avoid denial of service attacks. One thing that could be done more is to reduce the Gas costs of overpriced opcodes. For example, addition is much cheaper than multiplication, but currently the ADD and MUL opcodes have the same cost. We can reduce the cost of ADD and even make simpler opcodes like PUSH even cheaper. EOF is more optimized in this regard overall.

4. EVM-MAX and SIMD: EVM-MAX is a proposal to allow more efficient native large number analog math as a separate module of the EVM. Unless intentionally exported, the values calculated by EVM-MAX calculations can only be accessed by other EVM-MAX opcodes. This allows for more space to store these values in an optimized format. SIMD (single instruction multiple data) is a proposal to allow the same instruction to be executed efficiently on arrays of values. Together, the two can create a powerful co-processor next to the EVM that can be used to implement cryptographic operations more efficiently. This is particularly useful for privacy protocols and L2 guard systems, so it will help with both L1 and L2 scaling.

These improvements will be discussed in more detail in future Splurge articles.

Finally, the third strategy is native Rollups (or enshrined rollups): essentially, creating many copies of the EVM running in parallel, resulting in a model equivalent to what Rollup can provide, but more natively integrated into the protocol.

What are the links to existing research?

1. Polynya's Ethereum L1 expansion roadmap: https://polynya.mirror.xyz/epju72rsymfB-JK52_uYI7HuhJ-W_zM735NdP7alkAQ

2. Multi-dimensional Gas Pricing: https://vitalik.eth.limo/general/2024/05/09/multidim.html

3. EIP-7706: https://eips.ethereum.org/EIPS/eip-7706

4. EOF: https://evmobjectformat.org/

5. EVM-MAX: https://ethereum-magicians.org/t/eip-6601-evm-modular-arithmetic-extensions-evmmax/13168

6. SIMD: https://eips.ethereum.org/EIPS/eip-616

7. Native rollups: https://mirror.xyz/ohotties.eth/P1qSCcwj2FZ9cqo3_6kYI4S2chW5K5tmEgogk6io1GE

8. Max Resnick on the value of extending L1 in an interview: https://x.com/BanklessHQ/status/1831319419739361321

9. Justin Drake on using SNARKs and native Rollups To expand: https://www.reddit.com/r/ethereum/comments/1f81ntr/comment/llmfi28/

What else needs to be done, and what are the tradeoffs?

There are three strategies for L1 scaling, which can be pursued separately or in parallel:

1. Improve technology (e.g. client code, stateless clients, history expiration) to make L1 easier to verify, and then increase the gas limit.

2. Reduce the cost of specific operations, increasing average capacity without increasing worst-case risk;

3. Native Rollups (i.e., creating N parallel copies of the EVM).

Understanding these different techniques, we will find that each has different trade-offs. For example, native Rollups have many of the same weaknesses as ordinary Rollups in terms of composability: you cannot send a single transaction to execute operations across multiple Rollups synchronously, as you can do in contracts on the same L1 (or L2). Raising the gas limit would undermine other benefits that could be achieved by simplifying L1 validation, such as increasing the percentage of users running validating nodes and increasing the number of solo stakers. Depending on the implementation, making specific operations in the EVM (Ethereum Virtual Machine) cheaper could increase the overall complexity of the EVM.

A big question that any L1 scaling roadmap needs to answer is: what is the ultimate vision for L1 and L2? Obviously, it would be ridiculous to put everything on L1: potential use cases could involve hundreds of thousands of transactions per second, which would make L1 completely unverifiable (unless we go the native Rollup route). But we do need some guiding principles to ensure that we don't get into a situation where a 10x increase in the gas limit severely harms the decentralization of Ethereum L1.

A view of the division of labor between L1 and L2

How does this interact with the rest of the roadmap?

Bringing more users to L1 means not only improving scaling, but also improving other aspects of L1. This means that more MEV will stay on L1 (rather than just being an issue for L2), so the need to deal with MEV explicitly will become more urgent. This will greatly increase the value of fast slot times on L1. At the same time, this also relies heavily on the smooth progress of L1 (the Verge) verification.

Related reading: Vitalik's new article: Where else can Ethereum PoS be improved? How to achieve it?

Original link