introduction

Zero-knowledge proofs, especially zk-SNARKs (Zero-Knowledge Succinct Non-Interactive Knowledge Arguments), are perhaps one of the most important cutting-edge technologies in Web 3. Currently, most of the media and investment attention in this subfield is focused on zk-Rollups, a scaling solution that provides massive scalability to L1 blockchains such as Ethereum. Despite this, zk-Rollups are anything but The only purpose of zk-SNARKs. This article will provide an in-depth analysis of the concept of zero-knowledge assembly code (zkASM), evaluate its use cases in zk-Rollups and other aspects, and explore its potential in re-inventing the Internet from a theoretical level.

Technical principles

As the name suggests, zk-ASM mainly consists of two technical components: ZK and ASM. ZK refers to zk-SNARKs, which is concise non-interactive knowledge argument, and ASM refers to assembly code. To understand the potential of zk-ASM, we must first understand the theoretical basis of these two seemingly obscure concepts.

zk-SNARKs

zk-SNARKs are the crown jewel of zk-Proofs: they are a concise way to prove that a statement is true without revealing any information about the data being proved. For example, suppose someone declares "I know there is an m such that C(m) = 0," where m is a gigabyte of information and C is a function. The zk-SNARK will constitute a short proof (<1GB) that quickly verifies the existence of m while not exposing any information about m (other than public information).

What exactly is this "C(m)"? What's the use? This function is actually an arithmetic circuit, or a directed acyclic graph (DAG) representation of the specific function we want to execute, as shown in the figure. Essentially, "m" is the entry data of the circuit, and the specific "nodes" in the circuit are individual logic gates or operations. For example, "2" and "3" can be input to the "+" node, and then "5" is output to the next operator. This allows any arithmetic or logical operation to be encoded in an "arithmetic circuit".

After the zk-SNARK code we want to run is represented by an arithmetic circuit, we can start to build this zk-SNARK. Fundamentally speaking, the feasibility of zk-SNARK is established by the "Fundamental Theorem of Algebra". According to the Fundamental Theorem of Algebra, a polynomial of degree "d" has at most "d" roots. This mathematical trick is a two-step process: (1) convert the function "f(m)" that needs to be proved into a polynomial and continue to use the polynomial, and (2) use the "Fundamental Theorem of Algebra" to process the polynomial and provide concise prove. In technical terms, the first part is called the "Polynomial Commitment Scheme" (PCS) and the second part is called the "Polynomial Interactive Proof of Oracle" (PIOP).

The composition of an efficient universal circuit SNARK. Source: https://cs251.stanford.edu/lectures/lecture15.pdf

The specific implementation of PCS and PIOP is beyond the scope of this article, but this gives us a rough sketch of the core steps of zk-SNARK:

  1. Select a function (code function, mathematical equation, etc.) that you wish to run zk-SNARK on

  2. Encode this function into an arithmetic circuit C(m)

  3. Run PCS to get the polynomial representation of the arithmetic circuit

  4. Run PIOP and get a concise proof of log(m) size

Now there is a customized zk-SNARK, which can prove that someone knows a certain information without revealing the specific content of the information.

assembly code

The second piece of the zk-ASM puzzle is the assembly code. It is a language-like language containing very low-level language instructions that are easy for machines to read but difficult for humans to decipher. Unlike high-level languages ​​such as Python, Java or even C, assembly language contains some very primitive functions, as well as MOVE (move), CMP (compare), ADD (addition), and JMP (jump) at the processor and register levels. . For example, the Python code for typing the numbers 1 to 9 on the screen is `123456789`:

That’s easy to understand, right? Let’s take a look at its x86 assembly version:

It's really a lot more troublesome, and this is just a very simple operation. In this case, why use assembly language? As mentioned above, while these instructions are not easily readable by humans, they are easily "assembled" into byte code of `110011001` for a machine to read and execute (called an assembler). In comparison, although high-level languages ​​such as Python and Java are more readable, the processor cannot directly execute programs written in these languages. We need a "compiler" to convert the Python or Java code we write into the assembly code above, and hand it over to the machine for assembly and execution. The reason why the same piece of Python or Java code can run smoothly on different processors and operating systems is because the compiler does the heavy lifting and compiles the source code into assembly language for the specific processor or operating system.

Because all languages ​​compile to assembly code (which itself is further compiled into executable binaries), assembly is essentially like the "mother of all languages." Now assuming that we can convert all operands in an assembly language (such as x86 or RISC-V) into arithmetic circuit representations, we can provide zk-SNARK proofs for all operands in this assembly language. In theory, this means that we can provide zk-SNARKs for any program written in any high-level language that compiles to assembly, such as Python or Java. Because of this, zk-ASM deserves our careful study.

Practical application

zk-EVM Rollups:Polygon zk-ASM

One of the most important applications of zk-ASM is the creation of zk-Rollups compatible with the Ethereum Virtual Machine, or zk-EVM. zk-EVM is very important for blockchain scalability because it allows programmers to deploy on zk-Rollup-based L2 chains without making too many (or any) modifications to the code. In this regard, Polygon's zk-EVM is a typical example of how zk-ASM can be used to achieve this goal.

Developers on the Ethereum L1 public chain usually use Solidity language, which is a high-level programming language similar to C language. Before Solidity code is run on the L1 blockchain, it will first be compiled into a series of EVM operation codes, such as ADD, SLOAD, EQ, etc. By default, this process obviously does not create any zk-Proof. Polygon's ingenuity was to create a way to translate every EVM opcode into their custom-written zk-ASM, which is very zk-SNARK friendly. Their L2 zk-EVM then executes zk-ASM while creating the ASM's zk-SNARK circuit to create a zk-SNARK proof. For example, the ADD operation code in the EVM would be translated into Polygon's zk-ASM as follows:

Since Polygon zk-EVM's magic happens at the assembly level, it is two levels "lower" than the code layer that regular Ethereum touches, the "Solidity" layer. Because of this, most developers can port the EVM code they built for the Ethereum mainnet directly to Polygon zk-EVM. At the same time, Polygon zk-EVM "retains" the Ethereum technology stack all the way down to the operation code level, and all debugging infrastructure that relies on analyzing compiled operation code can still be used intact. This is different from other zk-EVM designs that do not provide zk-Proof at the opcode level, such as zk-Sync. Therefore, although Polygon invented and verified its own assembly language, as Buterin said, "it can still verify EVM code, it just uses different internal logic."

Beyond Rollups: zk-WASM

zk-EVM is by no means the only use case for zk-ASM. As mentioned above, assembly language is essentially the "mother of all languages", and creating zk-ASM will unlock zk-Proofs for general-purpose programs written in any language that compiles to assembly language. Web Assembly (WASM) is one of the most important emerging assembly languages. It was first released in 2018 and aims to improve the execution speed of web applications and provide execution supplements for Javascript (the main programming language behind the Web).

Essentially, as the Web has evolved over the years, Web applications have grown in size and complexity, which means that browsers have to compile everything written in Javascript, often at extremely slow speeds, and have to repeat complex "compile- Optimize-Reload" process. WebAssembly eliminates dependence on complex browser execution engines by providing assembly language that is portable, modular, and easy to execute. Additionally, WASM, as an assembly language, allows programmers to write snippets of code using languages ​​such as C, C++, Rust, Java, or Ruby that can run directly in the browser. WASM has therefore become the technology of choice for "providing distributed serverless functions".

What role can zk-SNARKs play in this regard? WASM is unique in that it is a client-side technology that interacts directly with user-entered data. This often includes sensitive data such as passwords and personal information, so we need a technology that (1) ensures that the program is executed accurately, and (2) ensures that sensitive information is not leaked. zk-SNARK is the perfect solution to these two problems, and is therefore an important puzzle piece for WASM.

Work on developing zk-WASM is still in its early stages, but several projects have recently released prototype zk-SNARK circuits for WebAssembly. For example, Delphinus Lab’s “ZAWA” zk-SNARK simulator brings a way to encode the operands and semantics of the WASM virtual machine into the computing circuit, allowing it to produce zk-SNARK proofs. The zk-WASM circuit will certainly continue to be optimized so that programs written in general-purpose languages ​​such as C, C++, Rust, and Ruby will adopt the zk-Proofs paradigm.

in conclusion

This article explores the theoretical basis of zk-ASM and examines two paradigm use cases of zk-ASM: Polygon uses zk-ASM to create zk-EVM at the operational code level; and applies zk-SNARKs to WebAssembly to create zk-WASM. Ultimately, zk-ASM promises to combine the interoperability and scale of Web 2 with the trustlessness and security of Web 3.

On the one hand, blockchain increasingly seeks to break through current processing volume bottlenecks, and on the other hand, Web 2 approaches are increasingly criticized for failing to adequately protect user data and privacy. As programmers are able to use Web 3 design paradigms in Web 2 code and bring Web 2 languages ​​and code to the blockchain, universal zk-ASM is expected to become a meeting point between the Web 2 and Web 3 worlds. In view of this, zk-ASM may allow us to reimagine a more secure and trustless Internet.