On October 31, 2023, TON (formerly Telegram Open Network) set a new world record, reaching an astonishing peak of 104,715 transactions per second in the first public live performance test, completing a total of 107,652,545 transactions in 25 minutes. Verified and confirmed by Certik, this performance makes TON the fastest and most scalable blockchain in the world, exceeding the processing speed of all L1 blockchains and well-known centralized payment networks such as PayPal, Visa and Mastercard.

TON is undoubtedly an eye-catching project. This article will analyze the TON white paper in depth, revealing its unique technical features and innovations, and why TON can become the fastest blockchain in the world.

Scaling Problems

In the development of blockchain technology, scalability has always been a huge problem. The expansion plan of blockchain is mainly to improve the throughput of the system and reduce transaction fees, so that the blockchain network can handle more transactions and better adapt to large-scale applications. Although different public chains continue to try new consensus and architecture designs, the current results are still unsatisfactory, becoming a bottleneck for blockchain to move towards large-scale applications, and it is difficult to carry our vision of TG's billion users. The current mainstream expansion plans can be divided into the following categories:

Sharding: Split the network into multiple smaller parts. Each shard can process transactions and smart contracts in parallel, thereby significantly improving the throughput of the network. But sharding comes with potential security issues, as each shard may be less secure than the entire network. Additionally, cross-shard communication is a technical challenge. Representative examples: the former Ethereum 2.0 and NEAR Nightshade protocol.

Sidechains: A sidechain is a blockchain that runs independently of the main chain and can have its own consensus mechanism and block parameters. Through the sidechain, users can transfer assets between the two chains, thereby unloading the burden of the main chain. Representative example: Polygon

Layer 2 solutions: By building another layer on top of the main chain, L2 is able to provide faster transaction confirmation times and lower transaction fees. Take the more well-known L2s, Optimism and Arbitrum: Both are expansion solutions designed specifically for Ethereum. Therefore, part of the architecture of Optimism and Arbitrum is located in Layer 1. With the upgrade of Ethereum, their upper limit of transactions per second (TPS) has increased from the original 2-4k to about 2w.

zkSync 2.0: Compared with the hundreds of TPS limit of zkSync 1.0, zkSync 2.0 brings significant improvements. The zkSync team claims that its 2.0 version can reach the upper limit of 100,000 TPS, but most institutions predict that its true upper limit may be 1-20,000. Starknet: After completing Quantum Leap’s upgrade in June, its TPS currently exceeds 100TPS.

Solana: Solana uses an innovative consensus algorithm called Proof of History (PoH) as the core of its expansion plan. Although Solana claims that its TPS can reach 65,000, in fact most of the TPS is used for communication between nodes. The actual transaction volume may only be 6-8k TPS. Moreover, due to its centralized consensus mechanism design, Solana has experienced multiple downtimes when facing a large number of requests, such as when minting NFTs. In addition, Solana has not yet successfully implemented the rotation of central nodes.

The designers of the TON blockchain come from the founder and core team of Telegram. As one of the most popular social platforms in the world, Telegram has nearly 900 million monthly active users. While it has a high degree of security and privacy, it also provides a stable and smooth user experience, with tens of billions of messages transmitted within the software every day. The concept of web3 is relatively well-known, but in fact, native crypto users are still a minority, and most people rely on centralized exchanges to access tokens. The world's most popular decentralized crypto wallet Metamask currently has only 30 million monthly active users. The design concept of TON has been based on serving billions of users from the beginning, not just a few web3 geeks.

Unlimited Sharding Paradigm

Sharding is a concept from database design. It refers to splitting a large logical data set and distributing it into multiple databases that do not share each other. These databases can be distributed on multiple servers. In simple terms, sharding provides the ability to scale horizontally, allowing data to be broken down into independent parts that can be processed in parallel.

TON is not the first project to introduce sharding technology into the blockchain. For example, Ethereum 2.0 once announced a fixed 64 shards but abandoned it due to the difficulty. NEAR's Nightshade Protocol plans to achieve 100 shards next year, and currently there are 4 shards.

Different from the traditional sharding method, TON adopts the strategy of unlimited sharding.

However, TON’s approach is considered advanced not because it has more shards, but because of two unique features:

  • The number of shards is not fixed: TON supports increasing shards based on business needs, up to a maximum of 2^60 working chains, which is a nearly unlimited number.

  • The number of shards is elastic: TON can automatically split shards when the system load is high and merge them when the load decreases. This is a very effective strategy to cope with dynamic expansion needs.

Currently, TON consists of two work chains, the Masterchain for synchronization and governance, and the Workchain for smart contracts. Below the Workchain are the Shardchain and the bottom-level virtual account chain.

The work chain can be divided into N shards (from 1 to 256 shards). Each shard has its own validator group. The work chain group is responsible for executing transactions in its own shard. At the same time, it continuously downloads blocks from all other shards of its work chain. In general, a blockchain is a series of blocks that record changes in its state. For POS blockchains, validators first agree on how they want to change the blockchain state by compiling a block containing a list of changes. After that, they vote for this block, and if enough votes are collected, they apply the block to the blockchain state and move to the next block.

The throughput of a block thread is very limited because the validators must check all transactions in a block before agreeing to accept it. So there are many threads in TON, which can be simply thought of as mini micro blockchains. They exist in parallel and each has its own set of validators.

Main Chain

The main chain is the main block thread in TON. It is used to synchronize all the rest of the blocks and recalculate the validator set. When all threads agree on a new block, they sign it and register it in the main chain. However, the main chain validators do not verify the validity of the block, they only check if it is signed by the appropriate validators. So many threads may coexist in parallel. Contracts from different threads communicate with each other by sending messages.

Job Chain

Workchains are independent address spaces that can run according to their own rules. For example, they may have different virtual machines or extend the time to publish blocks with high gas limits. Most importantly, workchains must have the same message queue format so that they can exchange messages. This also means that all workchains must have roughly the same security guarantees. Since they can exchange messages, these messages carry the network token. There are now two workchains active: the main chain and the first processing workchain. The workchain is determined by the address prefix: -1:ax...1s2 - the address of an account in the main chain. -1 is the main chain prefix.

0:zx...123 - The account address in the first work chain. 0 - is the prefix of the first processed work chain.

Shard Chain

A processing thread or shard chain is an independent thread that processes blocks in a work chain. By default, work chain 0 has only one thread and one chain. The validators of this thread accept external messages and process internal messages sent by themselves or from other work chains. If a thread is overloaded during the last N blocks, the thread will be split: one thread is divided into two, and transactions in them are processed in parallel.

Accounts with addresses starting with 0:00.. - 0:88.. are now in thread 1, and accounts 0:88.. - 0:FF.. are in thread 2. Since all smart contracts communicate with each other asynchronously, there are no glitches, and throughput has increased by two times. When the load drops, the threads merge back after a while. If the load keeps increasing, the two threads can split again and again, and so on. The main chain has only one thread.

A block in TON is not just a list of transactions that need to be completed to effect a state change. Instead, a block is:

A list of messages that execute transactions, removing them from the incoming queue. New messages that enter the outgoing queue after message processing, and then message processing causes changes in the smart contract state. That is, in order for a validator of shard X to maintain the current state of shard Y, it does not need to execute all transactions in the shard Y block. It simply downloads that block and summarizes the changes that have occurred in the message queue and smart contract state.

Fundamentally changing the blockchain world cannot be done without a cost. To take advantage of this radical approach, TON smart contract developers must design their contracts differently. The basic atomic unit of the TON blockchain is the smart contract. A smart contract has an address, code, and data unit (persistent state). This unit is called an atomic unit because a smart contract always has atomic, synchronized access to all of its persistent state.

Hypercube Network Routing

TON has created a unique intelligent routing mechanism to ensure that transactions between any two blockchains can always be processed quickly, no matter how large the system is. The time required to send information between TON blockchains only increases logarithmically with the number of chains, so even if it expands to millions of chains, they can communicate at the fastest speed.

In the TON blockchain, Instant Hypercube Routing and Slow Routing are two routing mechanisms used to process cross-chain transactions.

 

Instant Hypercube Routing: TON proposed the idea of ​​speeding up message routing, allowing cross-chain transactions to be completed in a very short time. In the traditional slow cube routing process, a message is routed from a shard chain along the hypercube network to the destination shard chain. However, during the message routing process, the validator of the destination shard chain of the message can choose to process the message in advance and add it to the block, and then provide a merkel proof (receipt) and send a receipt to destroy the message in transit. It allows cross-chain transactions to be completed in a very short time. Instant routing achieves efficient cross-chain interaction by constructing a routing structure of a high-dimensional cube (hypercube). In this structure, each chain is mapped to a vertex of the cube, and the distance between chains is represented as the number of hops between vertices. In this way, transactions can be quickly routed on the shortest path, thereby achieving efficient cross-chain interaction. Instant routing can complete cross-chain transactions in seconds without waiting for block confirmation.

Slow Routing: Slow routing is a relatively traditional cross-chain transaction processing method that is implemented by gradually transferring transactions from the source chain to the target chain. In this method, the transaction is first packaged into a block on the source chain and then transferred to the target chain through a relayer. The validator of the target chain verifies the validity of the transaction and then packages it into a block of the target chain. The advantage of slow routing over fast routing is that it provides higher security and decentralization because cross-chain transactions need to go through a complete block confirmation process. Similar to the TCP/IP network, the destination is addressed by the destination IP address, which ensures that the message is reliably propagated to the destination chain in order. For a shard chain hypercube network of size N, the number of intermediate shard chains that need to be passed is log16(N)-1. Therefore, only 4 routing nodes (intermediate shard chains) are needed to support millions of shard chains.

Why is it designed like this?

Distributed systems require verification nodes. If the system is very large and has tens of thousands of nodes, it will be too heavy to expand. After sharding, each shard has a collection, shard0, shard1... and cross-shard communication is required. Communication can be cross-shard, from one shard to another, which means that there must be a routing mechanism between shards. The connection forms a route, which jumps through some intermediate nodes. Every time information passes through a route, the transmission time is equivalent to an increase of one block time.

As the total number of shard chains grows, this will require a lot of computing power and network bandwidth, limiting the scalability of the system. Therefore, it is not possible to pass messages directly from any shard to all other shards. Instead, each shard is only "connected" to shards that differ in one hexadecimal digit of their (w, s) shard identifier. In this way, all shard chains form a "hypercube" graph, and messages are passed along the edges of this hypercube.

If the message is sent to a different shard than the current one, one hexadecimal digit of the current shard's identifier (chosen deterministically) is replaced by the corresponding digit of the target shard, and the resulting identifier becomes the approximate target for the recipient of the forwarded message.

The main advantage of hypercube routing is the block validity condition, where validators who create shard chain blocks must collect and process messages from the output queues of "neighboring" shard chains, or they will lose their stake. In this way, it can be expected that any message will reach its final destination sooner or later; messages can neither be lost in transit nor delivered twice.

Hypercube routing introduces some additional latency and cost because messages need to be forwarded through several intermediate shard chains. However, the number of these intermediate shard chains grows very slowly, with log N, the logarithm of the total number of shard chains N.

Communication Asynchronous

Smart contracts on TON implement asynchronous communication. Smart contracts on TON can be compared to Internet microservices. Each microservice only performs atomic synchronous access to its local data. Communication between two microservices involves sending asynchronous messages over the network.

In system architecture, larger systems often need to be architected as microservices. This distributed approach requires some trade-offs to adopt, but can bring user experience benefits. Modern system management relies on sequencers like Kubernetes to take a set of containerized microservices and automatically start new instances on demand (autoscaling) and efficiently partition them between machines.

Using a Kubernetes analogy, this is exactly what TON does. As the load on a particular shard chain increases, it is split in two. Since smart contracts are atomic, they can never be split in half. This means that some smart contracts that were once on the same shard chain may one day find themselves on a different shard chain.

TON’s virtual machine (TVM) is applying the concept of distributed microservices to the overall architecture of Ethereum’s EVM.

State Decentralization

This is the most complex and challenging sharding mechanism in the sharding field. The entire database is divided and placed on different shards. Each shard stores all the data in its own shard, rather than the state of the entire blockchain.

In the TON blockchain sharding, all services are implemented in the form of smart contracts, and the status data of smart contracts are only saved in the corresponding sharding network, thereby realizing state sharding.

Moreover, in TON, the contract implements a unique implementation path in the industry, where each user can manage the token status in his or her own contract, truly realizing the decentralization of the blockchain status. I will discuss the principle of this design in detail through cases.

First, you need to understand the Wallet contract and the Jetton wallet contract. The Wallet contract is a user-specific smart contract used to manage the user's tokens on the TON blockchain. The Jetton (Russian: Gem) wallet contract is a special Wallet contract specifically used to manage the user's Jetton tokens. These tokens can be used to pay network fees and execute smart contracts. Each user has his or her own Wallet contract and Jetton wallet contract. These contracts act as the user's digital wallet for storing and managing tokens. At the same time, these contracts can also interact with other users' contracts to achieve decentralized asset transfers and transactions.

At this point, assume that user A and user B each have their own Wallet contract. User A wants to transfer a certain amount of tokens to user B. In this case, user A's Wallet contract will interact with user B's Wallet contract to transfer tokens. The entire process does not need to rely on a centralized contract, but is implemented through two decentralized contracts.

Each user of the TON blockchain has their own contract to manage the status of their assets, which means that there is no single centralized contract that bears the risk of managing all assets. This improves the decentralization of the system and reduces the risk of single point failure. The asset status of all users is managed by a dedicated contract, and attackers cannot affect the entire system by attacking a single centralized contract. Asset transactions between users can also be automatically executed through smart contracts, avoiding the risk of human operation. You can also customize your own Wallet contract and Jetton wallet contract according to your needs to achieve more functions and application scenarios. This provides users with greater flexibility and autonomy. Everyone manages the status of assets in their own contracts, and the scalability of the system is improved. As the number of users increases, the number of contracts will also increase accordingly, but this will not put too much pressure on the entire system because each contract runs independently.

The above is my analysis of the scalability of the TON blockchain and the technical architecture of the white paper. I would like to thank Dr. Awesome Doge for editing the first draft. I would like to thank the Russian and Ukrainian development teams for their perseverance and hard work. Finally, I would like to thank Mr. Nikolai Durov, the founder of Telegram, for his great design many years ago. All of this is for the glory of the human mind.