Merkle trees and roots

Content
What is a Merkle tree?
How are Merkle trees constructed?
Why are Merkle roots used in Bitcoin?
Mining
Verification
Summary

What is a Merkle tree?The Merkle tree concept was proposed in the early 1980s by Ralph Merkle, a computer scientist known for his work in public key cryptography.
A Merkle tree is a structure for verifying data in a set. It is widely used in the field of peer-to-peer networks, where participants need to exchange information and subject it to independent verification.
The Merkle tree structure is based on hash functions, so we recommend that you first read the article “What is hashing?” and then return to this topic.

How are Merkle trees constructed?Let's say you download a large file. When using an open source program, you need to check if the hash of the downloaded file matches the published hash of the developers. If it matches, then the file on your computer is exactly the same as their file.
If the hashes are different, then either you downloaded a malicious file masquerading as a program, or it was downloaded incorrectly and will not work. Loading issues can also be a hassle, especially if it takes a long time. In this case, you will need to download the file again and hope that everything goes well this time.
You're probably thinking, "Is it really that complicated?" Fortunately, this is where Merkle trees come in handy, allowing us to split the file into parts. For example, a 50 GB file can be split into 100 0.5 GB chunks. In this case, it will be downloaded in parts, similar to how files are downloaded through a torrent.
The main goal of the process is to obtain a single hash called a Merkle root that represents each piece of data in a file. By using the Merkle root, we can greatly simplify data validation.
As an example, let's take an 8 GB file divided into 8 parts. Each fragment is given a name from A to H and then passed through a hash function to produce 8 different hashes.

Each of the eight fragments is passed through a hash function to generate its hash.

So, we got that out of the way. We have received a hash of all fragments, which means we can compare it with the original one and find out which one is faulty, right? It's possible, but it would be extremely ineffective. There are only eight fragments in our file, but if there are thousands of them, would you hash them all and compare the results?
Hardly. Instead, you need to take each pair of hashes, concatenate them, and hash them together. So we hash hA + hB, hC + hD, hE + hF and hG + hH and get four hashes. Then we carry out another round of hashing so that there are two hashes. Finally, we hash the remaining pair and get the main hash - the Merkle root (or root hash).

The structure resembles an upside down tree. In the bottom row there are “leaves” that go into nodes, which go into the root.

So we have a Merkle root representing the downloaded file. Now we can compare the root hash with the original creator hash. If they match, then everything is great! If the hashes are different, it means that the data has been changed, that is, one or more fragments have created a different hash. Thus, any modification of the data will produce a completely different Merkle root.
Fortunately, we can easily find the incorrect fragment. Let's say this is hE. First, query the last two hashes that created the Merkle root (hABCD and hEFGH). Your hABCD will be the same as the original because there are no errors in that segment, but your hEFGH will be different and should be checked. Next, we request hEF and hGH and compare them with ours. Since hGH will match, we need hEF. Finally, we compare the hashes hE and hF. So we found out that the incorrect fragment is hE, which means we need to re-download it.
To summarize, a Merkle tree is created by dividing data into many parts, which are then hashed repeatedly to form a Merkle root. This system makes it easy to check that every piece of data is ok. In the next section we will look at other possible uses.

Are you wondering how to get started with cryptocurrencies? Buy Bitcoin on Binance!

Why are Merkle roots used in Bitcoin?Merkle trees have many uses, but for now we are interested in their application in blockchain. Merkle trees are necessary in working with Bitcoin and many other cryptocurrencies; they are an integral part of each block and are located in the block headers. To obtain the leaves of the tree, we use the hash of each transaction (TXID) included in the block.
In this case, the Merkle root performs several tasks. Next, we will look at their use in cryptocurrency mining and transaction verification.

MiningA Bitcoin block consists of two parts. The first part is the block header, a fixed-size segment containing metadata for the block. The second part is a list of transactions, the size of which is usually much larger than the header, but can vary.
Miners need to hash data multiple times to get a result that meets certain conditions and mine a valid block. Finding it can take trillions of tries because miners need to change the random number in the block header (nonce) to get a new result, but most of the block will remain the same. There can be thousands of transactions in a block, and all of them will have to be hashed each time.
The Merkle root greatly simplifies this process. During mining, all necessary transactions are lined up in a Merkle tree. The root hash (32 bytes) is placed in the block header, after which only the block header is hashed, not the entire block.
This method is tamper-proof and effectively summarizes all transactions in a block in a compact format. However, it is impossible to find a valid block header and then change the list of transactions, since this will change the Merkle root. When a block is sent to other nodes, they calculate the root from the list of transactions. If it does not match the root in the header, the block is rejected.

VerificationLet's look at another useful property of Merkle roots, which concerns simplified nodes (which do not contain a complete copy of the blockchain). If you are running a node on a device with limited resources, then you do not necessarily need to download and hash all the transactions of a block. Instead, you can simply ask the full node for a Merkle proof—proof that your transaction is in a specific block. This method was described in detail by Satoshi Nakamoto in the Bitcoin whitepaper and is often called simplified payment verification (SPV).

To check hD, only red hashes are needed.

Let's say we need information about a transaction whose TXID is hD. Given hC, we can calculate hCD. We then need hAB to calculate hABCD. Finally, hEFGH can be used to check whether the resulting Merkle root matches the root in the block header. If yes, then this proves that the transaction was included in the block, since it is almost impossible to create the same hash with different data.
In the example above, we only hashed three times, whereas without the Merkle proof it would have to be done seven times. Since blocks can contain thousands of transactions, using Merkle proofs can save a lot of time and computing resources.

SummaryMerkle trees have proven their effectiveness in computer technology. They are incredibly useful in blockchains and allow information to be easily verified across distributed systems without overloading the network with unnecessary data.
Without Merkle trees (and Merkle roots), the blocks of Bitcoin and other cryptocurrencies would be very bulky. While lightweight nodes may have privacy and security concerns, Merkle proofs can cost-effectively tell whether transactions were included in a block.
Merkle trees and roots

What is a Merkle tree?

How are Merkle trees constructed?

Why are Merkle roots used in Bitcoin?

Mining

Verification

Summary

Latest News