Author: PermaDAO

 

Decentralized storage is a method of storing data that does not rely on a single central point of control. This contrasts with traditional centralized storage (such as traditional cloud storage services like Amazon S3 or Google Cloud), which are usually managed by a single company or organization.

Mainstream decentralized storage

The mainstream decentralized storage on the market currently include Arweave, Filecoin, and Storj. They each have unique characteristics and design concepts:

  • Arweave focuses on long-term or permanent data storage.

  • Filecoin provides a decentralized market similar to traditional cloud storage, supporting flexible storage needs.

  • Storj focuses on providing secure and privacy-preserving decentralized cloud storage services.

All three platforms use blockchain technology, but their application scenarios, technical implementations, and payment models are different, and each is suitable for different types of storage needs:

  1. Arweave

    • Goal: Provide a long-term, permanent data storage solution. Arweave aims to store data "forever" and is mainly used for long-term data preservation.

    • Technology: Uses a unique blockchain technology called Blockweave. Unlike traditional blockchains, Blockweave contains references to earlier random blocks in each new block, which is designed to encourage long-term data preservation.

    • Payment model: Users pay a one-time fee for data storage, and the data can theoretically be accessed permanently after being stored.

  1. Filecoin

    • Goal: Aims to create a decentralized storage market, similar to traditional cloud storage services.

    • Technology: Filecoin is the incentive layer of IPFS (Internet File System). It uses "Proof of Storage" and "Proof of Spacetime" to ensure that data is stored correctly.

    • Payment model: Users pay storage providers based on the amount of data stored and for how long. This is a more traditional rental model where users can increase or decrease storage as needed and pay accordingly.

  2. History

    • Goal: To provide users with a decentralized cloud storage solution with a focus on security and privacy protection.

    • Technology: Storj uses encryption and sharding technology to protect the security and privacy of data. Data is encrypted and split into multiple small blocks on the client before uploading, and then distributed and stored on nodes around the world.

    • Payment model: Storj's payment model is similar to traditional cloud storage, with charges based on storage space and bandwidth used.

In contrast, Arweave is unique in that it emphasizes permanent storage and pays more attention to the censorship resistance and durability of data. Filecoin and Storj both use the storage market and focus on using blockchain technology to reconstruct the storage market.

Business architecture analysis

The theoretical basis for Arweave's permanent data storage is similar to "Moore's Law". According to the statistical results of data storage costs from 1980 to the present, the storage cost has been decreasing at a rate of 20% per year. According to this statistical law, the cost of data storage will converge to a constant after an infinite number of years. Arweave's permanence is based on this and calculates the storage cost of data for 200 years. Users will pay this fee once for data storage.

At the same time, Arweave has designed a very elegant and concise data mining mechanism. We can name it "effective data mining".

The so-called "valid data" refers to the data that has been stored in the Arweave network in the past, and users have paid 200 years of storage fees for these valid data. Another role group in the network - miners, they use valid data for mining and provide valid data reading services. Unlike other storage blockchains, Arweave does not force miners to store data, but establishes incentive rules to encourage each miner to maximize the storage of "valid data". In the Arweave network, the more "valid data" a miner stores, the greater the "computing power" of mining.

Assuming that there is 100 TB of valid data in the Arweave network, it is not necessary for miners to store all 100 TB of data. In other words, miners can mine by storing only 100 MB of data, but the miner's computing power is very small. If the miner chooses to store all 100 TB of data, his computing power will reach the maximum value.

In the "effective data mining" mechanism, the Arweave network incentivizes miners to store as much data as possible, but does not force them to store all data. So under this incentive model, is there a possibility of data loss? The following is a simulation of data loss:

The 0.5 in the first and second rows means that a single node stores 50% of the data. Assuming that the blockchain network has 200,000 blocks and 200 nodes, and each node randomly stores 100,000 blocks (50% of the block data), the probability of a single block being inaccessible can be calculated as 6.223^10-61. The data reliability provided by the cloud service is 99.99999999%, which is 10 to the 7th power. The Arweave calculation above reaches an astonishing 61st power.

Both Filecoin and Storj use blockchain technology to build a data storage market. Storj's main improvement is data privacy. This article mainly explains the principles of Filecoin.

Similar to traditional order books, users who use Filecoin need to first place bids in the trading market and indicate the storage time and number of backups of the data. Miners will then accept profitable orders. In order to ensure the fairness of the entire trading market, Filecoin has established a complex economic model and set up multiple rules such as fines and small installment payments. Its core technologies are proof of replication and proof of space and time.

Proof of Replication: Miners prove to users that data has been stored on dedicated physical devices. Each time a miner proves that the user's data is stored, the network will pay the miner a fee.

Proof of Space and Time: If there is only proof of replication, it cannot guarantee that your data will be stored all the time. Miners can only store this part of the data when submitting proofs. For this reason, Filecoin supplements the proof of space and time, with the aim of allowing miners to continuously store this data.

To summarize, the basis and implementation of Arweave’s permanence are:

  • The cost of perpetuity is decreasing year by year

  • Incentivize miners through "effective data mining" to achieve data permanence

Filecoin and Storj are decentralized storage markets created using blockchain technology. Their models are similar to the order books of traditional trading markets, where order makers provide demand and miners accept orders to ensure data storage. The core technical points of Filecoin are: Proof of Replication and Proof of Spacetime.

Storage Practice

There are two ways to store data in Arweave. The first way is to send data directly to the Arweave node and pay AR. The second way is to use the ANS-104 (Bundled) data binding protocol to batch package data into Arweave.

Storing data directly in Arweave

The user only needs to prepare a wallet holding AR to complete the action. Use the following code to store a file named file.pdf to Arweave:

For more documentation, please refer to https://github.com/ArweaveTeam/arweave-js.

Using ANS-104 to store data in Arweave (recommended)

Arweave's block generation rate is low, usually about 2 minutes, and one block can only process 1,000 transactions, which greatly limits the number of transactions stored in Arweave, although the storage capacity of an Arweave transaction is unlimited, and users can store 100 MB or even 10 GB of data directly in Arweave through a transaction. In order to solve the problem of expanding the number of transactions, ANS-104 came into being.

ANS-104 is a multi-transaction binding technology that can bind tens of thousands of different data entities to a common Arweave transaction at one time. It can be compared to Ethereum to Layer2 Rollup solution, the difference is that ANS-104 does not lose data security, and the bound data is also 100% complete data stored on Arweave.

The code for storing data using ANS-104 is as follows:

The code uses the arseeding light node as a data binding service. The arseeding light node is a fully open source Arweave data node that supports all Arweave native node interfaces and expands the ANS-104 interface. At the same time, arseeding integrates the cross-chain payment protocol everPay, so in addition to using AR to pay for storage fees, users and developers can also use various assets such as ETH, BNB, USDT and USDC for data perpetuity.

For more documentation, please refer to https://web3infra.dev/docs/Arseeding/guide/quickStart .

Storage Fees

Currently, it costs $7.5 to store 1 GB of data on Arweave. For the latest storage fee reference: https://ar-fees.arweave.dev/.

Retrieving and downloading Arweave data

Arweave has a standardized GraphQL service interface. Any individual or organization can implement Arweave indexes according to the standard. The following are two typical and easy-to-use index gateways:

  • ArweaveNet gateway, the most comprehensive index. https://arweave.net/graphql

  • KNN3 gateway, real-time retrieval of arseeding node data, fast speed. https://knn3-gateway.knn3.xyz/arseeding/graphql

To download Arweave data, you only need to know the ARID or ItemID of the data. Code example:

Filecoin’s storage method

Unfortunately, Filecoin does not provide storage tools for ordinary users and developers. For ordinary developers, Filecoin is in an unusable state. From the scattered technical documents, some solutions for Filecoin storage through third-party service providers can be found, but after carefully checking the service provider's documents, most service providers only provide IPFS storage, and these service providers may not store data in Filecoin. Perhaps due to the author's limited level, I can't find a better way to store data in Filecoin, and there is no corresponding interface to directly obtain data from Filecoin.

Storj’s storage approach

Storj's storage method is the same as Web2. Developers need to register on the official website and obtain an API-KEY. Storj's storage is compatible with the AWS S3 interface, so I won't go into details here. Storj's storage fee is very low, 1 GB of storage for 1 month costs only $0.004. However, the storage fee converted to 200 years will be slightly higher than Arweave, at $9.6.

From the actual storage operation, we can see that Arweave's transaction processing mode is consistent with Bitcoin/Ethereum and other blockchains. Filecoin does not provide an available SDK and interface. It is a pity that the so-called storage leader is unavailable to developers. Storj's storage method is exactly the same as Web2.

It is worth noting that Arweave is a native blockchain storage. Once the data is sent to Arweave, it cannot be deleted or tampered with. Filecoin and Storj are rental models. The project party can stop the storage rental service at any time. In this model, the data does not have the characteristics of the blockchain. The data characteristics are consistent with those stored in centralized cloud services.

In order to more clearly distinguish the difference between data storage such as Arweave and Filecoin, we can name the data on Arweave as "consensus data". Whether it is data on BTC or Ethereum, they are all consensus data, and these data have the characteristics of being tamper-proof and traceable. The data stored in the Filecoin storage rental market cannot be called consensus data.

Prospects

Decentralized storage has two completely different business lines. Among them, the business line represented by Arweave is centered on consensus data, emphasizing the characteristics of data decentralization, anti-censorship, and traceability. The business line represented by Filecoin is centered on the decentralized market, emphasizing the allocation of storage resources and proving storage success. Analogous to the development of DeFi, the early IDEX used blockchain technology to create an order book market. The order book is a very traditional business model that aims to solve ticket exchange with a hang-up order mode. The outbreak of DeFi is the liquidity mining technology brought by the Uniswap AMM trading model. AMM allows orders to be fully automated and realizes the combination of liquidity, which eventually ushered in the DeFi Summer outbreak. In the current decentralized storage track, Filecoin also represents the blockchain technology to create an order book market, while Arweave uses a unified model similar to AMM to manage data supply and demand. Arweave's unified model is more convenient for data pricing and processing. Using Arweave can more easily complete the transformation of ordinary data to consensus data. This kind of data on consensus may usher in a "data combination" outbreak.

At the same time, we have to mention the SCP theory (storage-based consensus paradigm), the core idea of ​​which is that as long as there is consensus on data storage, then the applications composed of these data can also reach a consensus. SCP emphasizes off-chain computing. Data can be stored on various chains such as BTC and Ethereum, and a unique state is formed by aggregating data on the blockchain. Since these states will produce the same results when running in any computing unit, why do we still need to calculate them on the chain? Wasting so many computing resources?

Currently popular BRC20 and Bitcoin inscriptions all use off-chain computing consensus. The storage consensus emphasized by the BRC20 protocol and Arweave SCP is consistent. Both use blockchain as the data layer to provide immutable and traceable transaction data, and the state calculation is completely performed off-chain. With the storage capacity of Arweave, SCP theory can obtain a more powerful consensus data set. Arweave SCP theory has developed a complete engineering application solution - Permaweb, which is equivalent to the ultimate version of Bitcoin Indexer. Permaweb can not only process assets, but also text, pictures and even videos. Imagine that in the near future, super powerful indexers can play streaming media and create a completely decentralized TikTok.

Currently, the Permaweb solution supports a wide range of application types. Whether it is network disk, content co-creation, or games, it can be easily developed using this architecture. The data between Permaweb applications can be combined with each other. For example, a writer uploads the text and copyright of his creation to Arweave through content co-creation. In another game, the developer can directly quote the writer's content and let the players pay the author for the copyright.

The biggest difficulty DePIN is facing is the performance of blockchain. DePIN devices will enter thousands of households, but no blockchain can carry such a huge amount of user interaction. Most DePINs still use a centralized method to process data, which will make DePIN lose its decentralized characteristics. Consensus data can bring more powerful empowerment to DePIN. Once DePIN data is permanent, these data will also acquire combinatorial characteristics. For example, a green energy certificate can offset energy consumption during blockchain PoW calculations, can become an identification in content creation, and can also become a badge in games. Data and value will flow everywhere.

Consensus data is also applicable to the field of AI. Human knowledge and history should be preserved forever, and consensus data can ensure that AI cannot pollute or tamper with human knowledge and history. Similarly, consensus data can be used as the best data raw material for AI, allowing AI to learn and process various valid information.