1. Why do we need a decentralized database?
There are two basic ways to store data in Web2 applications: file system and database. Due to the lack of database products in Web3, most DApps still use centralized databases to store structured data, except for storing a small amount of important data in expensive smart contracts. As decentralized file systems such as IPFS are gradually used to store NFT data for Web3 applications, decentralized file systems are recognized and accepted by Web3, and the technology of decentralized databases has also undergone a round of iterations, with a variety of new products.
Decentralized databases have unique advantages over traditional centralized databases. They can reduce the risk of single point failure in Web3 projects and make Dapp completely decentralized.
Decentralized databases are suitable for storing frequently accessed hot data and non-financial data of Dapps, such as:
NFT metadata DAO voting data DEX order book Decentralized social data, blog data, emails. Complex relational database data required by Dapp. 2. What types of decentralized database storage systems are there?
In the past two years, many decentralized database projects have emerged, and some innovative projects have attracted widespread attention.
Ceramic: Ceramic is a project started in 2019. Data is stored and managed in the form of streams, and formatted event logs are added to the streams. The logs will be made into files and uploaded to IPFS. GraphQL API queries are provided. Ceramic does not have an incentive model like IPFS and supports data creation, reading, and updating (CRU). OrbitDB: OrbitDB is an earlier project than Ceramic and also uses the IPFS file system for file storage. It supports the storage of NoSQL databases and files. Tableland: The project was launched in 2022 and is currently in the public beta stage. The production version of Tableland will be released in 2023. Data storage requires the use of smart contracts, which define SQL statements and set usage permissions. Reading data is done off-chain and does not require payment. Currently, the contract has been deployed on L2 such as ETH and OP. Polybase: The project is now running on the test network. It is a NoSQL database that supports CRUD operations, and each operation requires a fee. In addition, Polybase also supports various file systems to store database files, including local disks, IPFS, Filecoin, Polystore, and even AWS S3. Polybase also uses payment channels for data query payments, reducing the frequency of on-chain transactions and avoiding query delays caused by payments. Web3Q: The project was launched in 2022, and the test network has been launched. A new URL pattern Web//access protocol is proposed for accessing data. Its charging model is very special, and data deletion is refundable. Kwill: Kwill is an Arweave-based SQL database system that uses smart contracts for payment. KYVE: KYVE is an Arwave-based database system. Technically, both SQL and NoSQL can be used as databases. SQL is more mature and efficient, and NoSQL is richer and more flexible. SQL's data structure needs to be highly consistent, with stronger joint query capabilities, mature and efficient; NoSQL's KV form is more in line with Ethereum's design pattern, can support a variety of data types, and is flexible and easy to expand. Functionally, it is best to support CRUD, but supporting UD will bring complexity to the system. If the system uses local storage, historical value queries may not be supported.If IPFS and Arweave are used, the database needs to be append-only, otherwise there will be multiple versions of a piece of data, and the storage cost will double. There are two options for the underlying file system. It is more flexible to store files locally, and the retrieval logic can be customized, which is more efficient and avoids the unreliability and complexity brought by the use of decentralized file systems such as Arweave. For example: users use TokenA to pay database miners, and miners need to pay Arweave coins to store data. The superposition of two layers of networks brings complexity. Database files are stored in decentralized file systems such as IPFS and Arweave; stored locally on nodes or on S3 cloud. Similar to decentralized storage, improving the retrieval speed of stored data, incentive models and token economics, and guarantee algorithms for ensuring data availability are key factors in determining whether a protocol will be widely used. A good incentive model and token model can not only call on the participation enthusiasm of nodes, but also incentivize nodes to do the right thing. For example: provide effective retrieval functions instead of just storing data to obtain storage rewards. The data availability guarantee algorithm will check the node's storage of data at intervals, and the node needs to provide a data availability certificate. This certificate complements the node's incentives to prevent data loss. The retrieval factor of data affects the user experience and is crucial to the ease and smoothness of using Dapp.
Summary The decentralized database space has high attention value and urgent needs, but there are currently no widely accepted and used products. The maturity of decentralized database technology is lower than that of decentralized file storage systems. Because decentralized database technology is based on distributed file systems. Many projects were launched in 2022. Improving the speed of stored data retrieval, incentive models and token economics, and guarantee algorithms for ensuring data availability are key factors in determining whether a protocol will be widely used. The focus of the protocol will be to reduce retrieval time, which is crucial for the ease of use and smoothness of Dapps.
