0. Intro
Web3 has reshaped the value of data, but the distributed blockchain is a closed deterministic system. Smart contracts do not implement the function of external API calls, so the oracle mechanism was born to help smart contracts obtain external data.
It is not difficult to upload off-chain data to the chain. The difficulty lies in creating trust through technology and mechanism design. The oracle problem requires solving the trust problem from data source to processing to price feeding.
A basic condition for becoming a publicly recognized oracle is decentralization, that is, whether single points of failure and data verification are allowed. A common solution to chain decentralization is to use multiple data nodes to form a decentralized oracle network. Each node collects data and inputs it into the smart contract on the blockchain after reaching consensus.
Chainlink Architecture
The main use of the oracle at present is to provide price feeds for DeFi, safely, promptly and accurately updating the prices of underlying assets. According to DefiLlama data, Chainlink is one of the largest oracle solutions on the market, with a total value of approximately $11B at the time of writing, accounting for 46% of the entire market.
Oracle Market Data
With the development of blockchain, the demand for off-chain data is becoming stronger and stronger. Simply feeding prices for DeFi can no longer meet the needs of developers. Most of the data in the real world and Web2 cannot be publicly accessed, but it is necessary to build innovative application scenarios of Web3 (credit lending, social, DID, KYC/AML, etc.). Therefore, the new generation of oracles needs to enable smart contracts to support complex use cases involving sensitive data in a privacy-preserving manner.
DECO is Chainlink's solution in this direction. It uses zero-knowledge proof technology to allow users to generate off-chain privacy data proofs for smart contracts without revealing the data to the public or the oracle node itself. DECO can access existing APIs, and even if end-user verification is required (for example, logging in to obtain bank account balances) there is no need for API data providers to make any changes. It has now reached the alpha stage and is testing proof of concept with multiple partners.
1. Background
This provides the necessary background on TLS and ZKP, the protocols on top of which DECO is built.
1.1 TLS
TLS (Transport Layer Security) is a powerful and widely deployed security protocol. Its predecessor is SSL. It is designed to promote privacy and data security in Internet communications. It is located between the application protocol layer and the TCP/IP layer. The main use case is to encrypt communications between web applications and servers.
All communications over HTTP are conducted in plain text, which is vulnerable to eavesdropping, tampering, and impersonation. With TLS, HTTP data sent by users to websites (clicks, filling out forms, etc.) and HTTP data sent by websites to users are encrypted, and the recipient must use a key to decrypt the encrypted data. HTTPS implements TLS encryption based on the HTTP protocol and is a standard practice for websites. Websites need to install TLS certificates on their source servers, and browsers will mark all non-HTTPS websites as unsafe.
Non-HTTPS Websites
The basic idea of TLS is to use public key encryption. The TLS/SSL certificate publicly shared by the website contains the public key, and the private key is installed on the source server and owned by the website. The client first asks the server for the digital certificate public key, and then encrypts the information with the public key. After the server receives the ciphertext, it decrypts it with its own private key.
There is a problem here. Public key encryption requires too much computation. In order to reduce the time consumed by the session, the client and server generate a "session key" for each session and use it to encrypt information. Since the "session key" is symmetric encryption, the operation speed is very fast, and the server public key is only used to encrypt the "session key" itself, which reduces the time consumed by the encryption operation.
Therefore, the TLS protocol can be divided into two layers:
Handshake protocol for authentication key negotiation: plain text communication, mutual confirmation and verification through asymmetric encryption, establishment of the encryption algorithm to be used, and generation of a consistent session key for symmetric encryption of the recording protocol
Record protocol for symmetric encrypted transmission: the main body of the protocol, which protects the confidentiality and integrity of data transmission
TLS protocol stack
The TLS CipherSuite is a combination of 4 algorithms:
Authentication: Determine the authenticity of the identity. The mainstream ones are RSA/DSA/ECDSA
Key exchange: The two communicating parties negotiate the key used for encryption. The mainstream one is ECDHE
Encryption: Symmetric encryption for communication, the trend is to use GCM
MAC (Message Authentication Code): used to verify data integrity and whether the data has been tampered with. The mainstream ones are SHA256/SHA384/SHA1, etc.
TLS is very powerful, but it has a limitation: it does not allow users to prove to third parties that the data they are accessing is indeed from a specific website, because the data transmission uses symmetric encryption, and users and servers have the ability to sign data. An intuitive example is that many websites have Alice's identity information stored in their servers, and it is easy to verify that Alice is over 18 years old, but it is difficult for Alice to prove this to Bob. Alice can take screenshots from the website, but screenshots are easy to forge, and even if the screenshots can be proven to be authentic, they will leak information - Alice's exact date of birth, not just the fact that she is over 18 years old.
Oracles need to be decentralized (not dependent on a single point such as a website server) to prove the provenance of off-chain private data and be used by smart contracts without leaking privacy. Zero-knowledge proofs can help achieve these functions.
1.2 CPC
Zero Knowledge Proof (ZKP) has received widespread attention in the blockchain, and its main applications are ZK-Rollup (a lot of compromises have been made in algorithm design to improve expansion efficiency, and it is not zk's Validity Proof) and privacy technology (real zk). Zero-knowledge proof allows the Prover to prove to the Verifier that it has a solution (Witness) that can solve a certain computational problem (Statement) without revealing any additional information about the solution (Witness).
A typical ZK system can be divided into the front-end and the back-end.
Front-end: Compiler, which writes the statements to be verified into a domain-specific language (DSL) and then compiles them into a ZK-friendly format, such as arithmetic circuits;
Backend: Proof systems, interactive argument systems that check circuit correctness, such as Marlin, Plonky2, Halo2;
ZK System
The process of constructing interactive questions on an open system such as blockchain is complicated, and proofs need to be verifiable by anyone at any time. Therefore, ZK systems on blockchain applications are usually non-interactive, and interactive systems can be converted to non-interactive systems using the Fiat–Shamir-heuristic.
2. How DECO works
DECO is an extension of the HTTPS/TLS protocol, allowing it to be used without modification on the server side.
The core idea of DECO is to build a novel three-party handshake protocol between Prover (user or Dapp running DECO Prover), Verifier (Chainlink oracle running DECO Verifier), and Server (data provider).
Provenance: When the Prover queries the Web Server for information, the Verifier witnesses the interaction and receives a Commitment created by the Prover on the TLS session data, which enables the Verifier to verify the true source of the information.
Privacy: If the data does not need to be private, Prover can directly provide the Verifier with the key that can decrypt the data for the developer to add the data to the Dapp; if privacy is required, Prover uses ZKP to generate proof of non-disclosure of data for the developer to add to the Dapp.
DECO Example
Specifically, the DECO protocol consists of three phases:
Three-way handshake: Prover, Verifier and Server establish a session key in a special format to ensure that data cannot be forged;
Query execution, Prover uses Query with its private parameters θs (such as account password, API key) to query data from Server;
Proof generation, Prover proves that the response meets the required conditions.
DECO Architecture
2.1 Three-party handshake
Note: The following description is based on the AES-CBC-HMAC encryption algorithm. TLS 1.3 only retains the more secure AEAD as the encryption algorithm, using one key for encryption and MAC, and does not require a MAC key. However, due to the key independence of TLS 1.3, a three-way handshake protocol of similar complexity can also be constructed.
Prover P cannot make a promise after obtaining the MAC key, otherwise he can forge or tamper with the data. Therefore, the idea of the three-way handshake is to use Prover P and Verifier V as TLS clients to establish a shared MAC key with TLS server S. The MAC key k is split on the client side, Prover holds kp, Verifier holds kv, k=kp+kv. At the same time, P also holds the encryption key k^{Enc} used for the symmetric encryption algorithm. If Verifier does not do evil, the three-way handshake protocol can ensure that the data cannot be forged.
2.2 Query execution
After the handshake, since the MAC key is secretly shared, P and V perform an interactive protocol (two-party secure computation) and use the private parameter θs to construct an encrypted query TLS message Query Q. Then P sends Q to S as a standard TLS client. In this process, only P communicates with S, and any query it sends cannot be leaked to V.
After receiving the response R from S, P commits to the session by sending the ciphertext Rˆ to V and receives kv from V to verify the authenticity of the response R.
2.3 Proof generation
Next, P needs to prove that the plaintext R corresponding to the ciphertext Rˆ satisfies certain properties. If privacy is not required, the encryption key k^{Enc} can be revealed directly. If privacy is required, zero-knowledge proof needs to be used.
If the plaintext consists of several data blocks R=(B1,...,Bn), DECO uses Selective Opening to generate zero-knowledge proof:
Reveal only certain rows: Without revealing other rows, prove that the i-th row in R is Bi
Hide rows containing private data: Prove that R_{-i} is equal to R, except that Bi is deleted
However, in many cases, the Verifier needs to verify whether the revealed substring appears in the correct context, and the above-mentioned methods are not enough to provide context integrity protection. To compensate for this, DECO uses a technique called zero-knowledge two-stage parsing: the Prover parses its session data locally, determines the minimum substring that can convince the Verifier, and then sends the data to the Verifier. This achieves privacy.
Concise non-interactive (NIZK) zero-knowledge proofs usually have high overhead on the Prover side in terms of computation and memory. Since the Verifier of the ZKP performed by DECO is designated (Chainlink's oracle), more efficient interactive zero-knowledge proofs can be used, such as smaller memory usage, avoiding trusted setup, cheap computation, etc.
In the current Alpha Test, DECO still uses Dapp to act as a Prover. In future iterations, it is planned that the Prover can be deployed locally by the end user (such as a mobile phone) or in a trusted execution environment (TEE).
3. Application
DECO can verify the validity of the user's off-chain identity information while ensuring data privacy, thereby unlocking many innovative Web3 application scenarios, from economic to social.
Self-hosted social recovery/legal identity proof (who you are): With DECO, leverage institutional sites (banks, social media) that already have mature identity verification mechanisms to act as one of the guardians of social recovery.
Credit Lending/Proof of Funds (How Much Money Do You Have): Teller is a DeFi credit lending protocol that uses the DECO protocol to prove that the user’s asset balance in an off-chain bank account exceeds the dynamic minimum threshold required for the loan.
Proof of Followers/Proof of Interaction (who you’ve interacted with): Clique is a social oracle that is developing a solution that provides deep analytics on off-chain user influence, loyalty, and contributions across various social media platforms (e.g. using the Twitter API).
Digital identity/social identity proof (you have an online account): PhotoChromic is a digital identity solution that uses DECO to bind Web3 users to their Twitter or Discord social accounts without exposing the underlying personal identity data in the process, allowing applications to filter out real users.
DAO's anti-sybil attack, SBT, KYC/AML, etc.
4. Other Players
Axiom built a ZK oracle for Uniswap TWAP, which uses a verifiable data source completely from the chain, which is more similar to Indexing (eg. Hyper Oracle); and DECO is more like a complementary rather than competitive relationship: more and more economic activities will take place on the chain, and pure on-chain oracles are one direction; more and more off-chain data needs to be on the chain, and off-chain privacy oracles are also a direction.
Empiric Network uses zk computing to put the entire oracle on the chain. There is no off-chain infrastructure that data must flow through, and it is not in the same direction as DECO.
4. Conclusion
Chainlink is the absolute leader in current oracles. Through the DECO oracle, massive amounts of private data off-chain can be called by on-chain smart contracts under the premise of privacy protection, unlocking many application scenarios from finance to identity to social networking. Potential risks are the speed of proof generation of Prover and the centralization of Verifier.