Author: Pika, Sui public chain ambassador, DePIN researcher
Editor: Faust, geek web3
Introduction: Although the DePIN track is very popular at the moment, there are still technical obstacles for DePIN-related IoT devices to be connected to the blockchain on a large scale. Generally speaking, if you want to connect IoT hardware to the blockchain, you have to go through the following three key stages:
1. Trustworthy operation of hardware devices;
2. Collect, verify and provide data;
3. Distribute data to different applications.
There are different attack scenarios and countermeasures in these three stages, and various mechanism designs need to be introduced. From the perspective of project workflow and protocol design, this article reviews and analyzes the entire process of IoT devices generating data from trustworthiness, verifying stored data, generating proof through calculation, and rolling data to the blockchain. If you are an entrepreneur in the DePIN track, I hope this article can help your project development in terms of methodology and technical design.
In the following, we take the scenario of air quality detection as an example, and analyze the three DePIN infrastructures of IoTeX, DePHY, and peaq to explain to you how the DePIN infrastructure works. This type of infrastructure platform can connect IoT devices with blockchain/Web3 facilities, helping project parties to quickly launch DePIN application projects.
Trusted operation of hardware devices
The trustworthiness of hardware devices includes trust in the device identity and trust in verifiable tamper-free program execution.
Basic working mode of DePIN
In most DePIN project incentive schemes, hardware device operators will provide external services as a bargaining chip to ask for rewards from the incentive system. For example, in Helium, network hotspot devices can obtain HNT rewards by providing signal coverage. However, before obtaining incentives from the system, DePIN devices need to show evidence to prove that they have indeed made certain "efforts" as required.
This type of proof used to prove that one has provided certain services or performed certain activities in the real world is called Proof of Physical Work (PoPW). In the protocol design of the DePIN project, Proof of Physical Work plays a pivotal role, and there are also various attack scenarios and corresponding countermeasures.
The DePIN project relies on blockchain to complete incentive distribution and token allocation. Similar to the public-private key system in traditional public chains, the identity verification process of DePIN devices also requires the use of public-private keys. The private key is used to generate and sign the "physical proof of work", and the public key is used by the outside world to verify the above proof, or as an identity tag (Device ID) of the hardware device.
In addition, it is not convenient to directly use the on-chain address of the device to receive token incentives, so the DePIN project often deploys a smart contract on the chain, which records the on-chain account addresses of different device holders, similar to a one-to-one or many-to-one relationship in a database. In this way, the token rewards that the off-chain physical device should receive can be directly deposited into the on-chain account of the device holder.
Sybil Attack
Most platforms that provide incentive mechanisms will encounter "sybil attacks", which means that someone may manipulate a large number of accounts or devices, or generate different identity certificates, disguise themselves as multiple people, and get multiple rewards. Take the air quality detection we mentioned earlier as an example. The more devices that provide this service, the more rewards the system will distribute. Someone can use technical means to quickly generate multiple copies of air detection data and corresponding device signatures, and create a large number of physical proofs of work to make a profit, which will cause the tokens of the DePIN project to fall into high inflation, so this kind of cheating must be stopped.
The so-called anti-witch, if KYC and other privacy-destroying methods are not adopted, the most common measures are POW and POS. In the Bitcoin protocol, miners have to pay a lot of computing resources to get mining rewards, while the POS public chain directly allows network participants to pledge a large amount of assets.
In the field of DePIN, anti-sybilism can be attributed to "raising the cost of generating physical proof of work". Since the generation of physical proof of work depends on valid device identity information (private key), as long as the cost of obtaining identity information is raised, certain cheating behaviors of generating a large amount of proof of work at low cost can be prevented.
A relatively effective solution to the above goals is to allow DePIN device manufacturers to monopolize the right to generate identity information, customize the devices, and enter a unique identity tag for each device. This is like the public security bureau uniformly recording the identity information of all citizens, and only those who can be found in the public security bureau database are eligible to receive government subsidies.
In the production process, DePIN equipment manufacturers will use a program to generate root keys for a sufficiently long time, and then randomly select root keys and write them into the chip using eFuse technology. Here, eFuse (programmable electronic fuse) is an electronic technology that stores information in integrated circuits. The information entered is usually not tampered with or erased, and has strong security protection.
In this production process, neither the device owner nor the manufacturer can know the device's private key or root key. Hardware devices can derive and use working keys from the root key in the isolated environment of TEE, including the private key for signing information and the public key for verifying the device's identity by the outside world. People or programs outside the TEE environment cannot perceive the details of the key.
In the above model, if you want to obtain token incentives, you must purchase equipment from exclusive manufacturers. If a Sybil attacker wants to bypass the equipment manufacturer and generate a large number of proofs of work at a low cost, they need to crack the manufacturer's security system and register the public key used to generate their own keys to the network-licensed device. It is difficult for a Sybil attacker to launch an attack at a low cost unless the equipment manufacturer steals from the manufacturer.
Once people find suspicious signs of equipment manufacturers doing evil, they can expose the DePIN equipment manufacturers through social consensus, which often causes harm to the DePIN project itself. But in most cases, equipment manufacturers, as the core beneficiaries of the DePIN network protocol, have no motive to do evil, because if the network protocol runs in an orderly manner, the money earned from selling mining machines will be more than the money earned from DePIN mining, so they are more inclined not to do evil.
If the hardware devices are not uniformly supplied by centralized manufacturers, then when any device is connected to the DePIN network, the system must first confirm that the device has the characteristics required by the protocol. For example, the system will check whether these newly added devices have exclusive hardware modules. Devices without such modules often cannot pass the certification. It costs a certain amount of money to allow the device to have the above hardware modules, which raises the cost of witch attacks and thus achieves the purpose of anti-witch. In this case, it is wiser and safer to operate the device normally rather than create a witch attack.
Data tampering attack
Let's think about it. If the air quality data collected by a certain device is more volatile, the system will consider the data more valuable and provide more rewards for it. Then any device will have sufficient motivation to falsify data and deliberately show high volatility. Even devices with identity authentication by centralized manufacturers can "smuggle private goods" during the data calculation process and rewrite the collected original data.
How can we ensure that the DePIN device is honest and trustworthy and does not arbitrarily modify the collected data? This requires the use of trusted firmware technology, among which the more famous ones are TEE (Trusted Execution Environment) and SPE (Secure Processing Environment). These hardware-level technologies can ensure that data is executed on the device according to pre-verified programs, and there is no "private goods" in the calculation process.
Here is a brief introduction. TEE (Trusted Execution Environment) is usually implemented in the processor or processor core to protect sensitive data and perform sensitive operations. TEE provides a trusted execution environment where code and data can be protected at the hardware level to prevent malware, malicious attacks or unauthorized access. For example, hardware wallets such as Leger and Keystone all use TEE technology.
Most modern chips support TEE, especially those for mobile devices, IoT devices, and cloud services. Typically, high-performance processors, security chips, smartphone SoCs (system-on-chips), and cloud server chips integrate TEE technology because the application scenarios involved in these hardware often have a high demand for security.
However, not all hardware supports trusted firmware. Some lower-end microcontrollers, sensor chips, and custom embedded chips may lack support for TEE. For these low-cost chips, probe attacks and other means can be used to obtain the identity information retained in the chip, thereby forging the device identity and behavior. For example, the attack obtains the private key data stored on the chip, and then uses the private key to sign the tampered or forged data, disguising it as data generated by the device's own operation.
However, probe attacks rely on specialized equipment and precise operations and data analysis processes, and the cost of attacks is too high, far higher than the cost of directly acquiring such low-cost chips from the market. Compared to profiting by cracking and forging the identity information of low-end devices through probe attacks and other means, attackers would be more willing to directly purchase more low-cost devices.
Data source attack scenario
The TEE mentioned above can ensure that the hardware device generates data results truthfully. It can only prove that the data has not been maliciously processed after being input into the device, but it cannot ensure that the input source of the data is credible before the data is calculated and processed. This is actually similar to the problem faced by the oracle protocol.
For example, if an air quality detector is placed near a factory that emits exhaust gas, but someone covers the air quality detector with a sealed glass jar at night, then the data obtained by the air quality detector must be untrue. However, the above attack scenarios are often unprofitable, and attackers do not need to do this most of the time because it is a thankless task. For the DePIN network protocol, as long as the device meets the honest and credible computing process and pays the workload required by the incentive protocol, it should theoretically be rewarded.
An Introduction
IoTeX
IoTeX provides the W3bStream development tool to connect IoT devices to blockchain and Web3. The W3bStream IoT SDK includes basic components such as communication and messaging, identity and credential services, and cryptography services.
W3bStream's IoT SDK has a very complete development of encryption functions, including the implementation of various encryption algorithms, such as PSA Crypto API, Cryptographic primitives, Cryptographic services, HAL, Tooling, Root of Trust and other modules.
With these modules, data generated by the device can be signed in a secure or less secure manner on various hardware devices, and passed to subsequent data layers through the network for verification.
DePHY
DePHY provides DID (Device ID) authentication services on the IoT side. DID is created by the manufacturer, and each device has only one corresponding DID. The metadata of DID can be customized and can include device serial number, model, warranty information, etc.
For hardware devices that support TEE, the manufacturer initially generates a key pair and uses eFuse to write the key into the chip. DePHY's DID service can help manufacturers generate DIDs based on the device's public key. In addition to being written into the IoT device, the private key generated by the manufacturer is only held by the manufacturer.
Since trusted firmware can achieve secure and reliable message signing and hardware-side private key confidentiality, if people find that there is cheating in the network to generate device private keys, it can basically be considered that the device manufacturer is doing evil, and the source can be traced back to the corresponding manufacturer to achieve trust traceability.
After purchasing a device, DePHY users can obtain the device's activation information, call the activation contract on the chain, associate and bind the hardware device's DID with their own on-chain address, and then access the DePHY network protocol. After the IoT device goes through the DID setting process, two-way data flow between users and devices can be achieved.
When a user sends a control instruction to a device through an on-chain account, the process is as follows:
1. Confirm that the user has access control permissions. Since the device's access control permissions are written on the DID in the form of metadata, permissions can be confirmed by checking the DID;
2. Allow users and devices to open private channels to establish connections to support user control of devices. In addition to NoStr relay, DePHY relayer also includes peer-to-peer network nodes, which can support point-to-point channels, and other nodes in the network help relay traffic. It can support users to control devices in real time off the chain.
When an IoT device sends data to the blockchain, the subsequent data layer will read the device's permission status from the DID. Only devices that have been registered and permitted can upload data, such as devices that have been registered by the manufacturer.
Another interesting feature of this DID service is that it provides functional characteristic (trait) certification of IoT devices. This certification can identify whether the IoT hardware device has certain specific functions and is qualified to participate in the incentive activities of a specific blockchain network. For example, a WiFi transmitter can be considered to have the function of providing wireless network connection by identifying that it has the function (trait) of LoRaWAN, and can also participate in the Helium network. Similarly, there are GPS traits, TEE traits, etc.
In terms of expanding services, DePHY's DID also supports participation in staking, linking programmable wallets, etc., making it easier to participate in on-chain activities.
sea
Peaq's solution is quite unique. It is divided into three levels: device-based authentication, pattern recognition verification, and oracle-based authentication.
1. Authentication from the device. Peaq also provides functions such as generating key pairs, signing information with private keys on the device, and binding the device address peaq ID to the user address. However, the function implementation of the trusted firmware cannot be found in their open source code. Peaq's simple authentication method of signing device information with a private key cannot guarantee the integrity of the device and that the data has not been tampered with. Peaq is more like an optimistic Rollup, which assumes that the device will not do evil by default, and then verifies the trustworthiness of the data in the subsequent stage.
2. Pattern recognition verification. The second solution is to combine machine learning and pattern recognition. The model is obtained by learning previous data. When new data is input, it is compared with the previous model to determine whether it is credible. However, the statistical model can only identify abnormal data and cannot determine whether the IoT device is operating honestly.
For example, an air quality detector in city A is placed in the basement, and the data collected is different from other air quality detectors, but this does not mean that the data is forged, and the device is still operating honestly. On the other hand, as long as the benefits are large enough, hackers are also willing to use methods such as GAN to generate data that is difficult for machine learning to identify, especially when the discriminant model is publicly shared.
3. Oracle-based authentication. The third solution is that they will select some more trusted data sources as oracles and compare and verify them with the data collected by other DePIN devices. For example, the project party deployed an accurate air quality detector in City A. If the data collected by other air quality detectors deviates too much, it will be considered unreliable.
On the one hand, this approach introduces authority into the blockchain and makes it dependent on authority. On the other hand, it may also cause deviations in the sampling of the entire network data due to sampling deviations in the oracle data source.
Based on the current information, peaq's infrastructure cannot guarantee the trustworthiness of devices and data on the IoT side. (Note: The author consulted peaq's official website, development documents, Github repository, and the only draft of the 2018 white paper. Even after sending an email to the development team, no additional information was obtained before publication.)
Data Generation and Distribution (DA)
The second stage of the DePIN workflow is to collect and verify the data transmitted by IoT devices, save it and provide data to subsequent stages. It is necessary to ensure that the data can be sent to the specific recipient in a complete and correct manner and can be restored. This is called the Data Availability Layer (DA Layer).
IoT devices often broadcast data and signature authentication information through protocols such as HTTP and MQTT. When the data layer of the DePIN infrastructure receives information from the device, it needs to verify the credibility of the data and collect and store the verified data.
Here we introduce that MQTT (MQ Telemetry Transport) is a lightweight, open, publish/subscribe-based message transmission protocol designed for devices with limited connectivity, such as sensors and embedded systems, to communicate in low-bandwidth and unstable network environments, and is very suitable for Internet of Things (IoT) applications.
The process of verifying IoT device messages includes device trusted execution authentication and message authentication.
Device trusted execution authentication can be combined with TEE. TEE ensures secure data collection by isolating the data collection code in a protected area of the device.
Another approach is zero-knowledge proof, which enables devices to prove the accuracy of their data collection without revealing the details of the underlying data. This scheme varies from device to device, and for powerful devices, ZKP can be generated locally, while for restricted devices, it can be generated remotely.
After authenticating the trust of the device, use DID to verify the message signature to confirm that the message was generated by the device.
An Introduction
IoTeX
In W3bStream, it is divided into three parts: trusted data collection, verification, data cleaning, and data storage.
The collection and verification of trusted data uses TEE and zero-knowledge proof methods.
Data cleaning refers to unifying and standardizing the formats of data uploaded from different types of devices for easy storage and processing.
The data storage link allows different application projects to select different storage systems by configuring storage adapters.
In the current W3bStream implementation, different IoT devices can directly send data to the W3bStream service terminal, or collect data through the server first and then send it to the W3bStream server terminal.
When receiving incoming data, W3bStream will distribute the incoming data to different programs for processing, just like a central distribution scheduler. The DePIN project in the W3bStream ecosystem will apply for registration on W3bStream and define the event triggering logic (Event Strategy) and processing program (Applet).
Each IoT device has a device account, which belongs to one and only one project on W3bStream. Therefore, when the message of the IoT device is transmitted to the W3bStream service port, it can be redirected to a project based on the registration binding information and the credibility of the data can be verified.
As for the event trigger logic mentioned above, you can define the data information received from the HTTP API terminal, MQTT topic, as well as the event records detected on the blockchain, the blockchain height, and other types of events that can be triggered (Event triggers), and bind the corresponding handlers to handle them.
One or more execution functions are defined in the processing program (Applet) and compiled into WASM format. Data cleaning and formatting can be performed by Applet. The processed data is stored in the key-value database defined by the project.
DePHY
The DePHY project uses a more decentralized approach to process and provide data, which they call the DePHY Message network.
The DePHY message network consists of unlicensed DePHY relay nodes (relayers). IoT devices can pass data through the RPC port of any DePHY relay node. The incoming data will first call the middleware and combine the DID to verify that the data is trustworthy.
Data that has been verified by trust needs to be synchronized between different relay nodes to form a consensus. The DePHY message network uses the NoStr protocol to achieve this. The original purpose of NoStr was to build decentralized social media. Remember when someone used NoStr to replace Twitter and it became popular? It is also cleverly suitable for the synchronization of DePIN data.
In the DePHY network, each piece of data stored in an IoT device can be organized into a Merkle tree, and the nodes will synchronize the root of the Merkle tree and the tree hash of the entire tree. When a Relayer obtains the Merkle Root and Tree hash, it can quickly locate which data is missing and obtain the completion from other Relayers. This method can achieve consensus confirmation (Finalize) extremely efficiently.
The nodes of the DePHY message network run permissionless, and anyone can pledge assets and run a DePHY network node. The more nodes there are, the more secure the network is and the more accessible it is. DePHY nodes can receive rewards through zk conditional payments (Zero-Knowledge Contingent Payments). In other words, when an application with data indexing needs requests data from a DePHY relay node, it decides how much to pay the relay node based on whether the ZK proof of the data can be retrieved.
At the same time, anyone can access the DePHY network to monitor and read data. Nodes operated by the project party can set filtering rules to only store DePHY device data related to their own projects. Since the original data is deposited, the DePHY message network can be used as a data availability layer for subsequent other tasks.
The DePHY protocol requires relay nodes to store received data locally for at least a period of time during operation, and then transfer cold data to a permanent storage platform such as Arweave. If all data is treated as hot data, it will eventually increase the storage cost of the node, thereby raising the threshold for running a full node, making it difficult for ordinary people to run a full node.
Through the design of hot and cold data classification processing, DePHY can greatly reduce the operating cost of all nodes in the message network and can better cope with massive IoT data.
sea
The previous two solutions both collect and store data off-chain and then roll it up to the blockchain. This is because the amount of data generated by IoT applications is huge, and there is also a requirement for communication delay. If DePIN transactions are directly executed on the blockchain, the data processing capacity is limited and the storage cost is very high.
Just waiting for the node consensus will bring unbearable delays. However, peaq has taken a different approach and built a public chain to directly carry and execute these calculations and transactions. It is developed based on Substrate. When the main network is actually launched, the number of DePIN devices carried will increase, and peaq will eventually be unable to carry such a large number of calculations and transaction requests due to performance bottlenecks.
Since Peaq does not have the function of trusted firmware, it is basically impossible to effectively verify the credibility of data. In terms of data storage, Peaq directly introduces in the development documentation how to connect the substrate-based blockchain to IPFS distributed storage.
Distribute data to different applications
The third stage in the DePIN workflow is to extract data from the data availability layer according to the needs of blockchain applications, and efficiently synchronize the execution results to the blockchain by executing operations or zero-knowledge proofs.
An Introduction
IoTeX
W3bStream calls this stage Data Proof Aggregation. This part of the network consists of many Aggregator Nodes forming a computing resource pool, which is shared by all DePIN projects.
Each aggregator node will record its working status on the blockchain, whether it is busy or idle. When there is a computing demand for the DePIN project, an idle aggregator node will be selected to handle it according to the status monitoring (monitor) on the chain.
The selected aggregator node will first retrieve the required data from the storage layer; then perform operations on the data according to the requirements of the DePIN project and generate proof of the operation results; finally, the proof results will be sent to the blockchain for verification by the smart contract. After completing the workflow, the aggregator node returns to an idle state.
When generating proofs, aggregator nodes use a layered aggregation circuit. The layered aggregation circuit consists of four parts:
Data compression circuit: Similar to a Merkle tree, verifies that all collected data comes from a specific Merkle tree root.
Signature batch verification circuit: Batch verify the validity of data from devices, each data is associated with a signature.
DePIN computational circuit: Prove that the DePIN device correctly executed some instructions according to a specific computational logic, such as verifying the number of steps in a healthcare project or verifying the energy generated in a solar power plant.
Proof aggregation circuit: Aggregates all proofs into a single proof for final verification by the Layer 1 smart contract.
Data proof aggregation is critical to ensuring the integrity and verifiability of computations in the DePIN project, providing a reliable and efficient method for verifying off-chain computations and data processing.
IoTeX's revenue is also mainly generated at this stage. Users can stake IOTX tokens and run aggregator nodes. The more aggregators participate, the more computing power can be brought, forming a computing layer with sufficient computing power.
DePHY
At the data distribution level, DePHY provides a coprocessor to monitor the finalized messages of the DePHY message network, and after state change, it packages and compresses the data and submits it to the blockchain.
State migration is a function of a smart contract for processing messages, which is customized by different DePIN project parties and also includes the calculation package data processing solution of zkVM or TEE. This part is provided by the DePHY team to the DePIN project party with a project scaffold for development and deployment, which has a high degree of freedom.
In addition to the co-processor provided by DePHY, the DePIN project party can also connect the DA layer data to the computing layer of other infrastructure based on the project scaffolding to achieve chain-up.
Comprehensive analysis
Although the DePIN track is hot, there are still technical barriers for IoT devices to be connected to the blockchain on a large scale. From the perspective of technical implementation, this article reviews and analyzes the entire process of IoT devices from generating data in a trustworthy manner, verifying and storing data, generating proofs through calculations, and rolling up data to the blockchain, thereby supporting the integration of IoT devices into Web3 applications. If you are an entrepreneur in the DePIN track, I hope this article can help the development of your project in terms of methodology and technical design.
Among the three DePHY infrastructures selected for analysis, peaq is still just hype, just like the online comments six years ago. DePHY and IoTeX both choose to collect IoT device data off-chain and then roll it up to the chain. They can connect IoT device data to the blockchain under the conditions of low latency and ensuring the credibility of device data.
DePHY and IoTeX each have their own focus. DePHY's DID includes hardware functional trait verification, bidirectional data transmission and other features. DePHY message network focuses more on the decentralized data availability layer, and is more combined with the DePIN project as a low-coupling functional module. IoTeX has a high degree of development integrity and a complete development workflow. It focuses more on binding handlers to different events and is biased towards the computing layer. DePIN project parties can choose different technical solutions to combine according to actual needs.
References
https://www.trustedfirmware.org/
https://www.digikey.com/en/blog/three-features-every-secure-microcontroller-needs
https://medium.com/@colbyserpa/nostr-2-0-layer-2-off-chain-data-storage-b7d299078c60
https://transparency.dev/
https://github.com/Sovereign-Labs/sovereign-sdk
https://github.com/nostr-protocol/nips
https://www.youtube.com/watch?v=W9YMtTWHAdk
https://www.youtube.com/watch?v=JKKqIYNAuec
https://iotex.io/blog/w3bstream/
https://w3bstream.com/#sdks
https://docs.w3bstream.com/sending-data-to-w3bstream/introduction-1/technical-framework
https://dephy.io/
https://docs.peaq.network/
https://docs.peaq.network/docs/learn/dePIN-functions/machine-data-verification/machine-data-verification-intro
https://www.reddit.com/r/Iota/comments/8ddjxq/peaq_white_paper_draft_is_here/
https://depinhub.io/
https://tehranipoor.ece.ufl.edu/wp-content/uploads/2021/07/2017-DT-Probe.pdf
https://multicoin.capital/2022/04/05/proof-of-physical-work/