Exploring the design space challenges of DeFi protocol oracle implementations

Original author: Adrian Chow
Jonathan Yuen and Wintersoldier contributed
Summary
Oracles are essential for securing the locked value of DeFi protocols. Of DeFi’s total locked value of $50 billion, $33 billion is secured by oracles.
However, the inherent time delay in oracle price feed updates results in a subtype of value extraction beyond the Maximum Extractable Value (MEV), known as Oracle Extractable Value (OEV); OEV includes oracle frontrunning, arbitrage, and inefficient liquidations.
There are a growing number of design implementations that can prevent or mitigate negative OEV churn, each with its own unique trade-offs. This article discusses existing design options and their trade-offs, and proposes two new concepts, their value propositions, unresolved issues, and development bottlenecks.
introduction
Oracles are arguably one of the most important infrastructures in DeFi today. They are an integral part of most DeFi protocols, which rely on price feeds to settle derivatives contracts, close undercollateralized positions, and more. Currently, oracles secure $33 billion in value, accounting for at least two-thirds of the total $50 billion locked on the chain1. However, for application developers, adding oracles brings obvious design trade-offs and problems, which stem from value loss through frontrunning, arbitrage, and inefficient liquidations. This article classifies this value loss as Oracle Extractable Value (OEV), outlines its key issues from the application perspective, and attempts to explain the key considerations for safely and reliably adding oracles to DeFi protocols based on industry research.
Oracle Extractable Value (OEV)
This section assumes that the reader has a basic understanding of oracle functionality and the difference between push-based and pull-based oracles. The price feeds of individual oracles may vary. See the Appendix for an overview, classification, and definitions.
Most applications that use oracle price feeds only need to read prices: decentralized exchanges that run their own pricing models use oracle price feeds as reference prices; depositing collateral for overcollateralized positions only requires oracle price readings to determine initial parameters such as the loan value ratio and the closing price; excluding extreme cases such as long-tail assets where pricing updates are too infrequent, the latency of oracle price feed updates is basically not important when considering the design of the system. Therefore, the most important considerations for oracles are - evaluating the accuracy of price contributors and the decentralized performance of oracle providers.
However, if latency in price feed updates is an important consideration, more attention should be paid to how the oracle interacts with the application. Typically in this case, such latency leads to value extraction opportunities, i.e. front-running, arbitrage, and liquidation. This subtype of MEV is referred to as OE V2. Before discussing various implementations and their trade-offs, we will outline the various flavors of OEV.
arbitrage
Oracle front-running and arbitrage are colloquially referred to as “toxic flow” in derivatives protocols because these trades are conducted with asymmetric information, often extracting risk-free profits at the expense of liquidity providers. OG DeFi protocols such as Synthetix have been dealing with this issue since 2018 and have tried various solutions over time to mitigate these negative externalities.
Let’s take a simple example; the perpetual contract decentralized exchange xyz uses Chainlink oracles on the ETH/USD market. The example uses the ETH/USD price feed as an example:
Figure 1: Example of arbitrage using Chainlink Oracles
While the above example is an oversimplified one that does not take into account factors such as slippage, fees, or funding, it illustrates the role of the deviation threshold leading to insufficient price granularity and the opportunities that arise from it. Seekers can monitor the latency of spot market price updates based on Chainlink’s on-chain storage and extract zero-risk value from liquidity providers (LPs).
Front-runner trading
Front-running is another form of value extraction similar to arbitrage, where searchers monitor the oracle updates of the memory pool and run the actual market price before it is submitted on-chain. This gives searchers time to bid before the oracle updates and trade at a price that is favorable to their trading direction.
Perpetual contract decentralized exchanges such as GMX have been victims of toxic front-running; before all oracles of GMX coordinated the protocol update through KeeperDAO, approximately 10% of the protocol's profits had been lost to front-running 4.
What if we only adopt a pull model?
One of Pyth’s value propositions is that using the Solana-based Pythnet, publishers can maintain low-latency price feeds by pushing price updates to the network every 300 milliseconds5. Therefore, when an application queries a price through Pyth’s API, it can retrieve the latest price, update it to the target chain’s on-chain storage, and perform any downstream actions in the application logic in a single transaction.
As mentioned above, the application can directly query Pythnet's latest price updates, update the on-chain storage, and complete all related logic in one transaction. Doesn't this effectively solve the problems of front-running and arbitrage?
Not quite - Pyth's update gives users the ability to choose which prices to use in transactions, which could lead to adversarial selection (another way of saying toxicity). While prices stored on-chain must be updated over time, users can still choose any price that satisfies these constraints - meaning arbitrage still exists because it allows searchers to see future prices before using past prices. Pyth's documentation 6 suggests that a simple way to protect against this attack vector is to add a staleness check to ensure that the price is recent enough - however, there must be a certain buffer time before updating transaction data in the next block, how do we determine the best time threshold?
Let’s take the perpetual contract decentralized exchange xyz as an example for analysis. This time they use the Pyth ETH/USD price feed, and the expiration check time is 20 seconds, which means that the timestamp of the Pyth price must be within 20 seconds of the block timestamp of the downstream transaction:
Figure 2: Example flow of front-running using Python
An intuitive idea is to simply lower the validity check threshold, but a lower threshold may result in unpredictable network responses at block times, which can affect user experience. Since Pyth's price feed relies on bridging, sufficient buffer is required to a) provide time for wormhole guardians to prove prices, and b) allow the target chain to process transactions and include them in blocks. The next section will describe these trade-offs in detail.
Closing a Position
Liquidation is a core part of any protocol involving leverage, and the granularity of price feed updates plays a crucial role in determining liquidation efficiency.
In the case of a threshold-based push oracle, when the spot price reaches the threshold but does not meet the parameters preset by the oracle feed, the granularity (or lack of granularity) of the price update will result in missed liquidation opportunities. This brings negative externalities in the form of market inefficiencies.
When liquidations occur, applications typically pay out a portion of the liquidation collateral, and sometimes provide rewards to the user who initiated the liquidation. For example, in 2002 Aave paid out $37.9 million in liquidation rewards on the mainnet alone7. This clearly overcompensates third parties and results in poor performance for users. Additionally, when there is extractable value, the resulting Gas Wars can cause value to drain from applications and into the MEV supply chain.
Design Space and Considerations
Taking the above issues into consideration, the following will discuss various implementation schemes based on push, pull and alternative designs, their effectiveness in solving the above problems and the trade-offs involved; the trade-offs can be in the form of additional centralization and trust assumptions, or poor user experience.
Oracle-specific Order Flow Auctions (OFA)
Order Flow Bidding OFA has emerged as a solution to eliminate the negative externalities generated by MEV. Broadly speaking, OFA is a general third-party bidding service to which users can send orders (trades or intents), and searchers who extract MEV can bid for exclusive rights to their order execution strategies. A large part of the bidding proceeds will be returned to users to compensate them for the value they created in these orders. OFA's adoption rate has surged recently, with more than 10% of Ethereum transactions being conducted in private channels (private RPC/OFAs) (Figure 3), and it is believed that it will further catalyze growth.
Figure 3: Combined daily private Ethereum transaction count. Source: Blocknative
The problem with implementing a universal OFA in oracle updates is that the oracle has no way of knowing whether an update based on standard rules will generate any OEV, and if not, it will introduce additional latency when the oracle sends the transaction to the auction. On the other hand, the simplest way to streamline OEV and minimize latency is to provide all oracle order flow to a single dominant searcher. But this obviously carries a huge risk of centralization, may encourage rent-seeking behavior and censorship, and lead to a poor user experience.
Figure 4: General OFA and Oracle-specific OFA
Price updates for oracle-specific OFA that do not include existing rule-based updates are still performed in the public memory pool. This allows the oracle’s price updates, and any extractable value that results from them, to remain in the application layer. As a by-product, it also increases the granularity of the data by allowing searchers to request data source updates without requiring oracle nodes to bear the additional cost of more frequent updates.
Oracle-specific OFAs are ideal for liquidation because they result in finer-grained price updates, maximize capital return to liquidated borrowers, reduce protocol rewards paid to liquidators, and retain value extracted from bidders in the protocol for redistribution to users. They also address front-running and arbitrage problems to some extent - though not completely. Under perfect competition and a first price sealed bid auction process, the result of the auction should be a block space cost close to execution opportunity 8, the value extracted from the front-running OEV data feed, and the reduction of arbitrage opportunities due to the increased price granularity of the feed updates.
Currently, implementing an oracle-specific OFA requires either joining a third-party bidding service (such as OEV-Share) or building a bidding service as part of the application. Inspired by Flashbots, API 3 leverages the OEV relay (Figure 5) as an API that performs DoS protection services by design for bidding. The relay is responsible for collecting meta transactions from oracles, collating and aggregating bids from searchers, and redistributing revenue in a trustless manner without controlling bids. When a searcher wins a bid, updating the data source can only rely on transferring the bid amount to a proxy contract owned by the protocol, which then updates the price source with the signed data provided by the relay.
Figure 5: OEV Repeater for API 3
Alternatively, protocols can forgo the middleman and build their own bidding service to capture all the extracted value from OEV. BBOX is an upcoming protocol that hopes to embed bidding into its liquidation mechanism to capture OEV and return it to applications and their users 9.
Running a Central Node or Keeper
An early idea stemming from the first wave of perpetual swaps decentralized exchanges to combat OEV was to run a centralized Keeper network that aggregated prices received from third-party sources (such as centralized exchanges) and then leveraged data feeds like Chainlink as a fallback or circuit breaker. This model was popularized in GMX v1 10 and many of its subsequent forks, with the main value proposition being that since the Keeper network is run by a single operator, it is absolutely protected against front-running.
While this solves many of the issues listed above, there are obvious centralization concerns. A centralized Keeper system can determine execution prices without properly verifying pricing sources and aggregation methods. In the case of GMX v1, Keeper is not an on-chain or transparent mechanism, but rather a program signed by a team address running on a centralized server. The core role of the Keeper is not only to execute orders, but also to "determine" the transaction price based on its own preset definitions, with no way to verify the authenticity or source of the execution price used.
Automated Keeper Network and Chainlink Data Streams
The solution to the centralization risks of a single-operator Keeper network is to build a more decentralized automation network using third-party service providers. Chainlink Automation is one such product, which provides this service in conjunction with Chainlink Data Streams, a new pull-based, low-latency oracle. The product was recently released and is currently in closed beta, but GMX v2 11 is currently using it and can serve as a reference for systems that adopt this design.
At a high level, the Chainlink dataflow consists of three main components: DataDON (decentralized oracle network), AutomatedDON, and on-chain validation contracts 12. DataDON is an off-chain data network with an architecture similar to how Pythnet maintains and aggregates data. AutomatedDON is a network of guardians secured by the same node operators of DataDON that extract prices from the on-chain DataDON. Finally, the validator contract is used to verify that the off-chain signature is correct.
Figure 6: Chainlink data flow architecture
The above diagram shows the transaction flow of calling the open trading function, where the automation DON is responsible for obtaining prices from the data DON and updating the on-chain storage. Currently, the endpoint for querying the data DON directly is limited to whitelisted users, so the protocol can choose to offload the Keeper maintenance work to the automation DON, or run its own Keeper. However, as the product development lifecycle progresses, it is expected that this will gradually shift to a permissionless structure.
On a security level, the trust assumptions of relying on automated DONs are the same as those of using data DONs alone, which is a significant improvement over the single Keeper design. However, if the right to update the price feed is given to the automated DON, the opportunity for value extraction can only be left to the nodes in the Keeper network. This in turn means that the protocol will trust Chainlink node operators (mainly institutions) to maintain their social reputation and not preempt users, which is similar to trusting Lido Node operators to maintain their reputation and not monopolize block space due to their large market share.
Pull: Delayed Settlement
One of the biggest changes in Synthetix perps v2 is the introduction of Pyth price feeds for perpetual contract settlement 13. This allows orders to be settled at either Chainlink or Pyth prices, provided that their deviation does not exceed a predefined threshold and the timestamp passes the expiration check. However, as mentioned above, simply switching to pull-based oracles will not solve all OEV-related issues for all protocols. To address front-running, a “last seen” pricing mechanism can be introduced in the form of delayed orders. In practice, this splits a user’s market order into two parts:
Transaction #1: Submit an "intent" to open a market order on-chain, providing standard order parameters such as size, leverage, collateral, and slippage tolerance. An additional Keeper fee is also paid, which rewards the Keeper for executing Transaction #2.
Transaction #2: Keeper receives the order submitted in Transaction #1, requests the latest Python price feed, and calls the Synthetix execution contract in one transaction. The contract checks predefined parameters such as timeliness and slippage, and if they pass, the order is executed, the on-chain price storage is updated, and the position is established. Keeper charges a fee to compensate for the gas used to use and maintain the network.
This implementation does not give users the opportunity to adversely choose the price submitted on-chain, thus effectively solving the front-running and arbitrage opportunities of the protocol. However, the trade-off of this design is user experience: executing this market order requires two transactions, and users need to compensate gas for Keeper's operations while sharing the cost of updating the oracle's on-chain storage. Previously, it was a fixed fee of 2 sUSD, but recently it was changed to a dynamic fee based on the Optimism gas oracle + a premium, which will change according to the activity of the second layer network (layer 2). In any case, this can be seen as a solution that sacrifices the trader's user experience to improve LP profitability.
Pull type: Optimistic settlement
Since delayed orders incur additional network fees for users (proportional to the Layer 2 DA fee), we brainstormed an alternative order settlement model, called "active settlement," that has the potential to reduce costs for users while maintaining decentralization and the security of the protocol. As the name implies, this mechanism allows traders to execute market trades atomically, with all prices actively accepted and a window for searchers to submit evidence of malicious orders. This section outlines different versions of this idea, our thought process, and unresolved issues.
Our initial idea was to have a mechanism where users submit prices via parsePriceFeedUpdates when opening a market order, and then allow users or any third party to submit settlement trades using the price feed and complete the trade at that price when the trade is confirmed. At settlement, any negative difference between the two prices will be recorded as slippage in the user's profit and loss statement. The advantages of this approach include reducing the cost burden on users and reducing the risk of front-running. Users no longer have to pay the premium to reward the gatekeeper, and the risk of front-running is still manageable because the settlement price is not known at the time of order submission. However, this still introduces a two-step settlement process, which is one of the shortcomings we found in Synthetix's delayed settlement model. In most cases, if the volatility between order placement and settlement does not exceed the system-defined profitable front-running threshold, the additional settlement trade may be redundant.
Another solution to circumvent the above problem is to allow the system to actively accept orders and then open a permissionless challenge period during which evidence can be submitted to prove that the price deviation between the price timestamp and the block timestamp allowed for a profitable front-running trade.
The specific operations are as follows:
Users create orders based on the current market price. They then pass the price along with the embedded python price feed byte data to create a trade as an order.
Smart contracts will actively verify and store this information.
After the order is confirmed on-chain, there will be a challenge period where the Seeker can submit an adverse selection proof. The proof will confirm that the trader used an outdated price feed with the intent to arbitrage the system. If the system accepts the proof, the difference will be applied to the trader's execution price as slippage, and the excess value will be given to the Keeper as a reward.
After the challenge period ends, the system considers all prices valid.
This model has two advantages: It reduces the cost burden on users. Users only need to pay gas fees for order creation and oracle update in the same transaction, without requiring additional settlement transactions. It also prevents front-running, protects the integrity of the liquidity pool, ensures a healthy Keeper network, and has financial incentives to submit proof to the system to prove it is front-running.
However, there are still some issues to be resolved before this idea can be put into practice:
Define "adverse selection": How does the system distinguish between users who submit out-of-date prices due to network latency, and users who intentionally seek arbitrage? A preliminary idea could be to measure volatility during the expiration check window (e.g. 15 seconds), and if the volatility exceeds the net execution fee, the order is flagged as a potential exploit.
Setting an appropriate challenge period: Given that toxic order flow may only be open for a short time, what is an appropriate time window for a Keeper to challenge a price? Batch attestation may be more cost-effective, but given the unpredictable nature of order flow over time, it is difficult to time batch attestation to ensure that all price information is either attested or has sufficient time to be challenged.
Economic Rewards for Keepers: For submitting a proof to be reasonable for a financially incentivized Keeper, the reward associated with submitting a winning proof must be greater than the gas cost associated with submitting the proof. This assumption may not be guaranteed due to varying order sizes.
Do we need to have a similar mechanism for closing orders? If so, how would that degrade the user experience?
Ensure that "unreasonable" slippage does not fall on users: In flash crash situations, very large price differences may occur between order creation and on-chain confirmation. Some kind of fallback or circuit breaker may be needed, and you can consider using Pyth's EMA price to ensure the stability of the feed price before use.
ZK Co-processors - Another form of data consumption
Another direction worth exploring is the use of ZK coprocessors, which are designed to take on-chain state to perform complex computations off-chain while providing proof of how the computation was performed in a permissionless way. Projects such as Axiom enable contracts to query historical blockchain data, perform computations off-chain, and submit ZK proofs that the computations were correctly computed based on valid on-chain data. Coprocessors open up the possibility of building custom TWAP oracles that are manipulation resilient using historical prices from multiple DeFi native liquidity sources (such as Uniswap + Curve).
Compared to traditional oracles that currently only have access to the latest asset price data, ZK-assisted processors will expand the range of data available to dApps in a secure manner (Python does provide EMA prices for developers to use as a reference check for the latest prices). This will allow applications to introduce more business logic that works with historical blockchain data to improve protocol security or enhance user experience.
However, the ZK co-processor is still in the early stages of development and there are still some bottlenecks, such as:
In an auxiliary processor environment, the acquisition and calculation of large amounts of blockchain data may require a long proof time
Only provides blockchain data and cannot solve the need for secure communication with non-Web3 applications
Oracle-less Solutions – The Future of DeFi?
Another approach to this problem is to design a primitive from scratch that eliminates the need for external price feeds.
Solving DeFi’s reliance on oracles. The latest development in this area is the use of various AMM LP tokens as a means of pricing, with the core idea being that the LP position of a constant function market maker is a token representing a preset weight of two assets, with an automatic pricing formula for these two tokens (i.e. xy=k). By leveraging LP tokens (as collateral, as a basis for loans, or in recent use cases, to move v3 LP positions to different tick points), the protocol can obtain information that would normally be required from an oracle. As a result, a new wave of trends - oracle-free solutions that are free from the challenges described above - have been realized. Examples of applications built on this direction include:
Panoptic is building a perpetual, oracle-free options protocol that leverages Uniswap v3 pooled liquidity positions. Since pooled liquidity positions convert 100% to the underlying asset when the spot price exceeds the upper range of the LP position, the returns for liquidity providers are very similar to the returns for sellers of put options. Therefore, the options market operates with liquidity providers depositing LP assets or positions, and option buyers and sellers borrowing liquidity and moving it in or out of the range, resulting in dynamic option returns. Since the loan is denominated in LP positions, no oracle is required for settlement.
Infinity Pools is leveraging Uniswap v3’s pooled liquidity positions to build a margin trading platform with no liquidation and no oracles. Uniswap v3 liquidity providers can lend their LP tokens, traders deposit some collateral, borrow LP tokens and redeem the underlying assets of their directional trades. The loan at redemption will be denominated in the asset basis or the quoted asset, depending on the price at the time of redemption, and can be calculated directly by checking the LP composition on Uniswap, eliminating the reliance on oracles.
Timeswap is building a fixed-term, no-liquidation, no-oracle lending platform. It is a three-party market consisting of lenders, borrowers, and liquidity providers. Unlike traditional lending markets, it uses "time-based" liquidation instead of "price-based" liquidation. In decentralized exchanges, liquidity providers are automatically set to always buy from sellers and sell to buyers; in Timeswap, liquidity providers always lend to borrowers and borrow from lenders, playing a similar role in the market. They are also responsible for loan defaults and have priority to receive confiscated collateral as compensation.
in conclusion
Pricing data remains an important part of many decentralized applications, and the total value captured by oracles continues to increase over time, further confirming their product-market fit. This article aims to provide readers with an overview of the challenges we currently face with OEV, as well as the design space for its implementation based on push, pull, and other designs using AMM liquidity providers or off-chain co-processors.
We love seeing the vibrant community of developers looking to solve these tough design challenges. If you’re working on a disruptive project in this space, we’d love to hear from you!
References and Acknowledgements
Thanks to Jonathan Yuen and Wintersoldier for their contributions and conversations that greatly contributed to this article.
Thanks to Erik Lie, Richard Yuen (Hailstone), Marc, Mario Bernardi, Anirudh Suresh (Pyth), Ugur Mersin (API 3 DAO), and Mimi (Timeswap) for their valuable comments, feedback, and reviews.
https://defillama.com/oracles(14 Nov)  ↩︎
OEV Litepaper https://drive.google.com/file/d/1wuSWSI8WY9ChChu2hvRgByJSyQlv_8SO/edit
Frontrunning on Synthetix: A History by Kain Warwick https://blog.synthetix.io/frontrunning-synthetix-a-history/
https://snapshot.org/#/rook.eth/proposal/0x523ea386c3e42c71e18e1f4a143533201083655dc04e6f1a99f1f0b340523c58  ↩︎
https://docs.pyth.network/documentation/pythnet-price-feeds/on-demand↩︎
https://docs.pyth.network/documentation/solana-price-feeds/best-practices#latency↩︎
Aave liquidation figures https://dune.com/queries/3247324
https://drive.google.com/file/d/1wuSWSI8WY9ChChu2hvRgByJSyQlv_8SO/edit
https://twitter.com/bboexchange/status/1726801832784318563
https://gmx-io.notion.site/gmx-io/GMX-Technical-Overview-47fc5ed832e243afb9e97e8a4a036353
https://gmxio.substack.com/p/gmx-v2-powered-by-chainlink-data ↩︎
https://docs.chain.link/data-streams↩︎
https://sips.synthetix.io/sips/sip-281/
︎
appendix
Definition: Push vs. Pull Oracles
Push-based oracles are used to maintain off-chain prices in a P2P network and to update prices based on pre-defined on-chain nodes. Taking Chainlink as an example, price updates are based on two trigger parameters: deviation threshold (deviation threshold) and heartbeat (heartbeat). As long as the off-chain price deviates from the latest on-chain price by 0.5%, or the heartbeat timer reaches zero after 1 hour, the Ethereum ETH/USD price feed below will be updated.
In this case, oracle operators must pay transaction fees for each price update, which is a trade-off between cost and scalability. Increasing the number of price sources, supporting additional blockchains, or adding more frequent updates will incur additional transaction costs. Therefore, long-tail assets with higher trigger parameters inevitably have less reliable price sources. Let’s use CRV/USD as an example to illustrate this - in order for a new price to be updated on-chain, a 1% deviation threshold is required, and the heartbeat is 24 hours, which means that if the price has not deviated by more than 1% in 24 hours, there will only be one new price update every 24 hours. Intuitively, the lack of granularity in price sources for long-tail assets will inevitably lead to additional risk factors that applications need to consider when creating markets for these assets, which explains why the vast majority of DeFi activity still revolves around the most liquid and largest market cap tokens.
In contrast, pull oracles allow prices to be pulled on-chain on demand. Pyth, the most prominent example today, transmits price updates off-chain, signs each update so anyone can verify its authenticity, and maintains aggregated prices on Pythnet, a private blockchain based on the Solana code. When an update is needed, the data is transmitted through Wormhole, verified on Pythnet, and then pulled on-chain permissionlessly.
The above diagram describes the architecture of the Pyth price feed: When the on-chain price needs to be updated, the user can request an update through the Pyth API. The verified price on Pythnet will be sent to the Wormhole contract. The Wormhole contract will observe and create and send a named VAA, which can be verified on any blockchain where the Pyth contract is deployed.
Disclaimer: This article does not constitute investment advice. Users should consider whether any opinions, views or conclusions in this article are suitable for their specific circumstances and comply with the relevant laws and regulations of the country and region where they are located.