Pyth Network
A brief primer on oracles, a look into the LUNA crash, and a introduction into Pyth
1 Why Oracles?
Blockchains are very good at storing data across multiple systems, akin to a decentralized ledger. However, these various blockchains are gated (imagine a computer without internet access), and thus unable to communicate with each other or the outside world1. This default limitation reduces the basic use-case of a blockchain to a store of value, and thus it is no surprise that the narrative around BTC is digital gold.
BTC proved that being an immutable store of value has product market fit. But in order to enable other use cases of blockchain, the dApps building on top of the layer one must be able to interact with off-chain data sources. Oracles act as this gateway, trustlessly piping data from the outside world into the blockchain, powering most of the DeFi and NFT dApps.
A few examples of oracles in action are lending borrowing protocols like Aave and Compound who pull updated crypto prices for collateral calculation. When the collateral supplied decreases in value below the acceptable threshold for the borrowed assets, the collateral gets liquidated to ensure the protocol has no bad debt. Trading protocols like dYdX and Mango also pull in index prices to help generate funding rates. This article has numerous more potential oracle use cases.
Oracles are an integral part of web3 and a quick metric to see how popular they are is total value secured (TVS). Below we can see this trend of value secured by oracles slowly increase throughout until the LUNA/UST debacle (red line) and the subsequent market wide downturn.
2 Oracle Dangers
Because oracles are the backbone of defi, it is of the utmost importance that they relay accurate data swiftly. There have been a few cases of inaccurate oracle prices and catastrophe swiftly ensued. A few months ago, Pyth reported a $5000 BTC/USD mark when the actual trading price was over $40,000 and some traders on Bonfida's perp exchange got liquidated23.
2.1 Luna
Just recently, LUNA spiraled from around $100 dollars to 0, wiping out nearly 60 billion of wealth (40B from LUNA and 20B from UST) in a few days. The contagion of this massive unwind is still being felt as leverage is unwound and firms/lending desks become insolvent. This large one direction price move triggered a (minimum value) circuit breaker for the Chainlink LUNA/USD price feed which was locked at $0.1074.
In this scenario, the actual traded price of LUNA kept dropping, but Chainlink still pegged the price at $0.107, which created a cascade of losses for protocols using the Chainlink price feed. Two dApps, Blizz Finance and Venus Protocol lost 10m and 13.5m respectively.
As LUNA went to 0, attackers deposited the worthless LUNA into Blizz/Venus and slowly siphoned the other assets. Below is the three step attack process and we will name our perpetrator Jack:
Jack deposits 412k LUNA into Venus, which is counted as 41k USD because of the Chainlink price.
Jack then withdraws 24k BUSD, because LUNA has a 60% LTV factor.
Jack swaps 10k BUSD for 434k LUNA and ends up with 22k more LUNA and 14k more BUSD. Rinse and repeat.
When the circuit breaker was triggered, the Chainlink team alerted the impacted applications, but many were unable to pause in time. Below is the statement from the team regarding this unfortunate situation.5
2.2 No oracles
On the exact opposite side is Kava, a l1 built on Cosmos that had no oracles and just hardcoded UST to 1 dollar. There was a mechanism that allowed UST to be used as collateral and mint USDX, the native stablecoin. This bug/feature was abused as UST started to depeg and arbers started swapping USDX for other assets and bridging them off. This attack on lps in Kava can be seen by the USDX price (minted against worthless UST) and the TVL of Kava. Right when UST started to depeg around May 8th both charts started crashing and only recovered a little bit after UST was removed as a collateral option for USDX. There was over a 250m TVL decline on the KAVA chain and while some of that is due to asset prices declining, we can safely assume that a good chunk of that is due to the hard-coding failure.
3 What is Pyth?
Through the calamities above, it is clear how important clean oracle data is. The Pyth team has created a unique and robust model to bring reliable market data to web3. After examining existing oracle solutions, Pyth was built from first principles to minimize the issues of existing oracle solutions, while maximizing the amount of accurate new data. There are three types of participants in the Pyth network, the consumers who use the data, the publishers who submit the data, and the delegators who validate and back the data6.
These participants primarily interact via four on chain mechanisms, which are designed to be extensible and robust against malicious actors.
Price aggregation combines the reported prices and confidence intervals of individual publishers into a single price feed and confidence interval feed for a specific product (e.g. BTC/USD feed)7.
Data staking allows delegators to stake tokens to earn data fees8.
Reward distribution determines the share of the reward pool earned by each publisher. Each product has a reward pool that delegators can stake into9.
Governance uses a coin-voting system that will help determine the high-level parameters of the three mechanisms above10.
Here is a microscopic look into the ecosystem: Solend is a borrow lending dApp that needs accurate price feeds and takes the role of the consumer. Various publishers would submit asset prices through Pyth, and delegators stake their tokens to specific products and publishers. All the publishers' data11 get aggregated into one price and Solend pays for this price information to the publishers and delegators. In the event that the published information is incorrect, the delegators get slashed and Solend gets the slashed tokens, similar to an insurance policy. This acts similar to a free market, because the more Solend and other protocols pay for accurate prices, the more delegators will stake (higher rewards and larger insurance policy) and the more likely publishers will submit accurate data.
The following subsections will explore the mechanisms of Pyth and how they interact and differ from existing solutions.
3.1 Price Aggregation
For Pyth, publishers submit both a price and a confidence feed (25% and 75% confidence intervals). This is very different from other oracles who only accept a single price point and allows Pyth to create more informative prices. Pyth then aggregates these values into a single price and confidence feed which is robust to manipulation. Different weights are assigned to each publisher based on their stakes. The price aggregation algorithm is a variant of the weighted median and is outside the scope of this article, but below are examples of the aggregation procedure. In each of the subplots, the red line is the aggregated price and the lines below are the various publishers' prices.
In subplot (a) we see that an outlier does not impact the aggregated price. Subplot (b) shows that the tighter the confidence interval is, the more confident a publisher is and the model brings the aggregated price closer to the tighter confidence interval publishers. Subplot (c) and (d) show that the aggregation model can adequately reflect the variation in publisher points even if there is disagreement. We can see that the incorporation of the confidence interval helps create a more accurate aggregate price in the different scenarios above. Pyth is the first oracle to use the innovative confidence interval to get a more accurate representation of current price.
3.2 Claims
If anyone believes that inaccurate price information was submitted, they may file a claim to determine if a delegator should get slashed to pay out the consumer. To avoid the oracle errors in the above section, it is important to have an impartial claims process, which relies on the HUMAN protocol to collect information on prices of products from Pyth feed and reference exchanges.
An oracle error is defined as if both the highest and lowest data points in the reference exchanges are not within 3 confidence intervals on either side of the Pyth feed. If this is the case, then the at-fault publishers get their stake slashed and given to the consumers of this feed proportionally based on the fees they paid. The wiki gives a specific example of a sample claim.
3.3 Rewards
The main issue with existing oracle data is that the rewards are designed for multiple counterparties to publish public information and these publishers will submit the same price. However it is often the case that a few counter-parties have private information which can lead to more accurate price predictions, but they are not rewarded for this extra information and this creates a few unintended negative consequences. Say one publisher has private information that signals the price of an asset at next timestamp should be $5 dollars but the aggregate price on previous timestamp was $4 dollars and the price is not likely to move 20% without knowledge of private data. This publisher knows the aggregate price will be around $4 dollars and submits something close to the $4 dollar value also.
Pyth solves the scenario presented with a quality score: rewarding publishers who share new information by checking how well a price series predicts future aggregate price changes. This allows new pertinent information to be included in the price series and also incorporates the confidence feed as a metric of price accuracy. This quality score12 works alongside a few other scores (like publisher staked percent) to return a final rewards score.
3.4 Data Moat
In a web3 world where everything is forkable, moats are increasingly hard to come by. Pyth has one of the strongest moats, because of its data providers. As of July 2022, there are over 60 data providers, including traditional HFT firms, crypto native trading firms, various centralized and decentralized exchanges along with traditional equities exchanges. It is important to note how historic this is; competing market participants are actively collaborating together. This data cannot be forked easily and it is precisely the high data quality, along with an innovative aggregation method and staking method that creates such a strong product. The Pyth network creates a powerful economic design that is positive sum for the entire ecosystem.
3.5 Performance During Luna Crash
During bouts of high volatility, it is imperative that oracles are reliable, both for availability and accuracy of information. Here we go in depth and examine the Luna crisis when it crashed over 99.99% in a matter of days. Between 5/7 and 5/13, we see that the Pyth LUNA/USD13 price feed was available even as multiple centralized exchanges delisted LUNA as LUNA crashed and the price nosedived to fractions of a penny. Below is a plot with the percentage of updates and how frequently they happened. We can see that over 62% of updates occurred within 2 seconds and 88% happened within 20 seconds.
As multiple centralized exchanges experienced deposit and withdrawal delays, reducing the arbitrage opportunities, Pyth was able to relay an accurate aggregate price information. Below is a snapshot of the 5 centralized exchanges (CEXs) with highest volume compared to Pyth on 5/11. We see that the Pyth price feed is a good approximation of these top CEXs.
Afterwards, when the various CEXs started delisting LUNA, the Pyth data providers shifted to pull from the remaining defi sources and remaining centralized exchanges. On 5/13, LUNA had cratered to fractions of a penny and only FTX still had it listed and we can see the price still inline. Then when FTX delisted their feed, the Pyth providers easily switched to the decentralized and other smaller exchanges, creating accurate and reliable price feeds to send to various dApps.
4 Pyth Growth Metrics
Pyth has been running (in beta) since April 2021 (devnet and August 2021 mainnet), and slowly but surely has become the dominant oracle for Solana. Looking at analytics, we see some explosive month on month growth and there are now 80 different price feeds and over 60 active publishers (almost a new publisher each week!). The blue line displays the total number of feeds while the red line shows all of the providers.
Furthermore, there are now over 60 solana dApps use Pyth on a regular basis and these dAPPs constantly ping the price oracles. In fact there are over 1.5m to 2.5m single price calls from dApps daily. These metrics show a vibrant and expanding ecosystem on the Pyth Network. Pyth is also deployed on BNB chain testnet and devnet, and eventually will be a true multichain oracle.
5 Conclusion
Oracles have a very strong product market fit and due to the innovative and economically sound mechanisms of Pyth, I think it has a chance to take market share from the existing oracles. Even without the $PYTH token (soon?) the Pyth network has become an integral part of the Solana ecosystem, rapidly growing in important KPIs month on month.
This article is an overview introducing the basic concepts of oracles and focusing on the current state of the Pyth network. A follow-up article will be written after the token launches and we have more data points on the interaction between the publishers, consumers, and delegators. It would be interesting to see how many publishers are delegators, how often slashing occurs, along with comparing the price feed to other protocol's feed.
It is important to reiterate that Pyth’s economic design is built specifically for periods of uncertainty to keep maximum uptime and price accuracy. While other oracles may pause updates during rapid price changes, or slow down the updates, Pyth believes these practices are suboptimal and impractical especially in an asset class as volatile as crypto.
Note: Nothing in the article constitutes professional and/or financial advice. Ape if you want to ape (when token comes out).
This is known as the oracle problem.
Two of the publishers messed up a decimal point.
It is worth noting that the confidence interval during this time accurately reflected the erroneous price and was over 8x the reported price. Protocols should have better practices and incorporate both the confidence interval and aggregated price. More information about the confidence interval is in the Pyth section.
This feature was coded into the smart contract to prevent flash crashes and other forms of market manipulation.
As a third party observer, I understand why the feed was paused, but I believe that uptime is very important for an oracle and crypto is 24/7 so its hard for the protocol devs to constantly be monitoring messages.
A single actor may perform multiple roles as consumer, publisher, and/or delegator.
This mechanism is designed to produce robust price feeds whose prices cannot be significantly influenced by small groups of publishers.
The delegators in aggregate also determine the level of influence (stake-weight) that each publisher has on the aggregate price. In addition, this mechanism determines whether delegators’ stakes are slashed. Finally, the mechanism collects data fees from consumers and distributes a share to delegators (initially set at 80%). The remainder (20%) goes into a reward pool that is distributed among publishers.
The reward distribution mechanism preferentially rewards publishers with higher quality price feeds and reduces the likelihood that uninformed publishers will earn rewards.
Parameters include what types of tokens may be used for data fees; which products are listed on Pyth; the share of data fees allocated to publishers, delegators, and other uses; the number of PYTH tokens that publishers must stake or enable claims to be filed against a product, and more.
There are many safeguards designed to prevent malicious publishers.
Calculated by training online regression model that predicts future price from several features of publisher's price series. A detailed formula can be found in the white paper.
Data/charts in this section provided by the Pyth team. Currently unable to publicly pull historical data.