🚀 The special episode of "Dr. Han, What Do You Think" is live!
🎙 Gate.io Founder & CEO Dr. Han takes on a rapid fire Q&A, covering work, life, and some truly tricky questions!
👀 How will he tackle these challenges?
🤩 Click to watch his real-time reactions, and join in the comments!
Paradigm: A detailed explanation of Ethereum history rise problems and their solutions
Original authors: Storm Slivkoff, Georgios Konstantopoulos
Original compilation: Luffy, Foresight News
History rise growth is currently the biggest bottleneck in Ethereum expansion. Surprisingly, historical rise has become a bigger problem than state rise. Within a few years, historical data will exceed long Ethereum Node storage capacity.
Here's the good news:
In this post, we will continue to look at the Ethereum scaling problem in Part 1 and now turn our attention from state rise to historical rise. Using granular datasets, our goals are to 1) technically understand Ethereum's scaling bottlenecks, and 2) help open the discussion around the optimal solution to Ethereum's gas limits.
What is Historical rise?
History is the collection of all blocks and transactions executed by Ethereum throughout its lifetime, and it is all the data from the Genesis Block to the current Block. Historical growth is the rise of new blocks and new transactions over time.
Figure 1 shows the relationship between historical rise and various protocol metrics and Ethereum Node hardware constraints. Compared to state rise, historical rise are limited by a different set of hardware constraints. Historical rise puts pressure on network IO as new Block and transactions must be transmitted throughout the network. Historical rise can also put pressure on Node storage short, as each Ethereum Node stores a complete copy of history. If the historical rate of rise is fast enough to exceed these hardware limits, the Node will no longer be able to reach stable Consensus with its Node. For an overview of state rise and other scaling bottlenecks, see Part 1 of this series.
Figure 1: Ethereum scaling bottleneck
Until recently, most of the network throughput per node was used to transfer history (such as new blocks and transactions). This changed with the introduction of blobs in the Dencun Hard Fork. blobs now account for a large portion of Node Network activity. However, blobs are not considered part of the history because 1) they are only stored by Nodes for 2 weeks and then discarded, and 2) they do not need to repeat data from the inception of Ethereum. Because of (1), blobs don't significantly increase the storage burden per Ethereum Node. We'll talk about blobs later in this article.
In this article, we will focus on historical rise and discuss the relationship between history and state. Because state rise and historical rise have some overlapping hardware constraints, they are related problems, and solving one problem can help solve the other.
How fast is history rise long?
Figure 2 shows the historical rise rate since the creation of Ethereum. Each vertical line represents a month's rise. The y-axis represents the number of k exabytes of historical rise in that month. Transactions are categorized by their "destination Address" and use RLP() bytes to indicate size. Contracts that cannot be easily identified are classified as "unknown". The "Other" category includes a range of sub-categories such as infrastructure and games.
Figure 2: Ethereum historical rise rate over time
A few key takeaways from the chart above:
Who is the biggest contributor to Ethereum's historical rise?
The historical number of different contract classes generated reveals how Ethereum's usage patterns have evolved over time. Figure 3 shows the relative contributions of the various contract categories. This is normalized to the same data as in Figure 2.
Figure 3: Contribution of different contract classes to historical rise
This data reveals four different periods of Ethereum usage patterns:
Each era represents a more complex pattern of using Ethereum than ever before. Over time, complexity can be seen as a form of Ethereum scaling, which cannot be measured by simple metrics such as transactions per second.
In the most recent data month (April 2024), Rollups no longer produce most of the history. It's unclear whether future history originates from DEXs and Decentralized Finance, or if some new usage patterns will emerge.
What about blobs?
The Dencun Hard Fork dramatically changed the historical rising dynamics by introducing blobs, allowing rollups to publish data using cheap blobs instead of history. Figure 4 amplifies the historical rise before and after the Dencun upgrade. The chart is similar to Figure 2, except that each vertical line represents a day instead of a month.
Figure 4 The impact of :D encun on historical rise
From this chart, we can draw several key conclusions:
Although blobs have drop historical rise speed, they are still a new feature of the Ethereum. It's unclear at what level the historical rise velocity will stabilize in the presence of blobs.
Is long fast historical rise acceptable?
Increasing the gas cap will increase the historical rise rate. Therefore, proposals to increase the gas cap, such as Pump the Gas, must take into account the relationship between historical rise and the hardware bottlenecks of each Node.
To determine an acceptable historical rise rate, you must first understand how long your current Node hardware can sustain long in terms of networking and storage. Networked hardware may be able to maintain the status quo indefinitely, as historical growth rates are unlikely to rise back to their pre-Dencun peaks until gas limits are increased. However, the storage burden of history increases over time. Under the current storage strategy, it is inevitable that each Node's storage disk will eventually be filled with history.
Figure 5 shows Ethereum Node storage burden over time and predicts the rise of the storage burden over the next 3 years. The forecast is based on the rise rate in April 2024. This rise rate may rise or decrease as future usage patterns or gas limits change.
Figure 5: The size of the history, state, and full node storage burden
From this graph, we can draw several key conclusions:
Unlike status data, historical data is append-only and accessed longest less frequently. Therefore, it is theoretically possible to store historical data separately from state data on a cheaper storage medium. This can be achieved with some clients such as Geth.
In addition to storage capacity, network IO is another major limitation of historical rise. Unlike storage capacity, network IO limits will not cause problems for Nodes in the short term, but these limits will become important for increasing gas limits in the future.
To understand how the network capacity of a typical Ethereum Node can support long few historical rise, it is important to know the relationship between historical rise and various network health metrics, such as reorganization rate, slot misses, final misses, proof misses, synchronization committee misses, and Block commit latency. The analysis of these metrics is beyond the scope of this article, but more long information can be found in previous surveys of Consensus layer health. In addition, the Ethereum Foundation's Xatu project has been building public datasets to speed up such analysis.
How to solve the historical rise problem?
Historical rising is a much easier problem to solve than state rising. It can be addressed almost entirely by candidate proposal EIP-4444. This EIP changes each Node from keeping the entire Ethereum historical data to only one year's worth of historical data. After the implementation of EIP-4444, data storage will no longer be a bottleneck for Ethereum scaling, and in the long run, gas limit increases will not be constrained. EIP-4444 is necessary for the long-term sustainability of the network, otherwise the historical rate of rise will be rapid and the hardware of the network Node needs to be updated regularly.
Figure 6 shows the impact of EIP-4444 on the storage burden of each Node over the next 3 years. This is the same as Figure 4, but with the addition of a shallower line to indicate the storage burden following the implementation of EIP-4444.
Figure 6: Impact of EIP-4444 on Ethereum Node storage burden
Some key conclusions can be seen from this graph:
After EIP-4444 is implemented, historical rise will still introduce some level of storage burden, as Node will store a year's worth of historical history. However, even if Ethereum reaches global scale, this burden will not be difficult to solve. Once the history-keeping method proves to be reliable, the one-year expiration time for EIP-4444 may be shortened to months, weeks, or even less.
How do I save my Ethereum history?
EIP-4444 raises the question: if history is not saved by Ethereum Node itself, then how should it be saved? History plays a central role in Ethereum's verification, accounting, and analysis, so it's crucial to preserve history. Luckily, history keeping is a simple matter that only requires 1/n honest data providers. This is in contrast to state Consensus issues, which require 1/3 to 2/3 of participants to be honest. Node operators can verify the authenticity of historical datasets by 1) replaying all transactions since the Genesis Block and 2) checking whether these transactions reproduce the same state root as the current Blockchain side.
There are longest ways to save history.
The remaining implementation challenges are longer social than technical. The Ethereum community needs to coordinate specific implementation details in order to integrate them directly into each Node client. In particular, performing a full sync from the Genesis Block (instead of Snapshot sync) will require retrieving the history from the history provider instead of the Ethereum Node. These changes don't technically require a hard fork, so they can be implemented earlier than Ethereum's next hard fork, Pectra.
All of these history-keeping methods can also be used by L2s to hold blob data they publish to Mainnet. Compared to historical preservation, blob preservation 1) is more difficult because the total amount of data is longer; 2) Less important because blobs are not necessary to replay Mainnet history. However, blob preservation is still necessary for each L2 to replay its own history. Therefore, some form of blob saving is important for the entire Ethereum ecosystem. In addition, if L2 develops a robust blob storage infrastructure, they may also be able to easily store historical L1 data.
It can be helpful to directly compare the datasets stored by various Node configurations before and after EIP-4444. Figure 7 shows the storage burden for different Ethereum Node types. State data is accounts and contracts, historical data is Block and Transactions, and archive data is an optional set of data indexes. The number of bytes in this table is based on the most recent reth Snapshot, but the numbers for other Node clients should be roughly the same.
Figure 7: Storage burden for different Ethereum Node types
Other words
Finally, there are additional EIPs that can limit the historical rise rate, not just accommodate the current rise rate. This helps stay within network IO constraints in the short term and storage constraints in the long term. Although EIP-4444 is still necessary for the long-term sustainability of the network, these other EIPs will help Ethereum scale more efficiently in the future:
These EIPs are easier to implement than EIP-4444, so they may serve as short-term options before EIP-4444 goes into production.
Conclusion
The purpose of this article is to use data to understand 1) how historical rise works and 2) ways to solve that problem. Much of the long data in this article is difficult to obtain through traditional means, so we wanted to expose this data to provide some new insights into historical rise issues.
Historically rise as a bottleneck for Ethereum expansion, not enough attention has been paid to it. Even without increasing the gas cap, Ethereum current history-keeping conventions will force Xu long Node to upgrade their hardware within a few years. Fortunately, this is not a difficult problem to solve. There is already a clear solution in EIP-4444. We believe that the implementation of this EIP should be accelerated to allow shorts for future gas cap increases.
Link to original article