put()
method with the BLOB hash and a fee in ETH. The fee will be gradually distributed to storage providers upon submitting a valid proof of storage of off-chain BLOBs over time. The EthStorage testnet is running on Ethereum Sepolia testnet with multiple community participants successfully proving their local storage.Acknowledgement: Many thanks to Piper Merriam from EF, Karthik Raju from Polychain, Qiang from EthStorage for providing feedback of the article.
On Oct 22, 2023, Péter Szilágyi, the renowned Go-Ethereum (Geth) dev lead, expressed his deep concerns on Twitter. He pointed out that while Geth clients preserve all historical data, other Ethereum clients like Nethermind and Besu can be configured to operate without certain historical Ethereum data, such as historical block bodies and headers. This makes all clients inconsistent and is unfair to Geth. It sparked intense discussions and debates surrounding the Ethereum Storage problem within the Ethereum roadmap.
Why do Nethermind and Besu opt to cease storing historical data? What issues underlie this decision? From my perspective, there are two primary root causes:
The first reason stems from the escalating storage demands of running an Ethereum client. To delve into the specific requirements, the following pie chart illustrates the distribution of storage costs for a fresh Geth node, as of block 18,779,761 on December 13, 2023.
As the picture shows:
The second reason is the absence of in-protocol incentives or penalties for storing historical blocks. While the protocol mandates nodes to store all historical data, it fails to provide any mechanism to encourage storage or penalize non-compliance. Storing and sharing historical data by nodes become purely altruistic, and a node is free to prune all historical data without facing any adverse consequences. In contrast, validators, for instance, must maintain the latest full state to avoid proposing/voting for an invalid block, risking the loss of incentives in either case.
Consequently, when the storage cost becomes a substantial burden for a node, it’s not surprising that some node operators choose to prune historical data. Opting to run without historical data can result in significant storage cost savings, reducing it from approximately 1TB to around 300GB.
Illustration: The Nethermined configuration to run a node without historical block bodies - saving ~460GB storage cost at the time being.
The challenge of storage is expected to intensify with the upcoming Ethereum Data Availability (DA) upgrade. The path towards fully scaling Ethereum DA commences with EIP-4844 in DenCun, introducing a fixed-sized Binary large object (BLOB) accompanied by an independent fee model known as blobGasPrice. Each BLOB is set at 128KB, and EIP-4844 permits each block to contain up to 6 BLOBs. To enhance data scalability, the plan involves implementing 1D Reed-Solomon code, allowing for 32 BLOBs per block initially and eventually reaching 256 BLOBs per block at full scaling.
With the Ethereum DA operating at full data capacity with 256 BLOBs per block, one year of Ethereum DA network is projected to accept approximately 80 TB of data, surpassing the storage capacities of most Ethereum nodes.
Vitalik’s tweet of Ethereum roadmap, in which the Purge mainly deals with storage.
The escalating storage costs have garnered attention from researchers within the Ethereum ecosystem. To address this and ensure alignment across all clients, several proposals are in development to explicitly prune storage. The two main proposals are:
What is the consequence of pruning historical data from all clients? The main one is that a fresh node cannot synchronize to the latest state via “full sync” - a synchronization to replay the transactions from the genesis block to the latest block. Instead, we have to resort to a “snap sync” or “state sync” to synchronize the latest state from Ethereum peers. This approach is already implemented in Geth and runs as the default sync.
Similarly, this consequence also applies to all L2s, i.e., a fresh node of L2 cannot fully replay the latest state from L2 genesis from Ethereum by replaying L2 blocks from L2 genesis. Further, since the L1 nodes do not maintain the L2 state, the “snap sync” approach for L2 cannot derive the latest L2 state from L1 - breaking an important L2 assumption of inheriting Ethereum security guarantees. The projected solution will rely on 3rd-party services such as Infura / Etherscan / L2 projects themselves to store a copy of historical L2 data or state, which is centralized with out-of-protocol indirect incentive.
The core questions we are asking are
The Ethereum Portal network serves as a lightweight, decentralized access network to the Ethereum protocol. Offering the Ethereum JSON-RPC interface such as eth_call, eth_getBlockByNumber, it translates JSON-RPC requests into P2P requests to a distributed hash table, similar to IPFS network. Unlike IPFS, which permits the storage of any data type and is susceptible to spam, the Portal P2P network exclusively hosts Ethereum data, such as historical headers and bodies. This is achieved through a built-in light-client verification technique within the Portal network.
A significant feature of the Portal network is its design for lightweight operation and compatibility with resource-constrained devices. It can run on top of a node with a few megabytes of storage and low memory, promoting decentralization. Even a cellphone or a Raspberry Pi device can potentially join the network and contribute to the availability of Ethereum data.
The development of the Portal network aligns with the Ethereum client diversity philosophy, with clients written in Rust, JavaScript, Nim, and Go. The beacon network and history network are ready for use, while the state network is actively under development. Notably, the Portal network does not provide direct incentives for data storage—all nodes in the network operate altruistically.
Illustration: Running a Portal network (Trin) with a 100MB storage limit.
The EthStorage network is a decentralized incentivized storage network specifically designed to store EIP-4844 BLOBs, supported by a grant from the ESP program.
From blockchain modularity perspective, EthStorage functions as an Ethereum Layer 2, but collects storage fees instead of transaction fees. By indexing BLOB hashes on-chain, EthStorage is an Ethereum modular storage layer with significant storage scalability and cost savings - targeting about 1000x.
In terms of development, EthStorage is already integrated with EIP-4844 on Ethereum Sepolia testnet. A stress test on EthStorage and Ethereum Sepolia testnet has been conducted, involving the writing of approximately hundreds of Gigabytes of BLOBs to EthStorage. More than 50 community participants joined the network and successfully proved their local storages.
The EthStorage network’s primary advantage lies in providing a decentralized, direct incentive on top of Ethereum—a pioneering feature, as far as our current knowledge extends. However, a limitation of the network is that it is specifically tailored for fixed-size BLOBs.
The dashboard of EthStorage on Ethereum Devnet
Ethereum storage, though less spotlighted, holds significant importance within the Ethereum ecosystem. As the Ethereum network is experiencing rapid growth, the storage and accessibility of Ethereum data emerge as critical challenges. While the Portal network and EthStorage network are in their early stages, we envision several intriguing directions for the long term:
In our pursuit, we aspire that these endeavors collectively contribute to the Ethereum roadmap, laying the groundwork for future decentralized storage solutions within the Ethereum ecosystem.
This article is reproduced from [tech flow deep tide], the original title is “Ethereum Storage Roadmap: Challenges and Opportunities”, the copyright belongs to the original author [EthStorage], if you have any objection to the reprint, please contact Gate Learn Team, the team will handle it as soon as possible according to relevant procedures.
Disclaimer: The views and opinions expressed in this article represent only the author’s personal views and do not constitute any investment advice.
Other language versions of the article are translated by the Gate Learn team, not mentioned in Gate.io, the translated article may not be reproduced, distributed or plagiarized.
put()
method with the BLOB hash and a fee in ETH. The fee will be gradually distributed to storage providers upon submitting a valid proof of storage of off-chain BLOBs over time. The EthStorage testnet is running on Ethereum Sepolia testnet with multiple community participants successfully proving their local storage.Acknowledgement: Many thanks to Piper Merriam from EF, Karthik Raju from Polychain, Qiang from EthStorage for providing feedback of the article.
On Oct 22, 2023, Péter Szilágyi, the renowned Go-Ethereum (Geth) dev lead, expressed his deep concerns on Twitter. He pointed out that while Geth clients preserve all historical data, other Ethereum clients like Nethermind and Besu can be configured to operate without certain historical Ethereum data, such as historical block bodies and headers. This makes all clients inconsistent and is unfair to Geth. It sparked intense discussions and debates surrounding the Ethereum Storage problem within the Ethereum roadmap.
Why do Nethermind and Besu opt to cease storing historical data? What issues underlie this decision? From my perspective, there are two primary root causes:
The first reason stems from the escalating storage demands of running an Ethereum client. To delve into the specific requirements, the following pie chart illustrates the distribution of storage costs for a fresh Geth node, as of block 18,779,761 on December 13, 2023.
As the picture shows:
The second reason is the absence of in-protocol incentives or penalties for storing historical blocks. While the protocol mandates nodes to store all historical data, it fails to provide any mechanism to encourage storage or penalize non-compliance. Storing and sharing historical data by nodes become purely altruistic, and a node is free to prune all historical data without facing any adverse consequences. In contrast, validators, for instance, must maintain the latest full state to avoid proposing/voting for an invalid block, risking the loss of incentives in either case.
Consequently, when the storage cost becomes a substantial burden for a node, it’s not surprising that some node operators choose to prune historical data. Opting to run without historical data can result in significant storage cost savings, reducing it from approximately 1TB to around 300GB.
Illustration: The Nethermined configuration to run a node without historical block bodies - saving ~460GB storage cost at the time being.
The challenge of storage is expected to intensify with the upcoming Ethereum Data Availability (DA) upgrade. The path towards fully scaling Ethereum DA commences with EIP-4844 in DenCun, introducing a fixed-sized Binary large object (BLOB) accompanied by an independent fee model known as blobGasPrice. Each BLOB is set at 128KB, and EIP-4844 permits each block to contain up to 6 BLOBs. To enhance data scalability, the plan involves implementing 1D Reed-Solomon code, allowing for 32 BLOBs per block initially and eventually reaching 256 BLOBs per block at full scaling.
With the Ethereum DA operating at full data capacity with 256 BLOBs per block, one year of Ethereum DA network is projected to accept approximately 80 TB of data, surpassing the storage capacities of most Ethereum nodes.
Vitalik’s tweet of Ethereum roadmap, in which the Purge mainly deals with storage.
The escalating storage costs have garnered attention from researchers within the Ethereum ecosystem. To address this and ensure alignment across all clients, several proposals are in development to explicitly prune storage. The two main proposals are:
What is the consequence of pruning historical data from all clients? The main one is that a fresh node cannot synchronize to the latest state via “full sync” - a synchronization to replay the transactions from the genesis block to the latest block. Instead, we have to resort to a “snap sync” or “state sync” to synchronize the latest state from Ethereum peers. This approach is already implemented in Geth and runs as the default sync.
Similarly, this consequence also applies to all L2s, i.e., a fresh node of L2 cannot fully replay the latest state from L2 genesis from Ethereum by replaying L2 blocks from L2 genesis. Further, since the L1 nodes do not maintain the L2 state, the “snap sync” approach for L2 cannot derive the latest L2 state from L1 - breaking an important L2 assumption of inheriting Ethereum security guarantees. The projected solution will rely on 3rd-party services such as Infura / Etherscan / L2 projects themselves to store a copy of historical L2 data or state, which is centralized with out-of-protocol indirect incentive.
The core questions we are asking are
The Ethereum Portal network serves as a lightweight, decentralized access network to the Ethereum protocol. Offering the Ethereum JSON-RPC interface such as eth_call, eth_getBlockByNumber, it translates JSON-RPC requests into P2P requests to a distributed hash table, similar to IPFS network. Unlike IPFS, which permits the storage of any data type and is susceptible to spam, the Portal P2P network exclusively hosts Ethereum data, such as historical headers and bodies. This is achieved through a built-in light-client verification technique within the Portal network.
A significant feature of the Portal network is its design for lightweight operation and compatibility with resource-constrained devices. It can run on top of a node with a few megabytes of storage and low memory, promoting decentralization. Even a cellphone or a Raspberry Pi device can potentially join the network and contribute to the availability of Ethereum data.
The development of the Portal network aligns with the Ethereum client diversity philosophy, with clients written in Rust, JavaScript, Nim, and Go. The beacon network and history network are ready for use, while the state network is actively under development. Notably, the Portal network does not provide direct incentives for data storage—all nodes in the network operate altruistically.
Illustration: Running a Portal network (Trin) with a 100MB storage limit.
The EthStorage network is a decentralized incentivized storage network specifically designed to store EIP-4844 BLOBs, supported by a grant from the ESP program.
From blockchain modularity perspective, EthStorage functions as an Ethereum Layer 2, but collects storage fees instead of transaction fees. By indexing BLOB hashes on-chain, EthStorage is an Ethereum modular storage layer with significant storage scalability and cost savings - targeting about 1000x.
In terms of development, EthStorage is already integrated with EIP-4844 on Ethereum Sepolia testnet. A stress test on EthStorage and Ethereum Sepolia testnet has been conducted, involving the writing of approximately hundreds of Gigabytes of BLOBs to EthStorage. More than 50 community participants joined the network and successfully proved their local storages.
The EthStorage network’s primary advantage lies in providing a decentralized, direct incentive on top of Ethereum—a pioneering feature, as far as our current knowledge extends. However, a limitation of the network is that it is specifically tailored for fixed-size BLOBs.
The dashboard of EthStorage on Ethereum Devnet
Ethereum storage, though less spotlighted, holds significant importance within the Ethereum ecosystem. As the Ethereum network is experiencing rapid growth, the storage and accessibility of Ethereum data emerge as critical challenges. While the Portal network and EthStorage network are in their early stages, we envision several intriguing directions for the long term:
In our pursuit, we aspire that these endeavors collectively contribute to the Ethereum roadmap, laying the groundwork for future decentralized storage solutions within the Ethereum ecosystem.
This article is reproduced from [tech flow deep tide], the original title is “Ethereum Storage Roadmap: Challenges and Opportunities”, the copyright belongs to the original author [EthStorage], if you have any objection to the reprint, please contact Gate Learn Team, the team will handle it as soon as possible according to relevant procedures.
Disclaimer: The views and opinions expressed in this article represent only the author’s personal views and do not constitute any investment advice.
Other language versions of the article are translated by the Gate Learn team, not mentioned in Gate.io, the translated article may not be reproduced, distributed or plagiarized.