As the Ethereum network continues to evolve and mature, the concept of different types of nodes becomes increasingly important to understand. However, the reality is that most users are not willing to put in the effort to run a node, despite the hardware requirements being achievable for many. In the “endgame” of Ethereum’s development, it’s crucial that users can verify state integrity and data availability without requiring extensive technical knowledge or resources. A blockchain without verifiability is, after all, just an inefficient database.
In this article, we will go through the three key types of nodes that will shape the future of the Ethereum network: stateless nodes, stateful nodes, and full/archive nodes. We will examine how stateless nodes can enable trustless verification of new blocks using zero-knowledge proofs, how stateful nodes can provide quick and trustless access to the current state of Ethereum, and how full/archive nodes can store the entire chain history back to genesis. By understanding the roles and trade-offs of each node type, we can work towards a more decentralized, secure, and scalable Ethereum ecosystem.
As we have already seen today most users are not willing to put much effort into running any type of node even though for both Bitcoin and Ethereum the hardware requirements are achievable for most of the heavy users of both of those chains. “Heavy user” here is defined as someone with a decent size of assets on the chain, think of it as any user where the cost of running a node is not the blocker.
The main reason is probably a combination of the fact that the vast majority of users do not care to do so, are not willing to spend a few $100 on the hardware or do not have the technical knowledge on how to run it. Even though both Bitcoin and Ethereum have made great strides in making it easier. It is still a pretty complex task for a non technical user.
A Vision for a Stateless Ethereum
I am of the opinion that in the “Endgame” of every blockchain, users will have to verify state integrity and data availability without them even necessarily having to know what either of those things are. The good news is that this vision is totally achievable with enough engineering (Zero-knowledge technology and a little bit of data availability sampling).
In this endgame, basically all wallets worth using will have a stateless node that for every new block that is added to the chain can query any full node on the p2p layer for the latest blockheader and a zk-proof that the state changes from the previous blockheader was executed correctly, request some random data samples from a few peers to get a close to 100% confidence that all the data (blobs and execution block data) has been published and also a zk-proof that proves that the network has come to consensus and finalized the block.
The bandwidth/computation to do this is very small and can totally be done on a phone (or even a smartwatch like@drakefjustin""> @drakefjustin loves to mention).This type of node mentioned above would be classified as a type of “stateless” node since the node can verify new blocks without needing the current state locally and instead relying of different types of proofs to verify new blocks.
These proofs do not have to be zk-proofs. We will have stateless validation of execution way before we can do what i described above with zk-proofs for execution. In-fact, stateless execution can be done today but is VERY inefficient with the current Merkle-Patricia-Tree structure, the witness proofs are way to big to be practical. (see @peter_szilagyi‘s tweet).
See the “witness” size here. This is the main issue stateless execution runs into with the current Merkle-Patricia-tree, many of the blocks in this screenshot is well under 100kb and the proof that is required to allow for stateless verification is often more that 50 times larger than the block itself.
Ethereum m’s MPT structure
However, Ethereum will upgrade its state tree structure to something else than the current Merkle-Patricia-Tree structure in the future. Many of you may have heard about Verkle trees which have been on the roadmap for years (If not, then read our article - Verkle Trees For The Rest Of Us: Part 1). They would allow the creation of stateless clients that are practical since the nature of the Verkle tree structure allows for very small witnesses/proofs.
Merkle tree vs. Verkle trees
One major problem Verkle-trees have is that they are not quantum secure, this means that they will at best be a temporary solution until a permanent solution to the state tree structure is mature and/or efficient enough. The endgame solution will probably be a STARK proven binary hash tree and it is very possible that Verkle-trees are skipped in favor of some flavor of a STARK proven binary hash tree. (relevant meme from @VitalikButerin)
One very interesting option a stateless node can have is the option to not be entirely stateless. As an example, it would be possible to locally store state that you find relevant for your use case (assuming your client support such a feature).
Say you have your assets spread across a few addresses, assets and DeFi protocols, you could in that case have the state of everything that is relevant to your use case written to disk locally while only using a trivially small amount of disk space. Even keeping track of the entire state of multiple large DeFi protocols would only be a few gigabytes and considering that basically all newer phones ship with 128gb+ of storage it is not only possible, but potentially practical for a user to keep all the state that they find useful written to the flash storage of their mobile phone.
(Quick note on light clients: In a world where stateless clients can efficiently verify state transitions and consensus trivially, I feel like there wont really be a use-case for traditional light client that relies on a honest majority assumption.)
Stateful nodes only hold the current and very recent state, they prune everything older than a certain age (see the eip-4444 proposal). The current state is required to build blocks locally and local block building is something stateless nodes is incapable of doing.
Stateful nodes should not be confused with “full” nodes as a stateful node will not hold the complete chain history because that will get really data intensive in the future. A stateful node is useful for any user who wants quick and trustless access to the current state of Ethereum, whether it is for querying data from the state, building blocks or using this type of a node for staking.
Preserving the possibility to run stateful nodes on consumer hardware is a very important goal that i think we in the Ethereum community must preserve even when stateless nodes are very light and mature. One of the main reason for this is that all stateless nodes rely on stateful nodes to create the witness that is needed for stateless validation of new blocks.
Having access to the current state is also required to know if a transaction that is in the mempool is valid or not and therefore it is very important that we have a very decentralized set of stateful nodes on the network that can ensure very strong censorship resistance guarantees with some form of inclusion list design.
The good news is that with state expiry we can make it significantly easier to run a stateful node as state that no-one has interacted with in a while can be pruned from the disk of the node, anyone who want to interact with state that has expired will have to bring a witness (essentially a merkle proof) to bring back the expired state back into the current state. Anyone with access to the chain history can in a trustless way construct these types of proofs to bring back expired state. As of the writing of this the Ethereum state is closing in on 300gb and until some form of state expiry is implemented the state size will continue to grow in a more or less up only trend.
( Here is a very great article from @paradigm that goes deeper into the topic of state-growth and state expiry)
For the purposes of this article i am going to batch full and archive nodes together since a normal full node can with the information it has written to disk locally compute all the data that a archival node has written to disk. The differences is that a full node prunes state that is no longer the latest/recent state. you can not query for example “what was the ETH balance of account X on block Y around 5 years ago” from a normal full node while an archival node would answer that query in a millisecond.
Easy Guide on Ethereum Full Node Vs Archive Node by @0xZeeve
That said, it is theoretically possible to compute the answer to this query from the data that a full node has written to disk (the entire chain history) but not many execution clients support this feature. I think it is unreasonable to think that many users, even sophisticated ones will run a full/archival node in say 10 years, for this to be a reasonable option we would have to constrain the L1 throughput to levels that is completely unreasonable when we can get way more throughput on L1 with minimal tradeoffs. When most users can trivially verify new blocks with a zk-proof i think it is a tradeoff worth pursuing when the benefits are so large.
Perhaps we can get Execution-clients that are able to run efficiently on HDD and make it practical to store even 100s of TB of archived state relatively cheaply. That could allow users that for whatever reason want to Archive all of Ethereum to do so, i know that one of Erigons goals is to allow a Full-archival node to be run on HDD.
In the end, the future of Ethereum will be shaped by the nodes that comprise its network. By embracing stateless nodes as the most realistic option for most users, but still being pragmatic and realizing the value of a strong presence of stateful, and full/archive nodes on the network we can create a perfect balance between decentralization, security, and scalability that benefits all users.
As the Ethereum network continues to evolve and mature, the concept of different types of nodes becomes increasingly important to understand. However, the reality is that most users are not willing to put in the effort to run a node, despite the hardware requirements being achievable for many. In the “endgame” of Ethereum’s development, it’s crucial that users can verify state integrity and data availability without requiring extensive technical knowledge or resources. A blockchain without verifiability is, after all, just an inefficient database.
In this article, we will go through the three key types of nodes that will shape the future of the Ethereum network: stateless nodes, stateful nodes, and full/archive nodes. We will examine how stateless nodes can enable trustless verification of new blocks using zero-knowledge proofs, how stateful nodes can provide quick and trustless access to the current state of Ethereum, and how full/archive nodes can store the entire chain history back to genesis. By understanding the roles and trade-offs of each node type, we can work towards a more decentralized, secure, and scalable Ethereum ecosystem.
As we have already seen today most users are not willing to put much effort into running any type of node even though for both Bitcoin and Ethereum the hardware requirements are achievable for most of the heavy users of both of those chains. “Heavy user” here is defined as someone with a decent size of assets on the chain, think of it as any user where the cost of running a node is not the blocker.
The main reason is probably a combination of the fact that the vast majority of users do not care to do so, are not willing to spend a few $100 on the hardware or do not have the technical knowledge on how to run it. Even though both Bitcoin and Ethereum have made great strides in making it easier. It is still a pretty complex task for a non technical user.
A Vision for a Stateless Ethereum
I am of the opinion that in the “Endgame” of every blockchain, users will have to verify state integrity and data availability without them even necessarily having to know what either of those things are. The good news is that this vision is totally achievable with enough engineering (Zero-knowledge technology and a little bit of data availability sampling).
In this endgame, basically all wallets worth using will have a stateless node that for every new block that is added to the chain can query any full node on the p2p layer for the latest blockheader and a zk-proof that the state changes from the previous blockheader was executed correctly, request some random data samples from a few peers to get a close to 100% confidence that all the data (blobs and execution block data) has been published and also a zk-proof that proves that the network has come to consensus and finalized the block.
The bandwidth/computation to do this is very small and can totally be done on a phone (or even a smartwatch like@drakefjustin""> @drakefjustin loves to mention).This type of node mentioned above would be classified as a type of “stateless” node since the node can verify new blocks without needing the current state locally and instead relying of different types of proofs to verify new blocks.
These proofs do not have to be zk-proofs. We will have stateless validation of execution way before we can do what i described above with zk-proofs for execution. In-fact, stateless execution can be done today but is VERY inefficient with the current Merkle-Patricia-Tree structure, the witness proofs are way to big to be practical. (see @peter_szilagyi‘s tweet).
See the “witness” size here. This is the main issue stateless execution runs into with the current Merkle-Patricia-tree, many of the blocks in this screenshot is well under 100kb and the proof that is required to allow for stateless verification is often more that 50 times larger than the block itself.
Ethereum m’s MPT structure
However, Ethereum will upgrade its state tree structure to something else than the current Merkle-Patricia-Tree structure in the future. Many of you may have heard about Verkle trees which have been on the roadmap for years (If not, then read our article - Verkle Trees For The Rest Of Us: Part 1). They would allow the creation of stateless clients that are practical since the nature of the Verkle tree structure allows for very small witnesses/proofs.
Merkle tree vs. Verkle trees
One major problem Verkle-trees have is that they are not quantum secure, this means that they will at best be a temporary solution until a permanent solution to the state tree structure is mature and/or efficient enough. The endgame solution will probably be a STARK proven binary hash tree and it is very possible that Verkle-trees are skipped in favor of some flavor of a STARK proven binary hash tree. (relevant meme from @VitalikButerin)
One very interesting option a stateless node can have is the option to not be entirely stateless. As an example, it would be possible to locally store state that you find relevant for your use case (assuming your client support such a feature).
Say you have your assets spread across a few addresses, assets and DeFi protocols, you could in that case have the state of everything that is relevant to your use case written to disk locally while only using a trivially small amount of disk space. Even keeping track of the entire state of multiple large DeFi protocols would only be a few gigabytes and considering that basically all newer phones ship with 128gb+ of storage it is not only possible, but potentially practical for a user to keep all the state that they find useful written to the flash storage of their mobile phone.
(Quick note on light clients: In a world where stateless clients can efficiently verify state transitions and consensus trivially, I feel like there wont really be a use-case for traditional light client that relies on a honest majority assumption.)
Stateful nodes only hold the current and very recent state, they prune everything older than a certain age (see the eip-4444 proposal). The current state is required to build blocks locally and local block building is something stateless nodes is incapable of doing.
Stateful nodes should not be confused with “full” nodes as a stateful node will not hold the complete chain history because that will get really data intensive in the future. A stateful node is useful for any user who wants quick and trustless access to the current state of Ethereum, whether it is for querying data from the state, building blocks or using this type of a node for staking.
Preserving the possibility to run stateful nodes on consumer hardware is a very important goal that i think we in the Ethereum community must preserve even when stateless nodes are very light and mature. One of the main reason for this is that all stateless nodes rely on stateful nodes to create the witness that is needed for stateless validation of new blocks.
Having access to the current state is also required to know if a transaction that is in the mempool is valid or not and therefore it is very important that we have a very decentralized set of stateful nodes on the network that can ensure very strong censorship resistance guarantees with some form of inclusion list design.
The good news is that with state expiry we can make it significantly easier to run a stateful node as state that no-one has interacted with in a while can be pruned from the disk of the node, anyone who want to interact with state that has expired will have to bring a witness (essentially a merkle proof) to bring back the expired state back into the current state. Anyone with access to the chain history can in a trustless way construct these types of proofs to bring back expired state. As of the writing of this the Ethereum state is closing in on 300gb and until some form of state expiry is implemented the state size will continue to grow in a more or less up only trend.
( Here is a very great article from @paradigm that goes deeper into the topic of state-growth and state expiry)
For the purposes of this article i am going to batch full and archive nodes together since a normal full node can with the information it has written to disk locally compute all the data that a archival node has written to disk. The differences is that a full node prunes state that is no longer the latest/recent state. you can not query for example “what was the ETH balance of account X on block Y around 5 years ago” from a normal full node while an archival node would answer that query in a millisecond.
Easy Guide on Ethereum Full Node Vs Archive Node by @0xZeeve
That said, it is theoretically possible to compute the answer to this query from the data that a full node has written to disk (the entire chain history) but not many execution clients support this feature. I think it is unreasonable to think that many users, even sophisticated ones will run a full/archival node in say 10 years, for this to be a reasonable option we would have to constrain the L1 throughput to levels that is completely unreasonable when we can get way more throughput on L1 with minimal tradeoffs. When most users can trivially verify new blocks with a zk-proof i think it is a tradeoff worth pursuing when the benefits are so large.
Perhaps we can get Execution-clients that are able to run efficiently on HDD and make it practical to store even 100s of TB of archived state relatively cheaply. That could allow users that for whatever reason want to Archive all of Ethereum to do so, i know that one of Erigons goals is to allow a Full-archival node to be run on HDD.
In the end, the future of Ethereum will be shaped by the nodes that comprise its network. By embracing stateless nodes as the most realistic option for most users, but still being pragmatic and realizing the value of a strong presence of stateful, and full/archive nodes on the network we can create a perfect balance between decentralization, security, and scalability that benefits all users.