Forward the Original Title‘Epochs and slots all the way down: ways to give Ethereum users faster transaction confirmation times’
One of the important properties of a good blockchain user experience is fast transaction confirmation times. Today, Ethereum has already improved a lot compared to five years ago. Thanks to the combination of EIP-1559 and steady block times after the Merge, transactions sent by users on L1 reliably confirm within 5-20 seconds. This is roughly competitive with the experience of paying with a credit card. However, there is value in improving user experience further, and there are some applications that outright require latencies on the order of hundreds of milliseconds or even less. This post will go over some of the practical options that Ethereum has.
Today, Ethereum’s Gasper consensus uses a slot and epoch architecture. Every 12-second slot, a subset of validators publish a vote on the head of the chain, and over the course of 32 slots (6.4 min), all validators get a chance to vote once. These votes are then re-interpreted as being messages in a vaguely PBFT-like consensus algorithm, which after two epochs (12.8 min) gives a very hard economic assurance called finality.
Over the last couple of years, we’ve become more and more uncomfortable with the current approach. The key reasons are that (i) it’s complicated and there are many interaction bugs between the slot-by-slot voting mechanism and the epoch-by-epoch finality mechanism, and (ii) 12.8 minutes is way too long and nobody cares to wait that long.
Single-slot finality replaces this architecture by a mechanism much more similar to Tendermint consensus, in which block N is finalized before block N+1 is made. The main deviation from Tendermint is that we keep the “inactivity leak“ mechanism, which allows the chain to keep going and recover if more than 1/3 of validators go offline.
A diagram of the leading proposed single-slot finality design_
The main challenge with SSF is that naively, it seems to imply that every single Ethereum staker would need to publish two messages every 12 seconds, which would be a lot of load for the chain to handle. There are clever ideas for how to mitigate this, including the very recent Orbit SSF proposal. But even still, while this improves UX significantly by making “finality” come faster, it doesn’t change the fact that users need to wait 5-20 seconds.
For the last few years, Ethereum has been following a rollup-centric roadmap, designing the Ethereum base layer (the “L1”) around supporting data availability and other functionalities that can then be used by layer 2 protocols like rollups (but also validiums and plasmas) that can give users the same level of security as Ethereum, but at much higher scale.
This creates a separation-of-concerns within the Ethereum ecosystem: the Ethereum L1 can focus on being censorship resistant, dependable, stable, and maintaining and improving a certain base-level core of functionality, and L2s can focus on more directly reaching out to users - both through different cultural and technological tradeoffs. But if you go down this path, one inevitable issue comes up: L2s want to serve users who want confirmations much faster than 5-20 seconds.
So far, at least in the rhetoric, it has been L2s’ responsibility to create their own “decentralized sequencing” networks. A smaller group of validators would sign off on blocks, perhaps once every few hundred milliseconds, and they would put their “stake” behind those blocks. Eventually, headers of these L2 blocks get published to L1.
L2 validator sets could cheat: they could first sign block B1, and then later sign a conflicting block B2 and commit it onto the chain before B1. But if they do this, they would get caught and lose their deposits. In practice, we have seen centralized versions of this, but rollups have been slow to develop decentralized sequencing networks. And you can argue that demanding L2s all do decentralized sequencing is an unfair deal: we’re asking rollups to basically do most of the same work as creating an entire new L1. For this reason and others, Justin Drake has been promoting a way to give all L2s (as well as L1) access to a shared Ethereum-wide preconfirmation mechanism: based preconfirmations.
The based preconfirmation approach assumes that Ethereum proposers will become highly sophisticated actors for MEV-related reasons (see here for my explanation of MEV, and see also the execution tickets line of proposals). The based preconfirmation approach takes advantage of this sophistication by incentivizing these sophisticated proposers to accept the responsibility of offering preconfirmations-as-a-service.
The basic idea is to create a standardized protocol by which a user can offer an additional fee in exchange for an immediate guarantee that the transaction will be included in the next block, along with possibly a claim about the results of executing that transaction. If the proposer violates any promise that they make to any user, they can get slashed.
As described, based preconfirmations provide guarantees to L1 transactions. If rollups are “based”, then all L2 blocks are L1 transactions, and so the same mechanism can be used to provide preconfirmations for any L2.
Suppose that we implement single slot finality. We use Orbit-like techniques to reduce the number of validators signing per slot, but not too much, so that we can also make progress on the key goal of reducing the 32 ETH staking minimum. As a result, perhaps the slot time creeps upward, to 16 sec. We then use either rollup preconfirmations, or based preconfirmations, to give users faster assurances. What do we have now? An epoch-and-slot architecture.
The “they’re the same picture” meme is getting quite overused at this point, so I’ll just put an old diagram I drew years ago to describe Gasper’s slot-and-epoch architecture and a diagram of L2 preconfirmations beside each other, and hopefully that will get the point across.
There is a deep philosophical reason why epoch-and-slot architectures seem to be so hard to avoid: it inherently takes less time to come to approximate agreement on something, than to come to maximally-hardened “economic finality” agreement on it.
One simple reason why is number of nodes. While the old linear @VitalikButerin/parametrizing-casper-the-decentralization-finality-time-overhead-tradeoff-3f2011672735">decentralization / finality time / overhead tradeoff is looking milder now due to hyper-optimized BLS aggregation and in the near future ZK-STARKs, it’s still fundamentally true that:
In Ethereum today, a 12-second slot is divided into three sub-slots, for (i) block publication and distribution, (ii) attestation, and (iii) attestation aggregation. If the attester count was much lower, we could drop to two sub-slots and have an 8-second slot time. Another, and realistically bigger, factor, is “quality” of nodes. If we could also rely on a professionalized subset of nodes to do approximate agreements (and still use the full validators set for finality), we could plausibly drop that to ~2 seconds.
Hence, it feels to me that (i) slot-and-epoch architectures are obviously correct, but also (ii) not all slot-and-epoch architectures are created equal, and there’s value in more fully exploring the design space. In particular, it’s worth exploring options that are not tightly interwoven like Gasper, and where instead there’s stronger separation of concerns between the two mechanisms.
In my view, there are three reasonable strategies for L2s to take at the moment:
For some applications, (eg. ENS, keystores), some payments), a 12-second block time is enough. For those applications that are not, the only solution is a slot-and-epoch architecture. In all three cases, the “epochs” are Ethereum’s SSF (perhaps we can retcon that acronym into meaning something other than “single slot”, eg. it could be “Secure Speedy Finality”). But the “slots” are something different in each of the above three cases:
A key question is, how good can we make something in category (1)? In particular, if it gets really good, then it feels like category (3) ceases to have as much meaning. Category (2) will always exist, at the very least because anything “based” doesn’t work for off-chain-data L2s such as plasmas and validiums. But if an Ethereum-native slot-and-epoch architecture can get down to 1-second “slot” (ie. pre-confirmation) times, then the space for category (3) becomes quite a bit smaller.
Today, we’re far from having final answers to these questions. A key question - just how sophisticated are block proposers going to become - remains an area where there is quite a bit of uncertainty. Designs like Orbit SSF are very recent, suggesting that the design space of slot-and-epoch designs where something like Orbit SSF is the epoch is still quite under-explored. The more options we have, the better we can do for users both on L1 and on L2s, and the more we can simplify the job of L2 developers.
Forward the Original Title‘Epochs and slots all the way down: ways to give Ethereum users faster transaction confirmation times’
One of the important properties of a good blockchain user experience is fast transaction confirmation times. Today, Ethereum has already improved a lot compared to five years ago. Thanks to the combination of EIP-1559 and steady block times after the Merge, transactions sent by users on L1 reliably confirm within 5-20 seconds. This is roughly competitive with the experience of paying with a credit card. However, there is value in improving user experience further, and there are some applications that outright require latencies on the order of hundreds of milliseconds or even less. This post will go over some of the practical options that Ethereum has.
Today, Ethereum’s Gasper consensus uses a slot and epoch architecture. Every 12-second slot, a subset of validators publish a vote on the head of the chain, and over the course of 32 slots (6.4 min), all validators get a chance to vote once. These votes are then re-interpreted as being messages in a vaguely PBFT-like consensus algorithm, which after two epochs (12.8 min) gives a very hard economic assurance called finality.
Over the last couple of years, we’ve become more and more uncomfortable with the current approach. The key reasons are that (i) it’s complicated and there are many interaction bugs between the slot-by-slot voting mechanism and the epoch-by-epoch finality mechanism, and (ii) 12.8 minutes is way too long and nobody cares to wait that long.
Single-slot finality replaces this architecture by a mechanism much more similar to Tendermint consensus, in which block N is finalized before block N+1 is made. The main deviation from Tendermint is that we keep the “inactivity leak“ mechanism, which allows the chain to keep going and recover if more than 1/3 of validators go offline.
A diagram of the leading proposed single-slot finality design_
The main challenge with SSF is that naively, it seems to imply that every single Ethereum staker would need to publish two messages every 12 seconds, which would be a lot of load for the chain to handle. There are clever ideas for how to mitigate this, including the very recent Orbit SSF proposal. But even still, while this improves UX significantly by making “finality” come faster, it doesn’t change the fact that users need to wait 5-20 seconds.
For the last few years, Ethereum has been following a rollup-centric roadmap, designing the Ethereum base layer (the “L1”) around supporting data availability and other functionalities that can then be used by layer 2 protocols like rollups (but also validiums and plasmas) that can give users the same level of security as Ethereum, but at much higher scale.
This creates a separation-of-concerns within the Ethereum ecosystem: the Ethereum L1 can focus on being censorship resistant, dependable, stable, and maintaining and improving a certain base-level core of functionality, and L2s can focus on more directly reaching out to users - both through different cultural and technological tradeoffs. But if you go down this path, one inevitable issue comes up: L2s want to serve users who want confirmations much faster than 5-20 seconds.
So far, at least in the rhetoric, it has been L2s’ responsibility to create their own “decentralized sequencing” networks. A smaller group of validators would sign off on blocks, perhaps once every few hundred milliseconds, and they would put their “stake” behind those blocks. Eventually, headers of these L2 blocks get published to L1.
L2 validator sets could cheat: they could first sign block B1, and then later sign a conflicting block B2 and commit it onto the chain before B1. But if they do this, they would get caught and lose their deposits. In practice, we have seen centralized versions of this, but rollups have been slow to develop decentralized sequencing networks. And you can argue that demanding L2s all do decentralized sequencing is an unfair deal: we’re asking rollups to basically do most of the same work as creating an entire new L1. For this reason and others, Justin Drake has been promoting a way to give all L2s (as well as L1) access to a shared Ethereum-wide preconfirmation mechanism: based preconfirmations.
The based preconfirmation approach assumes that Ethereum proposers will become highly sophisticated actors for MEV-related reasons (see here for my explanation of MEV, and see also the execution tickets line of proposals). The based preconfirmation approach takes advantage of this sophistication by incentivizing these sophisticated proposers to accept the responsibility of offering preconfirmations-as-a-service.
The basic idea is to create a standardized protocol by which a user can offer an additional fee in exchange for an immediate guarantee that the transaction will be included in the next block, along with possibly a claim about the results of executing that transaction. If the proposer violates any promise that they make to any user, they can get slashed.
As described, based preconfirmations provide guarantees to L1 transactions. If rollups are “based”, then all L2 blocks are L1 transactions, and so the same mechanism can be used to provide preconfirmations for any L2.
Suppose that we implement single slot finality. We use Orbit-like techniques to reduce the number of validators signing per slot, but not too much, so that we can also make progress on the key goal of reducing the 32 ETH staking minimum. As a result, perhaps the slot time creeps upward, to 16 sec. We then use either rollup preconfirmations, or based preconfirmations, to give users faster assurances. What do we have now? An epoch-and-slot architecture.
The “they’re the same picture” meme is getting quite overused at this point, so I’ll just put an old diagram I drew years ago to describe Gasper’s slot-and-epoch architecture and a diagram of L2 preconfirmations beside each other, and hopefully that will get the point across.
There is a deep philosophical reason why epoch-and-slot architectures seem to be so hard to avoid: it inherently takes less time to come to approximate agreement on something, than to come to maximally-hardened “economic finality” agreement on it.
One simple reason why is number of nodes. While the old linear @VitalikButerin/parametrizing-casper-the-decentralization-finality-time-overhead-tradeoff-3f2011672735">decentralization / finality time / overhead tradeoff is looking milder now due to hyper-optimized BLS aggregation and in the near future ZK-STARKs, it’s still fundamentally true that:
In Ethereum today, a 12-second slot is divided into three sub-slots, for (i) block publication and distribution, (ii) attestation, and (iii) attestation aggregation. If the attester count was much lower, we could drop to two sub-slots and have an 8-second slot time. Another, and realistically bigger, factor, is “quality” of nodes. If we could also rely on a professionalized subset of nodes to do approximate agreements (and still use the full validators set for finality), we could plausibly drop that to ~2 seconds.
Hence, it feels to me that (i) slot-and-epoch architectures are obviously correct, but also (ii) not all slot-and-epoch architectures are created equal, and there’s value in more fully exploring the design space. In particular, it’s worth exploring options that are not tightly interwoven like Gasper, and where instead there’s stronger separation of concerns between the two mechanisms.
In my view, there are three reasonable strategies for L2s to take at the moment:
For some applications, (eg. ENS, keystores), some payments), a 12-second block time is enough. For those applications that are not, the only solution is a slot-and-epoch architecture. In all three cases, the “epochs” are Ethereum’s SSF (perhaps we can retcon that acronym into meaning something other than “single slot”, eg. it could be “Secure Speedy Finality”). But the “slots” are something different in each of the above three cases:
A key question is, how good can we make something in category (1)? In particular, if it gets really good, then it feels like category (3) ceases to have as much meaning. Category (2) will always exist, at the very least because anything “based” doesn’t work for off-chain-data L2s such as plasmas and validiums. But if an Ethereum-native slot-and-epoch architecture can get down to 1-second “slot” (ie. pre-confirmation) times, then the space for category (3) becomes quite a bit smaller.
Today, we’re far from having final answers to these questions. A key question - just how sophisticated are block proposers going to become - remains an area where there is quite a bit of uncertainty. Designs like Orbit SSF are very recent, suggesting that the design space of slot-and-epoch designs where something like Orbit SSF is the epoch is still quite under-explored. The more options we have, the better we can do for users both on L1 and on L2s, and the more we can simplify the job of L2 developers.