Vitalik proposed the Epoch and slot scheme: to provide faster transaction confirmation time for ETH and improve the end user experience

One of the important attributes of a good blockchain user experience is fast transaction confirmation time. Today, Ethereum has made significant improvements compared to five years ago. Thanks to EIP-1559 and the stable block time after the transition to PoS (The Merge), transactions sent on L1 can usually be confirmed within 5-20 seconds, which is roughly equivalent to the experience of using a credit card for payment. However, further improving user experience is valuable, as some applications even require hundreds of milliseconds or even shorter latency. This article will explore some practical options for Ethereum (improving transaction confirmation time).

Overview of Existing Ideas and Technologies

Finality of Single Slot

Currently, Ethereum's Gasper consensus uses a single slot and Epoch architecture. Every 12 seconds, a slot is created, and a subset of validators vote on the chain's head. Within 32 slots (6.4 minutes), all validators have the opportunity to vote once. These votes are then interpreted as messages in a consensus algorithm similar to PBFT, and after two Epochs (12.8 minutes), they provide a very strong economic guarantee called finality.

In the past few years, we have become increasingly dissatisfied with the current method. There are two main reasons for this: firstly, this method is very complex, with many interaction errors between the slot-to-slot voting mechanism and the Epoch-to-Epoch finality mechanism; secondly, 12.8 minutes is too long, and no one wants to wait that long.

Single Slot Finaty (SSF) replaces this architecture by a mechanism similar to the Tendermint consensus, where block N is finally determined before block N+1 is generated. The main difference from Tendermint is that we retain the 'inactivity leak' mechanism, which allows the chain to continue running and recover when more than 1/3 of the validators are offline.

The main challenge of single-slot finality is that this means that every Ethereum staker needs to publish two messages every 12 seconds, which is a heavy load for the chain. There are some clever ideas to mitigate this issue, including the recent Orbit SSF proposal. Although this significantly speeds up "finality" to improve user experience, it does not change the fact that users still need to wait 5-20 seconds.

Vitalik提出Epoch and slot方案：为ETH提供更快交易确认时间，提升终端用户体验

Rollup Pre-confirmation

Over the past few years, Ethereum has been following a roadmap centered around rollups, designing the Ethereum base layer (L1) to support data availability and other features, which can then be used by L2 protocols (such as rollups, validiums, and plasmas) to provide users with the same level of security as Ethereum on a larger scale.

This has led to a separation of concerns within the Ethereum ecosystem: Ethereum L1 focuses on censorship resistance, reliability, stability, and maintaining and improving a core set of functionalities at the base layer, while L2 focuses on directly engaging users through different cultures and technologies. However, if we continue down this path, an inevitable issue arises: L2 wants to provide faster confirmations for users than the 5-20 seconds.

So far, at least in theory, it is the responsibility of L2 to create its own 'decentralized sequencer' network. A small group of validators may sign blocks every few hundred milliseconds and stake their assets behind these blocks. Eventually, the headers of these L2 blocks will be published to L1.

Vitalik提出Epoch and slot方案：为ETH提供更快交易确认时间，提升终端用户体验

But the L2 validator set can be "cheated": they can sign Block B1 before signing a conflicting Block B2 and submitting it to the on-chain before B1. But if they do, they will be identified and lose their stake assets. In fact, we have seen actual examples of centralized versions, but on the other hand rollups have been slow to develop decentralization ordering networks. You could argue that it's unfair to require all L2s to have decentralization ordering: we're asking rollups to do pretty much the same work as creating a brand new L1. As a result, Justin Drake has been promoting an approach that would allow all L2s (and L1s) to use a shared Ethereum-wide pre-confirmation mechanism: base pre-confirmation.

Basic Pre-confirmation

The method of based preconfirmations assumes that Ethereum proposers are highly complex participants related to MEV. The method of preconfirmation incentivizes these complex proposers to accept the responsibility of providing preconfirmation services to exploit this complexity.

Vitalik提出Epoch and slot方案：为ETH提供更快交易确认时间，提升终端用户体验

The basic idea of this approach is to create a standardized protocol, where users can provide additional fees to ensure that their transactions will be instantly guaranteed to be included in the next block, as well as a declaration of the results of executing the transaction. If the proposer violates any commitments made to any user, they can be slashed.

As described, providing guarantees for L1 transactions based on pre-confirmation. If rollups are "based on", then all L2 blocks are L1 transactions, so the same mechanism can be used to provide pre-confirmation for any L2.

What are we actually looking at?

Assuming we have achieved single-slot finality. We use technologies similar to Orbit to reduce the number of validators signing in each slot, but not too much, so that we can also make progress in reducing the minimum stake of 32 ETH. The slot time may increase to 16 seconds, and then we use rollup pre-commitment or basic pre-commitment to provide users with faster confirmation. Finally, what do we get: an epoch-slot architecture.

Vitalik提出Epoch and slot方案：为ETH提供更快交易确认时间，提升终端用户体验

There is a profound philosophical reason why the epoch-and-slot architecture seems so difficult to avoid: it takes less time to reach rough consensus on something than to achieve maximum 'economic finality' protocol on something.

One simple reason is the number of nodes. Although the old linear decentralization/finality time/cost trade-offs now look mild due to hyper-optimized BLS aggregation and upcoming ZK-STARKs, the following reasons cannot be ignored:

'Approximate consensus' requires only a few nodes, while economic finality requires the majority of nodes.
Once the number of nodes exceeds a certain threshold, you need to spend more time collecting signatures.

In today's Ethereum, the 12-second slots are divided into three sub-slots: block publishing and distribution, proof, and proof aggregation. If the number of validators is significantly reduced, we can reduce it to two sub-slots and use an 8-second slot time. Another, more practical and larger factor is the 'quality' of the nodes. Another major factor is the 'quality' of the nodes. If we can also rely on a specialized subset of nodes to achieve approximate consensus (and still use the full set of validators to determine finality), we can bring it down to about 2 seconds.

Therefore, in my opinion, the epoch-and-slot architecture is obviously correct, but not all epoch-and-slot architectures are equal, and it is valuable to explore the design space more fully. The direction worth further research is not to closely integrate like Gasper, but to have a stronger focus separation between the two mechanisms.

How should L2 be done?

In my opinion, L2 currently has three reasonable strategies:

Technically and spiritually 'based'. That is, they optimize the technical attributes and values of the Ethereum base layer (highly decentralized, anti-censorship, etc.). In its simplest form, you can think of these rollups as 'branded sharding', but they can also have bigger ambitions, conducting a lot of experiments in new virtual machine designs and other technological improvements.
Become a server with a blockchain scaffold and make full use of it. If you start from the server and then add STARK validity proof to ensure that the server follows the rules; ensure the right of users to exit or force transactions; collective freedom of choice, through coordinated large-scale exit or through changing the voter's vote, then you have gained most of the benefits of being on-chain while retaining most of the efficiency of the server.
Compromise method: a fast chain with one hundred nodes, Ethereum provides additional interoperability and security. This is the current actual roadmap for many L2 projects.

For some applications (such as ENS, Secret Key storage, and certain payment protocols), a 12-second block time is sufficient. For applications that are not suitable, the only solution is the epoch-and-slot architecture. In all three cases, 'epoch' refers to Ethereum's SSF, but the slots are different in each of the three cases:

An Ethereum-native epoch-and-slot architecture
Server Pre-confirmation
Committee pre-confirmation

A key question is, how well can we long do in Category 1? Especially, if it turns out to be very good, then it feels like category 3 doesn't make so much sense. Because all "based" schemes do not apply to off-chain data L2 such as plasmas and validiums, Class 2 will always exist. If a Ethereum-native epoch-and-slot architecture can drop to 1 second of slot, then the short of Class 3 will be long small.

Today, we are still far from the final answers to these questions. One key question is: how complex will the block proposer become, which is still an area with considerable uncertainty. Designs like Orbit SSF are very novel, so it is still worth exploring the design space, such as using Orbit SSF as the epoch in epoch-and-slot scheme. The more options we have, the better we can do for L1 and L2 users, and the easier we can make it for L2 developers.