How to scale app rollups

Intermediate1/3/2024, 7:58:07 AM
This article explores how to expand rollup to accommodate hundreds of thousands of simultaneous participants by modifying the rollup execution environment. It discusses the types of applications/games each method is suitable for and the challenges they face.

App rollups are emerging as the clear winner in scaling a specific set of Ethereum’s applications. These applications benefit from permissionless and strong ownership guarantees but don’t require simultaneous interactions between all the application users. Fully on-chain games (FOGs) are the best example that fit this description. FOGs benefit from strong in-game asset ownership, permissionless game participation, and permissionless game modding. Still, most games don’t require that all players interact with each other at the same moment. Other applications that can benefit from the app rollup scaling strategies include NFT marketplaces, Perpetual exchanges, and on-chain AI inference.

App rollups are already the go-to implementation for many of these use cases. However, the standard rollup implementations, i.e., EVM rollups, still have important scalability limits. They can probably achieve throughputs around 100 transactions per second. Such throughput can be sufficient for some on-chain games, depending on the game type. However, most games need a much higher throughput to support a large number (> 1000) of concurrent players. This article focuses on the approaches of scaling app-rollups to reach hundreds of thousands of concurrent participants. For each approach, I discuss the suitable type of applications/games and the challenges facing it.

Builders who are using app rollups or those who are building the infrastructure to scale app rollups are encouraged to reach out to me and apply to Alliance. I look forward to working with founders building in these areas.

Horizontal Scaling

Horizontal scalability is the simplest approach to scaling app rollups. However, the simplicity comes at the expense of losing composability which makes them suitable only for a small set of applications such as single-player games.

Horizontal scalability means simply deploying multiple app-rollups (OP or ZK) with the same smart contract deployed to all rollups. The application’s front end seamlessly directs the user to one of the rollups depending on capacity, location, or specific application options. Alt Layer recently demonstrated this concept by launching a scalable 2048 FOCG. In the game’s frontend, the user has the option to select which rollup to join based on their geographical location. Because of simplicity and availability of Rollup-as-a-service providers such as Caldera who handle all the infrastructure work related to spinning and managing these rollups, this approach can be easily adopted by game developers.

Despite the simplicity, there are a few issues in the multi-rollup scaling approach. The first is rollup network switching. Current wallets, e.g., Metamask, requires manual approval to connect to a new network, i.e., rollup instance. This leads to a difficult and confusing user experience for players who need to manually connect to multiple “networks” to play the same game. Fortunately, it’s possible to abstract away this complexity with AA solutions. I.e., EIP 4337, and embedded wallets such as Privy and 0xPass.

Another challenge is the management of the players’ state during a transition between rollup. In some instances, e.g., capacity drops, the application may need to consolidate multiple rollup instances into a single instance to save resources. In such a case, all the active player’s state needs to be migrated to the new instance. Current bridging solutions, specifically zk bridges, can be critical in solving this issue. Using these solutions, it is possible to bridge the player’s game state to a new rollup instance while maintaining a proof of the validity of this state. However, the latency of existing bridging solutions may be less optimal for gaming use cases.

ZK State Channels

Another app rollup scaling approach that is more suitable for multiplayer games, e.g., Poker, is zk state channels. In these games, the player interactions happen among a small number of players, e.g., 2–10. The game play among these players is only important while the game is progressing. The end result of the game is, however, more important because it affects each player’s asset balance. Hence, it’s important to store the result in a shared persistent layer.

In this case, the app rollup represents the shared information layer where game results are stored and where the game assets exist. For each game on the rollup, a ZK State Channel can be initiated to serve this game. During the game play, each player generates transactions and creates a ZKP proving that they have followed the rules of the game. Proofs from other player interactions aggregate previous proofs using recursive proofing. When the game ends, the final ZKP is submitted to the app rollup to prove the validity of the game play and the validity of the final result. The resulting state change from the game changes the player states on the app rollup.

ZK state channels move the game interactions off-chain. Hence, the in-game activities and transactions doesn’t count towards the throughput of the app rollup. Using this approach, app rollups can massively scale to support tens or hundreds of thousands of concurrent players. The app rollup transactions will be only the verification of the generated ZKPs and the state update transactions resulting in a scaling factor of 100–1000x. Multiple teams including Ontropy have been focusing on developing this technology.

A downside of this approach is that it requires players to run the game logic and generate the ZKPs on their devices. Oftentimes these proofs are lightweight and by leveraging state-of-the-art proving systems like Halo2, the proving can take less than a few seconds. However, this may still lead to degraded UX for players with resource-limited devices.

A modification to this approach that can alleviate this issue is assigning one of the zk state channel participants as a temporary sequencer. This sequencer will receive each player’s transactions and generate the corresponding ZKPs and share the ZKP with all the channel participants. This modification can be thought of as ephemeral ZK L3s that settle to the app rollup. The Cartridge team has been implementing this architecture by designing a specialized sequencer called Katana.

The zk state channel approach has a lot of potential. However, there are several open questions related to the execution environment inside the zk state channel and how to optimize it for proof recursion. Current zkEVM environments are not very efficient and most of them currently don’t support proof recursion. Alternatives include lightweight zkVMs or even using specialized zk circuits for player interactions if the number of actions possible for the player is limited.

Changing the Execution Environment

A third approach for app rollup scalability is to change the execution environment of the rollup. Despite the maturity and abundance of the EVM dev tools, they are not suitable for high-performance applications such as games. Further, the EVM single-threaded execution and storage model leads to a reduced throughput that can be improved on.

The main advantage of this approach is that Improving the rollup throughput doesn’t require sacrificing composability or restricting the number of use cases. This approach can work for any Web 3 application as long as the execution environment can achieve the throughput required by the application. This makes them the only viable solution for applications that require accessing a shared state like AMMs, lending protocols, and other DeFi applications.

Expanding the EVM functionality via precompiles

The first approach is for the rollup to remain EVM compatible and address some of the throughput limitations via precompiles. The idea here is simple. A precompile is simply moving computationally-expensive EVM operations down to the node level. An operation that would require hundreds or thousands of EVM OPs and consume 100k+ of gas can be simplified into a single operation with 100x lower gas costs. Expanding the rollup environment with precompiles is often called EVM+. Examples of this approach include supporting on-chain privacy and supporting more efficient signature schemes, e.g., BLS signatures. For instance, the zkHoldem poker game uses a specialized FHE and zk operations to achieve private poker card dealing and revealing. The development of these specialized precompiles is often a shared effort between the app rollup developer and the Raas providers who manage the deployment and maintenance of the app rollups infra.

Using a non-EVM execution environment

Another approach to improve the rollup execution environment is to break free from the EVM. This approach is getting more popular among developers who are new to the Ethereum ecosystem and devs who believe that Solidity is the not best language to develop complex applications.

Today we have rollup applications that are running on WASM, SVM, Cairo and even Linux runtimes. Most of these approaches allow developers to write their smart contract in high-level languages such as Rust or C. The downside is that interoperability with existing Solidity contracts is often lost. However, it’s still possible to create compatibility with the EVM. For instance, Aributrum’s stylus employs a co-processor to make Stylus contracts compatible with the EVM. This design brings Stylus closer to an EVM+ architecture than a non-EVM.

Hybrid execution environments

A third approach that is particularly popular within FOGs is combining the best features out of the two previous approaches. This approach combines EVM compatibility with the specialized non-EVM execution environment. The non-EVM environments focus on high-performance execution of the core game primitives. In-game asset management, e.g., trading the in-game NFTs can be handled by standard Solidity contracts.

The advantage of this approach is that EVM compatibility ensures alignment with a larger ecosystem of devs and existing products. It also allows for permissionless composability. Developers can mod and extend the game logic by adding EVM/solidity smart contracts. Meanwhile, the specialized non-EVM game engine achieves the high throughput that cannot be satisfied by the EVM.

Examples of this approach are World Engine from Argus and Keystone from Curio. The World Engine separates the execution of the game logic into a separate layer called Game Shard that runs on top of the EVM compatible layer. The Game Shard is also designed to allow horizontal scaling to adjust the total rollup throughput based on demand. Similarly, Curio’s Keystone architecture bundles a high-throughput game engine with the EVM as the rollup execution environment. The challenge here is to achieve seamless interoperability between the EVM engine and the game engine.

Data Availability Considerations

In the previous discussion, the focus was on the main aspect of scaling app rollups which is increasing the rollup transaction throughput. There are other related topics to this increased throughput such as Data Availability (DA), sequencer decentralization and settlement speed. Data availability is the most pressing of these issues for high-throughput app rollups.

A single app rollup can potentially achieve throughputs exceeding 10k tps. Using Ethereum as a DA layer for these transactions is not possible. First, the average cost of publishing the data of a simple L2 ETH transfer on L1 can exceed $0.10. These costs are too high for most app rollups. More importantly, Ethereum’s L1 currently cannot support more than roughly 8k TPS [1] for rollups that use the L1 for DA.

App rollups will primarily depend on external DA solutions. Celestia and EigenDA are currently positioned as the most viable option for app rollups. For instance, Eclipse plans to use Celestia for its high-throughput SVM-based rollup. Argus and high-throughput game engines also plan initially to use Celestia initially. Similarly, EigenDA which promises a data throughput of up to 10MB/sec can be a viable solution for multiple app rollups.

Integrating Celestia or EigneDA however has the main disadvantage of economic value leakage. The app rollup has to pay fees for the DA layer in addition to the settlement fees on the Ethereum L1. The settlement fees are critical for the app rollup because it aligns the rollup security with Ethereum’s security. DA guarantees are less important especially in the context of FOGs where the transaction values are much smaller. Furthermore, Celestia and EigenDA promise low fees because these networks are new and will initially have low utilization. When these DA networks achieve high utilization, the DA fees can also become excessive. In my opinion, app rollups should instead use a simple Data availability Committee (DAC) to attest to the availability of the rollup data[3] .

In conclusion, I believe that app rollups are the best existing solution to scale high-throughput applications in general and fully on-chain games in specific. Scaling these app rollups is the key to achieve mainstream adoption that goes beyond native crypto users. At Alliance, we want to bring this vision to reality by supporting founders who are building this

I would like to thank Matt Katz, Kevin Zhang, Tarrence van As, and Larry Liu for their valuable feedback on this article.

[1] Assumes 50% of Ethereum’s block gas limit will be only to store data using calldata, 10 bytes average tx size. 12-second block times

Disclaimer:

  1. This article is reprinted from [Alliance]. All copyrights belong to the original author [Mohamed Fouda]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

How to scale app rollups

Intermediate1/3/2024, 7:58:07 AM
This article explores how to expand rollup to accommodate hundreds of thousands of simultaneous participants by modifying the rollup execution environment. It discusses the types of applications/games each method is suitable for and the challenges they face.

App rollups are emerging as the clear winner in scaling a specific set of Ethereum’s applications. These applications benefit from permissionless and strong ownership guarantees but don’t require simultaneous interactions between all the application users. Fully on-chain games (FOGs) are the best example that fit this description. FOGs benefit from strong in-game asset ownership, permissionless game participation, and permissionless game modding. Still, most games don’t require that all players interact with each other at the same moment. Other applications that can benefit from the app rollup scaling strategies include NFT marketplaces, Perpetual exchanges, and on-chain AI inference.

App rollups are already the go-to implementation for many of these use cases. However, the standard rollup implementations, i.e., EVM rollups, still have important scalability limits. They can probably achieve throughputs around 100 transactions per second. Such throughput can be sufficient for some on-chain games, depending on the game type. However, most games need a much higher throughput to support a large number (> 1000) of concurrent players. This article focuses on the approaches of scaling app-rollups to reach hundreds of thousands of concurrent participants. For each approach, I discuss the suitable type of applications/games and the challenges facing it.

Builders who are using app rollups or those who are building the infrastructure to scale app rollups are encouraged to reach out to me and apply to Alliance. I look forward to working with founders building in these areas.

Horizontal Scaling

Horizontal scalability is the simplest approach to scaling app rollups. However, the simplicity comes at the expense of losing composability which makes them suitable only for a small set of applications such as single-player games.

Horizontal scalability means simply deploying multiple app-rollups (OP or ZK) with the same smart contract deployed to all rollups. The application’s front end seamlessly directs the user to one of the rollups depending on capacity, location, or specific application options. Alt Layer recently demonstrated this concept by launching a scalable 2048 FOCG. In the game’s frontend, the user has the option to select which rollup to join based on their geographical location. Because of simplicity and availability of Rollup-as-a-service providers such as Caldera who handle all the infrastructure work related to spinning and managing these rollups, this approach can be easily adopted by game developers.

Despite the simplicity, there are a few issues in the multi-rollup scaling approach. The first is rollup network switching. Current wallets, e.g., Metamask, requires manual approval to connect to a new network, i.e., rollup instance. This leads to a difficult and confusing user experience for players who need to manually connect to multiple “networks” to play the same game. Fortunately, it’s possible to abstract away this complexity with AA solutions. I.e., EIP 4337, and embedded wallets such as Privy and 0xPass.

Another challenge is the management of the players’ state during a transition between rollup. In some instances, e.g., capacity drops, the application may need to consolidate multiple rollup instances into a single instance to save resources. In such a case, all the active player’s state needs to be migrated to the new instance. Current bridging solutions, specifically zk bridges, can be critical in solving this issue. Using these solutions, it is possible to bridge the player’s game state to a new rollup instance while maintaining a proof of the validity of this state. However, the latency of existing bridging solutions may be less optimal for gaming use cases.

ZK State Channels

Another app rollup scaling approach that is more suitable for multiplayer games, e.g., Poker, is zk state channels. In these games, the player interactions happen among a small number of players, e.g., 2–10. The game play among these players is only important while the game is progressing. The end result of the game is, however, more important because it affects each player’s asset balance. Hence, it’s important to store the result in a shared persistent layer.

In this case, the app rollup represents the shared information layer where game results are stored and where the game assets exist. For each game on the rollup, a ZK State Channel can be initiated to serve this game. During the game play, each player generates transactions and creates a ZKP proving that they have followed the rules of the game. Proofs from other player interactions aggregate previous proofs using recursive proofing. When the game ends, the final ZKP is submitted to the app rollup to prove the validity of the game play and the validity of the final result. The resulting state change from the game changes the player states on the app rollup.

ZK state channels move the game interactions off-chain. Hence, the in-game activities and transactions doesn’t count towards the throughput of the app rollup. Using this approach, app rollups can massively scale to support tens or hundreds of thousands of concurrent players. The app rollup transactions will be only the verification of the generated ZKPs and the state update transactions resulting in a scaling factor of 100–1000x. Multiple teams including Ontropy have been focusing on developing this technology.

A downside of this approach is that it requires players to run the game logic and generate the ZKPs on their devices. Oftentimes these proofs are lightweight and by leveraging state-of-the-art proving systems like Halo2, the proving can take less than a few seconds. However, this may still lead to degraded UX for players with resource-limited devices.

A modification to this approach that can alleviate this issue is assigning one of the zk state channel participants as a temporary sequencer. This sequencer will receive each player’s transactions and generate the corresponding ZKPs and share the ZKP with all the channel participants. This modification can be thought of as ephemeral ZK L3s that settle to the app rollup. The Cartridge team has been implementing this architecture by designing a specialized sequencer called Katana.

The zk state channel approach has a lot of potential. However, there are several open questions related to the execution environment inside the zk state channel and how to optimize it for proof recursion. Current zkEVM environments are not very efficient and most of them currently don’t support proof recursion. Alternatives include lightweight zkVMs or even using specialized zk circuits for player interactions if the number of actions possible for the player is limited.

Changing the Execution Environment

A third approach for app rollup scalability is to change the execution environment of the rollup. Despite the maturity and abundance of the EVM dev tools, they are not suitable for high-performance applications such as games. Further, the EVM single-threaded execution and storage model leads to a reduced throughput that can be improved on.

The main advantage of this approach is that Improving the rollup throughput doesn’t require sacrificing composability or restricting the number of use cases. This approach can work for any Web 3 application as long as the execution environment can achieve the throughput required by the application. This makes them the only viable solution for applications that require accessing a shared state like AMMs, lending protocols, and other DeFi applications.

Expanding the EVM functionality via precompiles

The first approach is for the rollup to remain EVM compatible and address some of the throughput limitations via precompiles. The idea here is simple. A precompile is simply moving computationally-expensive EVM operations down to the node level. An operation that would require hundreds or thousands of EVM OPs and consume 100k+ of gas can be simplified into a single operation with 100x lower gas costs. Expanding the rollup environment with precompiles is often called EVM+. Examples of this approach include supporting on-chain privacy and supporting more efficient signature schemes, e.g., BLS signatures. For instance, the zkHoldem poker game uses a specialized FHE and zk operations to achieve private poker card dealing and revealing. The development of these specialized precompiles is often a shared effort between the app rollup developer and the Raas providers who manage the deployment and maintenance of the app rollups infra.

Using a non-EVM execution environment

Another approach to improve the rollup execution environment is to break free from the EVM. This approach is getting more popular among developers who are new to the Ethereum ecosystem and devs who believe that Solidity is the not best language to develop complex applications.

Today we have rollup applications that are running on WASM, SVM, Cairo and even Linux runtimes. Most of these approaches allow developers to write their smart contract in high-level languages such as Rust or C. The downside is that interoperability with existing Solidity contracts is often lost. However, it’s still possible to create compatibility with the EVM. For instance, Aributrum’s stylus employs a co-processor to make Stylus contracts compatible with the EVM. This design brings Stylus closer to an EVM+ architecture than a non-EVM.

Hybrid execution environments

A third approach that is particularly popular within FOGs is combining the best features out of the two previous approaches. This approach combines EVM compatibility with the specialized non-EVM execution environment. The non-EVM environments focus on high-performance execution of the core game primitives. In-game asset management, e.g., trading the in-game NFTs can be handled by standard Solidity contracts.

The advantage of this approach is that EVM compatibility ensures alignment with a larger ecosystem of devs and existing products. It also allows for permissionless composability. Developers can mod and extend the game logic by adding EVM/solidity smart contracts. Meanwhile, the specialized non-EVM game engine achieves the high throughput that cannot be satisfied by the EVM.

Examples of this approach are World Engine from Argus and Keystone from Curio. The World Engine separates the execution of the game logic into a separate layer called Game Shard that runs on top of the EVM compatible layer. The Game Shard is also designed to allow horizontal scaling to adjust the total rollup throughput based on demand. Similarly, Curio’s Keystone architecture bundles a high-throughput game engine with the EVM as the rollup execution environment. The challenge here is to achieve seamless interoperability between the EVM engine and the game engine.

Data Availability Considerations

In the previous discussion, the focus was on the main aspect of scaling app rollups which is increasing the rollup transaction throughput. There are other related topics to this increased throughput such as Data Availability (DA), sequencer decentralization and settlement speed. Data availability is the most pressing of these issues for high-throughput app rollups.

A single app rollup can potentially achieve throughputs exceeding 10k tps. Using Ethereum as a DA layer for these transactions is not possible. First, the average cost of publishing the data of a simple L2 ETH transfer on L1 can exceed $0.10. These costs are too high for most app rollups. More importantly, Ethereum’s L1 currently cannot support more than roughly 8k TPS [1] for rollups that use the L1 for DA.

App rollups will primarily depend on external DA solutions. Celestia and EigenDA are currently positioned as the most viable option for app rollups. For instance, Eclipse plans to use Celestia for its high-throughput SVM-based rollup. Argus and high-throughput game engines also plan initially to use Celestia initially. Similarly, EigenDA which promises a data throughput of up to 10MB/sec can be a viable solution for multiple app rollups.

Integrating Celestia or EigneDA however has the main disadvantage of economic value leakage. The app rollup has to pay fees for the DA layer in addition to the settlement fees on the Ethereum L1. The settlement fees are critical for the app rollup because it aligns the rollup security with Ethereum’s security. DA guarantees are less important especially in the context of FOGs where the transaction values are much smaller. Furthermore, Celestia and EigenDA promise low fees because these networks are new and will initially have low utilization. When these DA networks achieve high utilization, the DA fees can also become excessive. In my opinion, app rollups should instead use a simple Data availability Committee (DAC) to attest to the availability of the rollup data[3] .

In conclusion, I believe that app rollups are the best existing solution to scale high-throughput applications in general and fully on-chain games in specific. Scaling these app rollups is the key to achieve mainstream adoption that goes beyond native crypto users. At Alliance, we want to bring this vision to reality by supporting founders who are building this

I would like to thank Matt Katz, Kevin Zhang, Tarrence van As, and Larry Liu for their valuable feedback on this article.

[1] Assumes 50% of Ethereum’s block gas limit will be only to store data using calldata, 10 bytes average tx size. 12-second block times

Disclaimer:

  1. This article is reprinted from [Alliance]. All copyrights belong to the original author [Mohamed Fouda]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!