Understanding Rollup Bottlenecks and Optimization Methods From the Perspective of the Performance Difference between opBNB and Ethereum Layer2

Intermediate2/27/2024, 3:01:17 AM
This article aims to provide a brief summary of the working principles and commercial significance of opBNB, outlining an important step taken by the BSC public chain in the era of modular blockchain.

BNB Chain’s road to big blocks

The Road of Big Blocks on BNB Chain

Similar to Solana, Heco, and other public chains supported by exchanges, BNB Chain’s public chain BNB Smart Chain (BSC) has long pursued high performance. Since its launch in 2020, BSC has set the gas capacity limit for each block to 30 million, with a stable block interval of 3 seconds. With such parameters, BSC achieved a maximum TPS (TPS with various transactions mixed together) of over 100. In June 2021, the block gas limit of BSC was increased to 60 million. However, in July of the same year, a chain game called CryptoBlades exploded on BSC, causing daily transaction volumes to exceed 8 million and resulting in skyrocketing fees. It turned out that the efficiency bottleneck of BSC was still quite obvious at that time.

(Source: BscScan)

To address network performance issues, BSC once again raised the gas limit for each block, which remained stable at around 80-85 million for a long time. In September 2022, the gas limit per block of BSC Chain was increased to 120 million, and by the end of the year, it was raised to 140 million, nearly five times that of 2020. Previously, BSC had planned to increase the block gas capacity limit to 300 million, but perhaps considering the heavy burden on Validator nodes, the proposal for such super-large blocks has not been implemented.


source: YCHARTS

Later on, BNB Chain seemed to focus more on the modular/Layer2 track rather than persisting in Layer1 expansion. This intention became increasingly evident from the launch of zkBNB in the second half of last year to GreenField at the beginning of this year. Out of a strong interest in modular blockchain/Layer2, the author of this article will use opBNB as a research object to reveal the performance bottlenecks of Rollup by comparing it with Ethereum Layer2.

The Boost of BSC’s high throughput to opBNB’s DA layer

As we all know, Celestia has summarized four key components according to the workflow of modular blockchain: Execution Layer: Executes contract code and completes state transitions; Settlement Layer: Handles fraud proofs/validity proofs and addresses bridging issues between L2 and L1. Consensus Layer: Reaches consensus on transaction ordering. Data Availability Layer (DA): Publishes blockchain ledger-related data, allowing validators to download this data.


Among them, the DA layer is often coupled with the consensus layer. For example, the data of Optimistic Rollup’s DA contains a batch of transaction sequences in L2 blocks. When L2 full nodes obtain DA data, they know the order of each transaction in this batch. (For this reason, the Ethereum community believes that the DA layer and the consensus layer are related when layering Rollup.)

However, for Ethereum Layer2, the data throughput of the DA layer (Ethereum) has become the biggest bottleneck restricting Rollup performance. This is because the current data throughput of Ethereum is too low, forcing Rollup to suppress its TPS as much as possible to prevent the Ethereum mainnet from being unable to bear the data generated by L2. At the same time, the low data throughput causes a large number of transaction instructions within the Ethereum network to be in a pending state, leading to gas fees being pushed to extremely high levels and further increasing the cost of data publication for Layer2. Finally, many Layer2 networks have to adopt DA layers outside of Ethereum, such as Celestia, and opBNB, which is close to the water, has chosen to directly use the high throughput of BSC to implement DA to solve the bottleneck problem of data publication. For ease of understanding, let’s introduce the method of DA data publication for Rollup. Taking Arbitrum as an example, the Ethereum chain controlled by the Layer2 sequencer’s EOA address will periodically send Transactions to the specified contract. In the input parameters calldata of this instruction, the packaged transaction data is written, and the corresponding on-chain events are triggered, leaving a permanent record in the contract logs.


In this way, the transaction data of Layer2 is stored in Ethereum blocks for a long time. People who are capable of running L2 nodes can download the corresponding records and parse the corresponding data, but Ethereum nodes themselves do not execute these L2 transactions. It is easy to see that L2 only stores transaction data in Ethereum blocks, incurring storage costs, while the computational costs of executing transactions are borne by L2 nodes themselves. The aforementioned is the implementation method of Arbitrum’s DA, while Optimism uses an EOA address controlled by the sequencer to transfer to another specified EOA address, carrying a new batch of Layer2 transaction data in the additional data. As for opBNB, which uses the OP Stack, its DA data publishing method is basically the same as that of Optimism.


It is obvious that the throughput of the DA layer will limit the size of data that Rollup can publish in a unit of time, thereby limiting TPS. Considering that after EIP1559, the gas capacity of each ETH block stabilizes at 30 million, and the block time after the merge is about 12 seconds, Ethereum can handle a maximum of only 2.5 million gas per second. Most of the time, the gas consumed by accommodating L2 transaction data per byte in calldata is 16, so Ethereum can handle a maximum calldata size of only 150 KB per second. In contrast, BSC’s maximum average calldata size processed per second is about 2910 KB, which is 18.6 times that of Ethereum. The difference between the two as DA layers is obvious.

To summarize, Ethereum can carry about 150 KB of L2 transaction data per second. Even after the launch of EIP 4844, this number will not change much, only reducing the DA fee. So how many transactions’ data can be included in about 150KB per second? Here we need to explain the data compression rate of Rollup. Vitalik was overly optimistic in 2021, estimating that Optimistic Rollup can compress transaction data size to 11% of the original size. For example, a basic ETH transfer, originally occupying a calldata size of 112 bytes, can be compressed to 12 bytes by Optimistic Rollup, ERC-20 transfers can be compressed to 16 bytes, and Swap transactions on Uniswap can be compressed to 14 bytes. According to his estimation, Ethereum can record about 10,000 L2 transactions per second (with various types mixed together). However, according to the data disclosed by the Optimism team in 2022, the actual data compression rate can reach a maximum of only about 37%, which is 3.5 times lower than Vitalik’s estimate.


(Vitalik’s estimation of Rollup scalability effect deviates significantly from actual conditions)

(Actual compression rates achieved by various compression algorithms disclosed by Optimism)

So let’s give a reasonable number: even if Ethereum reaches its throughput limit, the maximum TPS of all Optimistic Rollups combined is only slightly over 2000. In other words, if Ethereum blocks were entirely used to carry the data published by Optimistic Rollups, such as those distributed among Arbitrum, Optimism, Base, and Boba, the combined TPS of these Optimistic Rollups would not even reach 3000, even under the most efficient compression algorithms. Additionally, we must consider that after EIP1559, each block’s gas capacity averages only 50% of the maximum value, so the above number should be halved. After the launch of EIP4844, although the transaction fees for publishing data will be significantly reduced, Ethereum’s maximum block size will not change much (as too much change would affect the security of the ETH main chain), so the estimated value above will not progress much.


According to data from Arbiscan and Etherscan, a batch of transactions on Arbitrum contains 1115 transactions, consuming 1.81 million gas on Ethereum. By extrapolation, if the DA layer is filled in every block, Arbitrum’s theoretical TPS limit is approximately 1500. Of course, considering the issue of L1 block reorganization, Arbitrum cannot publish transaction batches on every Ethereum block, so the above numbers are currently only theoretical. Additionally, with the widespread adoption of smart wallets related to EIP 4337, the DA issue will become even more severe. Because with support for EIP 4337, the way users verify their identity can be customized, such as uploading binary data of fingerprints or irises, which will further increase the data size occupied by regular transactions. Therefore, Ethereum’s low data throughput is the biggest bottleneck limiting Rollup efficiency, and this problem may not be properly resolved for a long time. On the other hand, in the BNB Chain of the BSC public chain, the maximum average calldata size processed per second is approximately 2910 KB, which is 18.6 times that of Ethereum. In other words, as long as the execution layer can keep up, the theoretical TPS upper limit of Layer2 within the BNB Chain ecosystem can reach approximately 18 times that of ARB or OP. This number is calculated based on the current BNB Chain’s maximum block gas capacity of 140 million, with a block time of 3 seconds.

In other words, the current aggregate TPS limit of all Rollups within the BNB Chain ecosystem is 18.6 times that of Ethereum (even when considering ZKRollup). From this perspective, it’s easy to understand why so many Layer2 projects use the DA layer under the Ethereum chain to publish data, as the difference is quite evident. However, the issue is not so simple. Besides the problem of data throughput, the stability of Layer1 itself can also affect Layer2. For example, most Rollups often wait for several minutes before publishing a batch of transactions to Ethereum, considering the possibility of Layer1 block reorganization. If a Layer1 block is reorganized, it would affect the blockchain ledger of Layer2. Therefore, the sequencer will wait for several new Layer1 blocks to be published after each release of an L2 transaction batch, significantly reducing the probability of block rollback, before publishing the next L2 transaction batch. This actually delays the time when L2 blocks are finally confirmed, reducing the confirmation speed of large transactions (large transactions require irreversible results to ensure security). In summary, transactions that occur in L2 only become irreversible after being published in the DA layer blocks and after the DA layer has generated a certain number of new blocks. This is an important reason limiting Rollup performance. However, Ethereum has a slow block generation speed, taking 12 seconds to produce a block. Assuming Rollup publishes a batch of L2 transactions every 15 blocks, there will be a 3-minute interval between different batches, and after each batch is published, it still needs to wait for multiple Layer1 blocks to be generated before becoming irreversible (assuming they are not challenged). Obviously, the time from initiation to irreversibility of transactions on Ethereum’s Layer2 is quite long, resulting in slow settlement speed; whereas, BNB Chain only takes 3 seconds to produce a block, and the blocks become irreversible in just 45 seconds (the time it takes to produce 15 new blocks). Based on the current parameters, assuming the same number of L2 transactions and considering the irreversibility of L1 blocks, the number of times opBNB can publish transaction data in a unit of time can reach up to 8.53 times that of Arbitrum (once every 45 seconds for the former, and once every 6.4 minutes for the latter). Clearly, the settlement speed of large transactions on opBNB is much faster than on Ethereum’s Layer2. Additionally, the maximum data size published by opBNB each time can reach 4.66 times that of Ethereum’s Layer2 (the former’s L1 block gas limit is 140 million, while the latter’s is 30 million). 8.53 * 4.66 = 39.74. This represents the gap between opBNB and Arbitrum in terms of TPS limit in practical implementation (currently, for security reasons, ARB seems to actively reduce TPS, but theoretically, if TPS were to be increased, it would still be many times lower compared to opBNB).


(Arbitrum’s sequencer publishes a transaction batch every 6-7 minutes)


(opBNB’s sequencer publishes a transaction batch every 1-2 minutes, with the fastest taking only 45 seconds). Of course, there is another crucial issue to consider, which is the gas fees in the DA layer. Each time L2 publishes a transaction batch, there is a fixed cost of 21,000 gas unrelated to the calldata size, which is an expense. If the gas fees for the DA layer/L1 are high, causing the fixed cost of publishing a transaction batch on L2 to remain high, the sequencer will reduce the frequency of publishing transaction batches. Additionally, when considering the components of L2 fees, the execution layer’s cost is very low and can often be ignored, focusing only on the impact of DA costs on transaction fees. In summary, while publishing calldata of the same size consumes the same amount of gas on Ethereum and BNB Chain, the gas price charged by Ethereum is about 10 to dozens of times higher than that of BNB Chain. Translated into L2 transaction fees, the current user transaction fees on Ethereum Layer2 are also about 10 to dozens of times higher than those on opBNB. Overall, the differences between opBNB and Optimistic Rollup on Ethereum are quite apparent.

(A transaction consuming 150,000 gas on Optimism costs $0.21)


(A transaction consuming 130,000 gas on opBNB costs $0.004) However, increasing the data throughput of the DA layer, while it can enhance the overall throughput of the Layer2 system, still has limited impact on the performance improvement of individual Rollups. This is because the execution layer often does not process transactions fast enough. Even if the limitations of the DA layer can be ignored, the execution layer becomes the next bottleneck affecting Rollup performance. If the execution speed of the Layer2 execution layer is slow, the overflow of transaction demand will spread to other Layer2s, ultimately causing liquidity fragmentation. Therefore, improving the performance of the execution layer is also crucial, serving as another threshold above the DA layer.

opBNB’s Boost in the Execution Layer: Cache Optimization

When most people discuss the performance bottlenecks of blockchain execution layers, they inevitably mention two important bottlenecks: the single-threaded serial execution of the EVM that fails to fully utilize the CPU, and the inefficient data lookup of the Merkle Patricia Trie adopted by Ethereum. In essence, the scaling strategies for the execution layer revolve around making more efficient use of CPU resources and ensuring that the CPU can access data as quickly as possible.

Optimization solutions for serial EVM execution and Merkle Patricia Trie are often complex and challenging to implement, while the more cost-effective efforts tend to focus on cache optimization. In fact, cache optimization brings us back to points frequently discussed in traditional Web2 and even textbook contexts.

Typically, the speed at which the CPU retrieves data from memory is hundreds of times faster than retrieving data from disk. For example, fetching data from memory may take only 0.1 seconds, while fetching from disk may take 10 seconds. Therefore, reducing the overhead generated by disk reads and writes, i.e., cache optimization, becomes an essential aspect of optimizing the blockchain execution layer.

In Ethereum and most other public chains, the database that records on-chain address states is stored entirely on disk, while the so-called World State trie is merely an index of this database, or a directory used for data lookup. Every time the EVM executes a contract, it needs to access relevant address states. Fetching data from the disk-based database one by one would significantly slow down transaction execution. Therefore, setting up a cache outside the database/disk is a necessary means of speeding up.

opBNB directly adopts the cache optimization solution used by BNB Chain. According to information disclosed by opBNB’s partner, NodeReal, the earliest BSC chain set up three layers of cache between the EVM and the LevelDB database storing the state. The design concept is similar to that of traditional three-level caches, where data with higher access frequency is stored in the cache. This allows the CPU to first search for the required data in the cache. If the cache hit rate is high enough, the CPU does not need to overly rely on the disk to fetch data, resulting in a significant improvement in the overall execution speed.

Later, NodeReal added a feature on top of this, which leverages the unused CPU cores to preemptively read the data that the EVM will need to process in the future from the database and store it in the cache. This feature is called “state preloading.”

The principle of state preloading is simple: blockchain nodes’ CPUs are multi-core, while the EVM operates in a single-threaded serial execution mode, utilizing only one CPU core, leaving other CPU cores underutilized. To address this, the CPU cores that are not used by the EVM can assist in tasks by predicting the data that the EVM will need from the yet-to-be-processed transaction sequence. These CPU cores outside the EVM will then retrieve the data that the EVM will need from the database, helping the EVM reduce the overhead of data retrieval and thus speeding up execution.

With cache optimization and sufficient hardware configurations, opBNB has effectively pushed the performance of its node’s execution layer close to the limit of the EVM, processing up to 100 million gas per second. This 100 million gas is essentially the performance ceiling of the unmodified EVM, as demonstrated by experimental testing data from a prominent public chain.

To put it succinctly, opBNB can process up to 4761 simple transfers per second, 15003000 ERC20 token transfers per second, and approximately 5001000 SWAP operations per second based on transaction data observed on blockchain explorers. Comparing the current parameters, opBNB’s TPS limit is 40 times that of Ethereum, more than 2 times that of BNB Chain, and more than 6 times that of Optimism.

Of course, for Ethereum Layer2 solutions, due to the severe limitations of the DA layer itself, the performance is significantly discounted based on the performance of the execution layer, when considering factors such as DA layer block generation time and stability.

For BNB Chain with a high throughput DA layer like opBNB, the doubling effect of scaling is valuable, especially considering that BNB Chain can host multiple such scaling projects. It can be foreseen that BNB Chain has already incorporated opBNB-led Layer2 solutions into its strategic plans, and will continue to onboard more modular blockchain projects, including introducing ZK proofs into opBNB and providing high availability DA layers with complementary infrastructure such as GreenField, in an attempt to compete or cooperate with the Ethereum Layer2 ecosystem.

In the era where layered scaling has become the trend, whether other public chains will also rush to support their own Layer2 projects remains to be seen, but undoubtedly, the paradigm shift towards modular blockchain infrastructure is already happening.

Disclaimer:

  1. This article is reprinted from [极客 Web3], All copyrights belong to the original author [Faust, 极客web3]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

Understanding Rollup Bottlenecks and Optimization Methods From the Perspective of the Performance Difference between opBNB and Ethereum Layer2

Intermediate2/27/2024, 3:01:17 AM
This article aims to provide a brief summary of the working principles and commercial significance of opBNB, outlining an important step taken by the BSC public chain in the era of modular blockchain.

BNB Chain’s road to big blocks

The Road of Big Blocks on BNB Chain

Similar to Solana, Heco, and other public chains supported by exchanges, BNB Chain’s public chain BNB Smart Chain (BSC) has long pursued high performance. Since its launch in 2020, BSC has set the gas capacity limit for each block to 30 million, with a stable block interval of 3 seconds. With such parameters, BSC achieved a maximum TPS (TPS with various transactions mixed together) of over 100. In June 2021, the block gas limit of BSC was increased to 60 million. However, in July of the same year, a chain game called CryptoBlades exploded on BSC, causing daily transaction volumes to exceed 8 million and resulting in skyrocketing fees. It turned out that the efficiency bottleneck of BSC was still quite obvious at that time.

(Source: BscScan)

To address network performance issues, BSC once again raised the gas limit for each block, which remained stable at around 80-85 million for a long time. In September 2022, the gas limit per block of BSC Chain was increased to 120 million, and by the end of the year, it was raised to 140 million, nearly five times that of 2020. Previously, BSC had planned to increase the block gas capacity limit to 300 million, but perhaps considering the heavy burden on Validator nodes, the proposal for such super-large blocks has not been implemented.


source: YCHARTS

Later on, BNB Chain seemed to focus more on the modular/Layer2 track rather than persisting in Layer1 expansion. This intention became increasingly evident from the launch of zkBNB in the second half of last year to GreenField at the beginning of this year. Out of a strong interest in modular blockchain/Layer2, the author of this article will use opBNB as a research object to reveal the performance bottlenecks of Rollup by comparing it with Ethereum Layer2.

The Boost of BSC’s high throughput to opBNB’s DA layer

As we all know, Celestia has summarized four key components according to the workflow of modular blockchain: Execution Layer: Executes contract code and completes state transitions; Settlement Layer: Handles fraud proofs/validity proofs and addresses bridging issues between L2 and L1. Consensus Layer: Reaches consensus on transaction ordering. Data Availability Layer (DA): Publishes blockchain ledger-related data, allowing validators to download this data.


Among them, the DA layer is often coupled with the consensus layer. For example, the data of Optimistic Rollup’s DA contains a batch of transaction sequences in L2 blocks. When L2 full nodes obtain DA data, they know the order of each transaction in this batch. (For this reason, the Ethereum community believes that the DA layer and the consensus layer are related when layering Rollup.)

However, for Ethereum Layer2, the data throughput of the DA layer (Ethereum) has become the biggest bottleneck restricting Rollup performance. This is because the current data throughput of Ethereum is too low, forcing Rollup to suppress its TPS as much as possible to prevent the Ethereum mainnet from being unable to bear the data generated by L2. At the same time, the low data throughput causes a large number of transaction instructions within the Ethereum network to be in a pending state, leading to gas fees being pushed to extremely high levels and further increasing the cost of data publication for Layer2. Finally, many Layer2 networks have to adopt DA layers outside of Ethereum, such as Celestia, and opBNB, which is close to the water, has chosen to directly use the high throughput of BSC to implement DA to solve the bottleneck problem of data publication. For ease of understanding, let’s introduce the method of DA data publication for Rollup. Taking Arbitrum as an example, the Ethereum chain controlled by the Layer2 sequencer’s EOA address will periodically send Transactions to the specified contract. In the input parameters calldata of this instruction, the packaged transaction data is written, and the corresponding on-chain events are triggered, leaving a permanent record in the contract logs.


In this way, the transaction data of Layer2 is stored in Ethereum blocks for a long time. People who are capable of running L2 nodes can download the corresponding records and parse the corresponding data, but Ethereum nodes themselves do not execute these L2 transactions. It is easy to see that L2 only stores transaction data in Ethereum blocks, incurring storage costs, while the computational costs of executing transactions are borne by L2 nodes themselves. The aforementioned is the implementation method of Arbitrum’s DA, while Optimism uses an EOA address controlled by the sequencer to transfer to another specified EOA address, carrying a new batch of Layer2 transaction data in the additional data. As for opBNB, which uses the OP Stack, its DA data publishing method is basically the same as that of Optimism.


It is obvious that the throughput of the DA layer will limit the size of data that Rollup can publish in a unit of time, thereby limiting TPS. Considering that after EIP1559, the gas capacity of each ETH block stabilizes at 30 million, and the block time after the merge is about 12 seconds, Ethereum can handle a maximum of only 2.5 million gas per second. Most of the time, the gas consumed by accommodating L2 transaction data per byte in calldata is 16, so Ethereum can handle a maximum calldata size of only 150 KB per second. In contrast, BSC’s maximum average calldata size processed per second is about 2910 KB, which is 18.6 times that of Ethereum. The difference between the two as DA layers is obvious.

To summarize, Ethereum can carry about 150 KB of L2 transaction data per second. Even after the launch of EIP 4844, this number will not change much, only reducing the DA fee. So how many transactions’ data can be included in about 150KB per second? Here we need to explain the data compression rate of Rollup. Vitalik was overly optimistic in 2021, estimating that Optimistic Rollup can compress transaction data size to 11% of the original size. For example, a basic ETH transfer, originally occupying a calldata size of 112 bytes, can be compressed to 12 bytes by Optimistic Rollup, ERC-20 transfers can be compressed to 16 bytes, and Swap transactions on Uniswap can be compressed to 14 bytes. According to his estimation, Ethereum can record about 10,000 L2 transactions per second (with various types mixed together). However, according to the data disclosed by the Optimism team in 2022, the actual data compression rate can reach a maximum of only about 37%, which is 3.5 times lower than Vitalik’s estimate.


(Vitalik’s estimation of Rollup scalability effect deviates significantly from actual conditions)

(Actual compression rates achieved by various compression algorithms disclosed by Optimism)

So let’s give a reasonable number: even if Ethereum reaches its throughput limit, the maximum TPS of all Optimistic Rollups combined is only slightly over 2000. In other words, if Ethereum blocks were entirely used to carry the data published by Optimistic Rollups, such as those distributed among Arbitrum, Optimism, Base, and Boba, the combined TPS of these Optimistic Rollups would not even reach 3000, even under the most efficient compression algorithms. Additionally, we must consider that after EIP1559, each block’s gas capacity averages only 50% of the maximum value, so the above number should be halved. After the launch of EIP4844, although the transaction fees for publishing data will be significantly reduced, Ethereum’s maximum block size will not change much (as too much change would affect the security of the ETH main chain), so the estimated value above will not progress much.


According to data from Arbiscan and Etherscan, a batch of transactions on Arbitrum contains 1115 transactions, consuming 1.81 million gas on Ethereum. By extrapolation, if the DA layer is filled in every block, Arbitrum’s theoretical TPS limit is approximately 1500. Of course, considering the issue of L1 block reorganization, Arbitrum cannot publish transaction batches on every Ethereum block, so the above numbers are currently only theoretical. Additionally, with the widespread adoption of smart wallets related to EIP 4337, the DA issue will become even more severe. Because with support for EIP 4337, the way users verify their identity can be customized, such as uploading binary data of fingerprints or irises, which will further increase the data size occupied by regular transactions. Therefore, Ethereum’s low data throughput is the biggest bottleneck limiting Rollup efficiency, and this problem may not be properly resolved for a long time. On the other hand, in the BNB Chain of the BSC public chain, the maximum average calldata size processed per second is approximately 2910 KB, which is 18.6 times that of Ethereum. In other words, as long as the execution layer can keep up, the theoretical TPS upper limit of Layer2 within the BNB Chain ecosystem can reach approximately 18 times that of ARB or OP. This number is calculated based on the current BNB Chain’s maximum block gas capacity of 140 million, with a block time of 3 seconds.

In other words, the current aggregate TPS limit of all Rollups within the BNB Chain ecosystem is 18.6 times that of Ethereum (even when considering ZKRollup). From this perspective, it’s easy to understand why so many Layer2 projects use the DA layer under the Ethereum chain to publish data, as the difference is quite evident. However, the issue is not so simple. Besides the problem of data throughput, the stability of Layer1 itself can also affect Layer2. For example, most Rollups often wait for several minutes before publishing a batch of transactions to Ethereum, considering the possibility of Layer1 block reorganization. If a Layer1 block is reorganized, it would affect the blockchain ledger of Layer2. Therefore, the sequencer will wait for several new Layer1 blocks to be published after each release of an L2 transaction batch, significantly reducing the probability of block rollback, before publishing the next L2 transaction batch. This actually delays the time when L2 blocks are finally confirmed, reducing the confirmation speed of large transactions (large transactions require irreversible results to ensure security). In summary, transactions that occur in L2 only become irreversible after being published in the DA layer blocks and after the DA layer has generated a certain number of new blocks. This is an important reason limiting Rollup performance. However, Ethereum has a slow block generation speed, taking 12 seconds to produce a block. Assuming Rollup publishes a batch of L2 transactions every 15 blocks, there will be a 3-minute interval between different batches, and after each batch is published, it still needs to wait for multiple Layer1 blocks to be generated before becoming irreversible (assuming they are not challenged). Obviously, the time from initiation to irreversibility of transactions on Ethereum’s Layer2 is quite long, resulting in slow settlement speed; whereas, BNB Chain only takes 3 seconds to produce a block, and the blocks become irreversible in just 45 seconds (the time it takes to produce 15 new blocks). Based on the current parameters, assuming the same number of L2 transactions and considering the irreversibility of L1 blocks, the number of times opBNB can publish transaction data in a unit of time can reach up to 8.53 times that of Arbitrum (once every 45 seconds for the former, and once every 6.4 minutes for the latter). Clearly, the settlement speed of large transactions on opBNB is much faster than on Ethereum’s Layer2. Additionally, the maximum data size published by opBNB each time can reach 4.66 times that of Ethereum’s Layer2 (the former’s L1 block gas limit is 140 million, while the latter’s is 30 million). 8.53 * 4.66 = 39.74. This represents the gap between opBNB and Arbitrum in terms of TPS limit in practical implementation (currently, for security reasons, ARB seems to actively reduce TPS, but theoretically, if TPS were to be increased, it would still be many times lower compared to opBNB).


(Arbitrum’s sequencer publishes a transaction batch every 6-7 minutes)


(opBNB’s sequencer publishes a transaction batch every 1-2 minutes, with the fastest taking only 45 seconds). Of course, there is another crucial issue to consider, which is the gas fees in the DA layer. Each time L2 publishes a transaction batch, there is a fixed cost of 21,000 gas unrelated to the calldata size, which is an expense. If the gas fees for the DA layer/L1 are high, causing the fixed cost of publishing a transaction batch on L2 to remain high, the sequencer will reduce the frequency of publishing transaction batches. Additionally, when considering the components of L2 fees, the execution layer’s cost is very low and can often be ignored, focusing only on the impact of DA costs on transaction fees. In summary, while publishing calldata of the same size consumes the same amount of gas on Ethereum and BNB Chain, the gas price charged by Ethereum is about 10 to dozens of times higher than that of BNB Chain. Translated into L2 transaction fees, the current user transaction fees on Ethereum Layer2 are also about 10 to dozens of times higher than those on opBNB. Overall, the differences between opBNB and Optimistic Rollup on Ethereum are quite apparent.

(A transaction consuming 150,000 gas on Optimism costs $0.21)


(A transaction consuming 130,000 gas on opBNB costs $0.004) However, increasing the data throughput of the DA layer, while it can enhance the overall throughput of the Layer2 system, still has limited impact on the performance improvement of individual Rollups. This is because the execution layer often does not process transactions fast enough. Even if the limitations of the DA layer can be ignored, the execution layer becomes the next bottleneck affecting Rollup performance. If the execution speed of the Layer2 execution layer is slow, the overflow of transaction demand will spread to other Layer2s, ultimately causing liquidity fragmentation. Therefore, improving the performance of the execution layer is also crucial, serving as another threshold above the DA layer.

opBNB’s Boost in the Execution Layer: Cache Optimization

When most people discuss the performance bottlenecks of blockchain execution layers, they inevitably mention two important bottlenecks: the single-threaded serial execution of the EVM that fails to fully utilize the CPU, and the inefficient data lookup of the Merkle Patricia Trie adopted by Ethereum. In essence, the scaling strategies for the execution layer revolve around making more efficient use of CPU resources and ensuring that the CPU can access data as quickly as possible.

Optimization solutions for serial EVM execution and Merkle Patricia Trie are often complex and challenging to implement, while the more cost-effective efforts tend to focus on cache optimization. In fact, cache optimization brings us back to points frequently discussed in traditional Web2 and even textbook contexts.

Typically, the speed at which the CPU retrieves data from memory is hundreds of times faster than retrieving data from disk. For example, fetching data from memory may take only 0.1 seconds, while fetching from disk may take 10 seconds. Therefore, reducing the overhead generated by disk reads and writes, i.e., cache optimization, becomes an essential aspect of optimizing the blockchain execution layer.

In Ethereum and most other public chains, the database that records on-chain address states is stored entirely on disk, while the so-called World State trie is merely an index of this database, or a directory used for data lookup. Every time the EVM executes a contract, it needs to access relevant address states. Fetching data from the disk-based database one by one would significantly slow down transaction execution. Therefore, setting up a cache outside the database/disk is a necessary means of speeding up.

opBNB directly adopts the cache optimization solution used by BNB Chain. According to information disclosed by opBNB’s partner, NodeReal, the earliest BSC chain set up three layers of cache between the EVM and the LevelDB database storing the state. The design concept is similar to that of traditional three-level caches, where data with higher access frequency is stored in the cache. This allows the CPU to first search for the required data in the cache. If the cache hit rate is high enough, the CPU does not need to overly rely on the disk to fetch data, resulting in a significant improvement in the overall execution speed.

Later, NodeReal added a feature on top of this, which leverages the unused CPU cores to preemptively read the data that the EVM will need to process in the future from the database and store it in the cache. This feature is called “state preloading.”

The principle of state preloading is simple: blockchain nodes’ CPUs are multi-core, while the EVM operates in a single-threaded serial execution mode, utilizing only one CPU core, leaving other CPU cores underutilized. To address this, the CPU cores that are not used by the EVM can assist in tasks by predicting the data that the EVM will need from the yet-to-be-processed transaction sequence. These CPU cores outside the EVM will then retrieve the data that the EVM will need from the database, helping the EVM reduce the overhead of data retrieval and thus speeding up execution.

With cache optimization and sufficient hardware configurations, opBNB has effectively pushed the performance of its node’s execution layer close to the limit of the EVM, processing up to 100 million gas per second. This 100 million gas is essentially the performance ceiling of the unmodified EVM, as demonstrated by experimental testing data from a prominent public chain.

To put it succinctly, opBNB can process up to 4761 simple transfers per second, 15003000 ERC20 token transfers per second, and approximately 5001000 SWAP operations per second based on transaction data observed on blockchain explorers. Comparing the current parameters, opBNB’s TPS limit is 40 times that of Ethereum, more than 2 times that of BNB Chain, and more than 6 times that of Optimism.

Of course, for Ethereum Layer2 solutions, due to the severe limitations of the DA layer itself, the performance is significantly discounted based on the performance of the execution layer, when considering factors such as DA layer block generation time and stability.

For BNB Chain with a high throughput DA layer like opBNB, the doubling effect of scaling is valuable, especially considering that BNB Chain can host multiple such scaling projects. It can be foreseen that BNB Chain has already incorporated opBNB-led Layer2 solutions into its strategic plans, and will continue to onboard more modular blockchain projects, including introducing ZK proofs into opBNB and providing high availability DA layers with complementary infrastructure such as GreenField, in an attempt to compete or cooperate with the Ethereum Layer2 ecosystem.

In the era where layered scaling has become the trend, whether other public chains will also rush to support their own Layer2 projects remains to be seen, but undoubtedly, the paradigm shift towards modular blockchain infrastructure is already happening.

Disclaimer:

  1. This article is reprinted from [极客 Web3], All copyrights belong to the original author [Faust, 极客web3]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Nu Starten
Meld Je Aan En Ontvang
$100
Voucher!