The State Of Fraud Proofs In Ethereum L2s

Advanced10/16/2024, 4:57:37 AM
This article provides a comprehensive analysis of the economic design of fraud proof systems in Ethereum rollups, using Arbitrum, Optimism, Cartesi, and Kroma as case studies, and highlights their importance to security.

1. Introduction

1.1. Where Do Optimistic Rollups Go?

In September 2024, Vitalik made a strong statement about raising the standards for rollups:

I take this seriously. Starting next year, I plan to only publicly mention (in blogs, talks, etc) L2s that are stage 1+, with maybe a short grace period for new genuinely interesting projects.

It doesn’t matter if I invested, or if you’re my friend; stage 1 or bust.

The Stage system for rollups is a framework used to roughly evaluate the security level of rollups, ranging from Stage 0 to Stage 2. Among the major rollups today, only Arbitrum and Optimism have reached Stage 1. Most other optimistic rollups are currently at Stage 0.

Given this situation, some questions arise:

  • It’s been 3 years since optimistic rollups were introduced, so why has no one reached the highest level, Stage 2?
  • When can we expect optimistic rollups to reach Stage 2?

This article aims to answer these questions by analyzing the fraud proof and challenge mechanisms of optimistic rollups and how each is working toward achieving Stage 2. Additionally, it explores the future of optimistic rollups and fraud proof systems.

1.2. Optimistic Rollup vs. ZK Rollup

Everyone knows that Ethereum is slow and expensive. Ethereum community’s researchers and developers have continuously worked to address this problem. After exploring various solutions like sharding and plasma, the community has agreed upon rollups as the primary solution to achieve scalability. As a result, many rollups like Arbitrum, Optimism, and zkSync have emerged. According to L2Beat, there are currently about 40 rollups, with additional solutions like Validium and Optimium using alt-DA solutions for higher scalability, totaling around 41. Furthermore, approximately 80 new rollup chains are expected to launch.

(Current Landscape of L2s | Source: L2Beat)

The core concept of rollup is to execute transactions off-chain and then submit only the transaction data and the result state root to Ethereum, thereby achieving scalability. Users can deposit funds into specific bridge contracts on Ethereum, move the funds into the rollup, and make transactions within the rollup. Since the transaction data is submitted to Ethereum, and once finalized, it cannot be altered without compromising Ethereum’s security, rollups are said to “inherit Ethereum’s security.”

But is this really true? What if the proposer processing transactions within the rollup is malicious? Malicious proposer could manipulate Alice’s balance, transfer it to its own account, and then withdraw it to Ethereum, effectively stealing Alice’s funds.

To prevent this, an additional security mechanism is needed when withdrawing from the rollup to Ethereum. By providing proof to the Ethereum bridge contract that the withdrawal transaction was correctly processed and included in the L2 chain, the withdrawal can be completed.

One of the simplest methods, which is chosen for every rollup, is to compare the hash of the withdrawal transaction with the rollup’s state root to prove that the withdrawal transaction is correctly included in the rollup’s state. This requires both the withdrawal transaction and the state root to be submitted to Ethereum’s bridge contract. Users submit their withdrawal transactions, while a validator calculates and submits the state root.

However, if the validator submitting the state root acts maliciously and submits an incorrect root, it could compromise the security of user funds. To mitigate this risk, two main mechanisms have been proposed, resulting in the differentiation between Optimistic Rollup and ZK Rollup.

  1. ZK Rollup
    ZK Rollup not only requires the validator to submit the state root but also to provide a ZK proof verifying the correctness of the state root calculation. If the validator submits an incorrect state root, the ZK proof would fail validation by the L1 Verifier contract, preventing the submission of the malicious state root.
  2. Optimistic Rollup
    Optimistic Rollup allows the designated validator to submit the state root without additional safeguards, relying on the assumption that the submission is honest. However, if the submitted root is incorrect, anyone can challenge it and make people not able to use it in the withdrawal process. The challenger must submit proof to Ethereum demonstrating that the root is incorrect, known as a Fraud Proof. \
    To ensure a safe resolution of a challenge from attacks like L1 censorship, there is a withdrawal delay of about a week in Optimistic Rollups.

1.3. Why do we need Fraud Proof?

Unlike ZK Rollups, Optimistic Rollups operate under conditions where validators can submit incorrect state roots and attempt to manipulate withdrawals. Fraud proofs effectively prevent this, ensuring the safety of funds in the bridge contract.

Without a robust fraud proof mechanism, Optimistic Rollups would not fully inherit Ethereum’s security. For example, in the current Arbitrum system of permissioned validator system, if all validators collude, they could potentially steal all funds in the bridge contract. Similarly, in OP Stack rollups like Base which still don’t implement permissionless fault proofs on mainnet, the single malicious validator could steal funds.

Thus, Fraud Proofs play a crucial role in the security of Optimistic Rollups, and any system lacking a well-implemented Fraud Proof mechanism poses a risk to user assets.

This article evaluates the risks associated with various Optimistic Rollups and examines the implementation, strengths, and weaknesses of their fraud proof mechanisms.

1.4. Toward Stage 2: Removing Training Wheels

Fraud proof systems play a pivotal role in helping optimistic rollups achieve ‘Stage 2.’ The Stage framework, proposed by Vitalik and currently operated by L2Beat, is used to evaluate the security level of rollups.

In the Ethereum ecosystem, this Stage framework is often likened to learning how to ride a bicycle. A Stage 0 rollup, which relies on the most trust assumptions, is compared to a tricycle with training wheels, while a Stage 2 rollup, which fully inherits Ethereum’s security, is compared to a two-wheeled bicycle with the training wheels removed.

Here are more detailed criteria for each stage from Stage 0 to Stage 2:

As highlighted above, implementing a proper fraud proof and challenge mechanism is crucial for optimistic rollups to achieve Stage 1 or 2. Considering the criteria, a fraud proof system that meets the Stage 2 standards would have the following characteristics:

  • It should be well-functioning with no known defects, with the 1-of-N characteristic.
  • It must be a permissionless system, where anyone can submit proofs.
  • If there is a bug in the proof system, it should be provable on-chain.

In the latter part of the article, we will explore how various protocols are attempting to implement these features.

2. Fraud Proof - Concept and Misconception

2.1. How are Fraud Proofs implemented?

Fraud proofs provide onchain verifiable evidence that a submitted state root is incorrect, indicating that a specific state transition function within the L2 was improperly executed. A straightforward method involves generating proofs for executing all L2 blocks from the last confirmed state root to the current state root, demonstrating the incorrect state root. However, this approach is costly and time-consuming.

Thus, effective fraud proof generation narrows down the specific incorrect state transition before generating proofs for that segment. Most fraud proof protocols follow this approach.

Fraud proof and challenge protocol typically follow these steps:

  1. Validators (asserters) periodically submit an output (or claim) containing L2 state root to Ethereum.
  2. If a validator (challenger) disagrees with an output, they initiate a challenge.
  3. The asserter and challenger identify the disagreeing segment through a process known as Bisection or Dissection. It narrows down the segment into an instruction level or a block level (with ZK).
  4. The challenger submits a fraud proof onchain to demonstrate the incorrect segment. Generally, protocols like Arbitrum and Optimism execute the suspicious instruction onchain for verification.
  5. If the fraud proof is validated, the incorrect output is removed or replaced. Depending on the challenge protocol, the asserter is slashed, and the challenger is rewarded.

2.2. Common Misconception: Fraud Proof and Challenge do not rollback the chain

One important point is that even if a fraud proof and a challenge occur, the chain is not rolled back. What the fraud proof guarantees is that ‘funds within the deposited bridge contract cannot be maliciously withdrawn,’ and there is no rollback of the incorrect state transition.

The main reason for not rolling back is that there is no need for it. Fundamentally, when an incorrect state transition occurs within the rollup, the problem is that malicious actors could steal users’ funds from the bridge. To prevent this, it is sufficient to ensure that the state root posted to L1 remains correct. This has nothing to do with chain rollback, and the fraud proof and challenge mechanism are sufficient as long as they prevent the finalization of a malicious state root.

Moreover, if the proposer who posts the state root and the sequencer who generates blocks for the L2 chain are different entities, then there is no need for a rollback mechanism.

Therefore, even in a situation where a challenge is successfully resolved, the L2 chain is not rolled back; only the state root (output or claim) submitted to L1 is either deleted or replaced. If the fraud proof and challenge mechanisms work correctly, it ensures that users’ funds within the bridge are safe.

2.3. Real Example: Challenge at Kroma in April 2024

Through the actual challenge case, you will be able to see that the rollback is not performed on the entire rollup chain, but only the output root is replaced or deleted. The only successful challenge case known on the mainnet so far is the challenge that occurred in Kroma, a hybrid rollup based on the OP Stack using ZK fault proof, in April 2024.

Kroma is an OP Stack based rollup with its own ZK fault proof and permissionless validator system. On April 1, 2024, a problem occurred with the L1 origin of the Kroma sequencer, causing the sequencer to generate incorrect blocks. Additionally, an incorrect output root was submitted by the validators observing this. Immediately after the submission of the output root, a total of 12 challengers created challenges against the output.

One of the challengers succeeded in calling the proveFault function, deleting the wrong output.

(Challenger successfully executed proveFault function | Source: etherscan)

This is the first successful challenge case in the history of Ethereum rollups on the mainnet. It is also the first successful verification and challenge of a fault proof in a mainnet environment approximately three years after the first Optimistic Rollup, Arbitrum, was launched in May 2021. The detailed overview of this challenge can be found in the article written by Kroma.

In this case, the Kroma chain did not undergo a rollback, but only the incorrect output root was deleted.

Disclaimer: Is it Fraud Proof or Fault Proof?

Fraud proof is also referred to as fault proof. Particularly in the Optimism and OP Stack chains, the term fault proof is used, while in Arbitrum, Cartesi, L2Beat, etc., the term fraud proof is used.

Considering the Kroma challenge case above, it can be inferred that challenges often arise from ‘mistakes’ rather than malicious attempts to manipulate withdrawals. In the above case, the main cause was an anomaly in the L1 client observed by the Kroma validators. In other words, challenges can arise simply due to validator errors or incorrect patches. In such cases, the term Fault Proof may be more appropriate.

However, the term that better reflects the purpose itself is fraud proof. All the mechanisms introduced so far, and those to be introduced in the future, aim to verify ‘fraudulent actions’ attempting to steal funds within the bridge through malicious outputs.

The point is, the purpose is to prevent fraud, but it can actually occur due to mistakes. In this article, I will use the term fraud proof, which is more widely used in the ecosystem.

3. Hack it! - Exploiting Fraud Proof Mechanisms

3.1. Designing Economic Dispute Protocol

Optimistic rollups have each implemented their own fraud proof and challenge mechanisms to protect user funds. What these mechanisms commonly aim for is the idea that “as long as there is at least one honest participant, the protocol can remain secure.” Fraud proofs are proofs that describe that a predetermined state transition function has been executed correctly, and through the verification process, it inevitably leads to a result where the honest participant wins.

However, this does not always hold true, and in reality, there can be situations where the protocol is in danger even with the presence of an honest participant. For example, unexpected bugs may occur due to the complexity of fraud proof, and malicious participants may find themselves economically advantaged over honest participants due to misaligned incentives, leading to situations where user withdrawals are significantly delayed or funds are stolen.

For these reasons, designing fraud proof and challenge mechanisms is a very difficult task. Particularly, to become a Stage 2 rollup, the challenge mechanism must be perfect, and countermeasures against various attack vectors and loopholes must be in place.

In other words, each fraud proof and challenge mechanism contains considerations on how to respond to attack vectors. If you do not understand each attack vector, you will not be able to understand why their protocol must be designed in such a way.

Thus, in this section, we will first examine the following attack vectors and explore how each protocol responds to them.

  • Attack vectors arising from vulnerabilities in the Dispute Game:
    • Delay attacks that delay user withdrawals for more than 7 days.
    • sybil attacks that deplete the funds and resources of honest participants.
  • Attacks caused by censorship of L1 validators.
  • Attacks exploiting bugs within the Fraud Proof VM.

Note: The attack vectors discussed below are all publicly known and do not affect the security of any mainnets.

The protocols and their respective characteristics that will be examined in the following sections are as follows:

3.2. Attack Vector #1: Exploiting Economic Dispute Game

Most optimistic rollups that have implemented fraud proof mechanisms all require bisection to find out the first disagreement point. It’s important for the protocol to provide incentives that encourage participants to act honestly.

One of the easiest ways to achieve this is to have participants stake a certain amount of funds (bond) when taking actions and slash the bond if they are deemed to have acted maliciously.

Considering game theory, the protocol must ensure that the funds consumed by malicious participants to attack are greater than the funds consumed by honest participants to defend. However, this is very difficult to achieve.

The key reason here is because, in a game context, it is impossible to know in advance who the malicious participant is without running the challenge to completion. In other words, the asserter who submitted the output may be malicious, or the challenger who challenged the output may be malicious. Therefore, the protocol must be designed under the assumption that either side could be malicious. Moreover, since there can be various attack vectors, designing the protocol becomes an exceedingly complex task.

Also, because each protocol adopts different mechanisms, the attack vectors corresponding to each method and the attacker’s incentive model must be defined. Additionally, an economically secure model must be designed to remain safe even when these are combined.

This remains a topic of ongoing discussion. In this section, we will analyze attack vectors that could generally occur and the incentives of participants within those scenarios. Additionally, we will explore how each protocol responds to these and how effectively they limit such incentives.

3.2.1. Attack Vector #1-1: Delay Attack

A delay attack refers to an attack where a malicious entity does not aim to steal rollup funds but rather prevents or delays the output from being confirmed on L1. This is an attack that can occur in most current optimistic rollups, adding additional delay to withdrawals, making it take more than a week for users to withdraw funds from L1.

This is slightly different from attacks caused by the censorship of L1 validators, which will be discussed later. Censorship prevents honest participants from taking any action on Ethereum, allowing a malicious state root to be finalized. On the other hand, a delay attack can delay the finalization of the state root even when honest participants are actively engaged. In such cases, not only can user withdrawals be delayed, but if the attacker has more funds than the defender, the malicious state root may be finalized, leading to the theft of user funds.

One of the simplest ways to prevent delay attacks is to require participants in the challenge system to stake a certain amount of funds or bond, which can be slashed if they are deemed to be causing delays.

However, there are considerations to take into account. What if the attacker is willing to have their funds slashed and still attempts a delay attack?

This attack vector is quite tricky to handle. This is also why Arbitrum’s fraud proof system currently operates in a permissioned structure.

The fraud proof mechanism applied to Arbitrum One, Arbitrum Classic, utilizes a branching model. Rather than simply allowing participants to challenge incorrect claims, each participant submits what they believe to be the correct claim along with a certain amount of funds, treating these as “forks of the chain.” Claims can also be thought of as checkpoints on the chain’s state.

(Branching model of Arbitrum)

In Arbitrum Classic, participants will submit claims and chain branches they believe are correct, and through challenges, incorrect chain branches are gradually removed, eventually confirming the correct claim.

However, a single challenge cannot determine who is correct. Two malicious participants may proceed with bisection in the wrong way, defining an unrelated point as the disagreement point, and eliminating the correct claim. Therefore, Arbitrum ensures that challenges are continuously carried out until no participants have funds staked on a particular claim, guaranteeing that the challenge is resolved successfully if there is at least one honest participant.

This can be exploited for delay attacks. Suppose there are honest participants and N-1 malicious attackers who stake funds on the correct claim, while one attacker stakes funds on an incorrect claim. If the attackers can always include their transaction before the honest participants, they can proceed with the challenge first. In the worst case, if they carry out the bisection incorrectly, bisecting the portion that they agree on instead of the portion where they disagree, they can present a fraud proof on the wrong part. Naturally, this will pass, causing the side with funds staked on the correct claim to lose.

Since each challenge can take up to 7 days, the attackers can delay the protocol by up to 7 * (N-1) days.

(Delay attack at Arbitrum Classic | Source: L2Beat Medium)

The issue with this mechanism is that the cost of delay attacks scales linearly with the time the protocol is delayed. If an attacker finds the attack profitable, they will want to delay the protocol as long as possible, and the total delay time will be proportional to the attacker’s total amount of funds, potentially causing very long delays in user withdrawals.

In conclusion, a fraud proof protocol that can effectively defend against delay attacks must be designed such that the maximum delay time is bounded to a certain amount, or the cost of conducting the delay increases exponentially over time, making the cost of executing the attack greater than the incentive to do so.

3.2.2. Attack Vector #1-2: Sybil Attack (Exhaustion Attack)

Another attack vector is the Sybil Attack (Exhaustion Attack, Proof of Whale Attack). This can occur when an attacker has more funds or computing resources than the defender. The attacker can continuously submit incorrect output roots or create meaningless challenges, exhausting the defender’s funds or computing resources. At some point, the defender will run out of funds or idle computing resources, making them unable to defend, and the attacker will finalize the incorrect output root and steal the funds.

Typically, the above attack vector can occur in a permissionless system in the following two ways:

  1. Continuously submitting incorrect outputs.
    Suppose attacker Bob has more money than honest participants (defenders) Alice, Charlie, and David combined. In this case, Bob continuously submits incorrect output roots. Honest participants Alice, Charlie, and David will respond by paying gas fees and bonds, and at a certain threshold, the honest participants’ funds will run out before Bob’s. At this point, Bob submits another incorrect output, and since there are no longer any honest participants with remaining funds in the network, the output will finalize without challenge. In this way, Bob can steal funds from an optimistic rollup.
  2. Submitting multiple challenges to an honest output.
    Conversely, a malicious participant may attack honest participants by submitting multiple challenges. Similarly, the attack will continue until the honest participants exhaust all their funds on gas fees and bonds, and the malicious attacker will then submit an incorrect output and steal the users’ funds from the bridge.

To prevent such attacks, the advantage of defender over attacker must be properly designed. In all situations, the defender must be at a sufficient advantage over the attacker. One way to do so is designing the bond carefully; since sybil attacks are related to the total amount of funds available to each participant, if the bond is properly set, it should be possible to establish that “the system is safe from sybil attacks unless the attacker’s total funds are N times greater than the defender’s total funds.”
The other known way to prevent sybil attacks is implementing a sybil-resistant dispute protocol. This will be explained further in the following section demonstrating Cartesi Dave.

Let’s take a look at how each protocol responds to these delay and sybil attacks through their respective designs.

3.3. Solution #1: Economically Sound Dispute Game

1) Arbitrum BoLD

BoLD, building on the branching model of the original Arbitrum Classic, introduces the following three elements to prevent the vulnerabilities of delay attacks:

  • All-vs-All challenge mechanism.
    In BoLD, challenges are no longer conducted 1-on-1, but take the form of a concurrent All-vs-All system where all participants can stake their bonds on the branch they agree with. This prevents the delay attack vector that arose from the previous challenge mechanism where 1-on-1 challenges were conducted sequentially, and ensures that multiple, separate challenges for the same dispute cannot occur.
  • Prevention of malicious bisection through proof of correct state computation (history commitment). \
    The issue in Arbitrum Classic was that malicious participants could intentionally cause delays by bisectioning in a way that marked non-controversial sections as disputed points. To prevent this, BoLD requires the submission of proof, along with the state root, to verify that the state root was correctly computed during the bisection process, ensuring that no malicious bisection has taken place.
    In BoLD, participants must submit proof along with the state root during the bisection process. This proof verifies that the current state root was correctly computed based on the state root submitted in the previous claim. If a malicious participant attempts to submit an arbitrary root unrelated to the previously submitted state root during bisection, the proof verification will fail, causing the bisection transaction to fail as well. This effectively ensures that only one type of bisection is possible for each claim.
    Therefore, if an attacker wants to carry out multiple bisections against an honest claim in BoLD, they must submit multiple claims.
    However, generating this proof requires validators to use quite a bit of computing resources. Internally, creating this proof requires generating hashes for all states within the bisection, which is typically estimated to be around 270 (approximately 1.18 x 1021) hashes in Arbitrum. To address this, BoLD splits the challenge into three levels, reducing the number of hashes that need to be computed to 226 (about 6.71 x 107).

(This figure assumes a total of 269 instructions, actual figures may vary)

  • Limiting the challenge period through the chess clock mechanism.
    In the previous Arbitrum Classic, there was no time limit on how long a challenge could proceed, allowing malicious participants to delay the protocol indefinitely as long as they had enough funds. BoLD introduces a chess clock mechanism to effectively limit the duration of a challenge.
    Let’s assume there are two participants who submitted different claims. Each is given a timer (chess clock) with 6.4 days of time. This timer begins to count down when it is a participant’s turn to submit a bisection or proof and stops once the participant completes their task.
    Since each participant is given 6.4 days of time, the maximum amount of time one participant can delay the process is 6.4 days. Therefore, in BoLD, challenges can last a maximum of 12.8 days (with an additional 2 days under certain circumstances when the Security Council intervenes).

Through these mechanisms, Arbitrum BoLD effectively limits delays caused by challenges. The maximum duration of a challenge is two weeks, and the maximum additional delay users may experience is approximately one week.

However, this can be exploited for delay attacks. A malicious participant can create a challenge and collude with L1 validators to censor the honest validator on Arbitrum, delaying Arbitrum users’ withdrawals by up to one week. In this scenario, users who request withdrawals within this timeframe may experience opportunity costs due to having their funds tied up for an additional week. Although this is not an attack where the attacker directly profits from the funds, it should still be prevented since it imposes opportunity costs on users. Arbitrum BoLD is addressing this issue by setting the bond required for creating a challenge high enough to deter such attacks.

Arbitrum calculates this amount in the Economics document of BoLD. The main reason for delay in the protocol is the censorship of L1 validators. In the case of a delay attack, the scenario would unfold as follows:

  1. The attacker submits a claim N’ that disagrees with an existing claim N before it finalizes on Arbitrum.
  2. The defender tries to send bisection txs, but it fails since L1 validators are censoring the challenge transactions from the defender.
  3. Since BoLD has an assumption that censorship cannot last over 7 days, this can delay the finalization of claim N for up to a week.

The attacker’s profit comes from the opportunity cost incurred by users who have requested withdrawals from the challenged output. The worst-case scenario is when all funds in Arbitrum are requested for withdrawal in one output, and in this case, the opportunity cost incurred by users is calculated as follows, assuming Arbitrum One has a TVL of $15.4B and an APY of 5%.

costopp = 15,400,000 x (1.051/52 - 1) = $14,722,400

Because submitting an incorrect claim can impose such a high opportunity cost, claim submitters in BoLD are required to submit a bond of a similar magnitude. Currently, the bond required for claim submission in BoLD is set at 3,600 ETH, which amounts to approximately $9.4M.

This is to preemptively prevent the attacker from causing significant losses to the system through delays. Since the attacker will lose their bond in a challenge, they can cause up to $14.7M in opportunity costs but will forfeit about $9.4M in funds. Thus, BoLD disincentivizes delay attacks by requiring bonds comparable to the worst-case opportunity cost.

However, the 3,600 ETH bond size is not set solely due to delay attacks. To defend against sybil attacks, Arbitrum BoLD is designed to ensure the system remains safe until the attacker’s total funds are 6.5 times greater than the defender’s total funds, and this is how the bond amount of 3,600 ETH was determined.

From the perspective of a sybil attack, the following attack scenario could occur in Arbitrum BoLD. BoLD’s challenge system consists of three levels, and users must lock funds to submit the claim they believe to be correct.

Let’s assume that honest participant Alice submits a valid claim with X ETH. Malicious participant Bob, who has 3,600 ETH, could create multiple malicious claims. Alice would then need to lock Y ETH for each claim at a lower level to counter them.

In Arbitrum’s branching model, locking funds implies agreement with the chain state from the genesis to the claim. This feature allows participants to move their staked funds from claim A to its children, A’ and A’’. Thus, Alice would move her initially staked X ETH to lower levels and lock Y ETH for each of Bob’s malicious claims.

What happens if Bob has significantly more money than Alice? Bob can generate countless malicious claims until Alice runs out of funds to lock. At this point, Alice can no longer proceed with the bisection, allowing Bob to confirm an incorrect claim.

Ultimately, this issue boils down to how much more advantageous the defender should be compared to the attacker in the game.

Arbitrum expresses this metric as the resource ratio. It indicates how much more advantageous the honest participant is compared to the malicious participant. It is represented by the ratio of gas fees (G) and staking amounts (S) that each participant must spend, as follows:

BoLD’s challenge system is divided into three levels, and by maintaining this resource ratio at every level, it guarantees that the defender consistently has N times the advantage over the attacker across the entire system. Arbitrum has calculated the required bond size at the top level based on this resource ratio and created a graph.

(Top-Level Dispute Bond Cost vs. Resource Ratio at Arbitrum BoLD | Source: Desmos)

According to this graph, when the resource ratio is 100x, the required bond at the top level exceeds 1 million ETH (over $4 trillion). While a higher resource ratio makes the system more secure from sybil attacks, the bond amount becomes so large that hardly anyone can participate in the system, making it no different from a centralized system where only one validator submits claims.

Therefore, in BoLD, the resource ratio is set to 6.5x, making the bond at the top level 3,600 ETH, and the bonds at level 1 and level 2 are set to 555 ETH and 79 ETH, respectively.

In summary, BoLD defends against sybil attacks by calculating the resource ratio and setting the bond amount such that the defender has a 6.5x advantage over the attacker.

2) Cartesi Dave

Cartesi’s Dave was first proposed in a paper titled Permissionless Refereed Tournaments published in December 2022, before the first whitepaper of BoLD. It aims to keep the honest participant’s computing resources and funds advantageous compared to the attacker. Dave is similarly structured to BoLD, and has two key features:

  • Prevention of malicious bisection through proof of correct state computation (history commitment).
    Like BoLD, Dave requires participants to generate proof during bisection to show that they performed the computation correctly, preventing malicious forms of bisection. Accordingly, Dave’s challenge system is also divided into multiple levels to save validators’ resources.
  • 1-vs-1 sequential challenge mechanism in a tournament structure.
    Dave’s challenges are not conducted all at once but rather proceed in a tournament format, as shown in the figure below.

The above figure shows how a challenge proceeds when a malicious attacker submits seven incorrect claims against the network. Due to the nature of the history commitment, honest participants who agree with the correct claim, shown in green, are grouped together as a team. In Dave, they are grouped into a tournament format and placed as shown in the figure, with each participant engaging in 1-vs-1 challenges. Challenges at the same stage are conducted concurrently, and after one week, when the challenge is completed, the winners move on to the next stage. In the figure, the team of honest participants must undergo three rounds of challenges to win the tournament.

This feature is highly effective in preventing sybil attacks. First, the attacker must create multiple claims to execute the sybil attack, and each consumes the attacker’s computing resources and funds in a significant way.

Cartesi’s paper proves that defenders always maintain an exponential advantage over attackers in any situation. In other words, Dave guarantees that sybil attacks can be defended against with logarithmic resources relative to the number of attackers. This makes it very difficult to execute a sybil attack in Dave, and as a result, the bond size in Dave is set to a minimal 1 ETH, much smaller than in BoLD.

However, Dave is vulnerable to delay attacks. Each stage of the tournament consumes one unit of challenge time (one week), so the more malicious claims there are, the longer the protocol delay will be. The time it takes to fully resolve a challenge in Dave can be expressed by the following formula,

Td = 7 x log2(1 + NA)(days)

where NA represents the number of malicious claims. However, Dave’s challenges can be composed of multiple levels to efficiently generate history commitments. Here, malicious participants can generate NA malicious claims at each level of the challenge, which increases the total delay time as follows:

Td = 7 x [log2(1 + NA)]L(days)

Where L represents the number of levels in each challenge. If, as in the figure above, there are seven malicious claims and L is 2, the full resolution of the challenge could take up to 9 weeks, and users would experience an additional withdrawal delay of 2 months. If the number of levels increases or the number of malicious claims grows, the delay could extend to several months.

Cartesi aims to solve this issue using ZK, which will be discussed in detail in section 4. Possible Improvement.

3) Optimism Fault Proof (OPFP)

OPFP is a permissionless challenge protocol currently applied on the OP Mainnet and has the following characteristics:

  • All-vs-All concurrent challenge mechanism using a Game Tree
    OPFP allows anyone to submit an output (root claim) at any time. Validators who disagree with the submitted output can initiate a bisection process by challenging it.

(Architecture of OPFP Game Tree and Bisection Process | Source: Optimism docs)

Bisection is conducted concurrently on a Game Tree structured as shown in the figure above. The leaves of the tree represent the states of L2, and each node in the tree corresponds to a state in L2, with the rightmost leaf representing the latest L2 state. For example, submitting a claim at Node 1 is the same as submitting the state at Node 31.
This structure allows for the representation of bisection. For instance, if a validator disagrees with the Root claim (Node 1), they would submit a claim at Node 2, which corresponds to Node 23 in the tree, as it is the midpoint between Nodes 16 and 31. The submitter of Node 1 would then check the L2 state at Node 23 and either submit Node 6 (Node 27) if they agree or Node 4 (Node 19) if they disagree, continuing this process until the disagreement is found.
Even if multiple directions of bisection exist within one game, they can all proceed simultaneously, and anyone, not just the output submitter, can participate in the bisection process.

(Full Architecture of OPFP Game Tree | Source: Optimism docs)

The Game Tree used in OPFP is nested, with the upper tree handling bisection at the block level and the subtree below handling bisection at the instruction level.
Unlike BoLD or Dave, OPFP does not enforce correct bisection through history commitment, as the off-chain/on-chain costs of generating and submitting such commitments would be high.

  • Customizable dispute games based on modularity
    Currently, there are only two types of dispute games (Permissionless / Permissioned) live on the OP Mainnet. Optimism aims to eventually introduce various types of dispute games and has implemented the minimum interface to support this. By adhering to the specified function names and arguments, one can create a custom dispute game.
  • Challenge time limitation through a chess clock
    In OPFP, when a challenge occurs, both the asserter and the challenger are given a clock with time allocated for bisection. Each time a claim is made, the clock starts running for the opposing party. Optimism refers to this as “inheriting the clock of the grandparent.”
    Interestingly, each participant is given 3.5 days, not 7 days, which means that if no one challenges an output, it will be finalized within 3.5 days.
    However, this does not allow for immediate withdrawals. After an output is finalized, OPFP has a 3.5-day guardian period during which the Security Council can intervene to invalidate an incorrect output if necessary. \

(User Withdrawal Process in Happy Path | Source: OP Labs Blog)

Based on these mechanisms, OPFP, like other optimistic rollups, guarantees that withdrawals can be made at least 7 days after submission. However, if a challenge occurs, it may take more than 7 days for users to withdraw through that output. OPFP’s chess clock model limits the time each participant can spend on bisection, but it does not strictly limit the total time until the challenge is resolved.

This raises the question: Could a user’s withdrawal be delayed for more than a week if a challenge occurs on OPFP, similar to BoLD? The answer is “yes.” Unlike BoLD or Dave, Optimism provides options for users to handle situations where a challenge occurs, based on the unique characteristics of the protocol.

OPFP operates on the assumption that “a participant who submits an incorrect claim loses their bond.” However, there is one edge case in OPFP where this assumption is broken, known as the “freeloader claim.” This can occur in the following scenario:

  1. Alice submits a claim with a correct state root.
  2. Bob submits a counterclaim, and Alice makes a move to defend her original claim.
  3. Bob waits until his clock has almost run out (3.5 days), then challenges his own claim.

At this point, Alice should respond and claim Bob’s bond, but she inherits the time left on Bob’s clock, which may be insufficient for her to counter his claim. Thus, Bob may avoid losing his bond by submitting a “freeloader claim.”

(Freeloader Claim at Optimism Fault Proof | source: L2Beat)

While this doesn’t prevent the proper resolution of a challenge, it does represent a case where “an incorrect claim is submitted without bond being slashed,” which should be prevented from an incentive perspective.

Therefore, OPFP addresses this by resetting the clock to 3 hours if either the asserter or challenger’s remaining time is below 3 hours. This ensures that there’s enough time to counter freeloader claims. However, if the next bisection period passes without action for more than 3 hours, the challenge ends.

We can imagine a scenario where this mechanism is exploited for delay attacks. Suppose honest participant Alice submits a correct output, and from the moment Alice submits, time starts running on the challenger’s clock. Malicious participant Bob waits until 1 second before the challenger’s clock expires and then submits an incorrect output. The rules of OPFP then extend Bob’s time to 3 hours. Alice will respond, and Bob will continue using the extra 3 hours provided for each bisection.

This could delay the resolution of the challenge. The maximum time Bob can delay is 3.5 days + 3 hours the maximum number of bisects. OPFP’s MAX_GAME_DEPTH is 73, meaning the longest Bob could delay the process is 3.5 days + 3 hours 36 = 8 days. If Alice were to act similarly to delay the challenge, the bisection process could take 16 days.

Does this mean users wouldn’t be able to withdraw for 16 days? In practice, no, due to Optimism’s withdrawal logic.

Unlike Arbitrum, where withdrawals must prove inclusion in a specific L2 block, OP Stack uses a storage proof mechanism, where the withdrawal request is recorded in the L2ToL1MessagePasser contract on L2. This means that even if a long challenge occurs for a specific output, users can wait for the next output to finalize and withdraw based on the contract storage root included in that output. Therefore, users are not forced to experience long delays even if the block they requested withdrawal from is challenged, as they can use the next output.

However, this only holds true if users act quickly. In most cases, users may still experience several days of delay. This can be attributed to the withdrawal process in OP Stack, which involves the following three steps:

  1. Initiating the withdrawal (initiateWithdrawal) on L2.
  2. Proving the withdrawal (proveWithdrawalTransaction) on L1 for the output that includes the withdrawal.
  3. Waiting one week of proof maturity delay before finalizing the withdrawal (finalizeWithdrawalTransaction).

The key point is that users must wait one week between proving the withdrawal and finalizing it. If Alice proves her withdrawal on output B and a challenge occurs, she can send another proof for output C and finalize the withdrawal after a week. In this case, Alice will only experience the delay between outputs B and C.

Therefore, users who are unaware of the challenge creation or respond late may experience up to 9 days of additional withdrawal delays.

Furthermore, there is an additional delay attack vector in OPFP, where every output is challenged consecutively. In this case, users cannot bypass the delay by proving on the next output, causing the entire protocol to be delayed. OPFP counters this by requiring participants to stake bonds at every bisection level, with the bond amount increasing exponentially as shown in the diagram below.

(Amount of OPFP bond | Source: Optimism docs)

In other words, the longer an attacker tries to delay the challenge resolution in OPFP, the greater the cost due to the exponential increase in bond requirements, reducing the incentive for delay attacks over time. Additionally, since outputs can be submitted at any time in OPFP, it is difficult for the attacker to estimate the resources required to conduct a delay attack. The initial bond is set to 0.08 ETH, and the whole bond that must be submitted with a full challenge is up to ~700 ETH.

In summary, OPFP leaves the length of delay up to the user’s response in the event of a single challenge, and exponential bond requirements are used to counteract delay attacks caused by consecutive challenges.

However, OPFP is vulnerable to sybil attacks. In OPFP, if the attacker has more funds than the defender, a sybil attack is possible.

The following sybil attack vectors are possible in OPFP, both of which could lead to the theft of user funds:

  1. The attacker creates multiple challenges, causing the defender to use all their funds on bonds and gas fees.
  2. The attacker continuously submits incorrect outputs, forcing the defender to respond until they deplete their funds on bonds and gas fees.

This is possible in OPFP because the total bond amount required by both the attacker and defender throughout the challenge process is nearly the same, and the defender does not use significantly fewer resources (e.g., gas fees or computing power) than the attacker.

However, this does not mean that user funds on the current OP Mainnet are at risk. OPFP is still in Stage 1, and the Security Council has the authority to correct any improper outcomes. Therefore, even if such attacks occur, the Security Council can intervene to protect user funds on the OP Mainnet bridge.

To move OPFP to Stage 2, however, Optimism must modify the mechanism to ensure that the defender has a greater advantage than the attacker. Optimism is preparing Dispute Game V2 to address this, and more details will be explained in section 4. Possible Improvement.

4) Kroma ZK Fault Proof (Kroma ZKFP)

Kroma is an L2 based on the OP Stack, and before OPFP was introduced, it launched a permissionless ZK Fault Proof system on its mainnet in September 2023. Kroma ZKFP has similar characteristics to OPFP but stands out in that it generates block-level proofs using ZK and utilizes dissection instead of bisection, significantly reducing the number of interactions required in the challenge process. The key features of Kroma ZKFP are summarized as follows:

  • Reduction of interactions through ZK and dissection
    Kroma ZKFP allows participants to find points of disagreement within four interactions. When a challenge is initiated, Kroma ZKFP processes the challenge over 1,800 blocks, starting from the previous output to the current output. Instead of bisection, where the range is split in half, the asserter and challenger divide the range into N parts using dissection. The process works as follows:

After each participant submits two transactions, they will have identified the blocks they disagree on, and the challenger can generate a ZK fault proof to demonstrate that the asserter’s claim was incorrect.
In Kroma ZKFP, the bisection timeout is set to 1 hour, and the ZK proof generation has an 8 hour timeout.

  • Decentralization of validators through an incentive mechanism
    Both BoLD and OPFP provide incentives for challenge winners but do not provide specific incentives for output submitters, and essentially anyone who wants to withdraw can submit an output and become a validator. However, it is impractical for users who wish to withdraw to operate a validator client themselves, and someone must regularly submit outputs to maintain liveness. Since this is a resource-consuming task that requires gas fees for output submission and validator client operation costs, without proper incentives, only a few people might participate as validators, which could lead to centralization and inadequate responses in failure scenarios.

To prevent this, Kroma has modified the OP Stack to distribute half of the gas fees generated by the chain to validators who submit outputs. Furthermore, Kroma plans to transition this reward mechanism to its native token, KRO, after the TGE, and it aims to introduce a DPoS-like validator system to allow regular users to contribute to the security and liveness of the chain without running their own clients.
The bond amount in Kroma is currently set at 0.2 ETH, ensuring that it is greater than the cost of generating the ZK proof and conducting the bisection. This bond will also transition to being staked in KRO within the future validator system.

  • Concurrent 1-vs-1 challenge system
    To ensure a fair and consistent distribution of incentives, Kroma has fixed the output submission interval to 1 hour, and validators are randomly selected from a pre-registered set to act as the asserter. This prevents excessive competition that could lead to wasted gas fees and avoids situations where block builders with transaction ordering rights monopolize rewards.
    Due to this mechanism, Kroma ZKFP operates a concurrent 1-vs-1 challenge system. When the randomly selected validator submits an output, anyone can initiate a challenge, and the bisection is conducted solely between the output submitter and the challenger. Multiple challenges can be conducted simultaneously, and the first challenger to submit a valid ZK proof wins the challenge.

Strictly set timeouts mean that even a malicious challenger attempting a delay attack must complete all bisections and proof generation within 10 hours. Additionally, since challengers are forced to complete all actions within 6 days (excluding the 1-day guardian period), it is impossible to conduct a typical delay attack in Kroma.

However, Kroma ZKFP may still be vulnerable to sybil attacks, similar to OPFP, if the attacker’s funds exceed the defender’s. A sybil attack scenario in Kroma ZKFP might look like this:

  • The attacker continuously creates challenges against a valid output until the output submitter’s funds are exhausted, at which point the attacker submits a ZK proof to win the challenge.

Like OPFP, Kroma ZKFP operates under a model where a successful challenge results in the deletion of the corresponding output. Therefore, if such an attack occurs, the output could be deleted, delaying user withdrawals for 1 hour. If the attack persists, all honest validators could run out of funds, leading to the finalization of an incorrect output, allowing the attacker to steal users’ funds.

Additionally, Kroma ZKFP is still in Stage 0, as its proof system is not yet perfect for the following reasons:

  1. The starting point for dissection is based on the last submitted output, not the last finalized output.
    In OPFP, the starting point for bisection is typically the last finalized output from about a week ago. However, in Kroma ZKFP, the starting point is the last submitted output, which was submitted about 1 hour earlier, and the dissection process is conducted over 1,800 blocks.
    This could allow a challenger to win the challenge if a previous output has been deleted due to a challenge. In this case, the dissection would proceed based on the previous output information submitted by the challenger, and if the challenger maliciously manipulates the previous output information, they could win the challenge.
  2. There is no verification that each validator is conducting the challenge based on correct batch data.
    While Kroma ZKFP’s use of ZK ensures that it is impossible for an incorrect state transition to be finalized if the ZK circuit has no bugs, Kroma ZKFP does not verify whether the ZK proof generation is based on correct batch data. This means that it is possible for a ZK proof to pass verification even if certain transactions were excluded or incorrect transactions were included in the batch.
    Therefore, it would be possible to win a challenge by using ZK proofs based on incorrect data, and if a user’s withdrawal transaction is excluded from the batch, their withdrawal could be delayed.

In practice, however, the Security Council can intervene to roll back the result of an incorrect challenge or delete an invalid output, so these attack vectors do not affect the funds of Kroma Mainnet users. However, to reach Stage 2, Kroma ZKFP must implement defense mechanisms against these vulnerabilities. Kroma has already proposed improvements for these issues, which will be explained in detail in section 4. Possible Improvement.

3.4. Attack Vector #2: L1 Censorship

Previously, we mentioned that rollups inherit Ethereum’s safety. This means that if Ethereum’s safety is compromised, the rollup will also be affected.

There are two scenarios where Ethereum’s situation could compromise the safety of a rollup:

  1. Censorship of rollup fraud proof transactions by Ethereum validators
    If Ethereum validators or builders collude and submit a malicious output root in an optimistic rollup while censoring all transactions related to fraud proof, what would happen? The challenge could not be resolved within the designated period, the output would be finalized, and users’ funds could be at risk.
    This is referred to as weak censorship. In the case of optimistic rollups, if this censorship lasts beyond the defined period, typically 7 days, users’ funds may be at risk.
  2. Ethereum undergoing a 51% attack, leading to censorship of all fraud proof transactions
    This scenario involves two potential attack paths:
    • First, an entity could acquire over 2/3 of Ethereum’s total stake, allowing them to finalize blocks as they wish. In this case, the attacker could censor fraud proof transactions or even generate them at will.
    • The second method involves a participant who has acquired over 1/3 of Ethereum’s total stake carrying out a “stealth” attack. This is described in the research: Non-attributable Censorship Attack on Fraud Proof-Based Layer2 Protocols. In this case, an attacker with 1/3 of Ethereum’s stake could prevent the finalization of any blocks they dislike. If the attacker continues to vote on regular blocks while withholding votes on blocks containing fraud proof, they could finalize a malicious output root and steal funds from the optimistic rollup. This is called a Non-attributable Censorship Attack on fraud proof-based L2s. It is harder to detect than simply acquiring over 2/3 of the stake and controlling all Ethereum blocks.

These censorship-based attacks are difficult to counter at the rollup level because they occur at the Ethereum protocol layer and would require improvements to Ethereum itself. However, there are strategies that rollups can adopt in the meantime.

3.5. Solution #2: 7 Days of Withdrawal Delay and Semi-Automated 51% Attack Recovery

To address these attack vectors, optimistic rollups currently implement a 7-day withdrawal delay. The 7-day period was first proposed by Vitalik and is based on the idea that 7 days will be ‘enough’ for reacting to censorship attacks.

Let’s examine whether the 7-day challenge period in Optimistic Rollups is sufficient to resist censorship attacks by considering two types of censorship: weak and strong censorship attacks.

For the first, weak censorship, we can use probability calculations to see if the 7-day period gives Optimistic Rollups resistance to censorship attacks. This involves calculating the probability of successfully challenging a fraud when some validators are censoring the rollup’s challenge transactions.

Here, two considerations must be made:

  1. Multiple transactions must succeed for a challenge to be successful within 7 days.
    In most protocols, the challenge won’t succeed if only one transaction from an honest participant is included in the week. Therefore, we need to calculate the probability of including all necessary transactions to submit a fraud proof within the 7-day period.
  2. A realistic assumption must be made about what percentage of validators are involved in censorship.
    Currently, most Ethereum block builders, known to be centralized, are not censoring, and given the percentage of solo stakers on Ethereum, the chance that a majority (e.g., 99.9%) of validators will collude to perform censorship is close to zero.

(Censorship of major Ethereum block builders | source: Tweet of Justin Drake)

Taking these two points into account, if we assume that 99.5% of validators (still too extreme assumption) are engaging in censorship and calculate the probability of an honest participant succeeding in sending 30 to 40 transactions required for challenge protocols like BoLD or OPFP, the probability of success approaches 100% in all cases. Additionally, resistance to censorship could be improved further with future solutions like inclusion lists or multiple concurrent proposers (e.g., BRAID, APS + FOCIL), reducing the risk of optimistic rollups losing user funds due to weak censorship.

Then will 7 days be enough in the situation of strong censorship? The 51% attack mentioned earlier can only be resolved through a social fork. The Non-attributable Censorship Attack is particularly challenging to detect and cannot be prevented using solutions designed for weak censorship, such as inclusion lists.

There is a proposal to develop a semi-automated 51% attack recovery tool in client software, based on a structure proposed by Vitalik. This censorship detection solution has been further developed by Ethereum researchers and consists of two steps:

  1. Light clients monitor the mempool and detect when certain transactions are not included in blocks for an extended period.
  2. If specific transactions remain in the mempool for a day without being included in a block, a “Do you agree with a social fork?” button is triggered, allowing the community to initiate a hard fork based on this consensus.

Let’s say this tool detects a 51% attack. The next step would be to move to a new chain through a social fork that invalidates the attacker’s funds.

In such a case, it’s crucial that the funds affected by the 51% attack remain locked until the social fork is executed. A similar situation occurred during The DAO hard fork, where the hacker’s funds were locked in a child DAO for 27 days before they could be withdrawn. The Ethereum community was able to conduct a hard fork within that period, preventing the hacker from cashing out the funds (see Vitalik’s Reddit post for more details).

In other words, even in the event of a 51% attack, funds need to remain locked until a social fork can be conducted. In this context, the 7-day withdrawal period in optimistic rollups serves as a buffer. If a social fork doesn’t occur within the week, user funds in optimistic rollups may be stolen, cashed out on centralized exchanges, or mixed via Tornado Cash, making it nearly impossible to return the funds to users even with a social fork.

To summarize, while the 7-day withdrawal period in optimistic rollups was originally proposed to account for weak censorship, in reality, weak censorship is unlikely to occur, and the 7-day period serves as a buffer in the event of strong censorship that requires a social fork.

From this perspective, there has been criticism that OPFP’s reduction of this period to 3.5 days makes it more vulnerable to attacks involving strong censorship. However, this criticism is unfounded. Since Optimism is still in Stage 1, guardians have a buffer to verify the state root’s correctness, and withdrawals can only occur after the additional 3.5-day Guardian Period has passed. Therefore, even if a strong censorship attack occurs, the attacker would still need to wait 7 days to withdraw. Additionally, the attacker would have to censor all challenge-related transactions for the entire week to succeed, as the guardians would also need to be censored to prevent them from halting the confirmation of a malicious output.

However, the key point remains that Ethereum must ensure it can process social forks within the 7-day period. This means that tools to detect 51% attacks must be ready and that there is sufficient research and simulation to determine whether a social fork can be implemented within 7 days. Only then can the 7-day withdrawal delay in optimistic rollups be considered an effective safeguard.

3.6. Attack Vector #3: Exploiting a Bug in the Fraud Proof System

Most challenge protocols work by having participants find a specific point (instruction or block) where they disagree and then generate proof showing that the other participant’s claim is incorrect. The virtual machine used to generate this proof is called the Fraud Proof VM, and the software used for proof generation on top of the VM is called a program. Each protocol uses different Fraud Proof VMs and programs, as shown below:

The goal of each Fraud Proof System is to prove that a specific execution result in the EVM was correct on-chain. But what happens if there’s a bug in this system, either in the VM or the program?

This question can be explored through the attack vector Yoav Weiss discovered in OVM. The attack was possible due to a vulnerability in OVM’s rollback feature, but the premise of creating a “fraudulent transaction” was crucial for the attack to be carried out. A fraudulent transaction is one that executes normally when processed on the rollup but produces a different result when executed in the challenge process using the Fraud Proof VM and program. Since the Fraud Proof System is supposed to generate the same result as the EVM, the ability to create a fraudulent transaction implies that there is a bug in the Fraud Proof System.

Yoav discovered several bugs in OVM’s Fraud Proof System and was able to simulate this attack by generating fraudulent transactions.

One simple example of the attacks he discovered was as follows: In OVM’s StateManager, the gas cost for the opcodes SSTORE and SLOAD (which store and read state) was incorrectly recorded. This meant that any transaction that stored or read a value in a contract (almost every transaction except simple ETH transfers) would be identified as a fraudulent transaction during the challenge process, even though it had executed correctly on the rollup.

In short, if there is a bug in the system, a state change that was correctly executed could be incorrectly flagged as invalid during a challenge, causing the output submitted by an honest participant to be marked as wrong.

This was one of the reasons OP Mainnet recently transitioned its fault proof system from a permissionless model to one where only authorized participants could join. After OPFP was applied to the mainnet, a security audit revealed several bugs in the Fraud Proof System (Cannon and op-program) and the Dispute Game challenge protocol. To prevent the system from being exploited, Optimism announced on August 17th that it would switch to a permissioned system.

Of course, exploiting a VM bug may not have a significant impact on rollups in Stage 0 or Stage 1, because the Security Council can intervene at any time to correct the outcome of a challenge.

This was a point previously argued by OP Labs. In fact, OP Labs shared its audit framework in the Optimism Forum, outlining its criteria for when external audits are necessary.

(OP Labs Audit Framework | Source: Optimism Forum)

In this framework, situations like the recent one fall into the fourth quadrant: “Fault Proofs with training wheels.” While these situations are chain-safety-related, they do not directly impact user funds and, therefore, are not included in the audit scope. This means that even if bugs are exploited, the Security Council can correct the results.

However, since vulnerabilities have been identified, they need to be addressed. Optimism fixed these issues in its Granite network upgrade, allowing OP Mainnet to return to Stage 1.

On the other hand, bugs in the system could be critical in Stage 2 rollups. In Stage 2, the Security Council can only intervene in the case of bugs that are provable onchain. Since proving that “the challenge result is wrong due to a system bug” on-chain is nearly impossible, if a bug occurs in a Stage 2 rollup, users’ funds could be at risk.

3.7. Solution #3: Multi Proofs

To prevent such issues, it is essential to conduct thorough audits before the code reaches production. However, Fraud Proof VMs and programs are complex software systems, and the more complex the system, the more likely bugs are to occur. Therefore, even with rigorous audits, bugs can still arise. We need to explore additional strategies beyond audits.

One approach is to use multiple proof systems within the same system. Instead of generating fraud proofs using a single system during a challenge, the system could simultaneously generate multiple fraud proofs using different VMs and programs, then compare the results. This would create a system that remains secure even in the event of a bug.

For example, imagine a multi-proof system using both Optimism’s Cannon and the asterisc ZK Fault Proof VM (which utilizes Risc-V). In the case of a challenge, the following would happen:

  1. If an incorrect output is detected, the challenger generates a challenge and initiates a bisection.
  2. Once a block of disagreement is found via bisection, two subgames occur simultaneously:
    • [ ] The subgame using the traditional OPFP method of Cannon.
    • [ ] A subgame using asterisc to generate a ZK Fault Proof.
  3. After both games are completed, the two different fraud proofs are verified.

If both proofs pass verification, the challenger wins; if both fail, the challenger loses. However, if one passes and the other fails, this indicates that an unexpected bug occurred in one of the VMs or programs during proof generation.

In such cases, entities like the Security Council would intervene to adjust the challenge result. This ensures that the system can remain free from bugs without violating the condition that “the Security Council can only intervene in cases of bugs that are provable onchain.”

This is one of the ongoing efforts for Optimism to reach Stage 2. To support this, OPFP’s Dispute Game is designed modularly, allowing multiple fraud proof systems to be implemented freely, with a minimal interface defined to support this.

4. Possible Improvements

In previous sections, we explored the design of optimistic rollup protocols and the vulnerabilities that could arise in their challenge and fraud proof verification processes. This section discusses the issues and solutions for each protocol, along with future prospects for fraud proof systems and optimistic rollups.

4.1. Rooms for Improvement for Each Protocol

1) Arbitrum BoLD

BoLD has a sound economic challenge protocol because it limits the maximum protocol delay to one week and ensures protection from sybil attacks unless the attacker has more than 6.5 times the funds of the defender. However, BoLD presents two notable issues:

  1. The resource ratio of 6.5x gives too little advantage to the defender.
  2. The bond for submitting a root claim is 3,600 ETH, which is excessively large.

The first issue can be addressed with ZK technology. BoLD splits challenges into multiple levels to reduce the resources required for history commitment computation. Using ZK, this could be reduced to a single level.

This concept is similar to the suggestion of BoLD++ from Gabriel at Cartesi. When challenges are multi-leveled, increasing the resource ratio results in an exponential rise in bond size at the top level. However, when using a single level, the resource ratio can be increased more easily, making the protocol more resistant to sybil attacks.

The second issue, the 3,600 ETH bond, is more difficult to solve. BoLD’s bond size was set not only to address sybil attacks but also to deter delay attacks. The bond size is a function of TVL, and even with ZK, it cannot be reduced significantly. To mitigate this, BoLD is implementing a pooled bonding mechanism, allowing multiple participants to contribute to the bond.

2) Cartesi Dave

Dave effectively addresses sybil attacks with its tournament structure, but as mentioned earlier, it is vulnerable to delay attacks. The maximum delay time as a function of the number of malicious claims NA and the number of challenge levels L is:

Td = 7 x [log2(1 + NA)]L(days)

If NA = 7 and L = 3, the protocol can experience delays of up to four months, causing significant inconvenience and loss to users as withdrawals are delayed.

ZK can help mitigate this. By fixing the number of levels L to 1 (as in BoLD++), the maximum delay time can be reduced to:

Td = 7 x log2(1 + NA)(days)

Cartesi is reportedly working on this improvement using RISC Zero’s ZK technology. However, there are still concerns about whether this will be enough to prevent delay attacks entirely. If NA = 7, the protocol could still face up to 2 weeks of additional delays, and the attacker’s costs would be just 7 ETH in bonds, along with gas fees and off-chain history commitment costs. For chains with high TVL, this penalty might not be enough to deter delay attacks.

(Dave with BoLD style sub-challenges | Source: L2Beat Medium)

There is a suggestion for Dave to adopt BoLD-style challenges with 8 participants instead of conducting 1-on-1 matches in each round, similar to a traditional tournament. In this case, the delay time would be calculated as follows:

Td = 7 x log8(1 + NA)(days)

Under this structure, an attacker would need to post at least 64 bonds to delay the challenge beyond two weeks, equating to a total bond requirement of 64 ETH, along with substantial onchain and offchain costs.

However, this approach has the downside of weakening the defender’s advantage in the case of a sybil attack. While BoLD provides a structure where the defender is N times more advantageous than the attacker, Dave creates a situation where the defender holds an exponentially greater advantage over the attacker.

In summary, Dave can effectively limit delay attack vectors through the use of ZK Fraud Proofs. While applying a structure like BoLD can improve resistance to delay attacks, it might lead to a trade-off where the defender’s advantage in the face of sybil attacks is reduced.

3) Optimism Fault Proof (OPFP)

OPFP had the drawback of being vulnerable to sybil attacks because the attacker and defender incurred equal costs. OP Labs proposed a solution to this problem in Dispute Game V2.

Unlike the original OPFP, where bonds were submitted with each bisection, Dispute Game V2 requires participants to post bonds only at the start of the bisection. Additionally, Dispute Game V2 introduces dissection, allowing participants to submit multiple claims simultaneously at branch points, reducing the number of interactions in most cases.

(Branch Claim at Dispute Game V2 | Source: Optimism Specs GitHub)

In the previous OPFP, the sybil attack vectors were:

  1. Creating numerous challenges to exhaust the defender’s funds on bonds and gas fees.
  2. Continuously submitting fraudulent outputs, forcing the defender to respond and drain their resources.

The introduction of branch claims addresses both vectors. First, the honest participant doesn’t need to post additional bonds during dissection, while malicious challengers must do so for every new challenge they create. This makes mass challenge creation unsustainable for attackers if bond amounts are appropriately set.

Second, as bonds are larger at higher levels in Dispute Game V2, continuously submitting fraudulent outputs becomes costlier for attackers than for defenders.

Thus, OPFP can effectively counter sybil attacks with the branch claims introduced in Dispute Game V2.

4) Kroma ZK Fault Proof (Kroma ZKFP)

Kroma ZKFP faces the dual challenges of vulnerability to sybil attacks and an imperfect proof system. The following two issues must be resolved for Kroma ZKFP to progress to Stage 1:

  1. The dissection starting point is based on the last submitted output, not the last finalized output.
  2. Validators do not verify whether challenges are based on correct batch data.

Kroma plans to switch from Scroll’s Halo2 zkEVM to Succinct SP1 zkVM, addressing these two issues and advancing to Stage 1.

Kroma is expected to modify its challenge process to align with Optimism’s Dispute Game interface. This adjustment is detailed in Kroma’s spec, and it will allow the dissection starting point to move to the last finalized output from one week ago, resolving the first issue.

For the second issue, Kroma will use ZK based trustless derivation. Here’s how it works:

(Trustless Derivation using ZK | Source: Lightscale Notion)

Imagine that we want to prove that a specific L2 block T was correctly executed. Before generating a ZK proof, we must verify that the transaction data for block T was properly constructed based on the L1 batch data.

Here, Kroma intends to verify whether the batch data has been correctly retrieved from L1 via ZK. If the data is simply fetched through a trusted RPC outside of the ZK program, there is no way to confirm whether the batch data has been tampered with. It is possible to verify that the program accessed the correct block and fetched the batch data by generating a ZK proof of the connectivity of block hashes from the block O (L1 origin of L2 block T), to block C (the L1 block at the time the challenge was created). If a challenger constructed L2 block T based on incorrect batch data, the hash of the L1 block from which the batch was retrieved by the challenger would differ from the hash of the L1 block that actually contains the batch including T1, and it would also not be connected to block C. Therefore, as long as there is no hash collision, the process of verifying the connectivity of L1 blocks through ZK can prove that the challenger constructed the L2 block from the correct batch data.

Kroma plans to verify the accuracy of batch data using ZK, which can check the connectivity of block hashes from L1 blocks O to C. If a challenger constructed L2 block T based on incorrect batch data, the L1 block hash they reference would differ from the one containing the correct batch, and it wouldn’t connect to block C. Since there’s no hash collisions, the challenge process can verify the correct batch data using this method.

With these improvements, Kroma ZKFP can possibly move to Stage 1. However, to reach Stage 2, Kroma will need additional solutions to protect against sybil attacks, including changing the challenge protocol to All-vs-All and redesigning the bond mechanism.

4.2. Summary

5. Future of Fraud Proof

5.1. Stage 2 Rollup - Your Funds are SAFU

As described above, Optimistic Rollups are moving towards Stage 2. Arbitrum is aiming to achieve Stage 2 based on BoLD. The implementation of BoLD has already been posted on the Governance Forum and has garnered significant support, with its implementation currently deployed on the testnet. If no major security issues are found, Arbitrum is likely to achieve Stage 2 through BoLD by the end of this year.

Optimism is also working hard to achieve Stage 2. For OP Mainnet to reach Stage 2, Dispute Game V2 must be completed, and there need to be multiple proof mechanisms for multi-proof. Although the specification is still in progress, Dispute Game V2 effectively addresses the weaknesses of the existing OPFP by providing strong protection against sybil attacks, bringing it closer to Stage 2. Additionally, multiple proofs are being actively developed, with various teams including OP Labs, Succinct, Kroma, and Kakarot dedicating significant R&D resources to create diverse ways to prove OP Stack. Therefore, Optimism is also expected to aim for Stage 2 by the first half of next year, barring any major issues.

The transition of these two rollups to Stage 2 could significantly impact the rollup ecosystem. Both Arbitrum and Optimism have their own rollup frameworks, Arbitrum Orbit and OP Stack, respectively. Their transition to Stage 2 means that all rollups using these frameworks could also transition to Stage 2.

Thus, starting from the end of this year to next year, major rollups with large user bases, such as Arbitrum, OP Mainnet, and Base, are expected to transition to Stage 2, inheriting the full security of Ethereum. This will likely silence criticisms such as “Rollups are just multisigs” or “Rollups can take your funds anytime.”

5.2. ZK Fraud Proof is the Future

Most of the protocols discussed would benefit from implementing ZK Fraud Proof. For example, applying ZK to Arbitrum BoLD could increase the resource ratio, making it more resilient to sybil attacks, and Cartesi Dave could reduce its vulnerability to delay attacks. OPFP is also investing R&D into ZK for multi-proof systems, which could reduce bond amounts and improve protocol security.

It’s important to note that ZK Fraud Proof does more than just reduce the number of interactions between validators. Fewer interactions mean significantly fewer resources for validators, which allows bond amounts to be reduced, enabling more participants to join the protocol. Additionally, this reduces the maximum possible delay, improving overall protocol security.

In this way, ZK Fraud Proof plays a critical role in both the security and decentralization of optimistic rollups.

5.3. How about ZK Rollup? Will Fraud Proof Diminish?

At this point, some readers might ask:

If fraud proof and challenge mechanisms are so complex, wouldn’t ZK Rollups be a better option?

To a certain extent, this is true. In ZK Rollups, achieving Stage 2 doesn’t require complex economic considerations, users’ funds aren’t at risk of being stolen in the event of L1 censorship, and users can withdraw funds within a matter of hours.

The transition from optimistic rollups to ZK rollups might happen sooner than expected. This is because the main drawbacks of ZK rollups—high proof generation costs and time—are rapidly improving. Recently, Succinct Labs introduced OP Succinct, a ZK version of OP Stack, offering a framework to easily launch ZK rollups based on the OP Stack.

(Introducing OP Succinct | Source: Succinct Blog)

However, there are still a few considerations. The first is cost. The cost for generating a block proof in OP Succinct is known to be around $0.005-$0.01, and the monthly cost of running a prover is estimated to be between $6,480 and $12,960. If the chain has a high TPS, these costs could increase further.

(Benchmark of proving const in various networks | Source: Succinct Blog)

For example, the average proof generation cost per block on Base at OP Succinct is about $0.62. Calculating the monthly cost based on this results in $803,520. This is an additional cost that did not arise with optimistic rollups, and even if ZK costs decrease, the operational costs of ZK rollups will always be higher than optimistic rollups.

The second consideration is how it affects decentralization. Validators in ZK rollups need to run provers, which is more difficult and expensive than running fraud proof programs in optimistic rollups. Also, due to the slower proof generation times in ZK systems, users can’t verify transactions in real-time. While higher hardware specs can improve proof generation speeds to match transaction execution, this means that running a prover requires high-spec computing environments. Ideally, anyone should be able to run a node to ensure the chain’s safety, but ZK hasn’t reached that level yet.

Lastly, ZK rollups are based on highly complex mathematics and cryptography, and this complexity surpasses that of fraud proof and challenge protocols. Thus, ZK systems require extensive testing before they can be safely used in production.

Arbitrum is pursuing a hybrid protocol that combines ZK and optimistic methods as its endgame. The protocol would primarily operate as an optimistic rollup, generating ZK proofs only when fast withdrawals are needed. This would be useful for scenarios requiring rapid fund rebalancing between chains, such as exchanges or bridges, or for enabling interoperability between chains.

In conclusion, the optimistic rollup approach appears to be valid for now, with taking ZK as a hybrid approach. But as ZK proof generation costs and speed continue to improve, more optimistic rollups might seriously consider transitioning to ZK in the future.

5.4. Are Fraud Proofs Only for Rollups?

We have looked into Ethereum’s optimistic rollups and their fraud proof mechanisms. What are some other use cases for this fraud proof?

  1. Restaking Protocols

Fraud proofs can be actively utilized in restaking protocols. Let’s explore this with the example of Eigenlayer, a representative restaking service on Ethereum.

Eigenlayer is a service that allows Ethereum’s security to be rented out through restaking. Operators in Eigenlayer can deposit ETH or LST from users based on a delegation contract within Eigenlayer, and participate in validation by opting into multiple AVSs (Actively Validated Services). Through Eigenlayer, protocols can easily build AVSs and reduce the cost of bootstrapping initial validators.

Like any other blockchain, AVSs reward operators for successful validation and must slash them when they act maliciously. This is where fraud proofs can be used in the slashing process.

(Slashing Example of an AVS | Source: Eigenlayer GitHub)

For example, consider a bridge AVS. The premise of the bridge AVS is that it must properly transfer users’ funds to the target chain, and any operator who maliciously manipulates transactions should be slashed. If such manipulation occurs, a challenger who discovers the misconduct can create a challenge with a fraud proof in the Dispute Resolution contract, asserting that the operator has incorrectly performed the bridging. If the fraud proof is deemed valid, the AVS can call the slasher contract in Eigenlayer to halt any rewards for the operator.

Although this slashing feature has not yet been implemented in Eigenlayer, they recently announced Shared Security Model, including slashing in the next release. This will enable the use of fraud proofs for slashing.

  1. Data Availability Layer

Fraud proof can also be used in the Data Availability (DA) layer. A representative example of this is the fraud proof proposed and implemented by Celestia. Celestia has a technology that allows light nodes to verify whether data is stored correctly based on Data Availability Sampling. Let’s take a closer look at this.

A light client should be able to verify whether a block has been agreed upon by a majority (more than 67%) of the validators without downloading all the data of the blockchain. However, it is difficult for light clients to verify all the signatures of validators for each block, and as the number of validators increases, this becomes almost impossible.

This is where Celestia presents an interesting concept. In Celestia, even if the majority of validators are malicious, it proposes a method where a single honest full node can tell light clients to reject a faulty block. This single honest full node can be trusted using fraud proof to guarantee its “honesty.”

There are two types of fraud proofs in Celestia:

  • Fraud proofs for data
  • Fraud proofs for state transition

First, fraud proofs for data work as follows: Celestia allows light nodes to verify that validators are holding the correct data without directly downloading all the data within a block. To achieve this, Celestia uses a technology called Data Availability Sampling (DAS).

(Data Availability Sampling at Celestia | Celestia Docs)

Celestia’s validators structure transaction data into a k x k matrix and then extend it to a 2k x 2k matrix using a technique called 2D Reed-Solomon Encoding. They then calculate a total of 4k Merkle roots for each row and column, and the result of further hashing these Merkle roots is included in the block header and propagated.

With just the Merkle root information in the block header, light nodes can verify that Celestia’s validators are holding the correct data. Light nodes request data from random points in the 2k x 2k matrix along with the Merkle roots for the corresponding rows and columns from validators. If this data can be verified against the values in the block header, the validators can be trusted to be holding the correct data.

However, one important consideration arises: What if a validator maliciously performs Reed-Solomon encoding? Celestia addresses this issue by implementing something called a “bad-encoding fraud proof.”

If a Celestia full node discovers during block recovery that encoding has been done incorrectly, it generates a fraud proof containing the block height, the incorrectly encoded section, and proof of the mistake, which is then propagated to light nodes. The light nodes verify the proof to confirm that the data was indeed encoded incorrectly, allowing them to stop using the faulty data.

In addition, Celestia also proposes a fraud proof mechanism for state transitions.

(Architecture of a block in Celestia | Source: Contribution DAO Blog)

Celestia’s blocks are structured to include trace data for transactions at various intervals. This allows full nodes to easily build fraud proofs, and light nodes can detect incorrect state transitions without executing the entire block. However, due to complexity issues, this mechanism has not yet been implemented on the Celestia mainnet.

In summary, fraud proof in the DA layer can play a role in filtering out incorrect data and state transitions without relying on consensus.

  1. Machine Learning

AI and blockchain were hot topics in 2024, and much R&D has been conducted in this area. One of the most notable aspects is the combination of blockchain and machine learning.

The primary reasons for applying machine learning to blockchain are as follows:

  • Data reliability: Blockchain manages data in a decentralized manner, with all transactions recorded openly and transparently. If a machine learning model learns from blockchain data, the data is from a reliable source, reducing the possibility of tampering.
  • Transparency and verifiability of models: When a machine learning model is executed on a blockchain, the model’s updates and results are recorded on-chain, making them verifiable. This prevents manipulation or bias in results that could occur in centralized environments.

The critical factor here is verifying that the machine learning model has been correctly trained. However, machine learning computations are highly intensive, making it nearly impossible to execute them entirely within blockchain runtimes. Therefore, frameworks like opML and zkML have emerged to efficiently verify machine learning model training in a blockchain environment. opML adopts an optimistic approach to model training, recording the results on the blockchain and correcting errors through a challenge mechanism.

Let’s take a closer look at the approach proposed by ORA, a project providing AI infrastructure on the blockchain. The opML challenge process is very similar to rollup challenges and is composed of the following three key components:

  • Fraud Proof VM: This VM executes machine learning inference and functions similarly to Arbitrum’s WAVM or Optimism’s Cannon.
  • opML smart contract: This contract verifies fraud proofs, playing a role similar to Optimism’s MIPS.sol contract.
  • Verification game: The verifier who issued the challenge interacts with the server through bisection to identify the single incorrect step within the VM, then generates a fraud proof for that step and submits it to the opML contract.

(Verification Game on ORA opML | Source: ORA Docs)

Through this fraud proof mechanism, opML leverages the security and trustworthiness of blockchain while providing a cost-effective environment for machine learning model training and verification.

6. Conclusion

Optimistic rollups are investing significant effort into improving fraud proofs and challenge protocols to inherit more of Ethereum’s security and create a more trust-minimized chain. Arbitrum is expected to reach Stage 2 by the end of this year through BoLD, and Optimism is also working towards Stage 2, relying on Dispute Game V2 and multi-proof mechanisms. By next year, users of optimistic rollups will be able to interact with the network with greater security, without worrying that “the rollup could take their funds.” Additionally, the number of Stage 1+ rollups that Vitalik could mention in his blog is expected to increase.

However, there is still room for improvement in each protocol, and much of it can be enhanced through ZK Fraud Proofs. Kroma is already advancing its protocol based on this, and other protocols such as Arbitrum, Optimism, and Cartesi can maintain a safer and more decentralized approach with ZK Fraud Proofs.

Fraud proofs are an area where not only rollups but also other protocols are investing substantial R&D resources. Based on the premise that “only one honest participant is needed,” fraud proofs, together with ZK, can contribute to building a trust-minimized architecture across the entire blockchain, and their impact will gradually become something we can experience firsthand.

7. Reference

L2Beat

Fraud Proof Wars | Luca Donnoh at L2Beat

Arbitrum Docs

Optimism Docs

Optimism Specs

Permissionless Refereed Tournaments | Cartesi

Kroma Specs

BoLD: Fast and Cheap Dispute Resolution

Economics of BoLD

Why is the Optimistic Rollup challenge period 7 days? | Kelvin Fichter at OP Labs

Fraud Proofs Are Broken | Gabriel Coutinho de Paula at Cartesi

Optimistic Time Travel | Yoav Weiss

About the First Successful Challenge on Kroma Mainnet

Unpacking progress in baseline decentralization | OP Labs

Non-attributable censorship attack on fraud-proof-based Layer2 protocols

OP Labs Audit Framework

Trustless Derivation | Kroma

Introducing OP Succinct: Full Validity Proving on the OP Stack | Succinct

Eigenlayer GitHub

Celestia Docs

Contribution DAO Blog

ORA Docs

Disclaimer:

  1. This article is reprinted from [research.2077], All copyrights belong to the original author [sm-stack and BTC Penguin]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

The State Of Fraud Proofs In Ethereum L2s

Advanced10/16/2024, 4:57:37 AM
This article provides a comprehensive analysis of the economic design of fraud proof systems in Ethereum rollups, using Arbitrum, Optimism, Cartesi, and Kroma as case studies, and highlights their importance to security.

1. Introduction

1.1. Where Do Optimistic Rollups Go?

In September 2024, Vitalik made a strong statement about raising the standards for rollups:

I take this seriously. Starting next year, I plan to only publicly mention (in blogs, talks, etc) L2s that are stage 1+, with maybe a short grace period for new genuinely interesting projects.

It doesn’t matter if I invested, or if you’re my friend; stage 1 or bust.

The Stage system for rollups is a framework used to roughly evaluate the security level of rollups, ranging from Stage 0 to Stage 2. Among the major rollups today, only Arbitrum and Optimism have reached Stage 1. Most other optimistic rollups are currently at Stage 0.

Given this situation, some questions arise:

  • It’s been 3 years since optimistic rollups were introduced, so why has no one reached the highest level, Stage 2?
  • When can we expect optimistic rollups to reach Stage 2?

This article aims to answer these questions by analyzing the fraud proof and challenge mechanisms of optimistic rollups and how each is working toward achieving Stage 2. Additionally, it explores the future of optimistic rollups and fraud proof systems.

1.2. Optimistic Rollup vs. ZK Rollup

Everyone knows that Ethereum is slow and expensive. Ethereum community’s researchers and developers have continuously worked to address this problem. After exploring various solutions like sharding and plasma, the community has agreed upon rollups as the primary solution to achieve scalability. As a result, many rollups like Arbitrum, Optimism, and zkSync have emerged. According to L2Beat, there are currently about 40 rollups, with additional solutions like Validium and Optimium using alt-DA solutions for higher scalability, totaling around 41. Furthermore, approximately 80 new rollup chains are expected to launch.

(Current Landscape of L2s | Source: L2Beat)

The core concept of rollup is to execute transactions off-chain and then submit only the transaction data and the result state root to Ethereum, thereby achieving scalability. Users can deposit funds into specific bridge contracts on Ethereum, move the funds into the rollup, and make transactions within the rollup. Since the transaction data is submitted to Ethereum, and once finalized, it cannot be altered without compromising Ethereum’s security, rollups are said to “inherit Ethereum’s security.”

But is this really true? What if the proposer processing transactions within the rollup is malicious? Malicious proposer could manipulate Alice’s balance, transfer it to its own account, and then withdraw it to Ethereum, effectively stealing Alice’s funds.

To prevent this, an additional security mechanism is needed when withdrawing from the rollup to Ethereum. By providing proof to the Ethereum bridge contract that the withdrawal transaction was correctly processed and included in the L2 chain, the withdrawal can be completed.

One of the simplest methods, which is chosen for every rollup, is to compare the hash of the withdrawal transaction with the rollup’s state root to prove that the withdrawal transaction is correctly included in the rollup’s state. This requires both the withdrawal transaction and the state root to be submitted to Ethereum’s bridge contract. Users submit their withdrawal transactions, while a validator calculates and submits the state root.

However, if the validator submitting the state root acts maliciously and submits an incorrect root, it could compromise the security of user funds. To mitigate this risk, two main mechanisms have been proposed, resulting in the differentiation between Optimistic Rollup and ZK Rollup.

  1. ZK Rollup
    ZK Rollup not only requires the validator to submit the state root but also to provide a ZK proof verifying the correctness of the state root calculation. If the validator submits an incorrect state root, the ZK proof would fail validation by the L1 Verifier contract, preventing the submission of the malicious state root.
  2. Optimistic Rollup
    Optimistic Rollup allows the designated validator to submit the state root without additional safeguards, relying on the assumption that the submission is honest. However, if the submitted root is incorrect, anyone can challenge it and make people not able to use it in the withdrawal process. The challenger must submit proof to Ethereum demonstrating that the root is incorrect, known as a Fraud Proof. \
    To ensure a safe resolution of a challenge from attacks like L1 censorship, there is a withdrawal delay of about a week in Optimistic Rollups.

1.3. Why do we need Fraud Proof?

Unlike ZK Rollups, Optimistic Rollups operate under conditions where validators can submit incorrect state roots and attempt to manipulate withdrawals. Fraud proofs effectively prevent this, ensuring the safety of funds in the bridge contract.

Without a robust fraud proof mechanism, Optimistic Rollups would not fully inherit Ethereum’s security. For example, in the current Arbitrum system of permissioned validator system, if all validators collude, they could potentially steal all funds in the bridge contract. Similarly, in OP Stack rollups like Base which still don’t implement permissionless fault proofs on mainnet, the single malicious validator could steal funds.

Thus, Fraud Proofs play a crucial role in the security of Optimistic Rollups, and any system lacking a well-implemented Fraud Proof mechanism poses a risk to user assets.

This article evaluates the risks associated with various Optimistic Rollups and examines the implementation, strengths, and weaknesses of their fraud proof mechanisms.

1.4. Toward Stage 2: Removing Training Wheels

Fraud proof systems play a pivotal role in helping optimistic rollups achieve ‘Stage 2.’ The Stage framework, proposed by Vitalik and currently operated by L2Beat, is used to evaluate the security level of rollups.

In the Ethereum ecosystem, this Stage framework is often likened to learning how to ride a bicycle. A Stage 0 rollup, which relies on the most trust assumptions, is compared to a tricycle with training wheels, while a Stage 2 rollup, which fully inherits Ethereum’s security, is compared to a two-wheeled bicycle with the training wheels removed.

Here are more detailed criteria for each stage from Stage 0 to Stage 2:

As highlighted above, implementing a proper fraud proof and challenge mechanism is crucial for optimistic rollups to achieve Stage 1 or 2. Considering the criteria, a fraud proof system that meets the Stage 2 standards would have the following characteristics:

  • It should be well-functioning with no known defects, with the 1-of-N characteristic.
  • It must be a permissionless system, where anyone can submit proofs.
  • If there is a bug in the proof system, it should be provable on-chain.

In the latter part of the article, we will explore how various protocols are attempting to implement these features.

2. Fraud Proof - Concept and Misconception

2.1. How are Fraud Proofs implemented?

Fraud proofs provide onchain verifiable evidence that a submitted state root is incorrect, indicating that a specific state transition function within the L2 was improperly executed. A straightforward method involves generating proofs for executing all L2 blocks from the last confirmed state root to the current state root, demonstrating the incorrect state root. However, this approach is costly and time-consuming.

Thus, effective fraud proof generation narrows down the specific incorrect state transition before generating proofs for that segment. Most fraud proof protocols follow this approach.

Fraud proof and challenge protocol typically follow these steps:

  1. Validators (asserters) periodically submit an output (or claim) containing L2 state root to Ethereum.
  2. If a validator (challenger) disagrees with an output, they initiate a challenge.
  3. The asserter and challenger identify the disagreeing segment through a process known as Bisection or Dissection. It narrows down the segment into an instruction level or a block level (with ZK).
  4. The challenger submits a fraud proof onchain to demonstrate the incorrect segment. Generally, protocols like Arbitrum and Optimism execute the suspicious instruction onchain for verification.
  5. If the fraud proof is validated, the incorrect output is removed or replaced. Depending on the challenge protocol, the asserter is slashed, and the challenger is rewarded.

2.2. Common Misconception: Fraud Proof and Challenge do not rollback the chain

One important point is that even if a fraud proof and a challenge occur, the chain is not rolled back. What the fraud proof guarantees is that ‘funds within the deposited bridge contract cannot be maliciously withdrawn,’ and there is no rollback of the incorrect state transition.

The main reason for not rolling back is that there is no need for it. Fundamentally, when an incorrect state transition occurs within the rollup, the problem is that malicious actors could steal users’ funds from the bridge. To prevent this, it is sufficient to ensure that the state root posted to L1 remains correct. This has nothing to do with chain rollback, and the fraud proof and challenge mechanism are sufficient as long as they prevent the finalization of a malicious state root.

Moreover, if the proposer who posts the state root and the sequencer who generates blocks for the L2 chain are different entities, then there is no need for a rollback mechanism.

Therefore, even in a situation where a challenge is successfully resolved, the L2 chain is not rolled back; only the state root (output or claim) submitted to L1 is either deleted or replaced. If the fraud proof and challenge mechanisms work correctly, it ensures that users’ funds within the bridge are safe.

2.3. Real Example: Challenge at Kroma in April 2024

Through the actual challenge case, you will be able to see that the rollback is not performed on the entire rollup chain, but only the output root is replaced or deleted. The only successful challenge case known on the mainnet so far is the challenge that occurred in Kroma, a hybrid rollup based on the OP Stack using ZK fault proof, in April 2024.

Kroma is an OP Stack based rollup with its own ZK fault proof and permissionless validator system. On April 1, 2024, a problem occurred with the L1 origin of the Kroma sequencer, causing the sequencer to generate incorrect blocks. Additionally, an incorrect output root was submitted by the validators observing this. Immediately after the submission of the output root, a total of 12 challengers created challenges against the output.

One of the challengers succeeded in calling the proveFault function, deleting the wrong output.

(Challenger successfully executed proveFault function | Source: etherscan)

This is the first successful challenge case in the history of Ethereum rollups on the mainnet. It is also the first successful verification and challenge of a fault proof in a mainnet environment approximately three years after the first Optimistic Rollup, Arbitrum, was launched in May 2021. The detailed overview of this challenge can be found in the article written by Kroma.

In this case, the Kroma chain did not undergo a rollback, but only the incorrect output root was deleted.

Disclaimer: Is it Fraud Proof or Fault Proof?

Fraud proof is also referred to as fault proof. Particularly in the Optimism and OP Stack chains, the term fault proof is used, while in Arbitrum, Cartesi, L2Beat, etc., the term fraud proof is used.

Considering the Kroma challenge case above, it can be inferred that challenges often arise from ‘mistakes’ rather than malicious attempts to manipulate withdrawals. In the above case, the main cause was an anomaly in the L1 client observed by the Kroma validators. In other words, challenges can arise simply due to validator errors or incorrect patches. In such cases, the term Fault Proof may be more appropriate.

However, the term that better reflects the purpose itself is fraud proof. All the mechanisms introduced so far, and those to be introduced in the future, aim to verify ‘fraudulent actions’ attempting to steal funds within the bridge through malicious outputs.

The point is, the purpose is to prevent fraud, but it can actually occur due to mistakes. In this article, I will use the term fraud proof, which is more widely used in the ecosystem.

3. Hack it! - Exploiting Fraud Proof Mechanisms

3.1. Designing Economic Dispute Protocol

Optimistic rollups have each implemented their own fraud proof and challenge mechanisms to protect user funds. What these mechanisms commonly aim for is the idea that “as long as there is at least one honest participant, the protocol can remain secure.” Fraud proofs are proofs that describe that a predetermined state transition function has been executed correctly, and through the verification process, it inevitably leads to a result where the honest participant wins.

However, this does not always hold true, and in reality, there can be situations where the protocol is in danger even with the presence of an honest participant. For example, unexpected bugs may occur due to the complexity of fraud proof, and malicious participants may find themselves economically advantaged over honest participants due to misaligned incentives, leading to situations where user withdrawals are significantly delayed or funds are stolen.

For these reasons, designing fraud proof and challenge mechanisms is a very difficult task. Particularly, to become a Stage 2 rollup, the challenge mechanism must be perfect, and countermeasures against various attack vectors and loopholes must be in place.

In other words, each fraud proof and challenge mechanism contains considerations on how to respond to attack vectors. If you do not understand each attack vector, you will not be able to understand why their protocol must be designed in such a way.

Thus, in this section, we will first examine the following attack vectors and explore how each protocol responds to them.

  • Attack vectors arising from vulnerabilities in the Dispute Game:
    • Delay attacks that delay user withdrawals for more than 7 days.
    • sybil attacks that deplete the funds and resources of honest participants.
  • Attacks caused by censorship of L1 validators.
  • Attacks exploiting bugs within the Fraud Proof VM.

Note: The attack vectors discussed below are all publicly known and do not affect the security of any mainnets.

The protocols and their respective characteristics that will be examined in the following sections are as follows:

3.2. Attack Vector #1: Exploiting Economic Dispute Game

Most optimistic rollups that have implemented fraud proof mechanisms all require bisection to find out the first disagreement point. It’s important for the protocol to provide incentives that encourage participants to act honestly.

One of the easiest ways to achieve this is to have participants stake a certain amount of funds (bond) when taking actions and slash the bond if they are deemed to have acted maliciously.

Considering game theory, the protocol must ensure that the funds consumed by malicious participants to attack are greater than the funds consumed by honest participants to defend. However, this is very difficult to achieve.

The key reason here is because, in a game context, it is impossible to know in advance who the malicious participant is without running the challenge to completion. In other words, the asserter who submitted the output may be malicious, or the challenger who challenged the output may be malicious. Therefore, the protocol must be designed under the assumption that either side could be malicious. Moreover, since there can be various attack vectors, designing the protocol becomes an exceedingly complex task.

Also, because each protocol adopts different mechanisms, the attack vectors corresponding to each method and the attacker’s incentive model must be defined. Additionally, an economically secure model must be designed to remain safe even when these are combined.

This remains a topic of ongoing discussion. In this section, we will analyze attack vectors that could generally occur and the incentives of participants within those scenarios. Additionally, we will explore how each protocol responds to these and how effectively they limit such incentives.

3.2.1. Attack Vector #1-1: Delay Attack

A delay attack refers to an attack where a malicious entity does not aim to steal rollup funds but rather prevents or delays the output from being confirmed on L1. This is an attack that can occur in most current optimistic rollups, adding additional delay to withdrawals, making it take more than a week for users to withdraw funds from L1.

This is slightly different from attacks caused by the censorship of L1 validators, which will be discussed later. Censorship prevents honest participants from taking any action on Ethereum, allowing a malicious state root to be finalized. On the other hand, a delay attack can delay the finalization of the state root even when honest participants are actively engaged. In such cases, not only can user withdrawals be delayed, but if the attacker has more funds than the defender, the malicious state root may be finalized, leading to the theft of user funds.

One of the simplest ways to prevent delay attacks is to require participants in the challenge system to stake a certain amount of funds or bond, which can be slashed if they are deemed to be causing delays.

However, there are considerations to take into account. What if the attacker is willing to have their funds slashed and still attempts a delay attack?

This attack vector is quite tricky to handle. This is also why Arbitrum’s fraud proof system currently operates in a permissioned structure.

The fraud proof mechanism applied to Arbitrum One, Arbitrum Classic, utilizes a branching model. Rather than simply allowing participants to challenge incorrect claims, each participant submits what they believe to be the correct claim along with a certain amount of funds, treating these as “forks of the chain.” Claims can also be thought of as checkpoints on the chain’s state.

(Branching model of Arbitrum)

In Arbitrum Classic, participants will submit claims and chain branches they believe are correct, and through challenges, incorrect chain branches are gradually removed, eventually confirming the correct claim.

However, a single challenge cannot determine who is correct. Two malicious participants may proceed with bisection in the wrong way, defining an unrelated point as the disagreement point, and eliminating the correct claim. Therefore, Arbitrum ensures that challenges are continuously carried out until no participants have funds staked on a particular claim, guaranteeing that the challenge is resolved successfully if there is at least one honest participant.

This can be exploited for delay attacks. Suppose there are honest participants and N-1 malicious attackers who stake funds on the correct claim, while one attacker stakes funds on an incorrect claim. If the attackers can always include their transaction before the honest participants, they can proceed with the challenge first. In the worst case, if they carry out the bisection incorrectly, bisecting the portion that they agree on instead of the portion where they disagree, they can present a fraud proof on the wrong part. Naturally, this will pass, causing the side with funds staked on the correct claim to lose.

Since each challenge can take up to 7 days, the attackers can delay the protocol by up to 7 * (N-1) days.

(Delay attack at Arbitrum Classic | Source: L2Beat Medium)

The issue with this mechanism is that the cost of delay attacks scales linearly with the time the protocol is delayed. If an attacker finds the attack profitable, they will want to delay the protocol as long as possible, and the total delay time will be proportional to the attacker’s total amount of funds, potentially causing very long delays in user withdrawals.

In conclusion, a fraud proof protocol that can effectively defend against delay attacks must be designed such that the maximum delay time is bounded to a certain amount, or the cost of conducting the delay increases exponentially over time, making the cost of executing the attack greater than the incentive to do so.

3.2.2. Attack Vector #1-2: Sybil Attack (Exhaustion Attack)

Another attack vector is the Sybil Attack (Exhaustion Attack, Proof of Whale Attack). This can occur when an attacker has more funds or computing resources than the defender. The attacker can continuously submit incorrect output roots or create meaningless challenges, exhausting the defender’s funds or computing resources. At some point, the defender will run out of funds or idle computing resources, making them unable to defend, and the attacker will finalize the incorrect output root and steal the funds.

Typically, the above attack vector can occur in a permissionless system in the following two ways:

  1. Continuously submitting incorrect outputs.
    Suppose attacker Bob has more money than honest participants (defenders) Alice, Charlie, and David combined. In this case, Bob continuously submits incorrect output roots. Honest participants Alice, Charlie, and David will respond by paying gas fees and bonds, and at a certain threshold, the honest participants’ funds will run out before Bob’s. At this point, Bob submits another incorrect output, and since there are no longer any honest participants with remaining funds in the network, the output will finalize without challenge. In this way, Bob can steal funds from an optimistic rollup.
  2. Submitting multiple challenges to an honest output.
    Conversely, a malicious participant may attack honest participants by submitting multiple challenges. Similarly, the attack will continue until the honest participants exhaust all their funds on gas fees and bonds, and the malicious attacker will then submit an incorrect output and steal the users’ funds from the bridge.

To prevent such attacks, the advantage of defender over attacker must be properly designed. In all situations, the defender must be at a sufficient advantage over the attacker. One way to do so is designing the bond carefully; since sybil attacks are related to the total amount of funds available to each participant, if the bond is properly set, it should be possible to establish that “the system is safe from sybil attacks unless the attacker’s total funds are N times greater than the defender’s total funds.”
The other known way to prevent sybil attacks is implementing a sybil-resistant dispute protocol. This will be explained further in the following section demonstrating Cartesi Dave.

Let’s take a look at how each protocol responds to these delay and sybil attacks through their respective designs.

3.3. Solution #1: Economically Sound Dispute Game

1) Arbitrum BoLD

BoLD, building on the branching model of the original Arbitrum Classic, introduces the following three elements to prevent the vulnerabilities of delay attacks:

  • All-vs-All challenge mechanism.
    In BoLD, challenges are no longer conducted 1-on-1, but take the form of a concurrent All-vs-All system where all participants can stake their bonds on the branch they agree with. This prevents the delay attack vector that arose from the previous challenge mechanism where 1-on-1 challenges were conducted sequentially, and ensures that multiple, separate challenges for the same dispute cannot occur.
  • Prevention of malicious bisection through proof of correct state computation (history commitment). \
    The issue in Arbitrum Classic was that malicious participants could intentionally cause delays by bisectioning in a way that marked non-controversial sections as disputed points. To prevent this, BoLD requires the submission of proof, along with the state root, to verify that the state root was correctly computed during the bisection process, ensuring that no malicious bisection has taken place.
    In BoLD, participants must submit proof along with the state root during the bisection process. This proof verifies that the current state root was correctly computed based on the state root submitted in the previous claim. If a malicious participant attempts to submit an arbitrary root unrelated to the previously submitted state root during bisection, the proof verification will fail, causing the bisection transaction to fail as well. This effectively ensures that only one type of bisection is possible for each claim.
    Therefore, if an attacker wants to carry out multiple bisections against an honest claim in BoLD, they must submit multiple claims.
    However, generating this proof requires validators to use quite a bit of computing resources. Internally, creating this proof requires generating hashes for all states within the bisection, which is typically estimated to be around 270 (approximately 1.18 x 1021) hashes in Arbitrum. To address this, BoLD splits the challenge into three levels, reducing the number of hashes that need to be computed to 226 (about 6.71 x 107).

(This figure assumes a total of 269 instructions, actual figures may vary)

  • Limiting the challenge period through the chess clock mechanism.
    In the previous Arbitrum Classic, there was no time limit on how long a challenge could proceed, allowing malicious participants to delay the protocol indefinitely as long as they had enough funds. BoLD introduces a chess clock mechanism to effectively limit the duration of a challenge.
    Let’s assume there are two participants who submitted different claims. Each is given a timer (chess clock) with 6.4 days of time. This timer begins to count down when it is a participant’s turn to submit a bisection or proof and stops once the participant completes their task.
    Since each participant is given 6.4 days of time, the maximum amount of time one participant can delay the process is 6.4 days. Therefore, in BoLD, challenges can last a maximum of 12.8 days (with an additional 2 days under certain circumstances when the Security Council intervenes).

Through these mechanisms, Arbitrum BoLD effectively limits delays caused by challenges. The maximum duration of a challenge is two weeks, and the maximum additional delay users may experience is approximately one week.

However, this can be exploited for delay attacks. A malicious participant can create a challenge and collude with L1 validators to censor the honest validator on Arbitrum, delaying Arbitrum users’ withdrawals by up to one week. In this scenario, users who request withdrawals within this timeframe may experience opportunity costs due to having their funds tied up for an additional week. Although this is not an attack where the attacker directly profits from the funds, it should still be prevented since it imposes opportunity costs on users. Arbitrum BoLD is addressing this issue by setting the bond required for creating a challenge high enough to deter such attacks.

Arbitrum calculates this amount in the Economics document of BoLD. The main reason for delay in the protocol is the censorship of L1 validators. In the case of a delay attack, the scenario would unfold as follows:

  1. The attacker submits a claim N’ that disagrees with an existing claim N before it finalizes on Arbitrum.
  2. The defender tries to send bisection txs, but it fails since L1 validators are censoring the challenge transactions from the defender.
  3. Since BoLD has an assumption that censorship cannot last over 7 days, this can delay the finalization of claim N for up to a week.

The attacker’s profit comes from the opportunity cost incurred by users who have requested withdrawals from the challenged output. The worst-case scenario is when all funds in Arbitrum are requested for withdrawal in one output, and in this case, the opportunity cost incurred by users is calculated as follows, assuming Arbitrum One has a TVL of $15.4B and an APY of 5%.

costopp = 15,400,000 x (1.051/52 - 1) = $14,722,400

Because submitting an incorrect claim can impose such a high opportunity cost, claim submitters in BoLD are required to submit a bond of a similar magnitude. Currently, the bond required for claim submission in BoLD is set at 3,600 ETH, which amounts to approximately $9.4M.

This is to preemptively prevent the attacker from causing significant losses to the system through delays. Since the attacker will lose their bond in a challenge, they can cause up to $14.7M in opportunity costs but will forfeit about $9.4M in funds. Thus, BoLD disincentivizes delay attacks by requiring bonds comparable to the worst-case opportunity cost.

However, the 3,600 ETH bond size is not set solely due to delay attacks. To defend against sybil attacks, Arbitrum BoLD is designed to ensure the system remains safe until the attacker’s total funds are 6.5 times greater than the defender’s total funds, and this is how the bond amount of 3,600 ETH was determined.

From the perspective of a sybil attack, the following attack scenario could occur in Arbitrum BoLD. BoLD’s challenge system consists of three levels, and users must lock funds to submit the claim they believe to be correct.

Let’s assume that honest participant Alice submits a valid claim with X ETH. Malicious participant Bob, who has 3,600 ETH, could create multiple malicious claims. Alice would then need to lock Y ETH for each claim at a lower level to counter them.

In Arbitrum’s branching model, locking funds implies agreement with the chain state from the genesis to the claim. This feature allows participants to move their staked funds from claim A to its children, A’ and A’’. Thus, Alice would move her initially staked X ETH to lower levels and lock Y ETH for each of Bob’s malicious claims.

What happens if Bob has significantly more money than Alice? Bob can generate countless malicious claims until Alice runs out of funds to lock. At this point, Alice can no longer proceed with the bisection, allowing Bob to confirm an incorrect claim.

Ultimately, this issue boils down to how much more advantageous the defender should be compared to the attacker in the game.

Arbitrum expresses this metric as the resource ratio. It indicates how much more advantageous the honest participant is compared to the malicious participant. It is represented by the ratio of gas fees (G) and staking amounts (S) that each participant must spend, as follows:

BoLD’s challenge system is divided into three levels, and by maintaining this resource ratio at every level, it guarantees that the defender consistently has N times the advantage over the attacker across the entire system. Arbitrum has calculated the required bond size at the top level based on this resource ratio and created a graph.

(Top-Level Dispute Bond Cost vs. Resource Ratio at Arbitrum BoLD | Source: Desmos)

According to this graph, when the resource ratio is 100x, the required bond at the top level exceeds 1 million ETH (over $4 trillion). While a higher resource ratio makes the system more secure from sybil attacks, the bond amount becomes so large that hardly anyone can participate in the system, making it no different from a centralized system where only one validator submits claims.

Therefore, in BoLD, the resource ratio is set to 6.5x, making the bond at the top level 3,600 ETH, and the bonds at level 1 and level 2 are set to 555 ETH and 79 ETH, respectively.

In summary, BoLD defends against sybil attacks by calculating the resource ratio and setting the bond amount such that the defender has a 6.5x advantage over the attacker.

2) Cartesi Dave

Cartesi’s Dave was first proposed in a paper titled Permissionless Refereed Tournaments published in December 2022, before the first whitepaper of BoLD. It aims to keep the honest participant’s computing resources and funds advantageous compared to the attacker. Dave is similarly structured to BoLD, and has two key features:

  • Prevention of malicious bisection through proof of correct state computation (history commitment).
    Like BoLD, Dave requires participants to generate proof during bisection to show that they performed the computation correctly, preventing malicious forms of bisection. Accordingly, Dave’s challenge system is also divided into multiple levels to save validators’ resources.
  • 1-vs-1 sequential challenge mechanism in a tournament structure.
    Dave’s challenges are not conducted all at once but rather proceed in a tournament format, as shown in the figure below.

The above figure shows how a challenge proceeds when a malicious attacker submits seven incorrect claims against the network. Due to the nature of the history commitment, honest participants who agree with the correct claim, shown in green, are grouped together as a team. In Dave, they are grouped into a tournament format and placed as shown in the figure, with each participant engaging in 1-vs-1 challenges. Challenges at the same stage are conducted concurrently, and after one week, when the challenge is completed, the winners move on to the next stage. In the figure, the team of honest participants must undergo three rounds of challenges to win the tournament.

This feature is highly effective in preventing sybil attacks. First, the attacker must create multiple claims to execute the sybil attack, and each consumes the attacker’s computing resources and funds in a significant way.

Cartesi’s paper proves that defenders always maintain an exponential advantage over attackers in any situation. In other words, Dave guarantees that sybil attacks can be defended against with logarithmic resources relative to the number of attackers. This makes it very difficult to execute a sybil attack in Dave, and as a result, the bond size in Dave is set to a minimal 1 ETH, much smaller than in BoLD.

However, Dave is vulnerable to delay attacks. Each stage of the tournament consumes one unit of challenge time (one week), so the more malicious claims there are, the longer the protocol delay will be. The time it takes to fully resolve a challenge in Dave can be expressed by the following formula,

Td = 7 x log2(1 + NA)(days)

where NA represents the number of malicious claims. However, Dave’s challenges can be composed of multiple levels to efficiently generate history commitments. Here, malicious participants can generate NA malicious claims at each level of the challenge, which increases the total delay time as follows:

Td = 7 x [log2(1 + NA)]L(days)

Where L represents the number of levels in each challenge. If, as in the figure above, there are seven malicious claims and L is 2, the full resolution of the challenge could take up to 9 weeks, and users would experience an additional withdrawal delay of 2 months. If the number of levels increases or the number of malicious claims grows, the delay could extend to several months.

Cartesi aims to solve this issue using ZK, which will be discussed in detail in section 4. Possible Improvement.

3) Optimism Fault Proof (OPFP)

OPFP is a permissionless challenge protocol currently applied on the OP Mainnet and has the following characteristics:

  • All-vs-All concurrent challenge mechanism using a Game Tree
    OPFP allows anyone to submit an output (root claim) at any time. Validators who disagree with the submitted output can initiate a bisection process by challenging it.

(Architecture of OPFP Game Tree and Bisection Process | Source: Optimism docs)

Bisection is conducted concurrently on a Game Tree structured as shown in the figure above. The leaves of the tree represent the states of L2, and each node in the tree corresponds to a state in L2, with the rightmost leaf representing the latest L2 state. For example, submitting a claim at Node 1 is the same as submitting the state at Node 31.
This structure allows for the representation of bisection. For instance, if a validator disagrees with the Root claim (Node 1), they would submit a claim at Node 2, which corresponds to Node 23 in the tree, as it is the midpoint between Nodes 16 and 31. The submitter of Node 1 would then check the L2 state at Node 23 and either submit Node 6 (Node 27) if they agree or Node 4 (Node 19) if they disagree, continuing this process until the disagreement is found.
Even if multiple directions of bisection exist within one game, they can all proceed simultaneously, and anyone, not just the output submitter, can participate in the bisection process.

(Full Architecture of OPFP Game Tree | Source: Optimism docs)

The Game Tree used in OPFP is nested, with the upper tree handling bisection at the block level and the subtree below handling bisection at the instruction level.
Unlike BoLD or Dave, OPFP does not enforce correct bisection through history commitment, as the off-chain/on-chain costs of generating and submitting such commitments would be high.

  • Customizable dispute games based on modularity
    Currently, there are only two types of dispute games (Permissionless / Permissioned) live on the OP Mainnet. Optimism aims to eventually introduce various types of dispute games and has implemented the minimum interface to support this. By adhering to the specified function names and arguments, one can create a custom dispute game.
  • Challenge time limitation through a chess clock
    In OPFP, when a challenge occurs, both the asserter and the challenger are given a clock with time allocated for bisection. Each time a claim is made, the clock starts running for the opposing party. Optimism refers to this as “inheriting the clock of the grandparent.”
    Interestingly, each participant is given 3.5 days, not 7 days, which means that if no one challenges an output, it will be finalized within 3.5 days.
    However, this does not allow for immediate withdrawals. After an output is finalized, OPFP has a 3.5-day guardian period during which the Security Council can intervene to invalidate an incorrect output if necessary. \

(User Withdrawal Process in Happy Path | Source: OP Labs Blog)

Based on these mechanisms, OPFP, like other optimistic rollups, guarantees that withdrawals can be made at least 7 days after submission. However, if a challenge occurs, it may take more than 7 days for users to withdraw through that output. OPFP’s chess clock model limits the time each participant can spend on bisection, but it does not strictly limit the total time until the challenge is resolved.

This raises the question: Could a user’s withdrawal be delayed for more than a week if a challenge occurs on OPFP, similar to BoLD? The answer is “yes.” Unlike BoLD or Dave, Optimism provides options for users to handle situations where a challenge occurs, based on the unique characteristics of the protocol.

OPFP operates on the assumption that “a participant who submits an incorrect claim loses their bond.” However, there is one edge case in OPFP where this assumption is broken, known as the “freeloader claim.” This can occur in the following scenario:

  1. Alice submits a claim with a correct state root.
  2. Bob submits a counterclaim, and Alice makes a move to defend her original claim.
  3. Bob waits until his clock has almost run out (3.5 days), then challenges his own claim.

At this point, Alice should respond and claim Bob’s bond, but she inherits the time left on Bob’s clock, which may be insufficient for her to counter his claim. Thus, Bob may avoid losing his bond by submitting a “freeloader claim.”

(Freeloader Claim at Optimism Fault Proof | source: L2Beat)

While this doesn’t prevent the proper resolution of a challenge, it does represent a case where “an incorrect claim is submitted without bond being slashed,” which should be prevented from an incentive perspective.

Therefore, OPFP addresses this by resetting the clock to 3 hours if either the asserter or challenger’s remaining time is below 3 hours. This ensures that there’s enough time to counter freeloader claims. However, if the next bisection period passes without action for more than 3 hours, the challenge ends.

We can imagine a scenario where this mechanism is exploited for delay attacks. Suppose honest participant Alice submits a correct output, and from the moment Alice submits, time starts running on the challenger’s clock. Malicious participant Bob waits until 1 second before the challenger’s clock expires and then submits an incorrect output. The rules of OPFP then extend Bob’s time to 3 hours. Alice will respond, and Bob will continue using the extra 3 hours provided for each bisection.

This could delay the resolution of the challenge. The maximum time Bob can delay is 3.5 days + 3 hours the maximum number of bisects. OPFP’s MAX_GAME_DEPTH is 73, meaning the longest Bob could delay the process is 3.5 days + 3 hours 36 = 8 days. If Alice were to act similarly to delay the challenge, the bisection process could take 16 days.

Does this mean users wouldn’t be able to withdraw for 16 days? In practice, no, due to Optimism’s withdrawal logic.

Unlike Arbitrum, where withdrawals must prove inclusion in a specific L2 block, OP Stack uses a storage proof mechanism, where the withdrawal request is recorded in the L2ToL1MessagePasser contract on L2. This means that even if a long challenge occurs for a specific output, users can wait for the next output to finalize and withdraw based on the contract storage root included in that output. Therefore, users are not forced to experience long delays even if the block they requested withdrawal from is challenged, as they can use the next output.

However, this only holds true if users act quickly. In most cases, users may still experience several days of delay. This can be attributed to the withdrawal process in OP Stack, which involves the following three steps:

  1. Initiating the withdrawal (initiateWithdrawal) on L2.
  2. Proving the withdrawal (proveWithdrawalTransaction) on L1 for the output that includes the withdrawal.
  3. Waiting one week of proof maturity delay before finalizing the withdrawal (finalizeWithdrawalTransaction).

The key point is that users must wait one week between proving the withdrawal and finalizing it. If Alice proves her withdrawal on output B and a challenge occurs, she can send another proof for output C and finalize the withdrawal after a week. In this case, Alice will only experience the delay between outputs B and C.

Therefore, users who are unaware of the challenge creation or respond late may experience up to 9 days of additional withdrawal delays.

Furthermore, there is an additional delay attack vector in OPFP, where every output is challenged consecutively. In this case, users cannot bypass the delay by proving on the next output, causing the entire protocol to be delayed. OPFP counters this by requiring participants to stake bonds at every bisection level, with the bond amount increasing exponentially as shown in the diagram below.

(Amount of OPFP bond | Source: Optimism docs)

In other words, the longer an attacker tries to delay the challenge resolution in OPFP, the greater the cost due to the exponential increase in bond requirements, reducing the incentive for delay attacks over time. Additionally, since outputs can be submitted at any time in OPFP, it is difficult for the attacker to estimate the resources required to conduct a delay attack. The initial bond is set to 0.08 ETH, and the whole bond that must be submitted with a full challenge is up to ~700 ETH.

In summary, OPFP leaves the length of delay up to the user’s response in the event of a single challenge, and exponential bond requirements are used to counteract delay attacks caused by consecutive challenges.

However, OPFP is vulnerable to sybil attacks. In OPFP, if the attacker has more funds than the defender, a sybil attack is possible.

The following sybil attack vectors are possible in OPFP, both of which could lead to the theft of user funds:

  1. The attacker creates multiple challenges, causing the defender to use all their funds on bonds and gas fees.
  2. The attacker continuously submits incorrect outputs, forcing the defender to respond until they deplete their funds on bonds and gas fees.

This is possible in OPFP because the total bond amount required by both the attacker and defender throughout the challenge process is nearly the same, and the defender does not use significantly fewer resources (e.g., gas fees or computing power) than the attacker.

However, this does not mean that user funds on the current OP Mainnet are at risk. OPFP is still in Stage 1, and the Security Council has the authority to correct any improper outcomes. Therefore, even if such attacks occur, the Security Council can intervene to protect user funds on the OP Mainnet bridge.

To move OPFP to Stage 2, however, Optimism must modify the mechanism to ensure that the defender has a greater advantage than the attacker. Optimism is preparing Dispute Game V2 to address this, and more details will be explained in section 4. Possible Improvement.

4) Kroma ZK Fault Proof (Kroma ZKFP)

Kroma is an L2 based on the OP Stack, and before OPFP was introduced, it launched a permissionless ZK Fault Proof system on its mainnet in September 2023. Kroma ZKFP has similar characteristics to OPFP but stands out in that it generates block-level proofs using ZK and utilizes dissection instead of bisection, significantly reducing the number of interactions required in the challenge process. The key features of Kroma ZKFP are summarized as follows:

  • Reduction of interactions through ZK and dissection
    Kroma ZKFP allows participants to find points of disagreement within four interactions. When a challenge is initiated, Kroma ZKFP processes the challenge over 1,800 blocks, starting from the previous output to the current output. Instead of bisection, where the range is split in half, the asserter and challenger divide the range into N parts using dissection. The process works as follows:

After each participant submits two transactions, they will have identified the blocks they disagree on, and the challenger can generate a ZK fault proof to demonstrate that the asserter’s claim was incorrect.
In Kroma ZKFP, the bisection timeout is set to 1 hour, and the ZK proof generation has an 8 hour timeout.

  • Decentralization of validators through an incentive mechanism
    Both BoLD and OPFP provide incentives for challenge winners but do not provide specific incentives for output submitters, and essentially anyone who wants to withdraw can submit an output and become a validator. However, it is impractical for users who wish to withdraw to operate a validator client themselves, and someone must regularly submit outputs to maintain liveness. Since this is a resource-consuming task that requires gas fees for output submission and validator client operation costs, without proper incentives, only a few people might participate as validators, which could lead to centralization and inadequate responses in failure scenarios.

To prevent this, Kroma has modified the OP Stack to distribute half of the gas fees generated by the chain to validators who submit outputs. Furthermore, Kroma plans to transition this reward mechanism to its native token, KRO, after the TGE, and it aims to introduce a DPoS-like validator system to allow regular users to contribute to the security and liveness of the chain without running their own clients.
The bond amount in Kroma is currently set at 0.2 ETH, ensuring that it is greater than the cost of generating the ZK proof and conducting the bisection. This bond will also transition to being staked in KRO within the future validator system.

  • Concurrent 1-vs-1 challenge system
    To ensure a fair and consistent distribution of incentives, Kroma has fixed the output submission interval to 1 hour, and validators are randomly selected from a pre-registered set to act as the asserter. This prevents excessive competition that could lead to wasted gas fees and avoids situations where block builders with transaction ordering rights monopolize rewards.
    Due to this mechanism, Kroma ZKFP operates a concurrent 1-vs-1 challenge system. When the randomly selected validator submits an output, anyone can initiate a challenge, and the bisection is conducted solely between the output submitter and the challenger. Multiple challenges can be conducted simultaneously, and the first challenger to submit a valid ZK proof wins the challenge.

Strictly set timeouts mean that even a malicious challenger attempting a delay attack must complete all bisections and proof generation within 10 hours. Additionally, since challengers are forced to complete all actions within 6 days (excluding the 1-day guardian period), it is impossible to conduct a typical delay attack in Kroma.

However, Kroma ZKFP may still be vulnerable to sybil attacks, similar to OPFP, if the attacker’s funds exceed the defender’s. A sybil attack scenario in Kroma ZKFP might look like this:

  • The attacker continuously creates challenges against a valid output until the output submitter’s funds are exhausted, at which point the attacker submits a ZK proof to win the challenge.

Like OPFP, Kroma ZKFP operates under a model where a successful challenge results in the deletion of the corresponding output. Therefore, if such an attack occurs, the output could be deleted, delaying user withdrawals for 1 hour. If the attack persists, all honest validators could run out of funds, leading to the finalization of an incorrect output, allowing the attacker to steal users’ funds.

Additionally, Kroma ZKFP is still in Stage 0, as its proof system is not yet perfect for the following reasons:

  1. The starting point for dissection is based on the last submitted output, not the last finalized output.
    In OPFP, the starting point for bisection is typically the last finalized output from about a week ago. However, in Kroma ZKFP, the starting point is the last submitted output, which was submitted about 1 hour earlier, and the dissection process is conducted over 1,800 blocks.
    This could allow a challenger to win the challenge if a previous output has been deleted due to a challenge. In this case, the dissection would proceed based on the previous output information submitted by the challenger, and if the challenger maliciously manipulates the previous output information, they could win the challenge.
  2. There is no verification that each validator is conducting the challenge based on correct batch data.
    While Kroma ZKFP’s use of ZK ensures that it is impossible for an incorrect state transition to be finalized if the ZK circuit has no bugs, Kroma ZKFP does not verify whether the ZK proof generation is based on correct batch data. This means that it is possible for a ZK proof to pass verification even if certain transactions were excluded or incorrect transactions were included in the batch.
    Therefore, it would be possible to win a challenge by using ZK proofs based on incorrect data, and if a user’s withdrawal transaction is excluded from the batch, their withdrawal could be delayed.

In practice, however, the Security Council can intervene to roll back the result of an incorrect challenge or delete an invalid output, so these attack vectors do not affect the funds of Kroma Mainnet users. However, to reach Stage 2, Kroma ZKFP must implement defense mechanisms against these vulnerabilities. Kroma has already proposed improvements for these issues, which will be explained in detail in section 4. Possible Improvement.

3.4. Attack Vector #2: L1 Censorship

Previously, we mentioned that rollups inherit Ethereum’s safety. This means that if Ethereum’s safety is compromised, the rollup will also be affected.

There are two scenarios where Ethereum’s situation could compromise the safety of a rollup:

  1. Censorship of rollup fraud proof transactions by Ethereum validators
    If Ethereum validators or builders collude and submit a malicious output root in an optimistic rollup while censoring all transactions related to fraud proof, what would happen? The challenge could not be resolved within the designated period, the output would be finalized, and users’ funds could be at risk.
    This is referred to as weak censorship. In the case of optimistic rollups, if this censorship lasts beyond the defined period, typically 7 days, users’ funds may be at risk.
  2. Ethereum undergoing a 51% attack, leading to censorship of all fraud proof transactions
    This scenario involves two potential attack paths:
    • First, an entity could acquire over 2/3 of Ethereum’s total stake, allowing them to finalize blocks as they wish. In this case, the attacker could censor fraud proof transactions or even generate them at will.
    • The second method involves a participant who has acquired over 1/3 of Ethereum’s total stake carrying out a “stealth” attack. This is described in the research: Non-attributable Censorship Attack on Fraud Proof-Based Layer2 Protocols. In this case, an attacker with 1/3 of Ethereum’s stake could prevent the finalization of any blocks they dislike. If the attacker continues to vote on regular blocks while withholding votes on blocks containing fraud proof, they could finalize a malicious output root and steal funds from the optimistic rollup. This is called a Non-attributable Censorship Attack on fraud proof-based L2s. It is harder to detect than simply acquiring over 2/3 of the stake and controlling all Ethereum blocks.

These censorship-based attacks are difficult to counter at the rollup level because they occur at the Ethereum protocol layer and would require improvements to Ethereum itself. However, there are strategies that rollups can adopt in the meantime.

3.5. Solution #2: 7 Days of Withdrawal Delay and Semi-Automated 51% Attack Recovery

To address these attack vectors, optimistic rollups currently implement a 7-day withdrawal delay. The 7-day period was first proposed by Vitalik and is based on the idea that 7 days will be ‘enough’ for reacting to censorship attacks.

Let’s examine whether the 7-day challenge period in Optimistic Rollups is sufficient to resist censorship attacks by considering two types of censorship: weak and strong censorship attacks.

For the first, weak censorship, we can use probability calculations to see if the 7-day period gives Optimistic Rollups resistance to censorship attacks. This involves calculating the probability of successfully challenging a fraud when some validators are censoring the rollup’s challenge transactions.

Here, two considerations must be made:

  1. Multiple transactions must succeed for a challenge to be successful within 7 days.
    In most protocols, the challenge won’t succeed if only one transaction from an honest participant is included in the week. Therefore, we need to calculate the probability of including all necessary transactions to submit a fraud proof within the 7-day period.
  2. A realistic assumption must be made about what percentage of validators are involved in censorship.
    Currently, most Ethereum block builders, known to be centralized, are not censoring, and given the percentage of solo stakers on Ethereum, the chance that a majority (e.g., 99.9%) of validators will collude to perform censorship is close to zero.

(Censorship of major Ethereum block builders | source: Tweet of Justin Drake)

Taking these two points into account, if we assume that 99.5% of validators (still too extreme assumption) are engaging in censorship and calculate the probability of an honest participant succeeding in sending 30 to 40 transactions required for challenge protocols like BoLD or OPFP, the probability of success approaches 100% in all cases. Additionally, resistance to censorship could be improved further with future solutions like inclusion lists or multiple concurrent proposers (e.g., BRAID, APS + FOCIL), reducing the risk of optimistic rollups losing user funds due to weak censorship.

Then will 7 days be enough in the situation of strong censorship? The 51% attack mentioned earlier can only be resolved through a social fork. The Non-attributable Censorship Attack is particularly challenging to detect and cannot be prevented using solutions designed for weak censorship, such as inclusion lists.

There is a proposal to develop a semi-automated 51% attack recovery tool in client software, based on a structure proposed by Vitalik. This censorship detection solution has been further developed by Ethereum researchers and consists of two steps:

  1. Light clients monitor the mempool and detect when certain transactions are not included in blocks for an extended period.
  2. If specific transactions remain in the mempool for a day without being included in a block, a “Do you agree with a social fork?” button is triggered, allowing the community to initiate a hard fork based on this consensus.

Let’s say this tool detects a 51% attack. The next step would be to move to a new chain through a social fork that invalidates the attacker’s funds.

In such a case, it’s crucial that the funds affected by the 51% attack remain locked until the social fork is executed. A similar situation occurred during The DAO hard fork, where the hacker’s funds were locked in a child DAO for 27 days before they could be withdrawn. The Ethereum community was able to conduct a hard fork within that period, preventing the hacker from cashing out the funds (see Vitalik’s Reddit post for more details).

In other words, even in the event of a 51% attack, funds need to remain locked until a social fork can be conducted. In this context, the 7-day withdrawal period in optimistic rollups serves as a buffer. If a social fork doesn’t occur within the week, user funds in optimistic rollups may be stolen, cashed out on centralized exchanges, or mixed via Tornado Cash, making it nearly impossible to return the funds to users even with a social fork.

To summarize, while the 7-day withdrawal period in optimistic rollups was originally proposed to account for weak censorship, in reality, weak censorship is unlikely to occur, and the 7-day period serves as a buffer in the event of strong censorship that requires a social fork.

From this perspective, there has been criticism that OPFP’s reduction of this period to 3.5 days makes it more vulnerable to attacks involving strong censorship. However, this criticism is unfounded. Since Optimism is still in Stage 1, guardians have a buffer to verify the state root’s correctness, and withdrawals can only occur after the additional 3.5-day Guardian Period has passed. Therefore, even if a strong censorship attack occurs, the attacker would still need to wait 7 days to withdraw. Additionally, the attacker would have to censor all challenge-related transactions for the entire week to succeed, as the guardians would also need to be censored to prevent them from halting the confirmation of a malicious output.

However, the key point remains that Ethereum must ensure it can process social forks within the 7-day period. This means that tools to detect 51% attacks must be ready and that there is sufficient research and simulation to determine whether a social fork can be implemented within 7 days. Only then can the 7-day withdrawal delay in optimistic rollups be considered an effective safeguard.

3.6. Attack Vector #3: Exploiting a Bug in the Fraud Proof System

Most challenge protocols work by having participants find a specific point (instruction or block) where they disagree and then generate proof showing that the other participant’s claim is incorrect. The virtual machine used to generate this proof is called the Fraud Proof VM, and the software used for proof generation on top of the VM is called a program. Each protocol uses different Fraud Proof VMs and programs, as shown below:

The goal of each Fraud Proof System is to prove that a specific execution result in the EVM was correct on-chain. But what happens if there’s a bug in this system, either in the VM or the program?

This question can be explored through the attack vector Yoav Weiss discovered in OVM. The attack was possible due to a vulnerability in OVM’s rollback feature, but the premise of creating a “fraudulent transaction” was crucial for the attack to be carried out. A fraudulent transaction is one that executes normally when processed on the rollup but produces a different result when executed in the challenge process using the Fraud Proof VM and program. Since the Fraud Proof System is supposed to generate the same result as the EVM, the ability to create a fraudulent transaction implies that there is a bug in the Fraud Proof System.

Yoav discovered several bugs in OVM’s Fraud Proof System and was able to simulate this attack by generating fraudulent transactions.

One simple example of the attacks he discovered was as follows: In OVM’s StateManager, the gas cost for the opcodes SSTORE and SLOAD (which store and read state) was incorrectly recorded. This meant that any transaction that stored or read a value in a contract (almost every transaction except simple ETH transfers) would be identified as a fraudulent transaction during the challenge process, even though it had executed correctly on the rollup.

In short, if there is a bug in the system, a state change that was correctly executed could be incorrectly flagged as invalid during a challenge, causing the output submitted by an honest participant to be marked as wrong.

This was one of the reasons OP Mainnet recently transitioned its fault proof system from a permissionless model to one where only authorized participants could join. After OPFP was applied to the mainnet, a security audit revealed several bugs in the Fraud Proof System (Cannon and op-program) and the Dispute Game challenge protocol. To prevent the system from being exploited, Optimism announced on August 17th that it would switch to a permissioned system.

Of course, exploiting a VM bug may not have a significant impact on rollups in Stage 0 or Stage 1, because the Security Council can intervene at any time to correct the outcome of a challenge.

This was a point previously argued by OP Labs. In fact, OP Labs shared its audit framework in the Optimism Forum, outlining its criteria for when external audits are necessary.

(OP Labs Audit Framework | Source: Optimism Forum)

In this framework, situations like the recent one fall into the fourth quadrant: “Fault Proofs with training wheels.” While these situations are chain-safety-related, they do not directly impact user funds and, therefore, are not included in the audit scope. This means that even if bugs are exploited, the Security Council can correct the results.

However, since vulnerabilities have been identified, they need to be addressed. Optimism fixed these issues in its Granite network upgrade, allowing OP Mainnet to return to Stage 1.

On the other hand, bugs in the system could be critical in Stage 2 rollups. In Stage 2, the Security Council can only intervene in the case of bugs that are provable onchain. Since proving that “the challenge result is wrong due to a system bug” on-chain is nearly impossible, if a bug occurs in a Stage 2 rollup, users’ funds could be at risk.

3.7. Solution #3: Multi Proofs

To prevent such issues, it is essential to conduct thorough audits before the code reaches production. However, Fraud Proof VMs and programs are complex software systems, and the more complex the system, the more likely bugs are to occur. Therefore, even with rigorous audits, bugs can still arise. We need to explore additional strategies beyond audits.

One approach is to use multiple proof systems within the same system. Instead of generating fraud proofs using a single system during a challenge, the system could simultaneously generate multiple fraud proofs using different VMs and programs, then compare the results. This would create a system that remains secure even in the event of a bug.

For example, imagine a multi-proof system using both Optimism’s Cannon and the asterisc ZK Fault Proof VM (which utilizes Risc-V). In the case of a challenge, the following would happen:

  1. If an incorrect output is detected, the challenger generates a challenge and initiates a bisection.
  2. Once a block of disagreement is found via bisection, two subgames occur simultaneously:
    • [ ] The subgame using the traditional OPFP method of Cannon.
    • [ ] A subgame using asterisc to generate a ZK Fault Proof.
  3. After both games are completed, the two different fraud proofs are verified.

If both proofs pass verification, the challenger wins; if both fail, the challenger loses. However, if one passes and the other fails, this indicates that an unexpected bug occurred in one of the VMs or programs during proof generation.

In such cases, entities like the Security Council would intervene to adjust the challenge result. This ensures that the system can remain free from bugs without violating the condition that “the Security Council can only intervene in cases of bugs that are provable onchain.”

This is one of the ongoing efforts for Optimism to reach Stage 2. To support this, OPFP’s Dispute Game is designed modularly, allowing multiple fraud proof systems to be implemented freely, with a minimal interface defined to support this.

4. Possible Improvements

In previous sections, we explored the design of optimistic rollup protocols and the vulnerabilities that could arise in their challenge and fraud proof verification processes. This section discusses the issues and solutions for each protocol, along with future prospects for fraud proof systems and optimistic rollups.

4.1. Rooms for Improvement for Each Protocol

1) Arbitrum BoLD

BoLD has a sound economic challenge protocol because it limits the maximum protocol delay to one week and ensures protection from sybil attacks unless the attacker has more than 6.5 times the funds of the defender. However, BoLD presents two notable issues:

  1. The resource ratio of 6.5x gives too little advantage to the defender.
  2. The bond for submitting a root claim is 3,600 ETH, which is excessively large.

The first issue can be addressed with ZK technology. BoLD splits challenges into multiple levels to reduce the resources required for history commitment computation. Using ZK, this could be reduced to a single level.

This concept is similar to the suggestion of BoLD++ from Gabriel at Cartesi. When challenges are multi-leveled, increasing the resource ratio results in an exponential rise in bond size at the top level. However, when using a single level, the resource ratio can be increased more easily, making the protocol more resistant to sybil attacks.

The second issue, the 3,600 ETH bond, is more difficult to solve. BoLD’s bond size was set not only to address sybil attacks but also to deter delay attacks. The bond size is a function of TVL, and even with ZK, it cannot be reduced significantly. To mitigate this, BoLD is implementing a pooled bonding mechanism, allowing multiple participants to contribute to the bond.

2) Cartesi Dave

Dave effectively addresses sybil attacks with its tournament structure, but as mentioned earlier, it is vulnerable to delay attacks. The maximum delay time as a function of the number of malicious claims NA and the number of challenge levels L is:

Td = 7 x [log2(1 + NA)]L(days)

If NA = 7 and L = 3, the protocol can experience delays of up to four months, causing significant inconvenience and loss to users as withdrawals are delayed.

ZK can help mitigate this. By fixing the number of levels L to 1 (as in BoLD++), the maximum delay time can be reduced to:

Td = 7 x log2(1 + NA)(days)

Cartesi is reportedly working on this improvement using RISC Zero’s ZK technology. However, there are still concerns about whether this will be enough to prevent delay attacks entirely. If NA = 7, the protocol could still face up to 2 weeks of additional delays, and the attacker’s costs would be just 7 ETH in bonds, along with gas fees and off-chain history commitment costs. For chains with high TVL, this penalty might not be enough to deter delay attacks.

(Dave with BoLD style sub-challenges | Source: L2Beat Medium)

There is a suggestion for Dave to adopt BoLD-style challenges with 8 participants instead of conducting 1-on-1 matches in each round, similar to a traditional tournament. In this case, the delay time would be calculated as follows:

Td = 7 x log8(1 + NA)(days)

Under this structure, an attacker would need to post at least 64 bonds to delay the challenge beyond two weeks, equating to a total bond requirement of 64 ETH, along with substantial onchain and offchain costs.

However, this approach has the downside of weakening the defender’s advantage in the case of a sybil attack. While BoLD provides a structure where the defender is N times more advantageous than the attacker, Dave creates a situation where the defender holds an exponentially greater advantage over the attacker.

In summary, Dave can effectively limit delay attack vectors through the use of ZK Fraud Proofs. While applying a structure like BoLD can improve resistance to delay attacks, it might lead to a trade-off where the defender’s advantage in the face of sybil attacks is reduced.

3) Optimism Fault Proof (OPFP)

OPFP had the drawback of being vulnerable to sybil attacks because the attacker and defender incurred equal costs. OP Labs proposed a solution to this problem in Dispute Game V2.

Unlike the original OPFP, where bonds were submitted with each bisection, Dispute Game V2 requires participants to post bonds only at the start of the bisection. Additionally, Dispute Game V2 introduces dissection, allowing participants to submit multiple claims simultaneously at branch points, reducing the number of interactions in most cases.

(Branch Claim at Dispute Game V2 | Source: Optimism Specs GitHub)

In the previous OPFP, the sybil attack vectors were:

  1. Creating numerous challenges to exhaust the defender’s funds on bonds and gas fees.
  2. Continuously submitting fraudulent outputs, forcing the defender to respond and drain their resources.

The introduction of branch claims addresses both vectors. First, the honest participant doesn’t need to post additional bonds during dissection, while malicious challengers must do so for every new challenge they create. This makes mass challenge creation unsustainable for attackers if bond amounts are appropriately set.

Second, as bonds are larger at higher levels in Dispute Game V2, continuously submitting fraudulent outputs becomes costlier for attackers than for defenders.

Thus, OPFP can effectively counter sybil attacks with the branch claims introduced in Dispute Game V2.

4) Kroma ZK Fault Proof (Kroma ZKFP)

Kroma ZKFP faces the dual challenges of vulnerability to sybil attacks and an imperfect proof system. The following two issues must be resolved for Kroma ZKFP to progress to Stage 1:

  1. The dissection starting point is based on the last submitted output, not the last finalized output.
  2. Validators do not verify whether challenges are based on correct batch data.

Kroma plans to switch from Scroll’s Halo2 zkEVM to Succinct SP1 zkVM, addressing these two issues and advancing to Stage 1.

Kroma is expected to modify its challenge process to align with Optimism’s Dispute Game interface. This adjustment is detailed in Kroma’s spec, and it will allow the dissection starting point to move to the last finalized output from one week ago, resolving the first issue.

For the second issue, Kroma will use ZK based trustless derivation. Here’s how it works:

(Trustless Derivation using ZK | Source: Lightscale Notion)

Imagine that we want to prove that a specific L2 block T was correctly executed. Before generating a ZK proof, we must verify that the transaction data for block T was properly constructed based on the L1 batch data.

Here, Kroma intends to verify whether the batch data has been correctly retrieved from L1 via ZK. If the data is simply fetched through a trusted RPC outside of the ZK program, there is no way to confirm whether the batch data has been tampered with. It is possible to verify that the program accessed the correct block and fetched the batch data by generating a ZK proof of the connectivity of block hashes from the block O (L1 origin of L2 block T), to block C (the L1 block at the time the challenge was created). If a challenger constructed L2 block T based on incorrect batch data, the hash of the L1 block from which the batch was retrieved by the challenger would differ from the hash of the L1 block that actually contains the batch including T1, and it would also not be connected to block C. Therefore, as long as there is no hash collision, the process of verifying the connectivity of L1 blocks through ZK can prove that the challenger constructed the L2 block from the correct batch data.

Kroma plans to verify the accuracy of batch data using ZK, which can check the connectivity of block hashes from L1 blocks O to C. If a challenger constructed L2 block T based on incorrect batch data, the L1 block hash they reference would differ from the one containing the correct batch, and it wouldn’t connect to block C. Since there’s no hash collisions, the challenge process can verify the correct batch data using this method.

With these improvements, Kroma ZKFP can possibly move to Stage 1. However, to reach Stage 2, Kroma will need additional solutions to protect against sybil attacks, including changing the challenge protocol to All-vs-All and redesigning the bond mechanism.

4.2. Summary

5. Future of Fraud Proof

5.1. Stage 2 Rollup - Your Funds are SAFU

As described above, Optimistic Rollups are moving towards Stage 2. Arbitrum is aiming to achieve Stage 2 based on BoLD. The implementation of BoLD has already been posted on the Governance Forum and has garnered significant support, with its implementation currently deployed on the testnet. If no major security issues are found, Arbitrum is likely to achieve Stage 2 through BoLD by the end of this year.

Optimism is also working hard to achieve Stage 2. For OP Mainnet to reach Stage 2, Dispute Game V2 must be completed, and there need to be multiple proof mechanisms for multi-proof. Although the specification is still in progress, Dispute Game V2 effectively addresses the weaknesses of the existing OPFP by providing strong protection against sybil attacks, bringing it closer to Stage 2. Additionally, multiple proofs are being actively developed, with various teams including OP Labs, Succinct, Kroma, and Kakarot dedicating significant R&D resources to create diverse ways to prove OP Stack. Therefore, Optimism is also expected to aim for Stage 2 by the first half of next year, barring any major issues.

The transition of these two rollups to Stage 2 could significantly impact the rollup ecosystem. Both Arbitrum and Optimism have their own rollup frameworks, Arbitrum Orbit and OP Stack, respectively. Their transition to Stage 2 means that all rollups using these frameworks could also transition to Stage 2.

Thus, starting from the end of this year to next year, major rollups with large user bases, such as Arbitrum, OP Mainnet, and Base, are expected to transition to Stage 2, inheriting the full security of Ethereum. This will likely silence criticisms such as “Rollups are just multisigs” or “Rollups can take your funds anytime.”

5.2. ZK Fraud Proof is the Future

Most of the protocols discussed would benefit from implementing ZK Fraud Proof. For example, applying ZK to Arbitrum BoLD could increase the resource ratio, making it more resilient to sybil attacks, and Cartesi Dave could reduce its vulnerability to delay attacks. OPFP is also investing R&D into ZK for multi-proof systems, which could reduce bond amounts and improve protocol security.

It’s important to note that ZK Fraud Proof does more than just reduce the number of interactions between validators. Fewer interactions mean significantly fewer resources for validators, which allows bond amounts to be reduced, enabling more participants to join the protocol. Additionally, this reduces the maximum possible delay, improving overall protocol security.

In this way, ZK Fraud Proof plays a critical role in both the security and decentralization of optimistic rollups.

5.3. How about ZK Rollup? Will Fraud Proof Diminish?

At this point, some readers might ask:

If fraud proof and challenge mechanisms are so complex, wouldn’t ZK Rollups be a better option?

To a certain extent, this is true. In ZK Rollups, achieving Stage 2 doesn’t require complex economic considerations, users’ funds aren’t at risk of being stolen in the event of L1 censorship, and users can withdraw funds within a matter of hours.

The transition from optimistic rollups to ZK rollups might happen sooner than expected. This is because the main drawbacks of ZK rollups—high proof generation costs and time—are rapidly improving. Recently, Succinct Labs introduced OP Succinct, a ZK version of OP Stack, offering a framework to easily launch ZK rollups based on the OP Stack.

(Introducing OP Succinct | Source: Succinct Blog)

However, there are still a few considerations. The first is cost. The cost for generating a block proof in OP Succinct is known to be around $0.005-$0.01, and the monthly cost of running a prover is estimated to be between $6,480 and $12,960. If the chain has a high TPS, these costs could increase further.

(Benchmark of proving const in various networks | Source: Succinct Blog)

For example, the average proof generation cost per block on Base at OP Succinct is about $0.62. Calculating the monthly cost based on this results in $803,520. This is an additional cost that did not arise with optimistic rollups, and even if ZK costs decrease, the operational costs of ZK rollups will always be higher than optimistic rollups.

The second consideration is how it affects decentralization. Validators in ZK rollups need to run provers, which is more difficult and expensive than running fraud proof programs in optimistic rollups. Also, due to the slower proof generation times in ZK systems, users can’t verify transactions in real-time. While higher hardware specs can improve proof generation speeds to match transaction execution, this means that running a prover requires high-spec computing environments. Ideally, anyone should be able to run a node to ensure the chain’s safety, but ZK hasn’t reached that level yet.

Lastly, ZK rollups are based on highly complex mathematics and cryptography, and this complexity surpasses that of fraud proof and challenge protocols. Thus, ZK systems require extensive testing before they can be safely used in production.

Arbitrum is pursuing a hybrid protocol that combines ZK and optimistic methods as its endgame. The protocol would primarily operate as an optimistic rollup, generating ZK proofs only when fast withdrawals are needed. This would be useful for scenarios requiring rapid fund rebalancing between chains, such as exchanges or bridges, or for enabling interoperability between chains.

In conclusion, the optimistic rollup approach appears to be valid for now, with taking ZK as a hybrid approach. But as ZK proof generation costs and speed continue to improve, more optimistic rollups might seriously consider transitioning to ZK in the future.

5.4. Are Fraud Proofs Only for Rollups?

We have looked into Ethereum’s optimistic rollups and their fraud proof mechanisms. What are some other use cases for this fraud proof?

  1. Restaking Protocols

Fraud proofs can be actively utilized in restaking protocols. Let’s explore this with the example of Eigenlayer, a representative restaking service on Ethereum.

Eigenlayer is a service that allows Ethereum’s security to be rented out through restaking. Operators in Eigenlayer can deposit ETH or LST from users based on a delegation contract within Eigenlayer, and participate in validation by opting into multiple AVSs (Actively Validated Services). Through Eigenlayer, protocols can easily build AVSs and reduce the cost of bootstrapping initial validators.

Like any other blockchain, AVSs reward operators for successful validation and must slash them when they act maliciously. This is where fraud proofs can be used in the slashing process.

(Slashing Example of an AVS | Source: Eigenlayer GitHub)

For example, consider a bridge AVS. The premise of the bridge AVS is that it must properly transfer users’ funds to the target chain, and any operator who maliciously manipulates transactions should be slashed. If such manipulation occurs, a challenger who discovers the misconduct can create a challenge with a fraud proof in the Dispute Resolution contract, asserting that the operator has incorrectly performed the bridging. If the fraud proof is deemed valid, the AVS can call the slasher contract in Eigenlayer to halt any rewards for the operator.

Although this slashing feature has not yet been implemented in Eigenlayer, they recently announced Shared Security Model, including slashing in the next release. This will enable the use of fraud proofs for slashing.

  1. Data Availability Layer

Fraud proof can also be used in the Data Availability (DA) layer. A representative example of this is the fraud proof proposed and implemented by Celestia. Celestia has a technology that allows light nodes to verify whether data is stored correctly based on Data Availability Sampling. Let’s take a closer look at this.

A light client should be able to verify whether a block has been agreed upon by a majority (more than 67%) of the validators without downloading all the data of the blockchain. However, it is difficult for light clients to verify all the signatures of validators for each block, and as the number of validators increases, this becomes almost impossible.

This is where Celestia presents an interesting concept. In Celestia, even if the majority of validators are malicious, it proposes a method where a single honest full node can tell light clients to reject a faulty block. This single honest full node can be trusted using fraud proof to guarantee its “honesty.”

There are two types of fraud proofs in Celestia:

  • Fraud proofs for data
  • Fraud proofs for state transition

First, fraud proofs for data work as follows: Celestia allows light nodes to verify that validators are holding the correct data without directly downloading all the data within a block. To achieve this, Celestia uses a technology called Data Availability Sampling (DAS).

(Data Availability Sampling at Celestia | Celestia Docs)

Celestia’s validators structure transaction data into a k x k matrix and then extend it to a 2k x 2k matrix using a technique called 2D Reed-Solomon Encoding. They then calculate a total of 4k Merkle roots for each row and column, and the result of further hashing these Merkle roots is included in the block header and propagated.

With just the Merkle root information in the block header, light nodes can verify that Celestia’s validators are holding the correct data. Light nodes request data from random points in the 2k x 2k matrix along with the Merkle roots for the corresponding rows and columns from validators. If this data can be verified against the values in the block header, the validators can be trusted to be holding the correct data.

However, one important consideration arises: What if a validator maliciously performs Reed-Solomon encoding? Celestia addresses this issue by implementing something called a “bad-encoding fraud proof.”

If a Celestia full node discovers during block recovery that encoding has been done incorrectly, it generates a fraud proof containing the block height, the incorrectly encoded section, and proof of the mistake, which is then propagated to light nodes. The light nodes verify the proof to confirm that the data was indeed encoded incorrectly, allowing them to stop using the faulty data.

In addition, Celestia also proposes a fraud proof mechanism for state transitions.

(Architecture of a block in Celestia | Source: Contribution DAO Blog)

Celestia’s blocks are structured to include trace data for transactions at various intervals. This allows full nodes to easily build fraud proofs, and light nodes can detect incorrect state transitions without executing the entire block. However, due to complexity issues, this mechanism has not yet been implemented on the Celestia mainnet.

In summary, fraud proof in the DA layer can play a role in filtering out incorrect data and state transitions without relying on consensus.

  1. Machine Learning

AI and blockchain were hot topics in 2024, and much R&D has been conducted in this area. One of the most notable aspects is the combination of blockchain and machine learning.

The primary reasons for applying machine learning to blockchain are as follows:

  • Data reliability: Blockchain manages data in a decentralized manner, with all transactions recorded openly and transparently. If a machine learning model learns from blockchain data, the data is from a reliable source, reducing the possibility of tampering.
  • Transparency and verifiability of models: When a machine learning model is executed on a blockchain, the model’s updates and results are recorded on-chain, making them verifiable. This prevents manipulation or bias in results that could occur in centralized environments.

The critical factor here is verifying that the machine learning model has been correctly trained. However, machine learning computations are highly intensive, making it nearly impossible to execute them entirely within blockchain runtimes. Therefore, frameworks like opML and zkML have emerged to efficiently verify machine learning model training in a blockchain environment. opML adopts an optimistic approach to model training, recording the results on the blockchain and correcting errors through a challenge mechanism.

Let’s take a closer look at the approach proposed by ORA, a project providing AI infrastructure on the blockchain. The opML challenge process is very similar to rollup challenges and is composed of the following three key components:

  • Fraud Proof VM: This VM executes machine learning inference and functions similarly to Arbitrum’s WAVM or Optimism’s Cannon.
  • opML smart contract: This contract verifies fraud proofs, playing a role similar to Optimism’s MIPS.sol contract.
  • Verification game: The verifier who issued the challenge interacts with the server through bisection to identify the single incorrect step within the VM, then generates a fraud proof for that step and submits it to the opML contract.

(Verification Game on ORA opML | Source: ORA Docs)

Through this fraud proof mechanism, opML leverages the security and trustworthiness of blockchain while providing a cost-effective environment for machine learning model training and verification.

6. Conclusion

Optimistic rollups are investing significant effort into improving fraud proofs and challenge protocols to inherit more of Ethereum’s security and create a more trust-minimized chain. Arbitrum is expected to reach Stage 2 by the end of this year through BoLD, and Optimism is also working towards Stage 2, relying on Dispute Game V2 and multi-proof mechanisms. By next year, users of optimistic rollups will be able to interact with the network with greater security, without worrying that “the rollup could take their funds.” Additionally, the number of Stage 1+ rollups that Vitalik could mention in his blog is expected to increase.

However, there is still room for improvement in each protocol, and much of it can be enhanced through ZK Fraud Proofs. Kroma is already advancing its protocol based on this, and other protocols such as Arbitrum, Optimism, and Cartesi can maintain a safer and more decentralized approach with ZK Fraud Proofs.

Fraud proofs are an area where not only rollups but also other protocols are investing substantial R&D resources. Based on the premise that “only one honest participant is needed,” fraud proofs, together with ZK, can contribute to building a trust-minimized architecture across the entire blockchain, and their impact will gradually become something we can experience firsthand.

7. Reference

L2Beat

Fraud Proof Wars | Luca Donnoh at L2Beat

Arbitrum Docs

Optimism Docs

Optimism Specs

Permissionless Refereed Tournaments | Cartesi

Kroma Specs

BoLD: Fast and Cheap Dispute Resolution

Economics of BoLD

Why is the Optimistic Rollup challenge period 7 days? | Kelvin Fichter at OP Labs

Fraud Proofs Are Broken | Gabriel Coutinho de Paula at Cartesi

Optimistic Time Travel | Yoav Weiss

About the First Successful Challenge on Kroma Mainnet

Unpacking progress in baseline decentralization | OP Labs

Non-attributable censorship attack on fraud-proof-based Layer2 protocols

OP Labs Audit Framework

Trustless Derivation | Kroma

Introducing OP Succinct: Full Validity Proving on the OP Stack | Succinct

Eigenlayer GitHub

Celestia Docs

Contribution DAO Blog

ORA Docs

Disclaimer:

  1. This article is reprinted from [research.2077], All copyrights belong to the original author [sm-stack and BTC Penguin]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!