Analysis of Bitcoin Layer 2 Scaling Technology: Validity Proof and Fraud Proof

Advanced28.13
Get an in-depth understanding of the Layer 2 expansion plan in the Bitcoin network, especially the validity proof and fraud proof technology. This article analyzes how to achieve Layer 2 expansion through technological innovation under the strict restrictions of Bitcoin, including Bit Commitment, Taproot, and Connector Output. and contracts, etc.
Analysis of Bitcoin Layer 2 Scaling Technology: Validity Proof and Fraud Proof

1 Introduction

For an algorithm f, two mutually distrustful participants, Alice and Bob, can establish trust in the following ways:

  1. Alice inputs x, runs algorithm f, and obtains result y. Bob also runs algorithm f with the same input x, resulting in y′. If y = y′, then Bob acknowledges Alice’s provided input x and output y. This is a special validity proof mechanism commonly used in blockchain consensus, where Alice is the node packaging the block and Bob is the node participating in the consensus.
  2. Alice inputs x, runs the zk.prove program on algorithm f, and obtains result y and proof proof. Bob runs the zk.verify program based on f, y, and proof. If the result is true, then Bob acknowledges Alice’s result y; if the result is false, then Bob does not acknowledge Alice’s result y. This is a validity proof, where Bob can be an on-chain contract.
  3. Alice inputs x, runs algorithm f, and obtains result y. Bob also runs algorithm f with the same input x, resulting in y′. If y = y′, then nothing is done; if y ≠ y′, then Bob challenges Alice, with the challenged program being f. The number of interactions between Alice and Bob can be one or multiple. Arbitration is achieved through the challenge-response process. This is a fraud proof, where Bob is the challenger, listening off-chain and challenging on-chain.
  4. Alice inputs x, runs the zk.prove program on algorithm f, and obtains result y and proof proof. Bob runs the zk.verify program based on f, y, and proof. If the result is true, then nothing is done; if the result is false, then Bob challenges Alice, with the challenged program being zk.verify. The number of interactions between Alice and Bob can be one or multiple. Arbitration is achieved through the challenge-response process. This is a fraud proof, where Bob is the challenger, listening off-chain and challenging on-chain.

For the above 2, 3, and 4, let x be Layer2 transactions and the initial state, f be the Layer2 consensus program, and y be the transaction end state, corresponding to the blockchain Layer2 scaling solution. Among them:

  • Validity Proof: Based on a pessimistic assumption, it must be proven valid before acceptance, and it takes effect immediately. In a validity proof, evidence of correct L2 state transitions must be provided, reflecting a pessimistic view of the world—only when a state is proven correct will it be accepted.
  • Fraud Proof: Based on an optimistic assumption, it is accepted by default unless someone proves it to be incorrect. It has a challenge window period, which only takes effect after the challenge window period. In a fraud proof, evidence of incorrect L2 state transitions must be provided, reflecting an optimistic view of the world—a state transition is assumed to be correct unless proven otherwise.


Table 1: Ways to Establish Trust

Additionally, it is important to note:

  • The key to distinguishing between fraud proofs and validity proofs is not whether ZK proof systems like SNARK/STARK are used. The ZK proof system expresses the method of proof used, while fraud or validity represents the content being proven. This is why scenario 1 in Table 1 is said to represent a validity proof.
  • Validity proofs have better timeliness, but the on-chain verification complexity is relatively high; fraud proofs have lower on-chain verification complexity, but their timeliness is relatively poor.
  • For cases 2 and 4 in Table 1, using ZK recursion and combination techniques, multiple f can be computed and compressed, significantly reducing the verification cost of off-chain computation on-chain.

Currently, benefiting from the Turing-complete smart contracts of Solidity, fraud proofs and validity proof technologies are widely used in Ethereum Layer2 scaling. However, under the Bitcoin paradigm, limited by Bitcoin’s limited opcode functionality, 1000 stack elements, and other restrictions, the application of these technologies is still in the exploratory stage. This article summarizes the limitations and breakthrough technologies under the Bitcoin paradigm in the context of Bitcoin Layer2 scaling, studies validity proof and fraud proof technologies, and outlines the unique script segmentation technology under the Bitcoin paradigm.

2 Limitations and Breakthroughs under the Bitcoin Paradigm

There are many limitations under the Bitcoin paradigm, but various clever methods or techniques can be used to overcome these limitations. For example, bit commitment can break through the UTXO stateless limitation, taproot can break through script space limitations, connector output can break through UTXO spending method limitations, and contracts can break through pre-signature limitations.

2.1 UTXO Model and Script Limitations

Bitcoin adopts the UTXO model, where each UTXO is locked in a locking script that defines the conditions that must be met to spend that UTXO. Bitcoin scripts have the following limitations:

  1. Bitcoin scripts are stateless;
  2. For P2TR output types, the maximum number of opcodes that can be accommodated in a single transaction is about 4 million, filling the entire block, while for other output types, there are only 10,000 opcodes;
  3. The spending methods of a single UTXO are limited, lacking exploration of combined spending methods;
  4. Flexible contract functions are not supported;
  5. The stack size is limited to a maximum of 1000 elements (altstack + stack), and the maximum size of a single element is 520 bytes;
  6. Arithmetic operations (such as addition and subtraction) are limited to 4-byte elements. They cannot be modified to long elements, such as 20 bytes or larger, which are necessary for cryptographic operations;
  7. Opcodes like OPMUL and OPCAT have been disabled, and simulating them with existing opcodes incurs extremely high costs, such as simulating a one-round BLAKE3 hash, with a script size of about 75K.

2.2 Bit Commitment: Breaking Through the UTXO Stateless Limitation

Currently, Bitcoin scripts are completely stateless. When executing a Bitcoin script, its execution environment is reset after each script. This leads to the inability of Bitcoin scripts to natively support constraint scripts 1 and 2 having the same x value. However, this issue can be circumvented through some methods, with the core idea being to sign a value in some way. If a signature can be created for a value, it is possible to achieve stateful Bitcoin scripts. That is, by checking the signature of the x value in scripts 1 and 2, it can be enforced that the same x value is used in both scripts. If there is a conflicting signature, meaning two different values are signed for the same variable x, a penalty can be applied. This solution is known as bit commitment.

The principle of bit commitment is relatively simple. A bit refers to setting two different hash values, hash0 and hash1, for each bit in the message to be signed. If the bit value to be signed is 0, the preimage0 of hash0 is revealed; if the bit value to be signed is 1, the preimage1 of hash1 is revealed.

Taking a single bit message m ∈{0,1} as an example, the corresponding bit commitment unlock script is just some preimages: if the bit value is 0, the corresponding unlock script is preimage0—“0xfa7fa5b1dea37d71a0b841967f6a3b119dbea140”; if the bit value is 1, the corresponding unlock script is preimage1—“0x47c31e611a3bd2f3a7a42207613046703fa27496”. Therefore, with bit commitment, it is possible to break through the UTXO stateless limitation and achieve stateful Bitcoin scripts, making various interesting new features possible.

OP_HASH160

OP_DUP

0xf592e757267b7f307324f1e78b34472f8b6f46f3> // This is hash1

OP_EQUAL

OP_DUP

OP_ROT

0x100b9f19ebd537fdc371fa1367d7ccc802dc2524> // This is hash0

OP_EQUAL

OP_BOOLOR

OP_VERIFY

// Now the value of the bit commitment is on the stack. Either “0” or “1”.

Currently, there are 2 implementations of bit commitment:

  • Lamport One-Time Signature: The principle is relatively simple and only requires the use of hash functions, making it Bitcoin-friendly. For each bit in the message, two hash values need to be committed, resulting in relatively large signature data. In other words, for a message of length v bits, the public key length is 2v bits, and the signature length is v bits.
  • Winternitz One-Time Signature: Compared to Lamport signatures, it significantly reduces the lengths of signatures and public keys but increases the complexity of signing and verification. This scheme allows for flexible setting of different hash chain lengths d, thus balancing length and complexity. For example, setting d = 15 results in both the public key length and signature length being about 4 times shorter, but the verification complexity will increase by 4 times. This is essentially a trade-off between Bitcoin’s stack space and script size. Lamport signatures can be seen as a special case of Winternitz signatures when d = 1.

Currently, the BitVM2 library implements Winternitz signatures based on the Blake3 hash function. The signature length corresponding to a single bit is about 26 bytes. Therefore, it can be seen that introducing state through bit commitment is costly. Thus, in the BitVM2 implementation, the message is first hashed to obtain a 256-bit hash value, and then bit commitment is performed on the hash value to save overhead, rather than committing to each bit of the original longer message directly.

2.3 Taproot: Breaking Through Script Space Limitations

The Bitcoin Taproot soft fork upgrade, activated in November 2021, includes three proposals: Schnorr signatures (BIP 340), Taproot (BIP 341), and TapScript (BIP 342). It introduces a new transaction type—Pay-to-Taproot (P2TR) transactions. P2TR transactions can create a more private, flexible, and scalable transaction format by combining the advantages of Taproot, MAST (Merkel Abstract Syntax Tree), and Schnorr signatures.

P2TR supports two spending methods: spending according to the key path or the script path.

According to the provisions in Taproot (BIP 341), when spending via the script path, the corresponding Merkle path cannot exceed a maximum length of 128. This means that the number of tapleafs in the taptree cannot exceed 2^128. Since the SegWit upgrade in 2017, the Bitcoin network measures block size in weight units, with a maximum support of 4 million weight units (approximately 4MB). When a P2TR output is spent via the script path, it only needs to reveal a single tapleaf script, meaning that the block is packed with a single tapleaf script. This implies that for P2TR transactions, the script size corresponding to each tapleaf can be a maximum of about 4MB. However, under Bitcoin’s default policy, many nodes only forward transactions smaller than 400K; larger transactions need to collaborate with miners to be packed.

The script space premium brought by Taproot makes it more valuable to simulate cryptographic operations like multiplication and hashing using existing opcodes.

When building verifiable computation based on P2TR, the corresponding script size is no longer limited to the 4MB constraint, but can be split into multiple sub-functions distributed across multiple tapleafs, thus breaking through the 4MB script space limitation. In fact, the Groth16 verifier algorithm implemented in the current BitVM2 has a size of up to 2GB. However, it can be split and distributed across about 1000 tapleafs, and by combining it with bit commitment, the consistency between the inputs and outputs of each sub-function can be constrained, ensuring the integrity and correctness of the entire computation.

2.4 Connector Output: Breaking Through UTXO Spending Method Limitations

Currently, Bitcoin provides two native spending methods for a single UTXO: spending by script or by public key. Therefore, as long as the correct public key signature or script conditions are met, the UTXO can be spent. Two UTXOs can be spent independently, and no restrictions can be added to constrain the two UTXOs, meaning that additional conditions must be met for them to be spent.

However, Burak, the founder of the Ark protocol, cleverly utilized the SIGHASH flag to achieve connector output. Specifically, Alice can create a signature to send her BTC to Bob. However, since the signature can commit to multiple inputs, Alice can set her signature to be conditional: the signature is valid for the Taketx transaction if and only if that transaction consumes the second input. The second input is called the connector, linking UTXO A and UTXO B. In other words, the Taketx transaction is valid if and only if UTXO B has not been spent by the Challengetx. Therefore, by destroying the connector output, the effectiveness of the Taketx transaction can be blocked.


Figure 1: Connector Output Illustration

In the BitVM2 protocol, the connector output acts as an if…else function. Once the connector output is spent by a transaction, it cannot be spent by another transaction to ensure its exclusive spending. In practical deployment, to allow for a challenge-response period, additional UTXOs with timelock are introduced. Furthermore, the corresponding connector output can also be set with different spending strategies as needed, such as allowing any party to spend in the case of challenge transactions, while allowing only the operator to spend or allowing anyone to spend after a timeout for response transactions.

2.5 Contracts: Breaking Through Pre-Signing Limitations

Currently, Bitcoin scripts mainly limit the conditions for unlocking, without restricting how the UTXO can be further spent. This is because Bitcoin scripts cannot read the content of the transaction itself, meaning they cannot achieve transaction introspection. If Bitcoin scripts could check any content of the transaction (including outputs), contract functionality could be realized.

Current contract implementations can be divided into two categories:

  • Pre-signing: Based on existing Bitcoin script capabilities, limited-functionality pre-determined contracts are constructed through pre-signing. This means designing and signing all possible future transactions in advance, locking participants into specific private keys and fee rates. Some schemes even require participants to generate short-term private keys for all pre-signed transactions. Once pre-signing is complete, these short-term private keys are securely deleted, preventing attackers from obtaining them and stealing funds. However, each time a new participant is added or an operation is updated, the above process needs to be repeated, leading to high maintenance costs. For example, the Lightning Network achieves two-party contracts through pre-signing and uses Hash Time-Locked Contracts (HTLC) technology to implement routing functions for multiple two-party contracts, thus achieving a trust-minimized scaling strategy. However, the Lightning Network requires pre-signing multiple transactions and is limited to two parties, making it somewhat cumbersome; in BitVM1, hundreds of transactions need to be pre-signed at each initialization, while in the optimized BitVM2, the number of transactions that need to be pre-signed at each initialization also reaches dozens. In both BitVM1 and BitVM2, only operators participating in pre-signing are eligible for reimbursement. If n participants are involved in pre-signing, and each participant needs to pre-sign m transactions, the pre-signing complexity for each participant will be n * m.
  • Introducing Contract Opcodes: Introducing new contract function opcodes can significantly reduce the communication complexity and maintenance costs between contract participants, thus introducing a more flexible contract implementation method for Bitcoin. For example, OPCAT: used to concatenate byte strings. Although its function is very simple, it is very powerful and can significantly reduce the complexity of BitVM; OPTXHASH: allows for better granular control of actions within the contract. If used in BitVM, it can support a larger set of operators, thus greatly improving the security assumptions of BitVM and minimizing trust. Additionally, the pre-signing method inherently means that operators in BitVM can only adopt a reimbursement process, leading to lower capital utilization efficiency; whereas through new contract opcodes, it may be possible to directly pay out from the peg-in fund pool to the output users, further improving capital efficiency. Therefore, a flexible contract model will effectively break through the traditional pre-signing limitations.

3 Bitcoin Layer2 Scaling: Validity Proofs and Fraud Proofs

Both validity proofs and fraud proofs can be used for Bitcoin L2 scaling, with the main differences shown in Table 2.


Table 2: Validity Proofs vs. Fraud Proofs

Based on bit commitment, taproot, pre-signing, and connector output, fraud proofs based on Bitcoin can be constructed. Based on taproot, validity proofs based on Bitcoin can also be constructed by introducing contract opcodes, such as OP_CAT. Additionally, depending on whether Bob has an admission system, fraud proofs can be divided into permissioned fraud proofs and permissionless fraud proofs. In permissioned fraud proofs, only specific groups can act as Bob to initiate challenges, while in permissionless fraud proofs, any third party can act as Bob to initiate challenges. The security of permissionless fraud proofs is superior to that of permissioned ones, reducing the risk of collusion among the permitted participants.

According to the number of challenge-response interactions between Alice and Bob, fraud proofs can be divided into one-round fraud proofs and multi-round fraud proofs, as shown in Figure 2.


Figure 2: One-Round Fraud Proofs vs. Multi-Round Fraud Proofs

As shown in Table 3, fraud proofs can be implemented through different interaction models, including one-round interaction models and multi-round interaction models.


Table 3: One-Round Interaction vs. Multi-Round Interaction

In the Bitcoin Layer2 scaling paradigm, BitVM1 adopts a multi-round fraud proof mechanism, while BitVM2 employs a one-round fraud proof mechanism, and bitcoin-circle stark utilizes validity proofs. Among these, BitVM1 and BitVM2 can be implemented without making any modifications to the Bitcoin protocol, while bitcoin-circle stark requires the introduction of a new opcode OP_CAT.

For most computational tasks, Bitcoin’s signet, testnet, and mainnet cannot fully represent a 4MB script, necessitating the use of script split technology—i.e., splitting the complete computation script into chunks smaller than 4MB, distributed across various tapleafs.

3.1 Multi-Round Fraud Proofs on Bitcoin

As shown in Table 3, multi-round fraud proofs are suitable for scenarios where there is a desire to reduce on-chain arbitration computation and/or where it is not possible to pinpoint problematic computation segments in one step. Multi-round fraud proofs, as the name suggests, require multiple rounds of interaction between the prover and the verifier to locate the problematic computation segments, followed by arbitration based on the identified segments.

Robin Linus’s early BitVM white paper (commonly referred to as BitVM1) utilized multi-round fraud proofs. Assuming each challenge period lasts a week and employing a binary search method to locate the problematic computation segments, the on-chain challenge response period for the Groth16 Verifier could extend up to 30 weeks, resulting in poor timeliness. Although there are currently teams researching more efficient n-ary search methods than binary search, their timeliness is still significantly lower compared to the 2-week cycle in one-round fraud proofs.

Currently, multi-round fraud proofs in the Bitcoin paradigm employ permissioned challenges, while one-round fraud proofs have achieved a permissionless challenge method, reducing the risk of collusion among participants and thus enhancing security. To this end, Robin Linus fully leveraged the advantages of Taproot to optimize BitVM1. Not only was the number of interaction rounds reduced to one, but the challenge method was also expanded to a permissionless approach, albeit at the cost of increased on-chain arbitration computation.

3.2 One-Round Fraud Proofs on Bitcoin

In this model, verification of fraud proofs can be completed through a single interaction between the prover and the verifier. The verifier only needs to initiate one challenge, and the prover only needs to respond once. In this response, the prover must provide evidence claiming that their computation is correct. If the verifier can find inconsistencies in that evidence, the challenge is successful; otherwise, it fails. The characteristics of one-round interactive fraud proofs are shown in Table 3.


Figure 3: One-Round Fraud Proof

On August 15, 2024, Robin Linus released the BitVM2: Bridging Bitcoin to Second Layers technical white paper, which implemented a cross-chain bridge using a one-round fraud proof method similar to that shown in Figure 3.

3.3 Validity Proofs on Bitcoin with OP_CAT

OPCAT was part of the original scripting language when Bitcoin was released but was disabled in 2010 due to security vulnerabilities. However, the Bitcoin community has been discussing its reactivation for years. OPCAT has now been assigned the number 347 and has been enabled on Bitcoin’s signet.

The main function of OP_CAT is to combine two elements in the stack and push the merged result back onto the stack. This functionality opens up the possibility for contracts and STARK Verifiers on Bitcoin:

  • Contracts: Andrew Poelstra proposed CAT and Schnorr Tricks I, using OPCAT and Schnorr techniques to implement contracts on Bitcoin. The Schnorr algorithm is a digital signature for P2TR output types; for other output types, similar ECDSA techniques can be used, as seen in Covenants with CAT and ECDSA. With the help of OPCAT contracts, the STARK Verifier algorithm can be split into multiple transactions, gradually verifying the entire STARK proof.
  • STARK Verifier: The STARK Verifier essentially connects data together and hashes it. Unlike algebraic operations, hashing is a native Bitcoin script operation that can save a significant amount of overhead. For example, OPSHA256 is a single opcode in its native form, while a simulated version requires hundreds of K opcodes. The main hashing operations in STARK involve verifying Merkle paths and Fiat-Shamir transformations. Therefore, OPCAT is very friendly to the STARK Verifier algorithm.

3.4 Bitcoin Script Split Technology

Although the computational load required to run the corresponding verifier algorithm to verify the proof after SNARK/STARK proof is much smaller than that required to directly run the original computation f, the amount of script needed when converting it to implement the verifier algorithm in Bitcoin script is still enormous. Currently, based on existing Bitcoin opcodes, the optimized Groth16 verifier script size and Fflonk verifier script size are still both greater than 2GB. However, the size of a single Bitcoin block is only 4MB, making it impossible to run the entire verifier script within a single block. However, since the Taproot upgrade, Bitcoin supports executing scripts by tapleaf, allowing the verifier script to be split into multiple chunks, with each chunk serving as a tapleaf to construct a taptree. The consistency of values between chunks can be ensured through bit commitment.

In the presence of OP_CAT contracts, the STARK Verifier can be split into multiple standard transactions smaller than 400KB, allowing the entire STARK validity proof verification to be completed without needing to collaborate with miners.

This section focuses on the relevant split technology of Bitcoin scripts under the existing conditions without introducing or activating any new opcodes.

When performing script splitting, the following dimensions of information must be balanced:

  • The size of a single chunk script must not exceed 4MB and should include input bit commitment scripts, transaction signatures, and other space.
  • The maximum stack size of a single chunk must not exceed 1000. Therefore, each chunk’s stack should only retain the necessary elements to reserve enough stack space for script size optimization, as Bitcoin transaction fees do not depend on the stack size used.
  • Bit commitment on Bitcoin is expensive. Therefore, the number of bits in the input and output between two adjacent chunks should be minimized, as currently, 1 bit corresponds to 26 bytes.
  • For ease of auditing, the functionality of each chunk should be as clear as possible.

Currently, the methods for script splitting can be divided into the following three main categories:

  • Automatic Splitting: This method seeks a splitting approach where the script size is around 3MB and the stack size is minimized based on stack size and script size. The advantages of this method are that it is independent of specific verifier algorithms and can be extended to script splitting for any computation. The disadvantages are: (1) the entire logical block must be separately marked, such as OP_IF code blocks that cannot be split; otherwise, the execution result of the split script will be incorrect; (2) the execution result of a chunk may correspond to multiple elements on the stack, requiring marking of the number of stack elements that need to apply bit commitment based on actual computation logic; (3) the readability of the logical functionality implemented by each chunk script is poor, making auditing difficult; (4) the stack may contain elements not needed for the next chunk, wasting stack space.
  • Functional Splitting: This method splits based on various functional sub-functions in the computation, with clear input and output values for the sub-functions. While splitting the script, it also implements the necessary bit commitment scripts for each chunk, ensuring that the total script size of the final chunks is less than 4MB and the stack size is less than 1000. The advantages are: clear functionality, clear logic for each chunk, good readability, and ease of auditing. The disadvantages are: the expression of the original computation logic may not match the script-level logic, and the original computation sub-functions may be optimal but not represent script-level optimality.
  • Manual Splitting: In this method, the splitting points are not based on functional sub-functions but are manually set. This is especially suitable for cases where a single sub-function size exceeds 4MB. The advantages are: it allows for manual splitting of heavy script size sub-functions, such as those related to Fq12 calculations; clear logic for each chunk, good readability, and ease of auditing. The disadvantages are: limited by manual tuning capabilities, when the overall script has been optimized, previously set manual splitting points may not be optimal and need to be readjusted.

For example, after multiple rounds of optimization, the script size of the Groth16 verifier was reduced from about 7GB to approximately 1.26GB. In addition to this overall computational optimization, each chunk can also be optimized individually to fully utilize stack space. For instance, by introducing better lookup table-based algorithms and dynamically loading and unloading the lookup table, the script size of each chunk can be further reduced.

The computational costs and runtime environments of web2 programming languages are completely different from those of Bitcoin scripts, so simply translating existing implementations for various algorithms into Bitcoin scripts is not feasible. Therefore, optimizations specific to the Bitcoin scenario need to be considered:

  • Seek algorithms that optimize memory locality, even at the cost of some computational load, to reduce the number of input/output bits between chunks, thereby decreasing the amount of data required for commitments in the assertTx transaction design of BitVM2.
  • Utilize the commutativity of related operations (e.g., logical operations), such as x&y = y&x, to save nearly half of the lookup table.
  • Currently, the script size corresponding to Fq12 operations is large; consider leveraging Fiat-Shamir, Schwartz-Zipple, and polynomial commitment schemes to significantly reduce the computational complexity of Fq12 extension operations.

4 Summary

This article first introduces the limitations of Bitcoin scripts and discusses using Bitcoin commitments to overcome the UTXO stateless limitation, using Taproot to break through script space limitations, using connector outputs to bypass UTXO spending method restrictions, and using contracts to overcome pre-signing limitations. It then provides a comprehensive overview and summary of the characteristics of fraud proofs and validity proofs, the features of permissioned and permissionless fraud proofs, the distinctions between one-round and multi-round fraud proofs, and the technology of Bitcoin script splitting.

Disclaimer:

  1. This article is reprinted from [aicoin]. All copyrights belong to the original author [mutourend & lynndell, Bitlayer Labs]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

Analysis of Bitcoin Layer 2 Scaling Technology: Validity Proof and Fraud Proof

Advanced28.13
Get an in-depth understanding of the Layer 2 expansion plan in the Bitcoin network, especially the validity proof and fraud proof technology. This article analyzes how to achieve Layer 2 expansion through technological innovation under the strict restrictions of Bitcoin, including Bit Commitment, Taproot, and Connector Output. and contracts, etc.
Analysis of Bitcoin Layer 2 Scaling Technology: Validity Proof and Fraud Proof

1 Introduction

For an algorithm f, two mutually distrustful participants, Alice and Bob, can establish trust in the following ways:

  1. Alice inputs x, runs algorithm f, and obtains result y. Bob also runs algorithm f with the same input x, resulting in y′. If y = y′, then Bob acknowledges Alice’s provided input x and output y. This is a special validity proof mechanism commonly used in blockchain consensus, where Alice is the node packaging the block and Bob is the node participating in the consensus.
  2. Alice inputs x, runs the zk.prove program on algorithm f, and obtains result y and proof proof. Bob runs the zk.verify program based on f, y, and proof. If the result is true, then Bob acknowledges Alice’s result y; if the result is false, then Bob does not acknowledge Alice’s result y. This is a validity proof, where Bob can be an on-chain contract.
  3. Alice inputs x, runs algorithm f, and obtains result y. Bob also runs algorithm f with the same input x, resulting in y′. If y = y′, then nothing is done; if y ≠ y′, then Bob challenges Alice, with the challenged program being f. The number of interactions between Alice and Bob can be one or multiple. Arbitration is achieved through the challenge-response process. This is a fraud proof, where Bob is the challenger, listening off-chain and challenging on-chain.
  4. Alice inputs x, runs the zk.prove program on algorithm f, and obtains result y and proof proof. Bob runs the zk.verify program based on f, y, and proof. If the result is true, then nothing is done; if the result is false, then Bob challenges Alice, with the challenged program being zk.verify. The number of interactions between Alice and Bob can be one or multiple. Arbitration is achieved through the challenge-response process. This is a fraud proof, where Bob is the challenger, listening off-chain and challenging on-chain.

For the above 2, 3, and 4, let x be Layer2 transactions and the initial state, f be the Layer2 consensus program, and y be the transaction end state, corresponding to the blockchain Layer2 scaling solution. Among them:

  • Validity Proof: Based on a pessimistic assumption, it must be proven valid before acceptance, and it takes effect immediately. In a validity proof, evidence of correct L2 state transitions must be provided, reflecting a pessimistic view of the world—only when a state is proven correct will it be accepted.
  • Fraud Proof: Based on an optimistic assumption, it is accepted by default unless someone proves it to be incorrect. It has a challenge window period, which only takes effect after the challenge window period. In a fraud proof, evidence of incorrect L2 state transitions must be provided, reflecting an optimistic view of the world—a state transition is assumed to be correct unless proven otherwise.


Table 1: Ways to Establish Trust

Additionally, it is important to note:

  • The key to distinguishing between fraud proofs and validity proofs is not whether ZK proof systems like SNARK/STARK are used. The ZK proof system expresses the method of proof used, while fraud or validity represents the content being proven. This is why scenario 1 in Table 1 is said to represent a validity proof.
  • Validity proofs have better timeliness, but the on-chain verification complexity is relatively high; fraud proofs have lower on-chain verification complexity, but their timeliness is relatively poor.
  • For cases 2 and 4 in Table 1, using ZK recursion and combination techniques, multiple f can be computed and compressed, significantly reducing the verification cost of off-chain computation on-chain.

Currently, benefiting from the Turing-complete smart contracts of Solidity, fraud proofs and validity proof technologies are widely used in Ethereum Layer2 scaling. However, under the Bitcoin paradigm, limited by Bitcoin’s limited opcode functionality, 1000 stack elements, and other restrictions, the application of these technologies is still in the exploratory stage. This article summarizes the limitations and breakthrough technologies under the Bitcoin paradigm in the context of Bitcoin Layer2 scaling, studies validity proof and fraud proof technologies, and outlines the unique script segmentation technology under the Bitcoin paradigm.

2 Limitations and Breakthroughs under the Bitcoin Paradigm

There are many limitations under the Bitcoin paradigm, but various clever methods or techniques can be used to overcome these limitations. For example, bit commitment can break through the UTXO stateless limitation, taproot can break through script space limitations, connector output can break through UTXO spending method limitations, and contracts can break through pre-signature limitations.

2.1 UTXO Model and Script Limitations

Bitcoin adopts the UTXO model, where each UTXO is locked in a locking script that defines the conditions that must be met to spend that UTXO. Bitcoin scripts have the following limitations:

  1. Bitcoin scripts are stateless;
  2. For P2TR output types, the maximum number of opcodes that can be accommodated in a single transaction is about 4 million, filling the entire block, while for other output types, there are only 10,000 opcodes;
  3. The spending methods of a single UTXO are limited, lacking exploration of combined spending methods;
  4. Flexible contract functions are not supported;
  5. The stack size is limited to a maximum of 1000 elements (altstack + stack), and the maximum size of a single element is 520 bytes;
  6. Arithmetic operations (such as addition and subtraction) are limited to 4-byte elements. They cannot be modified to long elements, such as 20 bytes or larger, which are necessary for cryptographic operations;
  7. Opcodes like OPMUL and OPCAT have been disabled, and simulating them with existing opcodes incurs extremely high costs, such as simulating a one-round BLAKE3 hash, with a script size of about 75K.

2.2 Bit Commitment: Breaking Through the UTXO Stateless Limitation

Currently, Bitcoin scripts are completely stateless. When executing a Bitcoin script, its execution environment is reset after each script. This leads to the inability of Bitcoin scripts to natively support constraint scripts 1 and 2 having the same x value. However, this issue can be circumvented through some methods, with the core idea being to sign a value in some way. If a signature can be created for a value, it is possible to achieve stateful Bitcoin scripts. That is, by checking the signature of the x value in scripts 1 and 2, it can be enforced that the same x value is used in both scripts. If there is a conflicting signature, meaning two different values are signed for the same variable x, a penalty can be applied. This solution is known as bit commitment.

The principle of bit commitment is relatively simple. A bit refers to setting two different hash values, hash0 and hash1, for each bit in the message to be signed. If the bit value to be signed is 0, the preimage0 of hash0 is revealed; if the bit value to be signed is 1, the preimage1 of hash1 is revealed.

Taking a single bit message m ∈{0,1} as an example, the corresponding bit commitment unlock script is just some preimages: if the bit value is 0, the corresponding unlock script is preimage0—“0xfa7fa5b1dea37d71a0b841967f6a3b119dbea140”; if the bit value is 1, the corresponding unlock script is preimage1—“0x47c31e611a3bd2f3a7a42207613046703fa27496”. Therefore, with bit commitment, it is possible to break through the UTXO stateless limitation and achieve stateful Bitcoin scripts, making various interesting new features possible.

OP_HASH160

OP_DUP

0xf592e757267b7f307324f1e78b34472f8b6f46f3> // This is hash1

OP_EQUAL

OP_DUP

OP_ROT

0x100b9f19ebd537fdc371fa1367d7ccc802dc2524> // This is hash0

OP_EQUAL

OP_BOOLOR

OP_VERIFY

// Now the value of the bit commitment is on the stack. Either “0” or “1”.

Currently, there are 2 implementations of bit commitment:

  • Lamport One-Time Signature: The principle is relatively simple and only requires the use of hash functions, making it Bitcoin-friendly. For each bit in the message, two hash values need to be committed, resulting in relatively large signature data. In other words, for a message of length v bits, the public key length is 2v bits, and the signature length is v bits.
  • Winternitz One-Time Signature: Compared to Lamport signatures, it significantly reduces the lengths of signatures and public keys but increases the complexity of signing and verification. This scheme allows for flexible setting of different hash chain lengths d, thus balancing length and complexity. For example, setting d = 15 results in both the public key length and signature length being about 4 times shorter, but the verification complexity will increase by 4 times. This is essentially a trade-off between Bitcoin’s stack space and script size. Lamport signatures can be seen as a special case of Winternitz signatures when d = 1.

Currently, the BitVM2 library implements Winternitz signatures based on the Blake3 hash function. The signature length corresponding to a single bit is about 26 bytes. Therefore, it can be seen that introducing state through bit commitment is costly. Thus, in the BitVM2 implementation, the message is first hashed to obtain a 256-bit hash value, and then bit commitment is performed on the hash value to save overhead, rather than committing to each bit of the original longer message directly.

2.3 Taproot: Breaking Through Script Space Limitations

The Bitcoin Taproot soft fork upgrade, activated in November 2021, includes three proposals: Schnorr signatures (BIP 340), Taproot (BIP 341), and TapScript (BIP 342). It introduces a new transaction type—Pay-to-Taproot (P2TR) transactions. P2TR transactions can create a more private, flexible, and scalable transaction format by combining the advantages of Taproot, MAST (Merkel Abstract Syntax Tree), and Schnorr signatures.

P2TR supports two spending methods: spending according to the key path or the script path.

According to the provisions in Taproot (BIP 341), when spending via the script path, the corresponding Merkle path cannot exceed a maximum length of 128. This means that the number of tapleafs in the taptree cannot exceed 2^128. Since the SegWit upgrade in 2017, the Bitcoin network measures block size in weight units, with a maximum support of 4 million weight units (approximately 4MB). When a P2TR output is spent via the script path, it only needs to reveal a single tapleaf script, meaning that the block is packed with a single tapleaf script. This implies that for P2TR transactions, the script size corresponding to each tapleaf can be a maximum of about 4MB. However, under Bitcoin’s default policy, many nodes only forward transactions smaller than 400K; larger transactions need to collaborate with miners to be packed.

The script space premium brought by Taproot makes it more valuable to simulate cryptographic operations like multiplication and hashing using existing opcodes.

When building verifiable computation based on P2TR, the corresponding script size is no longer limited to the 4MB constraint, but can be split into multiple sub-functions distributed across multiple tapleafs, thus breaking through the 4MB script space limitation. In fact, the Groth16 verifier algorithm implemented in the current BitVM2 has a size of up to 2GB. However, it can be split and distributed across about 1000 tapleafs, and by combining it with bit commitment, the consistency between the inputs and outputs of each sub-function can be constrained, ensuring the integrity and correctness of the entire computation.

2.4 Connector Output: Breaking Through UTXO Spending Method Limitations

Currently, Bitcoin provides two native spending methods for a single UTXO: spending by script or by public key. Therefore, as long as the correct public key signature or script conditions are met, the UTXO can be spent. Two UTXOs can be spent independently, and no restrictions can be added to constrain the two UTXOs, meaning that additional conditions must be met for them to be spent.

However, Burak, the founder of the Ark protocol, cleverly utilized the SIGHASH flag to achieve connector output. Specifically, Alice can create a signature to send her BTC to Bob. However, since the signature can commit to multiple inputs, Alice can set her signature to be conditional: the signature is valid for the Taketx transaction if and only if that transaction consumes the second input. The second input is called the connector, linking UTXO A and UTXO B. In other words, the Taketx transaction is valid if and only if UTXO B has not been spent by the Challengetx. Therefore, by destroying the connector output, the effectiveness of the Taketx transaction can be blocked.


Figure 1: Connector Output Illustration

In the BitVM2 protocol, the connector output acts as an if…else function. Once the connector output is spent by a transaction, it cannot be spent by another transaction to ensure its exclusive spending. In practical deployment, to allow for a challenge-response period, additional UTXOs with timelock are introduced. Furthermore, the corresponding connector output can also be set with different spending strategies as needed, such as allowing any party to spend in the case of challenge transactions, while allowing only the operator to spend or allowing anyone to spend after a timeout for response transactions.

2.5 Contracts: Breaking Through Pre-Signing Limitations

Currently, Bitcoin scripts mainly limit the conditions for unlocking, without restricting how the UTXO can be further spent. This is because Bitcoin scripts cannot read the content of the transaction itself, meaning they cannot achieve transaction introspection. If Bitcoin scripts could check any content of the transaction (including outputs), contract functionality could be realized.

Current contract implementations can be divided into two categories:

  • Pre-signing: Based on existing Bitcoin script capabilities, limited-functionality pre-determined contracts are constructed through pre-signing. This means designing and signing all possible future transactions in advance, locking participants into specific private keys and fee rates. Some schemes even require participants to generate short-term private keys for all pre-signed transactions. Once pre-signing is complete, these short-term private keys are securely deleted, preventing attackers from obtaining them and stealing funds. However, each time a new participant is added or an operation is updated, the above process needs to be repeated, leading to high maintenance costs. For example, the Lightning Network achieves two-party contracts through pre-signing and uses Hash Time-Locked Contracts (HTLC) technology to implement routing functions for multiple two-party contracts, thus achieving a trust-minimized scaling strategy. However, the Lightning Network requires pre-signing multiple transactions and is limited to two parties, making it somewhat cumbersome; in BitVM1, hundreds of transactions need to be pre-signed at each initialization, while in the optimized BitVM2, the number of transactions that need to be pre-signed at each initialization also reaches dozens. In both BitVM1 and BitVM2, only operators participating in pre-signing are eligible for reimbursement. If n participants are involved in pre-signing, and each participant needs to pre-sign m transactions, the pre-signing complexity for each participant will be n * m.
  • Introducing Contract Opcodes: Introducing new contract function opcodes can significantly reduce the communication complexity and maintenance costs between contract participants, thus introducing a more flexible contract implementation method for Bitcoin. For example, OPCAT: used to concatenate byte strings. Although its function is very simple, it is very powerful and can significantly reduce the complexity of BitVM; OPTXHASH: allows for better granular control of actions within the contract. If used in BitVM, it can support a larger set of operators, thus greatly improving the security assumptions of BitVM and minimizing trust. Additionally, the pre-signing method inherently means that operators in BitVM can only adopt a reimbursement process, leading to lower capital utilization efficiency; whereas through new contract opcodes, it may be possible to directly pay out from the peg-in fund pool to the output users, further improving capital efficiency. Therefore, a flexible contract model will effectively break through the traditional pre-signing limitations.

3 Bitcoin Layer2 Scaling: Validity Proofs and Fraud Proofs

Both validity proofs and fraud proofs can be used for Bitcoin L2 scaling, with the main differences shown in Table 2.


Table 2: Validity Proofs vs. Fraud Proofs

Based on bit commitment, taproot, pre-signing, and connector output, fraud proofs based on Bitcoin can be constructed. Based on taproot, validity proofs based on Bitcoin can also be constructed by introducing contract opcodes, such as OP_CAT. Additionally, depending on whether Bob has an admission system, fraud proofs can be divided into permissioned fraud proofs and permissionless fraud proofs. In permissioned fraud proofs, only specific groups can act as Bob to initiate challenges, while in permissionless fraud proofs, any third party can act as Bob to initiate challenges. The security of permissionless fraud proofs is superior to that of permissioned ones, reducing the risk of collusion among the permitted participants.

According to the number of challenge-response interactions between Alice and Bob, fraud proofs can be divided into one-round fraud proofs and multi-round fraud proofs, as shown in Figure 2.


Figure 2: One-Round Fraud Proofs vs. Multi-Round Fraud Proofs

As shown in Table 3, fraud proofs can be implemented through different interaction models, including one-round interaction models and multi-round interaction models.


Table 3: One-Round Interaction vs. Multi-Round Interaction

In the Bitcoin Layer2 scaling paradigm, BitVM1 adopts a multi-round fraud proof mechanism, while BitVM2 employs a one-round fraud proof mechanism, and bitcoin-circle stark utilizes validity proofs. Among these, BitVM1 and BitVM2 can be implemented without making any modifications to the Bitcoin protocol, while bitcoin-circle stark requires the introduction of a new opcode OP_CAT.

For most computational tasks, Bitcoin’s signet, testnet, and mainnet cannot fully represent a 4MB script, necessitating the use of script split technology—i.e., splitting the complete computation script into chunks smaller than 4MB, distributed across various tapleafs.

3.1 Multi-Round Fraud Proofs on Bitcoin

As shown in Table 3, multi-round fraud proofs are suitable for scenarios where there is a desire to reduce on-chain arbitration computation and/or where it is not possible to pinpoint problematic computation segments in one step. Multi-round fraud proofs, as the name suggests, require multiple rounds of interaction between the prover and the verifier to locate the problematic computation segments, followed by arbitration based on the identified segments.

Robin Linus’s early BitVM white paper (commonly referred to as BitVM1) utilized multi-round fraud proofs. Assuming each challenge period lasts a week and employing a binary search method to locate the problematic computation segments, the on-chain challenge response period for the Groth16 Verifier could extend up to 30 weeks, resulting in poor timeliness. Although there are currently teams researching more efficient n-ary search methods than binary search, their timeliness is still significantly lower compared to the 2-week cycle in one-round fraud proofs.

Currently, multi-round fraud proofs in the Bitcoin paradigm employ permissioned challenges, while one-round fraud proofs have achieved a permissionless challenge method, reducing the risk of collusion among participants and thus enhancing security. To this end, Robin Linus fully leveraged the advantages of Taproot to optimize BitVM1. Not only was the number of interaction rounds reduced to one, but the challenge method was also expanded to a permissionless approach, albeit at the cost of increased on-chain arbitration computation.

3.2 One-Round Fraud Proofs on Bitcoin

In this model, verification of fraud proofs can be completed through a single interaction between the prover and the verifier. The verifier only needs to initiate one challenge, and the prover only needs to respond once. In this response, the prover must provide evidence claiming that their computation is correct. If the verifier can find inconsistencies in that evidence, the challenge is successful; otherwise, it fails. The characteristics of one-round interactive fraud proofs are shown in Table 3.


Figure 3: One-Round Fraud Proof

On August 15, 2024, Robin Linus released the BitVM2: Bridging Bitcoin to Second Layers technical white paper, which implemented a cross-chain bridge using a one-round fraud proof method similar to that shown in Figure 3.

3.3 Validity Proofs on Bitcoin with OP_CAT

OPCAT was part of the original scripting language when Bitcoin was released but was disabled in 2010 due to security vulnerabilities. However, the Bitcoin community has been discussing its reactivation for years. OPCAT has now been assigned the number 347 and has been enabled on Bitcoin’s signet.

The main function of OP_CAT is to combine two elements in the stack and push the merged result back onto the stack. This functionality opens up the possibility for contracts and STARK Verifiers on Bitcoin:

  • Contracts: Andrew Poelstra proposed CAT and Schnorr Tricks I, using OPCAT and Schnorr techniques to implement contracts on Bitcoin. The Schnorr algorithm is a digital signature for P2TR output types; for other output types, similar ECDSA techniques can be used, as seen in Covenants with CAT and ECDSA. With the help of OPCAT contracts, the STARK Verifier algorithm can be split into multiple transactions, gradually verifying the entire STARK proof.
  • STARK Verifier: The STARK Verifier essentially connects data together and hashes it. Unlike algebraic operations, hashing is a native Bitcoin script operation that can save a significant amount of overhead. For example, OPSHA256 is a single opcode in its native form, while a simulated version requires hundreds of K opcodes. The main hashing operations in STARK involve verifying Merkle paths and Fiat-Shamir transformations. Therefore, OPCAT is very friendly to the STARK Verifier algorithm.

3.4 Bitcoin Script Split Technology

Although the computational load required to run the corresponding verifier algorithm to verify the proof after SNARK/STARK proof is much smaller than that required to directly run the original computation f, the amount of script needed when converting it to implement the verifier algorithm in Bitcoin script is still enormous. Currently, based on existing Bitcoin opcodes, the optimized Groth16 verifier script size and Fflonk verifier script size are still both greater than 2GB. However, the size of a single Bitcoin block is only 4MB, making it impossible to run the entire verifier script within a single block. However, since the Taproot upgrade, Bitcoin supports executing scripts by tapleaf, allowing the verifier script to be split into multiple chunks, with each chunk serving as a tapleaf to construct a taptree. The consistency of values between chunks can be ensured through bit commitment.

In the presence of OP_CAT contracts, the STARK Verifier can be split into multiple standard transactions smaller than 400KB, allowing the entire STARK validity proof verification to be completed without needing to collaborate with miners.

This section focuses on the relevant split technology of Bitcoin scripts under the existing conditions without introducing or activating any new opcodes.

When performing script splitting, the following dimensions of information must be balanced:

  • The size of a single chunk script must not exceed 4MB and should include input bit commitment scripts, transaction signatures, and other space.
  • The maximum stack size of a single chunk must not exceed 1000. Therefore, each chunk’s stack should only retain the necessary elements to reserve enough stack space for script size optimization, as Bitcoin transaction fees do not depend on the stack size used.
  • Bit commitment on Bitcoin is expensive. Therefore, the number of bits in the input and output between two adjacent chunks should be minimized, as currently, 1 bit corresponds to 26 bytes.
  • For ease of auditing, the functionality of each chunk should be as clear as possible.

Currently, the methods for script splitting can be divided into the following three main categories:

  • Automatic Splitting: This method seeks a splitting approach where the script size is around 3MB and the stack size is minimized based on stack size and script size. The advantages of this method are that it is independent of specific verifier algorithms and can be extended to script splitting for any computation. The disadvantages are: (1) the entire logical block must be separately marked, such as OP_IF code blocks that cannot be split; otherwise, the execution result of the split script will be incorrect; (2) the execution result of a chunk may correspond to multiple elements on the stack, requiring marking of the number of stack elements that need to apply bit commitment based on actual computation logic; (3) the readability of the logical functionality implemented by each chunk script is poor, making auditing difficult; (4) the stack may contain elements not needed for the next chunk, wasting stack space.
  • Functional Splitting: This method splits based on various functional sub-functions in the computation, with clear input and output values for the sub-functions. While splitting the script, it also implements the necessary bit commitment scripts for each chunk, ensuring that the total script size of the final chunks is less than 4MB and the stack size is less than 1000. The advantages are: clear functionality, clear logic for each chunk, good readability, and ease of auditing. The disadvantages are: the expression of the original computation logic may not match the script-level logic, and the original computation sub-functions may be optimal but not represent script-level optimality.
  • Manual Splitting: In this method, the splitting points are not based on functional sub-functions but are manually set. This is especially suitable for cases where a single sub-function size exceeds 4MB. The advantages are: it allows for manual splitting of heavy script size sub-functions, such as those related to Fq12 calculations; clear logic for each chunk, good readability, and ease of auditing. The disadvantages are: limited by manual tuning capabilities, when the overall script has been optimized, previously set manual splitting points may not be optimal and need to be readjusted.

For example, after multiple rounds of optimization, the script size of the Groth16 verifier was reduced from about 7GB to approximately 1.26GB. In addition to this overall computational optimization, each chunk can also be optimized individually to fully utilize stack space. For instance, by introducing better lookup table-based algorithms and dynamically loading and unloading the lookup table, the script size of each chunk can be further reduced.

The computational costs and runtime environments of web2 programming languages are completely different from those of Bitcoin scripts, so simply translating existing implementations for various algorithms into Bitcoin scripts is not feasible. Therefore, optimizations specific to the Bitcoin scenario need to be considered:

  • Seek algorithms that optimize memory locality, even at the cost of some computational load, to reduce the number of input/output bits between chunks, thereby decreasing the amount of data required for commitments in the assertTx transaction design of BitVM2.
  • Utilize the commutativity of related operations (e.g., logical operations), such as x&y = y&x, to save nearly half of the lookup table.
  • Currently, the script size corresponding to Fq12 operations is large; consider leveraging Fiat-Shamir, Schwartz-Zipple, and polynomial commitment schemes to significantly reduce the computational complexity of Fq12 extension operations.

4 Summary

This article first introduces the limitations of Bitcoin scripts and discusses using Bitcoin commitments to overcome the UTXO stateless limitation, using Taproot to break through script space limitations, using connector outputs to bypass UTXO spending method restrictions, and using contracts to overcome pre-signing limitations. It then provides a comprehensive overview and summary of the characteristics of fraud proofs and validity proofs, the features of permissioned and permissionless fraud proofs, the distinctions between one-round and multi-round fraud proofs, and the technology of Bitcoin script splitting.

Disclaimer:

  1. This article is reprinted from [aicoin]. All copyrights belong to the original author [mutourend & lynndell, Bitlayer Labs]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
Start Now
Sign up and get a
$100
Voucher!