What is a zk-VM?

Intermediate6/3/2024, 1:43:54 PM

ZK is the bridge to the widespread adoption of cryptography. Whether in Web2 or Web3, anything involving Zero-Knowledge Proofs (ZKP) will create immense value. The Lita team has written fundamental science articles, introducing the basics of ZK and zkVM, giving a high-level overview of the process within zkVM, and finally proposing a set of standards for evaluating zkVM.

Blockchain

Forward the Original Title ‘A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?’

1. Zero Knowledge Proofs: A Primer

What are zero knowledge proofs (ZKPs)?

If you have no prior knowledge of zero knowledge proofs (ZKP), this video from Wired explains the concept at five difficulty levels in an interactive manner with easily understandable examples and demonstrations. Highly recommended.

In simplest terms, ZKPs enable one party (prover) to prove to another party (the verifier) that they know something without revealing what that thing is or any additional information. More specifically, ZKPs prove knowledge of a piece of data, or knowledge of the result of a computation, without revealing the data or inputs. The process of creating a zero knowledge proof involves a series of mathematical models to transform the results of a computation into a piece of otherwise meaningless information that proves successful execution of code, which can later be verified.

In some cases, it takes less work to verify the proof, which is constructed after multiple rounds of algebraic conversions and cryptography, than it would take to run the computation. This unique combination of security and scalability is what makes zero knowledge cryptography such a powerful tool.

zkSNARKs: Zero Knowledge Succinct Non-Interactive Argument of Knowledge

Relies on an initial (trusted or non-trusted) setup process to establish parameters for verification
Requires at least one interaction between prover and verifier
Proof sizes are small and easy to verify
NARK based proofs are used by rollups like zkSync, Scroll, and Linea

zkSTARKs: Zero Knowledge Scalable Transparent Argument of Knowledge‍

No trusted setup required
Offer high transparency by using publicly verifiable randomness to create trustless verifiable systems, i.e. generating provably random parameters for proving and verifying
Highly scalable as they can generate and verify proofs fast (not always), even when the size of the underlying witness (data) is large
Requires no interaction between prover and verifier
Comes at the cost that STARKs generate larger proofs, which can be harder to verify than SNARKs
Proofs are harder to verify than some zkSNARK proofs but not as hard as some others to verify
STARKs are used by Starknet, as well as zkVMs like Lita, Risc Zero, and Succinct Labs

(Note: Succinct’s bridge uses SNARKs but SP1 is a STARK based protocol)

It is worth noting that all STARKs are SNARKs, but not all SNARKs are STARKs.

For a better general understanding of SNARKs and STARKs, we recommend reading this @krzhang/privacy-in-cryptocurrencies-zero-knowledge-and-zk-snarks-1-2-68ce1838fd9c">article series written by Yan Zhang and Yi Sun of Axiom, and this collection of articles in Ventali Tan’s github.

2. What is a zkVM?

A virtual machine (VM) is a program that runs programs. In context, a zkVM is a virtual computer that is implemented as a system for generating zero knowledge proofs, or a universal circuit, or tool, for generating ZKPs for any program or computation.

zkVMs remove the need to learn complicated mathematics and cryptography for designing and coding ZK, and enables any developer to execute programs written in their preferred languages and generate ZKPs, making it far easier to integrate and interact with zero knowledge. Broadly speaking, most references to zkVMs implicitly include the compiler toolchains and proof systems appended to the virtual machine that executes the programs, and not just the virtual machine itself. Below, we summarize the main components of a zkVM and their functions:

The main components of a zkVM

The design and implementation of each component is governed by the choice of proof (SNARKs or STARKs) and the instruction set architecture (ISA) of the zkVM. Traditionally, an ISA specifies what a CPU is capable of (data types, registers, memory etc.) and the sequence of actions the CPU performs when it executes a program. In context, the ISA determines the machine code that is interpretable and executable by the VM. Choosing an ISA can yield radical differences in the accessibility and usability of the zkVM, as well as the speed and efficiency of the proof generation processes, and underpins the construction of any zkVM.

Below are some examples of zkVMs and their components for your reference.

zkVMs and their components

For now, we will focus on the interactions between each component at a high level to provide a framework for understanding the algebraic and cryptographic processes as well as the design tradeoffs of a zkVM in a later article.

3. Abstracted zkVM Process Flow

The following diagram is an abstracted, generalized process flowchart of a zkVM, split and categorized between the format (inputs / outputs) of a program as it moves through the components of a zkVM. We will examine each process in depth in subsequent articles.

General flow for a zkVM

The process flow of a zkVM is generally as follows:

Compiler Stage

The compiler first compiles programs written in conventional languages (C, C++, Rust, Solidity) into machine code. The format of the machine code is dictated by the choice of the ISA.

VM Stage

The VM executes the machine code and generates an execution trace, which is the series of steps of the underlying program. The format of this is predetermined by the choice of arithmetization as well as the set of polynomial constraints. Common arithmetization schemes include R1CS as in Groth16, PLONKish arithmetization as in halo2, and AIR as in plonky2 and plonky3.

Prover Stage

The prover receives the trace and represents it as a set of polynomials bound by a set of constraints, essentially translating the computation into algebra by mapping out the facts mathematically.
The prover commits to these polynomials using a Polynomial Commitment Scheme (PCS). A commitment scheme is a protocol which allows the prover to create a fingerprint of some data X, which is called a commitment to X, and later prove facts about X without revealing X, using the commitment to X. The PCS is the fingerprint; a “pre-processed’’ succinct version of the constraints on the computation. This allows the prover to prove facts about the computation, now expressed in a polynomial equation, using random values proposed by the verifier in the following steps.
The prover runs a Polynomial Interactive Oracle Proof (PIOP) to show that the committed polynomials represent an execution trace which satisfies the given constraints. A PIOP is an interactive proof protocol consisting of a series of rounds in which the prover sends commitments to polynomials, the verifier responds with random field values, and the prover provides evaluations of the polynomial at these random values, akin to “solving” the polynomial equation using random values to probabilistically persuade the verifier.
Applying the Fiat-Shamir heuristic; the prover runs the PIOP in a non-interactive mode, where the verifier’s behavior is limited to sending pseudo-random challenge points. In cryptography, the Fiat–Shamir heuristic converts an interactive proof of knowledge into a digital signature for verification. This step encrypts the proof and makes it zero knowledge.
The prover must convince the verifier that the claimed polynomial evaluations are correct, with respect to the polynomial commitments it sent to the verifier. To do this, the prover produces an “evaluation” or “opening” proof, which is provided by the polynomial commitment scheme (fingerprint).

Verifier Stage

The verifier checks the proof by following the proof system’s verification protocol, either using the constraints or the commitment. The verifier either accepts or rejects the result according to the validity of the proof.

Summarily, a zkVM proof proves, for a given program, a given result, and given initial conditions, that there is some input which causes the program to produce the given result when executed from the given initial conditions. We can combine this statement with the process flow to arrive at the following description of a zkVM.

A zkVM proof proves, for a given VM program and a given output, that there is some input which causes the given program to produce the given output when executed on the VM.

4. Evaluating zkVMs

What are the criteria by which we should evaluate zkVMs? In other words, when should we say that one zkVM is better than another? Truthfully, the answer depends on the use case.

Lita’s market research suggests that for most commercial use cases, the most important properties, out of speed, efficiency, and succinctness, are either speed or core-time efficiency, depending on the application. Some applications are price sensitive and want to optimize for low energy consumption and low use of capital in proving; for these, core-time efficiency is probably the most important metric to optimize. Other applications, particularly finance or trading related applications, are latency sensitive and want to optimize for speed.

Most publicized comparisons of performance focus exclusively on speed, which is important but not a holistic measurement of performance. There are also a few important properties that measure the reliability of a zkVM; most of which are not up to production ready standards, even for market leading incumbents.

We hereby propose that zkVMs should be evaluated on the following criteria, separated into two subsets:

Main criteria for evaluating zk-VMs

Baseline: Measures the reliability of zkVMs

Correctness
Security
Trust Assumptions

Performance: Measures the capabilities of a zkVM

Efficiency
Speed
Succinctness

4.1 Baseline: Correctness, Security, and Trust Assumptions

When evaluating zkVMs for mission critical applications, correctness and security should be considered as the baseline. There needs to be sufficient reason to be confident about the correctness, and sufficiently strong claimed security. Also, the trust assumptions need to be sufficiently weak for the application.

Without these properties, the zkVM is potentially worse than useless for the application, as it may fail to perform as specified and expose users to hacking and exploits.

i. Correctness

The VM must execute the computation as intended
The proof system must satisfy its claimed security properties

Correctness is comprised of three properties:

Soundness: The proof system is truthful and therefore everything it proves is true. The verifier rejects proofs of false statements; it only accepts a computation’s result if the inputs actually produce that result.
Completeness: The proof system is complete and can prove all true statements. If the prover claims it can prove a computation’s result, it must be able to produce a proof that the verifier accepts.
Zero Knowledge: Possessing a proof doesn’t reveal any more about the computation’s inputs than knowing the result itself would

You can have completeness without soundness; if the proof system proves everything including falsity, obviously it is complete but not sound. Inversely, you can have soundness without completeness; if the proof system proves a program existed but does not cannot prove the computations, obviously it is sound (after all, it never proves any falsity), but not complete.

ii. Security

Relates to the tolerances of soundness, completeness, and zero knowledge

In practice, all three correctness properties have non-zero tolerances. This implies all proofs are statistical probabilities of correctness, and not absolute certainties. A tolerance refers to the maximum tolerable probability that one property will fail. Zero tolerances are of course the ideal, but zkVMs do not achieve zero tolerances on all of these properties in practice. Perfect soundness and completeness appear to be incompatible with succinctness, and there are no known methods for achieving perfect zero knowledge. A common way of measuring security is in bits of security, where a tolerance of 1 / (2^n) is referred to as n bits of security. More bits of security is better.

If a zkVM is perfectly correct, that does not necessarily imply that it is reliable. Correctness only implies that the zkVM satisfies its security properties up to the claimed tolerances. It does not imply that the claimed tolerances are low enough to be market ready. Also, if a zkVM is sufficiently secure, that does not imply that it is correct; security refers to the claimed tolerances, not the tolerances that are actually achieved. Only when a zkVM is both perfectly correct and sufficiently secure can it be said that the zkVM is reliable up to the claimed tolerances.

iii. Trust Assumptions

Assumptions about the honesty of the prover and verifier to reach the conclusion the zkVM functions reliably

When zkVMs have trust assumptions, this usually takes the form of a trusted setup process. A setup process for a ZK proof system runs once, before the first proof is generated using the proof system, in order to generate some information called “the setup data.” In a trusted setup process, one or more individuals generate some randomness which gets incorporated into the setup data, and it needs to be assumed that at least one of those individuals deleted the randomness which they incorporated into the setup data.

There are two common trust assumption models in practice.

An honest majority trust assumption states that out of some set of N individuals, more than N/2 of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by blockchains

A “1 out of N” trust assumption states that out of some set of N individuals, at least one of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by MPC based tools and applications.

It is generally agreed that all else being equal, zkVMs with no trust assumptions are more secure than zkVMs which require trust assumptions.

4.2 The zkVM Trilemma: Balancing Speed, Efficiency, and Succinctness in zkVMs

The zkVM trilemma: speed, efficiency, and succinctness

Speed, efficiency, and succinctness are all sliding scale properties. All of these factors contribute to the end user cost of zkVM. How they should be weighted in an evaluation depends on the application. Often, the fastest solution is not the most efficient or the most succinct; the most succinct solution is not the fastest or the most efficient; and so on and so forth. Let’s first define each property before explaining their relationship

i. Speed

How quickly can the prover generate proofs
Measured in wall clock time, which is the time elapsed from start to finish of computation

Speed should be defined and measured relative to specific test programs, inputs, and systems to ensure it can be quantitatively assessed. This metric is critical for latency-sensitive applications where prompt availability of proofs is essential, but comes with higher overhead and larger proof sizes

ii. Efficiency

The resources consumed by the prover, with less being preferable
Can be approximated by user time, which is the amount of computer time expended by the program code

The prover consumes two kinds of resources: core-time and space. Efficiency can therefore be subdivided into core-time efficiency and space efficiency.\

Core-Time Efficiency: Average amount of time the prover operates across all cores multiplied by number of cores running the prover. For a single-core prover, the core-time consumption and the speed are the same thing. For a multi-core capable prover running in multi-core mode on a multi-core system, core-time consumption and speed are not the same thing. If a program fully utilizes 5 cores or threads for 5 seconds, that would be 25 seconds of user time and 5 seconds of wall clock time.

Space Efficiency: Refers to the storage capacity used, such as RAM

User time is interesting as a proxy for the energy consumed by running a computation. In a situation where all cores are fully utilized almost all the time, the energy consumption of a CPU should remain relatively constant. In this situation, the user time spent by a CPU-bound, predominantly user-mode code execution should be roughly linearly proportional to the number of watt-hours (i.e., energy) consumed by that code execution.

Economizing on energy consumption, or the usage of computing resources, should be interesting from the point of view of any proving operations which have enough scale that their energy bill (or their cloud compute bill) for proving is a considerable operational cost. For these reasons, user time is an interesting metric. Lower costs of proving allow service providers to pass on lower proving prices to cost-sensitive customers.

Both kinds of efficiency are related to the energy consumption of the proving process and the amount of capital utilized by the proving process, which relates to the financial cost of proving. To operationalize the definition of efficiency for measurement, the definition has to be made relative to one or more test programs, one or more test inputs to each of those programs, and one or more test systems.

iii. Succinctness

The size of proofs generated and the complexity of verifying them

Succinctness is a composite of three distinct metrics, with the complexity of proof verification further subdivided:

Proof Size: The physical size of the proofs, typically measured in kilobytes
Proof Verification Time: Duration required to verify proofs.
Proof Verification Space: Memory usage during proof verification

Verification is typically a single core operation, thus speed and core-time efficiency are generally equivalent in this context. As with speed and efficiency, operationalizing the definition of succinctness requires specifying sets of test programs, test inputs, and test systems.

With each performance property defined, we now illustrate the dimunitive effects of optimizing one property over the others.

Speed: Fast proof generation leads to larger proof sizes, but proof verification is slow. More resources are consumed to generate proofs, which reduces efficiency
Succinctness: Prover needs more time to compress proofs. But proof verification is fast. The more succinct the proof, the higher the computation overhead
Efficiency: Minimizing resource usage reduces speed of proof generation and the succinctness of the proof

In general, optimizing for one quality means not optimizing for another quality, and therefore a multi-dimensional analysis is needed to pick an optimal solution on a case by case basis.

A good way of weighting these properties in an evaluation may be to define acceptable levels for each property and then determine which properties are most important. The most important properties should be optimized, subject to maintaining good enough levels on all other properties.

Below we summarize each property and their key considerations:

Evaluation properties of zkVMs

5. What’s Next?

With the above table, we hereby conclude the first article of our series. In the next articles, we will build on the flow chart shown above to explain common arithmetic and cryptographic processes in zkVMs.

If you found this helpful, visit our website and gitbook to learn more about what we are building at Lita.

Also, follow us on X and Discord to stay updated so you don’t miss the rest of the series

Disclaimer：

This article is reprinted from [lita]. Forward the Original Title‘A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?’. All copyrights belong to the original author [Lita Team]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

1. Zero Knowledge Proofs: A Primer

2. What is a zkVM?

3. Abstracted zkVM Process Flow

4. Evaluating zkVMs

5. What’s Next?

What is a zk-VM?

Intermediate6/3/2024, 1:43:54 PM

Blockchain

1. Zero Knowledge Proofs: A Primer

2. What is a zkVM?

3. Abstracted zkVM Process Flow

4. Evaluating zkVMs

5. What’s Next?

Forward the Original Title ‘A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?’

1. Zero Knowledge Proofs: A Primer

What are zero knowledge proofs (ZKPs)?

zkSNARKs: Zero Knowledge Succinct Non-Interactive Argument of Knowledge

Relies on an initial (trusted or non-trusted) setup process to establish parameters for verification
Requires at least one interaction between prover and verifier
Proof sizes are small and easy to verify
NARK based proofs are used by rollups like zkSync, Scroll, and Linea

zkSTARKs: Zero Knowledge Scalable Transparent Argument of Knowledge‍

No trusted setup required
Offer high transparency by using publicly verifiable randomness to create trustless verifiable systems, i.e. generating provably random parameters for proving and verifying
Highly scalable as they can generate and verify proofs fast (not always), even when the size of the underlying witness (data) is large
Requires no interaction between prover and verifier
Comes at the cost that STARKs generate larger proofs, which can be harder to verify than SNARKs
Proofs are harder to verify than some zkSNARK proofs but not as hard as some others to verify
STARKs are used by Starknet, as well as zkVMs like Lita, Risc Zero, and Succinct Labs

(Note: Succinct’s bridge uses SNARKs but SP1 is a STARK based protocol)

It is worth noting that all STARKs are SNARKs, but not all SNARKs are STARKs.

2. What is a zkVM?

The main components of a zkVM

Below are some examples of zkVMs and their components for your reference.

zkVMs and their components

3. Abstracted zkVM Process Flow

General flow for a zkVM

The process flow of a zkVM is generally as follows:

Compiler Stage

The compiler first compiles programs written in conventional languages (C, C++, Rust, Solidity) into machine code. The format of the machine code is dictated by the choice of the ISA.

VM Stage

The VM executes the machine code and generates an execution trace, which is the series of steps of the underlying program. The format of this is predetermined by the choice of arithmetization as well as the set of polynomial constraints. Common arithmetization schemes include R1CS as in Groth16, PLONKish arithmetization as in halo2, and AIR as in plonky2 and plonky3.

Prover Stage

The prover receives the trace and represents it as a set of polynomials bound by a set of constraints, essentially translating the computation into algebra by mapping out the facts mathematically.
The prover commits to these polynomials using a Polynomial Commitment Scheme (PCS). A commitment scheme is a protocol which allows the prover to create a fingerprint of some data X, which is called a commitment to X, and later prove facts about X without revealing X, using the commitment to X. The PCS is the fingerprint; a “pre-processed’’ succinct version of the constraints on the computation. This allows the prover to prove facts about the computation, now expressed in a polynomial equation, using random values proposed by the verifier in the following steps.
The prover runs a Polynomial Interactive Oracle Proof (PIOP) to show that the committed polynomials represent an execution trace which satisfies the given constraints. A PIOP is an interactive proof protocol consisting of a series of rounds in which the prover sends commitments to polynomials, the verifier responds with random field values, and the prover provides evaluations of the polynomial at these random values, akin to “solving” the polynomial equation using random values to probabilistically persuade the verifier.
Applying the Fiat-Shamir heuristic; the prover runs the PIOP in a non-interactive mode, where the verifier’s behavior is limited to sending pseudo-random challenge points. In cryptography, the Fiat–Shamir heuristic converts an interactive proof of knowledge into a digital signature for verification. This step encrypts the proof and makes it zero knowledge.
The prover must convince the verifier that the claimed polynomial evaluations are correct, with respect to the polynomial commitments it sent to the verifier. To do this, the prover produces an “evaluation” or “opening” proof, which is provided by the polynomial commitment scheme (fingerprint).

Verifier Stage

The verifier checks the proof by following the proof system’s verification protocol, either using the constraints or the commitment. The verifier either accepts or rejects the result according to the validity of the proof.

A zkVM proof proves, for a given VM program and a given output, that there is some input which causes the given program to produce the given output when executed on the VM.

4. Evaluating zkVMs

What are the criteria by which we should evaluate zkVMs? In other words, when should we say that one zkVM is better than another? Truthfully, the answer depends on the use case.

We hereby propose that zkVMs should be evaluated on the following criteria, separated into two subsets:

Main criteria for evaluating zk-VMs

Baseline: Measures the reliability of zkVMs

Correctness
Security
Trust Assumptions

Performance: Measures the capabilities of a zkVM

Efficiency
Speed
Succinctness

4.1 Baseline: Correctness, Security, and Trust Assumptions

Without these properties, the zkVM is potentially worse than useless for the application, as it may fail to perform as specified and expose users to hacking and exploits.

i. Correctness

The VM must execute the computation as intended
The proof system must satisfy its claimed security properties

Correctness is comprised of three properties:

Soundness: The proof system is truthful and therefore everything it proves is true. The verifier rejects proofs of false statements; it only accepts a computation’s result if the inputs actually produce that result.
Completeness: The proof system is complete and can prove all true statements. If the prover claims it can prove a computation’s result, it must be able to produce a proof that the verifier accepts.
Zero Knowledge: Possessing a proof doesn’t reveal any more about the computation’s inputs than knowing the result itself would

ii. Security

Relates to the tolerances of soundness, completeness, and zero knowledge

iii. Trust Assumptions

Assumptions about the honesty of the prover and verifier to reach the conclusion the zkVM functions reliably

There are two common trust assumption models in practice.

It is generally agreed that all else being equal, zkVMs with no trust assumptions are more secure than zkVMs which require trust assumptions.

4.2 The zkVM Trilemma: Balancing Speed, Efficiency, and Succinctness in zkVMs

The zkVM trilemma: speed, efficiency, and succinctness

i. Speed

How quickly can the prover generate proofs
Measured in wall clock time, which is the time elapsed from start to finish of computation

ii. Efficiency

The resources consumed by the prover, with less being preferable
Can be approximated by user time, which is the amount of computer time expended by the program code

The prover consumes two kinds of resources: core-time and space. Efficiency can therefore be subdivided into core-time efficiency and space efficiency.\

Space Efficiency: Refers to the storage capacity used, such as RAM

iii. Succinctness

The size of proofs generated and the complexity of verifying them

Succinctness is a composite of three distinct metrics, with the complexity of proof verification further subdivided:

Proof Size: The physical size of the proofs, typically measured in kilobytes
Proof Verification Time: Duration required to verify proofs.
Proof Verification Space: Memory usage during proof verification

With each performance property defined, we now illustrate the dimunitive effects of optimizing one property over the others.

Speed: Fast proof generation leads to larger proof sizes, but proof verification is slow. More resources are consumed to generate proofs, which reduces efficiency
Succinctness: Prover needs more time to compress proofs. But proof verification is fast. The more succinct the proof, the higher the computation overhead
Efficiency: Minimizing resource usage reduces speed of proof generation and the succinctness of the proof

In general, optimizing for one quality means not optimizing for another quality, and therefore a multi-dimensional analysis is needed to pick an optimal solution on a case by case basis.

Below we summarize each property and their key considerations:

Evaluation properties of zkVMs

5. What’s Next?

If you found this helpful, visit our website and gitbook to learn more about what we are building at Lita.

Also, follow us on X and Discord to stay updated so you don’t miss the rest of the series

Disclaimer：

This article is reprinted from [lita]. Forward the Original Title‘A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?’. All copyrights belong to the original author [Lita Team]. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.

Start Now

$100

Voucher!

What is a zk-VM?

1. Zero Knowledge Proofs: A Primer

2. What is a zkVM?

3. Abstracted zkVM Process Flow

4. Evaluating zkVMs

5. What’s Next?

Disclaimer：

Related articles

What is a zk-VM?

1. Zero Knowledge Proofs: A Primer

2. What is a zkVM?

3. Abstracted zkVM Process Flow

4. Evaluating zkVMs

5. What’s Next?

Disclaimer：

Related articles