Forward the Original Title ‘A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?’
What are zero knowledge proofs (ZKPs)?
If you have no prior knowledge of zero knowledge proofs (ZKP), this video from Wired explains the concept at five difficulty levels in an interactive manner with easily understandable examples and demonstrations. Highly recommended.
In simplest terms, ZKPs enable one party (prover) to prove to another party (the verifier) that they know something without revealing what that thing is or any additional information. More specifically, ZKPs prove knowledge of a piece of data, or knowledge of the result of a computation, without revealing the data or inputs. The process of creating a zero knowledge proof involves a series of mathematical models to transform the results of a computation into a piece of otherwise meaningless information that proves successful execution of code, which can later be verified.
In some cases, it takes less work to verify the proof, which is constructed after multiple rounds of algebraic conversions and cryptography, than it would take to run the computation. This unique combination of security and scalability is what makes zero knowledge cryptography such a powerful tool.
zkSNARKs: Zero Knowledge Succinct Non-Interactive Argument of Knowledge
zkSTARKs: Zero Knowledge Scalable Transparent Argument of Knowledge
(Note: Succinct’s bridge uses SNARKs but SP1 is a STARK based protocol)
It is worth noting that all STARKs are SNARKs, but not all SNARKs are STARKs.
For a better general understanding of SNARKs and STARKs, we recommend reading this @krzhang/privacy-in-cryptocurrencies-zero-knowledge-and-zk-snarks-1-2-68ce1838fd9c">article series written by Yan Zhang and Yi Sun of Axiom, and this collection of articles in Ventali Tan’s github.
A virtual machine (VM) is a program that runs programs. In context, a zkVM is a virtual computer that is implemented as a system for generating zero knowledge proofs, or a universal circuit, or tool, for generating ZKPs for any program or computation.
zkVMs remove the need to learn complicated mathematics and cryptography for designing and coding ZK, and enables any developer to execute programs written in their preferred languages and generate ZKPs, making it far easier to integrate and interact with zero knowledge. Broadly speaking, most references to zkVMs implicitly include the compiler toolchains and proof systems appended to the virtual machine that executes the programs, and not just the virtual machine itself. Below, we summarize the main components of a zkVM and their functions:
The main components of a zkVM
The design and implementation of each component is governed by the choice of proof (SNARKs or STARKs) and the instruction set architecture (ISA) of the zkVM. Traditionally, an ISA specifies what a CPU is capable of (data types, registers, memory etc.) and the sequence of actions the CPU performs when it executes a program. In context, the ISA determines the machine code that is interpretable and executable by the VM. Choosing an ISA can yield radical differences in the accessibility and usability of the zkVM, as well as the speed and efficiency of the proof generation processes, and underpins the construction of any zkVM.
Below are some examples of zkVMs and their components for your reference.
zkVMs and their components
For now, we will focus on the interactions between each component at a high level to provide a framework for understanding the algebraic and cryptographic processes as well as the design tradeoffs of a zkVM in a later article.
The following diagram is an abstracted, generalized process flowchart of a zkVM, split and categorized between the format (inputs / outputs) of a program as it moves through the components of a zkVM. We will examine each process in depth in subsequent articles.
General flow for a zkVM
The process flow of a zkVM is generally as follows:
The prover receives the trace and represents it as a set of polynomials bound by a set of constraints, essentially translating the computation into algebra by mapping out the facts mathematically.
The prover commits to these polynomials using a Polynomial Commitment Scheme (PCS). A commitment scheme is a protocol which allows the prover to create a fingerprint of some data X, which is called a commitment to X, and later prove facts about X without revealing X, using the commitment to X. The PCS is the fingerprint; a “pre-processed’’ succinct version of the constraints on the computation. This allows the prover to prove facts about the computation, now expressed in a polynomial equation, using random values proposed by the verifier in the following steps.
The prover runs a Polynomial Interactive Oracle Proof (PIOP) to show that the committed polynomials represent an execution trace which satisfies the given constraints. A PIOP is an interactive proof protocol consisting of a series of rounds in which the prover sends commitments to polynomials, the verifier responds with random field values, and the prover provides evaluations of the polynomial at these random values, akin to “solving” the polynomial equation using random values to probabilistically persuade the verifier.
Applying the Fiat-Shamir heuristic; the prover runs the PIOP in a non-interactive mode, where the verifier’s behavior is limited to sending pseudo-random challenge points. In cryptography, the Fiat–Shamir heuristic converts an interactive proof of knowledge into a digital signature for verification. This step encrypts the proof and makes it zero knowledge.
The prover must convince the verifier that the claimed polynomial evaluations are correct, with respect to the polynomial commitments it sent to the verifier. To do this, the prover produces an “evaluation” or “opening” proof, which is provided by the polynomial commitment scheme (fingerprint).
Summarily, a zkVM proof proves, for a given program, a given result, and given initial conditions, that there is some input which causes the program to produce the given result when executed from the given initial conditions. We can combine this statement with the process flow to arrive at the following description of a zkVM.
A zkVM proof proves, for a given VM program and a given output, that there is some input which causes the given program to produce the given output when executed on the VM.
What are the criteria by which we should evaluate zkVMs? In other words, when should we say that one zkVM is better than another? Truthfully, the answer depends on the use case.
Lita’s market research suggests that for most commercial use cases, the most important properties, out of speed, efficiency, and succinctness, are either speed or core-time efficiency, depending on the application. Some applications are price sensitive and want to optimize for low energy consumption and low use of capital in proving; for these, core-time efficiency is probably the most important metric to optimize. Other applications, particularly finance or trading related applications, are latency sensitive and want to optimize for speed.
Most publicized comparisons of performance focus exclusively on speed, which is important but not a holistic measurement of performance. There are also a few important properties that measure the reliability of a zkVM; most of which are not up to production ready standards, even for market leading incumbents.
We hereby propose that zkVMs should be evaluated on the following criteria, separated into two subsets:
Main criteria for evaluating zk-VMs
Baseline: Measures the reliability of zkVMs
Performance: Measures the capabilities of a zkVM
4.1 Baseline: Correctness, Security, and Trust Assumptions
When evaluating zkVMs for mission critical applications, correctness and security should be considered as the baseline. There needs to be sufficient reason to be confident about the correctness, and sufficiently strong claimed security. Also, the trust assumptions need to be sufficiently weak for the application.
Without these properties, the zkVM is potentially worse than useless for the application, as it may fail to perform as specified and expose users to hacking and exploits.
i. Correctness
Correctness is comprised of three properties:
You can have completeness without soundness; if the proof system proves everything including falsity, obviously it is complete but not sound. Inversely, you can have soundness without completeness; if the proof system proves a program existed but does not cannot prove the computations, obviously it is sound (after all, it never proves any falsity), but not complete.
ii. Security
In practice, all three correctness properties have non-zero tolerances. This implies all proofs are statistical probabilities of correctness, and not absolute certainties. A tolerance refers to the maximum tolerable probability that one property will fail. Zero tolerances are of course the ideal, but zkVMs do not achieve zero tolerances on all of these properties in practice. Perfect soundness and completeness appear to be incompatible with succinctness, and there are no known methods for achieving perfect zero knowledge. A common way of measuring security is in bits of security, where a tolerance of 1 / (2^n) is referred to as n bits of security. More bits of security is better.
If a zkVM is perfectly correct, that does not necessarily imply that it is reliable. Correctness only implies that the zkVM satisfies its security properties up to the claimed tolerances. It does not imply that the claimed tolerances are low enough to be market ready. Also, if a zkVM is sufficiently secure, that does not imply that it is correct; security refers to the claimed tolerances, not the tolerances that are actually achieved. Only when a zkVM is both perfectly correct and sufficiently secure can it be said that the zkVM is reliable up to the claimed tolerances.
iii. Trust Assumptions
When zkVMs have trust assumptions, this usually takes the form of a trusted setup process. A setup process for a ZK proof system runs once, before the first proof is generated using the proof system, in order to generate some information called “the setup data.” In a trusted setup process, one or more individuals generate some randomness which gets incorporated into the setup data, and it needs to be assumed that at least one of those individuals deleted the randomness which they incorporated into the setup data.
There are two common trust assumption models in practice.
An honest majority trust assumption states that out of some set of N individuals, more than N/2 of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by blockchains
A “1 out of N” trust assumption states that out of some set of N individuals, at least one of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by MPC based tools and applications.
It is generally agreed that all else being equal, zkVMs with no trust assumptions are more secure than zkVMs which require trust assumptions.
4.2 The zkVM Trilemma: Balancing Speed, Efficiency, and Succinctness in zkVMs
The zkVM trilemma: speed, efficiency, and succinctness
Speed, efficiency, and succinctness are all sliding scale properties. All of these factors contribute to the end user cost of zkVM. How they should be weighted in an evaluation depends on the application. Often, the fastest solution is not the most efficient or the most succinct; the most succinct solution is not the fastest or the most efficient; and so on and so forth. Let’s first define each property before explaining their relationship
i. Speed
Speed should be defined and measured relative to specific test programs, inputs, and systems to ensure it can be quantitatively assessed. This metric is critical for latency-sensitive applications where prompt availability of proofs is essential, but comes with higher overhead and larger proof sizes
ii. Efficiency
The prover consumes two kinds of resources: core-time and space. Efficiency can therefore be subdivided into core-time efficiency and space efficiency.\
Core-Time Efficiency: Average amount of time the prover operates across all cores multiplied by number of cores running the prover. For a single-core prover, the core-time consumption and the speed are the same thing. For a multi-core capable prover running in multi-core mode on a multi-core system, core-time consumption and speed are not the same thing. If a program fully utilizes 5 cores or threads for 5 seconds, that would be 25 seconds of user time and 5 seconds of wall clock time.
Space Efficiency: Refers to the storage capacity used, such as RAM
User time is interesting as a proxy for the energy consumed by running a computation. In a situation where all cores are fully utilized almost all the time, the energy consumption of a CPU should remain relatively constant. In this situation, the user time spent by a CPU-bound, predominantly user-mode code execution should be roughly linearly proportional to the number of watt-hours (i.e., energy) consumed by that code execution.
Economizing on energy consumption, or the usage of computing resources, should be interesting from the point of view of any proving operations which have enough scale that their energy bill (or their cloud compute bill) for proving is a considerable operational cost. For these reasons, user time is an interesting metric. Lower costs of proving allow service providers to pass on lower proving prices to cost-sensitive customers.
Both kinds of efficiency are related to the energy consumption of the proving process and the amount of capital utilized by the proving process, which relates to the financial cost of proving. To operationalize the definition of efficiency for measurement, the definition has to be made relative to one or more test programs, one or more test inputs to each of those programs, and one or more test systems.
iii. Succinctness
Succinctness is a composite of three distinct metrics, with the complexity of proof verification further subdivided:
Verification is typically a single core operation, thus speed and core-time efficiency are generally equivalent in this context. As with speed and efficiency, operationalizing the definition of succinctness requires specifying sets of test programs, test inputs, and test systems.
With each performance property defined, we now illustrate the dimunitive effects of optimizing one property over the others.
In general, optimizing for one quality means not optimizing for another quality, and therefore a multi-dimensional analysis is needed to pick an optimal solution on a case by case basis.
A good way of weighting these properties in an evaluation may be to define acceptable levels for each property and then determine which properties are most important. The most important properties should be optimized, subject to maintaining good enough levels on all other properties.
Below we summarize each property and their key considerations:
Evaluation properties of zkVMs
With the above table, we hereby conclude the first article of our series. In the next articles, we will build on the flow chart shown above to explain common arithmetic and cryptographic processes in zkVMs.
If you found this helpful, visit our website and gitbook to learn more about what we are building at Lita.
Also, follow us on X and Discord to stay updated so you don’t miss the rest of the series
Forward the Original Title ‘A Zero Knowledge Paradigm: Part 1 - What is a zk-VM?’
What are zero knowledge proofs (ZKPs)?
If you have no prior knowledge of zero knowledge proofs (ZKP), this video from Wired explains the concept at five difficulty levels in an interactive manner with easily understandable examples and demonstrations. Highly recommended.
In simplest terms, ZKPs enable one party (prover) to prove to another party (the verifier) that they know something without revealing what that thing is or any additional information. More specifically, ZKPs prove knowledge of a piece of data, or knowledge of the result of a computation, without revealing the data or inputs. The process of creating a zero knowledge proof involves a series of mathematical models to transform the results of a computation into a piece of otherwise meaningless information that proves successful execution of code, which can later be verified.
In some cases, it takes less work to verify the proof, which is constructed after multiple rounds of algebraic conversions and cryptography, than it would take to run the computation. This unique combination of security and scalability is what makes zero knowledge cryptography such a powerful tool.
zkSNARKs: Zero Knowledge Succinct Non-Interactive Argument of Knowledge
zkSTARKs: Zero Knowledge Scalable Transparent Argument of Knowledge
(Note: Succinct’s bridge uses SNARKs but SP1 is a STARK based protocol)
It is worth noting that all STARKs are SNARKs, but not all SNARKs are STARKs.
For a better general understanding of SNARKs and STARKs, we recommend reading this @krzhang/privacy-in-cryptocurrencies-zero-knowledge-and-zk-snarks-1-2-68ce1838fd9c">article series written by Yan Zhang and Yi Sun of Axiom, and this collection of articles in Ventali Tan’s github.
A virtual machine (VM) is a program that runs programs. In context, a zkVM is a virtual computer that is implemented as a system for generating zero knowledge proofs, or a universal circuit, or tool, for generating ZKPs for any program or computation.
zkVMs remove the need to learn complicated mathematics and cryptography for designing and coding ZK, and enables any developer to execute programs written in their preferred languages and generate ZKPs, making it far easier to integrate and interact with zero knowledge. Broadly speaking, most references to zkVMs implicitly include the compiler toolchains and proof systems appended to the virtual machine that executes the programs, and not just the virtual machine itself. Below, we summarize the main components of a zkVM and their functions:
The main components of a zkVM
The design and implementation of each component is governed by the choice of proof (SNARKs or STARKs) and the instruction set architecture (ISA) of the zkVM. Traditionally, an ISA specifies what a CPU is capable of (data types, registers, memory etc.) and the sequence of actions the CPU performs when it executes a program. In context, the ISA determines the machine code that is interpretable and executable by the VM. Choosing an ISA can yield radical differences in the accessibility and usability of the zkVM, as well as the speed and efficiency of the proof generation processes, and underpins the construction of any zkVM.
Below are some examples of zkVMs and their components for your reference.
zkVMs and their components
For now, we will focus on the interactions between each component at a high level to provide a framework for understanding the algebraic and cryptographic processes as well as the design tradeoffs of a zkVM in a later article.
The following diagram is an abstracted, generalized process flowchart of a zkVM, split and categorized between the format (inputs / outputs) of a program as it moves through the components of a zkVM. We will examine each process in depth in subsequent articles.
General flow for a zkVM
The process flow of a zkVM is generally as follows:
The prover receives the trace and represents it as a set of polynomials bound by a set of constraints, essentially translating the computation into algebra by mapping out the facts mathematically.
The prover commits to these polynomials using a Polynomial Commitment Scheme (PCS). A commitment scheme is a protocol which allows the prover to create a fingerprint of some data X, which is called a commitment to X, and later prove facts about X without revealing X, using the commitment to X. The PCS is the fingerprint; a “pre-processed’’ succinct version of the constraints on the computation. This allows the prover to prove facts about the computation, now expressed in a polynomial equation, using random values proposed by the verifier in the following steps.
The prover runs a Polynomial Interactive Oracle Proof (PIOP) to show that the committed polynomials represent an execution trace which satisfies the given constraints. A PIOP is an interactive proof protocol consisting of a series of rounds in which the prover sends commitments to polynomials, the verifier responds with random field values, and the prover provides evaluations of the polynomial at these random values, akin to “solving” the polynomial equation using random values to probabilistically persuade the verifier.
Applying the Fiat-Shamir heuristic; the prover runs the PIOP in a non-interactive mode, where the verifier’s behavior is limited to sending pseudo-random challenge points. In cryptography, the Fiat–Shamir heuristic converts an interactive proof of knowledge into a digital signature for verification. This step encrypts the proof and makes it zero knowledge.
The prover must convince the verifier that the claimed polynomial evaluations are correct, with respect to the polynomial commitments it sent to the verifier. To do this, the prover produces an “evaluation” or “opening” proof, which is provided by the polynomial commitment scheme (fingerprint).
Summarily, a zkVM proof proves, for a given program, a given result, and given initial conditions, that there is some input which causes the program to produce the given result when executed from the given initial conditions. We can combine this statement with the process flow to arrive at the following description of a zkVM.
A zkVM proof proves, for a given VM program and a given output, that there is some input which causes the given program to produce the given output when executed on the VM.
What are the criteria by which we should evaluate zkVMs? In other words, when should we say that one zkVM is better than another? Truthfully, the answer depends on the use case.
Lita’s market research suggests that for most commercial use cases, the most important properties, out of speed, efficiency, and succinctness, are either speed or core-time efficiency, depending on the application. Some applications are price sensitive and want to optimize for low energy consumption and low use of capital in proving; for these, core-time efficiency is probably the most important metric to optimize. Other applications, particularly finance or trading related applications, are latency sensitive and want to optimize for speed.
Most publicized comparisons of performance focus exclusively on speed, which is important but not a holistic measurement of performance. There are also a few important properties that measure the reliability of a zkVM; most of which are not up to production ready standards, even for market leading incumbents.
We hereby propose that zkVMs should be evaluated on the following criteria, separated into two subsets:
Main criteria for evaluating zk-VMs
Baseline: Measures the reliability of zkVMs
Performance: Measures the capabilities of a zkVM
4.1 Baseline: Correctness, Security, and Trust Assumptions
When evaluating zkVMs for mission critical applications, correctness and security should be considered as the baseline. There needs to be sufficient reason to be confident about the correctness, and sufficiently strong claimed security. Also, the trust assumptions need to be sufficiently weak for the application.
Without these properties, the zkVM is potentially worse than useless for the application, as it may fail to perform as specified and expose users to hacking and exploits.
i. Correctness
Correctness is comprised of three properties:
You can have completeness without soundness; if the proof system proves everything including falsity, obviously it is complete but not sound. Inversely, you can have soundness without completeness; if the proof system proves a program existed but does not cannot prove the computations, obviously it is sound (after all, it never proves any falsity), but not complete.
ii. Security
In practice, all three correctness properties have non-zero tolerances. This implies all proofs are statistical probabilities of correctness, and not absolute certainties. A tolerance refers to the maximum tolerable probability that one property will fail. Zero tolerances are of course the ideal, but zkVMs do not achieve zero tolerances on all of these properties in practice. Perfect soundness and completeness appear to be incompatible with succinctness, and there are no known methods for achieving perfect zero knowledge. A common way of measuring security is in bits of security, where a tolerance of 1 / (2^n) is referred to as n bits of security. More bits of security is better.
If a zkVM is perfectly correct, that does not necessarily imply that it is reliable. Correctness only implies that the zkVM satisfies its security properties up to the claimed tolerances. It does not imply that the claimed tolerances are low enough to be market ready. Also, if a zkVM is sufficiently secure, that does not imply that it is correct; security refers to the claimed tolerances, not the tolerances that are actually achieved. Only when a zkVM is both perfectly correct and sufficiently secure can it be said that the zkVM is reliable up to the claimed tolerances.
iii. Trust Assumptions
When zkVMs have trust assumptions, this usually takes the form of a trusted setup process. A setup process for a ZK proof system runs once, before the first proof is generated using the proof system, in order to generate some information called “the setup data.” In a trusted setup process, one or more individuals generate some randomness which gets incorporated into the setup data, and it needs to be assumed that at least one of those individuals deleted the randomness which they incorporated into the setup data.
There are two common trust assumption models in practice.
An honest majority trust assumption states that out of some set of N individuals, more than N/2 of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by blockchains
A “1 out of N” trust assumption states that out of some set of N individuals, at least one of those individuals exhibited integrity in some particular interaction(s) with the system, which is commonly used by MPC based tools and applications.
It is generally agreed that all else being equal, zkVMs with no trust assumptions are more secure than zkVMs which require trust assumptions.
4.2 The zkVM Trilemma: Balancing Speed, Efficiency, and Succinctness in zkVMs
The zkVM trilemma: speed, efficiency, and succinctness
Speed, efficiency, and succinctness are all sliding scale properties. All of these factors contribute to the end user cost of zkVM. How they should be weighted in an evaluation depends on the application. Often, the fastest solution is not the most efficient or the most succinct; the most succinct solution is not the fastest or the most efficient; and so on and so forth. Let’s first define each property before explaining their relationship
i. Speed
Speed should be defined and measured relative to specific test programs, inputs, and systems to ensure it can be quantitatively assessed. This metric is critical for latency-sensitive applications where prompt availability of proofs is essential, but comes with higher overhead and larger proof sizes
ii. Efficiency
The prover consumes two kinds of resources: core-time and space. Efficiency can therefore be subdivided into core-time efficiency and space efficiency.\
Core-Time Efficiency: Average amount of time the prover operates across all cores multiplied by number of cores running the prover. For a single-core prover, the core-time consumption and the speed are the same thing. For a multi-core capable prover running in multi-core mode on a multi-core system, core-time consumption and speed are not the same thing. If a program fully utilizes 5 cores or threads for 5 seconds, that would be 25 seconds of user time and 5 seconds of wall clock time.
Space Efficiency: Refers to the storage capacity used, such as RAM
User time is interesting as a proxy for the energy consumed by running a computation. In a situation where all cores are fully utilized almost all the time, the energy consumption of a CPU should remain relatively constant. In this situation, the user time spent by a CPU-bound, predominantly user-mode code execution should be roughly linearly proportional to the number of watt-hours (i.e., energy) consumed by that code execution.
Economizing on energy consumption, or the usage of computing resources, should be interesting from the point of view of any proving operations which have enough scale that their energy bill (or their cloud compute bill) for proving is a considerable operational cost. For these reasons, user time is an interesting metric. Lower costs of proving allow service providers to pass on lower proving prices to cost-sensitive customers.
Both kinds of efficiency are related to the energy consumption of the proving process and the amount of capital utilized by the proving process, which relates to the financial cost of proving. To operationalize the definition of efficiency for measurement, the definition has to be made relative to one or more test programs, one or more test inputs to each of those programs, and one or more test systems.
iii. Succinctness
Succinctness is a composite of three distinct metrics, with the complexity of proof verification further subdivided:
Verification is typically a single core operation, thus speed and core-time efficiency are generally equivalent in this context. As with speed and efficiency, operationalizing the definition of succinctness requires specifying sets of test programs, test inputs, and test systems.
With each performance property defined, we now illustrate the dimunitive effects of optimizing one property over the others.
In general, optimizing for one quality means not optimizing for another quality, and therefore a multi-dimensional analysis is needed to pick an optimal solution on a case by case basis.
A good way of weighting these properties in an evaluation may be to define acceptable levels for each property and then determine which properties are most important. The most important properties should be optimized, subject to maintaining good enough levels on all other properties.
Below we summarize each property and their key considerations:
Evaluation properties of zkVMs
With the above table, we hereby conclude the first article of our series. In the next articles, we will build on the flow chart shown above to explain common arithmetic and cryptographic processes in zkVMs.
If you found this helpful, visit our website and gitbook to learn more about what we are building at Lita.
Also, follow us on X and Discord to stay updated so you don’t miss the rest of the series