Ethereum’s roadmap incorporates a scaling tech called Data Availability Sampling (DAS) 6. DAS introduces new @djrtwo/das-requirements">requirements 4 to Ethereum’s networking stack, necessitating the implementation of specialized networking protocols 7. One prominent @dankrad/S-Kademlia-DAS">protocol proposal 4 uses a Distributed Hash Table (DHT) based on Kademlia 2 to store and retrieve the samples of the data.
However, DHTs 4 are susceptible to Sybil attacks: An attacker who controls a large number of DHT nodes can make DAS samples unavailable. To counteract this threat, a high-trust networking layer can be established, consisting solely of beacon chain validators. Such a security measure significantly raises the barrier for attackers, as they must now stake their own ETH to attack the DHT.
In this post, we introduce a proof of validator protocol, which enables DHT participants to demonstrate, in zero-knowledge, that they are an Ethereum validator.
In this section, we motivate further the proof of validator protocol by describing a Sybil attack against Data Availability Sampling.
The DAS protocol revolves around the block builder ensuring that block data is made available so that clients can fetch them. Present approaches involve partitioning data into samples, and network participants only fetch samples that pertain to their interests.
)
Consider a scenario where a Sybil attacker wants to prevent network participants from fetching samples from a victim node, which is resposible for providing the sample. As depicted in the figure above, the attacker generates many node IDs which are close to the victim’s node ID. By surrounding the victim’s node with their own nodes, the attacker hinders clients from discovering the victim node, as evil nodes will deliberately withhold information about the victim’s existence.
For more information about such Sybil attacks, see this recent research paper 2 on DHT Eclipse attacks. Furthermore, @dankrad/S-Kademlia-DAS#SKademlia-modifications">Dankrad’s DAS networking protocol proposal 8 describes how the S/Kademlia DHT protocol suffers from such attacks and shows the need for a proof of validator protocol.
The above attack motivates the need for a proof of validator protocol: If only validators can join the DHT, then an attacker who wants to launch a Sybil attack must also stake a large amount of ETH.
Using our proof of validator protocol we ensure that only beacon chain validators can join the DHT and that each validator gets a unique DHT identity.
Furthermore, for validator DoS resilience, we also aim to hide the identity of the validators on the networking layer. That is, we don’t want attackers to be able to tell which DHT node corresponds to which validator.
To fulfill these objectives, the proof of validator protocol must meet the following requirements:
Such a proof of validator protocol would be used by Bob during connection establishment in the DHT layer, so that Alice knows she is speaking to a validator.
Our proof of validator protocol is effectively a simple anonymous credential scheme. Its objective is to enable Alice to generate a unique derived key, denoted as D, if and only if she is a validator. Subsequently, Alice uses this derived key D within the networking layer.
In designing this protocol, our objective was to create a solution that was both straightforward to implement and analyze, ensuring it meets the outlined requirements in an efficient way.
The protocol employs a membership proof subprotocol, wherein Alice proves she is a validator by demonstrating knowledge of a secret hash preimage using ZK proofs. Alice then constructs a unique keypair derived from that secret hash preimage.
The membership proof subprotocol can be instantiated through different methods. In this post, we show a protocol using Merkle trees and a second protocol using lookups.
While both approaches demonstrate acceptable efficiency, they feature distinct tradeoffs. Merkle trees rely on SNARK-friendly hash functions like Poseidon (which may be considered experimental). On the other hand, efficient lookup protocols rely on a powers-of-tau trusted setup of size equal to the size of the validator set (currently 700k validators but growing).
Now let’s dive into the protocols:
Merkle trees have seen widespread use for membership proofs (e.g. see Semaphore 3). Here is the tradeoff space when designing a membership proof using Merkle trees:
Below we describe the proof of validator protocol based on Merkle trees:
)
At the end of the protocol, Alice can use D in the DHT to sign messages and derive her unique DHT node identity.
Now let’s look at a slightly more complicated, but much more efficient, solution using lookups.
Here is the tradeoff space of using lookup 2 protocols like Caulk 2:
Below we describe a concrete proof of validator protocol:
Exactly like in the Merkle approach, every validator i registers a new value pi on the blockchain such that:
We benchmarked the runtime of our membership proof protocol (link 6 to the benchmark code 5) in terms of proof creation and verification. Note that while the membership proof is just one part of our proof of validator protocol, we expect it to dominate the overall running time.
Below we provide benchmark results for a merkle tree membership proof using the Halo2 proof system with IPA as the polynomial commitment scheme. IPA is a slower scheme than KZG but it doesn’t require a trusted setup maximizing the advantages of the merkle tree approach.
We observe that both the prover and verifier times align well with our efficiency requirements. For this reason, we decided against benchmarking the Caulk-based approach, as its performance is expected to be significantly better in all categories (especially prover time and proof size).
Benchmarks were collected on a laptop running on an Intel i7-8550U (five years old CPU).
The uniqueness property of the proof of validator protocol ensures that each network participant possesses a distinct derived keypair. However, for certain networking protocols, it might be advantageous to allow validators to have rotating identities, where their derived keys change periodically, perhaps daily.
In such a scenario, if Eve misbehaves on a particular day, Alice can blocklist her for that day. However, on the next day, Eve can generate a new derived key, which is not blocklisted. If we wanted to be able to permanently blocklist validators based on their rotating identity we would need a more advanced anonymous credentials scheme like SNARKBlock 1.
An alternative (perhaps simpler) approach would be to build a commitment out of all validator identity BLS12-381 keys and do a membership proof on that commitment.
However, this approach would require validators to insert their identity private key into the ZK proof system to create a valid membership proof and compute the unique derived key.
We decided to not take this approach because it’s not good practice to insert sensitive identity keys into complicated cryptographic protocol, and it would also make it harder for validators to keep their main identity key offline.
Thanks to Enrico Bottazzi, Cedoor, Vivian Plasencia and Wanseob for the help in navigating the web of membership proof codebases.
Ethereum’s roadmap incorporates a scaling tech called Data Availability Sampling (DAS) 6. DAS introduces new @djrtwo/das-requirements">requirements 4 to Ethereum’s networking stack, necessitating the implementation of specialized networking protocols 7. One prominent @dankrad/S-Kademlia-DAS">protocol proposal 4 uses a Distributed Hash Table (DHT) based on Kademlia 2 to store and retrieve the samples of the data.
However, DHTs 4 are susceptible to Sybil attacks: An attacker who controls a large number of DHT nodes can make DAS samples unavailable. To counteract this threat, a high-trust networking layer can be established, consisting solely of beacon chain validators. Such a security measure significantly raises the barrier for attackers, as they must now stake their own ETH to attack the DHT.
In this post, we introduce a proof of validator protocol, which enables DHT participants to demonstrate, in zero-knowledge, that they are an Ethereum validator.
In this section, we motivate further the proof of validator protocol by describing a Sybil attack against Data Availability Sampling.
The DAS protocol revolves around the block builder ensuring that block data is made available so that clients can fetch them. Present approaches involve partitioning data into samples, and network participants only fetch samples that pertain to their interests.
)
Consider a scenario where a Sybil attacker wants to prevent network participants from fetching samples from a victim node, which is resposible for providing the sample. As depicted in the figure above, the attacker generates many node IDs which are close to the victim’s node ID. By surrounding the victim’s node with their own nodes, the attacker hinders clients from discovering the victim node, as evil nodes will deliberately withhold information about the victim’s existence.
For more information about such Sybil attacks, see this recent research paper 2 on DHT Eclipse attacks. Furthermore, @dankrad/S-Kademlia-DAS#SKademlia-modifications">Dankrad’s DAS networking protocol proposal 8 describes how the S/Kademlia DHT protocol suffers from such attacks and shows the need for a proof of validator protocol.
The above attack motivates the need for a proof of validator protocol: If only validators can join the DHT, then an attacker who wants to launch a Sybil attack must also stake a large amount of ETH.
Using our proof of validator protocol we ensure that only beacon chain validators can join the DHT and that each validator gets a unique DHT identity.
Furthermore, for validator DoS resilience, we also aim to hide the identity of the validators on the networking layer. That is, we don’t want attackers to be able to tell which DHT node corresponds to which validator.
To fulfill these objectives, the proof of validator protocol must meet the following requirements:
Such a proof of validator protocol would be used by Bob during connection establishment in the DHT layer, so that Alice knows she is speaking to a validator.
Our proof of validator protocol is effectively a simple anonymous credential scheme. Its objective is to enable Alice to generate a unique derived key, denoted as D, if and only if she is a validator. Subsequently, Alice uses this derived key D within the networking layer.
In designing this protocol, our objective was to create a solution that was both straightforward to implement and analyze, ensuring it meets the outlined requirements in an efficient way.
The protocol employs a membership proof subprotocol, wherein Alice proves she is a validator by demonstrating knowledge of a secret hash preimage using ZK proofs. Alice then constructs a unique keypair derived from that secret hash preimage.
The membership proof subprotocol can be instantiated through different methods. In this post, we show a protocol using Merkle trees and a second protocol using lookups.
While both approaches demonstrate acceptable efficiency, they feature distinct tradeoffs. Merkle trees rely on SNARK-friendly hash functions like Poseidon (which may be considered experimental). On the other hand, efficient lookup protocols rely on a powers-of-tau trusted setup of size equal to the size of the validator set (currently 700k validators but growing).
Now let’s dive into the protocols:
Merkle trees have seen widespread use for membership proofs (e.g. see Semaphore 3). Here is the tradeoff space when designing a membership proof using Merkle trees:
Below we describe the proof of validator protocol based on Merkle trees:
)
At the end of the protocol, Alice can use D in the DHT to sign messages and derive her unique DHT node identity.
Now let’s look at a slightly more complicated, but much more efficient, solution using lookups.
Here is the tradeoff space of using lookup 2 protocols like Caulk 2:
Below we describe a concrete proof of validator protocol:
Exactly like in the Merkle approach, every validator i registers a new value pi on the blockchain such that:
We benchmarked the runtime of our membership proof protocol (link 6 to the benchmark code 5) in terms of proof creation and verification. Note that while the membership proof is just one part of our proof of validator protocol, we expect it to dominate the overall running time.
Below we provide benchmark results for a merkle tree membership proof using the Halo2 proof system with IPA as the polynomial commitment scheme. IPA is a slower scheme than KZG but it doesn’t require a trusted setup maximizing the advantages of the merkle tree approach.
We observe that both the prover and verifier times align well with our efficiency requirements. For this reason, we decided against benchmarking the Caulk-based approach, as its performance is expected to be significantly better in all categories (especially prover time and proof size).
Benchmarks were collected on a laptop running on an Intel i7-8550U (five years old CPU).
The uniqueness property of the proof of validator protocol ensures that each network participant possesses a distinct derived keypair. However, for certain networking protocols, it might be advantageous to allow validators to have rotating identities, where their derived keys change periodically, perhaps daily.
In such a scenario, if Eve misbehaves on a particular day, Alice can blocklist her for that day. However, on the next day, Eve can generate a new derived key, which is not blocklisted. If we wanted to be able to permanently blocklist validators based on their rotating identity we would need a more advanced anonymous credentials scheme like SNARKBlock 1.
An alternative (perhaps simpler) approach would be to build a commitment out of all validator identity BLS12-381 keys and do a membership proof on that commitment.
However, this approach would require validators to insert their identity private key into the ZK proof system to create a valid membership proof and compute the unique derived key.
We decided to not take this approach because it’s not good practice to insert sensitive identity keys into complicated cryptographic protocol, and it would also make it harder for validators to keep their main identity key offline.
Thanks to Enrico Bottazzi, Cedoor, Vivian Plasencia and Wanseob for the help in navigating the web of membership proof codebases.