When one thinks about encryption, the first usages that come to mind are encryption at rest and encryption during transport. The first one allows to store some data on encrypted hard drives, removable devices, or even cloud databases and offers the guarantee that only the legitimate owner can see or edit the plaintext content. Encryption during transport guarantees that data transmitted over the Internet is only accessible by its designated recipients, even if it transits through public routers or channels. Both scenarios rely on encryption, with the additional integrity guarantee that the data has also not been tampered with by a malicious attacker in between. This is known as authenticated encryption: once a data is encrypted, nobody in the chain can infer any single bit of data (confidentiality) and nobody can alter the ciphertext without it being detected (integrity/authenticity).
Some collaborative use cases require that some non-trivial processing is allowed even on ciphertexts. This is the domain of privacy preserving techniques, or encryption at use, with fully homomorphic encryption (FHE) being one of them. One example is electronic voting in the cloud: voters may, for instance, encrypt their ballot, then some entity in the middle would aggregate all the ballots to count the number of votes, and only the final result would be published. Unfortunately, with authenticated encryption, the entity in the middle would need to decrypt all the ballots to perform such a computation, and would see the individual votes in the clear, which is quite cumbersome. We could, in theory, shuffle the ballots (some e-voting protocols actually rely on this), but, unlike for paper ballots, the traditional cryptographic mechanisms that guarantee integrity also make it much harder to unlink an encrypted ballot from the identity of its sender. In an e-voting scheme, one could add a hardware wall around the entity that counts the votes. This is, for instance, the objective of trusted execution enclaves. Such enclaves would make it much harder for an attacker to interact with the entity. But then a failure in the hardware could leak the decryption key, and, unlike software errors, hardware design vulnerabilities cannot be patched easily.
To address this and other similar use cases, we can make use of Fully Homomorphic Encryption (FHE). FHE is a special form of encryption that allows one to compute a function over the ciphertexts without decrypting them, and to obtain an encryption of the function’s output directly.
Most of the time, the function f to evaluate is public, so the sequence of processing steps to convert an encryption of f(x) is public knowledge, and can be carried out in the cloud without involving any secret.
This figure depicts the 3 scenarios for evoting: in the leftmost image, a trusted entity shuffles and decrypts the individual votes before publishing their addition. We must trust the entity doing the computation so that voter privacy is preserved and votes are correctly counted. In the center image a Trusted Enclave, trusted to provide integrity and privacy guarantees, is used to perform the same computation. In the right image, homomorphic encryption is used: the encrypted votes can be added (in public) before the result is decrypted. E( ) means the encryption operation, while D( ) denotes decryption.
FHE is also compact, which means that the bit-size of the output ciphertext and the effort to decrypt it only depends on the number of bits in the plaintext result. It does not depend on the chain of computations that was applied. This excludes trivial non-compact cryptosystems that would simply concatenate the inputs ciphertext of x with the source code of f, and let the recipient do all the work by decrypting x and then applying f.
FHE outsourcing is often presented as an alternative to secure enclaves, based on the hardness of a mathematical problem rather than on practical barriers. FHE is therefore completely invulnerable to passive side-channel attacks, or other corruptions of the cloud host. Imagine that someone needs to outsource some computation, but the data is really sensitive. That person would probably be reluctant to use a cloud VM if someone else can be root on the machine. They would probably also hesitate running it in an enclave like SGX, knowing that the CPU and memory of cloud hosts are constantly monitored for load, power, temperature. Maybe some information can be extracted from these measurements. That person may be reassured by the promise of FHE that extracting any single bit of information requires breaking a post-quantum mathematical problem, independently of any kind of measurement we could gather.
If the confidentiality provided by the scheme prevents an attacker from reading any single bit of information without the secret key, the universal malleability of FHE allows, on the other hand, an attacker to flip any bit that is computed: in a circuit, this would be the equivalent of active side-channel attacks, where the attacker is granted a magical laser beam that can target any bit. It may sound very scary at first, but in FHE these attacks correspond to dishonest executions of the homomorphic operations. They can be avoided by adding replication or redundancies in the computation.
FHE is often presented in a public key manner. There is:
The decryption key owner is the owner of the most sensitive secret of the cryptosystem. This person is responsible of making sure that the chain of homomorphic operations that were executed is legitimate and that the final ciphertext is safe to decrypt, and then decrypt the plaintext result. If malicious operations are introduced in the chain, the decryption key could potentially be leaked at decryption time. Luckily, homomorphic operations can be done in public and verified.
In this section, we describe a few scenarios in which FHE can be used, as well as some pros and cons of each setup.
In this figure, the orange key symbolizes the decryption key (and its owner). FHE ciphertexts are represented by locks of the same color as the decryption key. Parties that contribute with private data are represented by a cylinder: here, only Alice contributes with private data. On Bob’s side, the evaluation function and the evaluation key are public, and the computation, depicted by a green box, can be made deterministically. Everyone can retrace the computation and detect if the claimed output ciphertext is incorrect.
This is historically the first use case that was published for FHE. It aims at turning cloud computing into a private computing analogous to a secure enclave, but based on cryptographic security instead of hardware security. In such a setting, Alice owns some private data, but has limited computing capabilities. Bob mimics a cloud instance with much larger computing power. Bob does not contribute with any additional private data. Alice can outsource a computation by encrypting the inputs, Bob then evaluates the desired (public) function homomorphically and sends the encrypted result back to Alice.
With the current hardware capabilities, the outsourcing mode is still a bit slow to be used in full generality in practice – we can typically count a running time overhead factor of 1 million and a memory overhead of 1 thousand on non-linear use cases. However, FHE hardware is currently being developed to close the gap, like the Darpa DPRIVE project or CryptoLight.
Currently, the outsourcing mode is used in practice for PIR (Private Information Retrieval) use cases, where a server (Bob) has a large public database, a client (Alice) emits a query, and the index that is queried should remain private. Such PIR schemes benefit a lot from the linearity and compacity of homomorphic encrypted operations, while the small multiplicative depth of the circuits maintains the computation costs within reasonable bounds.
This table summarizes the pros and cons of the outsourcing mode.
This figure uses the same color coding as previously. This time, Bob contributes to the computation with some private data. The computation on Bob’s side is not publicly verifiable anymore, symbolized by a red box, this mode should be restricted to honest-but-curious use cases.
In this new setup, the only difference is that Bob contributes to the computation with some private data. In this case, FHE is a good two party compute solution, with minimal communication, and provides strong guarantees on the querier side: Bob learns nothing about Alice’s data, and Alice learns the result of the computation.
A potential application for this scenario would be the millionaire’s problem, where Alice and Bob are two millionaires who want to know who is richer without revealing their wealth to the other party. Solutions for this problem are used in e-commerce applications.
This is a refinement of the outsourcing mode, where data from many participants can be aggregated in a compact (in the sense that the output doesn’t grow with the amount of participants) and publicly verifiable manner. Typical usages include federated learning and e-voting.
This setup is a refinement of the two-party-compute mode, where the compute party is now providing a secure compute service to multiple clients, that have their own secret key. FHE can for instance be used as a private model prediction service (e.g. a ML service with a private model and inputs): the server has the private model (private but in plaintext) and each client has its own data and would like to run a prediction. As a result, each client retrieves its own encrypted prediction, with its own secret key.
It is always easier to use FHE in collaborative scenarios where each participant has an incentive to follow the protocol honestly. FHE can for instance be used to compute statistics between two legal entities of the same group in two different countries: regulations such as GDPR allow some statistics to be published, but prevent in general gathering all indidividual data in the same place. In this case, using FHE is possible and all participants have incentive to follow the protocol honestly. In a non-collaborative scenario, the easiest way of making sure that the relevant function has been computed is to introduce redundancy. For instance, in the outsourcing and aggregation scenarios, homomorphic computations are entirely public and can be made deterministic. As long as two or more independent entities end up with the exact same output ciphertext, then the computation is correct and the result is safe to decrypt. The more redundancy, the higher confidence we can have in the result, which of course is a trade-off with efficiency.
Also, when the compute party vouches for an FHE result by digitally signing the input and output ciphertexts, everyone can retrace the same FHE computation and check whether the proof is valid. Any attempt of cheating by the FHE computing party can be publicly caught, and associated with a publicly verifiable certificate that exposes the cheat and the cheater – we call such model strong covert security model.
Fully Homomorphic signatures are another way of verifying the correctness of a computation, without the need of an independent verifier, but requires in general much more resources.
The easiest way of doing this is to ensure that the decryption key owner does not have access to any intermediate ciphertext.
In the two-party scenario, or in the client-server one, Alice encrypts the input, Bob does the computation over ciphertexts and transmits the encrypted output back to Alice, it is clear that Alice will only be able to decrypt the result, she does not have access to any other variable.
In a cloud aggregation scenario, like e-voting where many participants send encrypted ballots on a common cloud, another technique is used: the decryption key is in general not given to a single recipient but instead secret-shared between different members of a decryption authority. In this case, decryption can only be triggered on one particular ciphertext by executing a multiparty computation, that involves online communication between the members of the authority. If one member refuses to decrypt a ciphertext, decryption is impossible. This ensures that only the ciphertexts that are agreed upon by all the authority members can be decrypted.
There are three kinds of homomorphic encryption: partial homomorphic encryption (PHE), leveled homomorphic encryption (LHE), and fully homomorphic encryption (FHE). Partial homomorphic encryption only allows us to compute a restricted set of functions (e.g. only a sum, only linear functions, only bilinear functions), whereas leveled and fully homomorphic encryption can evaluate arbitrary circuits, or, equivalently, functions whose control flows are data independent. For LHE, the cryptographic parameters depend on the function and grow with the complexity of the circuit, which in turn results in larger ciphertexts and keys. FHE schemes allow, for a given set of parameters, and thus for a given key and ciphertext size, us to evaluate any function that can be represented as a circuit with arithmetic or binary gates. That is, in contrast to LHE, even if the circuit to evaluate grows more and more, the scheme parameters (and keys and ciphertexts) do not get larger.
In other words, when the question is asked whether a given plaintext circuit can be run homomorphically, and at what cost (in time and memory overhead), PHE is may answer no to the question. LHE answers yes, but may set an arbitrary high cost for complex circuits. FHE also answers yes, and, in addition, it provides the keys, encryption and decryption algorithms, and how to homomorphically evaluate a universal set of gates before the plaintext circuit is even specified. FHE is therefore the only mode that guarantees that the memory and running time of the homomorphic evaluation remains proportional to the original plaintext circuit. To do this, all the FHE schemes known today deal with noisy ciphertexts that get more and more noisy as computations are done. To avoid noise overflowing into the computation done and leading to decryption errors, these schemes regularly perform a quite costly operation called bootstrapping, which reduces noise back to a manageable level. More on the specifics of each sort of scheme, on bootstrapping, and on how to minimise noise and other costs with FHE compilers, in the second blog post of this series!
When one thinks about encryption, the first usages that come to mind are encryption at rest and encryption during transport. The first one allows to store some data on encrypted hard drives, removable devices, or even cloud databases and offers the guarantee that only the legitimate owner can see or edit the plaintext content. Encryption during transport guarantees that data transmitted over the Internet is only accessible by its designated recipients, even if it transits through public routers or channels. Both scenarios rely on encryption, with the additional integrity guarantee that the data has also not been tampered with by a malicious attacker in between. This is known as authenticated encryption: once a data is encrypted, nobody in the chain can infer any single bit of data (confidentiality) and nobody can alter the ciphertext without it being detected (integrity/authenticity).
Some collaborative use cases require that some non-trivial processing is allowed even on ciphertexts. This is the domain of privacy preserving techniques, or encryption at use, with fully homomorphic encryption (FHE) being one of them. One example is electronic voting in the cloud: voters may, for instance, encrypt their ballot, then some entity in the middle would aggregate all the ballots to count the number of votes, and only the final result would be published. Unfortunately, with authenticated encryption, the entity in the middle would need to decrypt all the ballots to perform such a computation, and would see the individual votes in the clear, which is quite cumbersome. We could, in theory, shuffle the ballots (some e-voting protocols actually rely on this), but, unlike for paper ballots, the traditional cryptographic mechanisms that guarantee integrity also make it much harder to unlink an encrypted ballot from the identity of its sender. In an e-voting scheme, one could add a hardware wall around the entity that counts the votes. This is, for instance, the objective of trusted execution enclaves. Such enclaves would make it much harder for an attacker to interact with the entity. But then a failure in the hardware could leak the decryption key, and, unlike software errors, hardware design vulnerabilities cannot be patched easily.
To address this and other similar use cases, we can make use of Fully Homomorphic Encryption (FHE). FHE is a special form of encryption that allows one to compute a function over the ciphertexts without decrypting them, and to obtain an encryption of the function’s output directly.
Most of the time, the function f to evaluate is public, so the sequence of processing steps to convert an encryption of f(x) is public knowledge, and can be carried out in the cloud without involving any secret.
This figure depicts the 3 scenarios for evoting: in the leftmost image, a trusted entity shuffles and decrypts the individual votes before publishing their addition. We must trust the entity doing the computation so that voter privacy is preserved and votes are correctly counted. In the center image a Trusted Enclave, trusted to provide integrity and privacy guarantees, is used to perform the same computation. In the right image, homomorphic encryption is used: the encrypted votes can be added (in public) before the result is decrypted. E( ) means the encryption operation, while D( ) denotes decryption.
FHE is also compact, which means that the bit-size of the output ciphertext and the effort to decrypt it only depends on the number of bits in the plaintext result. It does not depend on the chain of computations that was applied. This excludes trivial non-compact cryptosystems that would simply concatenate the inputs ciphertext of x with the source code of f, and let the recipient do all the work by decrypting x and then applying f.
FHE outsourcing is often presented as an alternative to secure enclaves, based on the hardness of a mathematical problem rather than on practical barriers. FHE is therefore completely invulnerable to passive side-channel attacks, or other corruptions of the cloud host. Imagine that someone needs to outsource some computation, but the data is really sensitive. That person would probably be reluctant to use a cloud VM if someone else can be root on the machine. They would probably also hesitate running it in an enclave like SGX, knowing that the CPU and memory of cloud hosts are constantly monitored for load, power, temperature. Maybe some information can be extracted from these measurements. That person may be reassured by the promise of FHE that extracting any single bit of information requires breaking a post-quantum mathematical problem, independently of any kind of measurement we could gather.
If the confidentiality provided by the scheme prevents an attacker from reading any single bit of information without the secret key, the universal malleability of FHE allows, on the other hand, an attacker to flip any bit that is computed: in a circuit, this would be the equivalent of active side-channel attacks, where the attacker is granted a magical laser beam that can target any bit. It may sound very scary at first, but in FHE these attacks correspond to dishonest executions of the homomorphic operations. They can be avoided by adding replication or redundancies in the computation.
FHE is often presented in a public key manner. There is:
The decryption key owner is the owner of the most sensitive secret of the cryptosystem. This person is responsible of making sure that the chain of homomorphic operations that were executed is legitimate and that the final ciphertext is safe to decrypt, and then decrypt the plaintext result. If malicious operations are introduced in the chain, the decryption key could potentially be leaked at decryption time. Luckily, homomorphic operations can be done in public and verified.
In this section, we describe a few scenarios in which FHE can be used, as well as some pros and cons of each setup.
In this figure, the orange key symbolizes the decryption key (and its owner). FHE ciphertexts are represented by locks of the same color as the decryption key. Parties that contribute with private data are represented by a cylinder: here, only Alice contributes with private data. On Bob’s side, the evaluation function and the evaluation key are public, and the computation, depicted by a green box, can be made deterministically. Everyone can retrace the computation and detect if the claimed output ciphertext is incorrect.
This is historically the first use case that was published for FHE. It aims at turning cloud computing into a private computing analogous to a secure enclave, but based on cryptographic security instead of hardware security. In such a setting, Alice owns some private data, but has limited computing capabilities. Bob mimics a cloud instance with much larger computing power. Bob does not contribute with any additional private data. Alice can outsource a computation by encrypting the inputs, Bob then evaluates the desired (public) function homomorphically and sends the encrypted result back to Alice.
With the current hardware capabilities, the outsourcing mode is still a bit slow to be used in full generality in practice – we can typically count a running time overhead factor of 1 million and a memory overhead of 1 thousand on non-linear use cases. However, FHE hardware is currently being developed to close the gap, like the Darpa DPRIVE project or CryptoLight.
Currently, the outsourcing mode is used in practice for PIR (Private Information Retrieval) use cases, where a server (Bob) has a large public database, a client (Alice) emits a query, and the index that is queried should remain private. Such PIR schemes benefit a lot from the linearity and compacity of homomorphic encrypted operations, while the small multiplicative depth of the circuits maintains the computation costs within reasonable bounds.
This table summarizes the pros and cons of the outsourcing mode.
This figure uses the same color coding as previously. This time, Bob contributes to the computation with some private data. The computation on Bob’s side is not publicly verifiable anymore, symbolized by a red box, this mode should be restricted to honest-but-curious use cases.
In this new setup, the only difference is that Bob contributes to the computation with some private data. In this case, FHE is a good two party compute solution, with minimal communication, and provides strong guarantees on the querier side: Bob learns nothing about Alice’s data, and Alice learns the result of the computation.
A potential application for this scenario would be the millionaire’s problem, where Alice and Bob are two millionaires who want to know who is richer without revealing their wealth to the other party. Solutions for this problem are used in e-commerce applications.
This is a refinement of the outsourcing mode, where data from many participants can be aggregated in a compact (in the sense that the output doesn’t grow with the amount of participants) and publicly verifiable manner. Typical usages include federated learning and e-voting.
This setup is a refinement of the two-party-compute mode, where the compute party is now providing a secure compute service to multiple clients, that have their own secret key. FHE can for instance be used as a private model prediction service (e.g. a ML service with a private model and inputs): the server has the private model (private but in plaintext) and each client has its own data and would like to run a prediction. As a result, each client retrieves its own encrypted prediction, with its own secret key.
It is always easier to use FHE in collaborative scenarios where each participant has an incentive to follow the protocol honestly. FHE can for instance be used to compute statistics between two legal entities of the same group in two different countries: regulations such as GDPR allow some statistics to be published, but prevent in general gathering all indidividual data in the same place. In this case, using FHE is possible and all participants have incentive to follow the protocol honestly. In a non-collaborative scenario, the easiest way of making sure that the relevant function has been computed is to introduce redundancy. For instance, in the outsourcing and aggregation scenarios, homomorphic computations are entirely public and can be made deterministic. As long as two or more independent entities end up with the exact same output ciphertext, then the computation is correct and the result is safe to decrypt. The more redundancy, the higher confidence we can have in the result, which of course is a trade-off with efficiency.
Also, when the compute party vouches for an FHE result by digitally signing the input and output ciphertexts, everyone can retrace the same FHE computation and check whether the proof is valid. Any attempt of cheating by the FHE computing party can be publicly caught, and associated with a publicly verifiable certificate that exposes the cheat and the cheater – we call such model strong covert security model.
Fully Homomorphic signatures are another way of verifying the correctness of a computation, without the need of an independent verifier, but requires in general much more resources.
The easiest way of doing this is to ensure that the decryption key owner does not have access to any intermediate ciphertext.
In the two-party scenario, or in the client-server one, Alice encrypts the input, Bob does the computation over ciphertexts and transmits the encrypted output back to Alice, it is clear that Alice will only be able to decrypt the result, she does not have access to any other variable.
In a cloud aggregation scenario, like e-voting where many participants send encrypted ballots on a common cloud, another technique is used: the decryption key is in general not given to a single recipient but instead secret-shared between different members of a decryption authority. In this case, decryption can only be triggered on one particular ciphertext by executing a multiparty computation, that involves online communication between the members of the authority. If one member refuses to decrypt a ciphertext, decryption is impossible. This ensures that only the ciphertexts that are agreed upon by all the authority members can be decrypted.
There are three kinds of homomorphic encryption: partial homomorphic encryption (PHE), leveled homomorphic encryption (LHE), and fully homomorphic encryption (FHE). Partial homomorphic encryption only allows us to compute a restricted set of functions (e.g. only a sum, only linear functions, only bilinear functions), whereas leveled and fully homomorphic encryption can evaluate arbitrary circuits, or, equivalently, functions whose control flows are data independent. For LHE, the cryptographic parameters depend on the function and grow with the complexity of the circuit, which in turn results in larger ciphertexts and keys. FHE schemes allow, for a given set of parameters, and thus for a given key and ciphertext size, us to evaluate any function that can be represented as a circuit with arithmetic or binary gates. That is, in contrast to LHE, even if the circuit to evaluate grows more and more, the scheme parameters (and keys and ciphertexts) do not get larger.
In other words, when the question is asked whether a given plaintext circuit can be run homomorphically, and at what cost (in time and memory overhead), PHE is may answer no to the question. LHE answers yes, but may set an arbitrary high cost for complex circuits. FHE also answers yes, and, in addition, it provides the keys, encryption and decryption algorithms, and how to homomorphically evaluate a universal set of gates before the plaintext circuit is even specified. FHE is therefore the only mode that guarantees that the memory and running time of the homomorphic evaluation remains proportional to the original plaintext circuit. To do this, all the FHE schemes known today deal with noisy ciphertexts that get more and more noisy as computations are done. To avoid noise overflowing into the computation done and leading to decryption errors, these schemes regularly perform a quite costly operation called bootstrapping, which reduces noise back to a manageable level. More on the specifics of each sort of scheme, on bootstrapping, and on how to minimise noise and other costs with FHE compilers, in the second blog post of this series!