Major challenges in computer and information security are the advent of quantum computers and the need to trust your data to (cloud) service providers. New cryptography is supposed to help, but they look daunting. At their core, however, they are just children’s riddles. An introduction to Lattice cryptography and Learning With Errors.
Two big challenges are faced by information security:
- Cryptographers are afraid that quantum computers may be able to break existing public-key cryptography easily, a few years from now. As public-key cryptography is at the heart of almost every information security mechanism, something needs to be done. And for secrets which should remain secret for many years or decades, we need solutions now. These solutions are called Quantum-safe or Post-Quantum cryptosystems.
- Cloud computing, smart cards and the Internet of Things have a common problem: You may need to process data on platforms which are in the hands of someone else. Trust in those devices and their owners or operators is necessary, but there is no way to test whether they are violating your trust. It would be great if sensitive data would not need to be processed on these systems, if instead, they could operate on encrypted versions of the data. This is the promise behind (Fully) Homomorphic Encryption: Performing mathematical operations on the encrypted data, with the results remaining encrypted; essentially, the holy grail for trustworthy computing.
When trying to learn about Quantum-resistant cryptography or Homomorphic Encryption, the terms Lattice-based Cryptography and Learning With Errors quickly pop up. However, the descriptions I was able to found on the Internet try to discourage any normal reader from understanding it: As presented by those in the know, the math looks scary and complex; achieving magic looks trivial by comparison.
However, at their core, they are just coordinate systems and children’s riddles.
So, let me give you an intuition into how they work, explained even for those who hate math. And conclude with what that means.
Lattices are fancy coordinate systems
You know coordinate systems from blueprints, maps, you name it. They are easy:
- The coordinate axes are orthogonal to each other (and parallel to the edges of the sheet or screen)
- The measurement system (or spacing) along the coordinate axes is the same (e.g., x coordinate 1 is at the same distance from the origin than y coordinate 1, just in a different direction)
- The coordinate systems are linear (i.e., the distance from 0 to 1 is the same as from 3 to 4)
This all seems natural, because we are used to it and they are made to be easy on us. The lattices used for encryption will just add a few twists to them, which we will introduce.
Twisting the vectors
That was easy. You may even have wondered why we talk about that at all. So let’s look at slightly different vectors A and B:
The new coordinate system is a slight nuisance, but still easy to handle. (In fact, using a simple formula, any (x, y) coordinate pair can be easily transformed into the number of A’s and B’s to add.)
In the same spirit, let’s continue twisting the vectors further:
With these two almost-aligned basis vectors, it becomes hard to determine the number of A and B steps to take. Some might be reminded of the complexity to move to a given location with a knight on a chess board.
Not all points are valid
In the previous section, all integer coordinates were reachable, because the vector pairs were carefully selected to enable this. Choosing vectors randomly would rarely result in this property. For example, after a slight variation of the second set of vectors above, only every second point is reachable, checkerboard-style.
By choosing longer vector lengths, you will generally create even sparser lattices. (This is a general rule. Let’s gloss over how to guarantee a certain sparseness; but it involves things like the relationships between the prime factors of the vectors’ coordinates.)
So what happens when we make these vectors more complicated, both in their sheer number (and thus, also their number of dimensions) and their length? It becomes even harder.
Humans can easily deal with two, sometimes three dimensions. Switching to dozens or even hundreds of dimensions and having longer vectors makes it very hard to determine:
- which coordinates are reachable at all,
- how this coordinate can be reached, and, especially,
- if this coordinate is unreachable, what is the closest reachable point.
The latter property is known as the closest vector problem and is at the heart of Lattice-based cryptography.
The closest vector problem is hard to solve for a complex set of basis vectors. However, if you have a simple set of basis vectors, as we started off in the first few examples, it becomes easy to solve. Having such a pair of complex and simple problems is at the heart of public-key cryptography.
So in essence, Lattice-based public-key cryptography works as follows:
- Alice creates a set of simple basis vectors in a high-dimensional space, which does not cover all of the space (think of a checkerboard with most squares white; black ones only few and far between, with no visible pattern to their placement). This is her private key, in which the closest vector problem is easy to solve.
- Alice also creates a second set of basis vectors, each vector built by weighted addition of many of her simple vectors. This is her public key, where it is still easy to create a valid point, but it is hard to determine whether a given point is valid or not (and where it is also hard to solve the closest vector problem).
- Alice publishes her public key and keeps her private key private (just as with any other public-key cryptosystem).
- Bob, wishing to encrypt a message to Alice, uses her public vectors to encode the message. For example, he could use the first byte of the message to multiply the first public basis vector with, the second byte for the second vector, and so on. Summing up all vectors will lead to a valid lattice point.
- Bob then slightly and randomly moves that encrypted data point, such that it is off-lattice now.
- He then sends that message to Alice.
- Any eavesdropper will only know the public, complicated, basis vectors and will be unable to solve the closest vector problem.
- However, Alice, with access to the private key, can solve the closest vector problem easily and therefore reverse-engineer (1) the actual data point and (2) what factors Bob used to multiply the basis vectors with, and thus the bytes of the message.
(In case you wonder: As is common in cryptography, these operations are all done in modular arithmetic.)
Learning with Errors
As a riddle (without errors)
A mother has two children. The girl is twice the age of the boy. Together, they have are 18 years old. How old are the children?Simple children’s riddle, which can be solved using linear equations.
The riddle can be solved by trial and error, or using a set of linear equations, where B is the age of the boy and G is the age of the girl:
2*B – G = 0
B + G = 18
After solving this linear system for B and G, we get G=12 and B=6. Here, we learned the age of the children without errors.
Properties of linear equation systems
That was easy. In general, linear equations are easy: With enough equations, they can be solved in a well-defined, straightforward manner, even if you increase the number of variables into the millions or billions.
“Enough equations” generally means that the number of equations need to be at least equal to the variables. (Equations that can be formed by weighted addition of other equations don’t count—i.e., those that are linear dependent on others—as they do not add new information.)
More information may not necessarily be better: If a third equation, “3*B + G = 100”, were to be added, this equation would conflict with the other two, as 3*12 + 6 = 42, which obviously does not equal 100. However, each of the three pairs would have a valid solution. In this case, they would be vastly different from each other.
For cryptography (with errors)
With these conflicts, we are on the right track toward a cryptographic puzzle. The basic idea behind learning with errors is as follows:
- Have a set of linear equations
- Increase the number of variables to be huge (into the many thousands or even millions)
- Add more valid equations, significantly beyond the number of variables. (They will automatically be linear combinations of the initial set, not adding additional information.)
- And now the magic part: Slightly jiggle the constants
A riddle with turtles
A mother turtle has two children. The girl is twice the age of the boy. Together, they are 180 years old. How old are the children?Slightly more complicated children’s riddle, which can still be solved using linear equations (or, guessing)
The ages are ten times those of the human children, but otherwise things stay the same. We also add a third equation which agrees with the other two equations. (It automatically is a linear combination of the first two. In this case, twice the first equation plus seven times the second equation equals three times the third equation.)
However, now we slightly change the right-hand sides of the equation, adding or subtracting a small number:
All of the equations now still almost match the girl and boy ages of 120 and 60 years, respectively, but no pair of equations matches the exact numbers or even just the third equation. Picking any pair of equations, however, gives a good estimate of the real result. With these approximations, you can perform further calculations and, as long as we are careful, the results will be close to the correct results as well. But we will never know the exact numbers. (And rounding to the nearest multiple of ten only works in this toy example.)
Learning-with-errors based public-key cryptography
To build a public-key cryptosystem, the correct values of the variables can be used as the private key and the jiggled equations as the public key. Same as in the Lattice above, the cryptographic operations are done in modular arithmetic.
- Alice chooses a modulus and creates private and public keys; she publishes the public keys and the modulus (let’s assume it to be M=89).
- For each encrypted bit that Bob wants to send to Alice, Bob creates a new, unique equation as a random, non-trivial linear combination of a subset of the equations forming the public key. (For example, the lowest equation in the image above.)
- To communicate a bit with a value of zero, Bob transmits this base equation as-is (of course, modulo M, so the the 131 would turn into 131 mod 89=42).
- To communicate a bit with a value of one, Bob transmits base equation modified by adding roughly half the modulus, again modulo M (half of M would be roughly 44, so 42+44=86 would be transmitted as the right-hand side of the equation).
- When receiving an equation, Alice can easily compute the correct value for the right-hand side of the equation from the left-hand-side coefficients (5 and –8, in the case of the linear combination created above). The correct result here would be 120 (=5*120–8*60); modulo M that would be 31.
- When the difference between the transmitted and the correct value is small (in this case, 42–31=11), the equation is decrypted into a bit valued 0; if the difference is high (close to ½M), the equation is decrypted into a bit valued 1.
- Given the right parameters, this decryption is easy for Alice and impossibly hard for anyone else; exactly how we want public-key cryptography to be like.
Key and message sizes
The first thing to note is that—for both Lattices and Learning With Errors—both keys and the messages are significantly larger than what we are used from “traditional” public-key cryptography: Every transmitted bit(!) may be hundreds of bytes or even kilobytes in size.
Many current (“classical”) public-key algorithms (RSA, Diffie-Hellman, ECC, …) rely on it being hard to find the (large) prime factors of a composite number. A powerful quantum computer running Shor’s Algorithm would be able to factor these numbers quickly and thus break public-key cryptography. As data encrypted today might still need to remain safe even when powerful quantum computers will be available (at least several years from now), such data need to stop using these “classical” public-key algorithms.
The goal of homomorphic encryption is to separate the provisioning of computing power from the right to access the data in the clear: Calculations on (encrypted) data can be performed on a computer without the computer needing to decrypt and re-encrypt the data. In fact, without the computer even having the key (and thus the ability) to decrypt the data.
One of the earliest cryptosystems which allowed some simple forms of homomorphic encryption was RSA. In “textbook RSA“, (modular) multiplication can be performed on the encrypted data. For most application, this “malleability” is undesirable and needs to be actively prevented by making sure that the decryption of a homomorphically-multiplied message has an illegal format.
Lattice-based cryptosystems are much better suited for homomorphic encryption, as they can support multiple operations, not just multiplication. But the property of malleability remains, so in the encrypted state, almost any operation could be performed on the encrypted data. You still have to trust that the right operations were performed when using the decrypted result.
Learning With Errors does not currently seem usable for FHE.
You learnt the basics of two cryptographic primitives that will be important in the future. And you learnt two applications. So you are well prepared for the future. I hope you enjoyed the minimum math! (If you insist on these two topics being explained with just a little bit more math, read on and watch the linked videos.)
The descriptions are inspired by Kelsey Houston-Edwards‘ videos:
- Lattice-based cryptography: The tricky math of dots
- Learning with errors: Encrypting with unsolvable equations
They are great and provide some additional information.
I would like to thank DALL•E 2 for the teaser image it created using the prompt “Post-Quantum Encryption without text”. Yes, it seems that “without text” was ignored, as it does not seem to understand the concept of “text” at all, and fails to recognize that it does not recognize (part of) the prompt, as image generation AI seems to be prone to do. Also, the concept of negation (“without”) seems to be hard for AI; but then again, it is hard for humans: You might remember experimenting with “do not think of a pink elephant” in your childhood. And failing to not think of a pink elephant…