There’s a very elegant cryptographic construction discovered by Barreto and Naehrig in a 2005 paper. It is beautiful from a pure mathematical perspective, but also has an impressive application: it was* part of the ingenious mechanism by which Zcash supports publicly-verifiable private transactions.
‘Publicly-verifiable private transactions’ sound paradoxical: they’re transactions where the inputs and outputs are cryptographically obfuscated (so no-one can see how much money is changing hands), but where it’s still possible for a member of the public to verify that all inputs and outputs are nonnegative and that the sum of the inputs equals the sum of the outputs (so there’s no ‘cheating’ happening).
If you’re not amazed by this, then I haven’t explained it properly: all account balances are encrypted, and the amounts of money changing hands in a transaction remains completely encrypted, but the transactions include a certificate which proves that the inputs and outputs (which are encrypted) satisfy nonnegativity and conservation of money. No information about the inputs and outputs is revealed, other than the fact that the transaction doesn’t ‘cheat’. This is an example of a ‘zero-knowledge proof’, and I still find it completely and utterly surprising that these things are even possible.
The rest of this article will attempt to explain one of the key mathematical ingredients (a construction of cryptographic pairings) and then briefly outline how it fits into these zero-knowledge proofs used in Zcash.
*Zcash changed their curve in 2017 from a Barreto-Naehrig curve to a BLS curve. BLS curves end up being marginally more efficient (higher level of security for a given size of elliptic curve), which is why the programmers made the change. The principle is still the same, namely constructing an elliptic curve which supports ‘pairings’, but there are various quantitative differences between Barreto-Naehrig and BLS curves. The reason for concentrating on Barreto-Naehrig curves is that they’re somewhat simpler and more aesthetically pleasing.
What are elliptic curves?
[If you know about the elliptic curve group law and the Cayley-Bacharach theorem from algebraic geometry, feel free to skip this section.]
The points on an elliptic curve (cubic curve with no singularities) can be made into an Abelian group. In particular, we define an operation by the following:
- take an elliptic curve Γ;
- choose one of its points of inflection and take that to be the ‘identity point’, 0;
- assert that P + Q + R = 0 whenever there exists a line which intersects the curve at P, Q, and R. (Note that if there’s a repeated point, e.g. P = Q, then the line must have a ‘double intersection’ (point of tangency) at that point. Similarly, P + P + P = 0 if and only if P is a point of inflection; that’s why the identity point 0 must be one of the points of inflection.)
Traditionally, people often study elliptic curves in Weierstrass normal form, where the equation of the curve is . The identity point 0 is then typically chosen to be the point ‘at infinity’ where the curve intersects the line at infinity on the projective plane.
To show that this works as a definition, we firstly need to see that it’s well defined. In particular, given two points P and Q, how do we know that the third intersection R with the line ℓ through P and Q is uniquely determined?
In particular, ℓ is a linear equation, so we can use it to express x in terms of y or vice-versa. Substituting into the elliptic curve equation Γ gives a cubic equation in one variable. We know that this cubic equation has two roots (those corresponding to P and Q), so we can divide the cubic by those linear factors to determine the third root (corresponding to R).
Note that we didn’t assume that the ambient field was algebraically closed. This is important, because cryptographers use elliptic curves over finite fields, and a finite field cannot be algebraically closed.
This gives the following procedure for ‘adding’ two points:
- draw a line through P and Q and let it intersect the curve again at R;
- draw a line through R and 0 and let it intersect the curve again at S.
Then S is the sum of P and Q. This operation is commutative (interchanging P and Q does not affect the result), but how do we know that it’s associative? In particular, given the following diagram from a talk I gave in Winchester a couple of years ago, how do we know that the elliptic curve (blue), orange line, and green line all mutually intersect at the bottom-left?
It’s painful to verify this algebraically, but it follows immediately from the Cayley-Bacharach theorem. We’ve previously discussed this theorem, along with several miscellaneous applications to Euclidean geometry.
Elliptic curve cryptography and the discrete logarithm problem
There are two reasons why elliptic curve cryptography requires the use of a finite field instead of the real or complex numbers. One reason is practicality: there are uncountably many reals, most of which cannot be described, and therefore they cannot feasibly be represented on a computer.
The other reason is security: the Weierstrass elliptic function allows you to construct an isomorphism between the elliptic curve and a torus (specifically, the complex plane quotiented out by an appropriate lattice), and the ‘discrete logarithm problem’** on a torus is trivial; you can solve it efficiently using continued fractions.
**given two points, A and B, determine an integer m such that mA = B, where mA := A + A + … + A just means ‘the point A added to itself m times’.
On the other hand, for elliptic curves over a finite field, there is no known way to efficiently solve the elliptic curve discrete logarithm problem on a classical computer; this is how HTTPS and Bitcoin digital signatures remain secure. On a quantum computer you can use Shor’s algorithm, but you’d need a fault-tolerant 2330-qubit quantum computer with 128 billion Toffoli gates to break the 256-bit curve ‘secp256k1’ used for Bitcoin digital signatures, and that seems to be beyond the current technological capabilities of large commercial*** organisations.
***I doubt governments have this technology either. After all, they use Excel spreadsheets as a database for tracking the spread of a pandemic.
Anyway, note the following asymmetry:
- It is very easy, given a large number m and a point G, to compute the point mG: using a ‘double-and-add’ procedure, you can do it in log2(m) ‘doubles’ and H(m) ‘additions’, where H(m) is the number of ‘1’-bits in the binary expansion of m. This procedure was how Ancient Egyptians multiplied ordinary integers.
- On the other hand, it’s infeasible to go the other way. Given points G and mG, there is no known classical algorithm to reverse-engineer this to extract the original number m, without performing an amount of work proportional to the square-root of the number of points on the elliptic curve. Bitcoin’s elliptic curve has roughly 2^256 points, so it would take about 2^128 operations to steal a private-key using the Pollard rho algorithm.
G is just a global ‘base point’ on the elliptic curve Γ which (together with the curve itself) is a public parameter of the cryptosystem. The order n of G (smallest integer such that nG = 0) must be prime. Then we have an isomorphism from [the additive group of] to the elliptic curve:
which is cryptographically irreversible.
[Note: the prime order n of the base point G is not the same as the prime order p of the field in which the elliptic curve itself lives. If G is a generator of the curve, then p and n will be relatively close but not necessarily equal. In the case where they are equal, the elliptic curve is called anomalous, and it has inherent weaknesses.]
Elliptic curve digital signatures
The existence of this one-way function enables various important cryptographic primitives to be built on top of elliptic curves, such as the aforementioned elliptic curve digital signature algorithm (ECDSA) used by HTTPS and Bitcoin. In ECDSA, Alice’s private key is some integer , and her public key is the corresponding point mG on the elliptic curve Γ.
To sign a message M, Alice firstly computes a cryptographic hash z of M. In general, a cryptographic hash is a fixed-length bitstring (for SHA-256, it consists of 256 bits). In this case, we interpret z as an element of by interpreting it as a binary number and reducing modulo n.
Then, Alice computes a single-use cryptographically secure random number k, also in the field , and reveals the following:
- the abscissa (x-coordinate) r of the curve point kG;
- the value .
Neither r nor s is allowed to be zero; if this happened (incredibly unlikely!), Alice should generate a new value k and try again. These data (r, s) together form the digital signature for the message M. Bob can verify that Alice created the signature by computing the cryptographic hash z and checking that the curve point (r/s)(mG) + (z/s)G has abscissa r. This only requires Bob to know the public-key mG, not the private-key m.
ECDSA key-pairs can be reused to sign many messages, but you must generate a different random number k each time you sign a message. Otherwise, it’s possible for someone to determine your private-key. Indeed, several bitcoins were stolen by attackers as a result of a poor random number generator on early versions of the Android operating system.
A cryptographic pairing on an elliptic curve Γ is a bilinear map from Γ × Γ’ to [the multiplicative group of] some field F, where Γ’ is another elliptic curve isomorphic to Γ and related by a ‘twist’ (explained here). That is to say that:
where P, Q are curve points on Γ, Γ’ (respectively) and a, b are integers. Note that the existence of a cryptographic pairing means that the elliptic curve discrete logarithm (hard) on Γ can be transported to the ordinary discrete logarithm (not quite as hard for a given size) on the field F. As such, the field F needs to be substantially larger than the curve Γ, lest it be the Achilles heel of the cryptosystem.
The field F is a finite field , whose characteristic p matches that of the ambient field in which the elliptic curve Γ lives. The minimal degree k for which a pairing exists is called the embedding degree of the elliptic curve.
For Bitcoin’s curve, the embedding degree is humongous (comparable to the number of points on Γ), which makes the pairing impossible to use. On the other hand, if k were very small (e.g. 1 or 2), the discrete logarithm in F would be much weaker than the discrete logarithm in Γ, so you’d need a massive elliptic curve to attain a desired level of security, and that would come at a computational cost.
The Barreto-Naehrig curves are a family of elliptic curves with a good embedding degree k = 12, so you can (for example) have a 256-bit elliptic curve with a 3072-bit embedding field F. This is what Zcash previously used, but it transpires that 3072-bit discrete logarithm is potentially slightly weaker than the desired security level. This means you’d want to use a slightly larger elliptic curve (384 or 512 bits), with a corresponding 4608- or 6144-bit embedding field F, respectively.
Details of the Barreto-Naehrig construction
The size of a Barreto-Naehrig curve is parametrised by an integer x. The values p and n are quartic polynomials in the value x:
- p = 36 x^4 + 36 x^3 + 24 x^2 + 6 x + 1
- n = 36 x^4 + 36 x^3 + 18 x^2 + 6 x + 1
Observe that the difference between them is only 6 x^2, which is slightly lower than the square-root of p (or n), consistent with Hasse’s bound. The validity of the construction relies on the fact that n is a factor of the 12th cyclotomic polynomial evaluated at p − n = 6 x^2:
In:= InputForm@Factor@Cyclotomic[12, 6 x^2]
(1 - 6*x + 18*x^2 - 36*x^3 + 36*x^4)*(1 + 6*x + 18*x^2 + 36*x^3 + 36*x^4)
We need to choose a value of x such that these two numbers are prime; the first few such positive values of x are:
1, 5, 6, 7, 20, 78, 82, 123, 166, 169, 173, 202, 257, 295, 308, 321, 420, 438, 448, 460, 487, 543, 596, 650, 720, 798, 810, 811, 833, 845, 869, 872, 921, 981, …
which apparently isn’t yet in the OEIS. (I’ll add it.)
Of course, those values of x are far too small for cryptography. If you want a 256-bit elliptic curve, then you’ll want to choose x to be slightly lower than 2^64. By the prime number theorem, if you choose a random x you have a probability of 1/log(x)² that the numbers p and n will be prime.
After you’ve chosen a suitable x which passes both primality checks for p and n, you need to build the curve itself. Rather like Bitcoin’s elliptic curve secp256k1, the coefficient ‘a‘ in the equation is zero. [Note: the parameter x is not the same as the coordinate x; they just happen to have the same name.]
To determine b, you just keep trying successive values until you find one that works, as described in their algorithm:
Once you have the curve, how do you compute the pairing? There’s an algorithm by Miller (1986) for efficiently computing Tate/Weil pairings on arbitrary elliptic curves, and a paper by Devegili, Scott, and Dahab (2007) describes an optimised implementation of Tate and Ate pairings specifically for Barreto-Naehrig curves.
Interestingly, the paper makes the following comment:
Furthermore, for efficiency reasons in the pairing computation it is desirable to generate curves of prime order n such that n has a low Hamming weight. Constructing such curves for k = 12 or φ(k) > 4 is still a research problem.
The best choice of parameter I could find using the Barreto-Naehrig construction was x = 3(2^75 + 1), which results in n having 312 bits of which only 36 are nonzero.
Why are pairings useful?
They’re useful because they allow more computations on encrypted data.
Simply put, in the same way that an elliptic curve supports addition of numbers that have been encrypted as points, a pairing supports multiplication of encrypted numbers. It’s somewhat restricted, because the ‘product’ of two points belongs to F instead of Γ (i.e. it has a different type from those of the two multiplicands), so you can’t directly compute an encrypted product of three or more encrypted numbers. This is why pairings fall short of fully homomorphic encryption.
Despite this constraint, it’s still possible to take your desired computation (expressed as an arithmetic circuit) and compile it into a system of constraints that can be verified using pairings.
There’s an excellent explanation of zk-SNARKs here, which pedagogically illustrates this property of a pairing using the following diagram:
Illustration by Maksym Petkus from his article, Why and How zk-SNARK Works: Definitive Explanation
Petkus’s explanation abstracts away the particular choices of cryptographic primitives (the words ‘elliptic curve’ being mentioned only once in the whole article), but it’s useful additional context to know that the ‘Source set’ above is the elliptic curve Γ and the ‘Output set’ is the much larger embedding field F.
In addition to Petkus’s explanation, I’d strongly recommend also reading this series of blog posts by Ariel Gabizon.