59 minute read ★★★☆☆

This time we will be proving the Coppersmith’s theorem using the proof method of Howgrave-Graham. We will use lattices and the lattice basis reduction algorithm to do it, and this will hopefully provide you some guidances in making use of lattice basis reductions for solving different problems (e.g. CTF problems).

Prerequisite: Coppersmith’s Method (Part I): Introduction

In this post, you just need to remember that the first vector returned by the LLL algorithm will satisfy \[ |b_1| \leq 2^{\frac{n-1}{4}}(\det L)^{\frac{1}{n}}\]And that LLL algorithm runs in polynomial time of \(n\) (the dimension of lattice) and \(\log\max\limits_{b_i\in B}\|b_i\|\) (the log of length of largest vector in the input).

The proofs we are going to do today are constructive, meaning that we are creating an algorithm that are able to solve our problem. The conditions are there because the algorithm requires it, and so the logic may seem reversed (in fact it is kind of self-referential) if you do not have much mathematical maturity. Kind of like this: Surprise tool meme

Howgrave-Graham’s Formulation

Recall: We use the convention that \(F(x)=\sum\limits_{i=0}^d a_i x^i\) be polynomial with integer coefficients with degree \(d\) (so \(a_d\neq 0\)). Let \(x_0\) be a root of \(F(x)\equiv 0\pmod{N}\) such that \(\lvert x_0\rvert < X\) for some integer \(X\). For that \(X\), we associate the polynomial \(F\) with the vector \(b_F = (a_0, a_1X, a_2X^2, \cdots, a_d X^d)\), and we may use the two notations interchangeably. The norm of a vector \(v=(v_1,\cdots v_n)\) is denoted \(\vert\vert v\rvert\rvert = \sqrt{\sum\limits_{i=1}^n v_i^2}\).

Theorem (Howgrave-Graham) If \(\lvert\lvert b_F\rvert\rvert < \frac{N}{\sqrt{d+1}}\), then \(F(x_0)=0\) over the integers (instead of mod \(N\) only).

The proof can be founded in the first part and is not difficult, but let’s also include it here for self-containedness.

Proof First we note that \(\sum\limits_{i=1}^n x_i \leq \sqrt{n \cdot \sum\limits_{i=1}^n x_i^2}\) by Cauchy-Schawarz inequality. Then

\[\begin{align*} \lvert F(x_0)\rvert =& \left\lvert \sum\limits_{i=0}^d a_i x_0^i\right\rvert\\ \leq& \sum\limits_{i=0}^d \lvert a_i\rvert\ \lvert x_0^i\rvert\\ < &\sum\limits_{i=0}^d \lvert a_i\rvert X^i\\ \leq &\sqrt{d+1} \lvert\lvert b_F\rvert\rvert\text{ (By the above lemma)}\\ \leq &\sqrt{d+1}\frac{N}{\sqrt{d+1}}\\ =& N \end{align*}\]

So \(-N < F(x_0) < N\). Since \(F(x) \equiv 0\pmod{N}\) this means \(F(x_0) = 0\) (without mod).

First Try

Here we are using the lattice vectors to correspond to a polynomial. If all the basis vectors \(b_i\) (or equivalently polynomial) that span a lattice \(\mathcal{L}\) have a common root \(\alpha\), then any vectors in the lattice will also have the root \(\alpha\) (so \(b_i(\alpha)=0\) for all \(i\)), since \(\forall v\in L\), \[v(\alpha) = \sum_i c_i b_i(\alpha) = \sum_i c_i\cdot 0 = 0\] We will exploit this property in our derivation.

Let \(\displaystyle B=\left[\begin{array}{ccccc} N & & & \\ &N X & & & \\ && N X^{2} & & \\ & & & \ddots & \\ &&&&NX^{d-1}\\ a_{0} & a_{1} X & a_{1} X^{2} & \cdots & a_{d-1} X^{d-1} & X^{d} \end{array}\right]_{(d+1)\times(d+1)}\)

Since this matrix is lower-triangular, the determinant of \(B\) is the product of diagonals: \(\det B = N^d X^{\frac{d(d+1)}{2}}\).

Theorem Let \(G(x)\) be the polynomial corresponding to \(b_{1}\) after LLL on \(L(B)\) . Set \(C_{1}(d)=2^{-\frac{1}{2}}(d+1)^{-\frac{1}{2}}\). If \[\displaystyle X<C_{1}(d) N^{\frac{2}{d(d+1)}},\]

any root \(x_{0}\) of \(F(x) \bmod N\) such that \(\|x\|<X\) satisfies \(G(x)=0\) in \(\mathbb{Z}\).

Proof: After performing LLL on \(B\), the shortest vector \(b_1\) (which will correspond to a polynomial \(G\)) will satisfy \(\quad\left\|b_{1}\right\| \leq 2^{\frac{n-1}{4}}(\det L)^{\frac{1}{n}}=2^{\frac{d}{4}} N^{\frac{d}{d+1}} X^{\frac{d}{2}}\) (as \(n=d+1\)). To satisfy Howgrave-Graham, need


If \(\quad\left\|b_{1}\right\| \leq 2^{\frac{d}{4}} N^{\frac{d}{d+1}} X^{\frac{d}{2}} < \frac{N}{\sqrt{d+1}}\), then the condition will be satisfied. Now

\[\begin{align*} 2^{\frac{d}{4}} N^{\frac{d}{d+1}} X^{\frac{d}{2}} < &\frac{N}{\sqrt{d+1}}\\ 2^{\frac d4} X^{\frac d2} \sqrt{d+1} <& N^{1-\frac{d}{d+1}}\\ \left(2^{\frac 12}(d+1)^{\frac 1d}\right)^{\frac{d}{2}} X^{\frac d2} <& N^{\frac{1}{d+1}}\\ C_1(d)^{-\frac d2} X^{\frac d2} <& N^{\frac{1}{d+1}}\\ X^{\frac d2} <&C_1(d)^{\frac d2}N^{\frac{1}{d+1}}\\ X < & C_1(d) N^{\frac{2}{d+1}} \end{align*}\]

So \(X < C_1(d) N^{\frac{2}{d+1}}\) will ensure the condition in Howgrave-Graham theorem will hold, then the statement in the theorem will hold.\qed

The seems like a good method to recover small roots of polynomials mod \(N\)! However, the bound for \(X\) is very small: we have \(X\approx N^{\frac{1}{d^2}}\). For example, for \(N=10001\) and \(d=3\), the bound for \(X\) is only \(2.07\), which is not useful at all.

Increasing \(X\)

The remedy here is to note that by increasing the dimension \(n\) of \(B\), the resulting bounding for \(X\) will be higher.

First Idea

One idea would be to use higher order polynomials: apart from using the polynomials \(g_i(x)=Nx^{i}\) and \(F(x)\) itself, we also use the x-shifts of \(F\), i.e.\[xF(x), x^2F(x),\cdots,x^kF(x)\] Then the right hand side Howgrave-Graham bound will be \(\frac{N}{\sqrt{d+1+k}}\). This way the resulting bound is higher, making \(X\approx N^{\frac{1}{2d-1}}\) (note that the power of the modulus is unchanged)! For example, again for \(N=10001, d=3\), if we add 3 x-shifts of \(F\), the new bound will be about \(3.11\).

Second Idea

The second idea is to increase the power of \(N\) on the right hand side by taking powers of \(F\). This means that instead of considering polynomials mod \(N\), we are considering mod \(N^h\) for some positive integers \(h\). Note that \(F(x)\equiv 0\pmod{N}\iff N^{h-k}F^k(x)\equiv 0\pmod{N^h}\) for any \(0\leq k\leq h\). So we can certainly work mod \(N^h\), have a bigger bound for the small roots, and the small roots found is also a root of the original polynomial mod \(N\).

By combining the two ideas, we will get a much higher bound. This is exactly the method used in Coppersmith’s Theorem.

Coppersmith’s Theorem

Theorem (Coppersmith) Let \(0<\varepsilon < \min\{0.18,\frac 1d\}\). Let \(F(x)\) be a monic irreducible polynomial of degree \(d\) with root(s) \(x_0\) modulo \(N\) such that \(\lvert x_0\rvert < \frac 1 2 N^{\frac 1 d - \varepsilon}\). Then such root(s) can be found in polynomial time of \(d\), \(\frac 1 \varepsilon\) and \(\log N\).

We note that even though we have a \(\varepsilon\) term, we can “improve” the bounded to \(N^{\frac 1d}\) by exhaustive search on the top few bits of \(x_0\).


Let \(h\) be an integer (to be determined later) depending on \(d\) and \(\varepsilon\). We consider the lattice containing vectors corresponding to the polynomials\[G_{ij}(x)=N^{h-1-j}F(x)^jx^i\]where \(0 \leq i<d, 0\leq j< h\). As above, if \(x_0\) is a solution to \(F(x)\equiv 0\pmod{N}\), then \(G_{ij}(x_0)\equiv 0\pmod{N^h}\). We also notice that the degree of \(G_{ij}\) is \(dj+i\), so the degree runs through 0 to \(dh-1\) exactly once.

Copying the Previous Method

In this way, the polynomials corresponding to (row) vectors \(b_{G_{ij}}\), and the matrix formed by the vectors can be rearranging the rows to form a lower triangular matrix with diagonals being \(N^{h-1-j}X^{dj+i}\). The determinant of this matrix is therefore \(N^{\frac{dh(h-1)}{2}}X^{dh}\), where this lattice has dimension \(dh\).

Now as before, run LLL on the lattice, and the first vector \(b_1\) will satisfy \(\|b_1\|<2^{\frac{dh-1}{4}}\left(\det L\right)^{\frac{1}{dh}}=2^{\frac{dh-1}{4}}N^{\frac 12 (h-1)}X^{\frac{dh-1}{2}}\). This vector corresponds to a polynomial \(G(x)\) with \(G(x_0)\equiv 0\pmod{h}\). If \(\|b_1\|\leq \frac{N^h}{\sqrt{dh}}\), then by Howgrave-Graham \(G(x_0)=0\) over the integers. So our goal becomes

\[\begin{align*} 2^{\frac{dh-1}{4}}N^{\frac 12 (h-1)}X^{\frac{dh-1}{2}} <& \frac{N^h}{\sqrt{dh}}\\ \sqrt{dh} 2^{\frac{dh-1}{4}}X^{\frac{dh-1}{2}} <& N^{\frac 12 (h-1)}\\ c(d,h)X<& N^{\frac{h-1}{dh-1}} \end{align*}\]

Where \(c(d,h)=\left(\sqrt{dh}2^{\frac{dh-1}{4}}\right)^{\frac{2}{dh-1}}=\sqrt{2}(dh)^{\frac{1}{dh-1}}\).

Getting \(\varepsilon\)

Now the power of \(N\) is

\[\begin{align*} \frac{h-1}{dh-1} =& \frac{d(h-1)}{d(dh-1)}\\ =& \frac{dh-1+1-d}{d(dh-1)}\\ =&\frac 1d - \frac{d-1}{d(dh-1)} \end{align*}\]

If we set \(\varepsilon = \frac{d-1}{d(dh-1)}\), then \(dh = \frac{d-1}{d\varepsilon}+1\). So \(c(d,h)\) becomes \[\sqrt{2}(dh)^{\frac{1}{dh-1}} = \sqrt{2}\left(1+\frac{d-1}{d\varepsilon}\right)^{\frac{d\varepsilon}{d-1}}=\sqrt{2}(1+u)^{1/u}\] By letting \(u=\frac{d\varepsilon}{d-1}\). Going back to the original inequality on \(X\), we have \(c(d,h)X< N^{\frac{h-1}{dh-1}}\), substituting the variable nets \[X < \frac{1}{c(d,h)}N^{\frac 1d - \epsilon}\]

Our original theorem statement requires \(X<\frac 12 N^{\frac 1d - \epsilon}\), so it seems that we will need \(c(d,h)\leq 2\). Luckily \(c(d,h)=\sqrt{2}(1+\frac 1u)^u\), and \((1+\frac 1u)^u \leq \sqrt{2}\) for \(0\leq u\leq 0.18\). \(u=\frac{d\varepsilon}{d-1}\) so \(\varepsilon \leq (1-\frac 1d)0.18\) so we just need to make sure that \(\varepsilon \leq 0.18\), which is the case in the assumption.

Setting \(h\)

Since \(h = \frac{d-1}{d^2\varepsilon}+\frac 1d = \frac{1}{d\varepsilon} - \frac{1}{d^2\varepsilon} + \frac 1d \approx \frac{1}{d\varepsilon}\), setting \(h\) to be larger than \(\frac{1}{d\varepsilon}\) will make Howgrave-Graham work.

For the polynomial time part, we note that the dimension of \(L\) (the lattice to be reduced) has dimension \(dh\approx\frac 1\varepsilon\), and the coefficients of the polynomials are bounded by \(N^h\). QED.

Next time we will try to prove the recovery of \(N=pq\) given partial knowledge of \(p\).


Galbraith, S. D. (2012). Mathematics of public key cryptography. Cambridge University Press.

Appendix: To be determined later…?

Note that \(h\) is “to be determined later”. Indeed it seems that we should have chosen \(h\) first, then to check if the condition of Howgrave-Graham is satisfied given the constraints of the theorem. However we are actually deriving the value of \(h\) that fits the condition. Then if we go hack and substitute the instances of (the unknown) \(h\) with the values we got at the end, we will be able to complete the proof.

Example: Limit of Functions in Calculus

Here we end this post with an example of these “to be determined later” that appears in mathematics. In calculus (or actually basic mathematical analysis), we say the limit of a function \(f(x)\) at a point \(a\) is \(L\) if for any \(\varepsilon>0\), there exists a \(\delta>0\) (that depends on \(\varepsilon\) only), such that for any \(x\) such that \(0<\lvert x-a\rvert < \delta\), we have that \[\lvert f(x)-L\rvert < \varepsilon.\]

The statement just means that, intuitively, that the value of \(f(x)\) can get arbitrarily close to the value \(L\), if \(x\) is also close enough to \(a\). This is like a challenge-response protocol, where we give a desired distance away from \(L\), and we should provide a response that consists of the range of \(x\) (near \(a\)) that can achieve the desired distance.

For instance, let’s try to prove that \[\lim\limits_{x\to 1}\frac 1x = 1\] with the definition.

For any \(\varepsilon>0\), \(\left\lvert \frac 1x - 1 \right\rvert = \left\lvert \frac{x-1}{x} \right\rvert = \frac{\lvert x-1 \rvert}{\lvert x \rvert}\).

If \(\lvert x-1\rvert < \frac 12\), then \(\frac 12 < x < \frac 32\). In particular \(x > \frac 1 2\), so with that assumption, we can have \(\frac{\lvert x-1 \rvert}{\lvert x \rvert} < 2\lvert x-1 \rvert\).

If further \(\lvert x-1 \rvert < \frac \varepsilon 2\), then \(2\lvert x-1 \rvert < \varepsilon\).

Now when we look back at the derivations, we find that \(\delta\) should be less than both \(\frac 12\) and \(\frac\varepsilon 2\), so that the requirement of \(\left\lvert \frac 1x - 1 \right\rvert < \varepsilon\) is satisfied. Let’s see how we will actually present the proof:

Proof: For any \(\varepsilon>0\), set \(\delta = \min\{\frac 12, \frac\varepsilon 2\}\). Then for any \(x\) such that \(0 < \lvert x-1\rvert < \delta\),

\[\begin{align*} \left\lvert \frac 1x - 1 \right\rvert =& \left\lvert \frac{x-1}{x} \right\rvert\\ =& \frac{\lvert x-1 \rvert}{\lvert x \rvert}\\ <& 2\lvert x-1 \rvert\\ <& \varepsilon \end{align*}\]

So \(\lim\limits_{x\to 1}\frac 1x = 1\) by definition.

See how the logic is kind of reversed? This is indeed because of the fact that we have already derived the whole thing in order to figure out the required value of \(\delta\) to satisfy the condition.