Modular inverses using Newton iteration

Remark 3 Since Equation (8) is an equation in $R[x]/\langle x^{\ell} \rangle$ , a solution of this equation can be viewed as an approximation of a more general problem. Think of truncated Taylor expansions! So let us recall from numerical analysis the celebrated Newton iteration and let ${\phi}(g) = 0$ be an equation that we want to solve, where ${\phi}: {\mbox{${\mathbb{R}}$}} \longmapsto {\mbox{${\mathbb{R}}$}}$ is a differentiable function. From a suitable initial approximation , the sequence, called Newton iteration step,

$\displaystyle g_{i+1} \ = \ g_i - \frac{{\phi}(g_i)}{{\phi}'(g_{i})}$

(10)

allows to compute subsequent approximations and converge toward a desired solution. In our case we have ${\phi}(g) = 1/g - f$ and the Newton iteration step is

$\displaystyle g_{i+1} \ = \ g_i - \frac{ 1/g_i - f}{ - 1/{g_i}^2} \ = \ 2 g_i - f \, {g_i}^2.$

(11)

Theorem 1 Let be a commutative ring with identity element. Let be a polynomial in such that . Let $g_0, g_1, g_2, \ldots$ be the sequence of polynomials defined for all $i \geq 0$ by

$\displaystyle \left\{ \begin{array}{rcl} g_0 & = & 1 \\ g_{i+1} & \equiv & 2 g_i - f \, {g_i}^2 \ \mod{\ x^{2^{i+1}}} \end{array} \right.$

(12)

Then for $i \geq 0$ we have

$\displaystyle f \, g_i \ \equiv \ 1 \ \mod{ \ x^{2^i}}$

(13)

Proof. By induction on $i \geq 0$ . For

we have $x^{2^i} = x$ and thus

$\displaystyle f \, g_i \ \equiv \ f(0) \, g_0 \ \equiv \ 1 \, \times \, 1 \ \equiv \ 1 \ \mod{ \ x^{2^i}}$

(14)

For the induction step we have

$\begin{displaymath}\begin{array}{rcll} 1 - f \, g_{i+1} & \equiv & 1 - f (2 g_i ... ...2^{i+1}}} \\ & \equiv & 0 & \mod{ \ x^{2^{i+1}}} \\ \end{array}\end{displaymath}$

(15)

Indeed $f \, g_i \ \equiv \ 1 \ \mod{ \ x^{2^i}}$ means that $x^{2^i}$ divides $1 - f \, g_i$ . Thus $x^{2^{i+1}} = x^{2^i + 2^i} = x^{2^i} \, x^{2^i}$ divides $(1 - f \, g_i)^2$ . $\qedsymbol$

Definition 2 A multiplication time is a function ${\ensuremath{\mathsf{M}}}: {\mbox{${\mathbb{N}}$}} \longrightarrow {\mbox{${\mathbb{R}}$}}$ such that for any commutative ring with a , for every $n \in {\mbox{${\mathbb{N}}$}}$ , any pair of polynomials in of degree less than can be multiplied in at most ${\ensuremath{\mathsf{M}}}(n)$ operations of . In addition, ${\ensuremath{\mathsf{M}}}$ must satisfy ${\ensuremath{\mathsf{M}}}(n) / n \geq {\ensuremath{\mathsf{M}}}(m) / m$ , for every $m,n \in {\mbox{${\mathbb{N}}$}}$ , with $n \geq m$ . This implies the superlinearity properties, that is, for every $m,n \in {\mbox{${\mathbb{N}}$}}$

$\displaystyle {\ensuremath{\mathsf{M}}}(n m) \geq m {\ensuremath{\mathsf{M}}}(n... ...suremath{\mathsf{M}}}(n) \ \ {\rm and} \ \ {\ensuremath{\mathsf{M}}}(n) \geq n.$

(16)

Proof. Theorem 1 tell us that Algorithm 2 computes the inverse of

modulo $x^{2^r}$ . Since $x^{\ell}$ divides $x^{2^r}$ , the result is also valid modulo $x^{\ell}$ . Before proving the complexity result, we point out the following relation for $i = 1 \cdots r$ .

$\displaystyle g_i \ \equiv \ g_{i-1} \ \mod{ \ x^{2^{i-1}}}$

(17)

Indeed, by virtue of Theorem 1 we have

$\begin{displaymath}\begin{array}{rcll} g_i & \equiv & 2 g_{i-1} - f \, {g_{i-1}}... ...}}} \\ & \equiv & g_{i-1} & \mod{ \ x^{2^{i-1}}} \\ \end{array}\end{displaymath}$

(18)

Therefore when computing

we only care about powers of

in the range $x^{2^{i-1}} \cdots x^{2^i}$ . This says that

half of the computation of is made during the last iteration of the for loop,
a quater is made when computing $g_{r-1}$ etc.

Now recall that

$\displaystyle \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \cdots \ = \ 1$

(19)

So roughly the cost of the algorithm is in the order of magnitude of the cost of the last iteration. which consists of

two multiplications of polynomials with degree less than ,
a multiplication of a polynomial (with degree less than ) by a constant,
truncations modulo $x^{2^r}$
a subtraction of polynomials with degree less than .

leading to $2 {\ensuremath{\mathsf{M}}}(2^r) + O(2^r)$ operations in

. But this was not a formal proof, although the principle was correct. Let us give a more formal proof.

The cost for the -th iteration is

$\ensuremath{\mathsf{M}}(2^{i-1})$ for the computation of ${g_{i-1}}^2$ ,
$\ensuremath{\mathsf{M}}(2^i)$ for the product $f \, {g_{i-1}}^2 {{\rm ~mod~}}x^{2^i}$ ,
and then the opposite of the upper half of $f{g_{i-1}}^2$ modulo $x^{2^i}$ (which is the upper half ) takes $2^{i-1}$ operations.

Thus we have $\ensuremath{\mathsf{M}}(2^i) + \ensuremath{\mathsf{M}}(2^{i-1}) + 2^{i-1} \le \frac{3}{2} \, \ensuremath{\mathsf{M}}(2^i) + 2^{i-1}$ , resulting in a total running time:

$\displaystyle \sum_{1\le{i}\le{r}}{\frac{3}{2} \, \ensuremath{\mathsf{M}}(2^i) ... ...\, \ensuremath{\mathsf{M}}(2^r)+2^r} = {3\,\ensuremath{\mathsf{M}}(\ell) +\ell}$

(20)

since $2\ensuremath{\mathsf{M}}(n) \le \ensuremath{\mathsf{M}}(2n)$ for all $n \in$ $\mbox{${\mathbb{N}}$}$ $\qedsymbol$

Remark 6 Let us take a closer look at the computation of

$\displaystyle g_{i-1} (2 - f \, g_{i-1}) \ \mod{ \ x^{2^{i}}}$

(22)

in Algorithm 2. Consider first the product $f \, g_{i-1}$ . It satisfies:

$\displaystyle f \, g_{i-1} \ \equiv \ 1 \ \mod{ \ x^{2^{i-1}}}$

(23)

Moreover, the polynomials and $g_{i-1}$ can be seen as polynomials with degrees less than $2^{i}$ and $2^{i-1}$ respectively. Hence, there exist polynomials $S, T \in R[x]$ with degree less than $2^{i-1}$ such that we have:

$\displaystyle f \, g_{i-1} = 1 + T x^{2^{i-1}} + S x^{2^{i}}.$

(24)

We are only interested in computing . In order to avoid computing , let us observe that we have

$\displaystyle f \, g_{i-1} \ \equiv \ (1 + S) + T x^{2^{i-1}} \mod{ \ x^{2^{i}} - 1}.$

(25)

In other words, the upper part (that is, the terms of degree at least $2^{i-1}$ ) of the convolution product of $f \, g_{i-1}$ gives us exactly .

So let us assume from now on that we have at hand a primitive -th root of unity, such that we can compute DFT's. Therefore, we can compute at the cost of one multiplication in degree less than $2^{i-1}$ .

Consider now that we have computed $2 - f \, g_{i-1} \mod{ \ x^{2^{i}}}$ . Viewing $2 - f \, g_{i-1}$ and $g_{i-1}$ as polynomials with degrees less than $2^{i}$ and $2^{i-1}$ respectively, there exist polynomials $U, V, W \in R[x]$ with degree less than $2^{i-1}$ such that

$\displaystyle g_{i-1} (2 - f \, g_{i-1}) = U + V x^{2^{i-1}} + W x^{2^{i}}$

(26)

We know that $g_{i-1} \ \equiv \ U \ \mod{ \ x^{2^{i-1}}}$ . Hence, we are only interested in computing . Similarly to the above, we observe that

$\displaystyle g_{i-1} (2 - f \, g_{i-1}) \ \equiv \ (U + W) + V x^{2^{i-1}} \mod{ \ x^{2^{i}} - 1}$

(27)

Therefore, using DFT, we can compute at the cost of one multiplication in degree less than $2^{i-1}$ .

It follows that, in the complexity analysis above (in the proof of Theorem 2) we can replace $\ensuremath{\mathsf{M}}(2^i) + \ensuremath{\mathsf{M}}(2^{i-1})$ by $\ensuremath{\mathsf{M}}(2^{i-1}) + \ensuremath{\mathsf{M}}(2^{i-1})$ leading to $2\,\ensuremath{\mathsf{M}}(\ell) + O({\ell})$ instead of $3\,\ensuremath{\mathsf{M}}(\ell) + O({\ell})$ .