9  Shor’s Algorithm

One of the best known quantum algorithm is Shor’s algorithm for finding the prime factors of an integer. It was developed by Peter Shor in 1994.

9.1 Discrete Fourier Transformation

One of the tools required for Shor’s algorithm is the Discrete Fourier Transformation (DFT). Generally, a Fourier transformation is a mathematical technique that decomposes a function into its constituent frequencies. We use the DFT to find the period of a vector.

The DFT is defined as follows:

Definition 9.1 (Discrete Fourier Transformation (DFT)) The discrete Fourier transform (DFT) is a linear transformation on \(\mathbb{C}^M\) represented by the matrix \[ \operatorname{DFT}_M = \frac{1}{\sqrt{M}} (\omega^{kl})_{kl} \in \mathbb{C}^{M\times M} \] with \(\omega = e^{2i\pi/M}\), which is the \(M\)-th root of unity.

This transformation is best imagined as a process, which takes a periodic vector as an input and outputs the period of that vector. The DFT has some important properties, which help us later on.

Theorem 9.1 (Properties of the DFT) Here are some properties of the DFT which can be used without further proof.

  1. The DFT is unitary.
  2. \(\omega^t = \omega^{t\mod M}\) for all \(t \in \mathbb{Z}\).
  3. Given a quantum state \(\psi \in \mathbb{C}^M\) which is \(r\)-periodic and where \(r\mid M\), \(\operatorname{DFT}_M \psi\) will compute a quantum state \(\phi \in \mathbb{C}^M\), which has non-zero values on the multiples of \(\frac{M}{r}\). Note that \(\frac{M}{r}\) intuitively represents the frequency of \(\psi\). This means, that \[ |\phi_i| = \begin{cases} \frac{1}{\sqrt{r}}, & \text{if}\ \frac{M}{r}\mid i \\ 0, & \text{otherwise} \end{cases} \]

9.2 Reducing factoring to period finding

With the DFT, we have seen, that we can use a unitary to find the period of a quantum state. We now look into using period finding to factor integers. We first look at the definition of the two problems:

Definition 9.2 (Factoring problem) Given integer \(N\) with two prime factors \(p,q\) such that \(pq=N\) and \(p \neq q\), find \(p\) and \(q\).

Note that this definition of the factoring problem is a simplified version of the factoring problem, where \(N\) has only 2 prime factors.

Definition 9.3 (Period finding problem) Given \(f: \mathbb{Z} \to X\) with \(f(x) = f(y)\) iff \(x \equiv y \bmod r\) for some fixed secret \(r\). \(r\) is called the period of \(f\). Find \(r\).

To start the reduction, we need a special case of the period finding problem called order finding:

Definition 9.4 (Order finding problem) For known \(a\) and \(N\) which are relatively prime, find the period \(r\) of \(f(i) = a^i \bmod N\). We call \(r\) the order of \(a\) written \(r = \text{ ord } a\). (This is similar to finding the smallest \(i > 0\) with \(f(i) = 1\)).

Since the order finding problem is just the period finding problem for a specific \(f(x)\), we know that if we can solve the period finding problem within reasonable runtime, we can also solve the order finding problem within reasonable runtime. We now reduce the factoring problem to the order finding problem:

We have a integer \(N\) as an input for the factoring problem.

  1. Pick an \(a \in \{1,\dots,N-1\}\) with \(a\) relatively prime to \(N\).
  2. Compute the order of \(a\), so that \(r = \text{ ord } a\) (using the solver for the order finding problem).
  3. If the order \(r\) is odd, we abort.
  4. Calculate \(x:= a^{\frac{r}{2}}+1 \bmod N\) and \(y:= a^{\frac{r}{2}}-1 \bmod N\).
  5. If \(\gcd(x,N) \in \{1,N\}\), we abort.
  6. We compute \(p = \gcd(x,N)\) and \(q = \gcd(y,N)\).

The output of the reduction are \(p,q\), such that \(pq = N\). This holds, since \[ xy = (a^{\frac{r}{2}}+1) (a^{\frac{r}{2}}-1) = a^r - 1 \equiv 1-1 = 0 \pmod N \]

Theorem 9.2 (Probability of an abort) If \(N\) has at least two different prime factors and \(N\) is odd, then the probability to abort is \(\leq \frac{1}{2}\).

All in all this reduction shows, that if we have an oracle which can solve the period finding problem within reasonable runtime, we can also solve the factoring problem within reasonable runtime (since all other operations are classically fast to compute).

9.3 The quantum algorithm for period finding

We now look into an quantum algorithm that solves the period finding problem within reasonable runtime. The quantum circuit for Shor’s algorithm requires a \(f:\mathbb{Z}\rightarrow X\) which is \(r\)-periodic. We choose a number \(m\) which needs to be big enough to encode the values of \(X\) and choose a number \(n\) under the condition of \(n\geq 2 \log_2(r)\) for the post processing to work. Note that when using this algorithm for factoring, we choose \(n\) to be \(n:=2\lvert N \rvert\), since \(r \leq N\). \(\lvert N\rvert\) denotes the number of bits needed to encode \(N\) here.

The quantum algorithm for period finding is shown in this figure:

Shor’s algorithm (quantum part)

The algorithm works as follows:

  1. We start with a \(\ket{0}\) entry on every wire.
  2. We bring the top wire into the superposition over all entries. The quantum state is then \(2^\frac{-n}{2}\sum_x \ket{x} \otimes \ket{0^m}\).
  3. We apply \(U_f\), which is the unitary of \(f:\{0,1\}^n\rightarrow\{0,1\}^m\). This calculates the superposition over all possible values \(f(x)\) on the bottom wire. The resulting quantum state is \(2^\frac{-n}{2}\sum_x \ket{x,f(x)}\).
  4. To understand the algorithm better, we measure the bottom wire at this point. This will give us one random value \(f(x_0)\) for some \(x_0\). The top wire will then contain a superposition over all values \(x\) where \(f(x) = f(x_0)\). Since \(f\) is known to be \(r\)-periodic, we know, that \(f(x) = f(x_0)\) iff \(x \equiv x_0 \bmod r\). This means, that the resulting quantum state on the top wire is periodic and can be written as \(\frac{1}{\sqrt{2^\frac{n}{r}}} \sum_{x\equiv x_0 \bmod r} \ket{x} \otimes \ket{f(x_0)}\).
  5. We apply the Discrete Fourier Transform on the top wire. This will “analyze” the top wire for the period and output a vector with entries at multiples of \(\frac{2^n}{r}\) as seen in Theorem 9.1. For simplicity we assume, that \(r \mid 2^n\) holds.
  6. We measure the top wire and get one random multiple of \(\frac{2^n}{r}\), which we can denote as \(a\cdot\frac{2^n}{r}\)

Since we get a multiple of \(\frac{2^n}{r}\) on each run, we can simply run the algorithm multiple times to get different multiples and then compute \(\frac{2^n}{r}\) by taking the gcd of those multiples. From that we compute \(r\). Unfortunately this only works because we assumed \(r \mid 2^n\). Since this does usually not hold, we only get approximate multiples of \(\frac{2^n}{r}\) (which is not even an integer) and thus post processing is a bit more complex.

9.4 Post processing

So far we have seen the DFT to analyze the period of a quantum state, we have seen a way to reduce the factoring problem to the period finding and we have seen a quantum algorithm for finding an approximate multiple of such a period of a function. We just need one final step to find \(r\). For this we start with a theorem:

Theorem 9.3 Iff \(f: \mathbb{Z} \rightarrow X\) is \(r\)-periodic, the following holds with probability \(\Omega(1/\log\log r)\): \[ \frac{-r}{2} \leq rc\bmod 2^n \leq \frac{r}{2} \] where \(c\) is the output of the second measurement of the quantum circuit described in Section 9.3 and \(n\) is the number of qubits on the upper wire of the quantum circuit.

We assume that the theorem holds for our outcome \(c\) of the second measurement (if that is not the case, our result will be wrong and we can just run the quantum algorithm again to get a different outcome):

Then exists an integer \(d\) such that: \[ \begin{aligned} &\lvert rc - d2^n\rvert \leq \frac{r}{2} \\ \iff&\lvert \frac{c}{2^n} - \frac{d}{r}\rvert \leq \frac{1}{2^{n+1}} && |\text{ division by } r\cdot 2^n \end{aligned} \] The fraction \(\frac{c}{2^n}\) is known, so the goal is to find a fraction \(\frac{d}{r}\) that is \(\frac{1}{2^{n+1}}\)-close to \(\frac{c}{2^n}\).

Since \(n\) is the number of qubits used in the quantum circuit and was chosen, such that \(n \geq 2\log_2(r)\) and thus \(2^{n} \geq 2 r^2\) holds and from this we know that \(\frac{1}{2^{n+1}} \leq \frac{1}{2r^2}\) holds as well.

So if Theorem 9.3 holds, we now \(\lvert \frac{c}{2^n} - \frac{d}{r} \rvert \leq \frac{1}{2r^2}\) also holds. Our task is now rewritten to find \(\frac{d}{r}\) under this condition. For this we use another theorem:

Theorem 9.4 For a given real number \(\varphi \geq 0\) and integer \(q > 0\) there is at most one fraction \(\frac{d}{r}\) with \(r \leq q\) and \(\lvert \varphi - \frac{d}{r} \rvert \leq \frac{1}{2q}\). In this case, this \(\frac{d}{r}\) is a convergent of the continued fraction expansion of \(\varphi\).

This theorem uses the convergent of a continued fraction expansion. A continued fraction expansion of a number \(t\) is the number rewritten as a fraction in the form

\[ t = a_0 + \frac{1}{a_1 + \frac{1}{a_2 + \frac{1}{a_3 + \dots}}} \]

where \(a_i\) always has to be the biggest possible integer. We call \([a_0,a_1,a_2,a_3,\dots]\) the continued expansion of \(t\). The expansion is finite iff t is rational. For a given continued expansion, a prefix \([a_0,\dots,a_i]\) is called a convergent. Writing this convergent as a normal fraction will give us an approximation of the number \(t\).

Example: continued expansion of a fraction

The number \(2.3\) can be written as \[ 2.3 = 2 + \frac{1}{3 + \frac{1}{3 + 0}} \] and the continued fraction expansion of \(2.3\) is \([2,3,3]\). The expansions \([2]\) and \([2,3]\) are convergents of the expansion of \(2.3\) and written as a fraction will give us the approximations \(2\) and \(2+\frac{1}{3} = 2.\bar{3}\).

The number \(0.99\) can be written as

\[ 0.99 = 0 + \frac{1}{1 + \frac{1}{99 + 0}} \] and the continued fraction expansion of \(0.99\) is \([0,1,99]\). The expansions \([0]\) and \([0,1]\) are convergents of the expansion of \(0.99\) and written as a fraction will give us the approximations \(0\) and \(0+\frac{1}{1} = 1\).

Using Theorem 9.4 (with \(\varphi:= \frac{c}{2^n}\) and \(q:=2^n\)) we can find \(\frac{d}{r}\) and from this \(r\) which is the period of our function using the following steps:

For each convergent \(\gamma\) of \(\varphi\) do the following:

  1. Compute \(\gamma\) as fraction \(\frac{d}{r}\).
  2. Stop if \(r \leq 2^n\) and this \(\frac{d}{r}\) is \(\frac{1}{2^{n+1}}\)-close to \(\frac{c}{2^n}\) and return \(r\).

Note: It can happen, that the resulting fraction does not have the right \(r\) in the denominator, because \(\frac{d}{r}\) was simplified (if numerator and denominator shared a common factor). But the probability of this happening is sufficiently small and already included in the probability in Theorem 9.3.

This completes the postprocessing of Shor’s algorithm.

9.5 Constructing the DFT

So far we have described everything necessary for Shor’s algorithm, but only described the matrix representation of the \(\operatorname{DFT}_M\). We will now take a closer look into implementing the \(\operatorname{DFT}_M\) as a quantum circuit. Since we only use the \(\operatorname{DFT}_M\) for Shor’s algorithm so far, we will only look at \(M=2^n\), which is the \(\operatorname{DFT}\) applied on \(n\) qubits.

To start the circuit, we recall the definition of the \(\operatorname{DFT}_{2^n}\) from Definition 9.1: \(\operatorname{DFT}_{2^n} := \frac{1}{\sqrt{{2^n}}} (\omega^{kl})_{kl}\) with \(\omega:= e^{2\pi i / 2^n}\). To apply the \(\operatorname{DFT}_{2^n}\) to a quantum state \(\ket{j}\) we calculate \[ \operatorname{DFT}_{2^n}\ket{j} = \frac{1}{\sqrt{{2^n}}} \sum_k e^{2\pi i j k 2^{-n}} \ket{k} \] We can rewrite this as follows: \[ \begin{aligned} \operatorname{DFT}_{2^n}\ket{j} =& \frac{1}{\sqrt{{2^n}}} \sum_k e^{2\pi i j k 2^{-n}} \ket{k}\\ =& \frac{1}{\sqrt{{2^n}}} \sum_{k_1} \dots \sum_{k_n} e^{2\pi i j (\sum_l k_l 2^{-l})} \ket{k_1 \dots k_n}\\ =& \frac{1}{\sqrt{{2^n}}} \sum_{k_1} \dots \sum_{k_n} \bigotimes^n_{l=1} e^{2\pi i j (k_l 2^{-l})} \ket{k_l}\\ =& \frac{1}{\sqrt{{2^n}}} \bigotimes_{l=1}^n \sum_{k_l} e^{2\pi i j (k_l 2^{-l})} \ket{k_l}\\ =& \frac{1}{\sqrt{{2^n}}} \bigotimes_{l=1}^n (\ket{0} + e^{2\pi i j 2^{-l}} \ket{1})\\ =& \frac{1}{\sqrt{{2^n}}} \bigotimes_{l=1}^n (\ket{0} + e^{2\pi i 0.j_{n-(l-1)} \dots j_{n}} \ket{1})\\ =& \bigotimes_{l=1}^n \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_{n-(l-1)} \dots j_{n}} \ket{1}) \end{aligned} \] The expression \(0.j\) expresses a binary fraction (e.g. \(0.101 = \frac{1}{2} + \frac{1}{8} = \frac{5}{8}\)).

With this we have shown, that we can write \(\operatorname{DFT}_{2^n}\ket{j}\) as the following tensor product of quantum states

\[ \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_n} \ket{1}) \otimes \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_{n-1}j_n} \ket{1}) \otimes \dots \otimes \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_1\dots j_n} \ket{1}) \]

From this rewritten tensor product, we can get an idea on how to construct the quantum circuit for the \(\operatorname{DFT}_{2^n}\). Namely, we can construct a quantum circuit for each element of the tensor product and from this build the general circuit.

For this, we segment the tensor product into different elements \(\psi\) as follows:

\[ \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_n} \ket{1})}_{\psi_1} \otimes \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_{n-1}j_n} \ket{1})}_{\psi_2} \otimes \dots \otimes \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_1\dots j_n} \ket{1})}_{\psi_n} \]

We also introduce a new gate \(R_k\) which is defined by the following matrix:

\[ R_k:=\begin{pmatrix} 1 & 0 \\ 0 & e^{2 \pi i / 2^k} \end{pmatrix} \]

To understand the construction of the circuit from these elements, we will look at an example for \(n=3\) first:

Example: Construction of the DFT circuit for \(n=3\)

We start by building the tensor product for \(n=3\). The input for the DFT circuit is \(\ket{j} = \ket{j_1j_2j_3}\). Using the formula from above, we can write the result of \(\operatorname{DFT}_{2^3} \ket{j}\) as the following tensor product:

\[ \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_3} \ket{1})}_{\psi_1} \otimes \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_2j_3} \ket{1})}_{\psi_2} \otimes \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_1j_2j_3} \ket{1})}_{\psi_3} \]

First we construct the \(\psi_3\) element. Contrary to the intuition, we use the top wire containing \(\ket{j_1}\) for this. We use a Hadamad-gate to bring \(\ket{j_1}\) into the superposition \(\frac{1}{\sqrt{2}}(\ket{0} + (-1)^{j_1} \ket{1}) = \frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0.j_1} \ket{1})\). This looks close to \(\psi_3\) already, we now need to add the last two decimal places \(j_2j_3\) to the state. For this we use \(R_2\) and \(R_3\). We apply \(R_2\) controlled by the wire \(j_2\) and \(R_3\) controlled by the wire \(j_3\). This means, that we only apply the \(R\)-gate, if the corresponding wire contains a \(1\). You can see this written as a quantum circuit at the figure below. After applying \(R_2\) we have the state \(\frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0.j_1j_2} \ket{1})\) and after applying \(R_3\) we have the state \(\frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0.j_1j_2j_3} \ket{1})\) on the top wire. This is the same as \(\psi_3\), so we are done on the first wire (We are at slice 3 in the figure below at this point).

The next step is to construct the \(\psi_2\) state on the middle wire. We again use a Hadamad-gate to bring \(\ket{j_2}\) into the superposition \(\frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0.j_2} \ket{1})\). We now need to include the last decimal point \(j_3\), for which we use \(R_2\) again, this time controlled by \(j_3\). The resulting superposition is now \(\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_2j_3} \ket{1})\), which is \(\psi_2\).

On the bottom wire, we can just do a Hadamad-gate to bring \(\ket{j_3}\) into the superposition \(\frac{1}{\sqrt{2}}(\ket{0} + e^{2 \pi i 0.j_3} \ket{1})\). We then have \(\psi_1\) on the bottom wire. The full circuit is described in this figure:

The DTF for three qubits

When applying this circuit, we get the state \(\psi_3 \otimes \psi_2 \otimes \psi_1\) as a result. This very close to our desired state \(\psi_1 \otimes \psi_2 \otimes \psi_3\), just the order of the wires is flipped. To solve this, we apply a \(\operatorname{SWAP}\) onto all wires, which flips the order of the wires an delivers the correct output for \(\operatorname{DFT}_{2^3}\).

The more general approach to construct the \(\operatorname{DFT}_{2^n}\) as a quantum circuit with \(n\) qubits works as follows:

  1. Initialize wires with input \(\ket{j}\), so that \(\ket{j_1}\) is on the top wire and \(\ket{j_n}\) is on the bottom wire. Note, that this is not part of the circuit yet.
  2. Start with the top wire. For each wire \(j_i\) do the following:
    1. Apply a Hadamad-gate on the wire \(j_i\).
    2. For each wire \(j_k\) below the current wire \(j_i\) (with \(i < k \leq n\)), add a \(R_{k-i+1}\)-gate controlled by \(j_k\). Start with \(k = i+1\) (if \(i<n\), else stop).
  3. Perform a \(\operatorname{SWAP}\) to flip all the wires. This means, that the first wire is swapped with the last wire, the second wire is swapped with the second to last wire and so on.

Note: If the the output of the DFT circuit is measured right after applying it (as in Shor’s algorithm) or if the rest of the algorithm allows for it, it is more efficient to perform the \(\operatorname{SWAP}\) classically, since this is considered to be the cheaper operation.

The more general layout of the quantum circuit for the \(\operatorname{DFT_{2^n}}\) with the \(\operatorname{SWAP}\) is shown in this figure.

The DTF for \(n\) qubits