9 Shor’s Algorithm
One of the best known quantum algorithm is Shor’s algorithm for finding the prime factors of an integer. It was developed by Peter Shor in 1994.
9.1 Discrete Fourier Transformation
One of the tools required for Shor’s algorithm is the Discrete Fourier Transformation (DFT). Generally, a Fourier transformation is a mathematical technique that decomposes a function into its constituent frequencies. We use the DFT to find the period of a vector.
The DFT is defined as follows:
This transformation is best imagined as a process, which takes a periodic vector as an input and outputs the period of that vector. The DFT has some important properties, which help us later on.
9.2 Reducing factoring to period finding
With the DFT, we have seen, that we can use a unitary to find the period of a quantum state. We now look into using period finding to factor integers. We first look at the definition of the two problems:
Note that this definition of the factoring problem is a simplified version of the factoring problem, where \(N\) has only 2 prime factors.
To start the reduction, we need a special case of the period finding problem called order finding:
Since the order finding problem is just the period finding problem for a specific \(f(x)\), we know that if we can solve the period finding problem within reasonable runtime, we can also solve the order finding problem within reasonable runtime. We now reduce the factoring problem to the order finding problem:
We have a integer \(N\) as an input for the factoring problem.
- Pick an \(a \in \{1,\dots,N-1\}\) with \(a\) relatively prime to \(N\).
- Compute the order of \(a\), so that \(r = \text{ ord } a\) (using the solver for the order finding problem).
- If the order \(r\) is odd, we abort.
- Calculate \(x:= a^{\frac{r}{2}}+1 \bmod N\) and \(y:= a^{\frac{r}{2}}-1 \bmod N\).
- If \(\gcd(x,N) \in \{1,N\}\), we abort.
- We compute \(p = \gcd(x,N)\) and \(q = \gcd(y,N)\).
The output of the reduction are \(p,q\), such that \(pq = N\). This holds, since \[ xy = (a^{\frac{r}{2}}+1) (a^{\frac{r}{2}}-1) = a^r - 1 \equiv 1-1 = 0 \pmod N \]
All in all this reduction shows, that if we have an oracle which can solve the period finding problem within reasonable runtime, we can also solve the factoring problem within reasonable runtime (since all other operations are classically fast to compute).
9.3 The quantum algorithm for period finding
We now look into an quantum algorithm that solves the period finding problem within reasonable runtime. The quantum circuit for Shor’s algorithm requires a \(f:\mathbb{Z}\rightarrow X\) which is \(r\)-periodic. \(X\) denotes an arbitrary set here. We choose a number \(m\) which needs to be big enough to encode the values of \(X\) and choose a number \(n\) under the condition of \(n\geq 2 \log_2(r)\) for the post processing to work. Note that when using this algorithm for factoring, we choose \(n\) to be \(n:=2\lvert N \rvert\), since \(r \leq N\). \(\lvert N\rvert\) denotes the number of bits needed to encode \(N\) here.
The quantum algorithm for period finding is shown in this figure:
The algorithm works as follows:
- We start with a \(\ket{0}\) entry on every wire.
- We bring the top wire into the superposition over all entries. The quantum state is then \(2^\frac{-n}{2}\sum_x \ket{x} \otimes \ket{0^m}\).
- We apply \(U_f\), which is the unitary of \(f:\{0,1\}^n\rightarrow\{0,1\}^m\). This calculates the superposition over all possible values \(f(x)\) on the bottom wire. The resulting quantum state is \(2^\frac{-n}{2}\sum_x \ket{x,f(x)}\).
- To understand the algorithm better, we measure the bottom wire at this point. This will give us one random value \(f(x_0)\) for some \(x_0\). The top wire will then contain a superposition over all values \(x\) where \(f(x) = f(x_0)\). Since \(f\) is known to be \(r\)-periodic, we know, that \(f(x) = f(x_0)\) iff \(x \equiv x_0 \bmod r\). This means, that the resulting quantum state on the top wire is periodic and can be written as \(\frac{\sqrt{r}}{\sqrt{2^n}} \sum_{x\equiv x_0 \bmod r} \ket{x} \otimes \ket{f(x_0)}\). For simplicity we assume, that \(r \mid 2^n\) holds.
- We apply the Discrete Fourier Transform on the top wire. This will “analyze” the top wire for the period and output a vector with entries at multiples of \(\frac{2^n}{r}\) as seen in Theorem 9.1. For simplicity we still assume, that \(r \mid 2^n\) holds.
- We measure the top wire and get one random multiple of \(\frac{2^n}{r}\), which we can denote as \(a\cdot\frac{2^n}{r}\)
Since we get a multiple of \(\frac{2^n}{r}\) on each run, we can simply run the algorithm multiple times to get different multiples and then compute \(\frac{2^n}{r}\) by taking the gcd of those multiples. From that we compute \(r\). Unfortunately this only works because we assumed \(r \mid 2^n\). Since this does usually not hold, we only get approximate multiples of \(\frac{2^n}{r}\) (which is not even an integer) and thus post processing is a bit more complex.
9.4 Post processing
So far we have seen the DFT to analyze the period of a quantum state, we have seen a way to reduce the factoring problem to the period finding and we have seen a quantum algorithm for finding an approximate multiple of such a period of a function. We just need one final step to find \(r\). For this we start with a theorem:
We assume that the theorem holds for our outcome \(c\) of the second measurement (if that is not the case, our result will be wrong and we can just run the quantum algorithm again to get a different outcome):
Then exists an integer \(d\) such that: \[ \begin{aligned} &\lvert rc - d2^n\rvert \leq \frac{r}{2} \\ \iff&\lvert \frac{c}{2^n} - \frac{d}{r}\rvert \leq \frac{1}{2^{n+1}} && |\text{ division by } r\cdot 2^n \end{aligned} \] The fraction \(\frac{c}{2^n}\) is known, so the goal is to find a fraction \(\frac{d}{r}\) that is \(\frac{1}{2^{n+1}}\)-close to \(\frac{c}{2^n}\).
Since \(n\) is the number of qubits used in the quantum circuit and was chosen, such that \(n \geq 2\log_2(r)\) and thus \(2^{n} \geq r^2\) holds and from this we know that \(\frac{1}{2^{n+1}} \leq \frac{1}{2r^2}\) holds as well.
So if Theorem 9.3 holds, we now \(\lvert \frac{c}{2^n} - \frac{d}{r} \rvert \leq \frac{1}{2r^2}\) also holds. Our task is now rewritten to find \(\frac{d}{r}\) under this condition. For this we use another theorem:
This theorem uses the convergent of a continued fraction expansion. A continued fraction expansion of a number \(t\) is the number rewritten as a fraction in the form
\[ t = a_0 + \frac{1}{a_1 + \frac{1}{a_2 + \frac{1}{a_3 + \dots}}} \]
where \(a_i\) always has to be the biggest possible integer. We call \([a_0,a_1,a_2,a_3,\dots]\) the continued expansion of \(t\). The expansion is finite iff t is rational. For a given continued expansion, a prefix \([a_0,\dots,a_i]\) is called a convergent. Writing this convergent as a normal fraction will give us an approximation of the number \(t\).
Using Theorem 9.4 (with \(\varphi:= \frac{c}{2^n}\) and \(q:=2^n\)) we can find \(\frac{d}{r}\) and from this \(r\) which is the period of our function using the following steps:
For each convergent \(\gamma\) of \(\varphi\) do the following:
- Compute \(\gamma\) as fraction \(\frac{d}{r}\).
- Stop if \(r \leq 2^n\) and this \(\frac{d}{r}\) is \(\frac{1}{2^{n+1}}\)-close to \(\frac{c}{2^n}\) and return \(r\).
Note: It can happen, that the resulting fraction does not have the right \(r\) in the denominator, because \(\frac{d}{r}\) was simplified (if numerator and denominator shared a common factor). But the probability of this happening is sufficiently small and already included in the probability in Theorem 9.3.
This completes the postprocessing of Shor’s algorithm.
9.5 Constructing the DFT
So far we have described everything necessary for Shor’s algorithm, but only described the matrix representation of the \(\operatorname{DFT}_M\). We will now take a closer look into implementing the \(\operatorname{DFT}_M\) as a quantum circuit. Since we only use the \(\operatorname{DFT}_M\) for Shor’s algorithm so far, we will only look at \(M=2^n\), which is the \(\operatorname{DFT}\) applied on \(n\) qubits.
To start the circuit, we recall the definition of the \(\operatorname{DFT}_{2^n}\) from Definition 9.1: \(\operatorname{DFT}_{2^n} := \frac{1}{\sqrt{{2^n}}} (\omega^{kl})_{kl}\) with \(\omega:= e^{2\pi i / 2^n}\). To apply the \(\operatorname{DFT}_{2^n}\) to a quantum state \(\ket{j}\) we calculate \[ \operatorname{DFT}_{2^n}\ket{j} = \frac{1}{\sqrt{{2^n}}} \sum_k e^{2\pi i j k 2^{-n}} \ket{k} \] We can rewrite this as follows: \[ \begin{aligned} \operatorname{DFT}_{2^n}\ket{j} =& \frac{1}{\sqrt{{2^n}}} \sum_k e^{2\pi i j k 2^{-n}} \ket{k}\\ =& \frac{1}{\sqrt{{2^n}}} \sum_{k_1} \dots \sum_{k_n} e^{2\pi i j (\sum_l k_l 2^{-l})} \ket{k_1 \dots k_n}\\ =& \frac{1}{\sqrt{{2^n}}} \sum_{k_1} \dots \sum_{k_n} \bigotimes^n_{l=1} e^{2\pi i j (k_l 2^{-l})} \ket{k_l}\\ =& \frac{1}{\sqrt{{2^n}}} \bigotimes_{l=1}^n \sum_{k_l} e^{2\pi i j (k_l 2^{-l})} \ket{k_l}\\ =& \frac{1}{\sqrt{{2^n}}} \bigotimes_{l=1}^n (\ket{0} + e^{2\pi i j 2^{-l}} \ket{1})\\ =& \frac{1}{\sqrt{{2^n}}} \bigotimes_{l=1}^n (\ket{0} + e^{2\pi i 0.j_{n-(l-1)} \dots j_{n}} \ket{1})\\ =& \bigotimes_{l=1}^n \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_{n-(l-1)} \dots j_{n}} \ket{1}) \end{aligned} \] The expression \(0.j\) expresses a binary fraction (e.g. \(0.101 = \frac{1}{2} + \frac{1}{8} = \frac{5}{8}\)).
With this we have shown, that we can write \(\operatorname{DFT}_{2^n}\ket{j}\) as the following tensor product of quantum states
\[ \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_n} \ket{1}) \otimes \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_{n-1}j_n} \ket{1}) \otimes \dots \otimes \frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_1\dots j_n} \ket{1}) \]
From this rewritten tensor product, we can get an idea on how to construct the quantum circuit for the \(\operatorname{DFT}_{2^n}\). Namely, we can construct a quantum circuit for each element of the tensor product and from this build the general circuit.
For this, we segment the tensor product into different elements \(\psi\) as follows:
\[ \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_n} \ket{1})}_{\psi_1} \otimes \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_{n-1}j_n} \ket{1})}_{\psi_2} \otimes \dots \otimes \underbrace{\frac{1}{\sqrt{2}}(\ket{0} + e^{2\pi i 0.j_1\dots j_n} \ket{1})}_{\psi_n} \]
We also introduce a new gate \(R_k\) which is defined by the following matrix:
\[ R_k:=\begin{pmatrix} 1 & 0 \\ 0 & e^{2 \pi i / 2^k} \end{pmatrix} \]
To understand the construction of the circuit from these elements, we will look at an example for \(n=3\) first:
The more general approach to construct the \(\operatorname{DFT}_{2^n}\) as a quantum circuit with \(n\) qubits works as follows:
- Initialize wires with input \(\ket{j}\), so that \(\ket{j_1}\) is on the top wire and \(\ket{j_n}\) is on the bottom wire. Note, that this is not part of the circuit yet.
- Start with the top wire. For each wire \(j_i\) do the following:
- Apply a Hadamad-gate on the wire \(j_i\).
- For each wire \(j_k\) below the current wire \(j_i\) (with \(i < k \leq n\)), add a \(R_{k-i+1}\)-gate controlled by \(j_k\). Start with \(k = i+1\) (if \(i<n\), else stop).
- Perform a \(\operatorname{SWAP}\) to flip all the wires. This means, that the first wire is swapped with the last wire, the second wire is swapped with the second to last wire and so on.
Note: If the the output of the DFT circuit is measured right after applying it (as in Shor’s algorithm) or if the rest of the algorithm allows for it, it is more efficient to perform the \(\operatorname{SWAP}\) classically, since this is considered to be the cheaper operation.
The more general layout of the quantum circuit for the \(\operatorname{DFT_{2^n}}\) with the \(\operatorname{SWAP}\) is shown in this figure.