The second of two feature columns on this subject.
The Fast Fourier Transform is fast, but not fast enough for realistic factorization of large numbers. Factoring an nbit number would require 3n·2^{n} operations; that number increases exponentially with n. This month's column will examine how for a quantum computer this growth could be made polynomial, and the factorization problem could become tractable.
Data in a quantum computer are stored in qubits, and manipulated by gates.
Example: the NOTgate. This is the simplest example of a gate that does something.
NOTgate: 

Example: the Rgate. In terms of the basis vectors:
Rgate: 

Suppose the gate is acting on the two qubits qubit with state space basis 0>, 1> and qubit with state space basis 0>, 1>. The tensor product a b of a = a_{0}0> + a_{1}1> with b = b_{0}0> + b_{1}1> is a 4component object best represented by the matrix:
( 
a_{0}b_{0} a_{0}b_{1} a_{1}b_{0} a_{1}b_{1} 
). 
In terms of the bases 0>, 1> and 0>, 1>, the tensor products

 


Example: ControlledNOT Such a device inputs 2 qubits, say qubit and qubit; it leaves qubit unchanged and reverses the state of qubit whenever qubit is in state 1>.
CNOT gate: 

This operation can also be described by saying that the sum modulo 2 of qubit and qubit is stored in qubit, and that qubit is carried along to make the operation reversible.
A typical element of the tensor product of the qubit state space (basis 0>, 1>) and the qubit state space (basis 0>, 1>) will have the form:
( 
c_{0,0} c_{0,1} c_{1,0} c_{1,1}  ), 
Such an element will in general not be the tensor product ab of a = a_{0}0> + a_{1}1> and b = b_{0}0> + b_{1}1>. In fact, we can write a matrix as a tensor product

= 

only when the determinant c_{0,0} c_{1,1} – c_{1,0} c_{0,1} is 0.
A state in the 2qubit state space which is not of the form ab is called an entangled state. Here is why:
and we modify the ControlledNOT gate so that if qubit starts in state 0>, then its output state is the modulo 2 sum of the input states of qubit and qubit; the other qubits remain unchanged. The outputs when qubit starts in state 1> will not concern us, except inasmuch as they make the whole operation unitary. Let us call this 3input, 3output gate a Dgate. In terms of the standard basis for the tensor product of the three state spaces, it is described by the table
D gate: 

(1/2) 0>
0>
0>
+ (1/2) 0> 1> 0> + (1/2) 1> 0> 0> + (1/2) 1> 1> 0>. 
(1/2) 0>
0>
0>
+ (1/2) 0> 1> 1> + (1/2) 1> 0> 1> + (1/2) 1> 1> 0>. 
S_{j,k} gate: 

For 3 bits, the QFT algorithm specifies the sequence
R_{2} S_{1,2} R_{1} S_{0,2} S_{0,1} R_{0}
applied in lefttoright order. Here we abbreviate
0>0>0> as 000, etc.
input
^{ }000 
after R_{2}
2^{–1/2} (000 + 001) 
after S_{1,2} (ω_{2–1}=i)
2^{–1/2} (000 + 001) 
after R_{1}
2^{–1} (000 + 010 + 001 + 011) 
after S_{0,2} (ω_{2–0}=e^{ iπ/4} )
2^{–1} (000 + 010 + 001 + 011) 
after S_{0,1} (ω_{1–0}=i)
2^{–1} (000 + 010 + 001 + 011) 
after R_{0}
2^{–3/2} (000 + 100 + 010 + 110 + 001 + 101 + 011 + 111) 
More specifically, suppose in our 3qubit example that we carry along the input register as the first 3 qubits in the computation (this will turn out not be necessary). Then at the end if the first qubits are measured and report binary 5, i.e. 1>0>1>, the last three qubits must contain the superposition:
2^{–3/2} (0>0>0> –
1>0>0> + i0>1>0> – i1>1>0> – e^{ iπ/4}
0>0>1>
+ e^{ iπ/4} 1>0>1>
– ie^{ iπ/4}
0>1>1>
+ ie^{ iπ/4}
1>1>1>).
In the "row" corresponding to binary 5, the coefficient of the state "binary c" is exactly (2^{–3/2} times) e^{ i 5 cπ/4}; so that 0>0>0> has coefficient e^{ i5·0π/4} = 1, 0>0>1> has coefficient e^{ i5·1π/4} = –e^{ iπ/4}, 0>1>0> has coefficient e^{ i5·2π/4} = i, etc. By a combination of entanglement (for the rows) and superposition (for the columns) this 6qubit register now contains all 64 elements of the matrix e^{ i a cπ/4}, where a and c run from 0 to 7.
As set out in last month's column, we are working to discover the two prime factors of a number N by choosing an arbitrary number x smaller than N and detecting by Fourier analysis the periodicity of the sequence of remainders x^{a} mod N, a= 0, 1, 2, etc.
We follow Shor, and work with a register L of length q, where N^{2} < 2^{q} < 2N^{2}, and a second register R large enough to hold N. Initially every qubit in L is in state 0>.
Step 1. We load into L an equiprobable superposition of all the numbers between 0 and 2^{q}–1. This means applying the quantum gate R_{j} to qubit_{j}, for j = 0, ... , q–1. These gates can be applied simultaneously.
Step 2. For each number a in L, we calculate x^{a} mod N and write it in register R. This can be done simultaneously for all a in L. Now in registers L,R we have the superposition
Notation: Here and in what follows k> in register L stands for b_{q–1}> ... b_{1}> b_{0}>, where b_{q–1} ... b_{1}b_{0} is the binary representation of k, with a similar convention for register R.
Step 3. We apply the Quantum Fourier Transform to register L. This replaces each a> by . In Fig. 1, each column in the matrix corresponds to a standard basis state for L, whereas the rightmost column represents the contents of register R; the state of LR is the superposition
Fig. 1. Step 3 of the Shor Factorization Algorithm illustrated with N = 85, x = 19 and q = 4. Now the LR register holds a superposition of 16^{2} states, each weighted by 1/16. Reading from the top left, the first one is e^{2 π 0·0/16}0> 1>, i.e.
the state corresponding to the last entry in the fourth row is e^{2 π 3·15/16}15> 59>, i.e.
(e^{2 π 3·15/16} 1> 1> 1> 1>) (0> 1> 1> 1> 0> 1> 1>).
Before going to Step 4, let us note that the registers are set up to carry out a Discrete Fourier Transform. But that algorithm would require 2^{q} multiplications for each of 2^{q} rows: many too many operations. Shor's intuition was that superposition and entanglement could be harnessed to do all the work in a couple of steps, by first reading register R and then reading register L.
Step 4. We examine register R. One of the values will appear (they all have equal probability), and the others will be lost. Suppose that value is x^{α} mod N. During the readout the contents of register L collapse to those states which were coupled with x^{α} mod N (compare Example 2 above).
Fig. 2. Step 4 of Shor's Factorization Algorithm illustrated with N = 85, x = 19 and q = 4. We suppose that register R reads 59 (so α, as above, was 3 or 11). Register L now only contains the superposition of states which were entangled with 59>; these are set off by the green boxes.
Fig. 3. Step 4 of Shor's Factorization Algorithm illustrated with N = 85, x = 33 and q = 4; conventions as above. We suppose that register R reads 67.
Step 5. We now read out register L. In the two examples we have considered, the readout will be an exact multiple of p = 2^{q}/r. Repeating the experiment with the same x a small number of times leads with very high probability to a set of readouts whose only common divisor is p; so r can be determined and the problem is solved.
These examples are untypically simple in that r (8 and 4 in these two cases) was a power of 2, and therefore divided 2^{q} exactly.
In this simple situation, we can see how the readout takes us from a superposition of states (in register R) with period r to a superposition of states (in register L) which have coefficient zero unless they correspond to a multiple of 2^{q}/r. (Compare the examples; this is the way the QFT detects frequency).
In fact, there will be exactly p = 2^{q}/r values a in the range [0,2^{q}–1] such that x^{a} = x^{α} mod N, and each of them will contribute
to register L. Since each of these a's is of the form α + kr, k = 0, ..., p–1, the factor of c> is, up to a constant,
The general case requires a subtler analysis. Now 2^{q}/r is not an integer. The readout in general cannot be an exact multiple of the frequency. But using a long enough sample (here is where the condition 2^{q} > N^{2} comes into play to analyze 85 "realistically" we would have had to use q = 13 and an 8192 x 8192 table of coefficients) guarantees, with high probability, a useful approximation. Complete details are in Shor's article referenced below.
The calculated probabilities of a readout of c, with r = 10 and 2^{q} = 256. As Shor remarks, the value r = 10 could occur when factoring 33 if x were chosen to be 5, for example; 256 is chosen instead of the required 2^{11} to get a legible picture. With high probability the observed value of c is near an integral multiple of 2^{q}/r = 256/10. Image courtesy Peter Shor.
Peter W. Shor, Algorithms for Quantum Computation: In: Proceedings, 35th Annual Symposium on Foundations of Computer Science, Santa Fe, NM, November 2022, 1994, IEEE Computer Society Press, pp. 124134.
An
expanded version is available under the title
PolynomialTime Algorithms for Prime Factorization
and Discrete Logarithms on a Quantum Computer, as
arXiv:quantph/9508027.
History of the Quantum Fourier Transform:
R. Cleve, A note on computing Fourier transforms by quantum programs, unpublished, available as postscript file
D. Coppersmith, An Approximate Fourier Transform Useful in Quantum Factoring, IBM Research Report 07/12/94, available as arXiv:quantph/0201067.
A. Ekert and R. Jozsa, Quantum computation and Shor's factoring algorithm, Reviews of Modern Physics 68 (1996) 733753