The ABC of DNA Computing

## 3. NP-complete problems and Molecular Computation

Other NP-complete problems that have more recently been attacked directly by DNA Computation include the Maximal Clique Problem (1997) and the Satisfiability Problem (2000, update, 2002.)
There is a collection of NP problems, all known to be equivalent in the following sense. An algorithm for solving one of them can be adapted to solve any one of the others. These problems are called NP-complete, a notion due to Stephen Cook of the University of Toronto. The directed Hamiltonian path problem is known to be NP-complete. Another NP-complete problem is the cracking of public-key codes. A rapid method for solving any one of these problems will have important practical implications.

That is why so much excitement greeted Leonard Adleman's announcement ("Molecular computation of solutions to combinatorial problems," Science, 266 1021-1024, November 11, 1994) that he had found a way to solve the path problem using individual molecules as computing elements. The number of molecules of a compound in any tangible amount of that compound is almost unimaginably large (it helps to remember: if an apple were the size of the Earth, its molecules would be the size of apples). Adleman used very tiny quantities of his reagents; nevertheless he had 3x10^13 copies of each of 20 or so molecules, so about six hundred million million molecules at his service. This was like having six hundred million million (albeit very rudimentary) computers working on the problem at the same time: the age-old parable of the monkeys and the typewriters brought to life in a test-tube.

Molecular computation is especially appropriate for problems susceptible to a ``massively parallel'' attack. These are opposed to problems where long chains of calculations must be performed, each depending on the last one. The path problem can be parallelized as follows: imagine a huge number of passengers traveling the airline and making allowable connections at random. If the number is huge enough, then it is overwhelmingly likely that, if a Hamiltonian path exists, one of them at least will have found it.

Adleman's achievement is to have imagined a way to use molecules, (in this case, strands of DNA) to implement this strategy, and then to have devised an ingenious set of steps, using recombinant DNA technology, to filter out all except the strands representing a Hamiltonian path.

This is not the way DNA works in its natural setting. In nature a cell nucleus will have only one copy of each DNA molecule it uses, and not 10^13 copies. (On the other hand those molecules are chains of 50 to 250 million base pairs, whereas Adleman worked with strands only 20 bases long). Also, in the nucleus DNA expresses its information through chemical interactions with RNA and, through RNA, with amino acids which it organizes into proteins. None of this happens in Adleman's experiment.

Adleman uses DNA molecules for their computational ability alone. As a part of DNA's replication mechanism, a strand can recognize complementary strands or parts of strands by bonding to them. This is the only computation these molecules can perform, but it turns out to be enough.