next up previous contents
Next: Other uses of the Up: Subset-Sum (Knapsack) problems and Previous: A proposed public-key system   Contents

Breaking Knapsack Cryptosystems

We will present some of the basic ideas for attacking the system described in the preceding section, and illustrate them on a small example.4 Our source is the paper by E. Brickell and A. Odlyzko, ``Cryptanalysis: a survey of recent results,'' in Contemporary cryptology: the science of information integrity.

It is not necessary to provide an efficient algorithm guaranteed to crack all instances of a cryptosystem to call its security into question, and our analysis is only an indication that the simple system of the previous section can often be broken.


Suppose we are given the sequence $a_i$:

\begin{displaymath}\begin{array}{*{10}{r}}
611&929&1996&2456&2464&3594&3646&4085&5552&6765,\end{array}\end{displaymath}

generated using an unknown multiplier $V$ (correponding to 1000 in the previous example), an unknown modulus $M$, and an unknown super-increasing sequence $b_i$. Let ${UV}\equiv{1}\hbox{ (mod }{M})$. Then there are non-negative $k_i$ such that $b_i=Ua_i-Mk_i$.


$M$ must be larger than all the $b_i$, hence it must be significantly larger than the earliest members of the sequence.


The decryption method begins by guessing which of the known $a_i$ correspond to a few of the early members of the sequence. A trial-and-error approach is considered feasible for practical size problems.


Suppose we have guessed that 2464, 611, and 2456 correspond to the first three $b_i$, and further that 2464 corresponds to $b_1$.

\begin{displaymath}b_1=2464U-Mk_1\qquad b_2=611U-Mk_2\qquad b_3=2456U-Mk_3\end{displaymath}

$U$ may be eliminated from pairs of equations to obtain:

\begin{displaymath}611b_1-2464b_2=M(611k_1-2464k_2)\qquad2456b_1-2464b_3=M(2456k_1-2464k_3)\end{displaymath}

Since $M$ is large compared to $b_1,b_2,b_3$ this implies

\begin{displaymath}611k_1-2464k_2\quad\hbox{and}\quad2456k_1-2464k_3\end{displaymath}

must be small-- about the size of $b_1,b_2,b_3$ (positive or negative). In other words, $611k_1$ and $2456k_1$ must both be close to 0 or to 2464 mod 2464.


It is plausible that restrictions of this kind give a lot of information about $k_1$. As we look at all possible values of $k_1$, $611k_1$ mod 2464 is evenly distributed through the entire range from 0 to 2463. If we require, for example, that the number is either $\le100$ or $\ge2363$, we would expect that only about $1/12$ of the values for $k_1$ satisfy this. It seems plausible that the distribution of $2456k_1$ is independent of the distribution of $611k_1$, so a similar restriction on the second number reduces the possible values of $k_1$ by $(1/12)^2$.


In our example, ${611k_1}\equiv{s}\hbox{ (mod }{2464})$ implies $k_1{\equiv} 1355s$ and $2456k_1{\equiv} 1480s$. We look at the values of $s$ for something which makes $1480s$ mod 2464 small. Part of the results:

\begin{displaymath}\begin{array}{l\vert*{15}{c\vert}}s&1&2&3&4&5&6&7&8&9&10&11&1...
...
1480&496&1976&992&8&1488&504&1984&1000&16&1496&512
\end{array}\end{displaymath}

The multiples of $s=5$ are clearly the most likely candidates. We try $s=5$, which gives $k_1=1847$.



[The published algorithm recommends guessing more than 3 of the lowest $a_i$ and determining $k_i$ from an integer program:

\begin{eqnarray*}&\min  W\ &-W\le a_1k_j-a_jk_1\le W&\quad j\ge2
\end{eqnarray*}

using a special algorithm which is polynomial-time for a fixed number of constraints.]


Since $b_1=2464U-1847M$ implies $U/M$ is close to $1847/2464$, we try using $U=1847$, $M=2464$ on the $a_i$, which gives:

\begin{displaymath}\begin{array}{l\vert*{10}{c\vert}}
a_i&611&929&1996&2456&2464...
...552&6765\\
1847a_i&5&919&468&8&0&102&50&227&1840&11\end{array}\end{displaymath}

This is not a super-increasing sequence, since $5+8>11$. However, it is close enough to one that one can use it to solve subset-sum problems with a small amount of work.


The $a_i$ were generated using $M=6789$, $V=1234$, $U=5089$. Comparison of the ${a_i}\equiv{Vb_i}\hbox{ (mod }{M})$ with the ``almost'' super-increasing sequence obtained above shows that, except for the first few terms, the ratio of terms from the two sequences is nearly constant. Since a multiple of a super-increasing sequence is super-increasing, this explains why we would expect to obtain a useful sequence.

\begin{displaymath}\begin{array}{l\vert*{10}{c\vert}}
a_i&2464&611&2456&6765&364...
...537&5099\\
1847b_i&0&5&8&11&50&102&227&468&919&1840\end{array}\end{displaymath}



Footnotes

...4
Calculations in this section were done using the Calc program based on the emacs editor. Both programs are available free as part of the GNU system.

next up previous contents
Next: Other uses of the Up: Subset-Sum (Knapsack) problems and Previous: A proposed public-key system   Contents
Translated from LaTeX by Scott Sutherland
2002-12-14