next up previous contents
Next: Determining algorithm performance by Up: Probabilistic Encryption Previous: Weak laws of large   Contents


The magic of sampling

We have $10^6$ envelopes. Inside each envelope is a piece of paper with 0 or 1 written on it. If we want to know exactly how many envelopes have each number, we have to open them all. Suppose we want to estimate the fraction of the envelopes of each kind, and we want the proportion to be accurate to within .05. Now we need only open $9(10^5)$ envelopes.


The situation changes dramatically if we only want to estimate the proportion with high probability. If we are willing to accept a .01 probability of an error $>.05$, Lemma 24 implies we only need to open a randomly chosen sample of $10^4$ envelopes9.


The special feature of problems involving squares and pseudo-squares is that sampling is possible. We saw in our discussion of the Rabin system that every number mod $n$ has four square roots. Thus if we choose one of the $(p-1)(q-1)$ numbers $x$ not divisible10 by $p$ or $q$ and compute $x^2$ mod n, each square has a $(p-1)(q-1)/4$ chance of being chosen. It is also important that it is possible to sample from $Z^1_n$ (the set of squares and pseudo-squares) even if $p,q$ are not known.

Lemma 26   There is an efficient algorithm for deciding if $a\in Z^1_n$.

The proof of this is difficult, involving ``quadratic reciprocity'' and the ``Jacobi symbol.'' The algorithm itself is not that complicated, and is given in the RSA paper.


Given this lemma, we can sample in $Z^1_n$ by choosing $x$ at random and testing if it is in the set. If not, another $x$ is chosen. Since roughly half of $1\le x\le n$ is in $Z^1_n$, this won't take too long.


The different sampling possibilities we have discussed so far have all assumed that only $n$ was known. If we are given a single pseudo-square $y$, we can sample among all pseudo-squares by calculating $yx^2$ for $x$ randomly chosen.


The possibility of doing these various kinds of sampling is closely related to properties 2(a) and (c) in the paper (p. 277).


Footnotes

... envelopes9
Lemma 25 and Mathmatica suggest 400 envelopes are enough.
... divisible10
Even though $p,q$ are unknown, the gcd of $x,n$ can be computed.

next up previous contents
Next: Determining algorithm performance by Up: Probabilistic Encryption Previous: Weak laws of large   Contents
Translated from LaTeX by Scott Sutherland
2002-12-14