next up previous contents
Next: A recent pseudo-random number Up: Further results on pseudo-random Previous: Hard-Core Predicates and Pseudo-Random   Contents

Construction of hard-core predicates

It seems much harder to find examples of hard-core predicates than of one-way functions. The latter can be obtained from discrete logarithms, subset-sum problems, and elsewhere. It is difficult to prove that a property cannot be guessed with probability much larger than $1/2$. This makes the following result17 interesting.

Theorem 33   Let $f$ be a one-way function which maps sequences of bits of length $n$ to sequences of length $n$. For each $S\subset\{1,\dots n\}$ and $n$-bit $x$ define $B(S,x)$ to be true if the number of 1's in positions in $x$ specified by $S$ is even. There is no efficient algorithm which can guess $B(S,x)$ given $S$ and $f(x)$ with probability much greater than $1/2$.

This result does not rule out the possibility that, for some specific $S$, it may be possible to guess $B(S,x)$ in an efficient way. However, it would not be possible to do something like guess $B(S,x)$ with probability $.6$ for $10\%$ of all possible $S$, and probability $.5$ for all other $S$. [this would imply we would be correct about $B(S,x)$ overall with probability $.51$]

To prove this, we shall consider the following scenario: We are trying to determine an unknown $x\in \{0,1\}^{n}$. We have an oracle $A(S)$ which is supposed to tell us $B(S,x)$. The oracle may lie, but must tell the truth more than half the time. Theorem 34 says we can efficiently enumerate a not-too-large set which probably includes $x$.

Theorem 34   Suppose that $A(S)=B(S,x)$ for at least $1/2+\epsilon$ of all $S\subset\{1,\dots n\}$. We can enumerate $U\subset \{0,1\}^{n}$ in time polynomial in $\epsilon^{-1}$ such that $x\in U$ with probability close to 1.

To deduce Theorem 33 from Theorem 34, suppose we had an efficient algorithm for guessing $B(S,x)$. We obtain $U$, which cannot be too large given the time bound. If we are given the value of $f(x)$, we can evaluate $f$ for all $x\in U$ to identify the correct $x$ with high probability, which would mean $f$ is not a one-way function.

The construction of $U$ proceeds in stages. At stage $k$ ($1\le k\le n$), we enumerate $U_k\subset\{0,1\}^{k}$ which includes (with probability close to 1) the first $k$ digits of $x$.

We will consider $k$ fixed for the rest of this section. Define $L=\{1,\ldots k\}$ and $R=\{k+1,\ldots n\}$. If $\alpha$ and $\beta$ are true or false, $\alpha==\beta$ is defined to be 1 if $\alpha$ and $\beta$ are both true or both false, 0 otherwise. If $h$ is defined for all $S\subset L$, we will use $\begin{array}[t]{c}{\rm Avg}\ [-2pt]
S\end{array}h(S)$ to represent the average value of $h$, in other words, $2^{-k}\sum_Sh(S)$, with similar definitions for other collections of $S$. The hypothesis of Theorem 34 can be written as

\begin{displaymath}\begin{array}[t]{c}{\rm Avg}\ [-2pt]
S\subset\{1,\ldots n\}\end{array}\left(A(S)==B(S,x)\right)\ge1/2+\epsilon\end{displaymath}

Let $v\in \{0,1\}^{k}$. If there is a $w\in \{0,1\}^{n-k}$ with $x=v\circ w$ [i.e., $v$ is the first $k$ bits of $x$], then

\begin{displaymath}1/2+\epsilon\le\begin{array}[t]{c}{\rm Avg}\ [-2pt]
A(C\cup D)==B(C\cup D,v\circ w)\Bigr)\right)\end{displaymath}

\begin{displaymath}{\rm Define\quad }T(v,D)=\begin{array}[t]{c}{\rm Avg}\ [-2pt]
C \subset L\end{array}\Bigl(A(C\cup D)==B(C,v)\Bigr)\end{displaymath}

Recall that $B(S,x)$ is true if and only if the sum of the bits of $x$ corresponding to $S$ is even. This implies

\begin{displaymath}\begin{array}[t]{c}{\rm Avg}\ [-2pt]
...1-T(v,D)\quad{\rm if }B(D,w){\rm is false}\end{array}\right.\end{displaymath}

Thus, if $x=v\circ w$, then
\begin{array}[t]{c}{\rm Avg}\ [-2pt]
D\subset R\end{array}\Bigl(\vert T(v,D)-1/2\vert\Bigr)\ge\epsilon
\end{displaymath} (5)

To test whether a given $v$ satisfies (5) requires looking at all possible $C$ and $D$, which would take too much time. However, as in section 7.3, we can take not-too-large random samples from all possible $C$ and $D$, with a high probability of correctly deciding whether $v$ satisfies (5). Let $N$ be the number of different $D$ used in the sampling.

At stage $k-1$, we have obtained $U_{k-1}$ which includes the first $k-1$ bits of $x$ with high probability. To create $U_k$, we take each member of $U_{k-1}$, add 0 and 1 to it, and identify using sampling which of the resulting $k$-bit strings satisfy (5).

This process would take too much time if the number of strings doubled at each step. To complete the proof, we use Lemma 35 to show that, for every $D$, the number of $v$ with $\vert T(v,D)-1/2\vert\ge\epsilon$ is not too large, specifically there are at most $(2\epsilon)^{-2}$ such $v$. In order to to be included in $U_k$, $v$ must satisfy $\vert T(v,D)-1/2\vert\ge\epsilon$ for at least one of the sets $D$ used in the sample, which gives a bound of $N(2\epsilon)^{-2}$ on $\vert U_k\vert$.

Lemma 35   For any $D$, $\sum_v\bigl(T(v,D)-1/2\bigr)^2=1/4$, where the sum is over all $v\in \{0,1\}^{k}$.

Proof: $D$ is a constant and may be ignored. We will argue by induction on $k$. Define $L'=\{1,\ldots k-1\}$. The function $A$ may be represented by $A_1,A_2$ with

\begin{displaymath}A_1(C\cup D)=A(C\cup D)\quad A_2(C\cup D)=A(C\cup\{k\}
\cup D)\quad{\rm for all }C\subset L'.\end{displaymath}

{\rm Define\quad}T_i(v,D)&=&\begin{array}[t]{c}{\rm Avg}\ [-2...
...{1}{2}\Bigl(T_1(v,D)+(1-T_2(v,D)\Bigr){\rm\quad if\quad}v_k=1\\

We divide $\{0,1\}^{k}$ according to whether $v_k=0$ or 1 to obtain

\quad\mbox{\vrule height 8pt width 4pt}\end{eqnarray*}


... result17
O. Goldreich and L. Levin, ``A Hard-Core Predicate for Every One-Way Function,'' Symposium on Theory of Computation (1989)

next up previous contents
Next: A recent pseudo-random number Up: Further results on pseudo-random Previous: Hard-Core Predicates and Pseudo-Random   Contents
Translated from LaTeX by Scott Sutherland