Format Preserving Encryption in the Bounded Retrieval Model

Ben Morris Department of Mathematics, University of California, Davis Hans Oberschelp Department of Mathematics, University of California, Davis Hamilton Samraj Santhakumar Department of Mathematics, University of California, Davis

(July 16, 2023)

Abstract

In the bounded retrieval model, the adversary can leak a certain amount of information from the message sender’s computer (e.g., $10$ percent of the hard drive). Bellare, Kane and Rogaway give an efficient symmetric encryption scheme in the bounded retrieval model. Their scheme uses a giant key (a key so large only a fraction of it can be leaked.) One property of their scheme is that the encrypted message is larger than the original message. Rogaway asked if an efficient scheme exists that does not increase the size of the message. In this paper we present such a scheme.

1 Introduction

The present paper attempts to solve the problem of format preserving encryption in the bounded retrieval model, by constructing a pseudorandom permutation and providing concrete security bounds in the random oracle model. The bounded retrieval model was introduced to study cryptographic protocols that remain secure in the presence of an adversary that can transmit or leak private information from the host’s computer to a remote home base. One example of such an adversary is an APT (Advanced Persistent Threat), which is a malware that stays undetected in the host’s network and tries to ex-filtrate the secret keys used by the host. The premise of the bounded retrieval model is that such an adversary cannot move a large amount of data to a remote base without being detected or that it can only communicate with the remote base through a very narrow channel. That is, the model assumes an upper bound on the amount of data that an adversary can leak. In [1] Bellare, Kane and Rogaway introduce an efficient symmetric encryption scheme in this model and give concrete security bounds for it. They assume that the secret key is very large and model the leaked data as a function that takes the secret key as the input and outputs a smaller string. The length of this string is a parameter on which the security bounds depend. Their algorithm uses a random seed $R$ along with the big key to generate a key of conventional length that is indistinguishable from a random string of the same length even when the function used to model the leaked data depends on calls to the random oracle that the algorithm uses. It then uses this newly generated key and any of the conventionally available symmetric encryption schemes, say an AES mode of operation, to create a ciphertext $C$ . Finally it outputs $(R,C)$ .

The above scheme is not format preserving since the final ciphertext $(R,C)$ is longer than the original message $M$ . A question posed by Phillip Rogaway (personal communication) is whether a secure format preserving encryption scheme exists in the bounded retrieval model. Another way to pose this question is as follows : If the adversary is allowed to leak data, is it possible to construct a pseudorandom permutation that is secure under some notion of security, say the CCA notion of security? The aim of this paper is to answer this question. Unfortunately it is not possible to come up with a pseudorandom permutation that is secure under the strong notion of CCA security. This is because in the CCA model, before trying to distinguish between a random permutation and the pseudorandom permutation, the adversary can choose to look at a sequence of plaintext-ciphertext pairs that he chooses. If a leakage of data is allowed, the adversary can simply leak a single plaintext-ciphertext pair and use it to gain a very high CCA advantage. Hence we weaken the notion of security by requiring that the adversary can only look at a sequence of plaintext-ciphertext pairs where the plaintexts are uniformly random and distinct. We then ask her to distinguish between a truly random permutation and the pseudorandom permutation. (See Section 3 for a precise definition of the security in our setup.)
In the present paper, we operate in the setting of the random oracle model (see [2]). Our contribution is to give a pseudorandom permutation in the bounded retrieval model and prove that it is secure in the weak sense that is discussed above.

Just as in [1] we use a big key. We now give a brief sketch of our approach. Note that if one fixes the string of leaked bits, the key is a uniform sample from the preimage of the leaked string. If the length of the leaked string is small, then on average the preimage is very large. What this means is that even when the leakage is known, with a high probability the total entropy of the key is high. This implies that the sum of entropies of each bit in our key is large. This means that many of the bits in the key are “unpredictable” in the sense that the probabilty of $1$ is not close to $0$ or $1$ . So, if one uses a random oracle to look at various positions of the key and take an XOR, it is likely that the resulting bit is close to an unbiased random bit. This idea of probing the key is similar to the one used in [1]. The content of Sections 6 and 7, which form the heart of this paper, is to show that bits generated by probing the key are close to i.i.d. unbiased random bits. To construct a pseudorandom permutation using these bits, we use a particular card shuffling scheme called the Thorp shuffle, just as in [9]. This construction is given in the next section.

2 Thorp shuffle/maximally unbalanced Feistel network

One method of turning a pseudorandom function into a pseudorandom permutation is to use a Feistel network (see [5]). The maximally unbalanced Feistel network is also known as the Thorp shuffle. Round $r$ of this shuffle can be described as follows. Suppose that the current binary string (i.e., the value of the message after the first $r-1$ rounds of encryption) is $LR$ , where ${\rm length}(L)=1$ and ${\rm length}(R)=\mathcal{m}-1$ . Then round $r$ transforms the string to $RL^{\prime}$ , where

L^{\prime}=L\oplus F_{k}(R,r),

and $F_{k}$ is a pseudorandom function. Let $X_{t}(m)$ denote the result of $t$ Thorp shuffles on message $m$ . The novel idea in the present paper is to use a pseudorandom function based on a big key $k$ .

The Big Key Pseudorandom Function: Let $\mathcal{k}$ be the length of the big key $k$ . To compute $F_{k}(R,r)$ , apply the random oracle to $(R,r)$ to obtain $\big{(}P,{\mathcal{S}}\big{)}$ , where $P=(P_{1},\ldots,P_{n})$ is $n$ samples with replacement from $\{1,\ldots,\mathcal{k}\}$ and $\mathcal{S}$ is a uniform random subset of $\{1,\ldots,n\}$ that is independent of $P$ . By analogy with [1], we define the random subkey by $k[P]:=(k[P_{1}],\dots,k[P_{n}])$ . Finally, define

F_{k}(R,r)=\oplus_{i\in{\mathcal{S}}}k[P_{i}]\;.

That is, $F_{k}(R,r)$ is the XOR of a randomly chosen subsequence of the subkey. For a given key $k$ we define our cipher to be $X_{T}(\cdot)$ for some fixed positive integer $T$ .

Remark: We conjecture that it would also work (i.e., we would get a suitable pseudorandom function) if we took the XOR of the entire subkey; the current definition is used because it makes the proof simpler.

3 Security of the Cipher

In this section we introduce a notion of security for pseudorandom permutations under the assumption that there is a leakage of data. Let $\mathcal{K}=\{0,1\}^{\mathcal{k}}$ denote the set of keys and let $\mathcal{M}=\{0,1\}^{\mathcal{m}}$ denote the set of messages. We assume that the adversary can leak $\mathcal{l}$ bits of data and just as in [1], use a function $\Phi:\mathcal{K}\rightarrow\{0,1\}^{\mathcal{l}}$ to model this. Henceforth we will refer to this function as the leakage function. The adversary has the power to choose this function and this function can depend on calls to the random oracle. For a key $K$ , we will use $L=\Phi(K)$ to denote the output one gets by applying the leakage function to it. We will call this the leakage. We allow the adversary to make $\mathcal{r}$ random oracle calls and decide on a leakage function $\Phi$ . After the adversary has chosen a leakage function, consider the following two worlds.
World 1: In this world, we first choose distinct uniformly random messages $M_{1},\ldots,M_{q}\in\mathcal{M}$ . Then, for a uniformly random key $K\in\mathcal{K}$ , we set $C_{i}=X_{T}(M_{i})$ where $X_{t}$ is the Thorp shuffle based cipher we defined in Section 2 and $T$ is some fixed positive integer. We give the adversary access to the leakage $L$ , the input-output pairs $(M_{1},C_{1}),\ldots,(M_{q},C_{q})$ and the random oracle calls that were used by the algorithm to compute the $X_{T}(M_{i})^{\prime}s$ .
World 0: In this world, again we choose distinct uniformly random messages $M_{1},\ldots,M_{q}\in\mathcal{M}$ . We once again choose a random key $K\in\mathcal{K}$ and compute $L$ and all the random oracle calls necessary to evaluate the $X_{T}(M_{i})^{\prime}s$ , just like world 1. However, instead of setting $C_{i}^{\prime}s$ to be the outputs of the cipher, we choose a uniformly random permutation $\pi:\mathcal{M}\to\mathcal{M}$ and set $C_{i}=\pi(M_{i})$ . Just as in world 1, the adversary is provided access to the input-output pairs for the $q$ messages, the leakage $L$ and the random oracle calls.
We now place him in these two worlds one at a time without telling him which world he is in. In each of these cases we ask the adversary to guess which world he is in. Let $\mathcal{A}(0)$ and $\mathcal{A}(1)$ denote the answers he gives in world 0 and world 1 respectively. Then, we define the advantage of an adversary as

\textbf{Adv}(\mathcal{A})=\mathbb{P}_{1}\big{(}\mathcal{A}(1)=1\big{)}-\mathbb{P}_{0}\big{(}\mathcal{A}(0)=1\big{)},

(1)

where $\mathbb{P}_{i}$ is the probability measure in world $i$ . Define the maximum advantage

\mathbf{MaxAdv}_{\mathbf{\mathcal{r},q}}=\max_{\mathcal{A}}\Big{(}\textbf{Adv}(\mathcal{A})\Big{)},

(2)

where the maximum is taken over all adversaries satisfying the above mentioned conditions. Note that in the above setup if we allow the messages to be chosen by the adversary instead of being random, we get the notion of security of a block cipher against chosen plain text attack (CPA) under leakage. Security against CPA is weaker than security against CCA (chosen ciphertext attack). Unfortunately, if a leakage is allowed, it is not possible to design a cipher that is secure in the CPA framework. This is because of the adversary who does the following: Let $q=1$ . Assume that the message length $\mathcal{m}$ is less than $\mathcal{l}$ . For each key $k$ , the adversary includes the ciphertext $X_{T}(M_{1})$ into the leakage, for a fixed message $M_{1}$ . Then, the adversary answers as follows. If $C_{1}=X_{T}(M_{1})$ then the adversary guesses that he is in world 1. Else, the guess is world 0. In this case, $\mathbb{P}_{1}(\mathcal{A}(1)=1)=1$ and $\mathbb{P}_{0}(\mathcal{A}(0)=1)=1/2^{\mathcal{m}}$ . Hence this adversary has a very high advantage. By instead providing the adversary with uniform random plaintext-ciphertext pairs, we get the notion of secuirty against a known plain text attack (KPA) under leakage. The main result of this paper is the following bound on the maximum advantage of such an adversary.

Theorem 1

The adversary’s advantage satisfies

\mathbf{MaxAdv}_{\mathbf{\mathcal{r},q}}\leq\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}+{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+\frac{q\mathcal{r}}{2^{\mathcal{m}-1}}+\frac{qT}{2^{\mathcal{m}}},\,

where $\mathcal{r}$ is the number of random oracle calls, $\alpha=\mathcal{l}+\mathcal{m}(q+1)+T$ , $\mathcal{s}$ is an integer satisfying the equation $T=\mathcal{s}(\mathcal{m}-1)$ and $h^{-1}$ is the inverse of the function $h$ restricted to $[1/2,1]$ , where $h$ is defined by $h(p)=p\log_{2}\frac{1}{p}+(1-p)\log_{2}\frac{1}{p-1}$ .

Lets try to make sense of this bound. The first two terms have exponents that we can control by choosing the parameters of the cipher. Specifically, we can, with modest assumptions on the number of queries and amount of leakage, make the first term as small as desired by running the cipher for $T=\mathcal{O}(\log(q))$ rounds and make the second term equally small by sampling $n=\mathcal{O}(\log(q))$ probes in each round.
To make sense of the last two terms, lets consider an adversary which we will call the naive adversary. The naive adversary chooses a set $\mathcal{M}^{\prime}$ of $\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor$ messages and uses their $\mathcal{l}$ bits of leakage to leak the ciphertext of each message in $\mathcal{M}^{\prime}$ . Next, when placed in either world 0 or world 1, the naive adversary checks if any of the q random messages provided is from the collection $\mathcal{M}^{\prime}$ . The naive adversary answers "world 1" if the corresponding ciphertext matches the laked ciphertext. Otherwise, they answer "world 0". If none of the $q$ provided messages are from $\mathcal{M}^{\prime}$ , then the naive adversary answers based on the flip of an independent fair coin. Let $\mathbf{Adv_{naive}}$ denote the advantage of the naive adversary. Then,

\mathbf{Adv_{naive}}=\mathbb{P}(M_{i}\in\mathcal{M}^{\prime}\text{ for some }1\leq i\leq q)(1-2^{-\mathcal{m}}).\,

Recall that the distinct messages $M_{1},\ldots,M_{q}$ are sampled uniformly, and that $|\mathcal{M}^{\prime}|=\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor$ . Let $X=|\{M_{1},\ldots,M_{q}\}\cap\mathcal{M}^{\prime}|$ . Then $X$ is a hypergeometric random variable and

\mathbb{P}(M_{i}\in\mathcal{M}^{\prime}\text{ for some }1\leq i\leq q)=\mathbb{P}(X>0).\,

Using the bound

\mathbb{P}(X>0)\geq\frac{{{\mathbb{E}}}X}{1+{{\mathbb{E}}}X}\,

for hypergeometric random variables, we get

\mathbf{Adv_{naive}}\geq\frac{q\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor 2^{-\mathcal{m}}}{1+q\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor 2^{-\mathcal{m}}}\cdot(1-2^{-\mathcal{m}}).\,

With the modest assumption that $q\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor\leq 2^{\mathcal{m}}$ , and the fact that $\mathcal{m}\geq 1$ , we can simplify this bound to

\mathbf{Adv_{naive}}\geq\frac{q\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor}{4\cdot 2^{m}}.\,

Returning to the bound of the advantage of the optimal adversary, if we assume that $\mathcal{r},T\leq\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor$ and $q\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor\leq 2^{m}$ , then

\mathbf{MaxAdv}_{\mathbf{\mathcal{r},q}}\leq\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}+{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+12\cdot\mathbf{Adv_{naive}}.\,

Thus, with realistic assumptions, no adversary can do much better than the naive adversary. To make this precise consider the following example. Let $\mathcal{k}=2^{43}$ bits and $\mathcal{l}=\mathcal{k}/8=2^{40}$ bits, i.e. the key has a size of 1 terabyte out of which, $12.5\%$ or about 125 gigabytes can be leaked. Assume that the message length is $\mathcal{m}=128$ bits. Fix $n=500$ , $\mathcal{s}=2$ and $T=\mathcal{s}(2\mathcal{m}-1)=510$ . Let $\Gamma(q)$ denote the two leading terms on the RHS of the above inequality, i.e.

\Gamma(q)=\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}+{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2},

with values of $\mathcal{k},\mathcal{l},\mathcal{m},n,\mathcal{s}$ and $T$ fixed as discussed above. Figure 1 shows a plot between $\log_{2}(q)$ and $-\log_{2}(\Gamma(q))$ , for values of $q$ satisfying $q\geq 0$ , $q\lfloor\frac{\mathcal{l}}{\mathcal{m}}\rfloor\leq 2^{\mathcal{m}}$ , $1-(\alpha+n)/\mathcal{k}\geq 0$ and $\Gamma(q)\leq 1$ . From this plot we can see that for the example under consideration, until about $q=2^{30}$ , any adversary can only have a slightly higher advantage than 12 times the advantage obtained using the naive strategy.

Refer to caption — Figure 1: Plot of $-\log_{2}\big{(}\Gamma(q)\big{)}$ vs $\log_{2}(q)$ for a Particular Example.

If a bound avoiding the use of $h^{-1}$ is desired, we can make use of Lemma 3 in Section 5 which states

h^{-1}(z)\leq{{{\textstyle{1\over 2}}}}+{{{\textstyle{1\over 2}}}}\sqrt{1-z^{\ln 4}}.\,

This gives the following bound on the adversary’s advantage:

\mathbf{MaxAdv}_{\mathbf{\mathcal{r},q}}\leq\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}+{qT\over 2}\left[{{{\textstyle{1\over 2}}}}+{{{\textstyle{1\over 2}}}}\sqrt{1-\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}^{\ln 4}}\right]^{n/2}+\frac{q\mathcal{r}}{2^{\mathcal{m}-1}}+\frac{qT}{2^{\mathcal{m}}},\,

4 Entropy Background and Notation

Let $X,Y$ be two random variables. Then let ${\mathcal{L}}(X)$ and ${\mathcal{L}}(X{\;|\;}Y)$ denote the law of $X$ and the law of $X$ given $Y$ respectively. For example, let $X$ be a uniform random variable over $\{0,1\}^{\mathcal{k}_{0}}$ and suppose ${\Psi}:\{0,1\}^{\mathcal{k}_{0}}\to\{0,1\}^{\mathcal{l}_{0}}$ . We write ${\mathcal{L}}(X{\;|\;}{\Psi}(X))$ for the random probability measure $p$ defined by

p(x):=\left\{\begin{array}[]{ll}{1\over|{\Psi}^{-1}({\Psi}(X))|}&\mbox{if ${\Psi}(x)={\Psi}(X)$;}\\[5.0pt] 0&\mbox{otherwise.}\\ \end{array}\right.

That is, if ${\Psi}(X)=l$ , then ${\mathcal{L}}(X{\;|\;}{\Psi}(X))$ is the uniform distribution over $\{x\in\{0,1\}^{\mathcal{k}_{0}}:{\Psi}(x)=l\}$ .

Let ${\mathbf{H}}$ be the entropy base $2$ , that is

{\mathbf{H}}(p):=\sum_{x\in\Omega}p(x)\log_{2}{1\over p(x)}\,.

For a set $S$ , define ${\mathbf{H}}(S):=\log_{2}|S|$ . That is, ${\mathbf{H}}(S)$ is the entropy of the uniform distribution over $S$ .

Lemma 2

Let $X$ be a uniform random variable over $A\subset\{0,1\}^{\mathcal{k}_{0}}$ and suppose ${\Psi}:A\to{\mathcal{L}}$ , where $|{\mathcal{L}}|=2^{\mathcal{l}_{0}}$ . Define $S(X):={\Psi}^{-1}({\Psi}(X))$ . Then

{{\mathbb{E}}}\Bigl{(}{\mathbf{H}}(S(X))\Bigr{)}\geq\log_{2}|A|-\mathcal{l}_{0}.

Furthermore for any $m\in\mathbb{R}$ ,

{\bf P}\Bigl{(}{\mathbf{H}}(S(X))<\log_{2}|A|-\mathcal{l}_{0}-m\Bigr{)}\leq 2^{-m}\,.

Proof: For $l\in{\mathcal{L}}$ , let $S_{l}:=\{x:{\Psi}(x)=l\}$ . Note that if $X\in S_{l}$ then $S(X)=S_{l}$ . It follows that

$\displaystyle{{\mathbb{E}}}\Bigl{(}{\mathbf{H}}(S(X))\Bigr{)}$	$\displaystyle=$	$\displaystyle\sum_{l\in{\mathcal{L}}}{\|S_{l}\|\over\|A\|}\log_{2}\|S_{l}\|$	(3)
	$\displaystyle=$	$\displaystyle\log_{2}\|A\|+\sum_{l\in{\mathcal{L}}}{\|S_{l}\|\over\|A\|}\log_{2}{\|S_{l}\|\over\|A\|}$
	$\displaystyle=$	$\displaystyle\log_{2}\|A\|+\,\|{\mathcal{L}}\|\Bigl{[}{1\over\|{\mathcal{L}}\|}\sum_{l\in{\mathcal{L}}}{\|S_{l}\|\over\|A\|}\log_{2}{\|S_{l}\|\over\|A\|}\Bigr{]}\,.$

The average (over $l$ ) of the quantity ${|S_{l}|\over|A|}$ is ${1\over|{\mathcal{L}}|}$ . Therefore, since the function $x\log x$ is convex, Jensen’s inequality implies that the quantity (3) is at least

\log_{2}|A|+|{\mathcal{L}}|\left({1\over|{\mathcal{L}}|}\log_{2}\left({1\over|{\mathcal{L}}|}\right)\right)=\log_{2}|A|-\mathcal{l}_{0}\,.

For the second part of the lemma, note that

	$\displaystyle{\bf P}\Bigl{(}{\mathbf{H}}(S(X))<\log_{2}\|A\|-\mathcal{l}_{0}-m\Bigr{)}$	$\displaystyle=$	$\displaystyle{\bf P}(\|S(X)\|<\|A\|\cdot 2^{-\mathcal{l}_{0}-m})$
		$\displaystyle=$	$\displaystyle\sum{\|S_{l}\|\over\|A\|},$

where the sum is over $l$ such that $|S_{l}|<|A|\cdot 2^{-\mathcal{l}_{0}-m}$ . Since each term in the sum is at most $2^{-\mathcal{l}_{0}-m}$ and there are at most $2^{\mathcal{l}_{0}}$ terms, the sum is at most $2^{-m}$ . $\square$

5 Entropy and Bernoulli Random Variables

Let $h(p)$ denote the entropy of a Bernoulli( $p$ ) random variable. That is, define $h:[0,1]\to[0,1]$ by

h(p)=p\log_{2}{1\over p}+(1-p)\log_{2}{1\over 1-p}\,.

The restriction of $h$ to $[{{{\textstyle{1\over 2}}}},1]$ is a strictly decreasing and onto function and hence has an inverse $h^{-1}:[0,1]\to[{{{\textstyle{1\over 2}}}},1]$ . Since $h$ is concave and decreasing on $[{{{\textstyle{1\over 2}}}},1]$ , the function $h^{-1}$ is concave. Furthermore, note that for any $p\in[0,1]$ , we have $h(p)=h(1-p)$ and hence

h^{-1}(h(p))=\max(p,1-p)\,.

(4)

Theorem 1.2 of [12] gives the following bound:

h(p)\leq(4pq)^{1/\ln 4},

(5)

where $q=1-p$ . This implies the following lemma.

Lemma 3

For any $p\in[0,1]$ , we have

p\leq{{{\textstyle{1\over 2}}}}\left(1+\sqrt{1-h(p)^{\ln 4}}\right)

Proof: Let $\Delta=\max(p,1-p)-{{{\textstyle{1\over 2}}}}$ . Then

	$\displaystyle pq$	$\displaystyle=$	$\displaystyle\left({{\frac{1}{2}}}+\Delta\right)\left({{\frac{1}{2}}}-\Delta\right)$
		$\displaystyle=$	$\displaystyle{{\frac{1}{4}}}-\Delta^{2},$

and hence

4pq=1-4\Delta^{2}.

(6)

Equation (5) implies that

4pq\geq h(p)^{\ln 4}.

Combining this with (6) gives

\Delta^{2}\leq{1\over 4}(1-h(p)^{\ln 4}),

and hence

\Delta\leq{1\over 2}\sqrt{1-h(p)^{\ln 4}}\,.

It follows that

$\displaystyle p$	$\displaystyle\leq$	$\displaystyle{1\over 2}+\Delta$
	$\displaystyle\leq$	$\displaystyle{1\over 2}+{1\over 2}\sqrt{1-h(p)^{\ln 4}}$
	$\displaystyle=$	$\displaystyle{1\over 2}\left(1+\sqrt{1-h(p)^{\ln 4}}\right).$

$\square$

Recall that $h(p)$ is the entropy of a Bernoulli( $p$ ) random variable and if $S\subset\{0,1\}^{\mathcal{k}}$ then ${\mathbf{H}}(S):=\log_{2}|S|$ . We shall need the following entropy decomposition lemma.

Lemma 4

Suppose that $S\subset\{0,1\}^{\mathcal{k}}$ and suppose that $K$ is uniformly distributed over $S$ . Then

\sum_{i=1}^{\mathcal{k}}h\left(\mathbb{P}(K[i]=1)\right)\geq{\mathbf{H}}(S).

Proof: Note that ${\mathbf{H}}(S)$ is the entropy of $K$ . Applying the chain rule for entropy on $K=(K[1],\ldots,K[\mathcal{k}])$ gives

\sum_{i=1}^{\mathcal{k}}{\mathbf{H}}((K[i]{\;|\;}K[i-1],K[i-1],\ldots,K[1]\big{)})={\mathbf{H}}(S)\,.

It is well known that for any two discrete random variables $Z,Z^{\prime}$ on a common probability space, $\mathbf{H}(Z|Z^{\prime})\leq\mathbf{H}(Z)$ . So the above inequality gives

\sum_{i=1}^{\mathcal{k}}{\mathbf{H}}((K[i]))\geq\sum_{i=1}^{\mathcal{k}}\mathbf{H}\big{(}K[i]\ \big{|}\ K[i-1],\ldots,K[1]\big{)}={\mathbf{H}}(S)\,.

Finally, note that ${\mathbf{H}}(K[i])=h\left(\mathbb{P}(K[i]=1)\right)$ . $\square$

6 Main Technical Results

Lemma 5

Let $Y=(Y_{1},Y_{2},\ldots,Y_{n})\in\{0,1\}^{n}$ be a random n-bit string. For $S\subseteq\{1,\ldots,n\}$ , set $f_{S}(Y):=(-1)^{\oplus_{i\in S}Y_{i}}$ , with the convention that $f_{\emptyset}\equiv 1$ . Also let

	$\displaystyle\mathcal{E}(S)$	$\displaystyle=$	$\displaystyle\mathbb{E}[f_{S}(Y)]$		(7)
		$\displaystyle=$	$\displaystyle 2\Bigl{(}{{{\textstyle{1\over 2}}}}-\mathbb{P}(\oplus_{i\in S}Y_{i}=1)\Bigr{)}\;.$		(7)

Then for a uniformly chosen random subset $\mathcal{S}\subseteq\{1,\ldots,n\}$ , we have

\mathbb{E}\big{[}\mathcal{E}(\mathcal{S})^{2}]=\sum_{y\in\{0,1\}^{n}}{\mathbb{P}}\bigl{(}\,{Y=y}\,\bigr{)}^{2}.

(8)

Lemma 5 is a well-known consequence of Parseval’s theorem (see page 24 of [11]). For completeness, we give a proof here:

Proof: Let $\Omega=\{0,1\}^{n}$ . Note that $\mathbbm{R}^{\Omega}$ , the space of real valued functions on $\Omega$ , forms a vector space of dimension $|\Omega|$ over $\mathbbm{R}$ . Define the following inner product on $\mathbbm{R}^{\Omega}$ .

\begin{split}\langle f,g\rangle&=\frac{1}{2^{n}}\sum_{x\in\Omega}f(x)g(x)\quad\text{for }f,g\in\mathbbm{R}^{\Omega}\\ &=\mathbb{E}[f(Z)g(Z)],\end{split}

where $Z=(Z_{1},\ldots,Z_{n})$ and $Z_{1},Z_{2},\ldots,Z_{n}$ are i.i.d Bernoulli(1/2) random variables. Observe that when $S\neq S^{\prime}$ ,

\langle f_{S},f_{S^{\prime}}\rangle=\mathbb{E}[f_{S}(Z)f_{S^{\prime}}(Z)]=\mathbb{E}\Bigg{[}\prod_{i\in S\cap S^{\prime}}(-1)^{2Z_{i}}\prod_{j\in S\triangle S^{\prime}}(-1)^{Z_{j}}\Bigg{]}=\prod_{i\in S\cap S^{\prime}}\mathbb{E}\Big{[}(-1)^{2Z_{i}}\Big{]}\prod_{j\in S\triangle S^{\prime}}\mathbb{E}\Big{[}(-1)^{Z_{j}}\Big{]}=0,

since $\mathbb{E}[(-1)^{Z_{i}}]=0$ and $S\triangle S^{\prime}$ is non-empty when $S\neq S^{\prime}$ . Also observe that

\langle f_{S},f_{S}\rangle=\mathbb{E}\Big{[}(-1)^{2(\oplus_{i\in S}Z_{i})}\Big{]}=\mathbb{E}[1]=1.

Therefore, $\{f_{S}\}_{S\in 2^{[n]}}$ forms an orthonormal basis for $2^{\Omega}$ . Next, let $U(y)=1/2^{n}$ and $P(y)={\mathbb{P}}\bigl{(}\,{Y=y}\,\bigr{)}$ for $y\in\Omega$ . Then, $P,U,P/U\in\mathbb{R}^{\Omega}$ . Now note that

\langle P/U,f_{S}\rangle=\frac{1}{2^{n}}\sum_{x\in\Omega}2^{n}P(x)f_{S}(x)=\sum_{x\in\Omega}P(x)f_{S}(x)=\mathbb{E}[f_{S}(Y)]=\mathcal{E}(S)\,.

It follows that

\frac{1}{2^{n}}\langle P/U,f_{S}\rangle^{2}=\frac{1}{2^{n}}\mathcal{E}(S)^{2}.

Summing the above equation over all subsets $S\subseteq\{1,\ldots,n\}$ and using the fact that $f_{S}^{\prime}s$ form an orthonormal basis, we get

\frac{1}{2^{n}}\langle P/U,P/U\rangle=\sum_{S\subseteq\{1,\ldots,n\}}\frac{1}{2^{n}}\langle P/U,f_{S}\rangle^{2}=\sum_{S\subseteq\{1,\ldots,n\}}\frac{1}{2^{n}}\mathcal{E}(S)^{2}=\mathbb{E}\big{[}\mathcal{E}(\mathcal{S})^{2}\big{]}.

The left hand side of the above equation simplifies to $\sum_{y\in\Omega}P(y)^{2}$ and hence the proof is complete. $\square$

Note that $\mathcal{E}(S)$ is a measure of the bias in the parity of the bits of $Y$ whose positions are in $S$ . More precisely, recall that for probability distributions $\mu$ and $\nu$ on a finite set $\Omega$ , the total variation distance

\lVert\mu-\nu\rVert_{TV}:=\frac{1}{2}\sum_{x\in\Omega}|\mu(x)-\nu(x)|.

(9)

For a $\{0,1\}$ -valued random variable $W$ , the total variation distance

\lVert W-{{\rm Bernoulli}({{{\textstyle{1\over 2}}}})}\rVert_{TV}=|\mathbb{P}(W=1)-{{{\textstyle{1\over 2}}}}|\;.

Equation (7) implies that

{{{\textstyle{1\over 2}}}}-\mathbb{P}(\oplus_{i\in S}Y_{i}=1)={{{\textstyle{1\over 2}}}}\mathcal{E}(S),

and hence

\lVert\oplus_{i\in S}Y_{i}-{{\rm Bernoulli}({{{\textstyle{1\over 2}}}})}\rVert_{TV}={{{\textstyle{1\over 2}}}}|\mathcal{E}(S)|.

(10)

For $S\subseteq\{1,\ldots,n\}$ , define ${\mathcal{B}}(S):=\lVert\oplus_{i\in S}Y_{i}-{{\rm Bernoulli}({{{\textstyle{1\over 2}}}})}\rVert_{TV}$ . Then

$\displaystyle{{\mathbb{E}}}({\mathcal{B}}(\mathcal{S}))$	$\displaystyle=$	$\displaystyle{{{\textstyle{1\over 2}}}}{{\mathbb{E}}}\|\mathcal{E}(\mathcal{S})\|$
	$\displaystyle\leq$	$\displaystyle{{{\textstyle{1\over 2}}}}\sqrt{{{\mathbb{E}}}(\mathcal{E}(\mathcal{S})^{2})}$
	$\displaystyle=$	$\displaystyle{{{\textstyle{1\over 2}}}}\Biggl{[}\sum_{y\in\{0,1\}^{n}}{\mathbb{P}}\bigl{(}\,{Y=y}\,\bigr{)}^{2}\Biggr{]}^{1/2},$

where the first line follows from equation (10), the second line follows from Jensen’s inequality and the third line follows from Lemma 5. This leads to the following:

Corollary 6

Let $K$ be a random string in $\{0,1\}^{\mathcal{k}}$ . Let $(p_{1},\ldots,p_{n})$ be a choice of probes. Let $c^{\prime}$ be a Bernoulli( $1/2$ ) random variable, and for $S\subset\{1,\dots,n\}$ , define

c(S):=\oplus_{i\in S}K[p_{i}];\hskip 36.135ptd(S):=\lVert c(S)-c^{\prime}\rVert_{TV}.

If $\mathcal{S}$ is a uniform random subset of $\{1,2,\dots,n\}$ then

\mathbb{E}(d(\mathcal{S}))\leq\frac{1}{2}\mathbb{E}\Bigg{[}\sqrt{\sum_{y\in\{0,1\}^{n}}{\mathbb{P}}\left({K[p_{1},\dots,p_{n}]=y\ }\right)^{2}\ }\Bigg{]}.

This shows that the expectation (taken over the subprobes) of the distance between the random bit $c$ and a Bernoulli( $1/2$ ) random variable can be bounded in terms of the $l^{2}$ -norm of the distribution of $K[p_{1},\dots,p_{n}]$ .

7 Main Lemma

Suppose $K$ is a uniform random element of $\{0,1\}^{\mathcal{k}}$ and suppose ${\Psi}:\{0,1\}^{\mathcal{k}}\to{\mathcal{L}}$ . For $l\in{\mathcal{L}}$ , let $S_{l}:={\Psi}^{-1}(l)$ . Define the probability measure ${\mathbb{P}_{l}}$ by

{\mathbb{P}_{l}}(\,\cdot):=\mathbb{P}(\,\cdot{\;|\;}{\Psi}(K)=l)\,,

and write ${{\mathbb{E}_{l}}}$ for the expectation operator with respect to ${\mathbb{P}_{l}}$ . Note that under ${\mathbb{P}_{l}}$ , the distribution of $K$ is uniform over $S_{l}$ . For an integer $r$ with $1\leq r\leq n$ and probes $p_{1},\dots,p_{r}$ define

{g}_{l}(p_{1},\dots,p_{r}):=\sum_{x\in\{0,1\}^{r}}{\mathbb{P}_{l}}(K[p_{1},\dots,p_{r}]=x)^{2}\,.

Lemma 7

Suppose that probes $P_{1},P_{2},\dots,P_{n}$ are chosen independently and uniformly at random from $\{1,2,\dots,\mathcal{k}\}$ . If ${\mathbf{H}}(S_{l})\geq\mathcal{k}-\alpha$ , then

{{\mathbb{E}_{l}}}({g}_{l}(P_{1},\dots,P_{n}))\leq\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n}\,.

Proof: Fix $l$ with ${\mathbf{H}}(S_{l})\geq\mathcal{k}-\alpha$ . For $x\in\{0,1\}^{r}$ , and a choice of probes $p_{1},\dots,p_{r+1}$ , define

	$\displaystyle{\lambda}_{l}(x,p_{1},\dots,p_{r+1})$	$\displaystyle:=$	$\displaystyle{\mathbb{P}_{l}}(K[p_{r+1}]=1{\;\|\;}K[p_{1},\dots,p_{r}]=x)$
		$\displaystyle=$	$\displaystyle\mathbb{P}(K[p_{r+1}]=1{\;\|\;}K[p_{1},\dots,p_{r}]=x,{\Psi}(K)=l)\,.$

Define ${\mu_{l}^{r}}(x):{\mathbb{P}_{l}}(K[p_{1},\dots,p_{r}]=x)$ . Note that conditional on ${\Psi}(K)=l$ and $K[p_{1},\dots,p_{r}]=x$ , the distribution of $K$ is uniform over $S_{l}\cap\{A\in\{0,1\}^{\mathcal{k}}:A[p_{1},\dots,p_{r}]=x\}$ . Furthermore,

|S_{l}\cap\{A\in\{0,1\}^{\mathcal{k}}:A[p_{1},\dots,p_{r}]=x\}|=|S_{l}|\cdot{\mu_{l}^{r}}(x)\,.

Hence Lemma 4 implies that

{1\over\mathcal{k}}\sum_{j=1}^{\mathcal{k}}h({\lambda}(x,p_{1},\dots,p_{r},j))\geq\frac{1}{\mathcal{k}}\log_{2}\left(|S_{l}|\cdot{\mu_{l}^{r}}(x)\right)\,.

(11)

For any $p_{1},\dots,p_{r+1}$ , we have

	$\displaystyle{g}_{l}(p_{1},\dots,p_{r+1})$	(12)
$\displaystyle=$	$\displaystyle\sum_{y\in\{0,1\}^{r+1}}{\mathbb{P}_{l}}(K[p_{1},\dots,p_{r+1}]=y)^{2}$	(13)
$\displaystyle=$	$\displaystyle\sum_{x\in\{0,1\}^{r}}{\mathbb{P}_{l}}(K[p_{1},\dots,p_{r}]=x)^{2}\Bigl{[}{\lambda}_{l}(x,p_{1},\dots,p_{r+1})^{2}+(1-{\lambda}_{l}(x,p_{1},\dots,p_{r+1}))^{2}\Bigr{]}$	(14)
$\displaystyle=$	$\displaystyle\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}\Bigl{[}{\lambda}_{l}(x,p_{1},\dots,p_{r+1})^{2}+(1-{\lambda}_{l}(x,p_{1},\dots,p_{r+1}))^{2}\Bigr{]}\,.$	(15)

Note that for any $p\in[0,1]$ we have $p^{2}+(1-p)^{2}\leq\max(p,1-p)$ . Hence, the quantity in square brackets in equation (16) is at most

h^{-1}(h({\lambda}_{l}(x,p_{1},\dots,p_{r+1})))

by equation (4). Thus

\displaystyle{g}_{l}(p_{1},\dots,p_{r+1})

\displaystyle\leq

\displaystyle\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}h^{-1}(h({\lambda}_{l}(x,p_{1},\dots,p_{r+1})))\,.

(16)

Recall that the probe $P_{r+1}$ is chosen uniformly at random from $\{1,2,\dots,\mathcal{k}\}$ . It follows that

	$\displaystyle{{\mathbb{E}_{l}}}({g}_{l}(p_{1},\dots,p_{r},P_{r+1}))$	$\displaystyle\leq$	$\displaystyle{1\over\mathcal{k}}\sum_{j=1}^{\mathcal{k}}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}h^{-1}(h({\lambda}_{l}(x,p_{1},\dots,p_{r},j)))$		(17)
		$\displaystyle=$	$\displaystyle\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}\Bigl{[}{1\over\mathcal{k}}\sum_{j=1}^{\mathcal{k}}h^{-1}(h({\lambda}_{l}(x,p_{1},\dots,p_{r},j)))\Bigr{]}\,.$		(18)

Recall that in Section 5 we showed that $h^{-1}$ is concave. Thus, Jensen’s inequality implies that the quantity (18) is at most

			$\displaystyle\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}h^{-1}\Bigl{(}{1\over\mathcal{k}}\sum_{j=1}^{\mathcal{k}}h({\lambda}_{l}(x,p_{1},\dots,p_{r},j))\Bigr{)}$		(19)
		$\displaystyle\leq$	$\displaystyle\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}h^{-1}\Bigl{(}{1\over\mathcal{k}}\log_{2}(\|S_{l}\|\cdot{\mu_{l}^{r}}(x))\Bigr{)},$		(20)

where the inequality follows from (11) and the fact that $h^{-1}$ is decreasing. Recall that the Harris-FKG inequality (see Section 2.2 of [3]) implies that if $X$ is a random variable and $f$ (respectively, $g$ ) is an increasing (respectively, decreasing) function, then

{{\mathbb{E}}}(f(X)g(X))\leq{{\mathbb{E}}}(f(X)){{\mathbb{E}}}(g(X))\,.

Now, consider the probability measure that assigns mass ${\mu_{l}^{r}}(x)$ to each $x\in\{0,1\}^{r}$ , and let $X:\{0,1\}^{r}\to{\bf R}$ be the random variable defined by $X(x)={\mu_{l}^{r}}(x)$ . Let $f$ be the identity function on ${\bf R}$ and define $g:{\bf R}\to{\bf R}$ by $g(u)=h^{-1}\Bigl{(}{1\over\mathcal{k}}\log_{2}(|S_{l}|\cdot u)\Bigr{)}$ . Then $f$ is increasing and since $h^{-1}$ is decreasing, $g$ is decreasing. Thus the Harris-FKG inequality implies that the quantity (LABEL:eighteen) is at most

			$\displaystyle\Bigl{(}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}\Bigr{)}\Bigl{(}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)h^{-1}\Bigl{(}{1\over\mathcal{k}}\log_{2}(\|S_{l}\|\cdot{\mu_{l}^{r}}(x))\Bigr{)}$		(22)
		$\displaystyle\leq$	$\displaystyle\Bigl{(}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}\Bigr{)}h^{-1}\Bigl{(}{1\over\mathcal{k}}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)\log_{2}(\|S_{l}\|\cdot{\mu_{l}^{r}}(x))\Bigr{)}\,.$		(23)

Applying the first part of Lemma 2 with ${\Psi}(K)=K[p_{1},p_{2},\dots,p_{r}]$ gives

	$\displaystyle\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)\log_{2}(\|S_{l}\|\cdot{\mu_{l}^{r}}(x))$	$\displaystyle\geq$	$\displaystyle{\mathbf{H}}(S_{l})-r$
		$\displaystyle\geq$	$\displaystyle\mathcal{k}-\alpha-n,$

where the second inequality follows from the fact that ${\mathbf{H}}(S_{l})\geq\mathcal{k}-\alpha$ and $r\geq n$ . It follows that the quantity (23) is at most

	$\displaystyle\Bigl{(}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}\Bigr{)}h^{-1}\Bigl{(}{1\over\mathcal{k}}(\mathcal{k}-\alpha-n)\Bigr{)}$	(24)
$\displaystyle=$	$\displaystyle\Bigl{(}\sum_{x\in\{0,1\}^{r}}{\mu_{l}^{r}}(x)^{2}\Bigr{)}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}$	(25)
$\displaystyle=$	$\displaystyle{g}_{l}(p_{1},p_{2},\dots,p_{r})h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\,.$	(26)

We have shown that for any choice of $p_{1},p_{2},\dots,p_{r}$ we have

{{\mathbb{E}_{l}}}({g}_{l}(p_{1},\dots,p_{r},P_{r+1}))\leq{g}_{l}(p_{1},p_{2},\dots,p_{r})h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\,.

It follows that

{{\mathbb{E}_{l}}}({g}_{l}(P_{1},\dots,P_{r+1}))\leq{{\mathbb{E}_{l}}}({g}_{l}(P_{1},P_{2},\dots,P_{r}))h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\,.

Since this is true for all $r$ with $1\leq r\leq n$ , we have

{{\mathbb{E}_{l}}}({g}_{l}(P_{1},\dots,P_{n}))\leq{{\mathbb{E}_{l}}}({g}_{l}(P_{1}))\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n-1}\,.

Finally, an argument similar to above (eliminating all sums over $\{0,1\}^{r}$ and replacing ${\mu_{l}^{r}}(x)$ by $1$ ) shows that

{{\mathbb{E}_{l}}}({g}_{l}(P_{1}))\leq h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\,,

and the lemma follows. $\square$

8 Proof of Main Theorem

In this section we will prove Theorem 1, which bounds the advantage of a KPA adversary with leakage against the cipher described in section 2. Recall that the bound in question is

\mathbf{MaxAdv}_{\mathbf{\mathcal{r},q}}\leq\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}+{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+\frac{q\mathcal{r}}{2^{\mathcal{m}-1}}+\frac{qT}{2^{\mathcal{m}}},\,

First, we prove the bound assuming that the adversary makes no random oracle calls. Let $({{M}}_{i},{{C}}_{i})_{i=1}^{q}$ be the uniform random sequence of input/output pairs given to the adversary.
The adversary’s advantage satisfies

\mathbf{MaxAdv}_{\mathbf{0,q}}\leq\lVert({{M}}_{i},C_{i})_{i=1}^{q}-({{M}}^{u}_{i},C^{u}_{i})_{i=1}^{q}\rVert_{TV},

where $({{M}}^{u}_{i},C^{u}_{i})_{i=1}^{q}$ are $q$ uniform random queries from a uniform random permuation. Let $({{M}}^{{\rm Th}}_{i},C^{{\rm Th}}_{i})_{i=1}^{q}$ be $q$ uniform random queries from $T$ rounds of an idealized Thorp shuffle that uses a uniform random round function $F$ instead of a pseudorandom function.

In [9], Morris, Rogaway and Stegers prove the following.

Theorem 8

[9] Let $T=\mathcal{s}(2\mathcal{m}-1)$ for some whole number $s$ , where $2^{\mathcal{m}}=|\mathcal{M}|$ . Then

\lVert({{M}}^{{\rm Th}}_{i},C^{{\rm Th}}_{i})_{i=1}^{q}-({{M}}^{u}_{i},C^{u}_{i})_{i=1}^{q}\rVert_{TV}\leq\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}.

Combining this result with a bound on

\lVert({{M}}_{i},C_{i})_{i=1}^{q}-({{M}}^{{\rm Th}}_{i},C^{{\rm Th}}_{i})_{i=1}^{q}\rVert_{TV}

(27)

will give the claimed bound on the adversary’s advantage. To bound (27) we use a hybrid argument.

For $0\leq l\leq q$ , let $C^{l}_{i}$ be the result of $T$ Thorp shuffles starting from $M_{i}$ , where the round function used to determine the “random bits” of the shuffle is defined as follows:

1.

If $i\leq l$ then we use the round function $F_{k}$ .
2.

If $l+1\leq i\leq q$ , then at any step not already determined by the round functions used to evaluate the first $l$ queries, we use a uniform random function $F$ .

Define $Q_{l}=({{M}}_{i},{{C}}^{l}_{i})_{i=1}^{q}$ . Thus, the first $l$ queries of $Q_{l}$ correspond to the Thorp shuffle using the pseudorandom round function $F_{k}$ and the final $q-l$ queries correspond to the Thorp shuffle using a uniform random round function (except at steps that are already “forced” by the trajectories of the first $l$ queries.) Note that $Q_{0}$ corresponds to the Thorp shuffle with a uniform random round function and $Q_{q}$ corresponds to the Thorp shuffle with the “big key” pseudorandom round function $F_{k}$ . Thus, the triangle inequality gives us

\displaystyle\lVert({{M}}_{i},C_{i})_{i=1}^{q}-({{M}}^{{\rm Th}}_{i},C^{{\rm Th}}_{i})_{i=1}^{q}\rVert_{TV}

\displaystyle\leq

\displaystyle\sum_{k=0}^{q-1}\lVert Q_{k+1}-Q_{k}\rVert_{TV}

To bound the terms of this sum, we prove the following lemma:

Lemma 9

For all $s$ we have

\lVert Q_{s+1}-Q_{s}\rVert_{TV}\leq{T\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}\,+T\cdot 2^{-\mathcal{m}}.

Proof: It is sufficient to bound

\lVert Z_{s}-Z^{\prime}_{s}\rVert_{TV},

where

Z_{s}=(Q_{s+1},{\mathcal{T}}_{s+1});\hskip 72.26999ptZ^{\prime}_{s}=(Q_{s},{\mathcal{T}}_{s+1})\;.

and ${\mathcal{T}}_{s+1}=(X_{0}(M_{s+1}),\dots,X_{T}({{M}}_{s+1}))$ is the trajectory of message ${{M}}_{s+1}$ .

Again, we use a hybrid argument. For $i$ with $1\leq i\leq T$ , let $P_{s,i}$ be the algorithm defined as follows:

Algorithm $P_{s,i}$ : For the first $s$ queries and for the first $i$ rounds of query $s+1$ , we use the pseudorandom function $F_{k}$ . For rounds $i+1,\dots,T$ of query $s+1$ and for queries $s+2,\dots,q$ , any random bit (that was not already determined by the previous queries) will be defined using a uniform random function $F$ .

Let $W_{s,i}$ be the value of $\Bigl{(}({{M}}_{i},{{C}}_{i})_{i=1}^{q},{\mathcal{T}}_{s+1}\Bigr{)}$ when algorithm $P_{s,i}$ is followed. Note that $W_{s,T}$ has the distribution of $Z_{s}$ and $W_{s,0}$ has the distribution of $Z^{\prime}_{s}$ . Therefore, by another application of the triangle inequality, we have

\lVert Z_{s}-Z_{s}^{\prime}\rVert_{TV}\leq\sum\limits_{i=0}^{T-1}\lVert W_{s,i+1}-W_{s,i}\rVert_{TV}

The only difference between $P_{s,i+1}$ and $P_{s,i}$ is the bit used in round $i+1$ of query $s+1$ . If this bit was determined by the previous queries, then it has the same value in both $P_{s,i+1}$ and $P_{s,i}$ . Otherwise, it is a ${{\rm Bernoulli}({{{\textstyle{1\over 2}}}})}$ random variable in $P_{s,i}$ and it uses the “big key” pseudorandom function $F_{K}$ in $P_{s,i+1}$ . It is enough to show that the claimed bound on the total variation distance holds even if we condition on the input messages $M_{1},\dots,M_{q}$ . So let $m_{1},\dots,m_{q}$ be arbitrary input messages. Let ${\Psi}$ be a function on $\{0,1\}^{\mathcal{k}}$ such that ${\Psi}(K)$ encodes

1.

$\Phi(K)$
2.

the values of $C_{1},\dots,C_{s}$ and $X_{1}(m_{s+1}),\dots,X_{i}(m_{s+1})$ when algorithm $P_{s,i+1}$ is used with key $K$ and input messages $m_{1},\dots,m_{s}$ .

Let $L={\Psi}(K)$ . Note that there are at most

2^{\mathcal{l}}\cdot\left(2^{\mathcal{m}}\right)^{s}\cdot 2^{i}=2^{\mathcal{l}+\mathcal{m}s+i}\leq 2^{\mathcal{l}+\mathcal{m}q+T}

possible values of $L$ . Define $S_{l}:={\Psi}^{-1}(l)$ . We can use Lemma 2 to get a bound on the size of $S_{L}$ that holds with high probability. More precisely, Lemma 2

\mathbb{P}\Big{(}{\mathbf{H}}(|S_{L}|)\leq\mathcal{k}-\mathcal{l}-\mathcal{m}(q+1)-T\Big{)}\leq 2^{-\mathcal{m}}\,.

On the event that ${\mathbf{H}}(|S_{L}|)\leq\mathcal{k}-\mathcal{l}-\mathcal{m}(q+1)-T$ , we can use Lemma 7 to bound the total variation distance. Using Lemma 7 with $\alpha=\mathcal{l}+\mathcal{m}(q+1)+T$ and combining this with Corollary 6 shows that if $B$ is the random bit generated by Algorithm $P_{s,i+1}$ then

{{\mathbb{E}}}\Bigl{(}\lVert B-{{\rm Bernoulli}({{{\textstyle{1\over 2}}}})}\rVert_{TV}\Bigr{)}\leq{1\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+2^{-\mathcal{m}}\,.

Since this one random bit is only nondeterministic difference between $W_{s,i+1}$ and $W_{s,i}$ , we have

\lVert W_{s,i+1}-W_{s,i}\rVert_{TV}\leq{1\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+2^{-\mathcal{m}}\,.

This quantity is independent of $i$ , so

\lVert Z_{s}-Z_{s}^{\prime}\rVert_{TV}\leq{T\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+T\cdot 2^{-\mathcal{m}}\,.

$\square$

Now we use Lemma 9 to bound the sum,

	$\displaystyle\lVert({{M}}_{i},C_{i})_{i=1}^{q}-({{M}}^{{\rm Th}}_{i},C^{{\rm Th}}_{i})_{i=1}^{q}\rVert_{TV}$	$\displaystyle\leq$	$\displaystyle\sum_{k=0}^{q-1}\lVert Q_{k+1}-Q_{k}\rVert_{TV}$
		$\displaystyle\leq$	$\displaystyle{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+qT\cdot 2^{-\mathcal{m}}\,.$

Combining this with Theorem 8 and another application of the triangle inequality gives

\lVert({{M}}_{i},C_{i})_{i=1}^{q}-({{M}}^{u}_{i},C^{u}_{i})_{i=1}^{q}\rVert_{TV}\leq{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+qT\cdot 2^{-\mathcal{m}}+\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}\,.

(28)

Finally, we consider the effect of random oracle calls made by the adversary before calculation of $\Phi$ . Let ${\mathcal{C}}$ be the set of random oracle calls made by the adversary. Note that

{\mathcal{C}}=\cup_{i=1}^{T}{\mathcal{C}}_{i},

where ${\mathcal{C}}_{i}$ is the set of random oracle calls where the input is $(R,i)$ for some $R$ . Let $E$ be the event that at least one of the random oracle calls used to evaluate the ${{M}}_{i}$ is in ${\mathcal{C}}$ . Note that for a uniform random message, the value of $R$ (the rightmost $\mathcal{m}-1$ bits) after any number of Thorp shuffles is uniform over $\{0,1\}^{\mathcal{m}-1}$ . Hence, the probability that the random oracle call used in stage $r$ of the shuffle is in ${\mathcal{C}}_{i}$ is $\displaystyle\frac{|{\mathcal{C}}_{i}|}{2^{\mathcal{m}-1}}$ . Hence, taking a union bound over queries and time steps gives

$\displaystyle\mathbb{P}(E)$	$\displaystyle\leq$	$\displaystyle q\sum_{i=1}^{T}\frac{\|{\mathcal{C}}_{i}\|}{2^{\mathcal{m}-1}}$
	$\displaystyle=$	$\displaystyle\frac{q\|{\mathcal{C}}\|}{2^{\mathcal{m}-1}}$
	$\displaystyle=$	$\displaystyle\frac{q\mathcal{r}}{2^{\mathcal{m}-1}}.$

On the event $E^{C}$ , the adversary’s random oracle calls are separate from all the oracle calls used to compute each $M_{i}$ . Since random oracle calls are independent of all each other, the information from the adversary’s random oracle calls is irrelevant toward determining if they are in world 0 or world 1. Therefore, unless $E$ occurs, the adversary is as good as an adversary with no random oracle calls. We complete the theorem by adding $\mathbb{P}(E)$ to the advantage of an adversary with no random oracle calls.

	$\displaystyle\mathbf{MaxAdv}_{\mathbf{\mathcal{r},q}}$	$\displaystyle\leq$	$\displaystyle\mathbf{MaxAdv}_{\mathbf{0,q}}+\mathbb{P}(E)$
		$\displaystyle\leq$	$\displaystyle{qT\over 2}\Bigl{[}h^{-1}\Bigl{(}1-{\alpha+n\over\mathcal{k}}\Bigr{)}\Bigr{]}^{n/2}+qT\cdot 2^{-\mathcal{m}}+\frac{q}{\mathcal{s}+1}\bigg{(}\frac{4\mathcal{m}q}{2^{\mathcal{m}}}\bigg{)}^{\mathcal{s}}+\frac{q\mathcal{r}}{2^{\mathcal{m}-1}}\,.$

References

[1] M. Bellare, D. Kane, and P. Rogaway. Big-Key Symmetric Encryption: Resisting Key Exfiltration. CRYPTO 2016, pp. 373-402, 2016
[2] M. Bellare, P. Rogaway. Random Oracles are Practical: A Paradigm for Designing Efficient Protocols. ACM Conference on Computer and Communications Security, pp. 62–73.
[3] G. Grimmett. Percolation. Springer-Verlag, 1999.
[4] D. Levin, Y. Peres, and E. Wilmer. Markov chains and mixing times. American Mathematical Society, 2008.
[5] M. Luby and C. Rackoff. How to Construct Pseudorandom Permutations from Pseudorandom Functions. SIAM Journal on Computing, 17 (2), pp. 373–386.
[6] B. Morris. Improved mixing time bounds for the Thorp shuffle. Combinatorics, Probability and Computing, 22(1), 2013.
[7] B. Morris. The mixing time of the Thorp shuffle. SIAM J. on Computing, 38(2), pp. 484–504, 2008. Earlier version in STOC 2005.
[8] B. Morris and P. Rogaway. Sometimes-Recurse shuffle: Almost-random permutations in logarithmic expected time. EUROCRYPT 2014, LNCS vol. 8441, Springer, pp. 311–326, 2014.
[9] B. Morris, P. Rogaway, and T. Stegers. How to encipher messages on a small domain: deterministic encryption and the Thorp shuffle. CRYPTO 2009, LNCS vol. 2009, Springer, pp. 286–302, 2009.
[10] B. Morris and Y. Peres. Evolving sets, mixing and heat kernel bounds. Probability Theory and Related Fields, 2005.
[11] R. O’Donnell. Analysis of Boolean Functions. Cambridge University Press, 2014.
[12] F. Topsøe. Bounds for entropy and divergence for distributions over a two-element set. Journal of Inequalities in Pure and Applied Mathematics, 2001.

	$\displaystyle{\bf P}\Bigl{(}{\mathbf{H}}(S(X))<\log_{2}\|A\|-\mathcal{l}_{0}-m\Bigr{)}$	$\displaystyle=$	$\displaystyle{\bf P}(\|S(X)\|<\|A\|\cdot 2^{-\mathcal{l}_{0}-m})$
		$\displaystyle=$	$\displaystyle\sum{\|S_{l}\|\over\|A\|},$