This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Blind Fingerprinting

Ying Wang and Pierre Moulin Y. Wang is with Qualcomm, Bedminster, NJ. P. Moulin is with the ECE Department, the Coordinated Science Laboratory, and the Beckman Institute at the University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA. Email: moulin@ifp.uiuc.edu. This work was supported by NSF under grants CCR 03-25924, CCF 06-35137 and CCF 07-29061. Part of this work was presented at ISIT’06 in Seattle, WA.
Abstract

We study blind fingerprinting, where the host sequence into which fingerprints are embedded is partially or completely unknown to the decoder. This problem relates to a multiuser version of the Gel’fand-Pinsker problem. The number of colluders and the collusion channel are unknown, and the colluders and the fingerprint embedder are subject to distortion constraints.

We propose a conditionally constant-composition random binning scheme and a universal decoding rule and derive the corresponding false-positive and false-negative error exponents. The encoder is a stacked binning scheme and makes use of an auxiliary random sequence. The decoder is a maximum doubly-penalized mutual information decoder, where the significance of each candidate coalition is assessed relative to a threshold that trades off false-positive and false-negative error exponents. The penalty is proportional to coalition size and is a function of the conditional type of host sequence. Positive exponents are obtained at all rates below a certain value, which is therefore a lower bound on public fingerprinting capacity. We conjecture that this value is the public fingerprinting capacity. A simpler threshold decoder is also given, which has similar universality properties but also lower achievable rates. An upper bound on public fingerprinting capacity is also derived.

Index Terms. Fingerprinting, traitor tracing, watermarking, data hiding, randomized codes, universal codes, method of types, maximum mutual information decoder, minimum equivocation decoder, channel coding with side information, random binning, capacity, error exponents, multiple access channels, model order selection.

I Introduction

Content fingerprinting finds applications to document protection for multimedia distribution, broadcasting, and traitor tracing [1, 2, 3, 4]. A covertext—image, video, audio, or text—is to be distributed to many users. A fingerprint, a mark unique to each user, is embedded into each copy of the covertext. In a collusion attack, several users may combine their copies in an attempt to “remove” their fingerprints and to forge a pirated copy. The distortion between the pirated copy and the colluding copies is bounded by a certain tolerance level. To trace the forgery back to the coalition members, we need fingerprinting codes that can reliably identify the fingerprints of those members. Essentially, from a communication viewpoint, the fingerprinting problem is a multiuser version of the watermarking problem [5, 6, 7, 8, 9, 10]. For watermarking, the attack is by one user and is based on one single copy, whereas for fingerprinting, the attack is modeled as a multiple-access channel (MAC). The covertext plays the role of side information to the encoder and possibly to the decoder.

Depending on the availability of the original covertext to the decoder, there are two basic versions of the problem: private and public. In the private fingerprinting setup, the covertext is available to both the encoder and decoder. In the public fingerprinting setup, the covertext is available to the encoder but not to the decoder, and thus decoding performance is generally worse. However public fingerprinting presents an important advantage over private fingerprinting, in that it does not require the vast storage and computational resources that are needed for media registration in a large database. For example, a DVD player could detect fingerprints from a movie disc and refuse to play it if fingerprints other than the owner’s are present. Or Web crawling programs can be used to automatically search for unauthorized content on the Internet or other public networks [3].

The scenario considered in this paper is one where a degraded version SdS^{d} of each host symbol SS is available to the decoder. Private and public fingerprinting are obtained as special cases with Sd=SS^{d}=S and Sd=S^{d}=\emptyset, respectively. We refer to this scenario as either blind or semiprivate fingerprinting. The motivation is analogous to semiprivate watermarking [11], where some information about the host signal is provided to the receiver in order to improve decoding performance. This may be necessary to guarantee an acceptable performance level when the number of colluders is large.

The capacity and reliability limits of private fingerprinting have been studied in [7, 8, 9, 10]. The decoder of [10] is a variation of Liu and Hughes’ minimum equivocation decoder [12], accounting for the presence of side information and for the fact that the number of channel inputs is unknown. Two basic types of decoders are of interest: detect-all and detect-one. The detect-all decoder aims to catch all members of the coalition and an error occurs if some colluder escapes detection. The detect-one decoder is content with catching at least one of the culprits and an error occurs only when none of the colluders is identified. A third type of error (arguably the most damaging one) is a false positive, by which the decoder accuses an innocent user.

In the same way as fingerprinting is related to the MAC problem, blind fingerprinting is related to a multiuser extension of the Gel’fand-Pinsker problem. The capacity region for the latter problem is unknown. An inner region, achievable using random binning, was given in [13].

This paper derives random-coding exponents and an upper bound on detect-all capacity for semiprivate fingerprinting. Neither the encoder nor the decoder know the number of colluders. The collusion channel has arbitrary memory but is subject to a distortion constraint between the pirated copy and the colluding copies. Our fingerprinting scheme uses random binning because, unlike in the private setup, the availability of side information to the encoder and decoder is asymmetric. To optimize the error exponents, we propose an extension of the stacked-binning scheme that was developed for single-user channel coding with side information [11]. Here the codebook consists of a stack of variable-size codeword-arrays indexed by the conditional type of the covertext sequence. The decoder is a minimum doubly-penalized equivocation (M2PE) decoder or equivalently, a maximum doubly-penalized mutual information (M2PMI) decoder.

The proposed fingerprinting system is universal in that it can cope with unknown collusion channels and unknown number of colluders, as in the private fingerprinting setup of [10]. A tunable parameter Δ\Delta trades off false-positive and false-negative error exponents. The derivation of these exponents combines techniques from [10] and [11]. A preliminary version of our work, assuming a fixed number of colluders, was given in [14, 15].

I-A Organization of This Paper

A mathematical statement of our generic fingerprinting problem is given in Sec. II, together with the basic definitions of error probabilities, capacity, error exponents, and fair coalitions. Sec. III presents our random coding scheme. Sec. IV presents a simple but suboptimal decoder that compares empirical mutual information scores between received data and individual fingerprints, and outputs a guilty decision whenever the score exceeds a certain tunable threshold. Sec. V presents a joint decoder that assigns a penalized empirical mutual information score to candidate coalitions and selects the coalition with the highest score. Sec. VI establishes an upper bound on blind fingerprinting capacity under the detect-all criterion. Finally, conclusions are given in Sec. VII. The proofs of the theorems are given in appendices.

I-B Notation

We use uppercase letters for random variables, lowercase letters for their individual values, calligraphic letters for finite alphabets, and boldface letters for sequences. We denote by {\cal M}^{\star} the set of sequences of arbitrary length (including 0) whose elements are in {\cal M}. The probability mass function (p.m.f.) of a random variable X𝒳X\in{\cal X} is denoted by pX={pX(x),x𝒳}p_{X}=\{p_{X}(x),\,x\in{\cal X}\}. The entropy of a random variable XX is denoted by H(X)H(X), and the mutual information between two random variables XX and YY is denoted by I(X;Y)=H(X)H(X|Y)I(X;Y)=H(X)-H(X|Y). Should the dependency on the underlying p.m.f.s be explicit, we write the p.m.f.s as subscripts, e.g., HpX(X)H_{p_{X}}(X) and IpX,pY|X(X;Y)I_{p_{X},p_{Y|X}}(X;Y). The Kullback-Leibler divergence between two p.m.f.s pp and qq is denoted by D(p||q)D(p||q); the conditional Kullback-Leibler divergence of pY|Xp_{Y|X} and qY|Xq_{Y|X} given pXp_{X} is denoted by D(pY|X||qY|X|pX)=D(pY|XpX||qY|XpX)D(p_{Y|X}||q_{Y|X}|p_{X})=D(p_{Y|X}\,p_{X}||q_{Y|X}\,p_{X}). All logarithms are in base 2 unless specified otherwise.

Denote by p𝐱p_{\mathbf{x}} the type, or empirical p.m.f. induced by a sequence 𝐱𝒳N{\mathbf{x}}\in{\cal X}^{N}. The type class T𝐱T_{\mathbf{x}} is the set of all sequences of type p𝐱p_{\mathbf{x}}. Likewise, we denote by p𝐱𝐲p_{{\mathbf{x}}{\mathbf{y}}} the joint type of a pair of sequences (𝐱,𝐲)𝒳N×𝒴N({\mathbf{x}},{\mathbf{y}})\in{\cal X}^{N}\times{\cal Y}^{N} and by T𝐱𝐲T_{{\mathbf{x}}{\mathbf{y}}} the type class associated with p𝐱𝐲p_{{\mathbf{x}}{\mathbf{y}}}. The conditional type p𝐲|𝐱p_{{\mathbf{y}}|{\mathbf{x}}} of a pair of sequences (𝐱,𝐲{\mathbf{x}},{\mathbf{y}}) is defined by p𝐱𝐲(x,y)/p𝐱(x)p_{{\mathbf{x}}{\mathbf{y}}}(x,y)/p_{{\mathbf{x}}}(x) for all x𝒳x\in{\cal X} such that p𝐱(x)>0p_{{\mathbf{x}}}(x)>0. The conditional type class T𝐲|𝐱T_{{\mathbf{y}}|{\mathbf{x}}} given 𝐱{\mathbf{x}}, is the set of all sequences 𝐲~\tilde{{\mathbf{y}}} such that (𝐱,𝐲~)T𝐱𝐲({\mathbf{x}},\tilde{{\mathbf{y}}})\in T_{{\mathbf{x}}{\mathbf{y}}}. We denote by H(𝐱)H({\mathbf{x}}) the empirical entropy of the p.m.f. p𝐱p_{{\mathbf{x}}}, by H(𝐲|𝐱)H({\mathbf{y}}|{\mathbf{x}}) the empirical conditional entropy, and by I(𝐱;𝐲)I({\mathbf{x}};{\mathbf{y}}) the empirical mutual information for the joint p.m.f. p𝐱𝐲p_{{\mathbf{x}}{\mathbf{y}}}.

We use the calligraphic fonts 𝒫X\mathscr{P}_{X} and 𝒫X[N]\mathscr{P}_{X}^{[N]} to represent the set of all p.m.f.s and all empirical p.m.f.’s, respectively, on the alphabet 𝒳{\cal X}. Likewise, 𝒫Y|X\mathscr{P}_{Y|X} and 𝒫Y|X[N]\mathscr{P}_{Y|X}^{[N]} denote the set of all conditional p.m.f.s and all empirical conditional p.m.f.’s on the alphabet 𝒴{\cal Y}. A special symbol 𝒲K\mathscr{W}_{K} will be used to denote the feasible set of collusion channels pY|X1,,XKp_{Y|X_{1},\cdots,X_{K}} that can be selected by a size-KK coalition.

Mathematical expectation is denoted by the symbol 𝔼{\mathbb{E}}. The shorthands aNbNa_{N}\doteq b_{N} and aNbNa_{N}\mbox{$\>\stackrel{{\scriptstyle\centerdot}}{{\leq}}\>$}b_{N} denote asymptotic relations in the exponential scale, respectively limN1NlogaNbN=0\lim_{N\to\infty}\frac{1}{N}\log\frac{a_{N}}{b_{N}}=0 and lim supN\limsup_{N\to\infty} 1NlogaNbN0\frac{1}{N}\log\frac{a_{N}}{b_{N}}\leq 0. We define |t|+max(t,0)|t|^{+}\triangleq\max(t,0) and exp2(t)2t\exp_{2}(t)\triangleq 2^{t}. The indicator function of a set 𝒜{\cal A} is denoted by 𝟙{x𝒜}\mathds{1}_{\{x\in{\cal A}\}}. Finally, we adopt the convention that the minimum of a function over an empty set is ++\infty and the maximum of a function over an empty set is 0.

II Statement of the Problem

II-A Overview

Refer to caption
Figure 1: Model for semiprivate (blind) fingerprinting game, where 𝐒d{\mathbf{S}}^{d} is a degraded version of the covertext 𝐒{\mathbf{S}}. Private and public fingerprinting arise as special cases with 𝐒d=𝐒{\mathbf{S}}^{d}={\mathbf{S}} and 𝐒d={\mathbf{S}}^{d}=\emptyset, respectively.

Our model for blind fingerprinting is diagrammed in Fig. 1. Let 𝒮{\cal S}, 𝒳{\cal X}, and 𝒴{\cal Y} be three finite alphabets. The covertext sequence 𝐒=(S1,,SN)𝒮N{\mathbf{S}}=(S_{1},\cdots,S_{N})\in{\cal S}^{N} consists of NN independent and identically distributed (i.i.d.) samples drawn from a p.m.f. pS(s)p_{S}(s), s𝒮s\in{\cal S}. A random variable VV taking values in an alphabet 𝒱N{\cal V}_{N} is shared between encoder and decoder, and not publicly revealed. The random variable VV is independent of 𝐒{\mathbf{S}} and plays the role of a cryptographic key. There are 2NR2^{NR} users, each of which receives a fingerprinted copy:

𝐗m=fN(𝐒,V,m),1m2NR,{\mathbf{X}}_{m}=f_{N}({\mathbf{S}},V,m),\quad 1\leq m\leq 2^{NR}, (2.1)

where fN:𝒮N×𝒱N×{1,,2NR}𝒳Nf_{N}:{\cal S}^{N}\times{\cal V}_{N}\times\{1,\cdots,2^{NR}\}\to{\cal X}^{N} is the encoding function, and mm is the index of the user. The encoder binds each fingerprinted copy 𝐱m{\mathbf{x}}_{m} to the covertext 𝐬{\mathbf{s}} via a distortion constraint. Let d:𝒮×𝒳+d~:~{\cal S}\times{\cal X}\to{\mathbb{R}}^{+} be the distortion measure and dN(𝐬,𝐱)=1Ni=1Nd(si,xi)d^{N}({\mathbf{s}},{\mathbf{x}})=\frac{1}{N}\sum_{i=1}^{N}d(s_{i},x_{i}) the extension of this measure to length-NN sequences. The code fNf_{N} is subject to the distortion constraint

dN(𝐬,𝐱m)D11m2NR.d^{N}({\mathbf{s}},{\mathbf{x}}_{m})\leq D_{1}\quad 1\leq m\leq 2^{NR}. (2.2)

Let 𝒦{m1,m2,mK}{\cal K}\triangleq\{m_{1},\,m_{2}\,\cdots,\,m_{K}\} be a coalition of KK users, called colluders. No constraints are imposed on the formation of coalitions. The colluders combine their copies 𝐗𝒦{𝐗m,m𝒦}{\mathbf{X}}_{{\cal K}}\triangleq\{{\mathbf{X}}_{m},\,m\in{\cal K}\} to produce a pirated copy 𝐘𝒴N{\mathbf{Y}}\in{\cal Y}^{N}. Without loss of generality, we assume that 𝐘{\mathbf{Y}} is generated stochastically as the output of a collusion channel p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}. Fidelity constraints are imposed on p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}} to ensure that 𝐘{\mathbf{Y}} is “close” to the fingerprinted copies 𝐗m,m𝒦{\mathbf{X}}_{m},\,m\in{\cal K}. These constraints can take the form of distortion constraints, analogously to (2.2). They are formulated below and result in the definition of a feasible class 𝒲K\mathscr{W}_{K} of attacks.

The decoder knows neither KK nor p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}} selected by the KK colluders and has access to the pirated copy 𝐘{\mathbf{Y}}, the secret key VV, as well as to 𝐒d{\mathbf{S}}^{d}, a degraded version of the host 𝐒{\mathbf{S}}. To simplify the exposition, the degradation arises via a deterministic symbolwise mapping h:𝒮𝒮dh~:~{\cal S}\to{\cal S}^{d}. The sequence sd=h(s)s^{d}=h(s) could represent a coarse version of 𝐬{\mathbf{s}}, or some other features of 𝐬{\mathbf{s}}. Two special cases are private fingerprinting where Sd=SS^{d}=S, and public fingerprinting where Sd=S^{d}=\emptyset. The decoder produces an estimate

𝒦^=gN(𝐘,𝐒d,V)\hat{{\cal K}}=g_{N}({\mathbf{Y}},{\mathbf{S}}^{d},V) (2.3)

of the coalition. A possible decision is the empty set, 𝒦^=\hat{{\cal K}}=\emptyset, which is the reasonable choice when an accusation would be unreliable. To summarize, we have

Definition II.1

A randomized rate-RR length-NN fingerprinting code (fN,gN)(f_{N},g_{N}) with embedding distortion D1D_{1} is a pair of encoder mapping fN:𝒮N×𝒱N×{1,2,,2NR}𝒳Nf_{N}~:~{\cal S}^{N}\times{\cal V}_{N}\times\{1,2,\cdots,2^{NR}\}\to{\cal X}^{N} and decoder mapping gN:𝒴N×(𝒮d)N×𝒱N{1,2,,2NR}g_{N}~:~{\cal Y}^{N}\times({\cal S}^{d})^{N}\times{\cal V}_{N}\to\{1,2,\cdots,2^{NR}\}^{\star}.

The randomization is via the secret key VV and can take the form of permutations of the symbol positions {1,2,,N}\{1,2,\cdots,N\}, permutations of the 2NR2^{NR} fingerprint assignments, and an auxiliary time-sharing sequence, as in [6][10], [16].

We now state the attack models and define the error probabilities, capacities, and error exponents.

II-B Collusion Channels

The conditional type p𝐲|𝐱𝒦p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}} is a random variable whose conditional distribution given 𝐱𝒦{\mathbf{x}}_{{\cal K}} depends on the collusion channel p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}. Our fidelity constraint on the coalition is of the general form

Pr[p𝐲|𝐱𝒦𝒲K]=1,Pr[p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\in\mathscr{W}_{K}]=1, (2.4)

where 𝒲K\mathscr{W}_{K} is a convex subset of 𝒫Y|X𝒦\mathscr{P}_{Y|X_{{\cal K}}}. That is, the empirical conditional p.m.f. of the pirated copy given the marked copies is restricted. Examples of 𝒲K\mathscr{W}_{K} are given in [10], including hard distortion constraints on the coalition:

𝒲K={pY|X𝒦:x𝒦,ypX𝒦(x𝒦)pY|X𝒦(y|x𝒦)𝔼ϕd2(ϕ(x𝒦),y)D2}\mathscr{W}_{K}=\left\{p_{Y|X_{{\cal K}}}~:~\sum_{x_{{\cal K}},y}\,p_{X_{{\cal K}}}(x_{{\cal K}})\,p_{Y|X_{{\cal K}}}(y|x_{{\cal K}})\,{\mathbb{E}}_{\phi}\,d_{2}(\phi(x_{{\cal K}}),y)\leq D_{2}\right\} (2.5)

where ϕ:𝒳K𝒮\phi~:{\cal X}^{K}\to{\cal S} is a (possible randomized) permutation-invariant estimator S^=ϕ(X𝒦)\hat{S}=\phi(X_{{\cal K}}) of each host signal sample based on the corresponding marked samples; d2:𝒮𝒴d_{2}~:~{\cal S}\to{\cal Y} is the coalition’s distortion function; pX𝒦p_{X_{{\cal K}}} is a reference p.m.f.; and D2D_{2} is the maximum allowed distortion. Another possible choice for 𝒲K\mathscr{W}_{K} is obtained using the Boneh-Shaw constraint [1, 10].

Fair Coalitions. Denote by π\pi a permutation of the elements of 𝒦{\cal K}. The set of fair, feasible collusion channels is the subset of 𝒲K\mathscr{W}_{K} consisting of permutation-invariant channels:

𝒲Kfair={pY|X𝒦𝒲K:pY|Xπ(𝒦)=pY|X𝒦,π}.\mathscr{W}_{K}^{fair}=\left\{p_{Y|X_{{\cal K}}}\in\mathscr{W}_{K}~:~p_{Y|X_{\pi({\cal K})}}=p_{Y|X_{{\cal K}}},\;\forall\pi\right\}. (2.6)

The collusion channel p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}} is said to be fair if Pr[p𝐲|𝐱𝒦𝒲Kfair]=1Pr[p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\in\mathscr{W}_{K}^{fair}]=1. For any fair collusion channel, the conditional type p𝐲|𝐱𝒦p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}} is invariant to permutations of the colluders.

Strongly exchangeable collusion channels [7]. Now denote by π\pi a permutation of the samples of a length-NN sequence. For strongly exchangeable channels, p𝐘|𝐗𝒦(π𝐲|π𝐱𝒦)p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}(\pi{\mathbf{y}}|\pi{\mathbf{x}}_{{\cal K}}) is independent of π\pi, for every (𝐱𝒦,𝐲)({\mathbf{x}}_{{\cal K}},{\mathbf{y}}). The channel is defined by a probability assignment Pr[T𝐲|𝐱𝒦]Pr[T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}] on the conditional type classes. The distribution of 𝐘{\mathbf{Y}} conditioned on 𝐘T𝐲|𝐱𝒦{\mathbf{Y}}\in T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}} is uniform:

p𝐘|𝐗𝒦(𝐲~|𝐱𝒦)=Pr[T𝐲|𝐱𝒦]|T𝐲|𝐱𝒦|,𝐲~T𝐲|𝐱𝒦.p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}(\tilde{\mathbf{y}}|{\mathbf{x}}_{{\cal K}})=\frac{Pr[T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}]}{|T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}|},\quad\forall\tilde{\mathbf{y}}\in T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}. (2.7)

II-C Error Probabilities

Let 𝒦{\cal K} be the actual coalition and 𝒦^=gN(𝐘,𝐒d,V)\hat{{\cal K}}=g_{N}({\mathbf{Y}},{\mathbf{S}}^{d},V) the decoder’s output. The three error probabilities of interest in this paper are the probability of false positives (one or more innocent users are accused),

PFP(fN,gN,p𝐘|𝐗𝒦)=Pr[𝒦^𝒦],P_{FP}(f_{N},g_{N},p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}})=Pr[\hat{{\cal K}}\setminus{\cal K}\neq\emptyset],

the probability of failing to catch a single colluder,

Peone(fN,gN,p𝐘|𝐗𝒦)=Pr[𝒦^𝒦=],P_{e}^{one}(f_{N},g_{N},p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}})=Pr[\hat{{\cal K}}\cap{\cal K}=\emptyset],

and the probability of failing to catch the full coalition:

Peall(fN,gN,p𝐘|𝐗𝒦)=Pr[𝒦𝒦^].P_{e}^{all}(f_{N},g_{N},p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}})=Pr[{\cal K}\not\subseteq\hat{{\cal K}}].

These three probabilities are obtained by averaging over 𝐒{\mathbf{S}}, VV, and the output of the collusion channel p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}. In each case the worst-case probability is denoted by

Pe(fN,gN,𝒲K)=maxp𝐘|𝐗𝒦Pe(fN,gN,p𝐘|𝐗𝒦)P_{e}(f_{N},g_{N},\mathscr{W}_{K})=\max_{p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}}\,P_{e}(f_{N},g_{N},p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}) (2.8)

where PeP_{e} denotes either PFPP_{FP}, PeoneP_{e}^{one} or PeallP_{e}^{all}, and the maximum is over all feasible collusion channels, i.e., such that (2.4) holds.

II-D Capacity and Random-Coding Exponents

Definition II.2

A rate RR is achievable for embedding distortion D1D_{1}, collusion class 𝒲K\mathscr{W}_{K}, and detect-one criterion if there exists a sequence of (N,2NR)(N,\lceil 2^{NR}\rceil) randomized codes (fN,gN)(f_{N},g_{N}) with maximum embedding distortion D1D_{1}, such that both Pe,None(fN,gN,𝒲K)P_{e,N}^{one}(f_{N},g_{N},\mathscr{W}_{K}) and PFP,N(fN,gN,𝒲K)P_{FP,N}(f_{N},g_{N},\mathscr{W}_{K}) vanish as NN\to\infty.

Definition II.3

A rate RR is achievable for embedding distortion D1D_{1}, collusion class 𝒲K\mathscr{W}_{K}, and detect-all criterion if there exists a sequence of (N,2NR)(N,\lceil 2^{NR}\rceil) randomized codes (fN,gN)(f_{N},g_{N}) with maximum embedding distortion D1D_{1}, such that both Pe,Nall(fN,gN,𝒲K)P_{e,N}^{all}(f_{N},g_{N},\mathscr{W}_{K}) and PFP,N(fN,gN,𝒲K)P_{FP,N}(f_{N},g_{N},\mathscr{W}_{K}) vanish as NN\to\infty.

Definition II.4

Fingerprinting capacities Cone(D1,𝒲K)C^{one}(D_{1},\mathscr{W}_{K}) and Call(D1,𝒲K)C^{all}(D_{1},\mathscr{W}_{K}) are the suprema of all achievable rates with respect to the detect-one and detect-all criteria, respectively.

For random codes the error exponents corresponding to (2.8) are defined as

E{one,all,FP}(R,D1,𝒲K)=lim infN[1NlogPe{one,all,FP}(fN,gN,𝒲K)].E^{\{one,all,FP\}}(R,D_{1},\mathscr{W}_{K})=\liminf_{N\to\infty}\left[-\frac{1}{N}\log P_{e}^{\{one,all,FP\}}(f_{N},g_{N},\mathscr{W}_{K})\right]. (2.9)

We have Call(D1,𝒲K)Cone(D1,𝒲K)C^{all}(D_{1},\mathscr{W}_{K})\leq C^{one}(D_{1},\mathscr{W}_{K}) and Eall(R,D1,𝒲K)Eone(R,D1,𝒲K)E^{all}(R,D_{1},\mathscr{W}_{K})\leq E^{one}(R,D_{1},\mathscr{W}_{K}) because an error event for the detect-one problem is also an error event for the detect-all problem.

III Overview of Random-Coding Scheme

A brief overview of our scheme is given in this section. The decoders will be specified later. The scheme is designed to achieve a false-positive error exponent equal to Δ\Delta and assumes a nominal value KnomK_{nom} for coalition size. Two arbitrarily large integers LwL_{w} and LuL_{u} are selected, defining alphabets 𝒲={1,2,,Lw}{\cal W}=\{1,2,\cdots,L_{w}\} and 𝒰={1,2,,Lu}{\cal U}=\{1,2,\cdots,L_{u}\}, respectively. The parameters Δ,Knom,Lw,Lu\Delta,K_{nom},L_{w},L_{u} are used to identify a certain optimal type class T𝐰T_{{\mathbf{w}}}^{*} and conditional type classes TU|SdW(𝐬d,𝐰)T_{U|S^{d}W}^{*}({\mathbf{s}}^{d},{\mathbf{w}}), TU|SW(𝐬,𝐰)T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}) and TX|USW(𝐮,𝐬,𝐰)T_{X|USW}^{*}({\mathbf{u}},{\mathbf{s}},{\mathbf{w}}) for every possible (𝐮,𝐬,𝐰)({\mathbf{u}},{\mathbf{s}},{\mathbf{w}}). Optimality is defined relative to either the thresholding decoder of Sec. IV or the joint decoder of Sec. V. The secret key VV consists of a random sequence 𝐖T𝐰{\mathbf{W}}\in T_{{\mathbf{w}}^{*}} and the collection (3.1) of random codebooks indexed by 𝐬d,𝐰,λ{\mathbf{s}}^{d},{\mathbf{w}},\lambda.

III-A Codebook

A random constant-composition code

𝒞(𝐬d,𝐰,λ)={𝐮(l,m,λ),  1l2Nρ(λ), 1m2NR}{\cal C}({\mathbf{s}}^{d},{\mathbf{w}},\lambda)=\{{\mathbf{u}}(l,m,\lambda),\;\,1\leq l\leq 2^{N\rho(\lambda)},\;1\leq m\leq 2^{NR}\} (3.1)

is generated for each pair of sequences (𝐬d,𝐰)(𝒮d)N×T𝐰({\mathbf{s}}^{d},{\mathbf{w}})\in({\cal S}^{d})^{N}\times T_{{\mathbf{w}}}^{*} and conditional type λ𝒫S|SdW[N]\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]} by drawing 2N[R+ρ(λ)]2^{N[R+\rho(\lambda)]} random sequences independently and uniformly from an optimized conditional type class TU|SdW(𝐬d,𝐰)T_{U|S^{d}W}^{*}({\mathbf{s}}^{d},{\mathbf{w}}), and arranging them into an array with 2NR2^{NR} columns and 2Nρ(λ)2^{N\rho(\lambda)} rows. Similarly to [11] (see Fig. 2 therein), we refer to ρ(λ)\rho(\lambda) as the depth parameter of the array.

III-B Encoding Scheme

Prior to encoding, a sequence 𝐖𝒲N{\mathbf{W}}\in{\cal W}^{N} is drawn independently of 𝐒{\mathbf{S}} and uniformly from T𝐰T_{{\mathbf{w}}}^{*}, and shared with the receiver. Given (𝐒,𝐖)({\mathbf{S}},{\mathbf{W}}), the encoder determines the conditional type λ=p𝐬|𝐬d𝐰\lambda=p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}} and performs the following two steps for each user 1m2NR1\leq m\leq 2^{NR}.

  1. 1.

    Find ll such that 𝐮(l,m,λ)𝒞(𝐬d,𝐰,λ)TU|SW(𝐬,𝐰){\mathbf{u}}(l,m,\lambda)\in{\cal C}({\mathbf{s}}^{d},{\mathbf{w}},\lambda)\bigcap T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}). If more than one such ll exists, pick one of them randomly (with uniform distribution). Let 𝐮=𝐮(l,m,λ){\mathbf{u}}={\mathbf{u}}(l,m,\lambda). If no such ll can be found, generate 𝐮{\mathbf{u}} uniformly from the conditional type class TU|SW(𝐬,𝐰)T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}).

  2. 2.

    Generate 𝐗m{\mathbf{X}}_{m} uniformly distributed over the conditional type class TX|USW(𝐮,𝐬,𝐰)T_{X|USW}^{*}({\mathbf{u}},{\mathbf{s}},{\mathbf{w}}), and assign this marked sequence to user mm.

III-C Worst Collusion Channel

The fingerprinting codes used in this paper are randomly-modulated (RM) codes [10, Def. 2.2]. For such codes we have the following proposition, which is a straightforward variation of [10, Prop. 2.1] with 𝐒d{\mathbf{S}}^{d} in place of 𝐒{\mathbf{S}} at the decoder.

Proposition III.1

For any RM code (fN,gN)(f_{N},g_{N}), the maximum of the error probability criteria (2.8) over all feasible p𝐘|𝐗𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}} is achieved by a strongly exchangeable collusion channel, as defined in (2.7).

To derive error exponents for such channels, it suffices to use the following upper bound:

p𝐘|𝐗𝒦(𝐲~|𝐱𝒦)=Pr[T𝐲|𝐱𝒦]|T𝐲|𝐱𝒦|1|T𝐲|𝐱𝒦| 1{p𝐲|𝐱𝒦𝒲K},𝐲~T𝐲|𝐱𝒦p_{{\mathbf{Y}}|{\mathbf{X}}_{{\cal K}}}(\tilde{\mathbf{y}}|{\mathbf{x}}_{{\cal K}})=\frac{Pr[T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}]}{|T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}|}\leq\frac{1}{|T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}|}\,\mathds{1}_{\{p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\in\mathscr{W}_{K}\}},\quad\forall\,\tilde{\mathbf{y}}\in T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}} (3.2)

which holds uniformly over all feasible probability assignments to conditional type classes T𝐲|𝐱𝒦T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}.

III-D Encoding and Decoding Errors

The array depth parameter ρ(λ)\rho(\lambda) takes the form

ρ(λ)=I(𝐮;𝐬|𝐬d,𝐰)+ϵ\rho(\lambda)=I({\mathbf{u}};{\mathbf{s}}|{\mathbf{s}}^{d},{\mathbf{w}})+\epsilon

where 𝐮{\mathbf{u}} is any element of TU|SW(𝐬,𝐰)T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}), and ϵ>0\epsilon>0 is an arbitrarily small number. The analysis shows that given any (𝐬,𝐰)({\mathbf{s}},{\mathbf{w}}), the probability of encoding errors vanishes doubly exponentially.

The analysis also shows that the decoding error probability is dominated by a single joint type class T𝐲𝐮𝐬𝐰T_{{\mathbf{y}}{\mathbf{u}}{\mathbf{s}}{\mathbf{w}}}. Denote by (𝐲,𝐮,𝐬,𝐰)({\mathbf{y}},{\mathbf{u}},{\mathbf{s}},{\mathbf{w}}) an arbitrary representative of that class. The normalized logarithm of the size of the array is given by

R+ρ(λ)=I(𝐮;𝐲|𝐬d,𝐰)Δ,R+\rho(\lambda)=I({\mathbf{u}};{\mathbf{y}}|{\mathbf{s}}^{d},{\mathbf{w}})-\Delta,

and the probability of false positives vanishes as 2NΔ2^{-N\Delta}.

IV Threshold Decoder

IV-A Decoding

The decoder has access to (𝐲,𝐬d,𝐰)({\mathbf{y}},{\mathbf{s}}^{d},{\mathbf{w}}) but does not know the conditional type λ=p𝐬|𝐬d𝐰\lambda=p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}} realized at the encoder. The decoder evaluates the users one at a time and makes an innocent/guilty decision on each user independently of the other users. Specifically, the receiver outputs an estimated coalition 𝒦^\hat{{\cal K}} if and only if 𝒦^\hat{{\cal K}} satisfies the following condition:

m𝒦^:maxλ𝒫S|SdW[N]max1l2Nρ(λ)I(𝐮(l,m,λ);𝐲|𝐬d𝐰)ρ(λ)>R+Δ.\forall m\in\hat{{\cal K}}~:\quad\max_{\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]}}\max_{1\leq l\leq 2^{N\rho(\lambda)}}I({\mathbf{u}}(l,m,\lambda);{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})-\rho(\lambda)>R+\Delta. (4.1)

If no such 𝒦^\hat{{\cal K}} is found, the receiver outputs 𝒦^=\hat{{\cal K}}=\emptyset. This decoder outputs all user indices whose empirical mutual information score, penalized by ρ(λ)\rho(\lambda), exceeds the threshold R+ΔR+\Delta.

Observe that the maximizing λ\lambda in (4.1) may depend on mm. With high probability, this event implies a decoding error. Improvements can only be obtained using a more complex joint decoder, as in Sec. V.

IV-B Error Exponents

Define the following set of conditional p.m.f.’s for (XU)𝒦(X𝒦,U𝒦)(XU)_{{\cal K}}\triangleq(X_{{\cal K}},U_{{\cal K}}) given (S,W)(S,W):

(pXU|SW)={p(XU)𝒦|SW:pXmUm|SW=pXU|SW,m𝒦},{\cal M}(p_{XU|SW})=\{p_{(XU)_{{\cal K}}|SW}~:~p_{X_{m}U_{m}|SW}=p_{XU|SW},\,m\in{\cal K}\},

i.e., the conditional marginal p.m.f. pXU|SWp_{XU|SW} is the same for each (Xm,Um),m𝒦(X_{m},U_{m}),\forall m\in{\cal K}. Also define the sets

𝒫XU|SW(pSW,Lw,Lu,D1)\displaystyle\mathscr{P}_{XU|SW}(p_{SW},L_{w},L_{u},D_{1}) =\displaystyle= {pXU|SW:𝔼[d(S,X)]D1},\displaystyle\left\{p_{XU|SW}~:~{\mathbb{E}}[d(S,X)]\leq D_{1}\right\},
𝒫(XU)𝒦W|S(pS,Lw,Lu,D1)\displaystyle\mathscr{P}_{(XU)_{{\cal K}}W|S}(p_{S},L_{w},L_{u},D_{1}) =\displaystyle= {p(XU)𝒦W|S=pWk𝒦pXkUk|SW\displaystyle\left\{p_{(XU)_{{\cal K}}W|S}=p_{W}\,\prod_{k\in{\cal K}}p_{X_{k}U_{k}|SW}\right. (4.2)
:pX1U1|SW==pXKUK|SW,and𝔼[d(S,X1)]D1}\displaystyle\left.:~p_{X_{1}U_{1}|SW}=\cdots=p_{X_{K}U_{K}|SW},\;\;\mathrm{and}\;\;{\mathbb{E}}[d(S,X_{1})]\leq D_{1}\right\}

where in (4.2) the random variables (Xk,Uk),k𝒦(X_{k},U_{k}),\,k\in{\cal K}, are conditionally i.i.d. given (S,W)(S,W).

Define for each m𝒦m\in{\cal K} the set of conditional p.m.f.’s

𝒫Y(XU)𝒦|SW(pW,p~S|W,pXU|SW,𝒲K,R,Lw,Lu,m)\displaystyle\mathscr{P}_{Y(XU)_{{\cal K}}|SW}(p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K},R,L_{w},L_{u},m) (4.3)
\displaystyle\triangleq {p~Y(XU)𝒦|SW:p~(XU)𝒦|SW(pXU|SW),p~Y|X𝒦𝒲K,\displaystyle\left\{\tilde{p}_{Y(XU)_{{\cal K}}|SW}\,:~\tilde{p}_{(XU)_{{\cal K}}|SW}\in{\cal M}(p_{XU|SW}),~\tilde{p}_{Y|X_{{\cal K}}}\in\mathscr{W}_{K},\right.
IpWp~S|Wp~Y(XU)𝒦|SW(Um;Y|SdW)IpWp~S|WpXU|SW(U;S|SdW)R}\displaystyle\qquad\left.I_{p_{W}\tilde{p}_{S|W}\tilde{p}_{Y(XU)_{{\cal K}}|SW}}(U_{m};Y|S^{d}W)-I_{p_{W}\tilde{p}_{S|W}p_{XU|SW}}(U;S|S^{d}W)\leq R\right\}

and the pseudo sphere packing exponent

E~psp,m(R,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\tilde{E}_{psp,m}(R,p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) =\displaystyle= minp~Y(XU)𝒦|SW𝒫Y(XU)𝒦|SW(pW,p~S|W,pXU|SW,𝒲K,R,Lw,Lu,m)\displaystyle\min_{\tilde{p}_{Y(XU)_{{\cal K}}|SW}\in\mathscr{P}_{Y(XU)_{{\cal K}}|SW}(p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K},R,L_{w},L_{u},m)} (4.4)
D(p~Y(XU)𝒦|SWp~S|Wp~Y|X𝒦pXU|SW𝒦pS|pW).\displaystyle\quad D(\tilde{p}_{Y(XU)_{{\cal K}}|SW}\,\tilde{p}_{S|W}\|\tilde{p}_{Y|X_{{\cal K}}}\,p_{XU|SW}^{{\cal K}}\,p_{S}|p_{W}).

Taking the maximum and minimum of E~psp,m\tilde{E}_{psp,m} above over m𝒦m\in{\cal K}, we respectively define

E~¯psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\!\!\!\overline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) =\displaystyle= maxm𝒦E~psp,m(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K),\displaystyle\max_{m\in{\cal K}}\tilde{E}_{psp,m}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}), (4.5)
E¯~psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\!\!\!\underline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) =\displaystyle= minm𝒦E~psp,m(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K).\displaystyle\min_{m\in{\cal K}}\tilde{E}_{psp,m}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}). (4.6)

For a fair coalition (𝒲K=𝒲Kfair\mathscr{W}_{K}=\mathscr{W}_{K}^{fair}), E~psp,m\tilde{E}_{psp,m} is independent of m𝒦m\in{\cal K}, and the two expressions above coincide. Define

Epsp(R,Lw,Lu,D1,𝒲K)\displaystyle E_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= maxpW𝒫Wminp~S|W𝒫S|WmaxpXU|SW𝒫XU|SW(pWp~S|W,Lw,Lu,D1)\displaystyle\max_{p_{W}\in\mathscr{P}_{W}}\min_{\tilde{p}_{S|W}\in\mathscr{P}_{S|W}}\max_{p_{XU|SW}\in\mathscr{P}_{XU|SW}(p_{W}\,\tilde{p}_{S|W},L_{w},L_{u},D_{1})} (4.7)
E~psp,1(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲Kfair).\displaystyle\quad\tilde{E}_{psp,1}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}^{fair}).

Denote by pWp_{W}^{*} and pXU|SWp_{XU|SW}^{*} the maximizers in (4.7), the latter to be viewed as a function of p~S|W\tilde{p}_{S|W}. Both pWp_{W}^{*} and pXU|SWp_{XU|SW}^{*} implicitly depend on RR and 𝒲Kfair\mathscr{W}_{K}^{fair}. Finally, define

E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\overline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp~S|W𝒫S|WE~¯psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\min_{\tilde{p}_{S|W}\in\mathscr{P}_{S|W}}\overline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W}^{*},\tilde{p}_{S|W},p_{XU|SW}^{*},\mathscr{W}_{K}) (4.8)
E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\underline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp~S|W𝒫S|WE¯~psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K).\displaystyle\min_{\tilde{p}_{S|W}\in\mathscr{P}_{S|W}}\underline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W}^{*},\tilde{p}_{S|W},p_{XU|SW}^{*},\mathscr{W}_{K}). (4.9)

The terminology pseudo sphere-packing exponent is used because despite its superficial similarity to a real sphere-packing exponent, (4.4) does not provide a fundamental asymptotic lower bound on error probability.

Theorem IV.1

The decision rule (4.1) yields the following error exponents.

(i)

The false-positive error exponent is

EFP(R,D1,𝒲K,Δ)=Δ.E_{FP}(R,D_{1},\mathscr{W}_{K},\Delta)=\Delta. (4.10)
(ii)

The detect-all error exponent is

Eall(R,Lw,Lu,D1,𝒲K,Δ)=E¯psp(R+Δ,Lw,Lu,D1,𝒲K).E^{all}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K},\Delta)=\underline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K}). (4.11)
(iii)

The detect-one error exponent is

Eone(R,Lw,Lu,D1,𝒲K,Δ)=E¯psp(R+Δ,Lw,Lu,D1,𝒲K).E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K},\Delta)=\overline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K}). (4.12)
(iv)

A fair collusion strategy is optimal under the detect-one error criterion:

Eone(R,Lw,Lu,D1,𝒲K,Δ)=Eone(R,Lw,Lu,D1,𝒲Kfair,Δ).E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K},\Delta)=E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}^{fair},\Delta).
(v)

The detect-one and detect-all error exponents are the same when the colluders emply a fair strategy: Eone(R,Lw,Lu,D1,𝒲Kfair,Δ)=Eall(R,Lw,Lu,D1,𝒲Kfair,Δ)E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}^{fair},\Delta)=E^{all}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}^{fair},\Delta).

(vi)

For K=KnomK=K_{nom}, the supremum of all rates for which the detect-one error exponent of (4.12) is positive is

Cthr(D1,𝒲K)\displaystyle C^{thr}(D_{1},\mathscr{W}_{K}) =\displaystyle= Cthr(D1,𝒲Kfair)\displaystyle C^{thr}(D_{1},\mathscr{W}_{K}^{fair}) (4.13)
=\displaystyle= limLw,LumaxpW𝒫WmaxpXU|SW𝒫XU|SW(pWpS,Lw,Lu,D1)minpY|X𝒦𝒲Kfair\displaystyle\lim_{L_{w},L_{u}\to\infty}\;\max_{p_{W}\in\mathscr{P}_{W}}\;\max_{p_{XU|SW}\in\mathscr{P}_{XU|SW}(p_{W}\,p_{S},L_{w},L_{u},D_{1})}\;\min_{p_{Y|X_{{\cal K}}}\in\mathscr{W}_{K}^{fair}}
[I(U;Y|Sd,W)I(U;S|Sd,W)].\displaystyle\qquad[I(U;Y|S^{d},W)-I(U;S|S^{d},W)].

V Joint Fingerprint Decoder

The fundamental improvement over the simple thresholding strategy for decoding in Sec. IV resides in the use of a joint decoding rule. Specifically, the decoder maximizes a penalized empirical mutual information score over all possible coalitions of any size. The penalty depends on the conditional host sequence type p𝐬|𝐬d𝐰p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}, as in Sec. IV, and is proportional to the size of the coalition, as in [10, Sec. V]. We call this blind fingerprint decoder the maximum doubly-penalized mutual information (M2PMI) decoder.

Mutual Information of kk Random Variables. The mutual information of kk random variables X1,,XkX_{1},\cdots,X_{k} is defined as the sum of their individual entropies minus their joint entropy [21, p. 57] or equivalently, the divergence between their joint distribution and the product of their marginals:

I(X1;;Xk)\displaystyle{\overset{\circ}{I}}(X_{1};\cdots;X_{k}) =\displaystyle= H(X1)++H(Xk)H(X1,,Xk)\displaystyle H(X_{1})+\cdots+H(X_{k})-H(X_{1},\cdots,X_{k})
=\displaystyle= D(pX1XkpX1pXk).\displaystyle D(p_{X_{1}\cdots X_{k}}\|p_{X_{1}}\cdots p_{X_{k}}).

The symbol I{\overset{\circ}{I}} is used to distinguish it from ordinary mutual information II between two random variables. Similarly one can define a conditional mutual information I(X1;;Xk|Z)=iH(Xi|Z)H(X1,,Xk|Z){\overset{\circ}{I}}(X_{1};\cdots;X_{k}|Z)=\sum_{i}H(X_{i}|Z)-H(X_{1},\cdots,X_{k}|Z) conditioned on ZZ, and an empirical mutual information I(𝐱1;;𝐱k|𝐳){\overset{\circ}{I}}({\mathbf{x}}_{1};\cdots;{\mathbf{x}}_{k}|{\mathbf{z}}) between kk sequences 𝐱1,,𝐱k{\mathbf{x}}_{1},\cdots,{\mathbf{x}}_{k}, conditioned on 𝐳{\mathbf{z}}, as the conditional mutual information with respect to the joint type of 𝐱1,,𝐱k,𝐳{\mathbf{x}}_{1},\cdots,{\mathbf{x}}_{k},{\mathbf{z}}. Some properties of I{\overset{\circ}{I}} are given in [10, Sec. V.A].

Recall that 𝐱𝒜{\mathbf{x}}_{{\cal A}} denotes {𝐱m,m𝒜}\{{\mathbf{x}}_{m},\,m\in{\cal A}\} and that the codewords in (3.1) take the form 𝐮(l,m,λ){\mathbf{u}}(l,m,\lambda). In the following, we shall use the compact notation (𝐱𝐮)𝒜(𝐱𝒜,𝐮𝒜)({\mathbf{x}}{\mathbf{u}})_{{\cal A}}\triangleq({\mathbf{x}}_{{\cal A}},{\mathbf{u}}_{{\cal A}}), and

𝐮(l𝒜,m𝒜,λ){𝐮(lm1,m1,λ),,𝐮(lm|𝒜|,m|𝒜|,λ)}for𝒜={m1,,m|𝒜|}.{\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda)\triangleq\{{\mathbf{u}}(l_{m_{1}},m_{1},\lambda),\cdots,{\mathbf{u}}(l_{m_{|{\cal A}|}},m_{|{\cal A}|},\lambda)\}\quad\mathrm{for~}{\cal A}=\{m_{1},\cdots,m_{|{\cal A}|}\}.

V-A M2PMI Criterion

Given 𝐲,𝐬d,𝐰{\mathbf{y}},{\mathbf{s}}^{d},{\mathbf{w}}, the decoder seeks the coalition size kk, the conditional host sequence type λ𝒫S|SdW[N]\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]}, and the codewords 𝐮(l,m,λ){\mathbf{u}}(l,m,\lambda) in 𝒞(𝐬d,𝐰,λ){\cal C}({\mathbf{s}}^{d},{\mathbf{w}},\lambda) that maximize the M2PMI criterion below. The column indices m𝒦m\in{\cal K}, corresponding to the decoded words form the decoded coalition 𝒦^\hat{{\cal K}}. If the maximizing kk in (5.2) is zero, the receiver outputs 𝒦^=\hat{{\cal K}}=\emptyset.

The Maximum Doubly-Penalized Mutual Information criterion is defined as

maxk0M2PMI(k)\max_{k\geq 0}M2PMI(k) (5.2)

where

M2PMI(k)={0:ifk=0maxλ𝒫S|SdW[N]max𝐮𝒦𝒞k(𝐬d,𝐰,λ)[I(𝐮𝒦;𝐲|𝐬d𝐰)k(ρ(λ)+R+Δ)]:ifk=1,2,M2PMI(k)=\left\{\begin{array}[]{ll}0&:~\mathrm{if~}k=0\\ \underset{\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]}}{\max}\;\underset{{\mathbf{u}}_{{\cal K}}\in{\cal C}^{k}({\mathbf{s}}^{d},{\mathbf{w}},\lambda)}{\max}\left[{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})-k(\rho(\lambda)+R+\Delta)\right]&:~\mathrm{if~}k=1,2,\cdots\end{array}\right. (5.3)

V-B Properties

The following lemma shows that 1) each subset of the estimated coalition is significant, and 2) any further extension of the coalition would fail a significance test. The proof parallels that of Lemma 5.1 in [10] and is therefore omitted.

Lemma V.1

Let 𝒦^\hat{{\cal K}}, λ\lambda, l𝒦^l_{\hat{{\cal K}}} achieve the maximum in (5.3) (5.2), i.e., 𝐮𝒦^=𝐮(l𝒦^,m𝒦^,λ){\mathbf{u}}_{\hat{{\cal K}}}={\mathbf{u}}(l_{\hat{{\cal K}}},m_{\hat{{\cal K}}},\lambda). Then for each subset of the estimated coalition 𝒦^\hat{{\cal K}}, we have

𝒜𝒦^:I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l𝒦^𝒜,m𝒦^𝒜,λ)|𝐬d𝐰)>|𝒜|(ρ(λ)+R+Δ).\forall{\cal A}\subseteq\hat{{\cal K}}~:\quad{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{\hat{{\cal K}}\setminus{\cal A}},m_{\hat{{\cal K}}\setminus{\cal A}},\lambda)\,|{\mathbf{s}}^{d}{\mathbf{w}})>|{\cal A}|\,(\rho(\lambda)+R+\Delta). (5.4)

Moreover, for every 𝒜{\cal A} disjoint with 𝒦^\hat{{\cal K}},

I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l𝒦^,m𝒦^,λ)|𝐬d𝐰)|𝒜|(ρ(λ)+R+Δ).{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{\hat{{\cal K}}},m_{\hat{{\cal K}}},\lambda)\,|{\mathbf{s}}^{d}{\mathbf{w}})\leq|{\cal A}|\,(\rho(\lambda)+R+\Delta). (5.5)

V-C Error Exponents

Define for each 𝒜𝒦{\cal A}\subseteq{\cal K} the set of conditional p.m.f.’s

𝒫Y(XU)𝒦|SW(pW,p~S|W,pXU|SW,𝒲K,R,Lw,Lu,𝒜)\displaystyle\mathscr{P}_{Y(XU)_{{\cal K}}|SW}(p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K},R,L_{w},L_{u},{\cal A}) (5.6)
\displaystyle\triangleq {p~Y(XU)𝒦|SW:p~(XU)𝒦|SW(pXU|SW),\displaystyle\left\{\tilde{p}_{Y(XU)_{{\cal K}}|SW}\,:~\tilde{p}_{(XU)_{{\cal K}}|SW}\in{\cal M}(p_{XU|SW}),\right.
1|𝒜|IpWp~S|Wp~Y(XU)𝒦|SW(U𝒜;YU𝒦𝒜|Sd,W)IpWp~S|WpXU|SW(U;S|Sd,W)+R}\displaystyle\quad\left.\frac{1}{|{\cal A}|}{\overset{\circ}{I}}_{p_{W}\,\tilde{p}_{S|W}\,\tilde{p}_{Y(XU)_{{\cal K}}|SW}}(U_{{\cal A}};YU_{{\cal K}\setminus{\cal A}}|S^{d},W)\leq I_{p_{W}\,\tilde{p}_{S|W}\,p_{XU|SW}}(U;S|S^{d},W)+R\right\}

and the pseudo sphere packing exponent

E~psp,𝒜(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\tilde{E}_{psp,{\cal A}}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) =\displaystyle= minp~Y(XU)𝒦|SW𝒫Y(XU)𝒦|SW(pW,p~S|W,pXU|SW,𝒲K,R,Lw,Lu,𝒜)\displaystyle\min_{\tilde{p}_{Y(XU)_{{\cal K}}|SW}\in\mathscr{P}_{Y(XU)_{{\cal K}}|SW}(p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K},R,L_{w},L_{u},{\cal A})} (5.7)
D(p~Y(XU)𝒦|SWp~S|Wp~Y|X𝒦p~(XU)𝒦|SWpS|pW).\displaystyle D(\tilde{p}_{Y(XU)_{{\cal K}}|SW}\,\tilde{p}_{S|W}\|\tilde{p}_{Y|X_{{\cal K}}}\,\tilde{p}_{(XU)_{{\cal K}}|SW}\,p_{S}\,|p_{W}).

Taking the maximum 111 The property that 𝒦{\cal K} achieves max𝒜𝒦E~psp,𝒜\max_{{\cal A}\subseteq{\cal K}}\tilde{E}_{psp,{\cal A}} is established in the proof of Theorem V.2, Part (iv). and the minimum of E~psp,𝒜\tilde{E}_{psp,{\cal A}} above over all subsets 𝒜{\cal A} of 𝒦{\cal K}, we define

E~¯psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\!\!\!\!\!\overline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) =\displaystyle= E~psp,𝒦(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K),\displaystyle\tilde{E}_{psp,{\cal K}}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}), (5.8)
E¯~psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\displaystyle\!\!\!\!\!\underline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) =\displaystyle= min𝒜𝒦E~psp,𝒜(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K).\displaystyle\min_{{\cal A}\subseteq{\cal K}}\tilde{E}_{psp,{\cal A}}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}). (5.9)

Now define

Epsp(R,Lw,Lu,D1,𝒲K)\displaystyle E_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= maxpW𝒫Wminp~S|W𝒫S|WmaxpXU|SW𝒫XU|SW(pW,p~S|W,Lw,Lu,D1)\displaystyle\max_{p_{W}\in\mathscr{P}_{W}}\min_{\tilde{p}_{S|W}\in\mathscr{P}_{S|W}}\max_{p_{XU|SW}\in\mathscr{P}_{XU|SW}(p_{W},\tilde{p}_{S|W},L_{w},L_{u},D_{1})} (5.10)
E~psp,𝒦(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲Knomfair).\displaystyle\quad\tilde{E}_{psp,{\cal K}}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K_{nom}}^{fair}).

Denote by pWp_{W}^{*} and pXU|SWp_{XU|SW}^{*} the maximizers in (5.10), where the latter is to be viewed as a function of p~S|W\tilde{p}_{S|W}. Both pWp_{W}^{*} and pXU|SWp_{XU|SW}^{*} implicitly depend on RR and 𝒲Kfair\mathscr{W}_{K}^{fair}. Finally, define

E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\overline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp~S|W𝒫S|WE~¯psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K),\displaystyle\min_{\tilde{p}_{S|W}\in\mathscr{P}_{S|W}}\overline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W}^{*},\tilde{p}_{S|W},p_{XU|SW}^{*},\mathscr{W}_{K}), (5.11)
E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\underline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp~S|W𝒫S|WE¯~psp(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K).\displaystyle\min_{\tilde{p}_{S|W}\in\mathscr{P}_{S|W}}\underline{\tilde{E}}_{psp}(R,L_{w},L_{u},p_{W}^{*},\tilde{p}_{S|W},p_{XU|SW}^{*},\mathscr{W}_{K}). (5.12)
Theorem V.2

The decision rule (5.2) yields the following error exponents.

(i)

The false-positive error exponent is

EFP(R,D1,𝒲K,Δ)=Δ.E_{FP}(R,D_{1},\mathscr{W}_{K},\Delta)=\Delta. (5.13)
(ii)

The detect-all error exponent is

Eall(R,Lw,Lu,D1,𝒲K,Δ)=E¯psp(R+Δ,Lw,Lu,D1,𝒲K).E^{all}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K},\Delta)=\underline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K}). (5.14)
(iii)

The detect-one error exponent is

Eone(R,Lw,Lu,D1,𝒲K,Δ)=E¯psp(R+Δ,Lw,Lu,D1,𝒲K).E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K},\Delta)=\overline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K}). (5.15)
(iv)

Eone(R,Lw,Lu,D1,𝒲K,Δ)=Eone(R,Lw,Lu,D1,𝒲Kfair,Δ)E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K},\Delta)=E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}^{fair},\Delta).

(v)

Eall(R,Lw,Lu,D1,𝒲Kfair,Δ)=Eone(R,Lw,Lu,D1,𝒲Kfair,Δ)E^{all}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}^{fair},\Delta)=E^{one}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}^{fair},\Delta).

(vi)

If K=KnomK=K_{nom}, the supremum of all rates for which the error exponent of (5.15) and (5.14) are positive is

C¯one(D1,𝒲K)\displaystyle\underline{C}^{one}(D_{1},\mathscr{W}_{K}) =\displaystyle= C¯one(D1,𝒲Kfair)\displaystyle\underline{C}^{one}(D_{1},\mathscr{W}_{K}^{fair}) (5.16)
=\displaystyle= limLw,LumaxpW𝒫Wmaxp(XU)𝒦|SW𝒫(XU)𝒦|SW(pW,pS,Lw,Lu,D1)minpY|X𝒦𝒲Kfair\displaystyle\lim_{L_{w},L_{u}\to\infty}\;\max_{p_{W}\in\mathscr{P}_{W}}\;\max_{p_{(XU)_{{\cal K}}|SW}\in\mathscr{P}_{(XU)_{{\cal K}}|SW}(p_{W},p_{S},L_{w},L_{u},D_{1})}\;\min_{p_{Y|X_{{\cal K}}}\in\mathscr{W}_{K}^{fair}}
[1KI(U𝒦;Y|Sd,W)I(U;S|Sd,W)]\displaystyle\qquad\left[\frac{1}{K}I(U_{{\cal K}};Y|S^{d},W)-I(U;S|S^{d},W)\right]

under the “detect-one” criterion, and by

C¯all(D1,𝒲K)\displaystyle\underline{C}^{all}(D_{1},\mathscr{W}_{K}) =\displaystyle= limLw,LumaxpW𝒫Wmaxp(XU)𝒦|SW𝒫(XU)𝒦|SW(pW,pS,Lw,Lu,D1)minpY|X𝒦𝒲K\displaystyle\lim_{L_{w},L_{u}\to\infty}\;\max_{p_{W}\in\mathscr{P}_{W}}\;\max_{p_{(XU)_{{\cal K}}|SW}\in\mathscr{P}_{(XU)_{{\cal K}}|SW}(p_{W},p_{S},L_{w},L_{u},D_{1})}\;\min_{p_{Y|X_{{\cal K}}}\in\mathscr{W}_{K}} (5.17)
[min𝒜𝒦1|𝒜|I(U𝒜;Y|Sd,W,U𝒦𝒜)I(U;S|Sd,W)]\displaystyle\qquad\left[\min_{{\cal A}\subseteq{\cal K}}\;\frac{1}{|{\cal A}|}I(U_{{\cal A}};Y\,|S^{d},W,U_{{\cal K}\setminus{\cal A}})-I(U;S|S^{d},W)\right]

under the “detect-all” criterion. If the colluders select a fair collusion channel, as is their collective interest, the minimization is restricted to 𝒲Kfair\mathscr{W}_{K}^{fair} in (5.17), and then

C¯all(D1,𝒲K)=C¯one(D1,𝒲K).\underline{C}^{all}(D_{1},\mathscr{W}_{K})=\underline{C}^{one}(D_{1},\mathscr{W}_{K}).

For the special case of private fingerprinting (Sd=SS^{d}=S), the term I(U;S|Sd,W)I(U;S|S^{d},W) in (5.16) is zero. Since I(U𝒦;Y|S,W)I((XU)𝒦;Y|S,W)I(U_{{\cal K}};Y|S,W)\leq I((XU)_{{\cal K}};Y|S,W), it suffices to choose Lu=|𝒳|L_{u}=|{\cal X}| and U=XU=X to achieve the maximum in (5.16). The resulting expression coincides with the capacity formula in [10, Theorem 3.2]. Similarly to the single-user case [11], when U=XU=X the binning scheme is degenerate.

V-D Bounded Coalition Size

Assume now that KK is known not exceed some maximum value KmaxK_{\max}. The same random coding scheme can be used. In the evaluation of the M2PMI criterion of (5.2), the maximization is now limited to 0kKmax0\leq k\leq K_{\max}. In Lemma 5.5, property (5.4) holds, and property (5.5) now holds for every 𝒜{\cal A} disjoint with 𝒦^\hat{{\cal K}}, and of size |𝒜|Kmax|𝒦^||{\cal A}|\leq K_{\max}-|\hat{{\cal K}}|. Following the derivation of the error exponents in the appendix, we see that these exponents remain the same as those given by Theorem V.2.

Blind watermarking. The case Kmax=1K_{\max}=1 represents blind watermark decoding with a guarantee that the false-positive exponent is at least equal to Δ\Delta. In this scenario, there is no need for a time-sharing sequence 𝐰{\mathbf{w}}, and the decoder’s input 𝐲{\mathbf{y}} is either an unwatermarked sequence (K=0K=0) or a watermarked sequence (K=1K=1). The M2PMI criterion of (5.3) reduces to

M2PMI(k)=maxλmax𝐮𝒞(𝐬d)I(𝐮;𝐲|𝐬d)(ρ(λ)+R+Δ)fork=1.M2PMI(k)=\max_{\lambda}\max_{{\mathbf{u}}\in{\cal C}({\mathbf{s}}^{d})}I({\mathbf{u}};{\mathbf{y}}|{\mathbf{s}}^{d})-(\rho(\lambda)+R+\Delta)\quad\mathrm{for~}k=1.

The resulting false-positive and false-negative exponents are given by Δ\Delta and Epsp(R+Δ,0,Lu,D1,𝒲K)E_{psp}(R+\Delta,0,L_{u},D_{1},\mathscr{W}_{K}), respectively.

VI Upper Bounds on Public Fingerprinting Capacity

Deriving public fingerprinting capacity is a challenge because the capacity region for the Gel’fand-Pinsker version of the MAC is still unknown, in fact an outer bound for this region has yet to be established. Even in the case of a MAC with side information causally available at the transmitter but not at the receiver, the expressions for the inner and outer capacity regions do not coincide [23]. Likewise, the expression derived below is an upper bound on public fingerprinting capacity under the detect-all criterion.

Recall the definition of the set 𝒫(XU)𝒦W|S(pS,Lw,Lu,D1)\mathscr{P}_{(XU)_{{\cal K}}W|S}(p_{S},L_{w},L_{u},D_{1}) in (4.2), where WW and UU are random variables defined over alphabets 𝒲={1,2,,Lw}{\cal W}=\{1,2,\cdots,L_{w}\} and 𝒰={1,2,,Lu}{\cal U}=\{1,2,\cdots,L_{u}\}, respectively. Here we define the larger set

𝒫(XU)𝒦W|Souter(pS,Lw,Lu,D1)\displaystyle\mathscr{P}_{(XU)_{{\cal K}}W|S}^{outer}(p_{S},L_{w},L_{u},D_{1}) =\displaystyle= {p(XU)𝒦W|S=pW(k𝒦pXk|SW)pU𝒦|X𝒦SW:\displaystyle\left\{p_{(XU)_{{\cal K}}W|S}=p_{W}\,\left(\prod_{k\in{\cal K}}p_{X_{k}|SW}\right)p_{U_{{\cal K}}|X_{{\cal K}}SW}:\right. (6.1)
pX1|SW==pXK|SW,and𝔼[d(S,X1)]D1}\displaystyle\left.\quad p_{X_{1}|SW}=\cdots=p_{X_{K}|SW},\;\;\mathrm{and}\;\;{\mathbb{E}}[d(S,X_{1})]\leq D_{1}\right\}

where Xk,k𝒦X_{k},\,k\in{\cal K}, are still conditionally i.i.d. given (S,W)(S,W) but the random variables Uk,k𝒦U_{k},\,k\in{\cal K}, are generally conditionally dependent.

Define

C¯Lw,Luall(D1,𝒲K)\displaystyle\overline{C}_{L_{w},L_{u}}^{all}(D_{1},\mathscr{W}_{K}) =\displaystyle= maxp(XU)𝒦W|S𝒫(XU)𝒦W|Souter(pS,Lw,Lu,D1)minpY|X𝒦𝒲K\displaystyle\max_{p_{(XU)_{{\cal K}}W|S}\in\mathscr{P}_{(XU)_{{\cal K}}W|S}^{outer}(p_{S},L_{w},L_{u},D_{1})}\;\min_{p_{Y|X_{{\cal K}}}\in\mathscr{W}_{K}} (6.2)
min𝒜𝒦1|𝒜|[I(U𝒜;Y,Sd|U𝒦𝒜)I(U𝒜;S|U𝒦𝒜)].\displaystyle\qquad\qquad\min_{{\cal A}\subseteq{\cal K}}\;\frac{1}{|{\cal A}|}\left[I(U_{{\cal A}};Y,S^{d}|U_{{\cal K}\setminus{\cal A}})-I(U_{{\cal A}};S|U_{{\cal K}\setminus{\cal A}})\right].

Using the same derivation as in Lemma 2.1 of [11], it can be shown that C¯Lw,Luall(D1,𝒲K)\overline{C}_{L_{w},L_{u}}^{all}(D_{1},\mathscr{W}_{K}) is a nondecreasing function of LwL_{w} and LuL_{u} and converges to a finite limit. Moreover, the gap to the limit may be bounded by a polynomial function of LwL_{w} and LuL_{u}, see [11, Sec. 3.5] for a similar derivation.

Theorem VI.1

Public fingerprinting capacity is upper-bounded by

C¯all(D1,𝒲K)=limLw,LuC¯Lw,Luall(D1,𝒲K)\displaystyle\overline{C}^{all}(D_{1},\mathscr{W}_{K})=\lim_{L_{w},L_{u}\to\infty}\;\overline{C}_{L_{w},L_{u}}^{all}(D_{1},\mathscr{W}_{K}) (6.3)

under the “detect-all” criterion.

Proof: see appendix.

We conjecture that the upper bound on capacity given by Theorem VI.1 is generally not tight. The insight here is that the upper bound remains valid if the class of encoding functions is enlarged to include feedback from the receiver: Xki=f~i(𝐒,Mk,Yi1)X_{ki}=\tilde{f}_{i}({\mathbf{S}},M_{k},Y^{i-1}) for 1iN1\leq i\leq N. It can indeed be verified that all the inequalities in the proof and the Markov chain properties hold. The question is now whether feedback can increase public fingerprinting capacity. We conjecture the answer is yes, because feedback is known to increase MAC capacity [24].

We also make the stronger conjecture that the maximum over p(XU)𝒦|SWp_{(XU)_{{\cal K}}|SW} is achieved by a p.m.f. that decouples the components (Xk,Uk),k𝒦(X_{k},U_{k}),\,k\in{\cal K}, conditioned on (S,W)(S,W). If this is true, the set 𝒫(XU)𝒦W|Souter(pS\mathscr{P}_{(XU)_{{\cal K}}W|S}^{outer}(p_{S}, Lw,Lu,D1)L_{w},L_{u},D_{1}) in the formula (6.2) can be replaced with the smaller set 𝒫(XU)𝒦W|S(pS\mathscr{P}_{(XU)_{{\cal K}}W|S}(p_{S}, Lw,Lu,D1)L_{w},L_{u},D_{1}) of (4.2), and the random coding scheme of Sec. V is capacity-achieving.

VII Conclusion

We have proposed a communication model and a random-coding scheme for blind fingerprinting. While a standard binning scheme for communication with asymmetric side information at the transmitter and the receiver may seem like a reasonable candidate, such a scheme would be unable to trade false-positive error exponents against false-negative error exponents. Our proposed binning scheme combines two ideas. The first is the use of a stacked binning scheme as in [11], which demonstrated the advantages (in terms of decoding error exponents) of selecting codewords from an array whose size depends on the conditional type of the host sequence. The second is the use of an auxiliary time-sharing random variable as in [10]. The blind fingerprint decoders of Secs. IV and V combine the advantages of both methods and provide positive error exponents for a range of code rates. The tradeoff between the two fundamental types of error probabilities is determined by the value of the parameter Δ\Delta.

Appendix A Proof of Theorem IV.1

We derive the error exponents for the thresholding rule (4.1). We have 𝒲={1,2,,Lw}{\cal W}=\{1,2,\cdots,L_{w}\} and 𝒰={1,2,,Lu}{\cal U}=\{1,2,\cdots,L_{u}\}. Fix some arbitrarily small ϵ>0\epsilon>0. Define for all m𝒦m\in{\cal K}

𝒫Y(XU)𝒦|SW[N](p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K,R,Lw,Lu,m)\displaystyle\mathscr{P}_{Y(XU)_{{\cal K}}|SW}^{[N]}(p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K},R,L_{w},L_{u},m) =\displaystyle= {p𝐲(𝐱𝐮)𝒦|𝐬𝐰:p(𝐱𝐮)𝒦|𝐬𝐰(p𝐱𝐮|𝐬𝐰),\displaystyle\left\{p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\,:~p_{({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\in{\cal M}(p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}),\right.
p𝐲|𝐱𝒦𝒲K,I(𝐮m;𝐲|𝐬d𝐰)ρ(p𝐬|𝐬d𝐰)+R}\displaystyle\left.p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\in\mathscr{W}_{K},~I({\mathbf{u}}_{m};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R\right\}
E˘psp,m,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\breve{E}_{psp,m,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= minp𝐲(𝐱𝐮)𝒦|𝐬𝐰𝒫Y(XU)𝒦|SW[N](p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K,R,Lw,Lu,m)\displaystyle\min_{p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\in\mathscr{P}_{Y(XU)_{{\cal K}}|SW}^{[N]}(p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K},R,L_{w},L_{u},m)} (A.1)
D(p𝐲(𝐱𝐮)𝒦|𝐬𝐰p𝐲|𝐱𝒦p𝐱𝐮|𝐬𝐰𝒦|p𝐬𝐰),\displaystyle\quad D(p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\|p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{{\cal K}}|p_{{\mathbf{s}}{\mathbf{w}}}),
E^psp,m,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\hat{E}_{psp,m,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= D(p𝐬|𝐰pS|p𝐰)+E˘psp,m,N(R,Lw,Lu,\displaystyle D(p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{S}\,|\,p_{{\mathbf{w}}})+\breve{E}_{psp,m,N}(R,L_{w},L_{u}, (A.2)
p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\quad p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K})
=\displaystyle= minp𝐲(𝐱𝐮)𝒦|𝐬𝐰𝒫Y(XU)𝒦|SW[N](p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K,R,Lw,Lu,m)\displaystyle\min_{p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\in\mathscr{P}_{Y(XU)_{{\cal K}}|SW}^{[N]}(p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K},R,L_{w},L_{u},m)}
D(p𝐲(𝐱𝐮)𝒦|𝐬𝐰p𝐬|𝐰p𝐲|𝐱𝒦p𝐱𝐮|𝐬𝐰𝒦pS|p𝐰),\displaystyle\quad D(p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\,p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{{\cal K}}\,p_{S}\,|p_{{\mathbf{w}}}),
E^¯psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\overline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= maxm𝒦E^psp,m,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\max_{m\in{\cal K}}\hat{E}_{psp,m,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K})
E¯^psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\underline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= minm𝒦E^psp,m,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\min_{m\in{\cal K}}\hat{E}_{psp,m,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K})

where (A.2) is obtained by application of the chain rule for divergence. Also define

Epsp,N(R,Lw,Lu,D1,𝒲K)\displaystyle E_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= maxp𝐰𝒫W[N]minp𝐬|𝐰𝒫S|W[N]maxp𝐱𝐮|𝐬𝐰𝒫XU|SW[N](p𝐰p𝐬|𝐰,Lw,Lu,D1)\displaystyle\max_{p_{{\mathbf{w}}}\in\mathscr{P}_{W}^{[N]}}\min_{p_{{\mathbf{s}}|{\mathbf{w}}}\in\mathscr{P}_{S|W}^{[N]}}\max_{p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}\in\mathscr{P}_{XU|SW}^{[N]}(p_{{\mathbf{w}}}\,p_{{\mathbf{s}}|{\mathbf{w}}},L_{w},L_{u},D_{1})} (A.5)
E^psp,1,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲Knomfair).\displaystyle\hat{E}_{psp,1,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K_{nom}}^{fair}).

Denote by p𝐰p_{{\mathbf{w}}}^{*} and p𝐱𝐮|𝐬𝐰p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*} the maximizers above, the latter viewed as a function of p𝐬|𝐰p_{{\mathbf{s}}|{\mathbf{w}}}. Both maximizers depend implicitly on RR and 𝒲Knomfair\mathscr{W}_{K_{nom}}^{fair}. Let

E¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\overline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp𝐬|𝐰𝒫S|W[N]E^¯psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰),\displaystyle\min_{p_{{\mathbf{s}}|{\mathbf{w}}}\in\mathscr{P}_{S|W}^{[N]}}\overline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}), (A.6)
E¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\underline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp𝐬|𝐰𝒫S|W[N]E¯^psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰).\displaystyle\min_{p_{{\mathbf{s}}|{\mathbf{w}}}\in\mathscr{P}_{S|W}^{[N]}}\underline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}). (A.7)

The exponents (A.2)—(A.7) differ from (4.4)—(4.9) in that the optimizations are performed over conditional types instead of general conditional p.m.f.’s. We have

limNE¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\lim_{N\to\infty}\overline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\overline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) (A.8)
limNE¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\lim_{N\to\infty}\underline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\underline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) (A.9)

by continuity of the divergence and mutual-information functionals.

Consider the maximization over the conditional type p𝐱𝐮|𝐬𝐰p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}} in (A.5). As a result of this maximization, we may associate the following:

  • to any (𝐬,𝐰)({\mathbf{s}},{\mathbf{w}}), a conditional type class TU|SW(𝐬,𝐰)T𝐮|𝐬𝐰T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}})\triangleq T_{{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*};

  • to any (𝐬d,𝐰)({\mathbf{s}}^{d},{\mathbf{w}}), a conditional type class TU|SdW(𝐬d,𝐰)T𝐮|𝐬d𝐰T_{U|S^{d}W}^{*}({\mathbf{s}}^{d},{\mathbf{w}})\triangleq T_{{\mathbf{u}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*};

  • to any (𝐬,𝐰)({\mathbf{s}},{\mathbf{w}}) and 𝐮TU|SW(𝐬𝐰){\mathbf{u}}\in T_{U|SW}^{*}({\mathbf{s}}{\mathbf{w}}), a conditional type class TX|USW(𝐮,𝐬,𝐰)T𝐱|𝐮𝐬𝐰T_{X|USW}^{*}({\mathbf{u}},{\mathbf{s}},{\mathbf{w}})\triangleq T_{{\mathbf{x}}|{\mathbf{u}}{\mathbf{s}}{\mathbf{w}}}^{*};

  • to any type p𝐬𝐰p_{{\mathbf{s}}{\mathbf{w}}}, a conditional mutual information IUS|SdW(p𝐬𝐰)I(𝐮;𝐬|𝐬d,𝐰)I_{US|S^{d}W}^{*}(p_{{\mathbf{s}}{\mathbf{w}}})\triangleq I({\mathbf{u}};{\mathbf{s}}|{\mathbf{s}}^{d},{\mathbf{w}}) where 𝐮,𝐬,𝐰{\mathbf{u}},{\mathbf{s}},{\mathbf{w}} are any three sequences with joint type p𝐮|𝐬𝐰p𝐬𝐰p_{{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}p_{{\mathbf{s}}{\mathbf{w}}}.

Codebook. Define the function

ρ(p𝐬|𝐬d𝐰)=IUS|SdW(p𝐬𝐰)+ϵ,p𝐬𝐰𝒫SW[N].\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})=I_{US|S^{d}W}^{*}(p_{{\mathbf{s}}{\mathbf{w}}})+\epsilon,\quad\forall p_{{\mathbf{s}}{\mathbf{w}}}\in\mathscr{P}_{SW}^{[N]}.

A random constant-composition code

𝒞(𝐬d,𝐰,p𝐬|𝐬d𝐰)={𝐮(l,m,p𝐬|𝐬d𝐰),  1lexp2{Nρ(p𝐬|𝐬d𝐰), 1m2NR}{\cal C}({\mathbf{s}}^{d},{\mathbf{w}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})=\{{\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}),\;\,1\leq l\leq\exp_{2}\{N\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}),\;1\leq m\leq 2^{NR}\}

is generated for each 𝐬d(𝒮d)N{\mathbf{s}}^{d}\in({\cal S}^{d})^{N}, 𝐰T𝐰{\mathbf{w}}\in T_{{\mathbf{w}}}^{*}, and p𝐬|𝐬d𝐰𝒫S|SdW[N]p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}\in\mathscr{P}_{S|S^{d}W}^{[N]} by drawing exp2{N(R+ρ(p𝐬|𝐬d𝐰))}\exp_{2}\{N(R+\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}))\} random sequences independently and uniformly from the conditional type class TU|SdW(𝐬d,𝐰)T_{U|S^{d}W}^{*}({\mathbf{s}}^{d},{\mathbf{w}}), and arranging them into an array with 2NR2^{NR} columns and exp2{Nρ(p𝐬|𝐬d𝐰)}\exp_{2}\{N\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\} rows.

Encoder. Prior to encoding, a sequence 𝐖𝒲N{\mathbf{W}}\in{\cal W}^{N} is drawn independently of 𝐒{\mathbf{S}} and uniformly from T𝐰T_{{\mathbf{w}}}^{*}, and shared with the receiver. Given (𝐒,𝐖)({\mathbf{S}},{\mathbf{W}}), the encoder determines the conditional type p𝐬|𝐬d𝐰p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}} and performs the following two steps for each user 1m2NR1\leq m\leq 2^{NR}.

  1. 1.

    Find ll such that 𝐮(l,m,p𝐬|𝐬d𝐰)𝒞(𝐒d,𝐖,p𝐬|𝐬d𝐰)TU|SW(𝐬,𝐰){\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\in{\cal C}({\mathbf{S}}^{d},{\mathbf{W}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\bigcap T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}). If more than one such ll exists, pick one of them randomly (with uniform distribution). Let 𝐮=𝐮(l,m,p𝐬|𝐬d𝐰){\mathbf{u}}={\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}). If no such ll can be found, generate 𝐮{\mathbf{u}} uniformly from the conditional type class TU|SW(𝐬,𝐰)T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}).

  2. 2.

    Generate 𝐗m{\mathbf{X}}_{m} uniformly distributed over the conditional type class TX|USW(𝐮,𝐬,𝐰)T_{X|USW}^{*}({\mathbf{u}},{\mathbf{s}},{\mathbf{w}}).

Collusion channel. By Prop. III.1, it is sufficient to restrict our attention to strongly exchangeable collusion channels in the error probability analysis.

Decoder. Given (𝐲,𝐬d,𝐰)({\mathbf{y}},{\mathbf{s}}^{d},{\mathbf{w}}), the decoder outputs 𝒦^\hat{{\cal K}} if and only if (4.1) is satisfied.

Encoding errors. Analogously to [11], the probability of encoding errors vanishes doubly exponentially with NN because ρ(p𝐬|𝐬d𝐰)>I(𝐮;𝐬|𝐬d𝐰)\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})>I({\mathbf{u}};{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}). Indeed an encoding error for user mm arises under the following event:

m\displaystyle{\cal E}_{m} =\displaystyle= {(𝒞,𝐬,𝐰):(𝐮(l,m,p𝐬|𝐬d𝐰)𝒞and𝐮(l,m,p𝐬|𝐬d𝐰)TU|SW(𝐬,𝐰))for1l2Nρ(p𝐬|𝐬d𝐰)}.\displaystyle\{({\cal C},{\mathbf{s}},{\mathbf{w}})~:~({\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\in{\cal C}~\mathrm{and}~{\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\notin T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}))~\mathrm{for}~1\leq l\leq 2^{N\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})}\}.

The probability that a sequence 𝐔{\mathbf{U}} uniformly distributed over TU|SdW(𝐬d,𝐰)T_{U|S^{d}W}^{*}({\mathbf{s}}^{d},{\mathbf{w}}) also belongs to TU|SW(𝐬,𝐰)T_{U|SW}^{*}({\mathbf{s}},{\mathbf{w}}) is equal to exp2{NIUS|SdW(p𝐬𝐰)}\exp_{2}\{-NI_{US|S^{d}W}^{*}(p_{{\mathbf{s}}{\mathbf{w}}})\} on the exponential scale. Therefore the encoding error probability, conditioned on type class T𝐬𝐰T_{{\mathbf{s}}{\mathbf{w}}}, satisfies

Pr[m|(𝐒,𝐖)T𝐬𝐰]\displaystyle Pr[{\cal E}_{m}|({\mathbf{S}},{\mathbf{W}})\in T_{{\mathbf{s}}{\mathbf{w}}}] =\displaystyle= (1|TU|SW(𝐒,𝐖)||TU|SdW(𝐒d,𝐖)|)2Nρ(p𝐬|𝐬d𝐰)\displaystyle\left(1-\frac{|T_{U|SW}^{*}({\mathbf{S}},{\mathbf{W}})|}{|T_{U|S^{d}W}^{*}({\mathbf{S}}^{d},{\mathbf{W}})|}\right)^{2^{N\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})}} (A.11)
\displaystyle\doteq (12NIUS|SdW(p𝐬𝐰))2Nρ(p𝐬|𝐬d𝐰)\displaystyle(1-2^{-NI_{US|S^{d}W}^{*}(p_{{\mathbf{s}}{\mathbf{w}}})})^{2^{N\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})}}
\displaystyle\leq exp{exp2(N[ρ(p𝐬|𝐬d𝐰)IUS|SdW(p𝐬𝐰)])}\displaystyle\exp\{-\exp_{2}(N[\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})-I_{US|S^{d}W}^{*}(p_{{\mathbf{s}}{\mathbf{w}}})])\}
=\displaystyle= exp{2Nϵ}\displaystyle\exp\{-2^{N\epsilon}\}

where the inequality follows from 1aea1-a\leq e^{-a}.

The derivation of the decoding error exponents is based on the following two asymptotic equalities which are special cases of (C.2) and (C.5) established in Lemma C.1.

1) Fix 𝐲,𝐬d,𝐰{\mathbf{y}},{\mathbf{s}}^{d},{\mathbf{w}} and draw 𝐮{\mathbf{u}} uniformly from some fixed type class, independently of (𝐲,𝐬d,𝐰)({\mathbf{y}},{\mathbf{s}}^{d},{\mathbf{w}}). Then

Pr[I(𝐮;𝐲|𝐬d𝐰)ν]2Nν.Pr[I({\mathbf{u}};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\geq\nu]\doteq 2^{-N\nu}. (A.12)

2) Given 𝐬,𝐰{\mathbf{s}},{\mathbf{w}}, draw (𝐱k,𝐮k),k𝒦({\mathbf{x}}_{k},{\mathbf{u}}_{k}),\,k\in{\cal K}, i.i.d. uniformly from a conditional type class T𝐱𝐮|𝐬𝐰T_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}, and then draw 𝐲{\mathbf{y}} uniformly over a single conditional type class T𝐲|𝐱𝒦T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}. For any ν>0\nu>0, we have

Pr[I(𝐮m;𝐲|𝐬d𝐰)ρ(p𝐬|𝐬d𝐰)+ν]exp2{NE˘psp,m,N(ν,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}.Pr[I({\mathbf{u}}_{m};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+\nu]\doteq\exp_{2}\{-N\breve{E}_{psp,m,N}(\nu,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\}. (A.13)

(i). False Positives. From (4.1), the occurrence of a false positive implies that

λ𝒫S|SdW[N],l,m𝒦:I(𝐮(l,m,λ);𝐲|𝐬d𝐰)>ρ(λ)+R+Δ.\exists\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]},l,m\notin{\cal K}\,:\quad I({\mathbf{u}}(l,m,\lambda);{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})>\rho(\lambda)+R+\Delta. (A.14)

By construction of the codebook, 𝐮(l,m,λ){\mathbf{u}}(l,m,\lambda) is independent of 𝐲{\mathbf{y}} for m𝒦m\notin{\cal K}. For any given λ\lambda, there are at most 2Nρ(λ)2^{N\rho(\lambda)} possible values for ll and 2NRK2^{NR}-K possible values for mm in (A.14). Hence the probability of false positives, conditioned on the joint type class T𝐲(𝐱𝐮)𝒦𝐬𝐰T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}}, is

PFP(T𝐲(𝐱𝐮)𝒦𝐬𝐰,𝒲K)\displaystyle P_{FP}(T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) (A.15)
\displaystyle\leq λ(2NRK) 2Nρ(λ)Pr[I(𝐮(l,m,λ);𝐲|𝐬d𝐰)>ρ(λ)+R+Δ]\displaystyle\sum_{\lambda}(2^{NR}-K)\,2^{N\rho(\lambda)}\,Pr[I({\mathbf{u}}(l,m,\lambda);{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})>\rho(\lambda)+R+\Delta]
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\doteq}} λ2N(R+ρ(λ)) 2N(R+Δ+ρ(λ))\displaystyle\sum_{\lambda}2^{N(R+\rho(\lambda))}\,2^{-N(R+\Delta+\rho(\lambda))}
(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{\leq}} (N+1)|𝒮|Lw 2NΔ\displaystyle(N+1)^{|{\cal S}|\,L_{w}}\,2^{-N\Delta}
\displaystyle\doteq 2NΔ\displaystyle 2^{-N\Delta}

where (a) is obtained by application of (A.12) with ν=ρ(λ)+R+Δ\nu=\rho(\lambda)+R+\Delta, and (b) because the number of conditional types λ\lambda is at most (N+1)|𝒮|Lw(N+1)^{|{\cal S}|\,L_{w}}.

Averaging over all type classes T𝐲(𝐱𝐮)𝒦𝐬𝐰T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}}, we obtain PFP2NΔP_{FP}\mbox{$\>\stackrel{{\scriptstyle\centerdot}}{{\leq}}\>$}2^{-N\Delta}, from which (4.10) follows.

(ii). Detect-One Error Criterion (Miss All Colluders). We first derive the error exponent for the event that the decoder misses a specific colluder m𝒦m\in{\cal K}. Any coalition 𝒦^\hat{{\cal K}} that contains mm fails the test (4.1), i.e., for any such 𝒦^\hat{{\cal K}},

λ𝒫S|SdW[N]:maxlI(𝐮(l,m,λ);𝐲|𝐬d𝐰)ρ(λ)+R+Δ.\forall\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]}\,:\quad\max_{l}I({\mathbf{u}}(l,m,\lambda);{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq\rho(\lambda)+R+\Delta. (A.16)

This implies that

I(𝐮(l,m,p𝐬|𝐬d𝐰);𝐲|𝐬d𝐰)ρ(p𝐬|𝐬d𝐰)+R+ΔI({\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}});{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta (A.17)

where ll is the row index actually selected by the encoder, and p𝐬|𝐬d𝐰p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}} is the actual host sequence conditional type. The probability of the miss-mm event, given the joint type p𝐰p𝐬|𝐰p𝐱𝐮|𝐬𝐰p_{{\mathbf{w}}}^{*}\,p_{{\mathbf{s}}|{\mathbf{w}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}, is therefore upper-bounded by the probability of the event (A.17):

pmissm(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle p_{miss-m}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K}) \displaystyle\leq Pr[I(𝐮(l,m,p𝐬|𝐬d𝐰);𝐲|𝐬d𝐰)ρ(p𝐬|𝐬d𝐰)+R+Δ]\displaystyle Pr\left[I({\mathbf{u}}(l,m,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}});{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta\right]
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\mbox{$\>\stackrel{{\scriptstyle\centerdot}}{{\leq}}\>$}}} exp2{NE˘psp,m,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}\displaystyle\,\exp_{2}\left\{-N\breve{E}_{psp,m,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}

where (a) follows from (A.13) with ν=R+Δ\nu=R+\Delta.

The miss-all event is the intersection of the miss-mm events over m𝒦m\in{\cal K}. Its conditional probability is

pmissall(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle p_{miss-all}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K}) (A.18)
=\displaystyle= Pr[m𝒦{missm|p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰}]\displaystyle Pr\left[\bigcap_{m\in{\cal K}}\left\{\mathrm{miss~}m~|~p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}\right\}\right]
\displaystyle\leq minm𝒦pmissm(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\min_{m\in{\cal K}}p_{miss-m}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
\displaystyle\doteq exp2{Nmaxm𝒦E˘psp,m,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}.\displaystyle\exp_{2}\left\{-N\max_{m\in{\cal K}}\breve{E}_{psp,m,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}.

Averaging over 𝐒{\mathbf{S}}, we obtain

pmissall(𝒲K)\displaystyle p_{miss-all}(\mathscr{W}_{K})
\displaystyle\leq p𝐬|𝐰Pr[T𝐬|𝐰]pmissall(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\sum_{p_{{\mathbf{s}}|{\mathbf{w}}}}\,Pr[T_{{\mathbf{s}}|{\mathbf{w}}}]\,p_{miss-all}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
\displaystyle\doteq maxp𝐬|𝐰Pr[T𝐬|𝐰]pmissall(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\max_{p_{{\mathbf{s}}|{\mathbf{w}}}}\,Pr[T_{{\mathbf{s}}|{\mathbf{w}}}]\,p_{miss-all}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\doteq}} maxp𝐬|𝐰exp2{N[D(p𝐬|𝐰pS|p𝐰)+maxm𝒦E˘psp,m,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)]}\displaystyle\max_{p_{{\mathbf{s}}|{\mathbf{w}}}}\exp_{2}\left\{-N\left[D(p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{S}\,|\,p_{{\mathbf{w}}}^{*})+\max_{m\in{\cal K}}\breve{E}_{psp,m,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right]\right\}
(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{\doteq}} exp2{NE¯psp,N(R+Δ,Lw,Lu,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\overline{E}_{psp,N}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K})\right\}
(c)\displaystyle\stackrel{{\scriptstyle(c)}}{{\doteq}} exp2{NE¯psp(R+Δ,Lw,Lu,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\overline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K})\right\}

which establishes (4.12). Here (a) follows from (C.3) and (A.18), (b) from (LABEL:eq:Emax-tilde-simple-N) and (A.6), and (c) from (A.8).

(iii). Detect-All Error Criterion (Miss Some Colluders).

The miss-some event is the union of the miss-mm events over m𝒦m\in{\cal K}. Given the joint type p𝐰p𝐬|𝐰p𝐱𝐮|𝐬𝐰p_{{\mathbf{w}}}^{*}\,p_{{\mathbf{s}}|{\mathbf{w}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}, the probability of this event is

pmisssome(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle p_{miss-some}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
=\displaystyle= Pr[m𝒦{missm|p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰}]\displaystyle Pr\left[\bigcup_{m\in{\cal K}}\left\{\mathrm{miss~}m~|~p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*}\right\}\right]
\displaystyle\leq m𝒦pmissm(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\sum_{m\in{\cal K}}p_{miss-m}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
\displaystyle\doteq maxm𝒦exp2{NE˘psp,m,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}\displaystyle\max_{m\in{\cal K}}\exp_{2}\left\{-N\breve{E}_{psp,m,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}
=\displaystyle= exp2{Nminm𝒦E˘psp,m,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}.\displaystyle\exp_{2}\left\{-N\min_{m\in{\cal K}}\breve{E}_{psp,m,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}. (A.20)

Averaging over 𝐒{\mathbf{S}}, we obtain

pmisssome(𝒲K)\displaystyle p_{miss-some}(\mathscr{W}_{K})
\displaystyle\leq p𝐬|𝐰Pr[T𝐬|𝐰]pmisssome(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\sum_{p_{{\mathbf{s}}|{\mathbf{w}}}}\,Pr[T_{{\mathbf{s}}|{\mathbf{w}}}]\,p_{miss-some}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\doteq}} maxp𝐬|𝐰exp2{N[D(p𝐬|𝐰pS|p𝐰)+minm𝒦E˘psp,m,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)]}\displaystyle\max_{p_{{\mathbf{s}}|{\mathbf{w}}}}\,\exp_{2}\left\{-N\left[D(p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{S}\,|\,p_{{\mathbf{w}}}^{*})+\min_{m\in{\cal K}}\breve{E}_{psp,m,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right]\right\}
(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{\leq}} exp2{NE¯psp,N(R+Δ,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\underline{E}_{psp,N}(R+\Delta,D_{1},\mathscr{W}_{K})\right\}
(c)\displaystyle\stackrel{{\scriptstyle(c)}}{{\doteq}} exp2{NE¯psp(R+Δ,Lw,Lu,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\underline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K})\right\}

which establishes (4.11). Here (a) follows from (C.3) and (A.20), (b) from (A.7) and (LABEL:eq:Emin-tilde-simple-N), and (c) from (A.9).

(iv). Fair Collusion Channels. The proof parallels that of [10, Theorem 4.1(iv)], using the conditional divergence D(p~Y(XU)𝒦|SWp~S|Wp~Y|X𝒦pXU|SW𝒦pS|pW)D(\tilde{p}_{Y(XU)_{{\cal K}}|SW}\,\tilde{p}_{S|W}\|\tilde{p}_{Y|X_{{\cal K}}}\,p_{XU|SW}^{{\cal K}}\,p_{S}\,|p_{W}) in place of D(p~YX𝒦|Wp~Y|X𝒦pX|W𝒦|pW)D(\tilde{p}_{YX_{{\cal K}}|W}\|\tilde{p}_{Y|X_{{\cal K}}}\,p_{X|W}^{{\cal K}}\,|p_{W}).

(v). Immediate, because E¯psp=E¯psp\overline{E}_{psp}=\underline{E}_{psp} in this case.

(vi). Positive Error Exponents. From Part (v) above, we may restrict our attention to 𝒲K=𝒲Kfair\mathscr{W}_{K}=\mathscr{W}_{K}^{fair}. Consider any 𝒲={1,,Lw}{\cal W}=\{1,\cdots,L_{w}\} and pWp_{W} that is positive over its support set (if it is not, reduce the value of LwL_{w} accordingly.) For any m𝒦m\in{\cal K}, the minimand in the expression (4.4) for E~psp,m(R,Lw,Lu,pW,pXU|SW\tilde{E}_{psp,m}(R,L_{w},L_{u},p_{W},p_{XU|SW}, 𝒲Kfair)\mathscr{W}_{K}^{fair}) is zero if and only if

p~Y(XU)𝒦|SWp~S|W=p~Y|X𝒦pXU|SW𝒦pS,withp~Y|X𝒦𝒲Kfair.\tilde{p}_{Y(XU)_{{\cal K}}|SW}\,\tilde{p}_{S|W}=\tilde{p}_{Y|X_{{\cal K}}}\,p_{XU|SW}^{{\cal K}}\,p_{S},\quad\mathrm{with~}\tilde{p}_{Y|X_{{\cal K}}}\in\mathscr{W}_{K}^{fair}.

Such (p~Y(XU)𝒦|SW,p~S|W)(\tilde{p}_{Y(XU)_{{\cal K}}|SW},\tilde{p}_{S|W}) is feasible for (4.3) if and only if (pXU|SW,p~Y|X𝒦)(p_{XU|SW},\tilde{p}_{Y|X_{{\cal K}}}) is such that I(Um;Y|Sd,W)I(U_{m};Y|S^{d},W) I(Um;S|Sd,W)+R\leq I(U_{m};S|S^{d},W)+R. It is not feasible, and thus a positive exponent EoneE^{one} is guaranteed, if R<I(U1;Y|Sd,W)I(U1;S|Sd,W)R<I(U_{1};Y|S^{d},W)-I(U_{1};S|S^{d},W). The supremum of all such RR is given by (4.13) and is achieved by letting ϵ0\epsilon\to 0, Δ0\Delta\to 0, and Lw,LuL_{w},L_{u}\to\infty. \Box

Appendix B Proof of Theorem V.2

We derive the error exponents for the M2PMI decision rule (5.2). Define for all 𝒜𝒦{\cal A}\subseteq{\cal K}

𝒫Y(XU)𝒦|SW[N](p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K,R,Lw,Lu,𝒜)\displaystyle\mathscr{P}_{Y(XU)_{{\cal K}}|SW}^{[N]}(p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K},R,L_{w},L_{u},{\cal A}) =\displaystyle= {p𝐲(𝐱𝐮)𝒦|𝐬𝐰:p(𝐱𝐮)𝒦|𝐬𝐰(p𝐱𝐮|𝐬𝐰),\displaystyle\left\{p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\,:~p_{({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\in{\cal M}(p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}),\right. (B.1)
p𝐲|𝐱𝒦𝒲K,\displaystyle p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\in\mathscr{W}_{K},
I(𝐮𝒜;𝐲𝐮𝒦𝒜|𝐬d𝐰)|𝒜|(ρ(p𝐬|𝐬d𝐰)+R)}\displaystyle\left.{\overset{\circ}{I}}({\mathbf{u}}_{\cal A};{\mathbf{y}}{\mathbf{u}}_{{\cal K}\setminus{\cal A}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq|{\cal A}|(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R)\right\}
E˘psp,𝒜,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\breve{E}_{psp,{\cal A},N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= minp𝐲(𝐱𝐮)𝒦|𝐬𝐰𝒫Y(XU)𝒦|SW[N](p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K,R,Lw,Lu,𝒜)\displaystyle\min_{p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\,\in\,\mathscr{P}_{Y(XU)_{{\cal K}}|SW}^{[N]}(p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K},R,L_{w},L_{u},{\cal A})} (B.2)
D(p𝐲(𝐱𝐮)𝒦|𝐬𝐰p𝐲|𝐱𝒦p𝐱𝐮|𝐬𝐰𝒦|p𝐬𝐰),\displaystyle\quad D(p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\|p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{{\cal K}}\,|\,p_{{\mathbf{s}}{\mathbf{w}}}),
E^psp,𝒜,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\hat{E}_{psp,{\cal A},N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= D(p𝐬|𝐰pS|p𝐰)+E˘psp,𝒜,N(R,Lw,Lu,\displaystyle D(p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{S}\,|p_{{\mathbf{w}}})+\breve{E}_{psp,{\cal A},N}(R,L_{w},L_{u}, (B.3)
p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\qquad p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K})
=\displaystyle= minp𝐲(𝐱𝐮)𝒦|𝐬𝐰𝒫Y(XU)𝒦|SW[N](p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K,R,Lw,Lu,𝒜)\displaystyle\min_{p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\,\in\,\mathscr{P}_{Y(XU)_{{\cal K}}|SW}^{[N]}(p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K},R,L_{w},L_{u},{\cal A})}
D(p𝐲(𝐱𝐮)𝒦|𝐬𝐰p𝐬|𝐰p𝐲|𝐱𝒦p𝐱𝐮|𝐬𝐰𝒦pS|p𝐰),\displaystyle\quad D(p_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\,p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{{\cal K}}\,p_{S}\,|\,p_{{\mathbf{w}}}),
E^¯psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\overline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= E^psp,𝒦,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K),\displaystyle\hat{E}_{psp,{\cal K},N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}), (B.4)
E¯^psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\underline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) =\displaystyle= min𝒜𝒦E^psp,𝒜,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K),\displaystyle\min_{{\cal A}\subseteq{\cal K}}\hat{E}_{psp,{\cal A},N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}),
Epsp,N(R,Lw,Lu,D1,𝒲K)\displaystyle E_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= maxp𝐰𝒫W[N]minp𝐬|𝐰𝒫S|W[N]maxp𝐱𝐮|𝐬𝐰𝒫XU|SW[N](p𝐰p𝐬|𝐰,Lw,Lu,D1)\displaystyle\max_{p_{{\mathbf{w}}}\in\mathscr{P}_{W}^{[N]}}\;\min_{p_{{\mathbf{s}}|{\mathbf{w}}}\in\mathscr{P}_{S|W}^{[N]}}\max_{p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}\in\mathscr{P}_{XU|SW}^{[N]}(p_{{\mathbf{w}}}\,p_{{\mathbf{s}}|{\mathbf{w}}},L_{w},L_{u},D_{1})}
E^psp,𝒦,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲Knomfair).\displaystyle\hat{E}_{psp,{\cal K},N}(R,L_{w},L_{u},p_{{\mathbf{w}}},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K_{nom}}^{fair}).

Denote by p𝐰p_{{\mathbf{w}}}^{*} and p𝐱𝐮|𝐬𝐰p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*} the maximizers in (LABEL:eq:Epsp-N), the latter viewed as a function of p𝐬|𝐰p_{{\mathbf{s}}|{\mathbf{w}}}. Both maximizers depend implicitly on RR, D1D_{1}, and 𝒲Knomfair\mathscr{W}_{K_{nom}}^{fair}. Let

E¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\overline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp𝐬|𝐰E^¯psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\min_{p_{{\mathbf{s}}|{\mathbf{w}}}}\overline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K}) (B.7)
E¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\underline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= minp𝐬|𝐰E¯^psp,N(R,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K).\displaystyle\min_{p_{{\mathbf{s}}|{\mathbf{w}}}}\underline{\hat{E}}_{psp,N}(R,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K}). (B.8)

The exponents (B.3)—(B.8) differ from (5.7)—(5.12) in that the optimizations are performed over conditional types instead of general conditional p.m.f.’s. We have

limNE¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\lim_{N\to\infty}\overline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\overline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) (B.9)
limNE¯psp,N(R,Lw,Lu,D1,𝒲K)\displaystyle\lim_{N\to\infty}\underline{E}_{psp,N}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) =\displaystyle= E¯psp(R,Lw,Lu,D1,𝒲K)\displaystyle\underline{E}_{psp}(R,L_{w},L_{u},D_{1},\mathscr{W}_{K}) (B.10)

by continuity of the divergence and mutual-information functionals.

The codebook and encoding procedure are exactly as in the proof of Theorem IV, the difference being that p𝐰p_{{\mathbf{w}}}^{*} and p𝐱𝐮|𝐬𝐰p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*} are solutions to the optimization problem (LABEL:eq:Epsp-N) instead of (A.5). The decoding rule is the M2PMI rule of (5.2).

To analyze the error probability for this random-coding scheme, it is again sufficient to restrict our attention to strongly-exchangeable channels and use the bound (3.2) on the conditional probability of the collusion channel output. We also use Lemma C.1.

(i). False Positives. By application of (5.4), a false positive occurs if 𝒦^𝒦\hat{{\cal K}}\setminus{\cal K}\neq\emptyset and

λ𝒫S|SdW[N]:𝒜𝒦^:l𝒦^:\displaystyle\exists\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]}\,:\quad\forall{\cal A}\subseteq\hat{{\cal K}}~:\quad\exists l_{\hat{{\cal K}}}:~ I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l𝒦^𝒜,m𝒦^𝒜,λ)|𝐬d𝐰)\displaystyle{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{\hat{{\cal K}}\setminus{\cal A}},m_{\hat{{\cal K}}\setminus{\cal A}},\lambda)\,|{\mathbf{s}}^{d}{\mathbf{w}}) (B.11)
>|𝒜|(ρ(λ)+R+Δ).\displaystyle\qquad>|{\cal A}|\,(\rho(\lambda)+R+\Delta).

The probability of this event is upper-bounded by the probability of the larger event

𝒜𝒦^:λ,l𝒦^:\displaystyle\forall{\cal A}\subseteq\hat{{\cal K}}~:\quad\exists\lambda,l_{\hat{{\cal K}}}:~ I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l𝒦^𝒜,m𝒦^𝒜,λ)|𝐬d𝐰)\displaystyle{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{\hat{{\cal K}}\setminus{\cal A}},m_{\hat{{\cal K}}\setminus{\cal A}},\lambda)\,|{\mathbf{s}}^{d}{\mathbf{w}}) (B.12)
>|𝒜|(ρ(λ)+R+Δ).\displaystyle\qquad>|{\cal A}|\,(\rho(\lambda)+R+\Delta).

Denote by p𝐬|𝐬d𝐰p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*} the conditional type of the host sequence and by l𝒦l_{{\cal K}}^{*} the row indices selected by the encoder. To each triple (𝒦^,λ,l𝒦^)(\hat{{\cal K}},\lambda,l_{\hat{{\cal K}}}), we associate a unique subset {\cal B} of 𝒦𝒦^{\cal K}\cap\hat{{\cal K}} defined as follows:

  • If λp𝐬|𝐬d𝐰\lambda\neq p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*} then ={\cal B}=\emptyset

  • If λ=p𝐬|𝐬d𝐰\lambda=p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*} then {\cal B} is the (possibly empty) set of all indices k𝒦𝒦^k\in{\cal K}\cap\hat{{\cal K}} such that lk=lkl_{k}=l_{k}^{*}.

Thus {\cal B} is the set of colluder indices k𝒦k\in{\cal K} for which the decoder correctly identifies the conditional host sequence type p𝐬|𝐬d𝐰p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*} and the codewords 𝐮(lk,k,p𝐬|𝐬d𝐰){\mathbf{u}}(l_{k}^{*},k,p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*}) that were assigned by the encoder. Denoting by Ω()\Omega({\cal B}) the set of pairs (λ,l𝒦^)(\lambda,l_{\hat{{\cal K}}}) associated with {\cal B}, we rewrite (B.12) as

𝒜𝒦^:𝒦𝒦^,(λ,l𝒦^)Ω():\displaystyle\forall{\cal A}\subseteq\hat{{\cal K}}~:\quad\exists{\cal B}\subseteq{\cal K}\cap\hat{{\cal K}},\;\exists(\lambda,l_{\hat{{\cal K}}})\in\Omega({\cal B}): (B.13)
I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l𝒦^𝒜,m𝒦^𝒜,λ)|𝐬d𝐰)>|𝒜|(ρ(λ)+R+Δ).\displaystyle{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{\hat{{\cal K}}\setminus{\cal A}},m_{\hat{{\cal K}}\setminus{\cal A}},\lambda)\,|{\mathbf{s}}^{d}{\mathbf{w}})>|{\cal A}|(\rho(\lambda)+R+\Delta).

Define the complement set 𝒜=𝒦^{\cal A}=\hat{{\cal K}}\setminus{\cal B} which is comprised of all incorrectly accused users as well as any colluder kk such that λp𝐬|𝐬d𝐰\lambda\neq p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*} or lklkl_{k}\neq l_{k}^{*}. Since 𝒦^{\cal B}\subseteq\hat{{\cal K}} and there is at least one innocent user in 𝒦^\hat{{\cal K}}, the cardinality of 𝒜{\cal A} is at least equal to 1. By construction of the codebook and definition of 𝒜{\cal A} and {\cal B}, 𝐮(l𝒜,m𝒜,λ){\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda) is independent of 𝐲{\mathbf{y}} and 𝐮(l,m,p𝐬|𝐬d𝐰){\mathbf{u}}(l_{{\cal B}}^{*},m_{{\cal B}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*}). The probability of the event (B.13) is upper-bounded by the probability of the larger event

𝒦,λ,l𝒜,m𝒜:I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l,m,p𝐬|𝐬d𝐰)|𝐬d𝐰)>|𝒜|(ρ(λ)+R+Δ).\exists{\cal B}\subseteq{\cal K},\;\exists\lambda,l_{{\cal A}},m_{{\cal A}}\,:\quad{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{{\cal B}}^{*},m_{{\cal B}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*})\,|{\mathbf{s}}^{d}{\mathbf{w}})>|{\cal A}|(\rho(\lambda)+R+\Delta). (B.14)

Hence the probability of false positives, conditioned on T𝐲(𝐱𝐮)𝒦𝐬𝐰T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}}, satisfies

PFP(T𝐲(𝐱𝐮)𝒦𝐬𝐰,𝒲K)\displaystyle P_{FP}(T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) (B.15)
=\displaystyle= Pr[𝒦|𝒜|1{λ,l𝒜,m𝒜:I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l,m,p𝐬|𝐬d𝐰)|𝐬d𝐰)\displaystyle Pr\left[\bigcup_{{\cal B}\subseteq{\cal K}}\bigcup_{|{\cal A}|\geq 1}\left\{\exists\lambda,l_{{\cal A}},m_{{\cal A}}\,:{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{{\cal B}}^{*},m_{{\cal B}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*})\,|{\mathbf{s}}^{d}{\mathbf{w}})\right.\right.
>|𝒜|(ρ(λ)+R+Δ)}]\displaystyle\hskip 180.67499pt\left.\left.>|{\cal A}|\,(\rho(\lambda)+R+\Delta)\right\}\right]
\displaystyle\leq 𝒦|𝒜|1P,|𝒜|(T𝐲(𝐱𝐮)𝒦𝐬𝐰,𝒲K)\displaystyle\sum_{{\cal B}\subseteq{\cal K}}\sum_{|{\cal A}|\geq 1}P_{{\cal B},|{\cal A}|}(T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K})

where

P,|𝒜|(T𝐲(𝐱𝐮)𝒦𝐬𝐰,𝒲K)=Pr[λ,l𝒜,m𝒜:\displaystyle P_{{\cal B},|{\cal A}|}(T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K})=Pr\left[\exists\lambda,l_{{\cal A}},m_{{\cal A}}\right.: I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l,m,p𝐬|𝐬d𝐰)|𝐬d𝐰)\displaystyle{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{{\cal B}}^{*},m_{{\cal B}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*})\,|{\mathbf{s}}^{d}{\mathbf{w}}) (B.16)
>|𝒜|(ρ(λ)+R+Δ)].\displaystyle\qquad\left.>|{\cal A}|\,(\rho(\lambda)+R+\Delta)\right].

By definition of {\cal B}, there are at most λp𝐬|𝐬d𝐰2N|𝒜|ρ(λ)\sum_{\lambda\neq p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}}2^{N|{\cal A}|\,\rho(\lambda)} possible values for l𝒜l_{{\cal A}} and 2N|𝒜|R2^{N|{\cal A}|R} possible values for m𝒜m_{{\cal A}} in (B.16). Hence

P,|𝒜|(T𝐲(𝐱𝐮)𝒦𝐬𝐰,𝒲K)\displaystyle P_{{\cal B},|{\cal A}|}(T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) (B.17)
\displaystyle\leq λ2N|𝒜|(R+ρ(λ))Pr[I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l,m,p𝐬|𝐬d𝐰)|𝐬d𝐰)>|𝒜|(ρ(λ)+R+Δ)]\displaystyle\sum_{\lambda}2^{N|{\cal A}|(R+\rho(\lambda))}\,Pr[{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{{\cal B}}^{*},m_{{\cal B}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*})\,|{\mathbf{s}}^{d}{\mathbf{w}})>|{\cal A}|\,(\rho(\lambda)+R+\Delta)]
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\doteq}} λ2N|𝒜|(R+ρ(λ)) 2N|𝒜|(R+Δ+ρ(λ))\displaystyle\sum_{\lambda}2^{N|{\cal A}|(R+\rho(\lambda))}\,2^{-N|{\cal A}|\,(R+\Delta+\rho(\lambda))}
\displaystyle\leq (N+1)|𝒮| 2N|𝒜|Δ\displaystyle(N+1)^{|{\cal S}|}\,2^{-N|{\cal A}|\Delta}
\displaystyle\doteq 2N|𝒜|Δ\displaystyle 2^{-N|{\cal A}|\Delta}

where (a) is obtained by application of (C.2) with 𝐲𝐮(l,m,p𝐬|𝐬d𝐰){\mathbf{y}}{\mathbf{u}}(l_{{\cal B}}^{*},m_{{\cal B}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}^{*}) in place of 𝐳{\mathbf{z}}.

Combining (B.15) and (B.17) we obtain

PFP(T𝐲(𝐱𝐮)𝒦𝐬𝐰,𝒲K)\displaystyle P_{FP}(T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}},\mathscr{W}_{K}) \displaystyle\leq 𝒦|𝒜|12N|𝒜|Δ\displaystyle\sum_{{\cal B}\subseteq{\cal K}}\sum_{|{\cal A}|\geq 1}2^{-N|{\cal A}|\Delta}
\displaystyle\doteq 2NΔ.\displaystyle 2^{-N\Delta}.

Averaging over all joint type classes T𝐲(𝐱𝐮)𝒦𝐬𝐰T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}}, we obtain PFP2NΔP_{FP}\mbox{$\>\stackrel{{\scriptstyle\centerdot}}{{\leq}}\>$}2^{-N\Delta}, from which (5.13) follows.

(ii). Detect-All Criterion. (Miss Some Colluders.)

Under the miss-some error event, any coalition 𝒦^\hat{{\cal K}} that contains 𝒦{\cal K} fails the test. By (5.4), this implies

λ𝒫S|SdW[N]:𝒜𝒦^:\displaystyle\forall\lambda\in\mathscr{P}_{S|S^{d}W}^{[N]}\,:\quad\exists{\cal A}\subseteq\hat{{\cal K}}~: maxl𝒦^I(𝐮(l𝒜,m𝒜,λ);𝐲𝐮(l𝒦^𝒜,m𝒦^𝒜,λ)|𝐬d𝐰)\displaystyle\max_{l_{\hat{{\cal K}}}}{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},\lambda);{\mathbf{y}}{\mathbf{u}}(l_{\hat{{\cal K}}\setminus{\cal A}},m_{\hat{{\cal K}}\setminus{\cal A}},\lambda)\,|{\mathbf{s}}^{d}{\mathbf{w}}) (B.18)
|𝒜|(ρ(λ)+R+Δ).\displaystyle\qquad\leq|{\cal A}|\,(\rho(\lambda)+R+\Delta).

In particular, for 𝒦^=𝒦\hat{{\cal K}}={\cal K} we have

𝒜𝒦:I(𝐮(l𝒜,m𝒜,p𝐬|𝐬d𝐰);𝐲𝐮(l𝒦𝒜,m𝒦𝒜,p𝐬|𝐬d𝐰)|𝐬d𝐰)|𝒜|(ρ(p𝐬|𝐬d𝐰)+R+Δ).\exists{\cal A}\subseteq{\cal K}~:\quad{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}});{\mathbf{y}}{\mathbf{u}}(l_{{\cal K}\setminus{\cal A}},m_{{\cal K}\setminus{\cal A}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\,|{\mathbf{s}}^{d}{\mathbf{w}})\leq|{\cal A}|\,(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta). (B.19)

where l𝒦l_{{\cal K}} are the row indices actually selected by the encoder, and p𝐬|𝐬d𝐰p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}} is the actual host sequence conditional type. The probability of the miss-some event, conditioned on (𝐬,𝐰)({\mathbf{s}},{\mathbf{w}}), is therefore upper bounded by the probability of the event (B.19):

pmisssome(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle p_{miss-some}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K}) (B.20)
\displaystyle\leq Pr[𝒜𝒦{I(𝐮(l𝒜,m𝒜,p𝐬|𝐬d𝐰);𝐲𝐮(l𝒦𝒜,m𝒦𝒜,p𝐬|𝐬d𝐰)|𝐬d𝐰)|𝒜|(ρ(p𝐬|𝐬d𝐰)+R+Δ)}]\displaystyle Pr\left[\bigcup_{{\cal A}\subseteq{\cal K}}\left\{{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}});{\mathbf{y}}{\mathbf{u}}(l_{{\cal K}\setminus{\cal A}},m_{{\cal K}\setminus{\cal A}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\,|{\mathbf{s}}^{d}{\mathbf{w}})\leq|{\cal A}|(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta)\right\}\right]
\displaystyle\leq 𝒜𝒦Pr[I(𝐮(l𝒜,m𝒜,p𝐬|𝐬d𝐰);𝐲𝐮(l𝒦𝒜,m𝒦𝒜,p𝐬|𝐬d𝐰)|𝐬d𝐰)|𝒜|(ρ(p𝐬|𝐬d𝐰)+R+Δ)]\displaystyle\sum_{{\cal A}\subseteq{\cal K}}Pr\left[{\overset{\circ}{I}}({\mathbf{u}}(l_{{\cal A}},m_{{\cal A}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}});{\mathbf{y}}{\mathbf{u}}(l_{{\cal K}\setminus{\cal A}},m_{{\cal K}\setminus{\cal A}},p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})\,|{\mathbf{s}}^{d}{\mathbf{w}})\leq|{\cal A}|(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta)\right]
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\mbox{$\>\stackrel{{\scriptstyle\centerdot}}{{\leq}}\>$}}} 𝒜𝒦exp2{NE˘psp,𝒜,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}\displaystyle\sum_{{\cal A}\subseteq{\cal K}}\exp_{2}\left\{-N\breve{E}_{psp,{\cal A},N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}
\displaystyle\doteq max𝒜𝒦exp2{NE˘psp,𝒜,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}\displaystyle\max_{{\cal A}\subseteq{\cal K}}\exp_{2}\left\{-N\breve{E}_{psp,{\cal A},N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}
=\displaystyle= exp2{Nmin𝒜𝒦E˘psp,𝒜,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}\displaystyle\exp_{2}\left\{-N\min_{{\cal A}\subseteq{\cal K}}\breve{E}_{psp,{\cal A},N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}

where (a) follows from (C.5) with ν=R+Δ\nu=R+\Delta.

Averaging over 𝐒{\mathbf{S}}, we obtain

pmisssome(𝒲K)\displaystyle p_{miss-some(\mathscr{W}_{K})}
=\displaystyle= p𝐬|𝐰Pr[T𝐬|𝐰]pmisssome(p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\sum_{p_{{\mathbf{s}}|{\mathbf{w}}}}\,Pr[T_{{\mathbf{s}}|{\mathbf{w}}}]\,p_{miss-some}(p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\doteq}} maxp𝐬|𝐰exp2{N[D(p𝐬|𝐰pS|p𝐰)+min𝒜𝒦E˘psp,N(R+Δ,L,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)]}\displaystyle\max_{p_{{\mathbf{s}}|{\mathbf{w}}}}\exp_{2}\left\{-N[D(p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{S}\,|\,p_{{\mathbf{w}}})+\min_{{\cal A}\subseteq{\cal K}}\,\breve{E}_{psp,N}(R+\Delta,L,p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})]\right\}
=(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{=}} maxp𝐬|𝐰exp2{NE¯^psp,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}\displaystyle\max_{p_{{\mathbf{s}}|{\mathbf{w}}}}\exp_{2}\left\{-N\underline{\hat{E}}_{psp,N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}
=(c)\displaystyle\stackrel{{\scriptstyle(c)}}{{=}} exp2{NE¯psp,N(R+Δ,Lw,Lu,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\underline{E}_{psp,N}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K})\right\}
(d)\displaystyle\stackrel{{\scriptstyle(d)}}{{\doteq}} exp2{NE¯psp(R+Δ,Lw,Lu,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\underline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K})\right\}

which proves (5.14). Here (a) follows from (C.3) and (B.20), (b) from the definitions (LABEL:eq:Emin-tilde-N) and (B.3), (c) from (B.8), and (d) from the limit property (B.10).

(iii). Detect-One Criterion (Miss All Colluders.) Either the estimated coalition 𝒦^\hat{{\cal K}} is empty, or it is a set {\cal I} of innocent users (disjoint with 𝒦{\cal K}). Hence PeonePr[𝒦^=]+Pr[𝒦^=]P_{e}^{one}\leq Pr[\hat{{\cal K}}=\emptyset]+Pr[\hat{{\cal K}}={\cal I}]. The first probability, conditioned on (𝐬d,𝐰)({\mathbf{s}}^{d},{\mathbf{w}}), is bounded as

Pr[𝒦^=]\displaystyle Pr[\hat{{\cal K}}=\emptyset] =\displaystyle= Pr[𝒦:M2PMI(𝒦)0]\displaystyle Pr[\forall{\cal K}^{\prime}~:~M2PMI({\cal K}^{\prime})\leq 0]
\displaystyle\leq Pr[M2PMI(𝒦)0]\displaystyle Pr[M2PMI({\cal K})\leq 0]
=\displaystyle= Pr[I(𝐮𝒦;𝐲|𝐬d𝐰)K(ρ(p𝐬|𝐬d𝐰)+R+Δ)]\displaystyle Pr[{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq K(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta)]
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\doteq}} exp2{NE˘psp,𝒦,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}.\displaystyle\exp_{2}\left\{-N\breve{E}_{psp,{\cal K},N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}.

where (a) follows from (C.5) with ν=R+Δ\nu=R+\Delta. To bound Pr[𝒦^=]Pr[\hat{{\cal K}}={\cal I}], we use property (5.5) with 𝒦^=\hat{{\cal K}}={\cal I} and 𝒜=𝒦{\cal A}={\cal K}, which yields

I(𝐮𝒦;𝐲𝐮|𝐬d𝐰)K(ρ(p𝐬|𝐬d𝐰)+R+Δ).{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}{\mathbf{u}}_{{\cal I}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq K(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta).

Since

I(𝐮𝒦;𝐲𝐮|𝐬d𝐰)=I(𝐮𝒦;𝐲|𝐬d𝐰)+I(𝐮𝒦;𝐮|𝐲𝐬d𝐰)I(𝐮𝒦;𝐲|𝐬d𝐰){\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}{\mathbf{u}}_{{\cal I}}|{\mathbf{s}}^{d}{\mathbf{w}})={\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})+I({\mathbf{u}}_{{\cal K}};{\mathbf{u}}_{{\cal I}}|{\mathbf{y}}{\mathbf{s}}^{d}{\mathbf{w}})\geq{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})

combining the two inequalities above yields

I(𝐮𝒦;𝐲|𝐬d𝐰)K(ρ(p𝐬|𝐬d𝐰)+R+Δ).{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{y}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq K(\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}})+R+\Delta).

The probability of this event is again given by (B); we conclude that

pmissall(p𝐰p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)exp2{NE˘psp,𝒦,N(R+Δ,Lw,Lu,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}.p_{miss-all}(p_{{\mathbf{w}}}^{*}\,p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\doteq\exp_{2}\left\{-N\breve{E}_{psp,{\cal K},N}(R+\Delta,L_{w},L_{u},p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}.

Averaging over 𝐒{\mathbf{S}} and proceeding as in Part (ii) above, we obtain

pmissall(𝒲K)\displaystyle p_{miss-all}(\mathscr{W}_{K}) \displaystyle\leq p𝐬|𝐰Pr[T𝐬|𝐰]pmissall(p𝐰p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)\displaystyle\sum_{p_{{\mathbf{s}}|{\mathbf{w}}}}\,Pr[T_{{\mathbf{s}}|{\mathbf{w}}}]\,p_{miss-all}(p_{{\mathbf{w}}}^{*}\,p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})
\displaystyle\doteq exp2{NE¯psp(R+Δ,Lw,Lu,D1,𝒲K)}\displaystyle\exp_{2}\left\{-N\overline{E}_{psp}(R+\Delta,L_{w},L_{u},D_{1},\mathscr{W}_{K})\right\}

which establishes (5.15).

(iv). Optimal Collusion Channels are Fair. The proof parallels that of [10, Theorem 4.1(iv)] and is omitted.

(v). Detect-All Exponent for Fair Collusion Channels. The proof parallels that of [10, Theorem 4.1(v)] and is omitted.

(vi). Achievable Rates. Consider any 𝒲={1,,Lw}{\cal W}=\{1,\cdots,L_{w}\} and pWp_{W} that is positive over its support set (if it is not, reduce the value of LwL_{w} accordingly.) For any 𝒜𝒦{\cal A}\subseteq{\cal K}, the divergence to be minimized in the expression (5.7) for E~psp,𝒜(R,Lw,Lu,pW,p~S|W,pXU|SW,𝒲K)\tilde{E}_{psp,{\cal A}}(R,L_{w},L_{u},p_{W},\tilde{p}_{S|W},p_{XU|SW},\mathscr{W}_{K}) is zero if and only if

p~Y(XU)𝒦|SW=p~Y|X𝒦pXU|SW𝒦andp~S|W=pS.\tilde{p}_{Y(XU)_{{\cal K}}|SW}=\tilde{p}_{Y|X_{{\cal K}}}\,p_{XU|SW}^{{\cal K}}\quad\mathrm{and~}\tilde{p}_{S|W}=p_{S}.

These p.m.f.’s are feasible for (5.6) if and only if the inequality below holds:

1|𝒜|I(U𝒜;YU𝒦𝒜|Sd,W)>I(U;S|Sd,W)+R.\frac{1}{|{\cal A}|}I(U_{{\cal A}};YU_{{\cal K}\setminus{\cal A}}|S^{d},W)>I(U;S|S^{d},W)+R.

They are infeasible, and thus positive error exponents are guaranteed, if

R<min𝒜𝒦1|𝒜|I(U𝒜;YU𝒦𝒜|Sd,W)I(U;S|Sd,W).R<\min_{{\cal A}\subseteq{\cal K}}\frac{1}{|{\cal A}|}I(U_{{\cal A}};YU_{{\cal K}\setminus{\cal A}}|S^{d},W)-I(U;S|S^{d},W).

From Part (iv) above, we may restrict our attention to 𝒲K=𝒲Kfair\mathscr{W}_{K}=\mathscr{W}_{K}^{fair} under the detect-one criterion. Since the p.m.f. of (S,W,(XU)𝒦,Y)(S,W,(XU)_{{\cal K}},Y) is permutation-invariant, by application of [10, Eqn. (3.3)] with (U𝒦,Sd)(U_{{\cal K}},S^{d}) in place of (X𝒦,S)(X_{{\cal K}},S), we have

min𝒜𝒦1|𝒜|I(U𝒜;YU𝒦𝒜|SdW)=1KI(U𝒦;Y|SdW).\min_{{\cal A}\subseteq{\cal K}}\frac{1}{|{\cal A}|}I(U_{{\cal A}};YU_{{\cal K}\setminus{\cal A}}|S^{d}W)=\frac{1}{K}I(U_{{\cal K}};Y|S^{d}W). (B.22)

Hence the supremum of all RR for error exponents are positive is given by C¯one(D1,𝒲K)\underline{C}^{one}(D_{1},\mathscr{W}_{K}) in (5.16) and is obtained by letting ϵ0\epsilon\to 0, Δ0\Delta\to 0 and Lw,LuL_{w},L_{u}\to\infty.

For any 𝒲K\mathscr{W}_{K}, under the detect-all criterion, the supremum of all RR for which error exponents are positive is given by C¯all(D1,𝒲K)\underline{C}^{all}(D_{1},\mathscr{W}_{K}) in (5.17) and is obtained by letting ϵ0\epsilon\to 0, Δ0\Delta\to 0 and Lw,LuL_{w},L_{u}\to\infty. Since the optimal conditional p.m.f. is not necessarily permutation-invariant, (B.22) does not hold in general. However, if 𝒲K=𝒲Kfair\mathscr{W}_{K}=\mathscr{W}_{K}^{fair}, (B.22) holds, and the same achievable rate is obtained for the detect-one and detect-all problems. \Box

Appendix C

Lemma C.1

1) Fix (𝐬d,𝐰)({\mathbf{s}}^{d},{\mathbf{w}}) and 𝐳𝒵N{\mathbf{z}}\in{\cal Z}^{N}, and draw 𝐮𝒦={𝐮m,m𝒦}{\mathbf{u}}_{{\cal K}}=\{{\mathbf{u}}_{m},\,m\in{\cal K}\} i.i.d. uniformly over a common type class T𝐮|𝐬d𝐰T_{{\mathbf{u}}|{\mathbf{s}}^{d}{\mathbf{w}}}, independently of 𝐳{\mathbf{z}}. We have the asymptotic equality

Pr[T𝐮𝒦|𝐳𝐬d𝐰]=|T𝐮𝒦|𝐳𝐬d𝐰||T𝐮|𝐬d𝐰|K2N[KH(𝐮|𝐬d𝐰)H(𝐮𝒦|𝐳𝐬d𝐰)]=2NI(𝐮𝒦;𝐳|𝐬d𝐰)Pr[T_{{\mathbf{u}}_{{\cal K}}|{\mathbf{z}}{\mathbf{s}}^{d}{\mathbf{w}}}]=\frac{|T_{{\mathbf{u}}_{{\cal K}}|{\mathbf{z}}{\mathbf{s}}^{d}{\mathbf{w}}}|}{|T_{{\mathbf{u}}|{\mathbf{s}}^{d}{\mathbf{w}}}|^{K}}\doteq 2^{-N[KH({\mathbf{u}}|{\mathbf{s}}^{d}{\mathbf{w}})-H({\mathbf{u}}_{{\cal K}}|{\mathbf{z}}{\mathbf{s}}^{d}{\mathbf{w}})]}=2^{-N{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{z}}|{\mathbf{s}}^{d}{\mathbf{w}})} (C.1)
Pr[I(𝐮𝒦;𝐳|𝐬d𝐰)ν]\displaystyle Pr[{\overset{\circ}{I}}({\mathbf{u}}_{{\cal K}};{\mathbf{z}}|{\mathbf{s}}^{d}{\mathbf{w}})\geq\nu] \displaystyle\doteq 2Nν.\displaystyle 2^{-N\nu}. (C.2)

2) Given 𝐰{\mathbf{w}}, draw 𝐬{\mathbf{s}} i.i.d. pSp_{S}. We have [21]

Pr[T𝐬|𝐰]2ND(p𝐬|𝐰pS|p𝐰).Pr[T_{{\mathbf{s}}|{\mathbf{w}}}]\doteq 2^{-ND(p_{{\mathbf{s}}|{\mathbf{w}}}\|p_{S}|p_{{\mathbf{w}}})}. (C.3)

3) Given (𝐬,𝐰)({\mathbf{s}},{\mathbf{w}}), draw (𝐱k,𝐮k),k𝒦({\mathbf{x}}_{k},{\mathbf{u}}_{k}),\,k\in{\cal K}, i.i.d. uniformly from a conditional type class T𝐱𝐮|𝐬𝐰T_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}, and then draw 𝐘{\mathbf{Y}} uniformly from a single conditional type class T𝐲|𝐱𝒦T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}. We have

Pr[T𝐲(𝐱𝐮)𝒦|𝐬𝐰]\displaystyle Pr[T_{{\mathbf{y}}({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}] =\displaystyle= |T𝐲|(𝐱𝐮)𝒦𝐬𝐰||T𝐲|𝐱𝒦||T(𝐱𝐮)𝒦|𝐬𝐰||T𝐱𝐮|𝐬𝐰|K\displaystyle\frac{|T_{{\mathbf{y}}|({\mathbf{x}}{\mathbf{u}})_{{\cal K}}{\mathbf{s}}{\mathbf{w}}}|}{|T_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}|}\,\frac{|T_{({\mathbf{x}}{\mathbf{u}})_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}|}{|T_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}|^{K}} (C.4)
\displaystyle\doteq exp2{ND(p𝐲𝐱𝒦|𝐬𝐰p𝐲|𝐱𝒦p𝐱𝐮|𝐬𝐰K|p𝐬𝐰)}.\displaystyle\exp_{2}\left\{-ND(p_{{\mathbf{y}}{\mathbf{x}}_{{\cal K}}|{\mathbf{s}}{\mathbf{w}}}\|p_{{\mathbf{y}}|{\mathbf{x}}_{{\cal K}}}\,p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{K}\,|\,p_{{\mathbf{s}}{\mathbf{w}}})\right\}.

For any feasible, strongly exchangeable collusion channel, for any 𝒜𝒦{\cal A}\subseteq{\cal K} and ν>0\nu>0, we have

Pr[I(𝐮𝒜;𝐲𝐮𝒦𝒜|𝐬d𝐰)|𝒜|(ν+ρ(p𝐬|𝐬d𝐰))]\displaystyle Pr[{\overset{\circ}{I}}({\mathbf{u}}_{{\cal A}};{\mathbf{y}}{\mathbf{u}}_{{\cal K}\setminus{\cal A}}|{\mathbf{s}}^{d}{\mathbf{w}})\leq|{\cal A}|(\nu+\rho(p_{{\mathbf{s}}|{\mathbf{s}}^{d}{\mathbf{w}}}))] (C.5)
\displaystyle\doteq exp2{NE˘psp,𝒜,N(ν,L,p𝐰,p𝐬|𝐰,p𝐱𝐮|𝐬𝐰,𝒲K)}.\displaystyle\exp_{2}\left\{-N\breve{E}_{psp,{\cal A},N}(\nu,L,p_{{\mathbf{w}}}^{*},p_{{\mathbf{s}}|{\mathbf{w}}},p_{{\mathbf{x}}{\mathbf{u}}|{\mathbf{s}}{\mathbf{w}}}^{*},\mathscr{W}_{K})\right\}.

Proof: The derivation of (C.4), (C.3), and (C.5) parallels that of (D.12), (D.15) and (D.16) in [10].

Appendix D Proof of Theorem VI.1

Let KK be size of the coalition and (fN,gN)(f_{N},g_{N}) a sequence of length-NN, rate-RR randomized codes. We show that for any sequence of such codes, reliable decoding of all KK fingerprints is possible only if RC¯all(D1,𝒲K)R\leq\overline{C}^{all}(D_{1},\mathscr{W}_{K}). Recall that the encoder generates marked copies 𝐱m=fN(𝐬,v,m){\mathbf{x}}_{m}=f_{N}({\mathbf{s}},v,m) for 1m2NR1\leq m\leq 2^{NR} and that the decoder outputs an estimated coalition gN(𝐲,𝐬d,v){1,,2NR}g_{N}({\mathbf{y}},{\mathbf{s}}^{d},v)\in\{1,\cdots,2^{NR}\}^{\star}. We use the notation MK{M1,,MK}M^{K}\triangleq\{M_{1},\cdots,M_{K}\} and 𝐗K{𝐗1,,𝐗K}{\mathbf{X}}^{K}\triangleq\{{\mathbf{X}}_{1},\cdots,{\mathbf{X}}_{K}\}.

To prove that C¯all(D1,𝒲K)\overline{C}^{all}(D_{1},\mathscr{W}_{K}) is an upper bound on capacity, it suffices to identify a family of collusion channels for which reliable decoding is impossible at rates above C¯all(D1,𝒲K)\overline{C}^{all}(D_{1},\mathscr{W}_{K}). As shown in [10], it is sufficient to derive such a bound for the compound family 𝒲K\mathscr{W}_{K} of memoryless channels.

Our derivation is an extension of the single-user compound Gel’fand-Pinsker problem [11] to the multiple-access case. A lower bound on error probability is obtained when an oracle informs the decoder that the coalition size is at most KK.

There are (2NRK)2KNR\left(\begin{array}[]{c}2^{NR}\\ K\end{array}\right)\leq 2^{KNR} possible coalitions of size K\leq K. We represent such a coalition as MK{M1,,MK}M^{K}\triangleq\{M_{1},\cdots,M_{K}\}, where Mk, 1kKM_{k},\,1\leq k\leq K, are drawn i.i.d. uniformly from {1,,2NR}\{1,\cdots,2^{NR}\}.

Given a memoryless channel pY|XK𝒲Kp_{Y|X^{K}}\in\mathscr{W}_{K}, the joint p.m.f. of (MK,V,𝐒,𝐗K,𝐘)(M^{K},V,{\mathbf{S}},{\mathbf{X}}^{K},{\mathbf{Y}}) is given by

pMKV𝐒𝐗K𝐘=pSNpV1kK(pMk 1{𝐗k=fN(𝐒,V,Mk)})pY|XKN.p_{M^{K}V{\mathbf{S}}{\mathbf{X}}^{K}{\mathbf{Y}}}=p_{S}^{N}\,p_{V}\,\prod_{1\leq k\leq K}\left(p_{M_{k}}\,\mathds{1}_{\{{\mathbf{X}}_{k}=f_{N}({\mathbf{S}},V,M_{k})\}}\right)p_{Y|X^{K}}^{N}. (D.1)

Our derivations make repeated use of the identity

I(U𝒜;Y|Z,U𝒦𝒜)I(U𝒜;S|Z,U𝒦𝒜)=I(U𝒜;Y,Z|U𝒦𝒜)I(U𝒜;S,Z|U𝒦𝒜)I(U_{{\cal A}};Y|Z,U_{{\cal K}\setminus{\cal A}})-I(U_{{\cal A}};S|Z,U_{{\cal K}\setminus{\cal A}})=I(U_{{\cal A}};Y,Z|U_{{\cal K}\setminus{\cal A}})-I(U_{{\cal A}};S,Z|U_{{\cal K}\setminus{\cal A}})

which follows from the chain rule for conditional mutual information and holds for any (U𝒦,S,Y,Z)(U_{{\cal K}},S,Y,Z).

The total error probability (including false positives and false negatives) for the detect-all decoder is

Pe(pY|XK)=Pr[𝒦^𝒦]P_{e}(p_{Y|X^{K}})=Pr[\hat{{\cal K}}\neq{\cal K}] (D.2)

when collusion channel pY|XK𝒲Kp_{Y|X^{K}}\in\mathscr{W}_{K} is in effect.

Step 1. Following the derivation of [10, Eqn. (B.20)] with (𝐘,𝐒d,V)({\mathbf{Y}},{\mathbf{S}}^{d},V) in place of (𝐘,𝐒,V)({\mathbf{Y}},{\mathbf{S}},V) at the receiver, for the error probability Pe(pY|XK)P_{e}(p_{Y|X^{K}}) to vanish for each pY|XK𝒲Kp_{Y|X^{K}}\in\mathscr{W}_{K}, we need

Rlim infNminpY|XK𝒲Kmin𝒜𝒦1N|𝒜|I(M𝒜;𝐘|𝐒d,V).R\leq\liminf_{N\to\infty}\min_{p_{Y|X^{K}}\in\mathscr{W}_{K}}\min_{{\cal A}\subseteq{\cal K}}\;\frac{1}{N|{\cal A}|}I(M_{{\cal A}};{\mathbf{Y}}|{\mathbf{S}}^{d},V). (D.3)

Step 2. Define the i.i.d. random variables

Wi={V,Sj,ji}𝒱N×𝒮N1,1iN.W_{i}=\{V,\;S_{j},j\neq i\}\in{\cal V}_{N}\times{\cal S}^{N-1},\quad 1\leq i\leq N. (D.4)

Also define the random variables

Vki\displaystyle V_{ki} =\displaystyle= (Mk,V,Si+1N),\displaystyle(M_{k},V,S_{i+1}^{N}),
Uki\displaystyle U_{ki} =\displaystyle= (Vki,(YSd)i1)=(Mk,V,Si+1N,(YSd)i1),1kK, 1iN\displaystyle(V_{ki},(YS^{d})^{i-1})=(M_{k},V,S_{i+1}^{N},(YS^{d})^{i-1}),\quad 1\leq k\leq K,\;1\leq i\leq N (D.5)

where Si+1N(Si+1,,SN)S_{i+1}^{N}\triangleq(S_{i+1},\cdots,S_{N}) and (YSd)i1(Y1,S1d,,Yi1,Si1d)(YS^{d})^{i-1}\triangleq(Y_{1},S_{1}^{d},\cdots,Y_{i-1},S_{i-1}^{d}). Hence

Vi1K=(ViK,Si),V1K=U1K,VNK=(MK,V).V_{i-1}^{K}=(V_{i}^{K},S_{i}),\quad V_{1}^{K}=U_{1}^{K},\quad V_{N}^{K}=(M^{K},V). (D.6)

The following properties hold for each 1iN1\leq i\leq N:

  • By (D.1) and (D.5), (Si,Wi,UiK)=(MK,V,𝐒,Yi1)XiKYi(S_{i},W_{i},U_{i}^{K})=(M^{K},V,{\mathbf{S}},Y^{i-1})\to X_{i}^{K}\to Y_{i} forms a Markov chain.

  • The random variables Xki, 1kKX_{ki},\,1\leq k\leq K, are conditionally i.i.d. given (𝐒,V)=(Si,Wi)({\mathbf{S}},V)=(S_{i},W_{i}).

  • Due to the term Yi1Y^{i-1} in (D.5), the random variables Uki, 1kKU_{ki},\,1\leq k\leq K, are conditionally dependent given (𝐒,V)=(Si,Wi)({\mathbf{S}},V)=(S_{i},W_{i}).

The joint p.m.f. of (Si,Wi,XiK,UiK,Yi)(S_{i},W_{i},X_{i}^{K},U_{i}^{K},Y_{i}) may thus be written as

pSipWi(1kKpXki|SiWi)pUiK|XiKSiWipY|X𝒦,1iN.p_{S_{i}}p_{W_{i}}\left(\prod_{1\leq k\leq K}p_{X_{ki}|S_{i}W_{i}}\right)\,p_{U_{i}^{K}|X_{i}^{K}S_{i}W_{i}}\;p_{Y|X_{{\cal K}}},\quad 1\leq i\leq N. (D.7)

Step 3. Consider a time-sharing random variable TT that is uniformly distributed over {1,,N}\{1,\cdots,N\} and independent of the other random variables, and define the tuple of random variables (S,Sd,W,UK,XK,Y)(S,S^{d},W,U^{K},X^{K},Y) as (ST,STd,WT,UTK,XTK,YT)(S_{T},S_{T}^{d},W_{T},U_{T}^{K},X_{T}^{K},Y_{T}). Also let W=(WT,T)W=(W_{T},T) and Uk=(Uk,T,T)U_{k}=(U_{k,T},T), 1kK1\leq k\leq K, which are defined over alphabets of respective cardinalities

Lw(N)=N|𝒱N||𝒮|N1L_{w}(N)=N\,|{\cal V}_{N}|\,|{\cal S}|^{N-1}

and

Lu(N)=N|𝒱N| 2N[R+logmax(|𝒮|,|𝒴||𝒮d|)].L_{u}(N)=N\,|{\cal V}_{N}|\,2^{N\left[R+\log\max(|{\cal S}|,|{\cal Y}|\,|{\cal S}^{d}|)\right]}.

Since (Si,Wi,UiK)XiKYi(S_{i},W_{i},U_{i}^{K})\to X_{i}^{K}\to Y_{i} forms a Markov chain, so does (S,W,UK)XKY(S,W,U^{K})\to X^{K}\to Y. From (D.7), the joint p.m.f. of (S,W,UK,XK,Y)(S,W,U^{K},X^{K},Y) takes the form

pSpW(1kKpXk|SW)pUK|XKSWpY|X𝒦.p_{S}p_{W}\left(\prod_{1\leq k\leq K}p_{X_{k}|SW}\right)\,p_{U^{K}|X^{K}SW}\;p_{Y|X_{{\cal K}}}. (D.8)

In (6.1) we have defined the set

𝒫XKUKW|Souter(pS,Lw,Lu,D1)={pXKUKW|S=pW(k=1KpXk|SW)pUK|XKSW\displaystyle\mathscr{P}_{X^{K}U^{K}W|S}^{outer}(p_{S},L_{w},L_{u},D_{1})=\left\{p_{X^{K}U^{K}W|S}=p_{W}\,\left(\prod_{k=1}^{K}p_{X_{k}|SW}\right)\,p_{U^{K}|X^{K}SW}\right. (D.9)
:pX1|SW==pXK|SW,and𝔼d(S,X1)D1}\displaystyle\hskip 130.08621pt\left.~:~p_{X_{1}|SW}=\cdots=p_{X_{K}|SW},\;\;\mathrm{and}\;\;{\mathbb{E}}d(S,X_{1})\leq D_{1}\right\}

where |𝒲|=Lw|{\cal W}|=L_{w} and |𝒰|=Lu|{\cal U}|=L_{u}. Observe that pXKUKW|Sp_{X^{K}U^{K}W|S} defined in (D.8) belongs to 𝒫XKUKW|S(pS,Lw\mathscr{P}_{X^{K}U^{K}W|S}(p_{S},L_{w}, Lu,D1)L_{u},D_{1}).

Define the collection of KK indices 𝒦={1,2,,K}{\cal K}=\{1,2,\cdots,K\} and the following functionals indexed by 𝒜𝒦{\cal A}\subseteq{\cal K}:

JLw,Lu,𝒜(pS,pXKUKW|S,pY|XK)\displaystyle J_{L_{w},L_{u},{\cal A}}(p_{S},p_{X^{K}U^{K}W|S},p_{Y|X^{K}}) =\displaystyle= 1|𝒜|[I(U𝒜;YSd|U𝒦𝒜)I(U𝒜;S|U𝒦𝒜)].\displaystyle\frac{1}{|{\cal A}|}[I(U_{{\cal A}};YS^{d}|U_{{\cal K}\setminus{\cal A}})-I(U_{{\cal A}};S|U_{{\cal K}\setminus{\cal A}})]. (D.10)

Step 4. We have

I(M𝒦;𝐘|𝐒d,V)\displaystyle I(M_{\cal K};{\mathbf{Y}}|{\mathbf{S}}^{d},V) =(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{=}} I(M𝒦;𝐘|𝐒d,V)I(M𝒦,V;𝐒|𝐒d)\displaystyle I(M_{\cal K};{\mathbf{Y}}|{\mathbf{S}}^{d},V)-I(M_{\cal K},V;{\mathbf{S}}|{\mathbf{S}}^{d}) (D.11)
=\displaystyle= I(M𝒦,V;𝐘|𝐒d)I(V;𝐘|𝐒d)I(M𝒦,V;𝐒|𝐒d)\displaystyle I(M_{\cal K},V;{\mathbf{Y}}|{\mathbf{S}}^{d})-I(V;{\mathbf{Y}}|{\mathbf{S}}^{d})-I(M_{\cal K},V;{\mathbf{S}}|{\mathbf{S}}^{d})
\displaystyle\leq I(M𝒦,V;𝐘|𝐒d)I(M𝒦,V;𝐒|𝐒d)\displaystyle I(M_{\cal K},V;{\mathbf{Y}}|{\mathbf{S}}^{d})-I(M_{\cal K},V;{\mathbf{S}}|{\mathbf{S}}^{d})
=(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{=}} I(M𝒦,V;𝐘𝐒d)I(M𝒦,V;𝐒)\displaystyle I(M_{\cal K},V;{\mathbf{Y}}{\mathbf{S}}^{d})-I(M_{\cal K},V;{\mathbf{S}})
(c)\displaystyle\stackrel{{\scriptstyle(c)}}{{\leq}} i=1N[I(U𝒦,i;YiSid)I(U𝒦,i;Si)]\displaystyle\sum_{i=1}^{N}[I(U_{{\cal K},i};Y_{i}S_{i}^{d})-I(U_{{\cal K},i};S_{i})]
=\displaystyle= I(U𝒦,T;YSd|T)I(U𝒦,T;S|T)\displaystyle I(U_{{\cal K},T};YS^{d}|T)-I(U_{{\cal K},T};S|T)
=\displaystyle= I(U𝒦,T,T;YSd)I(T;YSd)I(U𝒦,T,T;S)+I(T;S)\displaystyle I(U_{{\cal K},T},T;YS^{d})-I(T;YS^{d})-I(U_{{\cal K},T},T;S)+I(T;S)
(d)\displaystyle\stackrel{{\scriptstyle(d)}}{{\leq}} I(U𝒦,T,T;YSd)I(U𝒦,T,T;S)\displaystyle I(U_{{\cal K},T},T;YS^{d})-I(U_{{\cal K},T},T;S)
=(e)\displaystyle\stackrel{{\scriptstyle(e)}}{{=}} I(U𝒦;YSd)I(U𝒦;S)\displaystyle I(U_{{\cal K}};YS^{d})-I(U_{{\cal K}};S)
=\displaystyle= KJLw(N),Lu(N),𝒦(pS,pXKUKW|S,pY|XK),\displaystyle K\,J_{L_{w}(N),L_{u}(N),{\cal K}}(p_{S},p_{X^{K}U^{K}W|S},p_{Y|X^{K}}),

where (a) holds because MK,V,𝐒M_{K},V,{\mathbf{S}} are mutually independent, and (b) follows from the chain rule for mutual information, (c) from [20, Lemma 4], using ViKV_{i}^{K} and UiKU_{i}^{K} in place of ViV_{i} and UiU_{i}, respectively, (d) holds because I(T;S)=0I(T;S)=0, and (e) by definition of U𝒦U_{{\cal K}}.

For all 𝒜𝒦{\cal A}\subset{\cal K}, we have

I(M𝒜;𝐘|𝐒d,V)\displaystyle I(M_{\cal A};{\mathbf{Y}}|{\mathbf{S}}^{d},V) =\displaystyle= I(M𝒜,V;𝐘|𝐒d,V)\displaystyle I(M_{\cal A},V;{\mathbf{Y}}|{\mathbf{S}}^{d},V)
=(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{=}} I(M𝒜,V;𝐘|𝐒d,V)I(M𝒜,V;𝐒|𝐒d,M𝒦𝒜,V)\displaystyle I(M_{\cal A},V;{\mathbf{Y}}|{\mathbf{S}}^{d},V)-I(M_{\cal A},V;{\mathbf{S}}|{\mathbf{S}}^{d},M_{{\cal K}\setminus{\cal A}},V)
=(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{=}} I(M𝒜,V;𝐘|𝐒d,M𝒦𝒜,V)I(M𝒜,V;𝐒|𝐒d,M𝒦𝒜)\displaystyle I(M_{\cal A},V;{\mathbf{Y}}|{\mathbf{S}}^{d},M_{{\cal K}\setminus{\cal A}},V)-I(M_{\cal A},V;{\mathbf{S}}|{\mathbf{S}}^{d},M_{{\cal K}\setminus{\cal A}})
=\displaystyle= I(M𝒜,V;𝐘𝐒d|M𝒦𝒜,V)I(M𝒜,V;𝐒|𝐒d,M𝒦𝒜,V)\displaystyle I(M_{\cal A},V;{\mathbf{Y}}{\mathbf{S}}^{d}|M_{{\cal K}\setminus{\cal A}},V)-I(M_{\cal A},V;{\mathbf{S}}|{\mathbf{S}}^{d},M_{{\cal K}\setminus{\cal A}},V)
=(c)\displaystyle\stackrel{{\scriptstyle(c)}}{{=}} i=1N[I(U𝒜,i;YiSid|U𝒦𝒜,i)I(U𝒜,i;Si|U𝒦𝒜,i)]\displaystyle\sum_{i=1}^{N}[I(U_{{\cal A},i};Y_{i}S_{i}^{d}|U_{{\cal K}\setminus{\cal A},i})-I(U_{{\cal A},i};S_{i}|U_{{\cal K}\setminus{\cal A},i})]
=\displaystyle= N[I(U𝒜,T;YSd|U𝒦𝒜,T,T)I(U𝒜,T;S|U𝒦𝒜,T,T)]\displaystyle N\,[I(U_{{\cal A},T};YS^{d}|U_{{\cal K}\setminus{\cal A},T},T)-I(U_{{\cal A},T};S|U_{{\cal K}\setminus{\cal A},T},T)]
=\displaystyle= N[I(U𝒜,T,T;YSd|U𝒦𝒜,T,T)I(U𝒜,T,T;S|U𝒦𝒜,T,T)]\displaystyle N\,[I(U_{{\cal A},T},T;YS^{d}|U_{{\cal K}\setminus{\cal A},T},T)-I(U_{{\cal A},T},T;S|U_{{\cal K}\setminus{\cal A},T},T)]
=(d)\displaystyle\stackrel{{\scriptstyle(d)}}{{=}} N[I(U𝒜;YSd|U𝒦𝒜)I(U𝒜;S|U𝒦𝒜)]\displaystyle N\,[I(U_{{\cal A}};YS^{d}|U_{{\cal K}\setminus{\cal A}})-I(U_{{\cal A}};S|U_{{\cal K}\setminus{\cal A}})]
=\displaystyle= N|𝒜|JLw(N),Lu(N),𝒜(pS,pXKUKW|S,pY|XK).\displaystyle N\,|{\cal A}|\,J_{L_{w}(N),L_{u}(N),{\cal A}}(p_{S},p_{X^{K}U^{K}W|S},p_{Y|X^{K}}). (D.13)

where (a) and (b) hold because MKM_{K}, 𝐒{\mathbf{S}}, and VV are mutually independent, the equality (c) is proved at the end of this section, and (d) follows from the definition of U𝒦U_{\cal K}.

Combining (D.3), (D.11), and (D.13), we obtain

R\displaystyle R \displaystyle\leq lim infNminpY|XK𝒲Kmin𝒜𝒦JLw(N),Lu(N),𝒜(pS,pXKUKW|S,pY|XK)\displaystyle\liminf_{N\to\infty}\min_{p_{Y|X^{K}}\in\mathscr{W}_{K}}\min_{{\cal A}\subseteq{\cal K}}J_{L_{w}(N),L_{u}(N),{\cal A}}(p_{S},p_{X^{K}U^{K}W|S},p_{Y|X^{K}}) (D.14)
(a)\displaystyle\stackrel{{\scriptstyle(a)}}{{\leq}} supLw,LuminpY|XK𝒲Kmin𝒜𝒦JLw,Lu,𝒜(pS,pXKUKW|S,pY|XK)\displaystyle\sup_{L_{w},L_{u}}\min_{p_{Y|X^{K}}\in\mathscr{W}_{K}}\min_{{\cal A}\subseteq{\cal K}}J_{L_{w},L_{u},{\cal A}}(p_{S},p_{X^{K}U^{K}W|S},p_{Y|X^{K}})
\displaystyle\leq supLw,LumaxpXKUKW|S𝒫XKUKW|S(pS,Lw,Lu,D1)minpY|XK𝒲Kmin𝒜𝒦JLw,Lu,𝒜(pS,pXKUKW|S,pY|XK)\displaystyle\sup_{L_{w},L_{u}}\;\max_{p_{X^{K}U^{K}W|S}\in\mathscr{P}_{X^{K}U^{K}W|S}(p_{S},L_{w},L_{u},D_{1})}\;\min_{p_{Y|X^{K}}\in\mathscr{W}_{K}}\min_{{\cal A}\subseteq{\cal K}}J_{L_{w},L_{u},{\cal A}}(p_{S},p_{X^{K}U^{K}W|S},p_{Y|X^{K}})
=(b)\displaystyle\stackrel{{\scriptstyle(b)}}{{=}} supLw,LuC¯Lw,Luall(D1,𝒲K)\displaystyle\sup_{L_{w},L_{u}}\;\overline{C}_{L_{w},L_{u}}^{all}(D_{1},\mathscr{W}_{K})
=\displaystyle= limLw,LuC¯Lw,Luall(D1,𝒲K)\displaystyle\lim_{L_{w},L_{u}\to\infty}\overline{C}_{L_{w},L_{u}}^{all}(D_{1},\mathscr{W}_{K})
=(c)\displaystyle\stackrel{{\scriptstyle(c)}}{{=}} C¯all(D1,𝒲K),\displaystyle\overline{C}^{all}(D_{1},\mathscr{W}_{K}),

where (a) holds because the functionals JLw,Lu,𝒜()J_{L_{w},L_{u},{\cal A}}(\cdot) are nondecreasing in Lw,LuL_{w},L_{u}, (b) uses the definition of C¯Lw,Luall\overline{C}_{L_{w},L_{u}}^{all} in (6.2), and (c) the fact that the sequence {C¯Lw,Luall}\{\overline{C}_{L_{w},L_{u}}^{all}\} is nondecreasing.

Proof of (D). Recall the definitions of V𝒦,i=(M𝒦,V,Si+1N)V_{{\cal K},i}=(M_{{\cal K}},V,S_{i+1}^{N}) and U𝒦,i=(V𝒦,i,(YSd)i1)U_{{\cal K},i}=(V_{{\cal K},i},(YS^{d})^{i-1}) in (D.5) and the recursion (D.6) for V𝒦,iV_{{\cal K},i}. We prove the following inequality:

I(U𝒜,i;YiSid|U𝒦𝒜,i)I(U𝒜,i;Si|U𝒦𝒜,i)\displaystyle I(U_{{\cal A},i};Y_{i}S_{i}^{d}|U_{{\cal K}\setminus{\cal A},i})-I(U_{{\cal A},i};S_{i}|U_{{\cal K}\setminus{\cal A},i}) (D.15)
=\displaystyle= [I(V𝒜,i;(YSd)i|V𝒦𝒜,i)I(V𝒜,i;Si|V𝒦𝒜,i)]\displaystyle[I(V_{{\cal A},i};(YS^{d})^{i}|V_{{\cal K}\setminus{\cal A},i})-I(V_{{\cal A},i};S^{i}|V_{{\cal K}\setminus{\cal A},i})]
[I(V𝒜,i1;(YSd)i1|V𝒦𝒜,i1)I(V𝒜,i1;Si1|V𝒦𝒜,i1)].\displaystyle-[I(V_{{\cal A},i-1};(YS^{d})^{i-1}|V_{{\cal K}\setminus{\cal A},i-1})-I(V_{{\cal A},i-1};S^{i-1}|V_{{\cal K}\setminus{\cal A},i-1})].

Then summing both sides of this equality from i=2i=2 to NN, cancelling terms, and using the properties Vk,1=Uk,1V_{k,1}=U_{k,1} and Vk,N=(Mk,V)V_{k,N}=(M_{k},V) yields (D).

The first of the six terms in (D.15) may be expanded as follows:

I(U𝒜,i;YiSid|U𝒦𝒜,i)\displaystyle I(U_{{\cal A},i};Y_{i}S_{i}^{d}|U_{{\cal K}\setminus{\cal A},i}) =\displaystyle= I(V𝒜,i,(YSd)i1;YiSid|V𝒦𝒜,i,(YSd)i1)\displaystyle I(V_{{\cal A},i},(YS^{d})^{i-1};Y_{i}S_{i}^{d}|V_{{\cal K}\setminus{\cal A},i},(YS^{d})^{i-1}) (D.16)
=\displaystyle= I(V𝒜,i;YiSid|V𝒦𝒜,i,(YSd)i1)\displaystyle I(V_{{\cal A},i};Y_{i}S_{i}^{d}|V_{{\cal K}\setminus{\cal A},i},(YS^{d})^{i-1})
=\displaystyle= I(V𝒜,i,(YSd)i1;YiSid|V𝒦𝒜,i)I((YSd)i1;YiSid|V𝒦𝒜,i)\displaystyle I(V_{{\cal A},i},(YS^{d})^{i-1};Y_{i}S_{i}^{d}|V_{{\cal K}\setminus{\cal A},i})-I((YS^{d})^{i-1};Y_{i}S_{i}^{d}|V_{{\cal K}\setminus{\cal A},i})
=\displaystyle= I(U𝒜,i;YiSid|V𝒦𝒜,i)I((YSd)i1;YiSid|V𝒦𝒜,i).\displaystyle I(U_{{\cal A},i};Y_{i}S_{i}^{d}|V_{{\cal K}\setminus{\cal A},i})-I((YS^{d})^{i-1};Y_{i}S_{i}^{d}|V_{{\cal K}\setminus{\cal A},i}).

Similarly for the second term, replacing (YSd)(YS^{d}) with SS in the above derivation, we obtain

I(U𝒜,i;Si|U𝒦𝒜,i)=I(U𝒜,i;Si|V𝒦𝒜,i)I((YSd)i1;Si|V𝒦𝒜,i).I(U_{{\cal A},i};S_{i}|U_{{\cal K}\setminus{\cal A},i})=I(U_{{\cal A},i};S_{i}|V_{{\cal K}\setminus{\cal A},i})-I((YS^{d})^{i-1};S_{i}|V_{{\cal K}\setminus{\cal A},i}). (D.17)

The six terms in (D.15) can be expanded using the chain rule for mutual information, in the same way as in [20, Lemma 4.2]:

I(V𝒜,i;(YSd)i|V𝒦𝒜,i)\displaystyle I(V_{{\cal A},i};(YS^{d})^{i}|V_{{\cal K}\setminus{\cal A},i}) =\displaystyle= I(V𝒜,i;(YSd)i1|V𝒦𝒜,i)+I(V𝒜,i;(YSd)i|V𝒦𝒜,i)\displaystyle I(V_{{\cal A},i};(YS^{d})^{i-1}|V_{{\cal K}\setminus{\cal A},i})+I(V_{{\cal A},i};(YS^{d})_{i}|V_{{\cal K}\setminus{\cal A},i}) (D.18)
I(V𝒜,i;Si|V𝒦𝒜,i)\displaystyle I(V_{{\cal A},i};S^{i}|V_{{\cal K}\setminus{\cal A},i}) =\displaystyle= I(V𝒜,i;Si1|V𝒦𝒜,i)+I(V𝒜,i;Si|V𝒦𝒜,i)\displaystyle I(V_{{\cal A},i};S^{i-1}|V_{{\cal K}\setminus{\cal A},i})+I(V_{{\cal A},i};S_{i}|V_{{\cal K}\setminus{\cal A},i}) (D.19)
I(V𝒜,i1;Si1|V𝒦𝒜,i1)\displaystyle I(V_{{\cal A},i-1};S^{i-1}|V_{{\cal K}\setminus{\cal A},i-1}) =\displaystyle= I(V𝒜,i;Si1|Si,V𝒦𝒜,i1)\displaystyle I(V_{{\cal A},i};S^{i-1}|S_{i},V_{{\cal K}\setminus{\cal A},i-1}) (D.20)
I(V𝒜,i1;(YSd)i1|V𝒦𝒜,i1)\displaystyle I(V_{{\cal A},i-1};(YS^{d})^{i-1}|V_{{\cal K}\setminus{\cal A},i-1}) =\displaystyle= I(V𝒜,i;(YSd)i1|Si,V𝒦𝒜,i1)\displaystyle I(V_{{\cal A},i};(YS^{d})^{i-1}|S_{i},V_{{\cal K}\setminus{\cal A},i-1}) (D.21)
I(U𝒜,i;Si|V𝒦𝒜,i)\displaystyle I(U_{{\cal A},i};S_{i}|V_{{\cal K}\setminus{\cal A},i}) =\displaystyle= I((YSd)i1;Si|V𝒦𝒜,i)+I(V𝒜,i;Si|(YSd)i1,V𝒦𝒜,i)\displaystyle I((YS^{d})^{i-1};S_{i}|V_{{\cal K}\setminus{\cal A},i})+I(V_{{\cal A},i};S_{i}|(YS^{d})^{i-1},V_{{\cal K}\setminus{\cal A},i}) (D.22)
I(U𝒜,i;(YSd)i|V𝒦𝒜,i)\displaystyle I(U_{{\cal A},i};(YS^{d})_{i}|V_{{\cal K}\setminus{\cal A},i}) =\displaystyle= I((YSd)i1;(YSd)i|V𝒦𝒜,i)+I(V𝒜,i;(YSd)i|(YSd)i1,V𝒦𝒜,i).\displaystyle I((YS^{d})^{i-1};(YS^{d})_{i}|V_{{\cal K}\setminus{\cal A},i})+I(V_{{\cal A},i};(YS^{d})_{i}|(YS^{d})^{i-1},V_{{\cal K}\setminus{\cal A},i}).

Moreover, expanding the conditional mutual information I(V𝒜,i;Si,(YSd)i1|V𝒦𝒜,i)I(V_{{\cal A},i};S_{i},(YS^{d})^{i-1}|V_{{\cal K}\setminus{\cal A},i}) in two different ways, we obtain

I(V𝒜,i;(YSd)i1|V𝒦𝒜,i)+I(V𝒜,i;Si|(YSd)i1,V𝒦𝒜,i)\displaystyle I(V_{{\cal A},i};(YS^{d})^{i-1}|V_{{\cal K}\setminus{\cal A},i})+I(V_{{\cal A},i};S_{i}|(YS^{d})^{i-1},V_{{\cal K}\setminus{\cal A},i}) (D.24)
=\displaystyle= I(V𝒜,i;Si1|V𝒦𝒜,i)+I(V𝒜,i;(YSd)i1|Si,V𝒦𝒜,i).\displaystyle I(V_{{\cal A},i};S^{i-1}|V_{{\cal K}\setminus{\cal A},i})+I(V_{{\cal A},i};(YS^{d})^{i-1}|S_{i},V_{{\cal K}\setminus{\cal A},i}).

Substracting the sum of (D.17), (D.18), (D.20), (D.22), (D.24) from the sum of (D.16), (D.19), (D.21), (LABEL:eq:GP-6), and cancelling terms, we obtain (D.15), from which the claim follows. \Box

References

  • [1] D. Boneh and J. Shaw, “Collusion–Secure Fingerprinting for Digital Data,” in Advances in Cryptology: Proc. CRYPTO’95, Springer–Verlag, New York, 1995.
  • [2] I. J. Cox, J. Killian, F. T. Leighton and T. Shamoon, “Secure Spread Spectrum Watermarking for Multimedia,” IEEE Trans. Image Proc., Vol. 6, No. 12, pp. 1673—1687, Dec. 1997.
  • [3] M. Wu, W. Trappe, Z. J. Wang and K. J. R. Liu, “Collusion-Resistant Fingerprinting for Multimedia,” IEEE Signal Processing Magazine, Vol. 21, No. 2, pp. 15—27, March 2004.
  • [4] K. J. R. Liu, W. Trappe, Z. J. Wang, M. Wu and H. Zhao, Multimedia Fingerprinting Forensics for Traitor Tracing, EURASIP Book Series on Signal Processing, 2006.
  • [5] P. Moulin and A. Briassouli, “The Gaussian Fingerprinting Game,” Proc. Conf. Information Sciences and Systems, Princeton, NJ, March 2002.
  • [6] P. Moulin and J. A. O’Sullivan, “Information-theoretic analysis of information hiding,” IEEE Trans. on Information Theory, Vol. 49, No. 3, pp. 563—593, March 2003.
  • [7] A. Somekh-Baruch and N. Merhav, “On the capacity game of private fingerprinting systems under collusion attacks,” IEEE Trans. Information Theory, vol. 51, no. 3, pp. 884—899, Mar. 2005.
  • [8] A. Somekh-Baruch and N. Merhav, “Achievable error exponents for the private fingerprinting game,” IEEE Trans. Information Theory, Vol. 53, No. 5, pp. 1827—1838, May 2007.
  • [9] N. P. Anthapadmanabhan, A. Barg and I. Dumer, “On the Fingerprinting Capacity Under the Marking Assumption,” submitted to IEEE Trans. Information Theory, arXiv:cs/0612073v2, July 2007.
  • [10] P. Moulin, “Universal Fingerprinting: Capacity and Random-Coding Exponents,” submitted to IEEE Trans. Information Theory. Available from arxiv:0801.3837v1 [cs.IT] 24 Jan 2008.
  • [11] P. Moulin and Y. Wang, “Capacity and Random-Coding Exponents for Channel Coding with Side Information,” IEEE Trans. on Information Theory, Vol. 53, No. 4, pp. 1326—1347, Apr. 2007.
  • [12] Y.-S. Liu and B. L. Hughes, “A new universal random coding bound for the multiple-access channel,” IEEE Trans. Information Theory, vol. 42, no. 2, pp. 376 –386, Mar. 1996.
  • [13] A. Somekh-Baruch and N. Merhav, “On the Random Coding Error Exponents of the Single-User and the Multiple-Access Gel’fand-Pinsker Channels,” Proc. IEEE Int. Symp. Info. Theory, p. 448, Chicago, IL, June-July 2004.
  • [14] Y. Wang and P. Moulin, “Capacity and Random-Coding Error Exponent for Public Fingerprinting Game,” Proc. Int. Symp. on Information Theory, Seattle, WA, July 2006.
  • [15] Y. Wang, Detection- and Information-Theoretic Analysis of Steganography and Fingerprinting, Ph. D. Thesis, ECE Department, University of Illinois at Urbana-Champaign, Dec. 2006.
  • [16] G. Tardos, “Optimal Probabilistic Fingerprinting Codes,” STOC, 2003.
  • [17] R. Ahlswede, “Multiway Communication Channels,” Proc. ISIT, pp. 23—52, Tsahkadsor, Armenia, 1971.
  • [18] H. Liao, “Multiple Access Channels,” Ph. D. dissertation, EE Department, U. of Hawaii, 1972.
  • [19] A. Lapidoth and P. Narayan, “Reliable Communication Under Channel Uncertainty,” IEEE Trans. Information Theory, Vol. 44, No. 6, pp. 2148—2177, Oct. 1998.
  • [20] S. I. Gel’fand and M. S. Pinsker, “Coding for Channel with Random Parameters,” Problems of Control and Information Theory, Vol. 9, No. 1, pp. 19—31, 1980.
  • [21] I. Csiszár and J. Körner, Information Theory: Coding Theory for Discrete Memoryless Systems, Academic Press, NY, 1981.
  • [22] A. Das and P. Narayan, “Capacities of Time-Varying Multiple-Access Channels With Side Information,” IEEE Trans. Information Theory, Vol. 48, No. 1, pp. 4—25, Jan. 2002.
  • [23] S. Sigurjónsson and Y.-H. Kim, “On Multiple User Channels with State Information at the Transmitters,” Proc. ISIT 2005.
  • [24] N. T. Gaarder and J. K. Wolf, “The Capacity Region of a Multiple-Access Discrete Memoryless Channel Can Increase with Feedback,” IEEE Trans. Information Theory, Vol. 21, No. 1, pp. 100—102, Jan. 1975.