This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Rate Loss of Single-Letter Characterization:
The “Dirty” Multiple Access Channel

Tal Philosof and Ram Zamir 222This research was partially supported by BSF grant No-2004398
Dept. of Electrical Engineering - Systems, Tel-Aviv University
Tel-Aviv 69978, ISRAEL
talp,zamir@eng.tau.ac.il
Submitted to IEEE Trans. on Information Theory March 2008
Abstract

For general memoryless systems, the typical information theoretic solution - when exists - has a “single-letter” form. This reflects the fact that optimum performance can be approached by a random code (or a random binning scheme), generated using independent and identically distributed copies of some single-letter distribution. Is that the form of the solution of any (information theoretic) problem? In fact, some counter examples are known. The most famous is the “two help one” problem: Korner and Marton showed that if we want to decode the modulo-two sum of two binary sources from their independent encodings, then linear coding is better than random coding. In this paper we provide another counter example, the “doubly-dirty” multiple access channel (MAC). Like the Korner-Marton problem, this is a multi-terminal scenario where side information is distributed among several terminals; each transmitter knows part of the channel interference but the receiver is not aware of any part of it. We give an explicit solution for the capacity region of a binary version of the doubly-dirty MAC, demonstrate how the capacity region can be approached using a linear coding scheme, and prove that the “best known single-letter region” is strictly contained in it. We also state a conjecture regarding a similar rate loss of single letter characterization in the Gaussian case.

Index Terms:
Multi-user information theory, random binning, linear lattice binning, dirty paper coding, lattice strategies, Korner-Marton problem.

I Introduction

Consider the two-user / double-state memoryless multiple access channel (MAC) with transition and state probability distributions

P(y|x1,x2,s1,s2)andP(s1,s2),\displaystyle P(y|x_{1},x_{2},s_{1},s_{2})\ \ \mbox{and}\ \ P(s_{1},s_{2}), (1)

respectively, where the states S1S_{1} and S2S_{2} are known non-causally to user 11 and user 22, respectively. A special case of (1) is the additive channel shown in Fig. 1. In this channel, called the doubly-dirty MAC (after Costa’s “writing on dirty paper” [1]), the total channel noise consists of three independent components: S1S_{1} and S2S_{2}, the interference signals, that are known to user 11 and user 22, respectively, and ZZ, the unknown noise, which is known to neither. The channel inputs X1X_{1} and X2X_{2} may be subject to some average cost constraint.

Neither the capacity region of (1) nor that of the special case of Fig. 1 are known. In this paper we consider a particular binary version of the doubly-dirty MAC of Fig. 1, where all variables are in 2\mathbb{Z}_{2}, i.e., {0,1}\{0,1\}, and the unknown noise Z=0Z=0. The channel output of the binary doubly-dirty MAC is given by

Y=X1X2S1S2,\displaystyle Y=X_{1}\oplus X_{2}\oplus S_{1}\oplus S_{2}, (2)

where \oplus denotes the mod 2\mbox{mod}\;2 addition (xor), and S1,S2S_{1},S_{2} are Bernoulli(1/2)\mbox{Bernoulli}(1/2) and independent. Each of the codewords 𝐱i2n\mathbf{x}_{i}\in\mathbb{Z}_{2}^{n} is a function of the message WiW_{i} and the interference vector 𝐬i2n\mathbf{s}_{i}\in\mathbb{Z}_{2}^{n}, and must satisfy the input constraint, 1nwH(𝐱i)qi\frac{1}{n}w_{H}(\mathbf{x}_{i})\leq q_{i}, i=1,2\;i=1,2, where 0q1,q21/20\leq q_{1},q_{2}\leq 1/2 and wH()w_{H}(\cdot) is the Hamming weight. The coding rates R1R_{1} and R2R_{2} of the two users are given as usual by Ri=1nlog|𝒲i|R_{i}=\frac{1}{n}\log|{\cal W}_{i}|, where 𝒲i{\cal W}_{i} is the set of messages of user ii, and nn is the length of the codeword.

Refer to caption
S1S_{1}X1X_{1}X2X_{2}W1W_{1}W2W_{2}S2S_{2}ZZYYW^1\hat{W}_{1}W^2\hat{W}_{2}
Figure 1: Doubly-dirty MAC

The double state MAC (1) generalizes the point to point channel with side information (SI) at the transmitter considered by Gel’fand and Pinsker [2]. They prove their direct coding theorem using the framework of random binning, which is widely used in the analysis of multi-terminal source and channel coding problems [3]. They obtain a general capacity expression which involves an auxiliary random variable UU:

C=maxP(u,x|s){H(U|S)H(U|Y)}\displaystyle C=\max_{P(u,x|s)}\left\{H(U|S)-H(U|Y)\right\} (3)

where the maximization is over all the joint distributions of the form p(u,s,y,x)=p(s)p(u,x|s)p(y|x,s)p(u,s,y,x)=p(s)p(u,x|s)p(y|x,s).

The channel in (1) with only one informed encoder (i.e., where S2={}S_{2}=\{\emptyset\}) was considered recently by Somekh-Baruch et al. [4] and Kotagiri and Laneman [5]. The common message (W1=W2W_{1}=W_{2}) capacity of this channel is known [4], and it involves using random binning by the informed user. For the binary “one dirty user” case (i.e., (2) with S2=0S_{2}=0), we show that Somekh-Baruch’s common-message capacity becomes (see Appendix A)

Ccom=Hb(q1),\displaystyle C_{com}=H_{b}(q_{1}), (4)

where Hb(x)xlog2(x)(1x)log2(1x)H_{b}(x)\triangleq-x\log_{2}(x)-(1-x)\log_{2}(1-x) is the binary entropy function. Clearly, the doubly-dirty individual-message case is harder. Thus, it follows from (4) that the rate-sum in the setting of Fig. 1 is upper bounded by

R1+R2min{Hb(q1),Hb(q2)}.\displaystyle R_{1}+R_{2}\leq\min\Big{\{}H_{b}(q_{1}),H_{b}(q_{2})\Big{\}}. (5)

In Theorem 1 we show that this upper bound is in fact tight.

One approach to find achievable rates for the doubly-dirty MAC, is to extend the Gel’fand and Pinsker solution [2] to the two-user / double-state case. As shown by Jafar [6], this extension leads to the following pentagonal inner bound for the capacity region of (1):

(U1,U2){(R1,R2):\displaystyle\mathcal{R}(U_{1},U_{2})\triangleq\Bigg{\{}(R_{1},R_{2}):\; R1I(U1,Y|U2)I(U1;S1)\displaystyle R_{1}\leq I(U_{1},Y|U_{2})-I(U_{1};S_{1})
R2I(U2,Y|U1)I(U2;S2)\displaystyle R_{2}\leq I(U_{2},Y|U_{1})-I(U_{2};S_{2}) (6)
R1+R2I(U1,U2,Y)I(U1;S1)I(U2;S2)}\displaystyle R_{1}+R_{2}\leq I(U_{1},U_{2},Y)-I(U_{1};S_{1})-I(U_{2};S_{2})\Bigg{\}}

for some P(U1,U2,X1,X2|S1,S2)=P(U1,X1|S1)P(U2,X2|S2)P(U_{1},U_{2},X_{1},X_{2}|S_{1},S_{2})=P(U_{1},X_{1}|S_{1})P(U_{2},X_{2}|S_{2}). In fact, by a standard time-sharing argument [3], the closure of the convex hull of the set of all rate pairs (R1,R2)(R_{1},R_{2}) satisfying (7),

BSLclconv{\displaystyle\mathcal{R}_{BSL}\triangleq cl\;conv\;\Bigg{\{} (R1,R2)(U1,U2):P(U1,U2,X1,X2|S1,S2)=P(U1,X1|S1)P(U2,X2|S2)},\displaystyle(R_{1},R_{2})\in\mathcal{R}(U_{1},U_{2}):P(U_{1},U_{2},X_{1},X_{2}|S_{1},S_{2})=P(U_{1},X_{1}|S_{1})P(U_{2},X_{2}|S_{2})\Bigg{\}}, (7)

is also achievable111 As in the Gel’fand and Pinsker solution, for a finite alphabet system it is enough to optimize over auxiliary variables U1U_{1} and U2U_{2} whose alphabet size is bounded in terms of the size of the input and state alphabets. . To the best of our knowledge, the set BSL\mathcal{R}_{BSL} is the best currently known single-letter characterization for the rate region of the MAC with side information at the transmitters (1), and in particular, for the doubly-dirty MAC (2)222For the case where the users have also a common message W0W_{0} to be transmitted jointly by both encoders, (7) can be improved by adding another auxiliary random variable U0U_{0} which plays the role of the common auxiliary r.v. in Marton’s inner bound for the non-degraded broadcast channel [7]. In this case, the joint distribution of (U0,U1,U2)(U_{0},U_{1},U_{2}) is given by P(U0,U1,U2)=P(U0)P(U1|U0)P(U2|U0)P(U_{0},U_{1},U_{2})=P(U_{0})P(U_{1}|U_{0})P(U_{2}|U_{0}), i.e, U1U_{1} and U2U_{2} are conditionally independent given U0U_{0}.. The achievability of (7) can be proved, as usual, by an i.i.d random binning scheme [6].

A different method to cancel known interference is by “linear strategies”, i.e, binning based on the cosets of a linear code [8, 9, 10]. In the sequel, we show that the outer bound (5) can indeed be achieved by a linear coding scheme. Hence, the set of rate pairs (R1,R2)(R_{1},R_{2}) satisfying (5) is the capacity region of the binary doubly-dirty MAC. In contrast, we show that the single-letter region (7) is strictly contained in this capacity region. Hence, a random binning scheme based on this extension of the Gel’fand-Pinsker solution [2] is not optimal for this problem.

A similar observation has been made by Korner-Marton [11] for the “two help one” source coding problem. For a specific binary version known as the “modulo-two sum” problem, they showed that the minimum possible rate sum is achieved by a linear coding scheme, while the best known single-letter expression for this problem is strictly higher. See the discussion in [11, Section IV] and in the end of Section III.

Although the “single-letter characterization” is a fundamental concept in information theory, it has not been generally defined [12, p.35]. Csiszar and Korner [13, p.259] suggested to define it through the notion of computability, i.e., a problem has a single-letter solution if there exists an algorithm which can decide if a point belongs to an ε\varepsilon-neighborhood of the achievable rate region with polynomial complexity in 1/ε1/\varepsilon. Since we are not aware of any other computable solution to our problem, we shall refer to (7) as the “best known single-letter characterization”.

An extension of these observations to continuous channels would be of interest. Costa [1] considered the single-user case of the dirty channel problem Y=X+S+ZY=X+S+Z, where the interference SS and the noise ZZ are assumed to be i.i.d. Gaussian with variances QQ and NN, respectively, and the input XX is subject to a power constraint PP. He showed that in this case, the transmitter side-information capacity (3) coincides with the zero-interference capacity 12log2(1+SNR)\frac{1}{2}\log_{2}(1+SNR), where SNR=P/NSNR=P/N. Selecting the auxiliary random variable UU in (3) such that

U=X+αS,\displaystyle U=X+\alpha S, (8)

where XX and SS are independent, and taking α=PP+N\alpha=\frac{P}{P+N}, the formula (3) and its associated random binning scheme are capacity achieving. The continuous (Gaussian) version of the doubly-dirty MAC of Fig. 1 was considered in [10]. It was shown that by using a linear structure, i.e., lattice strategies [8], the full capacity region is achieved in the limit of high SNR and high lattice dimension. In contrast, it was shown that for QQ\rightarrow\infty no positive rate is achievable by using the natural generalization of Costa’s strategy (8) to the two user case, while a (scalar) modulo addition version of (8) looses 0.254\thickapprox 0.254 bit in the sum capacity. We shall further elaborate on this issue in Section IV.

Similar observations regarding the advantage of modulo-lattice modulation with respect to a separation based solution were made by Nazer and Gastpar [14], in the context of computation over linear Gaussian networks, and also by Krithivasan and Pradhan [15] for multi-terminal rate distortion problems.

The paper is organized as follows. In Section II the capacity region for the binary doubly-dirty MAC (2) is derived, and linear coding is shown to be optimal. Section III develops a closed form expression for the best known single-letter characterization (7) for this channel, and demonstrates that it is strictly contained in the the true capacity region. In Section IV we consider the Gaussian doubly-dirty MAC, and state a conjecture regarding the capacity loss of single-letter characterization in this case.

II The Capacity Region of the Binary Doubly-Dirty MAC

The following theorem characterizes the capacity region of the binary doubly-dirty MAC of Fig. 1.

Theorem 1.

The capacity region of the binary doubly-dirty MAC (2) is the set of all rate pairs (R1,R2)(R_{1},R_{2}) satisfying

𝒞(q1,q2){(R1,R2):R1+R2min{Hb(q1),Hb(q2)}}.\displaystyle\mathcal{C}(q_{1},q_{2})\triangleq\Bigg{\{}(R_{1},R_{2}):R_{1}+R_{2}\leq\min\Big{\{}H_{b}(q_{1}),H_{b}(q_{2})\Big{\}}\Bigg{\}}. (9)
Proof.

The converse part: As explained in the Introduction (5), one way to derive an upper bound for the rate-sum is through the general one-dirty-user capacity formula [4], which we derive explicitly for the binary case in Appendix A. Here we show directly the converse part, which is similar to the proof of the outer bound for the Gaussian case in [16, 10]. We assume that user 11 and user 22 intend to transmit a common message WW. An upper bound on the rate of this message clearly upper bounds the sum rate R1+R2R_{1}+R_{2} in the individual messages case. Thus,

n(R1+R2)\displaystyle n(R_{1}+R_{2}) H(W)\displaystyle\leq H(W)
=H(W|Yn)+I(W;Yn)\displaystyle=H(W|Y^{n})+I(W;Y^{n})
I(W;Yn)+nϵn\displaystyle\leq I(W;Y^{n})+n\epsilon_{n} (10)
=H(Yn)H(Yn|W)+nϵn\displaystyle=H(Y^{n})-H(Y^{n}|W)+n\epsilon_{n}
=H(Yn)H(Yn|W,S1n,S2n)I(S1n,S2n;Yn|W)+nϵn\displaystyle=H(Y^{n})-H(Y^{n}|W,S_{1}^{n},S_{2}^{n})-I(S_{1}^{n},S_{2}^{n};Y^{n}|W)+n\epsilon_{n}
=H(Yn)I(S1n,S2n;Yn|W)+nϵn\displaystyle=H(Y^{n})-I(S_{1}^{n},S_{2}^{n};Y^{n}|W)+n\epsilon_{n} (11)
=H(Yn)H(S1n,S2n|W)+H(S1n,S2n|W,Yn)+nϵn\displaystyle=H(Y^{n})-H(S_{1}^{n},S_{2}^{n}|W)+H(S_{1}^{n},S_{2}^{n}|W,Y^{n})+n\epsilon_{n}
n+H(S1n|W,Yn)+H(S2n|W,Yn,S1n)+nϵn\displaystyle\leq-n+H(S_{1}^{n}|W,Y^{n})+H(S_{2}^{n}|W,Y^{n},S_{1}^{n})+n\epsilon_{n} (12)
H(X1nX2nS1n|W,Yn,S1n)+nϵn\displaystyle\leq H(X_{1}^{n}\oplus X_{2}^{n}\oplus S_{1}^{n}|W,Y^{n},S_{1}^{n})+n\epsilon_{n} (13)
=H(X2n|W,Yn,S1n)+nϵn\displaystyle=H(X_{2}^{n}|W,Y^{n},S_{1}^{n})+n\epsilon_{n} (14)
nHb(q2)+nϵn,\displaystyle\leq nH_{b}(q_{2})+n\epsilon_{n}, (15)

where (10) follows from Fano’s inequality where ϵn0\epsilon_{n}\rightarrow 0 as the error probability Pe(n)P_{e}^{(n)} goes to zero for nn\rightarrow\infty; (11) follows since YY is fully known given WW, S1S_{1} and S2S_{2}; (12) follows from the chain rule for entropy, and due to H(Yn)nH(Y^{n})\leq n and H(S1n,S2n|W)=H(S1n)+H(S2n)=2nH(S_{1}^{n},S_{2}^{n}|W)=H(S_{1}^{n})+H(S_{2}^{n})=2n since WW, S1nS_{1}^{n} and S2nS_{2}^{n} are mutually independent; (13) follows since H(S1n|W,Yn)nH(S_{1}^{n}|W,Y^{n})\leq n and Yn=X1nX2nS1nS2nY^{n}=X_{1}^{n}\oplus X_{2}^{n}\oplus S_{1}^{n}\oplus S_{2}^{n}; (14) follows since X1nX_{1}^{n} is a function of (W,S1n)(W,S_{1}^{n}), finally (15) follows since H(X2n|W,Yn,S1n)H(X2n)nHb(q2)H(X_{2}^{n}|W,Y^{n},S_{1}^{n})\leq H(X_{2}^{n})\leq nH_{b}(q_{2}).

In the same way we can show that R1+R2Hb(q1)+ϵnR_{1}+R_{2}\leq H_{b}(q_{1})+\epsilon_{n}. The converse part follows since for nn\rightarrow\infty we have that ϵn0\epsilon_{n}\rightarrow 0, thus Pe(n)0P_{e}^{(n)}\rightarrow 0.

The direct part is based on the scheme for the point-to-point binary dirty paper channel [9]. We define qmin{q1,q2}q\triangleq\min\{q_{1},q_{2}\}. In view of the converse part, it is sufficient to show achievability of the point (R1,R2)=(Hb(q),0)(R_{1},R_{2})=(H_{b}(q),0), since the outer bound may be achieved by time sharing with the symmetric point (R1,R2)=(0,Hb(q))(R_{1},R_{2})=(0,H_{b}(q)). The corner point (R1,R2)=(Hb(q),0)(R_{1},R_{2})=(H_{b}(q),0) corresponds to the “helper problem”, i.e., user 22 tries to help user 11 to transmit at its highest rate. The encoders and decoder are described using a binary linear code 𝒞(n,k)\mathcal{C}(n,k) with parity check matrix HH. Let 𝐯2nk\mathbf{v}\in\mathbb{Z}_{2}^{n-k} be a syndrome of the code 𝒞\mathcal{C}, where we note that each syndrome represents a different coset of the linear code 𝒞\mathcal{C}. Let f(𝐯)f(\mathbf{v}) denote the “leader” of (or the minimum weight vector in) the coset associated with the syndrome 𝐯\mathbf{v} [17, Chap. 6], hence f:{0,1}nk{0,1}nf:\{0,1\}^{n-k}\rightarrow\{0,1\}^{n} . For 𝐚2n\mathbf{a}\in\mathbb{Z}_{2}^{n}, we define the nn-dimensional modulo operation over the code 𝒞\mathcal{C} as

𝐚modf(H𝐚),\displaystyle\mathbf{a}\;\mbox{mod}\;\mathbb{C}\triangleq f(H\mathbf{a}),

which is the leader of the coset to which the vector 𝐚\mathbf{a} belongs.

  • Encoder of user 𝟏\mathbf{1}: Let the transmitted message 𝐯12nk\mathbf{v}_{1}\in\mathbb{Z}_{2}^{n-k} be a syndrome in 𝒞\mathcal{C}, and let 𝐱~1=f(𝐯1)\mathbf{\tilde{x}}_{1}=f(\mathbf{v}_{1}) be its coset leader. In particular 𝐯1=H𝐱~1\mathbf{v}_{1}=H\mathbf{\tilde{x}}_{1}. Transmit the modulo of the code 𝒞\mathcal{C} with respect to the difference between 𝐱~1\tilde{\mathbf{x}}_{1} and 𝐬1\mathbf{s}_{1}, i.e.,

    𝐱1=(𝐱~1𝐬1)mod=f(𝐯1H𝐬1).\displaystyle\mathbf{x}_{1}=(\tilde{\mathbf{x}}_{1}\oplus\mathbf{s}_{1})\;\mbox{mod}\;\mathbb{C}=f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1}).
  • Encoder of user 𝟐\mathbf{2}: (functions as a “helper” for user 11). Transmit

    𝐱2=𝐬2mod=f(H𝐬2).\displaystyle\mathbf{x}_{2}=\mathbf{s}_{2}\;\mbox{mod}\;\mathbb{C}=f(H\mathbf{s}_{2}).
  • Decoder:
    1. Reconstruct 𝐱~1\tilde{\mathbf{x}}_{1} by 𝐱~^1=𝐲mod\hat{\tilde{\mathbf{x}}}_{1}=\mathbf{y}\;\rm{mod}\;\mathbb{C}.
    2. Reconstruct the transmitted coset of user 11 by 𝐯^1=H𝐱~^1\hat{\mathbf{v}}_{1}=H\hat{\tilde{\mathbf{x}}}_{1}.
    In fact, the transmitted coset can be reconstructed directly as 𝐯^1=H𝐱~^1=H(𝐲mod)=H𝐲\hat{\mathbf{v}}_{1}=H\hat{\tilde{\mathbf{x}}}_{1}=H(\mathbf{y}\;\rm{mod}\;\mathbb{C})=H\mathbf{y}, where the last equality follows since 𝐲mod\mathbf{y}\;\rm{mod}\;\mathbb{C} and 𝐲\mathbf{y} are in the same coset.

It follows that the decoder correctly decodes the message coset 𝐯1\mathbf{v}_{1}, since

𝐯^1\displaystyle\hat{\mathbf{v}}_{1} =H(𝐲mod)\displaystyle=H\cdot\Big{(}\mathbf{y}\;\rm{mod}\;\mathbb{C}\Big{)}
=H([𝐱~1𝐬1𝐬2𝐬1𝐬2]mod)\displaystyle=H\cdot\Big{(}[\tilde{\mathbf{x}}_{1}\oplus\mathbf{s}_{1}\oplus\mathbf{s}_{2}\oplus\mathbf{s}_{1}\oplus\mathbf{s}_{2}]\;\mbox{mod}\;\mathbb{C}\Big{)}
=H𝐱~1\displaystyle=H\tilde{\mathbf{x}}_{1}
=𝐯1,\displaystyle=\mathbf{v}_{1},

where the third equality follows since 𝐱~1\tilde{\mathbf{x}}_{1} and 𝐱~1mod\tilde{\mathbf{x}}_{1}\;\rm{mod}\;\mathbb{C} are in the same coset. It is left to relate the coding rate R1=1nlog(|{0,1}nk|)=1k/nR_{1}=\frac{1}{n}\log\Big{(}\Big{|}\{0,1\}^{n-k}\Big{|}\Big{)}=1-k/n to the input constraint qq. Form [18], there exists a binary linear code with covering radius ρ\rho that satisfies kn1Hb(ρ/n)+ϵ\frac{k}{n}\leq 1-H_{b}(\rho/n)+\epsilon where ϵ0\epsilon\rightarrow 0 as nn\rightarrow\infty. The achievability of the point (Hb(q),0)(H_{b}(q),0) follows by using q=ρ/nq=\rho/n, thus R1=1k/nHb(q)ϵR_{1}=1-k/n\geq H_{b}(q)-\epsilon, while wH(𝐱1)=wH(f(𝐯1H𝐬1))ρw_{H}(\mathbf{x}_{1})=w_{H}(f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1}))\leq\rho and wH(𝐱2)=wH(f(H𝐬2))ρw_{H}(\mathbf{x}_{2})=w_{H}(f(H\mathbf{s}_{2}))\leq\rho, hence

1nEwH{𝐱1}=1nEwH{f(𝐯1H𝐬1)}q\displaystyle\frac{1}{n}Ew_{H}\{\mathbf{x}_{1}\}=\frac{1}{n}Ew_{H}\{f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1})\}\leq q
1nEwH{𝐱2}=1nEwH{f(H𝐬2)}q.\displaystyle\frac{1}{n}Ew_{H}\{\mathbf{x}_{2}\}=\frac{1}{n}Ew_{H}\{f(H\mathbf{s}_{2})\}\leq q.

This completes the proof of the direct part of the theorem. ∎


As stated above, the achievability for the capacity region follows by time sharing the corner points (Hb(q),0)(H_{b}(q),0) and (0,Hb(q))(0,H_{b}(q)) where q=min{q1,q2}q=\min\{q_{1},q_{2}\}. It is also interesting to see how to achieve the rate sum Hb(q)H_{b}(q) for an arbitrary rate pair (R1,R2)(R_{1},R_{2}) without time sharing. For that, let the message of user 11 be 𝐦12l1\mathbf{m}_{1}\in\mathbb{Z}_{2}^{l_{1}} and the message of user 22 be 𝐦22l2\mathbf{m}_{2}\in\mathbb{Z}_{2}^{l_{2}} where l1+l2=nkl_{1}+l_{2}=n-k. We define the following syndromes in 𝒞\mathcal{C}

𝐯1\displaystyle\mathbf{v}_{1} [𝐦10 0 0l2]2nk\displaystyle\triangleq[\mathbf{m}_{1}\;\underbrace{0\;0\;\ldots\;0}_{l_{2}}]\in\mathbb{Z}_{2}^{n-k}
𝐯2\displaystyle\mathbf{v}_{2} [0 0 0l1𝐦2]2nk\displaystyle\triangleq[\underbrace{0\;0\;\ldots\;0}_{l_{1}}\;\mathbf{m}_{2}]\in\mathbb{Z}_{2}^{n-k}
𝐯\displaystyle\mathbf{v} 𝐯1𝐯2.\displaystyle\triangleq\mathbf{v}_{1}\oplus\mathbf{v}_{2}.

Clearly, given the syndrome 𝐯\mathbf{v} the syndromes 𝐯1\mathbf{v}_{1} and 𝐯2\mathbf{v}_{2} are fully known and the messages 𝐦1\mathbf{m}_{1} and 𝐦2\mathbf{m}_{2} as well. Let 𝐱~i=f(𝐯i)\mathbf{\tilde{x}}_{i}=f(\mathbf{v}_{i}) be the coset leader of 𝐯i\mathbf{v}_{i} for i=1,2i=1,2. In this case the transmission scheme is as follow:

  • Encoder of user 11: transmit 𝐱1=(𝐱~1𝐬1)mod=f(𝐯1H𝐬1)\mathbf{x}_{1}=(\tilde{\mathbf{x}}_{1}\oplus\mathbf{s}_{1})\;\mbox{mod}\;\mathbb{C}=f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1}).

  • Encoder of user 22: transmit 𝐱2=(𝐱~2𝐬2)mod=f(𝐯2H𝐬2)\mathbf{x}_{2}=(\tilde{\mathbf{x}}_{2}\oplus\mathbf{s}_{2})\;\mbox{mod}\;\mathbb{C}=f(\mathbf{v}_{2}\oplus H\mathbf{s}_{2}).

  • Decoder: reconstruct 𝐯^=H(𝐲mod)\hat{\mathbf{v}}=H\cdot\Big{(}\mathbf{y}\;\rm{mod}\;\mathbb{C}\Big{)}.

Therefore, we have that

𝐯^\displaystyle\hat{\mathbf{v}} =H(𝐲mod)\displaystyle=H\cdot\Big{(}\mathbf{y}\;\rm{mod}\;\mathbb{C}\Big{)}
=H(𝐱~1𝐱~2)=𝐯1𝐯2=𝐯.\displaystyle=H\cdot\Big{(}\tilde{\mathbf{x}}_{1}\oplus\tilde{\mathbf{x}}_{2}\Big{)}=\mathbf{v}_{1}\oplus\mathbf{v}_{2}=\mathbf{v}.

The sum capacity is achieved since R1+R2=l1+l2n=nknHb(q)ϵR_{1}+R_{2}=\frac{l_{1}+l_{2}}{n}=\frac{n-k}{n}\geq H_{b}(q)-\epsilon where ϵ0\epsilon\rightarrow 0 as nn\rightarrow\infty which satisfies the input constraints.

III A Single-Letter Characterization for the Capacity Region

In this section we characterize the best known single-letter region (7) for the binary doubly-dirty MAC (2), and show that it is strictly contained in the capacity region (9). For simplicity, we shall assume identical input constraints, i.e., q1=q2=qq_{1}=q_{2}=q.

Definition 1.

For a given qq, the best known single-letter rate region for the binary doubly-dirty MAC (2), denoted by BSL(q)\mathcal{R}_{BSL}(q), is the set of all rate pairs (R1,R2)(R_{1},R_{2}) satisfying (7) with the additional constraints that E𝐗1,E𝐗2qE\mathbf{X}_{1},E\mathbf{X}_{2}\leq q.

In the following theorem we give a closed form expression for BSL(q)\mathcal{R}_{BSL}(q).

Theorem 2.

The best known single-letter rate region for the binary doubly-dirty MAC (2) is a triangular region given by

BSL(q)={(R1,R2):R1+R2u.c.e[2Hb(q)1]+},\displaystyle\mathcal{R}_{BSL}(q)=\Bigg{\{}(R_{1},R_{2}):R_{1}+R_{2}\leq u.c.e\Big{[}2H_{b}(q)-1\Big{]}^{+}\Bigg{\}}, (16)

where u.c.eu.c.e is the upper convex envelope with respect to qq, and [x]+max{0,x}[x]^{+}\triangleq\max\{0,x\}.

Fig. 2 shows the sum capacity of the binary doubly-dirty MAC (9) versus the best known single-letter rate sum (16) for equal input constraints. The latter is strictly contained in the capacity region which is achieved by a linear code. The quantity [2Hb(q)1]+[2H_{b}(q)-1]^{+} is not a convex - \cap function with respect to qq. The upper convex envelope of [2Hb(q)1]+[2H_{b}(q)-1]^{+} is achieved by time-sharing between the points q=0q=0 and q=q11/2q=q^{*}\triangleq 1-1/\sqrt{2}, therefore it is given by

R1+R2{2Hb(q)1,qq1/2Cq,0qq,\displaystyle R_{1}+R_{2}\leq\Bigg{\{}\begin{array}[]{cc}2H_{b}(q)-1,&q^{*}\leq q\leq 1/2\\ C^{*}q,&0\leq q\leq q^{*}\\ \end{array}, (19)

where C2Hb(q)1qC^{*}\triangleq\frac{2H_{b}(q^{*})-1}{q^{*}}.

Proof.

The direct part is shown by choosing in (6) U1=S1X1U_{1}=S_{1}\oplus X_{1} and U2=S2X2U_{2}=S_{2}\oplus X_{2}, where X1,X2Bernoulli(q)X_{1},X_{2}\sim\mbox{Bernoulli}(q) and X1,X2,S1,S2X_{1},X_{2},S_{1},S_{2} are independent. From (6) the achievable rate sum is given by

R1+R2=I(U1,U2;Y)I(U1,U2;S1,S2)\displaystyle R_{1}+R_{2}=I(U_{1},U_{2};Y)-I(U_{1},U_{2};S_{1},S_{2})
=H(U1|S1)+H(U2|S2)H(U1,U2|U1U2)\displaystyle=H(U_{1}|S_{1})+H(U_{2}|S_{2})-H(U_{1},U_{2}|U_{1}\oplus U_{2}) (20)
=H(U1|S1)+H(U2|S2)H(U1|U1U2)H(U2|U1U2,U1)\displaystyle=H(U_{1}|S_{1})+H(U_{2}|S_{2})-H(U_{1}|U_{1}\oplus U_{2})-H(U_{2}|U_{1}\oplus U_{2},U_{1}) (21)
=H(X1)+H(X2)H(U1|U1U2)\displaystyle=H(X_{1})+H(X_{2})-H(U_{1}|U_{1}\oplus U_{2}) (22)
=2Hb(q)1,\displaystyle=2H_{b}(q)-1, (23)

where (20) follows since Y=U1U2Y=U_{1}\oplus U_{2}; (21) follows from the chain rule for entropy; (22) follows since U2U_{2} is fully known given U1U2,U1U_{1}\oplus U_{2},U_{1} thus H(U2|U1U2,U1)=0H(U_{2}|U_{1}\oplus U_{2},U_{1})=0; (23) follows since H(Xi)Hb(q)H(X_{i})\leq H_{b}(q) and since U1,U2U_{1},U_{2} are independent with P(Ui=1)=1/2P(U_{i}=1)=1/2 thus H(U1|U1U2)=H(U1)=1H(U_{1}|U_{1}\oplus U_{2})=H(U_{1})=1.

The converse part of the proof is given in Appendix B. ∎

Refer to caption
Figure 2: The rate sum of binary doubly-dirty MAC vs. best known single-letter rate sum with input constraints EX1,EX2qEX_{1},EX_{2}\leq q.

We see that the binary doubly-dirty MAC is a memoryless channel coding problem, where the capacity region is achievable by a linear code, while the best known single-letter rate region is strictly contained in the capacity region. This may be explained by the fact that each user has only partial side information, and distributed random binning is unable to capture the linear structure of the channel.

In order to understand the limitation of random binning versus a linear code, we consider these two schemes for high enough qq, that is 2Hb(q)102H_{b}(q)-1\geq 0. The random binning scheme uses Ui=XiSiU_{i}=X_{i}\oplus S_{i} where XiBernoulli(q)X_{i}\sim\mbox{Bernoulli}(q) and SiBernoulli(1/2)S_{i}\sim\mbox{Bernoulli}(1/2) are independent, therefore Y=U1U2Y=U_{1}\oplus U_{2} where UiBernoulli(1/2)U_{i}\sim\mbox{Bernoulli}(1/2) for i=1,2i=1,2. Each transmitter maps the message (bin) WiW_{i} into a codeword 𝐮i\mathbf{u}_{i} which is with high probability at a Hamming distance of nqnq from 𝐬i\mathbf{s}_{i}. Therefore, given the vectors (𝐬1n,𝐬2n)(\mathbf{s}_{1}^{n},\mathbf{s}_{2}^{n}), the available input space is approximately 2nH(U1,U2|S1,S2)=2nH(X1,X2)=22nHb(q)2^{nH(U_{1},U_{2}|S_{1},S_{2})}=2^{nH(X_{1},X_{2})}=2^{2nH_{b}(q)}. Given the received vector 𝐲\mathbf{y}, the residual ambiguity is given by 2nH(U1,U2|Y)=2n[H(U1|Y)+H(U2|Y,U1)]=2n2^{nH(U_{1},U_{2}|Y)}=2^{n[H(U_{1}|Y)+H(U_{2}|Y,U_{1})]}=2^{n}, since H(U1|Y)=1H(U_{1}|Y)=1 and H(U2|Y,U1)=0H(U_{2}|Y,U_{1})=0. As a result, the achievable rate sum is given by

R1+R2=1nlog2(|input space||residual ambiguity space|)2Hb(q)1.\displaystyle R_{1}+R_{2}=\frac{1}{n}\log_{2}\Big{(}\frac{|\mbox{input space}|}{|\mbox{residual ambiguity space}|}\Big{)}\approx 2H_{b}(q)-1.

The linear coding scheme shown in Theorem 1 has the same input space size as the random binning scheme, i.e., 22nHb(q)2^{2nH_{b}(q)}, since each user has 2nHb(q)2^{nH_{b}(q)} cosets. However, given the received vector 𝐲\mathbf{y} there are 2nHb(q)2^{nH_{b}(q)} possible pairs of cosets, i.e., the residual ambiguity is only 2nHb(q)2^{nH_{b}(q)}. Therefore, the linear code achieves rate sum of R1+R22Hb(q)Hb(q)=Hb(q)R_{1}+R_{2}\approx 2H_{b}(q)-H_{b}(q)=H_{b}(q). The advantage of the linear coding scheme results from the “ordered structure” of the linear code, which decreases the residual ambiguity from 11 bit in random coding to Hb(q)H_{b}(q).

The following example illustrates the above arguments for the case that user 22 is a “helper” for user 11, i.e, R2=0R_{2}=0, and user 11 transmits at his highest rate for each technique (random binning or linear coding). Table I summarizes the rates and codebooks sizes for each user for q=0.3q=0.3, that is Hb(q)0.88H_{b}(q)\thickapprox 0.88 bit.

Random binning Linear code
Rate sum 2Hb(q)1=0.762H_{b}(q)-1=0.76 bit Hb(q)=0.88H_{b}(q)=0.88 bit
Codewords per bin/coset 2nI(Ui;Si)=2n[1Hb(q)]=20.12n2^{nI(U_{i};S_{i})}=2^{n[1-H_{b}(q)]}=2^{0.12n} 2n[1Hb(q)]=20.12n2^{n[1-H_{b}(q)]}=2^{0.12n}
Helper (user 22) - codebook size 2nI(U2;S2)=2n[1Hb(q)]=20.12n2^{nI(U_{2};S_{2})}=2^{n[1-H_{b}(q)]}=2^{0.12n} 2n[1Hb(q)]=20.12n2^{n[1-H_{b}(q)]}=2^{0.12n}
User 11 - codebook size 20.76n20.12n=20.88n2^{0.76n}2^{0.12n}=2^{0.88n} 20.12n20.88n=2n2^{0.12n}2^{0.88n}=2^{n}
Number of possible codeword pairs 20.88n20.12n=2n2^{0.88n}2^{0.12n}=2^{n} 2n20.12n=21.12n2^{n}2^{0.12n}=2^{1.12n}
TABLE I: Random binning and linear coding schemes codebooks sizes for the helper problem with q=0.3q=0.3.
Refer to caption
Z^\hat{Z}YYXXZ=XYZ=X\oplus Y
Figure 3: The Korner-Marton configuration.

Korner and Marton [11] observed a similar behavior for the “two help one” source coding problem shown in Fig. 3. In this problem, there are three binary sources X,Y,ZX,Y,Z, where Z=XYZ=X\oplus Y, and the joint distribution of XX and YY is symmetric with P(XY)=θP(X\neq Y)=\theta. The goal is to encode the sources XX and YY separately such that ZZ can be reconstructed losslessly. Korner and Marton showed that the rate sum required is at least

Rx+Ry2H(Z),\displaystyle R_{x}+R_{y}\geq 2H(Z), (24)

and furthermore, this rate sum can be achieved by a linear code: each encoder transmits the syndrome of the observed source relative to a good linear binary code for a BSC with crossover probability θ\theta.

In contrast, the “one help one” problem [19, 20] has a closed single-letter expression for the rate region, which corresponds to a random binning coding scheme. Korner and Marton [11] generalize the expression of [19, 20] to the “two help one” problem, and show that the minimal rate sum required using this expression is given by

Rx+RyH(X,Y).\displaystyle R_{x}+R_{y}\geq H(X,Y). (25)

The region (25) corresponds to Slepian-Wolf encoding of XX and YY, and it can also be derived from the Burger-Tung achievable region [21] for distributed coding for XX and YY with one reconstruction Z^\hat{Z} under the distortion measure d(X,Y,Z^)XYZ^d(X,Y,\hat{Z})\triangleq X\oplus Y\oplus\hat{Z}. Clearly, the region (6) is strictly contained in the Korner-Marton region Rx+Ry2H(Z)R_{x}+R_{y}\geq 2H(Z) (24) (since H(X,Y)=1+H(Z)>2H(Z)H(X,Y)=1+H(Z)>2H(Z) for ZBernoulli(θ)Z\sim\mbox{Bernoulli}(\theta), where θ12\theta\neq\frac{1}{2}). For further background on related source coding problems, see [15].

IV The Gaussian Doubly-Dirty MAC

In this section we introduce our conjecture regarding the rate loss of the best known single-letter characterization for the capacity region of the two-user Gaussian doubly-dirty MAC at high SNR. The Gaussian doubly-dirty MAC [10] is given by

Y=X1+X2+S1+S2+Z,\displaystyle Y=X_{1}+X_{2}+S_{1}+S_{2}+Z, (26)

where Z𝒩(0,N)Z\sim\mathcal{N}(0,N) is independent of X1,X2,S1,S2X_{1},X_{2},S_{1},S_{2}, and where user 11 and user 22 must satisfy the power constraints, 1ni=1nX1i2P1\frac{1}{n}\sum_{i=1}^{n}X_{1_{i}}^{2}\leq P_{1} and 1ni=1nX2i2P2\frac{1}{n}\sum_{i=1}^{n}X_{2_{i}}^{2}\leq P_{2} see Fig. 1. The interference signals S1S_{1} and S2S_{2} are known non-causally to the transmitters of user 11 and user 22, respectively. We shall assume that S1S_{1} and S2S_{2} are independent Gaussian with variances going to infinity, i.e., Si𝒩(0,Qi)S_{i}\sim\mathcal{N}(0,Q_{i}) where QiQ_{i}\rightarrow\infty for i=1,2i=1,2. The signal to noise ratios for the two users are SNR1=P1NSNR_{1}=\frac{P_{1}}{N} and SNR2=P2NSNR_{2}=\frac{P_{2}}{N}.

The capacity region at high SNR, i.e., SNR1,SNR21SNR_{1},SNR_{2}\gg 1, is given by [10],

R1+R212log2(min{P1,P2}N),\displaystyle R_{1}+R_{2}\leq\frac{1}{2}\log_{2}\Bigg{(}\frac{\min\{P_{1},P_{2}\}}{N}\Bigg{)}, (27)

and it is achievable by a modulo lattice coding scheme of dimension going to infinity. In contrast, it was shown in [10] that at high SNR and strong independent Gaussian interferences, the natural generalization of Costa’s strategy (8) for the two users case, i.e., with auxiliary random variables U1=X1+S1U_{1}=X_{1}+S_{1} and U2=X2+S2U_{2}=X_{2}+S_{2}, is not able to achieve any positive rate. A better choice for U1U_{1} and U2U_{2} suggested in [10] is a modulo version of Costa’s strategy (8),

Ui=[Xi+Si]modΔi,\displaystyle U_{i}^{*}=[X_{i}+S_{i}]\;\mbox{mod}\>\Delta_{i}, (28)

where Δi=12Pi\Delta_{i}=\sqrt{12P_{i}}, and where XiUnif([Δi2,Δi2))X_{i}\sim\mbox{Unif}\left([-\frac{\Delta_{i}}{2},\frac{\Delta_{i}}{2})\right) is independent of SiS_{i}, for i=1,2i=1,2. In this case the rate loss with respect to (27) is 12log2(πe6)0.254bit\frac{1}{2}\log_{2}\Big{(}\frac{\pi e}{6}\Big{)}\thickapprox 0.254\;\rm{bit}.

The best known single-letter capacity region for the Gaussian doubly-dirty MAC (26) is defined as the set of all rate pairs (R1,R2)(R_{1},R_{2}) satisfying (7), where X1X_{1} and X2X_{2} are restricted to the power constraints EX12P1EX_{1}^{2}\leq P_{1} and EX22P2EX_{2}^{2}\leq P_{2}. We believe that for high SNR and strong interference, the modulo-Δ\Delta strategy (28) is an optimum choice for (X1,X2,U1,U2)(X_{1},X_{2},U_{1},U_{2}) in (7) for the Gaussian doubly-dirty MAC. This implies the following conjecture about the rate loss of the best known single-letter characterization.

Conjecture 1.

For the Gaussian doubly-dirty MAC, at high SNR and strong interference, the best known single-letter expression RBSLsumR_{BSL}^{sum} (7) looses

CsumRBSLsum=12log2(πe6)0.254bit,\displaystyle C^{sum}-R_{BSL}^{sum}=\frac{1}{2}\log_{2}\Big{(}\frac{\pi e}{6}\Big{)}\approx 0.254\;\mbox{bit}, (29)

with respect to the sum capacity CsumC^{sum} (27).

Note that the right hand side of (29) is the well known “shaping loss” [22] (equivalent to a 1.53dB~1.53dB power loss).

A heuristic approach to attack the proof of this conjecture is to follow the steps of the proof of the converse part in the binary case (Theorem 2). First, in Lemma 6 we derive a simplified single-letter formula, G¯max(P1,P2)\overline{G}_{max}(P_{1},P_{2}), which is analogous to Lemma 1 in the binary case. The next step would be to optimize this expression. However, an optimal choice for the auxiliary random variables V1,V1,V2,V2V_{1},V_{1}^{\prime},V_{2},V_{2}^{\prime} (provided in the binary case by Lemma 2 and Lemma 3) is unfortunately still missing for the Gaussian case. The expression in Lemma 6 is close in spirit to the point-to-point dirty tape capacity for high SNR and strong interference [8]. In [8] it is shown that optimizing the capacity is equivalent to minimum entropy-constrained scalar quantization in high resolution, which is achieved by a lattice quantizer. Clearly, if we could show a similar lemma for the two variable pairs in the maximization of Lemma 6, i.e., that it is achieved by a pair of lattice quantizers, then the conjecture would be an immediate consequence.

It should be noted that the above discussion is valid only for strong interferences S1S_{1} and S2S_{2}. For interference with finite power, it seems that cancelling the interference part of the time and staying silence the rest of the time (like in the time-sharing region 0qq0\leq q\leq q^{*} in the binary case) may achieve better rates.

V Summary

A memoryless information theoretic problem is considered open as long as we are missing a general single-letter characterization for its information performance. This goes hand in hand with the optimality of the random coding approach for those problems which are currently solved. We examined this traditional view for the memoryless doubly-dirty MAC.

In the binary case, we showed that the best known single letter characterization is strictly contained in the region achievable by linear coding, and that the latter is in fact the full capacity region of the problem. In the Gaussian case, we conjectured that the best known single-letter characterization suffers an inherent rate loss (equal to the well known “shaping loss” 0.5log(πe/6)0.5\log(\pi e/6)), and we provide a partial proof. This is in contrast to the asymptotic optimality (dimension \rightarrow\infty) of lattice strategies, as recently shown in [10].

The underlying reason for these performance gaps is that random binning is in general not optimal when side information is distributed among more than one terminal in the network. In the specific case of the doubly-dirty MAC (like in Korner-Marton’s modulo-two sum problem [11] and similar settings [14, 15]), the linear structure of the network allows to show that linear binning is not only better, but it is capacity achieving.

Appendix A A Closed Form Expression for the Capacity of the Binary MAC with One Dirty User

We consider the binary dirty MAC (2) with S2=0S_{2}=0,

Y=X1X2S1,\displaystyle Y=X_{1}\oplus X_{2}\oplus S_{1}, (30)

where S1Bernoulli(1/2)S_{1}\sim\text{Bernoulli(1/2)} is known non-causally at the encoder of user 11 with the input constraints 1nWH(𝐱i)qi\frac{1}{n}W_{H}(\mathbf{x}_{i})\leq q_{i} for i=1,2i=1,2. We show that the common message (W1=W2=WW_{1}=W_{2}=W) capacity of this channel is given by

Ccom=Hb(q1).\displaystyle C_{com}=H_{b}(q_{1}). (31)

To prove (31), consider the general expression for the common message capacity of the MAC with one informed user [4], given by

Ccom=maxU1,X1,X2{I(U1,X2;Y)I(U1,X2;S1)},\displaystyle C_{com}=\max_{U_{1},X_{1},X_{2}}\{I(U_{1},X_{2};Y)-I(U_{1},X_{2};S_{1})\}, (32)

where the maximization is over al the joint distributions

P(S1,X1,X2,U1,Y)=P(S1)P(X2)P(U1|X2,S1)P(X1|S1,U1)P(Y|X1,X2,S1).\displaystyle P(S_{1},X_{1},X_{2},U_{1},Y)=P(S_{1})P(X_{2})P(U_{1}|X_{2},S_{1})P(X_{1}|S_{1},U_{1})P(Y|X_{1},X_{2},S_{1}).

The converse part of (31) follows since for any U1,X1,X2U_{1},X_{1},X_{2}, the common message rate RcomR_{com} can be upper bounded by

Rcom\displaystyle R_{com} =I(U1,X2;Y)I(U1,X2;S1)\displaystyle=I(U_{1},X_{2};Y)-I(U_{1},X_{2};S_{1})
=H(S1|U1,X2)H(Y|U1,X2)+H(Y)H(S1)\displaystyle=H(S_{1}|U_{1},X_{2})-H(Y|U_{1},X_{2})+H(Y)-H(S_{1})
H(S|U1,X2)H(Y|U1,X2)\displaystyle\leq H(S|U_{1},X_{2})-H(Y|U_{1},X_{2}) (33)
=H(S1|U1,X2)H(X1S1|U1,X2)\displaystyle=H(S_{1}|U_{1},X_{2})-H(X_{1}\oplus S_{1}|U_{1},X_{2}) (34)
=H(S1|T)H(X1S1|T)\displaystyle=H(S_{1}|T)-H(X_{1}\oplus S_{1}|T) (35)
=ET{H(S1|T=t)H(X1S1|T=t)}\displaystyle=E_{T}\Big{\{}H(S_{1}|T=t)-H(X_{1}\oplus S_{1}|T=t)\Big{\}} (36)
=ET{Hb(αt)Hb(βt)},\displaystyle=E_{T}\Big{\{}H_{b}(\alpha_{t})-H_{b}(\beta_{t})\Big{\}}, (37)

where (33) follows since H(Y)1H(Y)\leq 1 and H(S1)=1H(S_{1})=1; (34) follows since Y=X1X2S1Y=X_{1}\oplus X_{2}\oplus S_{1}; (35) follows the definition T(U1,X2)T\triangleq(U_{1},X_{2}); (36) follows from the definition of the conditional entropy; (37) follows from the following definitions αtP(S1=1|T=t)\alpha_{t}\triangleq P(S_{1}=1|T=t) and βtP(S1X1=1|T=t)\beta_{t}\triangleq P(S_{1}\oplus X_{1}=1|T=t) for any tTt\in T. We also define q1|tP(X1=1|T=t)=E{X1|T=t}q_{1|t}\triangleq P(X_{1}=1|T=t)=E\{X_{1}|T=t\}, therefore the input constraint of user 11 can be written as

EX1=ETE{X1|T=t}=ET{q1|t}q1.\displaystyle EX_{1}=E_{T}E\{X_{1}|T=t\}=E_{T}\{q_{1|t}\}\leq q_{1}. (38)

Without loss of generality, we can only consider αt,βt,q1|t[0,1/2]\alpha_{t},\beta_{t},q_{1|t}\in[0,1/2] in (37) for any tTt\in T. Thus,

Rcom\displaystyle R_{com} ET{Hb(αt)Hb([αtq1|t]+)}\displaystyle\leq E_{T}\Big{\{}H_{b}(\alpha_{t})-H_{b}\Big{(}[\alpha_{t}-q_{1|t}]^{+}\Big{)}\Big{\}} (39)
ET{Hb(q1|t)}\displaystyle\leq E_{T}\Big{\{}H_{b}(q_{1|t})\Big{\}} (40)
Hb(ET{q1|t})\displaystyle\leq H_{b}\Big{(}E_{T}\{q_{1|t}\}\Big{)} (41)
Hb(q1),\displaystyle\leq H_{b}(q_{1}), (42)

where (39) follows from (37) and since Hb(βt)Hb([αtq1|t]+)H_{b}(\beta_{t})\geq H_{b}\Big{(}[\alpha_{t}-q_{1|t}]^{+}\Big{)}, where [x]+=max{x,0}[x]^{+}=max\{x,0\}; (40) follows since Hb(αt)Hb([αtq1|t]+)H_{b}(\alpha_{t})-H_{b}\Big{(}[\alpha_{t}-q_{1|t}]^{+}\Big{)} is increasing in αt\alpha_{t} for αtq1|t1/2\alpha_{t}\leq q_{1|t}\leq 1/2 and decreasing in αt\alpha_{t} for q1|t<αt1/2q_{1|t}<\alpha_{t}\leq 1/2, thus the maximum is for αt=q1|t\alpha_{t}=q_{1|t}; (41) follows from Jensen’s inequality since Hb()H_{b}(\cdot) is convex-\cap; (42) follows from the input constraint for user 11 (38). The converse part follows since the outer bound is valid for any U1U_{1} and X1,X2X_{1},X_{2} that satisfy the input constraints.

The direct part is shown by using U1=X1S1U_{1}=X_{1}\oplus S_{1} where X1X_{1} and S1S_{1} are independent with X1Bernoulli(q1)X_{1}\sim\text{Bernoulli}(q_{1}), thus U1Bernoulli(1/2)U_{1}\sim\text{Bernoulli}(1/2). Furthermore, X2Bernoulli(q2)X_{2}\sim\text{Bernoulli}(q_{2}) which is independent of X1,U1,S1X_{1},U_{1},S_{1}. In this case Y=U1X2Y=U_{1}\oplus X_{2}, hence YBernoulli(1/2)Y\sim\text{Bernoulli}(1/2). Using this choice for U1,X1,X2U_{1},X_{1},X_{2}, the achievable common message rate is given by

Rcom\displaystyle R_{com} =I(U1,X2;Y)I(U1,X2;S1)\displaystyle=I(U_{1},X_{2};Y)-I(U_{1},X_{2};S_{1})
=H(S1|U1,X2)H(Y|U1,X2)+H(Y)H(S1)\displaystyle=H(S_{1}|U_{1},X_{2})-H(Y|U_{1},X_{2})+H(Y)-H(S_{1})
=H(X1)\displaystyle=H(X_{1}) (43)
=Hb(q1),\displaystyle=H_{b}(q_{1}),

where (43) follows since H(S1|U1,X2)=H(S1|U1)=H(X1)H(S_{1}|U_{1},X_{2})=H(S_{1}|U_{1})=H(X_{1}), H(Y|U1,X2)=0H(Y|U_{1},X_{2})=0, H(Y)=1H(Y)=1 and H(S1)=1H(S_{1})=1.

Appendix B Proof of the Converse Part of Theorem 2

The proof of the converse part follows from Lemma 1, Lemma 2 and Lemma 3, whereas Lemma 5 and Lemma 4 are technical results which assist in the derivation of Lemma 3.

Let us define the following functions:

F(PV1,V1,PV2,V2)[H(V1)+H(V2)H(V1V2)1]+,\displaystyle F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}})\triangleq\Big{[}H(V_{1})+H(V_{2})-H(V^{\prime}_{1}\oplus V^{\prime}_{2})-1\Big{]}^{+}, (44)

where [x]+=max(0,x)[x]^{+}=\max(0,x); its (q1,q2)(q_{1},q_{2})-constrained maximization with respect to V1,V1,V2,V22V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}\in\mathbb{Z}_{2} where (V1,V1)(V_{1},V^{\prime}_{1}) and (V2,V2)(V_{2},V^{\prime}_{2}) are independent, i.e.,

Fmax(q1,q2)\displaystyle F_{max}(q_{1},q_{2})\triangleq maxV1,V1,V2,V2F(PV1,V1,PV2,V2)\displaystyle\max_{V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}}F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}}) (45)
s.tP(ViVi)qi,fori=1,2;\displaystyle\mbox{s.t}\;\;P(V_{i}\neq V^{\prime}_{i})\leq q_{i},\;\mbox{for}\;\;i=1,2;

and the upper convex envelope of Fmax(q1,q2)F_{max}(q_{1},q_{2}) with respect to q1,q2q_{1},q_{2}

F¯max(q1,q2)u.c.e{Fmax(q1,q2)}.\displaystyle\overline{F}_{max}(q_{1},q_{2})\triangleq u.c.e\Big{\{}F_{max}(q_{1},q_{2})\Big{\}}. (46)

In the following lemma we give an outer bound for the single-letter region (7) of the binary doubly-dirty MAC in the spirit of [23, Lemma 3] and [8, Proposition 1].

Lemma 1.

The best known single-letter rate sum (7) of the binary doubly-dirty MAC (2) with input constraint q1q_{1} and q2q_{2} is upper bounded by

R1+R2F¯max(q1,q2).\displaystyle R_{1}+R_{2}\leq\overline{F}_{max}(q_{1},q_{2}). (47)
Proof.

An outer bound on the best known single-letter region (7) is given by

RBSLsum(U1,U2)\displaystyle R_{BSL}^{sum}(U_{1},U_{2}) [I(U1,U2;Y)I(U1,U2;S1,S2)]+\displaystyle\triangleq\Big{[}I(U_{1},U_{2};Y)-I(U_{1},U_{2};S_{1},S_{2})\Big{]}^{+} (48)
=[H(S1|U1)+H(S2|U2)H(Y|U1,U2)+H(Y)H(S1)H(S2)]+\displaystyle=\Big{[}H(S_{1}|U_{1})+H(S_{2}|U_{2})-H(Y|U_{1},U_{2})+H(Y)-H(S_{1})-H(S_{2})\Big{]}^{+} (49)
[H(S1|U1)+H(S2|U2)H(Y|U1,U2)1]+\displaystyle\leq\Big{[}H(S_{1}|U_{1})+H(S_{2}|U_{2})-H(Y|U_{1},U_{2})-1\Big{]}^{+} (50)
=[EU1,U2{H(S1|U1=u1)+H(S2|U2=u2)H(Y|U1=u1,U2=u2)1}]+\displaystyle=\Bigg{[}E_{U_{1},U_{2}}\Big{\{}H(S_{1}|U_{1}=u_{1})+H(S_{2}|U_{2}=u_{2})-H(Y|U_{1}=u_{1},U_{2}=u_{2})-1\Big{\}}\Bigg{]}^{+} (51)
EU1,U2{[H(S1|U1=u1)+H(S2|U2=u2)H(Y|U1=u1,U2=u2)1]+}\displaystyle\leq E_{U_{1},U_{2}}\Bigg{\{}\Big{[}H(S_{1}|U_{1}=u_{1})+H(S_{2}|U_{2}=u_{2})-H(Y|U_{1}=u_{1},U_{2}=u_{2})-1\Big{]}^{+}\Bigg{\}} (52)
EU1,U2{F(PS1,S1X1|U1=u1,PS2,S2X2|U2=u2)}\displaystyle\leq E_{U_{1},U_{2}}\Bigg{\{}F\Big{(}P_{S_{1},S_{1}\oplus X_{1}|U_{1}=u_{1}},P_{S_{2},S_{2}\oplus X_{2}|U_{2}=u_{2}}\Big{)}\Bigg{\}} (53)
EU1,U2{F¯max(q1|u1,q2|u2)}\displaystyle\leq E_{U_{1},U_{2}}\Big{\{}\overline{F}_{max}\big{(}q_{1|u_{1}},q_{2|u_{2}}\big{)}\Big{\}} (54)
F¯max(EU1q1|u1,EU2q2|u2)\displaystyle\leq\overline{F}_{max}\big{(}E_{U_{1}}q_{1|u_{1}},E_{U_{2}}q_{2|u_{2}}\big{)} (55)
F¯max(q1,q2),\displaystyle\leq\overline{F}_{max}\big{(}q_{1},q_{2}\big{)}, (56)

where (50) follows since H(S1)=H(S2)=1H(S_{1})=H(S_{2})=1 and H(Y)1H(Y)\leq 1; (51) follows from the definition of the conditional entropy; (52) follows since [Ex]+E{x+}[Ex]^{+}\leq E\{x^{+}\}; (53) follows from the definition of the function F(PV1,V1,PV2,V2)F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}}) (44), likewise (54) follows from the definition of the function F¯max(q1,q2)\overline{F}_{max}(q_{1},q_{2}) (46), and from the definition

qi|uiP(SiXiSi|Ui=ui)=P(Xi=1|Ui=ui),fori=1,2;\displaystyle q_{i|u_{i}}\triangleq P(S_{i}\neq X_{i}\oplus S_{i}|U_{i}=u_{i})=P(X_{i}=1|U_{i}=u_{i}),\;for\;i=1,2;

(55) follows from Jensen’s inequality since F¯max(q1,q2)\overline{F}_{max}(q_{1},q_{2}) is a concave function; (56) follows from the input constraints where

EXi\displaystyle EX_{i} =EUiP(Xi=1|Ui=ui)\displaystyle=E_{U_{i}}P(X_{i}=1|U_{i}=u_{i})
=uiUiP(ui)P(Xi=1|Ui=ui)\displaystyle=\sum_{u_{i}\in U_{i}}P(u_{i})P(X_{i}=1|U_{i}=u_{i})
=uiUiP(ui)qi|uiqi,fori=1,2.\displaystyle=\sum_{u_{i}\in U_{i}}P(u_{i})q_{i|u_{i}}\leq q_{i},\;\mbox{for}\;\;i=1,2. (57)

The lemma now follows since the upper bound (56) for the rate sum is independent of U1U_{1} and U2U_{2}, hence it also bounds the single-letter region BSL(q)\mathcal{R}_{BSL}(q). ∎

A simplified expression for the function Fmax(q1,q2)F_{max}(q_{1},q_{2}) of (45) is shown in the following lemma.

Lemma 2.

The function Fmax(q1,q2)F_{max}(q_{1},q_{2}) (45) is given by

Fmax(q1,q2)=maxα1,α2[0,1/2][Hb(α1)+Hb(α2)Hb([α1q1]+[α2q2]+)1]+,\displaystyle F_{max}(q_{1},q_{2})=\max_{\alpha_{1},\alpha_{2}\in[0,1/2]}\Big{[}H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-H_{b}\Big{(}[\alpha_{1}-q_{1}]^{+}\ast[\alpha_{2}-q_{2}]^{+}\Big{)}-1\Big{]}^{+}, (58)

where \ast is the binary convolution, i.e., xy(1x)y+(1y)xx\ast y\triangleq(1-x)y+(1-y)x.

Proof.

The function Fmax(q1,q2)F_{max}(q_{1},q_{2}) is defined in (44) and (45) where V1,V1,V2,V2V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2} are binary random variables. Let us define the following probabilities:

αi\displaystyle\alpha_{i} P(Vi=1)\displaystyle\triangleq P(V_{i}=1)
δi\displaystyle\delta_{i} P(Vi=1|Vi=0)\displaystyle\triangleq P(V^{\prime}_{i}=1|V_{i}=0)
γi\displaystyle\gamma_{i} P(Vi=0|Vi=1),\displaystyle\triangleq P(V^{\prime}_{i}=0|V_{i}=1),

for i=1,2i=1,2. We thus have

P(Vi=1)\displaystyle P(V^{\prime}_{i}=1) =(1αi)δi+αi(1γi)g(αi,δiγi)\displaystyle=(1-\alpha_{i})\delta_{i}+\alpha_{i}(1-\gamma_{i})\triangleq g(\alpha_{i},\delta_{i}\gamma_{i})
P(ViVi)\displaystyle P(V_{i}\neq V^{\prime}_{i}) =αiγi+(1αi)δih(αi,δi,γi),\displaystyle=\alpha_{i}\gamma_{i}+(1-\alpha_{i})\delta_{i}\triangleq h(\alpha_{i},\delta_{i},\gamma_{i}),

for i=1,2i=1,2. The maximization (45) can be written as

Fmax(q1,q2)=maxα1,α2[Hb(α1)+Hb(α2)minh(αi,δi,γi)qi,i=1,2γ1,δ1,γ2,δ2Hb(g(α1,δ1,γ1)g(α2,δ2,γ2))1]+.\displaystyle F_{max}(q_{1},q_{2})=\max_{\alpha_{1},\alpha_{2}}\Big{[}H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-\min_{\stackrel{{\scriptstyle\scriptstyle\gamma_{1},\delta_{1},\gamma_{2},\delta_{2}}}{{h(\alpha_{i},\delta_{i},\gamma_{i})\leq q_{i},\;i=1,2}}}H_{b}\Big{(}g(\alpha_{1},\delta_{1},\gamma_{1})\ast g(\alpha_{2},\delta_{2},\gamma_{2})\Big{)}-1\Big{]}^{+}. (59)

This maximization has two equivalent solutions (α1o,α2o)(\alpha_{1}^{o},\alpha_{2}^{o}) and (1α1o,1α2o)(1-\alpha_{1}^{o},1-\alpha_{2}^{o}) where 0α1o,α2o0.50\leq\alpha_{1}^{o},\alpha_{2}^{o}\leq 0.5, since any other (α1,α2)(\alpha_{1},\alpha_{2}) can only increase the inner minimization in (59) which results in a lower Fmax(q1,q2)F_{max}(q_{1},q_{2}). Therefore, without loss of generality we may assume that 0α1,α20.50\leq\alpha_{1},\alpha_{2}\leq 0.5.

To prove the lemma we need to show that for any αi\alpha_{i} the inner minimization is achieved by

δi=0,γi=min{1,qi/αi},i=1,2.\displaystyle\delta_{i}=0,\gamma_{i}=\min\{1,q_{i}/\alpha_{i}\},\;i=1,2.

In other words, ViV^{\prime}_{i} has the smallest possible probability for 11 under the constraint that P(ViVi)qiP(V_{i}\neq V^{\prime}_{i})\leq q_{i}, implying that the transition from ViV_{i} to ViV^{\prime}_{i} is a “Z channel”. The inner minimization requires that P(Vi=1)P(V^{\prime}_{i}=1) will be minimized restricted to the constraint P(ViVi)qiP(V_{i}\neq V^{\prime}_{i})\leq q_{i}, therefore it is equivalent to the following minimization

minh(αi,δiγi)qiγi,δig(αi,δiγi),i=1,2.\displaystyle\min_{\stackrel{{\scriptstyle\scriptstyle\gamma_{i},\delta_{i}}}{{h(\alpha_{i},\delta_{i}\gamma_{i})\leq q_{i}}}}g(\alpha_{i},\delta_{i}\gamma_{i}),\;i=1,2.

For αiq\alpha_{i}\leq q, the solution is δi=0\delta_{i}=0 and γi=1\gamma_{i}=1 since in this case g(αi,γi,δi)=0g(\alpha_{i},\gamma_{i},\delta_{i})=0 and the constraint is satisfied. For qαi0.5q\leq\alpha_{i}\leq 0.5, in order to minimize g(αi,γi,δi)g(\alpha_{i},\gamma_{i},\delta_{i}), it is required that δi[0,q/(1ai)]\delta_{i}\in[0,q/(1-a_{i})] will be minimal and γi[0,q/αi]\gamma_{i}\in[0,q/\alpha_{i}] will be maximal such that the constraint is satisfied. Clearly, the best choice is for δi=0\delta_{i}=0 and γi=q/αi\gamma_{i}=q/\alpha_{i}, in this case the constraint is satisfies and g(αi,γi,δi)=αiqg(\alpha_{i},\gamma_{i},\delta_{i})=\alpha_{i}-q. ∎

The next lemma gives an explicit upper bound for Fmax(q1,q2)F_{max}(q_{1},q_{2}) (45) for the case that q1=q2q_{1}=q_{2}. Let

f(x)=x11+(1x1)2,\displaystyle f(x)=x-\frac{1}{1+\Big{(}\frac{1}{x}-1\Big{)}^{2}}, (60)

and let

qcmaxx[0,1/2]f(x).\displaystyle q_{c}\triangleq\max_{x\in[0,1/2]}f(x). (61)

Since f(x)f(x) is differentiable, we can characterize qcq_{c} by differentiating f(x)f(x) with respect to xx and equating to zero, thus we get that

4x48x3+10x26x+1=0.\displaystyle 4x^{4}-8x^{3}+10x^{2}-6x+1=0.

This fourth order polynomial has two complex roots and two real roots, where one of its real roots is a local minimum and the other root is a local maximum. Specifically, this local maximum maximizes f(x)f(x) for the interval x[0,1/2]x\in[0,1/2] and it achieves qc0.1501q_{c}\simeq 0.1501 which occurs at x0.257x\simeq 0.257.

Lemma 3.

For q1=q2=qq_{1}=q_{2}=q, we have that:

Fmax(q,q)=2Hb(q)1,qcq1/2Fmax(q,q)Cq,0<q<qcFmax(0,0)=0,q=0,\displaystyle\begin{array}[]{ll}F_{max}(q,q)=2H_{b}(q)-1,&q_{c}\leq q\leq 1/2\\ F_{max}(q,q)\leq C^{*}q,&0<q<q_{c}\\ F_{max}(0,0)=0,&q=0,\\ \end{array} (65)

where qcq_{c} is defined in (61), while C=2Hb(q)1qC^{*}=\frac{2H_{b}(q^{*})-1}{q^{*}} and q11/20.3q^{*}\triangleq 1-1/\sqrt{2}\simeq 0.3 are defined in (19).

Note that in the first case (qcq1/2q_{c}\leq q\leq 1/2) in (58) is achieved by α1=α2=q\alpha_{1}=\alpha_{2}=q, while in the third case (q=0q=0) (58) is achieved by α1=α2=1/2\alpha_{1}=\alpha_{2}=1/2 as shown in Fig. 5. Although, we do not have an explicit expression for Fmax(q,q)F_{max}(q,q) in the range 0<q<qc0<q<q_{c}, the bound Fmax(q,q)CqF_{max}(q,q)\leq C^{*}q is sufficient for the purpose of proving Theorem 2 because qcqq_{c}\leq q^{*}. In Fig. 4 a numerical characterization of Fmax(q,q)F_{max}(q,q) is plotted.

Refer to caption
Figure 4: Numerical results of Fmax(q,q)F_{max}(q,q) (58) for q[0,0.12]q\in[0,0.12] (Fig. 2 is the same plot for q[0,0.5]q\in[0,0.5]) .
Refer to caption
Figure 5: The optimal α1=α2=α(q)\alpha_{1}=\alpha_{2}=\alpha(q) which maximizes (58).
Proof.

Define

F(α1,α2,q)Hb(α1)+Hb(α2)Hb([α1q]+[α2q]+)1.\displaystyle F(\alpha_{1},\alpha_{2},q)\triangleq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-H_{b}\big{(}[\alpha_{1}-q]^{+}\ast[\alpha_{2}-q]^{+}\big{)}-1.

From the discussion above about the cases of equality in (65), Lemma 3 will follow by showing that F(α1,α2,q)F(\alpha_{1},\alpha_{2},q) is otherwise smaller, i.e.,

F(α1,α2,q){Cq,0qqc2Hb(q)1,qcq1/2\displaystyle F(\alpha_{1},\alpha_{2},q)\leq\Bigg{\{}\begin{array}[]{cc}C^{*}q,&0\leq q\leq q_{c}\\ 2H_{b}(q)-1,&q_{c}\leq q\leq 1/2\\ \end{array} (68)

for all 0α1,α21/20\leq\alpha_{1},\alpha_{2}\leq 1/2. It is easy to see that for α1,α2q\alpha_{1},\alpha_{2}\leq q the function F(α1,α2,q)F(\alpha_{1},\alpha_{2},q) is monotonically increasing with α1,α2\alpha_{1},\alpha_{2}, and thus F(α1,α2,q)F(q,q,q)=2Hb(q)1F(\alpha_{1},\alpha_{2},q)\leq F(q,q,q)=2H_{b}(q)-1. For α1q\alpha_{1}\leq q and q<α21/2q<\alpha_{2}\leq 1/2, F(α1,α2,q)F(\alpha_{1},\alpha_{2},q) is increasing with α1\alpha_{1} and decreasing with α2\alpha_{2}, and thus F(α1,α2,q)F(q,q,q)=2Hb(q)1F(\alpha_{1},\alpha_{2},q)\leq F(q,q,q)=2H_{b}(q)-1. Clearly, from symmetry, also for α2q\alpha_{2}\leq q and qα11/2q\leq\alpha_{1}\leq 1/2, F(α1,α2,q)2Hb(q)1F(\alpha_{1},\alpha_{2},q)\leq 2H_{b}(q)-1. As a consequence, we have to show that (68) is satisfied only for qα1,α21/2q\leq\alpha_{1},\alpha_{2}\leq 1/2. Likewise, in the sequel we may assume without loss of generality that qα2α11/2q\leq\alpha_{2}\leq\alpha_{1}\leq 1/2.

The bound for the interval qc<q1/2q_{c}<q\leq 1/2: in this case (68) is equivalent to the following bound

Hb((α1q)(α2q))Hb(α1)Hb(α2)+2Hb(q)0,forqcqα2α11/2.\displaystyle H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}-H_{b}(\alpha_{1})-H_{b}(\alpha_{2})+2H_{b}(q)\geq 0,\;\mbox{for}\;q_{c}\leq q\leq\alpha_{2}\leq\alpha_{1}\leq 1/2. (69)

The LHS is lower bounded by

Hb((α1q)(α2q))Hb(α1)Hb(α2)+2Hb(q)\displaystyle H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}-H_{b}(\alpha_{1})-H_{b}(\alpha_{2})+2H_{b}(q)
Hb(α1q)Hb(α1)Hb(α2)+2Hb(q)\displaystyle\geq H_{b}(\alpha_{1}-q)-H_{b}(\alpha_{1})-H_{b}(\alpha_{2})+2H_{b}(q) (70)
Hb(α1q)2Hb(α1)+2Hb(q)\displaystyle\geq H_{b}(\alpha_{1}-q)-2H_{b}(\alpha_{1})+2H_{b}(q) (71)
0,\displaystyle\geq 0, (72)

where (70) follows since Hb((α1q)(α2q))Hb(α1q)H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}\geq H_{b}(\alpha_{1}-q); (71) follows since α2α11/2\alpha_{2}\leq\alpha_{1}\leq 1/2; (72) follows from Lemma 4 below.

The bound for the interval 0qqc0\leq q\leq q_{c}: in this case (68) is equivalent to the following bound

Hb((α1q)(α2q))Hb(α1)+Hb(α2)1Cq,for 0qα2α1qc.\displaystyle H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}\geq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1-C^{*}\cdot q,\;for\;0\leq q\leq\alpha_{2}\leq\alpha_{1}\leq q_{c}. (73)

For fixed α1\alpha_{1} and α2\alpha_{2}, let us denote the RHS and the LHS of (73) as

gl(q)\displaystyle g_{l}(q) Hb((α1q)(α2q))\displaystyle\triangleq H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}
gr(q)\displaystyle g_{r}(q) Hb(α1)+Hb(α2)1Cq.\displaystyle\triangleq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1-C^{*}\cdot q.

The function gl(q)g_{l}(q) is convex-\cap in qq, since it is a composition of the function Hb(x)H_{b}(x) which is non-decreasing convex-\cap in the range [0,1/2][0,1/2] and the function [α1q][α2q][\alpha_{1}-q]\ast[\alpha_{2}-q] which is convex-\cap in qq [24]. Since gr(q)g_{r}(q) is linear function in qq and gl(q)g_{l}(q) is convex-\cap function in qq, the bound (73) is satisfied if the interval edges (q=0q=0 and q=α2q=\alpha_{2}) satisfy this bound. For q=0q=0, (73) holds since

gl(q=0)\displaystyle g_{l}(q=0) =Hb(α1α2)\displaystyle=H_{b}(\alpha_{1}\ast\alpha_{2})
max{Hb(α1),Hb(α2)}\displaystyle\geq\max\{H_{b}(\alpha_{1}),H_{b}(\alpha_{2})\}
min{Hb(α1),Hb(α2)}\displaystyle\geq\min\{H_{b}(\alpha_{1}),H_{b}(\alpha_{2})\}
Hb(α1)+Hb(α2)1\displaystyle\geq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1
=gr(q=0).\displaystyle=g_{r}(q=0).

For q=α2q=\alpha_{2} where 0qqc0\leq q\leq q_{c}, the bound (73) is satisfied since

gr(q=α2)\displaystyle g_{r}(q=\alpha_{2}) =Hb(α1)+Hb(α2)1Cα2\displaystyle=H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1-C^{*}\cdot\alpha_{2} (74)
Hb(α1)Hb(q)+Hb(0.5q)0.5\displaystyle\leq H_{b}(\alpha_{1})-H_{b}(q^{*})+H_{b}(0.5q^{*})-0.5 (75)
Hb(α1)Hb(qc)\displaystyle\leq H_{b}(\alpha_{1})-H_{b}(q_{c}) (76)
Hb(α1)Hb(α2)\displaystyle\leq H_{b}(\alpha_{1})-H_{b}(\alpha_{2}) (77)
Hb(α1α2)\displaystyle\leq H_{b}(\alpha_{1}-\alpha_{2}) (78)
=gl(q=α2),\displaystyle=g_{l}(q=\alpha_{2}), (79)

where (75) follows from Lemma 5 since argmaxα2[0,1/2]gr(α2)=0.5q\arg\max_{\alpha_{2}\in[0,1/2]}g_{r}(\alpha_{2})=0.5q^{*}, and since C=2Hb(q)1qC^{*}=\frac{2H_{b}(q^{*})-1}{q^{*}};(76) follows since for q=11/2q^{*}=1-1/\sqrt{2} and qcq_{c} defined in (61), we have Hb(11/2)Hb(0.5(11/2))+0.50.68Hb(qc)H_{b}\big{(}1-1/\sqrt{2}\big{)}-H_{b}\big{(}0.5(1-1/\sqrt{2})\big{)}+0.5\simeq 0.68...\geq H_{b}(q_{c}); (77) follows since qcα2q_{c}\geq\alpha_{2}, thus Hb(qc)Hb(α2)H_{b}(q_{c})\geq H_{b}(\alpha_{2}); (78) follows since Hb(α1)Hb(α1α2)H_{b}(\alpha_{1})-H_{b}(\alpha_{1}-\alpha_{2}) is decreing in α1\alpha_{1}, thus Hb(α1)Hb(α1α2)Hb(α2)H_{b}(\alpha_{1})-H_{b}(\alpha_{1}-\alpha_{2})\leq H_{b}(\alpha_{2}) for α2α11/2\alpha_{2}\leq\alpha_{1}\leq 1/2. Therefore, the bound (73) follows which completes the proof. ∎

Lemma 4 and Lemma 5 are auxiliary lemmas used in the proof of Lemma 3.

Lemma 4.

For qcqα11/2q_{c}\leq q\leq\alpha_{1}\leq 1/2, the following inequality is satisfied

f1(α1)Hb(α1q)2Hb(α1)+2Hb(q)0.\displaystyle f_{1}(\alpha_{1})\triangleq H_{b}(\alpha_{1}-q)-2H_{b}(\alpha_{1})+2H_{b}(q)\geq 0. (80)
Proof.

Since f1(α1=q)=0f_{1}(\alpha_{1}=q)=0, it is sufficient to show that f1(α1)f_{1}(\alpha_{1}) is non-decreasing function in α1\alpha_{1}, i.e., ddα1f1(α1)0\frac{d}{d\alpha_{1}}f_{1}(\alpha_{1})\geq 0 for qcqα11/2q_{c}\leq q\leq\alpha_{1}\leq 1/2, therefore

ddα1f1(α1)=log2(1α1q1)2log2(1α11)0.\displaystyle\frac{d}{d\alpha_{1}}f_{1}(\alpha_{1})=\log_{2}\Big{(}\frac{1}{\alpha_{1}-q}-1\Big{)}-2\log_{2}\Big{(}\frac{1}{\alpha_{1}}-1\Big{)}\geq 0. (81)

Due to monotonicity of the log function (81) is equivalent to

qα111+(1α11)2=f(α1),\displaystyle q\geq\alpha_{1}-\frac{1}{1+\Big{(}\frac{1}{\alpha_{1}}-1\Big{)}^{2}}=f(\alpha_{1}), (82)

where f()f(\cdot) was defined in (60). Since by the definition of qcq_{c} (61) f(x)qcx[0,1/2]f(x)\leq q_{c}\;\forall x\in[0,1/2], it follows that f(α1)qα1f(\alpha_{1})\leq q\;\forall\;\alpha_{1} if qcqq_{c}\leq q, and in particular for qcqα1q_{c}\leq q\leq\alpha_{1}, which implies (82) as desired. ∎

Lemma 5.

Let

f2(x)=Hb(x)1Cx,\displaystyle f_{2}(x)=H_{b}(x)-1-C^{*}\cdot x, (83)

where x[0,1/2]x\in[0,1/2], and C=2Hb(q)1qC^{*}=\frac{2H_{b}(q^{*})-1}{q^{*}} where q=11/2q^{*}=1-1/\sqrt{2}. The maximum of f2(x)f_{2}(x) is achieved by

argmaxxf2(x)=0.5q=12(11/2).\displaystyle\arg\max_{x}f_{2}(x)=0.5q^{*}=\frac{1}{2}(1-1/\sqrt{2}). (84)
Proof.

By differentiating f2(x)f_{2}(x) with respect to xx and comparing to zero, we get that

0=ddxf2(x)=log2(1xx)C,\displaystyle 0=\frac{d}{dx}f_{2}(x)=\log_{2}\Big{(}\frac{1-x}{x}\Big{)}-C^{*}, (85)

thus xo=12C+1x^{o}=\frac{1}{2^{C^{*}}+1} maximizes f2(x)f_{2}(x) since the second derivative is negative, i.e., d2x2f2(x)|x=xo<0\frac{d^{2}}{x^{2}}f_{2}(x)|_{x=x^{o}}<0. The lemma is followed since xo=12C+1=0.5qx^{o}=\frac{1}{2^{C^{*}}+1}=0.5q^{*}. ∎

We are now in a position to summarize the proof of Theorem 2.

Proof of Theorem 2 - Converse Part. The rate sum is upper bounded by

R1+R2\displaystyle R_{1}+R_{2} u.c.e{Fmax(q,q)}\displaystyle\leq u.c.e\Big{\{}F_{max}(q,q)\Big{\}} (86)
u.c.e{Cq,0qqc2Hb(q)1,qc<q1/2}\displaystyle\leq u.c.e\Bigg{\{}\begin{array}[]{cc}C^{*}\cdot q,&0\leq q\leq q_{c}\\ 2H_{b}(q)-1,&q_{c}<q\leq 1/2\\ \end{array}\Bigg{\}} (89)
=u.c.e{[2Hb(q)1]+},\displaystyle=u.c.e\Big{\{}[2H_{b}(q)-1]^{+}\Big{\}}, (90)

where (86) follows from Lemma 1; (89) follows from Lemma 3; and (90) follows since (89) is equal to the upper convex envelope of [2Hb(q)1]+[2Hb(q)-1]^{+}.

Appendix C A simplified Outer Bound for the Sum Capacity in the Strong Interference Gaussian Case

Lemma 6.

The best known single-letter sum capacity (7) of the Gaussian doubly-dirty MAC (26) with power constraints P1P_{1}, P2P_{2}, and strong interferences (Q1,Q2Q_{1},Q_{2}\rightarrow\infty) is upper bounded by

R1+R2u.c.e{supV1,V1,V2,V2[h(V1)+h(V2)h(V1+V2+Z)+h(S1+S2)h(S1)h(S2)]+},\displaystyle R_{1}+R_{2}\leq u.c.e\bigg{\{}\sup_{V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}}\Big{[}h(V_{1})+h(V_{2})-h\big{(}V^{\prime}_{1}+V^{\prime}_{2}+Z\big{)}+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}\bigg{\}}, (91)

where u.c.eu.c.e is the upper convex envelope operation with respect to P1P_{1} and P2P_{2}, and [x]+=max(0,x)[x]^{+}=\max(0,x). The supremum is over all V1,V1,V2,V2V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2} such that (V1,V1)(V_{1},V^{\prime}_{1}) is independent of (V2,V2)(V_{2},V^{\prime}_{2}), and

E{(ViVi)2}Pi,\displaystyle E\Big{\{}(V_{i}-V^{\prime}_{i})^{2}\Big{\}}\leq P_{i},
h(Vi)h(Si),\displaystyle h(V_{i})\leq h(S_{i}),

for i=1,2i=1,2.

Proof.

Let us define the following functions (corresponds to F(PV1,V1,PV2,V2)F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}}) of (44))

G(fV1,V1,fV2,V2)[h(V1)+h(V2)h(V1+V2+Z)+h(S1+S2)h(S1)h(S2)]+.\displaystyle G\big{(}f_{V_{1},V^{\prime}_{1}},f_{V_{2},V^{\prime}_{2}}\big{)}\triangleq\Big{[}h(V_{1})+h(V_{2})-h\big{(}V^{\prime}_{1}+V^{\prime}_{2}+Z\big{)}+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}. (92)

The second function is the following maximization of (92) with respect to V1,V1,V2,V2V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}.

Gmax(P1,P2)\displaystyle G_{max}(P_{1},P_{2})\triangleq supV1,V1,V2,V2G(fV1,V1,fV2,V2)\displaystyle\sup_{V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}}G\big{(}f_{V_{1},V^{\prime}_{1}},f_{V_{2},V^{\prime}_{2}}\big{)} (93)
s.tE{(ViVi)2}Pi,h(Vi)h(Si),fori=1,2.\displaystyle\mbox{s.t}\;\;E\Big{\{}(V_{i}-V^{\prime}_{i})^{2}\Big{\}}\leq P_{i},\quad h(V_{i})\leq h(S_{i}),\quad\mbox{for}\;\;i=1,2.

Finally, we define the upper convex envelope of Gmax(P1,P2)G_{max}(P_{1},P_{2}) with respect to P1P_{1} and P2P_{2}:

G¯max(P1,P2)u.c.e{Gmax(P1,P2)}.\displaystyle\overline{G}_{max}(P_{1},P_{2})\triangleq u.c.e\Big{\{}G_{max}(P_{1},P_{2})\Big{\}}. (94)

Clearly if we take only the rate sum equation in (6) we get an outer bound on the best known single-letter region,

RBSLsum(U1,U2)[I(U1,U2;Y)I(U1,U2;S1,S2)]+\displaystyle R_{BSL}^{sum}(U_{1},U_{2})\triangleq\Big{[}I(U_{1},U_{2};Y)-I(U_{1},U_{2};S_{1},S_{2})\Big{]}^{+} (95)
=[h(S1|U1)+h(S2|U2)h(Y|U1,U2)+h(Y)h(S1)h(S2)]+\displaystyle=\Big{[}h(S_{1}|U_{1})+h(S_{2}|U_{2})-h(Y|U_{1},U_{2})+h(Y)-h(S_{1})-h(S_{2})\Big{]}^{+} (96)
[h(S1|U1)+h(S2|U2)h(Y|U1,U2)+h(S1+S2)h(S1)h(S2)]++o(1)\displaystyle\leq\Big{[}h(S_{1}|U_{1})+h(S_{2}|U_{2})-h(Y|U_{1},U_{2})+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}+o(1) (97)
=[EU1,U2{h(S1|U1=u1)+h(S2|U2=u2)h(Y|U1=u1,U2=u2)+h(S1+S2)h(S1)h(S2)}]++o(1)\displaystyle=\Bigg{[}E_{U_{1},U_{2}}\Big{\{}h(S_{1}|U_{1}=u_{1})+h(S_{2}|U_{2}=u_{2})-h(Y|U_{1}=u_{1},U_{2}=u_{2})+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{\}}\Bigg{]}^{+}+o(1) (98)
EU1,U2{[h(S1|U1=u1)+h(S2|U2=u2)h(X1+S1+X2+S2+Z|U1=u1,U2=u2)\displaystyle\leq E_{U_{1},U_{2}}\Bigg{\{}\Big{[}h(S_{1}|U_{1}=u_{1})+h(S_{2}|U_{2}=u_{2})-h(X_{1}+S_{1}+X_{2}+S_{2}+Z|U_{1}=u_{1},U_{2}=u_{2})
+h(S1+S2)h(S1)h(S2)]+}+o(1)\displaystyle\qquad\qquad+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}\Bigg{\}}+o(1) (99)
=EU1,U2{G(fS1,S1+X1|U1=u1fS2,S2+X2|U2=u2)}+o(1)\displaystyle=E_{U_{1},U_{2}}\Bigg{\{}G\Big{(}f_{S_{1},S_{1}+X_{1}|U_{1}=u_{1}}f_{S_{2},S_{2}+X_{2}|U_{2}=u_{2}}\Big{)}\Bigg{\}}+o(1) (100)
EU1,U2{G¯max(P1|u1,P2|u2)}+o(1)\displaystyle\leq E_{U_{1},U_{2}}\Big{\{}\overline{G}_{max}\big{(}P_{1|u_{1}},P_{2|u_{2}}\big{)}\Big{\}}+o(1) (101)
G¯max(EU1P1|u1,EU2P2|u2)+o(1)\displaystyle\leq\overline{G}_{max}\big{(}E_{U_{1}}P_{1|u_{1}},E_{U_{2}}P_{2|u_{2}}\big{)}+o(1) (102)
G¯max(P1,P2)+o(1),\displaystyle\leq\overline{G}_{max}\big{(}P_{1},P_{2}\big{)}+o(1), (103)

where (97) follows since h(Y)h(S1+S2)+o(1)h(Y)\leq h(S_{1}+S_{2})+o(1) where o(1)0o(1)\rightarrow 0 as Q1,Q2Q_{1},Q_{2}\rightarrow\infty; (98) follows from the definition of the conditional entropy; (99) follows since [Ex]+E{x+}[Ex]^{+}\leq E\{x^{+}\} and since Y=X1+S1+X2+S2+ZY=X_{1}+S_{1}+X_{2}+S_{2}+Z; (100) follows from the definition of the function G(fV1,V1,fV2,V2)G\big{(}f_{V_{1},V^{\prime}_{1}},f_{V_{2},V^{\prime}_{2}}\big{)} (92), likewise (101) follows from the definition of the function G¯max(P1,P2)\overline{G}_{max}(P_{1},P_{2}) (94), and since h(Si|Ui)h(Si)h(S_{i}|U_{i})\leq h(S_{i}) and from the definition

Pi|uiE{Xi2|Ui=ui},fori=1,2;\displaystyle P_{i|u_{i}}\triangleq E\Big{\{}X_{i}^{2}|U_{i}=u_{i}\Big{\}},\;for\;i=1,2;

(102) follows from Jensen’s inequality since G¯max(P1,P2)\overline{G}_{max}(P_{1},P_{2}) is a concave function; (103) follows from the input constraints where

EXi2\displaystyle EX_{i}^{2} =EUiE{Xi2|Ui=ui}=EUiPi|uiPi,fori=1,2.\displaystyle=E_{U_{i}}E\big{\{}X_{i}^{2}|U_{i}=u_{i}\big{\}}=E_{U_{i}}P_{i|u_{i}}\leq P_{i},\;\mbox{for}\;\;i=1,2. (104)

The lemma follows since the upper bound (103) for the rate sum is now independent of U1U_{1} and U2U_{2}, hence it also bound the single-letter region BSL(P1,P2)\mathcal{R}_{BSL}(P_{1},P_{2}). ∎

Acknowledgment

The authors wish to thank Ashish Khisti for earlier discussions on the binary case. The authors also would like to thank Uri Erez for helpful comments.

References

  • [1] M. Costa, “Writing on dirty paper,” IEEE Trans. Information Theory, vol. IT-29, pp. 439–441, May 1983.
  • [2] S. Gelfand and M. S. Pinsker, “Coding for channel with random parameters,” Problemy Pered. Inform. (Problems of Inform. Trans.), vol. 9, No. 1, pp. 19–31, 1980.
  • [3] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991.
  • [4] A. Somekh-Baruch, S. Shamai, and S. Verdu, “Cooperative encoding with asymmetric state information at the transmitters,” in Proceedings 44th Annual Allerton Conference on Communication, Control, and Computing, Univ. of Illinois, Urbana, IL, USA, Sep. 2006.
  • [5] S. Kotagiri and J. N. Laneman, “Multiple access channels with state information known at some encoders,” IEEE Trans. Information Theory, July 2006, submitted for publication.
  • [6] S. A. Jafar, “Capacity with causal and non-causal side information - a unified view,” IEEE Trans. Information Theory, vol. IT-52, pp. 5468–5475, Dec. 2006.
  • [7] K. Marton, “A coding theorem for the discrete memoryless broadcast channel,” IEEE Trans. Information Theory, vol. IT–22, pp. 374–377, May 1979.
  • [8] U. Erez, S. Shamai, and R. Zamir, “Capacity and lattice strategies for canceling known interference,” IEEE Trans. Information Theory, vol. IT-51, pp. 3820–3833, Nov. 2005.
  • [9] R. Zamir, S. Shamai, and U. Erez, “Nested linear/lattice codes for structured multiterminal binning,” IEEE Trans. Information Theory, vol. IT-48, pp. 1250–1276, June 2002.
  • [10] T. Philosof, A. Khisti, U. Erez, and R. Zamir, “Lattice strategies for the dirty multiple access channel,” in Proceedings of IEEE International Symposium on Information Theory, Nice, France, June 2007.
  • [11] J. Korner and K. Marton, “How to encode the modulo-two sum of binary sources,” IEEE Trans. Information Theory, vol. IT-25, pp. 219–221, March 1979.
  • [12] T. M. Cover and B. Gopinath, Open Problems in Communication and Computation. New York: Springer-Verlag, 1987.
  • [13] I. Csiszar and J. Korner, Information Theory - Coding Theorems for Discrete Memoryless Systems. New York: Academic Press, 1981.
  • [14] B. Nazer and M. Gastpar, “Computation over multiple-access channels,” IEEE Trans. Information Theory, vol. IT-53, pp. 3498–3516, Oct. 2007.
  • [15] D. Krithivasan and S. S. Pradhan, “Lattices for distributed source coding: Jointly Gaussian sources and reconstruction of a linear function,” arXiv:cs.IT/0707.3461V1.
  • [16] A. Khisti, “Private communication.”
  • [17] R. G. Gallager, Information Theory and Reliable Communication. New York, N.Y.: Wiley, 1968.
  • [18] G. Cohen, I. Honkala, S. Litsyn, and A. Lobstein, Covering Codes. Amsterdam, The Netherlands: North Holland Publishing, 1997.
  • [19] R. Ahlswede and J. Korner, “Source coding with side information and a converse for the degraded broadcast channel,” IEEE Trans. Information Theory, vol. 21, pp. 629–637, 1975.
  • [20] A. Wyner, “On source coding with side information at the decoder,” IEEE Trans. Information Theory, vol. IT-21, pp. 294–300, 1975.
  • [21] T. Berger, Multiterminal Source Coding. New York: In G.Longo, editor, the Information Theory Approach to Communications, Springer-Verlag, 1977.
  • [22] L. F. Wei and G. D. Forney, “Multidimensional constellation - part I: Introduction, figures of merit, and generalized cross constellations,” vol. 7, pp. 877–892, Aug. 1989.
  • [23] A. Cohen and R. Zamir, “Entropy amplification property and the loss for writing on dirty paper,” IEEE Trans. Information Theory, To appear, April 2008.
  • [24] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge: Cambridge University Press, 2004.