The Rate Loss of Single-Letter Characterization:
The “Dirty” Multiple Access Channel

Tal Philosof and Ram Zamir ²²2This research was partially supported by BSF grant No-2004398
Dept. of Electrical Engineering - Systems, Tel-Aviv University
Tel-Aviv 69978, ISRAEL
talp,zamir@eng.tau.ac.il
Submitted to IEEE Trans. on Information Theory March 2008

Abstract

For general memoryless systems, the typical information theoretic solution - when exists - has a “single-letter” form. This reflects the fact that optimum performance can be approached by a random code (or a random binning scheme), generated using independent and identically distributed copies of some single-letter distribution. Is that the form of the solution of any (information theoretic) problem? In fact, some counter examples are known. The most famous is the “two help one” problem: Korner and Marton showed that if we want to decode the modulo-two sum of two binary sources from their independent encodings, then linear coding is better than random coding. In this paper we provide another counter example, the “doubly-dirty” multiple access channel (MAC). Like the Korner-Marton problem, this is a multi-terminal scenario where side information is distributed among several terminals; each transmitter knows part of the channel interference but the receiver is not aware of any part of it. We give an explicit solution for the capacity region of a binary version of the doubly-dirty MAC, demonstrate how the capacity region can be approached using a linear coding scheme, and prove that the “best known single-letter region” is strictly contained in it. We also state a conjecture regarding a similar rate loss of single letter characterization in the Gaussian case.

Index Terms:

Multi-user information theory, random binning, linear lattice binning, dirty paper coding, lattice strategies, Korner-Marton problem.

I Introduction

Consider the two-user / double-state memoryless multiple access channel (MAC) with transition and state probability distributions

\displaystyle P(y|x_{1},x_{2},s_{1},s_{2})\ \ \mbox{and}\ \ P(s_{1},s_{2}),

(1)

respectively, where the states $S_{1}$ and $S_{2}$ are known non-causally to user $1$ and user $2$ , respectively. A special case of (1) is the additive channel shown in Fig. 1. In this channel, called the doubly-dirty MAC (after Costa’s “writing on dirty paper” [1]), the total channel noise consists of three independent components: $S_{1}$ and $S_{2}$ , the interference signals, that are known to user $1$ and user $2$ , respectively, and $Z$ , the unknown noise, which is known to neither. The channel inputs $X_{1}$ and $X_{2}$ may be subject to some average cost constraint.

Neither the capacity region of (1) nor that of the special case of Fig. 1 are known. In this paper we consider a particular binary version of the doubly-dirty MAC of Fig. 1, where all variables are in $\mathbb{Z}_{2}$ , i.e., $\{0,1\}$ , and the unknown noise $Z=0$ . The channel output of the binary doubly-dirty MAC is given by

\displaystyle Y=X_{1}\oplus X_{2}\oplus S_{1}\oplus S_{2},

(2)

where $\oplus$ denotes the $\mbox{mod}\;2$ addition (xor), and $S_{1},S_{2}$ are $\mbox{Bernoulli}(1/2)$ and independent. Each of the codewords $\mathbf{x}_{i}\in\mathbb{Z}_{2}^{n}$ is a function of the message $W_{i}$ and the interference vector $\mathbf{s}_{i}\in\mathbb{Z}_{2}^{n}$ , and must satisfy the input constraint, $\frac{1}{n}w_{H}(\mathbf{x}_{i})\leq q_{i}$ , $\;i=1,2$ , where $0\leq q_{1},q_{2}\leq 1/2$ and $w_{H}(\cdot)$ is the Hamming weight. The coding rates $R_{1}$ and $R_{2}$ of the two users are given as usual by $R_{i}=\frac{1}{n}\log|{\cal W}_{i}|$ , where ${\cal W}_{i}$ is the set of messages of user $i$ , and $n$ is the length of the codeword.

Refer to caption — Figure 1: Doubly-dirty MAC

The double state MAC (1) generalizes the point to point channel with side information (SI) at the transmitter considered by Gel’fand and Pinsker [2]. They prove their direct coding theorem using the framework of random binning, which is widely used in the analysis of multi-terminal source and channel coding problems [3]. They obtain a general capacity expression which involves an auxiliary random variable $U$ :

\displaystyle C=\max_{P(u,x|s)}\left\{H(U|S)-H(U|Y)\right\}

(3)

where the maximization is over all the joint distributions of the form $p(u,s,y,x)=p(s)p(u,x|s)p(y|x,s)$ .

The channel in (1) with only one informed encoder (i.e., where $S_{2}=\{\emptyset\}$ ) was considered recently by Somekh-Baruch et al. [4] and Kotagiri and Laneman [5]. The common message ( $W_{1}=W_{2}$ ) capacity of this channel is known [4], and it involves using random binning by the informed user. For the binary “one dirty user” case (i.e., (2) with $S_{2}=0$ ), we show that Somekh-Baruch’s common-message capacity becomes (see Appendix A)

\displaystyle C_{com}=H_{b}(q_{1}),

(4)

where $H_{b}(x)\triangleq-x\log_{2}(x)-(1-x)\log_{2}(1-x)$ is the binary entropy function. Clearly, the doubly-dirty individual-message case is harder. Thus, it follows from (4) that the rate-sum in the setting of Fig. 1 is upper bounded by

\displaystyle R_{1}+R_{2}\leq\min\Big{\{}H_{b}(q_{1}),H_{b}(q_{2})\Big{\}}.

(5)

In Theorem 1 we show that this upper bound is in fact tight.

One approach to find achievable rates for the doubly-dirty MAC, is to extend the Gel’fand and Pinsker solution [2] to the two-user / double-state case. As shown by Jafar [6], this extension leads to the following pentagonal inner bound for the capacity region of (1):

$\displaystyle\mathcal{R}(U_{1},U_{2})\triangleq\Bigg{\{}(R_{1},R_{2}):\;$	$\displaystyle R_{1}\leq I(U_{1},Y\|U_{2})-I(U_{1};S_{1})$
	$\displaystyle R_{2}\leq I(U_{2},Y\|U_{1})-I(U_{2};S_{2})$	(6)
	$\displaystyle R_{1}+R_{2}\leq I(U_{1},U_{2},Y)-I(U_{1};S_{1})-I(U_{2};S_{2})\Bigg{\}}$

for some $P(U_{1},U_{2},X_{1},X_{2}|S_{1},S_{2})=P(U_{1},X_{1}|S_{1})P(U_{2},X_{2}|S_{2})$ . In fact, by a standard time-sharing argument [3], the closure of the convex hull of the set of all rate pairs $(R_{1},R_{2})$ satisfying (7),

\displaystyle\mathcal{R}_{BSL}\triangleq cl\;conv\;\Bigg{\{}

\displaystyle(R_{1},R_{2})\in\mathcal{R}(U_{1},U_{2}):P(U_{1},U_{2},X_{1},X_{2}|S_{1},S_{2})=P(U_{1},X_{1}|S_{1})P(U_{2},X_{2}|S_{2})\Bigg{\}},

(7)

is also achievable¹¹1 As in the Gel’fand and Pinsker solution, for a finite alphabet system it is enough to optimize over auxiliary variables $U_{1}$ and $U_{2}$ whose alphabet size is bounded in terms of the size of the input and state alphabets. . To the best of our knowledge, the set $\mathcal{R}_{BSL}$ is the best currently known single-letter characterization for the rate region of the MAC with side information at the transmitters (1), and in particular, for the doubly-dirty MAC (2)²²2For the case where the users have also a common message $W_{0}$ to be transmitted jointly by both encoders, (7) can be improved by adding another auxiliary random variable $U_{0}$ which plays the role of the common auxiliary r.v. in Marton’s inner bound for the non-degraded broadcast channel [7]. In this case, the joint distribution of $(U_{0},U_{1},U_{2})$ is given by $P(U_{0},U_{1},U_{2})=P(U_{0})P(U_{1}|U_{0})P(U_{2}|U_{0})$ , i.e, $U_{1}$ and $U_{2}$ are conditionally independent given $U_{0}$ .. The achievability of (7) can be proved, as usual, by an i.i.d random binning scheme [6].

A different method to cancel known interference is by “linear strategies”, i.e, binning based on the cosets of a linear code [8, 9, 10]. In the sequel, we show that the outer bound (5) can indeed be achieved by a linear coding scheme. Hence, the set of rate pairs $(R_{1},R_{2})$ satisfying (5) is the capacity region of the binary doubly-dirty MAC. In contrast, we show that the single-letter region (7) is strictly contained in this capacity region. Hence, a random binning scheme based on this extension of the Gel’fand-Pinsker solution [2] is not optimal for this problem.

A similar observation has been made by Korner-Marton [11] for the “two help one” source coding problem. For a specific binary version known as the “modulo-two sum” problem, they showed that the minimum possible rate sum is achieved by a linear coding scheme, while the best known single-letter expression for this problem is strictly higher. See the discussion in [11, Section IV] and in the end of Section III.

Although the “single-letter characterization” is a fundamental concept in information theory, it has not been generally defined [12, p.35]. Csiszar and Korner [13, p.259] suggested to define it through the notion of computability, i.e., a problem has a single-letter solution if there exists an algorithm which can decide if a point belongs to an $\varepsilon$ -neighborhood of the achievable rate region with polynomial complexity in $1/\varepsilon$ . Since we are not aware of any other computable solution to our problem, we shall refer to (7) as the “best known single-letter characterization”.

An extension of these observations to continuous channels would be of interest. Costa [1] considered the single-user case of the dirty channel problem $Y=X+S+Z$ , where the interference $S$ and the noise $Z$ are assumed to be i.i.d. Gaussian with variances $Q$ and $N$ , respectively, and the input $X$ is subject to a power constraint $P$ . He showed that in this case, the transmitter side-information capacity (3) coincides with the zero-interference capacity $\frac{1}{2}\log_{2}(1+SNR)$ , where $SNR=P/N$ . Selecting the auxiliary random variable $U$ in (3) such that

\displaystyle U=X+\alpha S,

(8)

where $X$ and $S$ are independent, and taking $\alpha=\frac{P}{P+N}$ , the formula (3) and its associated random binning scheme are capacity achieving. The continuous (Gaussian) version of the doubly-dirty MAC of Fig. 1 was considered in [10]. It was shown that by using a linear structure, i.e., lattice strategies [8], the full capacity region is achieved in the limit of high SNR and high lattice dimension. In contrast, it was shown that for $Q\rightarrow\infty$ no positive rate is achievable by using the natural generalization of Costa’s strategy (8) to the two user case, while a (scalar) modulo addition version of (8) looses $\thickapprox 0.254$ bit in the sum capacity. We shall further elaborate on this issue in Section IV.

Similar observations regarding the advantage of modulo-lattice modulation with respect to a separation based solution were made by Nazer and Gastpar [14], in the context of computation over linear Gaussian networks, and also by Krithivasan and Pradhan [15] for multi-terminal rate distortion problems.

The paper is organized as follows. In Section II the capacity region for the binary doubly-dirty MAC (2) is derived, and linear coding is shown to be optimal. Section III develops a closed form expression for the best known single-letter characterization (7) for this channel, and demonstrates that it is strictly contained in the the true capacity region. In Section IV we consider the Gaussian doubly-dirty MAC, and state a conjecture regarding the capacity loss of single-letter characterization in this case.

II The Capacity Region of the Binary Doubly-Dirty MAC

The following theorem characterizes the capacity region of the binary doubly-dirty MAC of Fig. 1.

Theorem 1.

The capacity region of the binary doubly-dirty MAC (2) is the set of all rate pairs $(R_{1},R_{2})$ satisfying

\displaystyle\mathcal{C}(q_{1},q_{2})\triangleq\Bigg{\{}(R_{1},R_{2}):R_{1}+R_{2}\leq\min\Big{\{}H_{b}(q_{1}),H_{b}(q_{2})\Big{\}}\Bigg{\}}.

(9)

Proof.

The converse part: As explained in the Introduction (5), one way to derive an upper bound for the rate-sum is through the general one-dirty-user capacity formula [4], which we derive explicitly for the binary case in Appendix A. Here we show directly the converse part, which is similar to the proof of the outer bound for the Gaussian case in [16, 10]. We assume that user $1$ and user $2$ intend to transmit a common message $W$ . An upper bound on the rate of this message clearly upper bounds the sum rate $R_{1}+R_{2}$ in the individual messages case. Thus,

$\displaystyle n(R_{1}+R_{2})$	$\displaystyle\leq H(W)$
	$\displaystyle=H(W\|Y^{n})+I(W;Y^{n})$
	$\displaystyle\leq I(W;Y^{n})+n\epsilon_{n}$	(10)
	$\displaystyle=H(Y^{n})-H(Y^{n}\|W)+n\epsilon_{n}$
	$\displaystyle=H(Y^{n})-H(Y^{n}\|W,S_{1}^{n},S_{2}^{n})-I(S_{1}^{n},S_{2}^{n};Y^{n}\|W)+n\epsilon_{n}$
	$\displaystyle=H(Y^{n})-I(S_{1}^{n},S_{2}^{n};Y^{n}\|W)+n\epsilon_{n}$	(11)
	$\displaystyle=H(Y^{n})-H(S_{1}^{n},S_{2}^{n}\|W)+H(S_{1}^{n},S_{2}^{n}\|W,Y^{n})+n\epsilon_{n}$
	$\displaystyle\leq-n+H(S_{1}^{n}\|W,Y^{n})+H(S_{2}^{n}\|W,Y^{n},S_{1}^{n})+n\epsilon_{n}$	(12)
	$\displaystyle\leq H(X_{1}^{n}\oplus X_{2}^{n}\oplus S_{1}^{n}\|W,Y^{n},S_{1}^{n})+n\epsilon_{n}$	(13)
	$\displaystyle=H(X_{2}^{n}\|W,Y^{n},S_{1}^{n})+n\epsilon_{n}$	(14)
	$\displaystyle\leq nH_{b}(q_{2})+n\epsilon_{n},$	(15)

where (10) follows from Fano’s inequality where $\epsilon_{n}\rightarrow 0$ as the error probability $P_{e}^{(n)}$ goes to zero for $n\rightarrow\infty$ ; (11) follows since $Y$ is fully known given $W$ , $S_{1}$ and $S_{2}$ ; (12) follows from the chain rule for entropy, and due to $H(Y^{n})\leq n$ and $H(S_{1}^{n},S_{2}^{n}|W)=H(S_{1}^{n})+H(S_{2}^{n})=2n$ since $W$ , $S_{1}^{n}$ and $S_{2}^{n}$ are mutually independent; (13) follows since $H(S_{1}^{n}|W,Y^{n})\leq n$ and $Y^{n}=X_{1}^{n}\oplus X_{2}^{n}\oplus S_{1}^{n}\oplus S_{2}^{n}$ ; (14) follows since $X_{1}^{n}$ is a function of $(W,S_{1}^{n})$ , finally (15) follows since $H(X_{2}^{n}|W,Y^{n},S_{1}^{n})\leq H(X_{2}^{n})\leq nH_{b}(q_{2})$ .

In the same way we can show that $R_{1}+R_{2}\leq H_{b}(q_{1})+\epsilon_{n}$ . The converse part follows since for $n\rightarrow\infty$ we have that $\epsilon_{n}\rightarrow 0$ , thus $P_{e}^{(n)}\rightarrow 0$ .

The direct part is based on the scheme for the point-to-point binary dirty paper channel [9]. We define $q\triangleq\min\{q_{1},q_{2}\}$ . In view of the converse part, it is sufficient to show achievability of the point $(R_{1},R_{2})=(H_{b}(q),0)$ , since the outer bound may be achieved by time sharing with the symmetric point $(R_{1},R_{2})=(0,H_{b}(q))$ . The corner point $(R_{1},R_{2})=(H_{b}(q),0)$ corresponds to the “helper problem”, i.e., user $2$ tries to help user $1$ to transmit at its highest rate. The encoders and decoder are described using a binary linear code $\mathcal{C}(n,k)$ with parity check matrix $H$ . Let $\mathbf{v}\in\mathbb{Z}_{2}^{n-k}$ be a syndrome of the code $\mathcal{C}$ , where we note that each syndrome represents a different coset of the linear code $\mathcal{C}$ . Let $f(\mathbf{v})$ denote the “leader” of (or the minimum weight vector in) the coset associated with the syndrome $\mathbf{v}$ [17, Chap. 6], hence $f:\{0,1\}^{n-k}\rightarrow\{0,1\}^{n}$ . For $\mathbf{a}\in\mathbb{Z}_{2}^{n}$ , we define the $n$ -dimensional modulo operation over the code $\mathcal{C}$ as

\displaystyle\mathbf{a}\;\mbox{mod}\;\mathbb{C}\triangleq f(H\mathbf{a}),

which is the leader of the coset to which the vector $\mathbf{a}$ belongs.

•

Encoder of user $\mathbf{1}$ : Let the transmitted message $\mathbf{v}_{1}\in\mathbb{Z}_{2}^{n-k}$ be a syndrome in $\mathcal{C}$ , and let $\mathbf{\tilde{x}}_{1}=f(\mathbf{v}_{1})$ be its coset leader. In particular $\mathbf{v}_{1}=H\mathbf{\tilde{x}}_{1}$ . Transmit the modulo of the code $\mathcal{C}$ with respect to the difference between $\tilde{\mathbf{x}}_{1}$ and $\mathbf{s}_{1}$ , i.e.,

\displaystyle\mathbf{x}_{1}=(\tilde{\mathbf{x}}_{1}\oplus\mathbf{s}_{1})\;\mbox{mod}\;\mathbb{C}=f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1}).

•

Encoder of user $\mathbf{2}$ : (functions as a “helper” for user $1$ ). Transmit

$\displaystyle\mathbf{x}_{2}=\mathbf{s}_{2}\;\mbox{mod}\;\mathbb{C}=f(H\mathbf{s}_{2}).$
•

Decoder:
1. Reconstruct $\tilde{\mathbf{x}}_{1}$ by $\hat{\tilde{\mathbf{x}}}_{1}=\mathbf{y}\;\rm{mod}\;\mathbb{C}$ .
2. Reconstruct the transmitted coset of user $1$ by $\hat{\mathbf{v}}_{1}=H\hat{\tilde{\mathbf{x}}}_{1}$ .
In fact, the transmitted coset can be reconstructed directly as $\hat{\mathbf{v}}_{1}=H\hat{\tilde{\mathbf{x}}}_{1}=H(\mathbf{y}\;\rm{mod}\;\mathbb{C})=H\mathbf{y}$ , where the last equality follows since $\mathbf{y}\;\rm{mod}\;\mathbb{C}$ and $\mathbf{y}$ are in the same coset.

It follows that the decoder correctly decodes the message coset $\mathbf{v}_{1}$ , since

	$\displaystyle\hat{\mathbf{v}}_{1}$	$\displaystyle=H\cdot\Big{(}\mathbf{y}\;\rm{mod}\;\mathbb{C}\Big{)}$
		$\displaystyle=H\cdot\Big{(}[\tilde{\mathbf{x}}_{1}\oplus\mathbf{s}_{1}\oplus\mathbf{s}_{2}\oplus\mathbf{s}_{1}\oplus\mathbf{s}_{2}]\;\mbox{mod}\;\mathbb{C}\Big{)}$
		$\displaystyle=H\tilde{\mathbf{x}}_{1}$
		$\displaystyle=\mathbf{v}_{1},$

where the third equality follows since $\tilde{\mathbf{x}}_{1}$ and $\tilde{\mathbf{x}}_{1}\;\rm{mod}\;\mathbb{C}$ are in the same coset. It is left to relate the coding rate $R_{1}=\frac{1}{n}\log\Big{(}\Big{|}\{0,1\}^{n-k}\Big{|}\Big{)}=1-k/n$ to the input constraint $q$ . Form [18], there exists a binary linear code with covering radius $\rho$ that satisfies $\frac{k}{n}\leq 1-H_{b}(\rho/n)+\epsilon$ where $\epsilon\rightarrow 0$ as $n\rightarrow\infty$ . The achievability of the point $(H_{b}(q),0)$ follows by using $q=\rho/n$ , thus $R_{1}=1-k/n\geq H_{b}(q)-\epsilon$ , while $w_{H}(\mathbf{x}_{1})=w_{H}(f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1}))\leq\rho$ and $w_{H}(\mathbf{x}_{2})=w_{H}(f(H\mathbf{s}_{2}))\leq\rho$ , hence

	$\displaystyle\frac{1}{n}Ew_{H}\{\mathbf{x}_{1}\}=\frac{1}{n}Ew_{H}\{f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1})\}\leq q$
	$\displaystyle\frac{1}{n}Ew_{H}\{\mathbf{x}_{2}\}=\frac{1}{n}Ew_{H}\{f(H\mathbf{s}_{2})\}\leq q.$

This completes the proof of the direct part of the theorem. ∎

As stated above, the achievability for the capacity region follows by time sharing the corner points $(H_{b}(q),0)$ and $(0,H_{b}(q))$ where $q=\min\{q_{1},q_{2}\}$ . It is also interesting to see how to achieve the rate sum $H_{b}(q)$ for an arbitrary rate pair $(R_{1},R_{2})$ without time sharing. For that, let the message of user $1$ be $\mathbf{m}_{1}\in\mathbb{Z}_{2}^{l_{1}}$ and the message of user $2$ be $\mathbf{m}_{2}\in\mathbb{Z}_{2}^{l_{2}}$ where $l_{1}+l_{2}=n-k$ . We define the following syndromes in $\mathcal{C}$

	$\displaystyle\mathbf{v}_{1}$	$\displaystyle\triangleq[\mathbf{m}_{1}\;\underbrace{0\;0\;\ldots\;0}_{l_{2}}]\in\mathbb{Z}_{2}^{n-k}$
	$\displaystyle\mathbf{v}_{2}$	$\displaystyle\triangleq[\underbrace{0\;0\;\ldots\;0}_{l_{1}}\;\mathbf{m}_{2}]\in\mathbb{Z}_{2}^{n-k}$
	$\displaystyle\mathbf{v}$	$\displaystyle\triangleq\mathbf{v}_{1}\oplus\mathbf{v}_{2}.$

Clearly, given the syndrome $\mathbf{v}$ the syndromes $\mathbf{v}_{1}$ and $\mathbf{v}_{2}$ are fully known and the messages $\mathbf{m}_{1}$ and $\mathbf{m}_{2}$ as well. Let $\mathbf{\tilde{x}}_{i}=f(\mathbf{v}_{i})$ be the coset leader of $\mathbf{v}_{i}$ for $i=1,2$ . In this case the transmission scheme is as follow:

•

Encoder of user $1$ : transmit $\mathbf{x}_{1}=(\tilde{\mathbf{x}}_{1}\oplus\mathbf{s}_{1})\;\mbox{mod}\;\mathbb{C}=f(\mathbf{v}_{1}\oplus H\mathbf{s}_{1})$ .
•

Encoder of user $2$ : transmit $\mathbf{x}_{2}=(\tilde{\mathbf{x}}_{2}\oplus\mathbf{s}_{2})\;\mbox{mod}\;\mathbb{C}=f(\mathbf{v}_{2}\oplus H\mathbf{s}_{2})$ .
•

Decoder: reconstruct $\hat{\mathbf{v}}=H\cdot\Big{(}\mathbf{y}\;\rm{mod}\;\mathbb{C}\Big{)}$ .

Therefore, we have that

	$\displaystyle\hat{\mathbf{v}}$	$\displaystyle=H\cdot\Big{(}\mathbf{y}\;\rm{mod}\;\mathbb{C}\Big{)}$
		$\displaystyle=H\cdot\Big{(}\tilde{\mathbf{x}}_{1}\oplus\tilde{\mathbf{x}}_{2}\Big{)}=\mathbf{v}_{1}\oplus\mathbf{v}_{2}=\mathbf{v}.$

The sum capacity is achieved since $R_{1}+R_{2}=\frac{l_{1}+l_{2}}{n}=\frac{n-k}{n}\geq H_{b}(q)-\epsilon$ where $\epsilon\rightarrow 0$ as $n\rightarrow\infty$ which satisfies the input constraints.

III A Single-Letter Characterization for the Capacity Region

In this section we characterize the best known single-letter region (7) for the binary doubly-dirty MAC (2), and show that it is strictly contained in the capacity region (9). For simplicity, we shall assume identical input constraints, i.e., $q_{1}=q_{2}=q$ .

Definition 1.

For a given $q$ , the best known single-letter rate region for the binary doubly-dirty MAC (2), denoted by $\mathcal{R}_{BSL}(q)$ , is the set of all rate pairs $(R_{1},R_{2})$ satisfying (7) with the additional constraints that $E\mathbf{X}_{1},E\mathbf{X}_{2}\leq q$ .

In the following theorem we give a closed form expression for $\mathcal{R}_{BSL}(q)$ .

Theorem 2.

The best known single-letter rate region for the binary doubly-dirty MAC (2) is a triangular region given by

\displaystyle\mathcal{R}_{BSL}(q)=\Bigg{\{}(R_{1},R_{2}):R_{1}+R_{2}\leq u.c.e\Big{[}2H_{b}(q)-1\Big{]}^{+}\Bigg{\}},

(16)

where $u.c.e$ is the upper convex envelope with respect to $q$ , and $[x]^{+}\triangleq\max\{0,x\}$ .

Fig. 2 shows the sum capacity of the binary doubly-dirty MAC (9) versus the best known single-letter rate sum (16) for equal input constraints. The latter is strictly contained in the capacity region which is achieved by a linear code. The quantity $[2H_{b}(q)-1]^{+}$ is not a convex - $\cap$ function with respect to $q$ . The upper convex envelope of $[2H_{b}(q)-1]^{+}$ is achieved by time-sharing between the points $q=0$ and $q=q^{*}\triangleq 1-1/\sqrt{2}$ , therefore it is given by

\displaystyle R_{1}+R_{2}\leq\Bigg{\{}\begin{array}[]{cc}2H_{b}(q)-1,&q^{*}\leq q\leq 1/2\\ C^{*}q,&0\leq q\leq q^{*}\\ \end{array},

(19)

where $C^{*}\triangleq\frac{2H_{b}(q^{*})-1}{q^{*}}$ .

Proof.

The direct part is shown by choosing in (6) $U_{1}=S_{1}\oplus X_{1}$ and $U_{2}=S_{2}\oplus X_{2}$ , where $X_{1},X_{2}\sim\mbox{Bernoulli}(q)$ and $X_{1},X_{2},S_{1},S_{2}$ are independent. From (6) the achievable rate sum is given by

	$\displaystyle R_{1}+R_{2}=I(U_{1},U_{2};Y)-I(U_{1},U_{2};S_{1},S_{2})$
	$\displaystyle=H(U_{1}\|S_{1})+H(U_{2}\|S_{2})-H(U_{1},U_{2}\|U_{1}\oplus U_{2})$		(20)
	$\displaystyle=H(U_{1}\|S_{1})+H(U_{2}\|S_{2})-H(U_{1}\|U_{1}\oplus U_{2})-H(U_{2}\|U_{1}\oplus U_{2},U_{1})$		(21)
	$\displaystyle=H(X_{1})+H(X_{2})-H(U_{1}\|U_{1}\oplus U_{2})$		(22)
	$\displaystyle=2H_{b}(q)-1,$		(23)

where (20) follows since $Y=U_{1}\oplus U_{2}$ ; (21) follows from the chain rule for entropy; (22) follows since $U_{2}$ is fully known given $U_{1}\oplus U_{2},U_{1}$ thus $H(U_{2}|U_{1}\oplus U_{2},U_{1})=0$ ; (23) follows since $H(X_{i})\leq H_{b}(q)$ and since $U_{1},U_{2}$ are independent with $P(U_{i}=1)=1/2$ thus $H(U_{1}|U_{1}\oplus U_{2})=H(U_{1})=1$ .

The converse part of the proof is given in Appendix B. ∎

We see that the binary doubly-dirty MAC is a memoryless channel coding problem, where the capacity region is achievable by a linear code, while the best known single-letter rate region is strictly contained in the capacity region. This may be explained by the fact that each user has only partial side information, and distributed random binning is unable to capture the linear structure of the channel.

In order to understand the limitation of random binning versus a linear code, we consider these two schemes for high enough $q$ , that is $2H_{b}(q)-1\geq 0$ . The random binning scheme uses $U_{i}=X_{i}\oplus S_{i}$ where $X_{i}\sim\mbox{Bernoulli}(q)$ and $S_{i}\sim\mbox{Bernoulli}(1/2)$ are independent, therefore $Y=U_{1}\oplus U_{2}$ where $U_{i}\sim\mbox{Bernoulli}(1/2)$ for $i=1,2$ . Each transmitter maps the message (bin) $W_{i}$ into a codeword $\mathbf{u}_{i}$ which is with high probability at a Hamming distance of $nq$ from $\mathbf{s}_{i}$ . Therefore, given the vectors $(\mathbf{s}_{1}^{n},\mathbf{s}_{2}^{n})$ , the available input space is approximately $2^{nH(U_{1},U_{2}|S_{1},S_{2})}=2^{nH(X_{1},X_{2})}=2^{2nH_{b}(q)}$ . Given the received vector $\mathbf{y}$ , the residual ambiguity is given by $2^{nH(U_{1},U_{2}|Y)}=2^{n[H(U_{1}|Y)+H(U_{2}|Y,U_{1})]}=2^{n}$ , since $H(U_{1}|Y)=1$ and $H(U_{2}|Y,U_{1})=0$ . As a result, the achievable rate sum is given by

\displaystyle R_{1}+R_{2}=\frac{1}{n}\log_{2}\Big{(}\frac{|\mbox{input space}|}{|\mbox{residual ambiguity space}|}\Big{)}\approx 2H_{b}(q)-1.

The linear coding scheme shown in Theorem 1 has the same input space size as the random binning scheme, i.e., $2^{2nH_{b}(q)}$ , since each user has $2^{nH_{b}(q)}$ cosets. However, given the received vector $\mathbf{y}$ there are $2^{nH_{b}(q)}$ possible pairs of cosets, i.e., the residual ambiguity is only $2^{nH_{b}(q)}$ . Therefore, the linear code achieves rate sum of $R_{1}+R_{2}\approx 2H_{b}(q)-H_{b}(q)=H_{b}(q)$ . The advantage of the linear coding scheme results from the “ordered structure” of the linear code, which decreases the residual ambiguity from $1$ bit in random coding to $H_{b}(q)$ .

The following example illustrates the above arguments for the case that user $2$ is a “helper” for user $1$ , i.e, $R_{2}=0$ , and user $1$ transmits at his highest rate for each technique (random binning or linear coding). Table I summarizes the rates and codebooks sizes for each user for $q=0.3$ , that is $H_{b}(q)\thickapprox 0.88$ bit.

	Random binning	Linear code
Rate sum	$2H_{b}(q)-1=0.76$ bit	$H_{b}(q)=0.88$ bit
Codewords per bin/coset	$2^{nI(U_{i};S_{i})}=2^{n[1-H_{b}(q)]}=2^{0.12n}$	$2^{n[1-H_{b}(q)]}=2^{0.12n}$
Helper (user $2$ ) - codebook size	$2^{nI(U_{2};S_{2})}=2^{n[1-H_{b}(q)]}=2^{0.12n}$	$2^{n[1-H_{b}(q)]}=2^{0.12n}$
User $1$ - codebook size	$2^{0.76n}2^{0.12n}=2^{0.88n}$	$2^{0.12n}2^{0.88n}=2^{n}$
Number of possible codeword pairs	$2^{0.88n}2^{0.12n}=2^{n}$	$2^{n}2^{0.12n}=2^{1.12n}$

TABLE I: Random binning and linear coding schemes codebooks sizes for the helper problem with

q=0.3

Korner and Marton [11] observed a similar behavior for the “two help one” source coding problem shown in Fig. 3. In this problem, there are three binary sources $X,Y,Z$ , where $Z=X\oplus Y$ , and the joint distribution of $X$ and $Y$ is symmetric with $P(X\neq Y)=\theta$ . The goal is to encode the sources $X$ and $Y$ separately such that $Z$ can be reconstructed losslessly. Korner and Marton showed that the rate sum required is at least

\displaystyle R_{x}+R_{y}\geq 2H(Z),

(24)

and furthermore, this rate sum can be achieved by a linear code: each encoder transmits the syndrome of the observed source relative to a good linear binary code for a BSC with crossover probability $\theta$ .

In contrast, the “one help one” problem [19, 20] has a closed single-letter expression for the rate region, which corresponds to a random binning coding scheme. Korner and Marton [11] generalize the expression of [19, 20] to the “two help one” problem, and show that the minimal rate sum required using this expression is given by

\displaystyle R_{x}+R_{y}\geq H(X,Y).

(25)

The region (25) corresponds to Slepian-Wolf encoding of $X$ and $Y$ , and it can also be derived from the Burger-Tung achievable region [21] for distributed coding for $X$ and $Y$ with one reconstruction $\hat{Z}$ under the distortion measure $d(X,Y,\hat{Z})\triangleq X\oplus Y\oplus\hat{Z}$ . Clearly, the region (6) is strictly contained in the Korner-Marton region $R_{x}+R_{y}\geq 2H(Z)$ (24) (since $H(X,Y)=1+H(Z)>2H(Z)$ for $Z\sim\mbox{Bernoulli}(\theta)$ , where $\theta\neq\frac{1}{2}$ ). For further background on related source coding problems, see [15].

IV The Gaussian Doubly-Dirty MAC

In this section we introduce our conjecture regarding the rate loss of the best known single-letter characterization for the capacity region of the two-user Gaussian doubly-dirty MAC at high SNR. The Gaussian doubly-dirty MAC [10] is given by

\displaystyle Y=X_{1}+X_{2}+S_{1}+S_{2}+Z,

(26)

where $Z\sim\mathcal{N}(0,N)$ is independent of $X_{1},X_{2},S_{1},S_{2}$ , and where user $1$ and user $2$ must satisfy the power constraints, $\frac{1}{n}\sum_{i=1}^{n}X_{1_{i}}^{2}\leq P_{1}$ and $\frac{1}{n}\sum_{i=1}^{n}X_{2_{i}}^{2}\leq P_{2}$ see Fig. 1. The interference signals $S_{1}$ and $S_{2}$ are known non-causally to the transmitters of user $1$ and user $2$ , respectively. We shall assume that $S_{1}$ and $S_{2}$ are independent Gaussian with variances going to infinity, i.e., $S_{i}\sim\mathcal{N}(0,Q_{i})$ where $Q_{i}\rightarrow\infty$ for $i=1,2$ . The signal to noise ratios for the two users are $SNR_{1}=\frac{P_{1}}{N}$ and $SNR_{2}=\frac{P_{2}}{N}$ .

The capacity region at high SNR, i.e., $SNR_{1},SNR_{2}\gg 1$ , is given by [10],

\displaystyle R_{1}+R_{2}\leq\frac{1}{2}\log_{2}\Bigg{(}\frac{\min\{P_{1},P_{2}\}}{N}\Bigg{)},

(27)

and it is achievable by a modulo lattice coding scheme of dimension going to infinity. In contrast, it was shown in [10] that at high SNR and strong independent Gaussian interferences, the natural generalization of Costa’s strategy (8) for the two users case, i.e., with auxiliary random variables $U_{1}=X_{1}+S_{1}$ and $U_{2}=X_{2}+S_{2}$ , is not able to achieve any positive rate. A better choice for $U_{1}$ and $U_{2}$ suggested in [10] is a modulo version of Costa’s strategy (8),

\displaystyle U_{i}^{*}=[X_{i}+S_{i}]\;\mbox{mod}\>\Delta_{i},

(28)

where $\Delta_{i}=\sqrt{12P_{i}}$ , and where $X_{i}\sim\mbox{Unif}\left([-\frac{\Delta_{i}}{2},\frac{\Delta_{i}}{2})\right)$ is independent of $S_{i}$ , for $i=1,2$ . In this case the rate loss with respect to (27) is $\frac{1}{2}\log_{2}\Big{(}\frac{\pi e}{6}\Big{)}\thickapprox 0.254\;\rm{bit}$ .

The best known single-letter capacity region for the Gaussian doubly-dirty MAC (26) is defined as the set of all rate pairs $(R_{1},R_{2})$ satisfying (7), where $X_{1}$ and $X_{2}$ are restricted to the power constraints $EX_{1}^{2}\leq P_{1}$ and $EX_{2}^{2}\leq P_{2}$ . We believe that for high SNR and strong interference, the modulo- $\Delta$ strategy (28) is an optimum choice for $(X_{1},X_{2},U_{1},U_{2})$ in (7) for the Gaussian doubly-dirty MAC. This implies the following conjecture about the rate loss of the best known single-letter characterization.

Conjecture 1.

For the Gaussian doubly-dirty MAC, at high SNR and strong interference, the best known single-letter expression $R_{BSL}^{sum}$ (7) looses

\displaystyle C^{sum}-R_{BSL}^{sum}=\frac{1}{2}\log_{2}\Big{(}\frac{\pi e}{6}\Big{)}\approx 0.254\;\mbox{bit},

(29)

with respect to the sum capacity $C^{sum}$ (27).

Note that the right hand side of (29) is the well known “shaping loss” [22] (equivalent to a $~1.53dB$ power loss).

A heuristic approach to attack the proof of this conjecture is to follow the steps of the proof of the converse part in the binary case (Theorem 2). First, in Lemma 6 we derive a simplified single-letter formula, $\overline{G}_{max}(P_{1},P_{2})$ , which is analogous to Lemma 1 in the binary case. The next step would be to optimize this expression. However, an optimal choice for the auxiliary random variables $V_{1},V_{1}^{\prime},V_{2},V_{2}^{\prime}$ (provided in the binary case by Lemma 2 and Lemma 3) is unfortunately still missing for the Gaussian case. The expression in Lemma 6 is close in spirit to the point-to-point dirty tape capacity for high SNR and strong interference [8]. In [8] it is shown that optimizing the capacity is equivalent to minimum entropy-constrained scalar quantization in high resolution, which is achieved by a lattice quantizer. Clearly, if we could show a similar lemma for the two variable pairs in the maximization of Lemma 6, i.e., that it is achieved by a pair of lattice quantizers, then the conjecture would be an immediate consequence.

It should be noted that the above discussion is valid only for strong interferences $S_{1}$ and $S_{2}$ . For interference with finite power, it seems that cancelling the interference part of the time and staying silence the rest of the time (like in the time-sharing region $0\leq q\leq q^{*}$ in the binary case) may achieve better rates.

V Summary

A memoryless information theoretic problem is considered open as long as we are missing a general single-letter characterization for its information performance. This goes hand in hand with the optimality of the random coding approach for those problems which are currently solved. We examined this traditional view for the memoryless doubly-dirty MAC.

In the binary case, we showed that the best known single letter characterization is strictly contained in the region achievable by linear coding, and that the latter is in fact the full capacity region of the problem. In the Gaussian case, we conjectured that the best known single-letter characterization suffers an inherent rate loss (equal to the well known “shaping loss” $0.5\log(\pi e/6)$ ), and we provide a partial proof. This is in contrast to the asymptotic optimality (dimension $\rightarrow\infty$ ) of lattice strategies, as recently shown in [10].

The underlying reason for these performance gaps is that random binning is in general not optimal when side information is distributed among more than one terminal in the network. In the specific case of the doubly-dirty MAC (like in Korner-Marton’s modulo-two sum problem [11] and similar settings [14, 15]), the linear structure of the network allows to show that linear binning is not only better, but it is capacity achieving.

Appendix A A Closed Form Expression for the Capacity of the Binary MAC with One Dirty User

We consider the binary dirty MAC (2) with $S_{2}=0$ ,

\displaystyle Y=X_{1}\oplus X_{2}\oplus S_{1},

(30)

where $S_{1}\sim\text{Bernoulli(1/2)}$ is known non-causally at the encoder of user $1$ with the input constraints $\frac{1}{n}W_{H}(\mathbf{x}_{i})\leq q_{i}$ for $i=1,2$ . We show that the common message ( $W_{1}=W_{2}=W$ ) capacity of this channel is given by

\displaystyle C_{com}=H_{b}(q_{1}).

(31)

To prove (31), consider the general expression for the common message capacity of the MAC with one informed user [4], given by

\displaystyle C_{com}=\max_{U_{1},X_{1},X_{2}}\{I(U_{1},X_{2};Y)-I(U_{1},X_{2};S_{1})\},

(32)

where the maximization is over al the joint distributions

\displaystyle P(S_{1},X_{1},X_{2},U_{1},Y)=P(S_{1})P(X_{2})P(U_{1}|X_{2},S_{1})P(X_{1}|S_{1},U_{1})P(Y|X_{1},X_{2},S_{1}).

The converse part of (31) follows since for any $U_{1},X_{1},X_{2}$ , the common message rate $R_{com}$ can be upper bounded by

$\displaystyle R_{com}$	$\displaystyle=I(U_{1},X_{2};Y)-I(U_{1},X_{2};S_{1})$
	$\displaystyle=H(S_{1}\|U_{1},X_{2})-H(Y\|U_{1},X_{2})+H(Y)-H(S_{1})$
	$\displaystyle\leq H(S\|U_{1},X_{2})-H(Y\|U_{1},X_{2})$	(33)
	$\displaystyle=H(S_{1}\|U_{1},X_{2})-H(X_{1}\oplus S_{1}\|U_{1},X_{2})$	(34)
	$\displaystyle=H(S_{1}\|T)-H(X_{1}\oplus S_{1}\|T)$	(35)
	$\displaystyle=E_{T}\Big{\{}H(S_{1}\|T=t)-H(X_{1}\oplus S_{1}\|T=t)\Big{\}}$	(36)
	$\displaystyle=E_{T}\Big{\{}H_{b}(\alpha_{t})-H_{b}(\beta_{t})\Big{\}},$	(37)

where (33) follows since $H(Y)\leq 1$ and $H(S_{1})=1$ ; (34) follows since $Y=X_{1}\oplus X_{2}\oplus S_{1}$ ; (35) follows the definition $T\triangleq(U_{1},X_{2})$ ; (36) follows from the definition of the conditional entropy; (37) follows from the following definitions $\alpha_{t}\triangleq P(S_{1}=1|T=t)$ and $\beta_{t}\triangleq P(S_{1}\oplus X_{1}=1|T=t)$ for any $t\in T$ . We also define $q_{1|t}\triangleq P(X_{1}=1|T=t)=E\{X_{1}|T=t\}$ , therefore the input constraint of user $1$ can be written as

\displaystyle EX_{1}=E_{T}E\{X_{1}|T=t\}=E_{T}\{q_{1|t}\}\leq q_{1}.

(38)

Without loss of generality, we can only consider $\alpha_{t},\beta_{t},q_{1|t}\in[0,1/2]$ in (37) for any $t\in T$ . Thus,

$\displaystyle R_{com}$	$\displaystyle\leq E_{T}\Big{\{}H_{b}(\alpha_{t})-H_{b}\Big{(}[\alpha_{t}-q_{1\|t}]^{+}\Big{)}\Big{\}}$	(39)
	$\displaystyle\leq E_{T}\Big{\{}H_{b}(q_{1\|t})\Big{\}}$	(40)
	$\displaystyle\leq H_{b}\Big{(}E_{T}\{q_{1\|t}\}\Big{)}$	(41)
	$\displaystyle\leq H_{b}(q_{1}),$	(42)

where (39) follows from (37) and since $H_{b}(\beta_{t})\geq H_{b}\Big{(}[\alpha_{t}-q_{1|t}]^{+}\Big{)}$ , where $[x]^{+}=max\{x,0\}$ ; (40) follows since $H_{b}(\alpha_{t})-H_{b}\Big{(}[\alpha_{t}-q_{1|t}]^{+}\Big{)}$ is increasing in $\alpha_{t}$ for $\alpha_{t}\leq q_{1|t}\leq 1/2$ and decreasing in $\alpha_{t}$ for $q_{1|t}<\alpha_{t}\leq 1/2$ , thus the maximum is for $\alpha_{t}=q_{1|t}$ ; (41) follows from Jensen’s inequality since $H_{b}(\cdot)$ is convex- $\cap$ ; (42) follows from the input constraint for user $1$ (38). The converse part follows since the outer bound is valid for any $U_{1}$ and $X_{1},X_{2}$ that satisfy the input constraints.

The direct part is shown by using $U_{1}=X_{1}\oplus S_{1}$ where $X_{1}$ and $S_{1}$ are independent with $X_{1}\sim\text{Bernoulli}(q_{1})$ , thus $U_{1}\sim\text{Bernoulli}(1/2)$ . Furthermore, $X_{2}\sim\text{Bernoulli}(q_{2})$ which is independent of $X_{1},U_{1},S_{1}$ . In this case $Y=U_{1}\oplus X_{2}$ , hence $Y\sim\text{Bernoulli}(1/2)$ . Using this choice for $U_{1},X_{1},X_{2}$ , the achievable common message rate is given by

$\displaystyle R_{com}$	$\displaystyle=I(U_{1},X_{2};Y)-I(U_{1},X_{2};S_{1})$
	$\displaystyle=H(S_{1}\|U_{1},X_{2})-H(Y\|U_{1},X_{2})+H(Y)-H(S_{1})$
	$\displaystyle=H(X_{1})$	(43)
	$\displaystyle=H_{b}(q_{1}),$

where (43) follows since $H(S_{1}|U_{1},X_{2})=H(S_{1}|U_{1})=H(X_{1})$ , $H(Y|U_{1},X_{2})=0$ , $H(Y)=1$ and $H(S_{1})=1$ .

Appendix B Proof of the Converse Part of Theorem 2

The proof of the converse part follows from Lemma 1, Lemma 2 and Lemma 3, whereas Lemma 5 and Lemma 4 are technical results which assist in the derivation of Lemma 3.

Let us define the following functions:

\displaystyle F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}})\triangleq\Big{[}H(V_{1})+H(V_{2})-H(V^{\prime}_{1}\oplus V^{\prime}_{2})-1\Big{]}^{+},

(44)

where $[x]^{+}=\max(0,x)$ ; its $(q_{1},q_{2})$ -constrained maximization with respect to $V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}\in\mathbb{Z}_{2}$ where $(V_{1},V^{\prime}_{1})$ and $(V_{2},V^{\prime}_{2})$ are independent, i.e.,

	$\displaystyle F_{max}(q_{1},q_{2})\triangleq$	$\displaystyle\max_{V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}}F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}})$		(45)
		$\displaystyle\mbox{s.t}\;\;P(V_{i}\neq V^{\prime}_{i})\leq q_{i},\;\mbox{for}\;\;i=1,2;$

and the upper convex envelope of $F_{max}(q_{1},q_{2})$ with respect to $q_{1},q_{2}$

\displaystyle\overline{F}_{max}(q_{1},q_{2})\triangleq u.c.e\Big{\{}F_{max}(q_{1},q_{2})\Big{\}}.

(46)

In the following lemma we give an outer bound for the single-letter region (7) of the binary doubly-dirty MAC in the spirit of [23, Lemma 3] and [8, Proposition 1].

Lemma 1.

The best known single-letter rate sum (7) of the binary doubly-dirty MAC (2) with input constraint $q_{1}$ and $q_{2}$ is upper bounded by

\displaystyle R_{1}+R_{2}\leq\overline{F}_{max}(q_{1},q_{2}).

(47)

Proof.

An outer bound on the best known single-letter region (7) is given by

$\displaystyle R_{BSL}^{sum}(U_{1},U_{2})$	$\displaystyle\triangleq\Big{[}I(U_{1},U_{2};Y)-I(U_{1},U_{2};S_{1},S_{2})\Big{]}^{+}$	(48)
	$\displaystyle=\Big{[}H(S_{1}\|U_{1})+H(S_{2}\|U_{2})-H(Y\|U_{1},U_{2})+H(Y)-H(S_{1})-H(S_{2})\Big{]}^{+}$	(49)
	$\displaystyle\leq\Big{[}H(S_{1}\|U_{1})+H(S_{2}\|U_{2})-H(Y\|U_{1},U_{2})-1\Big{]}^{+}$	(50)
	$\displaystyle=\Bigg{[}E_{U_{1},U_{2}}\Big{\{}H(S_{1}\|U_{1}=u_{1})+H(S_{2}\|U_{2}=u_{2})-H(Y\|U_{1}=u_{1},U_{2}=u_{2})-1\Big{\}}\Bigg{]}^{+}$	(51)
	$\displaystyle\leq E_{U_{1},U_{2}}\Bigg{\{}\Big{[}H(S_{1}\|U_{1}=u_{1})+H(S_{2}\|U_{2}=u_{2})-H(Y\|U_{1}=u_{1},U_{2}=u_{2})-1\Big{]}^{+}\Bigg{\}}$	(52)
	$\displaystyle\leq E_{U_{1},U_{2}}\Bigg{\{}F\Big{(}P_{S_{1},S_{1}\oplus X_{1}\|U_{1}=u_{1}},P_{S_{2},S_{2}\oplus X_{2}\|U_{2}=u_{2}}\Big{)}\Bigg{\}}$	(53)
	$\displaystyle\leq E_{U_{1},U_{2}}\Big{\{}\overline{F}_{max}\big{(}q_{1\|u_{1}},q_{2\|u_{2}}\big{)}\Big{\}}$	(54)
	$\displaystyle\leq\overline{F}_{max}\big{(}E_{U_{1}}q_{1\|u_{1}},E_{U_{2}}q_{2\|u_{2}}\big{)}$	(55)
	$\displaystyle\leq\overline{F}_{max}\big{(}q_{1},q_{2}\big{)},$	(56)

where (50) follows since $H(S_{1})=H(S_{2})=1$ and $H(Y)\leq 1$ ; (51) follows from the definition of the conditional entropy; (52) follows since $[Ex]^{+}\leq E\{x^{+}\}$ ; (53) follows from the definition of the function $F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}})$ (44), likewise (54) follows from the definition of the function $\overline{F}_{max}(q_{1},q_{2})$ (46), and from the definition

\displaystyle q_{i|u_{i}}\triangleq P(S_{i}\neq X_{i}\oplus S_{i}|U_{i}=u_{i})=P(X_{i}=1|U_{i}=u_{i}),\;for\;i=1,2;

(55) follows from Jensen’s inequality since $\overline{F}_{max}(q_{1},q_{2})$ is a concave function; (56) follows from the input constraints where

$\displaystyle EX_{i}$	$\displaystyle=E_{U_{i}}P(X_{i}=1\|U_{i}=u_{i})$
	$\displaystyle=\sum_{u_{i}\in U_{i}}P(u_{i})P(X_{i}=1\|U_{i}=u_{i})$
	$\displaystyle=\sum_{u_{i}\in U_{i}}P(u_{i})q_{i\|u_{i}}\leq q_{i},\;\mbox{for}\;\;i=1,2.$	(57)

The lemma now follows since the upper bound (56) for the rate sum is independent of $U_{1}$ and $U_{2}$ , hence it also bounds the single-letter region $\mathcal{R}_{BSL}(q)$ . ∎

A simplified expression for the function $F_{max}(q_{1},q_{2})$ of (45) is shown in the following lemma.

Lemma 2.

The function $F_{max}(q_{1},q_{2})$ (45) is given by

\displaystyle F_{max}(q_{1},q_{2})=\max_{\alpha_{1},\alpha_{2}\in[0,1/2]}\Big{[}H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-H_{b}\Big{(}[\alpha_{1}-q_{1}]^{+}\ast[\alpha_{2}-q_{2}]^{+}\Big{)}-1\Big{]}^{+},

(58)

where $\ast$ is the binary convolution, i.e., $x\ast y\triangleq(1-x)y+(1-y)x$ .

Proof.

The function $F_{max}(q_{1},q_{2})$ is defined in (44) and (45) where $V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}$ are binary random variables. Let us define the following probabilities:

	$\displaystyle\alpha_{i}$	$\displaystyle\triangleq P(V_{i}=1)$
	$\displaystyle\delta_{i}$	$\displaystyle\triangleq P(V^{\prime}_{i}=1\|V_{i}=0)$
	$\displaystyle\gamma_{i}$	$\displaystyle\triangleq P(V^{\prime}_{i}=0\|V_{i}=1),$

for $i=1,2$ . We thus have

	$\displaystyle P(V^{\prime}_{i}=1)$	$\displaystyle=(1-\alpha_{i})\delta_{i}+\alpha_{i}(1-\gamma_{i})\triangleq g(\alpha_{i},\delta_{i}\gamma_{i})$
	$\displaystyle P(V_{i}\neq V^{\prime}_{i})$	$\displaystyle=\alpha_{i}\gamma_{i}+(1-\alpha_{i})\delta_{i}\triangleq h(\alpha_{i},\delta_{i},\gamma_{i}),$

for $i=1,2$ . The maximization (45) can be written as

\displaystyle F_{max}(q_{1},q_{2})=\max_{\alpha_{1},\alpha_{2}}\Big{[}H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-\min_{\stackrel{{\scriptstyle\scriptstyle\gamma_{1},\delta_{1},\gamma_{2},\delta_{2}}}{{h(\alpha_{i},\delta_{i},\gamma_{i})\leq q_{i},\;i=1,2}}}H_{b}\Big{(}g(\alpha_{1},\delta_{1},\gamma_{1})\ast g(\alpha_{2},\delta_{2},\gamma_{2})\Big{)}-1\Big{]}^{+}.

(59)

This maximization has two equivalent solutions $(\alpha_{1}^{o},\alpha_{2}^{o})$ and $(1-\alpha_{1}^{o},1-\alpha_{2}^{o})$ where $0\leq\alpha_{1}^{o},\alpha_{2}^{o}\leq 0.5$ , since any other $(\alpha_{1},\alpha_{2})$ can only increase the inner minimization in (59) which results in a lower $F_{max}(q_{1},q_{2})$ . Therefore, without loss of generality we may assume that $0\leq\alpha_{1},\alpha_{2}\leq 0.5$ .

To prove the lemma we need to show that for any $\alpha_{i}$ the inner minimization is achieved by

\displaystyle\delta_{i}=0,\gamma_{i}=\min\{1,q_{i}/\alpha_{i}\},\;i=1,2.

In other words, $V^{\prime}_{i}$ has the smallest possible probability for $1$ under the constraint that $P(V_{i}\neq V^{\prime}_{i})\leq q_{i}$ , implying that the transition from $V_{i}$ to $V^{\prime}_{i}$ is a “Z channel”. The inner minimization requires that $P(V^{\prime}_{i}=1)$ will be minimized restricted to the constraint $P(V_{i}\neq V^{\prime}_{i})\leq q_{i}$ , therefore it is equivalent to the following minimization

\displaystyle\min_{\stackrel{{\scriptstyle\scriptstyle\gamma_{i},\delta_{i}}}{{h(\alpha_{i},\delta_{i}\gamma_{i})\leq q_{i}}}}g(\alpha_{i},\delta_{i}\gamma_{i}),\;i=1,2.

For $\alpha_{i}\leq q$ , the solution is $\delta_{i}=0$ and $\gamma_{i}=1$ since in this case $g(\alpha_{i},\gamma_{i},\delta_{i})=0$ and the constraint is satisfied. For $q\leq\alpha_{i}\leq 0.5$ , in order to minimize $g(\alpha_{i},\gamma_{i},\delta_{i})$ , it is required that $\delta_{i}\in[0,q/(1-a_{i})]$ will be minimal and $\gamma_{i}\in[0,q/\alpha_{i}]$ will be maximal such that the constraint is satisfied. Clearly, the best choice is for $\delta_{i}=0$ and $\gamma_{i}=q/\alpha_{i}$ , in this case the constraint is satisfies and $g(\alpha_{i},\gamma_{i},\delta_{i})=\alpha_{i}-q$ . ∎

The next lemma gives an explicit upper bound for $F_{max}(q_{1},q_{2})$ (45) for the case that $q_{1}=q_{2}$ . Let

\displaystyle f(x)=x-\frac{1}{1+\Big{(}\frac{1}{x}-1\Big{)}^{2}},

(60)

and let

\displaystyle q_{c}\triangleq\max_{x\in[0,1/2]}f(x).

(61)

Since $f(x)$ is differentiable, we can characterize $q_{c}$ by differentiating $f(x)$ with respect to $x$ and equating to zero, thus we get that

\displaystyle 4x^{4}-8x^{3}+10x^{2}-6x+1=0.

This fourth order polynomial has two complex roots and two real roots, where one of its real roots is a local minimum and the other root is a local maximum. Specifically, this local maximum maximizes $f(x)$ for the interval $x\in[0,1/2]$ and it achieves $q_{c}\simeq 0.1501$ which occurs at $x\simeq 0.257$ .

Lemma 3.

For $q_{1}=q_{2}=q$ , we have that:

\displaystyle\begin{array}[]{ll}F_{max}(q,q)=2H_{b}(q)-1,&q_{c}\leq q\leq 1/2\\ F_{max}(q,q)\leq C^{*}q,&0<q<q_{c}\\ F_{max}(0,0)=0,&q=0,\\ \end{array}

(65)

where $q_{c}$ is defined in (61), while $C^{*}=\frac{2H_{b}(q^{*})-1}{q^{*}}$ and $q^{*}\triangleq 1-1/\sqrt{2}\simeq 0.3$ are defined in (19).

Note that in the first case ( $q_{c}\leq q\leq 1/2$ ) in (58) is achieved by $\alpha_{1}=\alpha_{2}=q$ , while in the third case ( $q=0$ ) (58) is achieved by $\alpha_{1}=\alpha_{2}=1/2$ as shown in Fig. 5. Although, we do not have an explicit expression for $F_{max}(q,q)$ in the range $0<q<q_{c}$ , the bound $F_{max}(q,q)\leq C^{*}q$ is sufficient for the purpose of proving Theorem 2 because $q_{c}\leq q^{*}$ . In Fig. 4 a numerical characterization of $F_{max}(q,q)$ is plotted.

Proof.

Define

\displaystyle F(\alpha_{1},\alpha_{2},q)\triangleq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-H_{b}\big{(}[\alpha_{1}-q]^{+}\ast[\alpha_{2}-q]^{+}\big{)}-1.

From the discussion above about the cases of equality in (65), Lemma 3 will follow by showing that $F(\alpha_{1},\alpha_{2},q)$ is otherwise smaller, i.e.,

\displaystyle F(\alpha_{1},\alpha_{2},q)\leq\Bigg{\{}\begin{array}[]{cc}C^{*}q,&0\leq q\leq q_{c}\\ 2H_{b}(q)-1,&q_{c}\leq q\leq 1/2\\ \end{array}

(68)

for all $0\leq\alpha_{1},\alpha_{2}\leq 1/2$ . It is easy to see that for $\alpha_{1},\alpha_{2}\leq q$ the function $F(\alpha_{1},\alpha_{2},q)$ is monotonically increasing with $\alpha_{1},\alpha_{2}$ , and thus $F(\alpha_{1},\alpha_{2},q)\leq F(q,q,q)=2H_{b}(q)-1$ . For $\alpha_{1}\leq q$ and $q<\alpha_{2}\leq 1/2$ , $F(\alpha_{1},\alpha_{2},q)$ is increasing with $\alpha_{1}$ and decreasing with $\alpha_{2}$ , and thus $F(\alpha_{1},\alpha_{2},q)\leq F(q,q,q)=2H_{b}(q)-1$ . Clearly, from symmetry, also for $\alpha_{2}\leq q$ and $q\leq\alpha_{1}\leq 1/2$ , $F(\alpha_{1},\alpha_{2},q)\leq 2H_{b}(q)-1$ . As a consequence, we have to show that (68) is satisfied only for $q\leq\alpha_{1},\alpha_{2}\leq 1/2$ . Likewise, in the sequel we may assume without loss of generality that $q\leq\alpha_{2}\leq\alpha_{1}\leq 1/2$ .

The bound for the interval $q_{c}<q\leq 1/2$ : in this case (68) is equivalent to the following bound

\displaystyle H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}-H_{b}(\alpha_{1})-H_{b}(\alpha_{2})+2H_{b}(q)\geq 0,\;\mbox{for}\;q_{c}\leq q\leq\alpha_{2}\leq\alpha_{1}\leq 1/2.

(69)

The LHS is lower bounded by

	$\displaystyle H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}-H_{b}(\alpha_{1})-H_{b}(\alpha_{2})+2H_{b}(q)$
	$\displaystyle\geq H_{b}(\alpha_{1}-q)-H_{b}(\alpha_{1})-H_{b}(\alpha_{2})+2H_{b}(q)$		(70)
	$\displaystyle\geq H_{b}(\alpha_{1}-q)-2H_{b}(\alpha_{1})+2H_{b}(q)$		(71)
	$\displaystyle\geq 0,$		(72)

where (70) follows since $H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}\geq H_{b}(\alpha_{1}-q)$ ; (71) follows since $\alpha_{2}\leq\alpha_{1}\leq 1/2$ ; (72) follows from Lemma 4 below.

The bound for the interval $0\leq q\leq q_{c}$ : in this case (68) is equivalent to the following bound

\displaystyle H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}\geq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1-C^{*}\cdot q,\;for\;0\leq q\leq\alpha_{2}\leq\alpha_{1}\leq q_{c}.

(73)

For fixed $\alpha_{1}$ and $\alpha_{2}$ , let us denote the RHS and the LHS of (73) as

	$\displaystyle g_{l}(q)$	$\displaystyle\triangleq H_{b}\Big{(}(\alpha_{1}-q)\ast(\alpha_{2}-q)\Big{)}$
	$\displaystyle g_{r}(q)$	$\displaystyle\triangleq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1-C^{*}\cdot q.$

The function $g_{l}(q)$ is convex- $\cap$ in $q$ , since it is a composition of the function $H_{b}(x)$ which is non-decreasing convex- $\cap$ in the range $[0,1/2]$ and the function $[\alpha_{1}-q]\ast[\alpha_{2}-q]$ which is convex- $\cap$ in $q$ [24]. Since $g_{r}(q)$ is linear function in $q$ and $g_{l}(q)$ is convex- $\cap$ function in $q$ , the bound (73) is satisfied if the interval edges ( $q=0$ and $q=\alpha_{2}$ ) satisfy this bound. For $q=0$ , (73) holds since

	$\displaystyle g_{l}(q=0)$	$\displaystyle=H_{b}(\alpha_{1}\ast\alpha_{2})$
		$\displaystyle\geq\max\{H_{b}(\alpha_{1}),H_{b}(\alpha_{2})\}$
		$\displaystyle\geq\min\{H_{b}(\alpha_{1}),H_{b}(\alpha_{2})\}$
		$\displaystyle\geq H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1$
		$\displaystyle=g_{r}(q=0).$

For $q=\alpha_{2}$ where $0\leq q\leq q_{c}$ , the bound (73) is satisfied since

$\displaystyle g_{r}(q=\alpha_{2})$	$\displaystyle=H_{b}(\alpha_{1})+H_{b}(\alpha_{2})-1-C^{*}\cdot\alpha_{2}$	(74)
	$\displaystyle\leq H_{b}(\alpha_{1})-H_{b}(q^{})+H_{b}(0.5q^{})-0.5$	(75)
	$\displaystyle\leq H_{b}(\alpha_{1})-H_{b}(q_{c})$	(76)
	$\displaystyle\leq H_{b}(\alpha_{1})-H_{b}(\alpha_{2})$	(77)
	$\displaystyle\leq H_{b}(\alpha_{1}-\alpha_{2})$	(78)
	$\displaystyle=g_{l}(q=\alpha_{2}),$	(79)

where (75) follows from Lemma 5 since $\arg\max_{\alpha_{2}\in[0,1/2]}g_{r}(\alpha_{2})=0.5q^{*}$ , and since $C^{*}=\frac{2H_{b}(q^{*})-1}{q^{*}}$ ;(76) follows since for $q^{*}=1-1/\sqrt{2}$ and $q_{c}$ defined in (61), we have $H_{b}\big{(}1-1/\sqrt{2}\big{)}-H_{b}\big{(}0.5(1-1/\sqrt{2})\big{)}+0.5\simeq 0.68...\geq H_{b}(q_{c})$ ; (77) follows since $q_{c}\geq\alpha_{2}$ , thus $H_{b}(q_{c})\geq H_{b}(\alpha_{2})$ ; (78) follows since $H_{b}(\alpha_{1})-H_{b}(\alpha_{1}-\alpha_{2})$ is decreing in $\alpha_{1}$ , thus $H_{b}(\alpha_{1})-H_{b}(\alpha_{1}-\alpha_{2})\leq H_{b}(\alpha_{2})$ for $\alpha_{2}\leq\alpha_{1}\leq 1/2$ . Therefore, the bound (73) follows which completes the proof. ∎

Lemma 4 and Lemma 5 are auxiliary lemmas used in the proof of Lemma 3.

Lemma 4.

For $q_{c}\leq q\leq\alpha_{1}\leq 1/2$ , the following inequality is satisfied

\displaystyle f_{1}(\alpha_{1})\triangleq H_{b}(\alpha_{1}-q)-2H_{b}(\alpha_{1})+2H_{b}(q)\geq 0.

(80)

Proof.

Since $f_{1}(\alpha_{1}=q)=0$ , it is sufficient to show that $f_{1}(\alpha_{1})$ is non-decreasing function in $\alpha_{1}$ , i.e., $\frac{d}{d\alpha_{1}}f_{1}(\alpha_{1})\geq 0$ for $q_{c}\leq q\leq\alpha_{1}\leq 1/2$ , therefore

\displaystyle\frac{d}{d\alpha_{1}}f_{1}(\alpha_{1})=\log_{2}\Big{(}\frac{1}{\alpha_{1}-q}-1\Big{)}-2\log_{2}\Big{(}\frac{1}{\alpha_{1}}-1\Big{)}\geq 0.

(81)

Due to monotonicity of the log function (81) is equivalent to

\displaystyle q\geq\alpha_{1}-\frac{1}{1+\Big{(}\frac{1}{\alpha_{1}}-1\Big{)}^{2}}=f(\alpha_{1}),

(82)

where $f(\cdot)$ was defined in (60). Since by the definition of $q_{c}$ (61) $f(x)\leq q_{c}\;\forall x\in[0,1/2]$ , it follows that $f(\alpha_{1})\leq q\;\forall\;\alpha_{1}$ if $q_{c}\leq q$ , and in particular for $q_{c}\leq q\leq\alpha_{1}$ , which implies (82) as desired. ∎

Lemma 5.

Let

\displaystyle f_{2}(x)=H_{b}(x)-1-C^{*}\cdot x,

(83)

where $x\in[0,1/2]$ , and $C^{*}=\frac{2H_{b}(q^{*})-1}{q^{*}}$ where $q^{*}=1-1/\sqrt{2}$ . The maximum of $f_{2}(x)$ is achieved by

\displaystyle\arg\max_{x}f_{2}(x)=0.5q^{*}=\frac{1}{2}(1-1/\sqrt{2}).

(84)

Proof.

By differentiating $f_{2}(x)$ with respect to $x$ and comparing to zero, we get that

\displaystyle 0=\frac{d}{dx}f_{2}(x)=\log_{2}\Big{(}\frac{1-x}{x}\Big{)}-C^{*},

(85)

thus $x^{o}=\frac{1}{2^{C^{*}}+1}$ maximizes $f_{2}(x)$ since the second derivative is negative, i.e., $\frac{d^{2}}{x^{2}}f_{2}(x)|_{x=x^{o}}<0$ . The lemma is followed since $x^{o}=\frac{1}{2^{C^{*}}+1}=0.5q^{*}$ . ∎

We are now in a position to summarize the proof of Theorem 2.

Proof of Theorem 2 - Converse Part. The rate sum is upper bounded by

$\displaystyle R_{1}+R_{2}$	$\displaystyle\leq u.c.e\Big{\{}F_{max}(q,q)\Big{\}}$	(86)
	$\displaystyle\leq u.c.e\Bigg{\{}\begin{array}[]{cc}C^{*}\cdot q,&0\leq q\leq q_{c}\\ 2H_{b}(q)-1,&q_{c}<q\leq 1/2\\ \end{array}\Bigg{\}}$	(89)
	$\displaystyle=u.c.e\Big{\{}[2H_{b}(q)-1]^{+}\Big{\}},$	(90)

where (86) follows from Lemma 1; (89) follows from Lemma 3; and (90) follows since (89) is equal to the upper convex envelope of $[2Hb(q)-1]^{+}$ .

Appendix C A simplified Outer Bound for the Sum Capacity in the Strong Interference Gaussian Case

Lemma 6.

The best known single-letter sum capacity (7) of the Gaussian doubly-dirty MAC (26) with power constraints $P_{1}$ , $P_{2}$ , and strong interferences ( $Q_{1},Q_{2}\rightarrow\infty$ ) is upper bounded by

\displaystyle R_{1}+R_{2}\leq u.c.e\bigg{\{}\sup_{V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}}\Big{[}h(V_{1})+h(V_{2})-h\big{(}V^{\prime}_{1}+V^{\prime}_{2}+Z\big{)}+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}\bigg{\}},

(91)

where $u.c.e$ is the upper convex envelope operation with respect to $P_{1}$ and $P_{2}$ , and $[x]^{+}=\max(0,x)$ . The supremum is over all $V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}$ such that $(V_{1},V^{\prime}_{1})$ is independent of $(V_{2},V^{\prime}_{2})$ , and

	$\displaystyle E\Big{\{}(V_{i}-V^{\prime}_{i})^{2}\Big{\}}\leq P_{i},$
	$\displaystyle h(V_{i})\leq h(S_{i}),$

for $i=1,2$ .

Proof.

Let us define the following functions (corresponds to $F(P_{V_{1},V^{\prime}_{1}},P_{V_{2},V^{\prime}_{2}})$ of (44))

\displaystyle G\big{(}f_{V_{1},V^{\prime}_{1}},f_{V_{2},V^{\prime}_{2}}\big{)}\triangleq\Big{[}h(V_{1})+h(V_{2})-h\big{(}V^{\prime}_{1}+V^{\prime}_{2}+Z\big{)}+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}.

(92)

The second function is the following maximization of (92) with respect to $V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}$ .

	$\displaystyle G_{max}(P_{1},P_{2})\triangleq$	$\displaystyle\sup_{V_{1},V^{\prime}_{1},V_{2},V^{\prime}_{2}}G\big{(}f_{V_{1},V^{\prime}_{1}},f_{V_{2},V^{\prime}_{2}}\big{)}$		(93)
		$\displaystyle\mbox{s.t}\;\;E\Big{\{}(V_{i}-V^{\prime}_{i})^{2}\Big{\}}\leq P_{i},\quad h(V_{i})\leq h(S_{i}),\quad\mbox{for}\;\;i=1,2.$

Finally, we define the upper convex envelope of $G_{max}(P_{1},P_{2})$ with respect to $P_{1}$ and $P_{2}$ :

\displaystyle\overline{G}_{max}(P_{1},P_{2})\triangleq u.c.e\Big{\{}G_{max}(P_{1},P_{2})\Big{\}}.

(94)

Clearly if we take only the rate sum equation in (6) we get an outer bound on the best known single-letter region,

	$\displaystyle R_{BSL}^{sum}(U_{1},U_{2})\triangleq\Big{[}I(U_{1},U_{2};Y)-I(U_{1},U_{2};S_{1},S_{2})\Big{]}^{+}$		(95)
	$\displaystyle=\Big{[}h(S_{1}\|U_{1})+h(S_{2}\|U_{2})-h(Y\|U_{1},U_{2})+h(Y)-h(S_{1})-h(S_{2})\Big{]}^{+}$		(96)
	$\displaystyle\leq\Big{[}h(S_{1}\|U_{1})+h(S_{2}\|U_{2})-h(Y\|U_{1},U_{2})+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}+o(1)$		(97)
	$\displaystyle=\Bigg{[}E_{U_{1},U_{2}}\Big{\{}h(S_{1}\|U_{1}=u_{1})+h(S_{2}\|U_{2}=u_{2})-h(Y\|U_{1}=u_{1},U_{2}=u_{2})+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{\}}\Bigg{]}^{+}+o(1)$		(98)
	$\displaystyle\leq E_{U_{1},U_{2}}\Bigg{\{}\Big{[}h(S_{1}\|U_{1}=u_{1})+h(S_{2}\|U_{2}=u_{2})-h(X_{1}+S_{1}+X_{2}+S_{2}+Z\|U_{1}=u_{1},U_{2}=u_{2})$
	$\displaystyle\qquad\qquad+h(S_{1}+S_{2})-h(S_{1})-h(S_{2})\Big{]}^{+}\Bigg{\}}+o(1)$		(99)
	$\displaystyle=E_{U_{1},U_{2}}\Bigg{\{}G\Big{(}f_{S_{1},S_{1}+X_{1}\|U_{1}=u_{1}}f_{S_{2},S_{2}+X_{2}\|U_{2}=u_{2}}\Big{)}\Bigg{\}}+o(1)$		(100)
	$\displaystyle\leq E_{U_{1},U_{2}}\Big{\{}\overline{G}_{max}\big{(}P_{1\|u_{1}},P_{2\|u_{2}}\big{)}\Big{\}}+o(1)$		(101)
	$\displaystyle\leq\overline{G}_{max}\big{(}E_{U_{1}}P_{1\|u_{1}},E_{U_{2}}P_{2\|u_{2}}\big{)}+o(1)$		(102)
	$\displaystyle\leq\overline{G}_{max}\big{(}P_{1},P_{2}\big{)}+o(1),$		(103)

where (97) follows since $h(Y)\leq h(S_{1}+S_{2})+o(1)$ where $o(1)\rightarrow 0$ as $Q_{1},Q_{2}\rightarrow\infty$ ; (98) follows from the definition of the conditional entropy; (99) follows since $[Ex]^{+}\leq E\{x^{+}\}$ and since $Y=X_{1}+S_{1}+X_{2}+S_{2}+Z$ ; (100) follows from the definition of the function $G\big{(}f_{V_{1},V^{\prime}_{1}},f_{V_{2},V^{\prime}_{2}}\big{)}$ (92), likewise (101) follows from the definition of the function $\overline{G}_{max}(P_{1},P_{2})$ (94), and since $h(S_{i}|U_{i})\leq h(S_{i})$ and from the definition

\displaystyle P_{i|u_{i}}\triangleq E\Big{\{}X_{i}^{2}|U_{i}=u_{i}\Big{\}},\;for\;i=1,2;

(102) follows from Jensen’s inequality since $\overline{G}_{max}(P_{1},P_{2})$ is a concave function; (103) follows from the input constraints where

\displaystyle EX_{i}^{2}

\displaystyle=E_{U_{i}}E\big{\{}X_{i}^{2}|U_{i}=u_{i}\big{\}}=E_{U_{i}}P_{i|u_{i}}\leq P_{i},\;\mbox{for}\;\;i=1,2.

(104)

The lemma follows since the upper bound (103) for the rate sum is now independent of $U_{1}$ and $U_{2}$ , hence it also bound the single-letter region $\mathcal{R}_{BSL}(P_{1},P_{2})$ . ∎

Acknowledgment

The authors wish to thank Ashish Khisti for earlier discussions on the binary case. The authors also would like to thank Uri Erez for helpful comments.

References

[1] M. Costa, “Writing on dirty paper,” IEEE Trans. Information Theory, vol. IT-29, pp. 439–441, May 1983.
[2] S. Gelfand and M. S. Pinsker, “Coding for channel with random parameters,” Problemy Pered. Inform. (Problems of Inform. Trans.), vol. 9, No. 1, pp. 19–31, 1980.
[3] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York: Wiley, 1991.
[4] A. Somekh-Baruch, S. Shamai, and S. Verdu, “Cooperative encoding with asymmetric state information at the transmitters,” in Proceedings 44th Annual Allerton Conference on Communication, Control, and Computing, Univ. of Illinois, Urbana, IL, USA, Sep. 2006.
[5] S. Kotagiri and J. N. Laneman, “Multiple access channels with state information known at some encoders,” IEEE Trans. Information Theory, July 2006, submitted for publication.
[6] S. A. Jafar, “Capacity with causal and non-causal side information - a unified view,” IEEE Trans. Information Theory, vol. IT-52, pp. 5468–5475, Dec. 2006.
[7] K. Marton, “A coding theorem for the discrete memoryless broadcast channel,” IEEE Trans. Information Theory, vol. IT–22, pp. 374–377, May 1979.
[8] U. Erez, S. Shamai, and R. Zamir, “Capacity and lattice strategies for canceling known interference,” IEEE Trans. Information Theory, vol. IT-51, pp. 3820–3833, Nov. 2005.
[9] R. Zamir, S. Shamai, and U. Erez, “Nested linear/lattice codes for structured multiterminal binning,” IEEE Trans. Information Theory, vol. IT-48, pp. 1250–1276, June 2002.
[10] T. Philosof, A. Khisti, U. Erez, and R. Zamir, “Lattice strategies for the dirty multiple access channel,” in Proceedings of IEEE International Symposium on Information Theory, Nice, France, June 2007.
[11] J. Korner and K. Marton, “How to encode the modulo-two sum of binary sources,” IEEE Trans. Information Theory, vol. IT-25, pp. 219–221, March 1979.
[12] T. M. Cover and B. Gopinath, Open Problems in Communication and Computation. New York: Springer-Verlag, 1987.
[13] I. Csiszar and J. Korner, Information Theory - Coding Theorems for Discrete Memoryless Systems. New York: Academic Press, 1981.
[14] B. Nazer and M. Gastpar, “Computation over multiple-access channels,” IEEE Trans. Information Theory, vol. IT-53, pp. 3498–3516, Oct. 2007.
[15] D. Krithivasan and S. S. Pradhan, “Lattices for distributed source coding: Jointly Gaussian sources and reconstruction of a linear function,” arXiv:cs.IT/0707.3461V1.
[16] A. Khisti, “Private communication.”
[17] R. G. Gallager, Information Theory and Reliable Communication. New York, N.Y.: Wiley, 1968.
[18] G. Cohen, I. Honkala, S. Litsyn, and A. Lobstein, Covering Codes. Amsterdam, The Netherlands: North Holland Publishing, 1997.
[19] R. Ahlswede and J. Korner, “Source coding with side information and a converse for the degraded broadcast channel,” IEEE Trans. Information Theory, vol. 21, pp. 629–637, 1975.
[20] A. Wyner, “On source coding with side information at the decoder,” IEEE Trans. Information Theory, vol. IT-21, pp. 294–300, 1975.
[21] T. Berger, Multiterminal Source Coding. New York: In G.Longo, editor, the Information Theory Approach to Communications, Springer-Verlag, 1977.
[22] L. F. Wei and G. D. Forney, “Multidimensional constellation - part I: Introduction, figures of merit, and generalized cross constellations,” vol. 7, pp. 877–892, Aug. 1989.
[23] A. Cohen and R. Zamir, “Entropy amplification property and the loss for writing on dirty paper,” IEEE Trans. Information Theory, To appear, April 2008.
[24] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge: Cambridge University Press, 2004.

$\displaystyle EX_{i}$	$\displaystyle=E_{U_{i}}P(X_{i}=1\|U_{i}=u_{i})$
	$\displaystyle=\sum_{u_{i}\in U_{i}}P(u_{i})P(X_{i}=1\|U_{i}=u_{i})$
	$\displaystyle=\sum_{u_{i}\in U_{i}}P(u_{i})q_{i\|u_{i}}\leq q_{i},\;\mbox{for}\;\;i=1,2.$	(57)

The Rate Loss of Single-Letter Characterization: The “Dirty” Multiple Access Channel

Abstract

Index Terms:

I Introduction

II The Capacity Region of the Binary Doubly-Dirty MAC

Theorem 1.

Proof.

III A Single-Letter Characterization for the Capacity Region

Definition 1.

Theorem 2.

Proof.

IV The Gaussian Doubly-Dirty MAC

Conjecture 1.

V Summary

Appendix A A Closed Form Expression for the Capacity of the Binary MAC with One Dirty User

Appendix B Proof of the Converse Part of Theorem 2

Lemma 1.

Proof.

Lemma 2.

Proof.

Lemma 3.

Proof.

Lemma 4.

Proof.

Lemma 5.

Proof.

Appendix C A simplified Outer Bound for the Sum Capacity in the Strong Interference Gaussian Case

Lemma 6.

Proof.

Acknowledgment

References

The Rate Loss of Single-Letter Characterization:
The “Dirty” Multiple Access Channel