Capacity of a Class of Diamond Channels^†^†thanks: This work was supported by NSF Grants CCF $04$ - $47613$ , CCF $05$ - $14846$ , CNS $07$ - $16311$ and CCF $07$ - $29127$ .

Wei Kang Sennur Ulukus
Department of Electrical and Computer Engineering
University of Maryland, College Park, MD 20742
wkang@umd.edu ulukus@umd.edu

Abstract

We study a special class of diamond channels which was introduced by Schein in 2001. In this special class, each diamond channel consists of a transmitter, a noisy relay, a noiseless relay and a receiver. We prove the capacity of this class of diamond channels by providing an achievable scheme and a converse. The capacity we show is strictly smaller than the cut-set bound. Our result also shows the optimality of a combination of decode-and-forward (DAF) and compress-and-forward (CAF) at the noisy relay node. This is the first example where a combination of DAF and CAF is shown to be capacity achieving. Finally, we note that there exists a duality between this diamond channel coding problem and the Kaspi-Berger source coding problem.

1 Problem Statement and the Result

The diamond channel was first introduced by Schein in 2001 [1]. The diamond channel consists of one transmitter, two relays and a receiver, where the transmitter and the two relays form a broadcast channel as the first stage and the two relays and the receiver form a multiple access channel as the second stage. The capacity of the diamond channel in its most general form is open. Schein explored several special cases of the diamond channel, one of which [1, Section 3.5] is specified as follows (see Figure 1). The multiple access channel consists of two orthogonal links with rate constraints $R_{1}$ and $R_{2}$ , respectively. The broadcast channel contains a noisy branch and a noiseless branch, i.e., with input $X$ and two outputs $X$ and $Y$ . We refer to the relay node receiving $Y$ as the noisy relay and the relay node receiving $X$ as the noiseless relay. Schein provided two achievable schemes for this class of diamond channels. In this paper, we will prove the capacity of this special class of diamond channels.

The formal definition of the problem is as follows. Consider a channel with input alphabet $\mathcal{X}$ and output alphabet $\mathcal{Y}$ , which is characterized by the transition probability $p(y|x)$ . Assume an $n$ -length block code consisting of $(f,g,h,\varphi)$ where

$\displaystyle f:$	$\displaystyle\{1,2,\dots,M\}\mapsto\mathcal{X}^{n}$	(1)
$\displaystyle g:$	$\displaystyle\mathcal{Y}^{n}\mapsto\{1,2,\dots,\|g\|\}$	(2)
$\displaystyle h:$	$\displaystyle\{1,2,\dots,M\}\mapsto\{1,2,\dots,\|h\|\}$	(3)
$\displaystyle\varphi:$	$\displaystyle\{1,2,\dots,\|g\|\}\times\{1,2,\dots,\|h\|\}\mapsto\{1,2,\dots,M\}$	(4)

Here $f$ denotes the encoding function at the transmitter, $g$ and $h$ denote the processing functions at the noisy and noiseless relays, respectively, and $\varphi$ denotes the decoding function at the receiver.

The encoder sends $x^{n}=f(m)$ into the channel, where $m\in\{1,2,\dots,M\}$ . The decoder reconstructs $\hat{m}=\varphi(g(Y^{n}),h(m))$ . The average probability of error is defined as

P_{e}\triangleq\frac{1}{M}\sum_{m=1}^{M}Pr(\hat{m}\neq m|m\text{ is sent})

(5)

The rate triple $(R,R_{1},R_{2})$ is achievable if for every $0<\epsilon<1$ , $\eta>0$ and every sufficiently large $n$ , there exists an $n$ -length block code $(f,g,h,\varphi)$ , such that $P_{e}\leq\epsilon$ and

$\displaystyle\frac{1}{n}\ln M$	$\displaystyle\geq R-\eta$	(6)
$\displaystyle\frac{1}{n}\ln\|g\|$	$\displaystyle\leq R_{1}+\eta$	(7)
$\displaystyle\frac{1}{n}\ln\|h\|$	$\displaystyle\leq R_{2}+\eta$	(8)

Refer to caption — Figure 1: The diamond channel.

The following theorem characterizes the capacity of the class of diamond channels considered in this paper.

Theorem 1

The rate triple $(R,R_{1},R_{2})$ is achievable in the above channel if and only if the following conditions are satisfied

$\displaystyle R$	$\displaystyle\leq I(U;Y)+H(X\|U)$	(9)
$\displaystyle R_{1}$	$\displaystyle\geq I(Z;Y\|U,X)$	(10)
$\displaystyle R_{2}$	$\displaystyle\geq H(X\|Z,U)$	(11)
$\displaystyle R_{1}+R_{2}$	$\displaystyle\geq R+I(Y;Z\|X,U)$	(12)

for some joint distribution

p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y)

(13)

with cardinalities of alphabets satisfying

	$\displaystyle\|\mathcal{U}\|$	$\displaystyle\leq\|\mathcal{X}\|+4$		(14)
	$\displaystyle\|\mathcal{Z}\|$	$\displaystyle\leq\|\mathcal{U}\|\|\mathcal{Y}\|+3\leq\|\mathcal{X}\|\|\mathcal{Y}\|+4\|\mathcal{X}\|+3$		(15)

2 The Achievability

Assume a given joint distribution

p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y)

(16)

and consider that the information theoretic quantities on the right hand sides of (9), (10), (11) and (12) are evaluated with this fixed joint probability distribution.

Consider a message $W$ with rate $R$ . If $R\leq H(X|Z,U)$ , reliable transmission can be achieved by letting $g(Y^{n})=\phi$ (constant) and $h(W)=W$ , i.e., by sending the message through the noiseless relay. Thus, we will only consider the case where

H(X|Z,U)<R\leq I(U;Y)+H(X|U)

(17)

We will show that the message can be reliably transmitted with a pair of functions $(g,h)$ such that $(\frac{1}{n}\ln|g|,\frac{1}{n}\ln|h|)$ lies in the inverse pentagon¹¹1By “inverse pentagon” with corner points $a$ and $b$ , we mean the region in the $(R_{1},R_{2})$ space that is to the “north-east” of line segment $[a,b]$ . More specifically, this is the region described by inequalities in (10), (11) and (12). with corners $a$ and $b$ in Figure 2. However, we instead prove reliable transmission with $(\frac{1}{n}\ln|g|,\frac{1}{n}\ln|h|)$ lying in the inverse pentagon with corners $a^{\prime}$ and $b^{\prime}$ , which contains the inverse pentagon with corners $a$ and $b$ and thus imposes a stronger condition to prove. It is straightforward to have reliable transmission with the rate pair at point $b^{\prime}$ by letting $g(Y^{n})=\phi$ (constant) and $h(W)=W$ . Thus, it remains to prove that reliable transmission is possible with the rate pair at point $a^{\prime}$ , i.e.,

	$\displaystyle R_{1}$	$\displaystyle=I(U;Y)+I(Y;Z\|U)$		(18)
	$\displaystyle R_{2}$	$\displaystyle=R-I(U;Y)-I(X;Z\|U)$		(19)

Let us assume that the message $W$ is decomposed as $W=(W_{a},W_{b},W_{c})$ . For a positive number $\epsilon$ , let us define

$\displaystyle M_{a}\triangleq\|W_{a}\|$	$\displaystyle=\exp(n(I(U;Y)-3\epsilon))$	(20)
$\displaystyle M_{b}\triangleq\|W_{b}\|$	$\displaystyle=\frac{M}{M_{a}M_{c}}=\exp(\ln M-n(I(U;Y)+I(X;Z\|U)+6\epsilon))$	(21)
$\displaystyle M_{c}\triangleq\|W_{c}\|$	$\displaystyle=\exp(n(I(X;Z\|U)-3\epsilon))$	(22)

Random codebook generation: We use a superpostion code structure. The size of the inner code is $M_{a}$ . For each inner codeword, we independently generate $M_{b}$ outer codes. The size of each outer code is $M_{c}$ .

•

Independently generate $M_{a}$ sequences, $u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})$ , according to $\prod_{i=1}^{n}p(u_{i})$ where $p(u_{i})=p(u)$ , for $i=1,2,\dots,n$ .
•

For $u^{n}(j)$ , $j=1,2,\dots,M_{a}$ , independently generate $M_{b}$ codebooks, $\mathcal{C}(j,1),\mathcal{C}(j,2),\dots$ , $\mathcal{C}(j,M_{b})$ .
•

In the codebook $\mathcal{C}(j,k)$ , $j=1,2,\dots,M_{a}$ , $k=1,2,\dots,M_{b}$ , independently generate $M_{c}$ codewords $x^{n}(j,k,1),x^{n}(j,k,2),\dots,x^{n}(j,k,M_{c})$ according to $\prod_{i=1}^{n}p(x_{i}|U_{i}=u_{i}(j))$ , where $p(x_{i}|U=u_{i}(j))=p(x|u)$ , for $i=1,2,\dots,n$ , $j=1,2,\dots,M_{a}$ , $k=1,2,\dots,M_{b}$ .

There will be no overlapping codebooks with high probability when $n$ is sufficiently large, because

\frac{1}{n}\ln M_{b}M_{c}<H(X|U)

(23)

Encoding at the transmitter: Let $W=(W_{a},W_{b},W_{c})$ be the message. We send codeword $X^{n}=f(W_{a},W_{b},W_{c})\triangleq x^{n}(W_{a},W_{b},W_{c})$ into the channel.

Processing at the noisy relay: First, after having received $Y^{n}$ , seek

\hat{U}^{n}=u^{n}(\hat{W}_{a})\in\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}

(24)

such that

(\hat{U}^{n},Y^{n})\in\mathcal{T}_{[UY]}^{n}

(25)

where the definition of strong typical set can be found in [2, Section 1.2]. If there is not any such $\hat{U}^{n}$ , then let $\hat{U}^{n}$ be an arbitrary sequence in $\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}$ . Secondly, construct a conditional rate distortion code according to $\prod_{i=1}^{n}p(z_{i},y_{i}|\hat{u}_{i})$ with encoding function $g^{\prime}(Y^{n},\hat{U}^{n})$ and $|g^{\prime}|=L=\exp(n(I(Y;Z|U)+\tau))$ . Finally send $\hat{U}^{n}$ and $Z^{n}\triangleq g^{\prime}(Y^{n},\hat{U}^{n})$ to the destination, i.e.,

g(Y^{n})=(\hat{U}^{n},Z^{n})

(26)

where

|g|=M_{a}\times L\leq\exp(n(I(U;Y)+I(X;Z|U)+\tau-3\epsilon))

(27)

Processing at the noiseless relay: Let $h(f(W_{a},W_{c},W_{b}))=W_{b}$ where

|h|=M_{b}=\exp(\ln M-n(I(U;Y)+I(X;Z|U)+6\epsilon))

(28)

Decoding: Decoder collects $(\hat{U}^{n},Z^{n})$ from the noisy relay and $W_{b}$ from the noiseless relay. The decoder seeks a codeword $x^{n}(W_{a},W_{b},i)$ from the codebook $\mathcal{C}(W_{a},W_{b})$ such that

(x^{n}(\hat{W}_{a},W_{b},i),Z^{n})\in\mathcal{T}_{[XZ|U]}^{n}(\hat{U}^{n})

(29)

Probability of error: The error occurs when $(\hat{U},\hat{X})\neq(U,X)$ . The average probability of error can be decomposed into

Pr(E)\leq Pr(E_{1}\cup E_{2}\cup E_{3})=Pr(E_{1})+Pr(E_{2}\cap E_{1}^{c})+Pr(E_{3}\cap E_{1}^{c}\cap E_{2}^{c})

(30)

where

$\displaystyle E_{1}$	$\displaystyle\triangleq(U^{n},X^{n},Y^{n},Z^{n})\notin\mathcal{T}_{[UXYZ]}^{n}$	(31)
$\displaystyle E_{2}$	$\displaystyle\triangleq\bigcup_{\bar{u}^{n}\neq U^{n},\bar{u}^{n}\in\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}}(\bar{u}^{n},Y^{n})\in\mathcal{T}_{[UY]}^{n}$	(32)
$\displaystyle E_{3}$	$\displaystyle\triangleq\bigcup_{\bar{x}^{n}\neq X^{n},\bar{x}^{n}\in\mathcal{C}(W_{a},W_{b})}(\bar{x}^{n},Z^{n})\in\mathcal{T}_{[XZ\|U]}^{n}(U^{n})$	(33)

We note that

\displaystyle Pr(E_{1})

\displaystyle\leq Pr(U^{n}\notin\mathcal{T}_{[U]}^{n})+Pr((Y^{n},Z^{n})\notin\mathcal{T}_{[YZ|U]}^{n}(U^{n}))+Pr(X^{n}\notin\mathcal{T}_{[X|YZU]}^{n}(Y^{n},Z^{n},U^{n}))

(34)

where

•

$U^{n}$ is generated in an i.i.d. fashion with probability $p(u)$ . Thus, when $n$ is sufficiently large, we have

$Pr(U^{n}\notin\mathcal{T}_{[U]}^{n})\leq\epsilon$ (35)
•

$Z^{n}$ is a conditional rate distortion code for $Y^{n}$ conditioned on $U^{n}$ . Thus, when $n$ is sufficiently large, $L=\exp(nI(Y;Z|U)+\tau)$ , and $U^{n}\in\mathcal{T}_{[U]}^{n}$ , we have

$Pr((Y^{n},Z^{n})\notin\mathcal{T}_{[YZ|U]}^{n}(U^{n}))\leq\epsilon$ (36)
•

$X^{n}$ can be viewed as being generated according to an i.i.d. conditional probability $p(x|u,y)$ with respect to $(U^{n},Y^{n})$ . Thus, when $n$ is sufficiently large and $(Y^{n},Z^{n},U^{n})\in\mathcal{T}_{[YZU]}^{n}$ ,

$Pr(X^{n}\notin\mathcal{T}_{[X|YZU]}^{n}(Y^{n},Z^{n},U^{n}))\leq\epsilon$ (37)

From the above calculation, we have

Pr(E_{1})=Pr((U^{n},X^{n},Y^{n},Z^{n})\notin\mathcal{T}_{[UXZ]}^{n})\leq 3\epsilon

(38)

For the second error event, we note that $M_{a}=\exp(n(I(U;Y)-3\epsilon)$ and

$\displaystyle Pr(E_{2}\cap E_{1}^{c})$	$\displaystyle=Pr\left(\bigcup_{\bar{u}^{n}\neq U^{n},\bar{u}^{n}\in\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}}(\bar{u}^{n},Y^{n})\in\mathcal{T}_{[UY]}^{n}\|(Y^{n})\in\mathcal{T}_{[Y]}^{n}\right)$
	$\displaystyle\leq\sum_{i=1}^{M_{a}}Pr((u^{n}(i),Y^{n})\in\mathcal{T}_{[UY]}^{n}\|Y^{n}\in\mathcal{T}_{[Y]}^{n})$
	$\displaystyle\leq M_{a}Pr(u^{n}(i)\in\mathcal{T}_{[U\|Y]}^{n}(Y^{n}))$
	$\displaystyle\leq M_{a}\exp(-nH(U)+n\epsilon)\exp(nH(U\|Y)+n\epsilon)$
	$\displaystyle=\exp(-n\epsilon)$
	$\displaystyle\leq\epsilon\qquad$	(39)

for sufficiently large $n$ . We note that $M_{c}=\exp(n(I(X;Z|U)-3\epsilon)$ , then

$\displaystyle Pr(E_{3}\cap E_{1}^{c})$	$\displaystyle=Pr\left(\bigcup_{\bar{x}^{n}\neq X^{n},\bar{x}^{n}\in\mathcal{C}(W_{a},W_{b})}(\bar{x}^{n},Z^{n})\in\mathcal{T}_{[XZ\|U]}^{n}(U^{n})\|(Z^{n},U)\in\mathcal{T}_{[ZU]}^{n}\right)$
	$\displaystyle\leq\sum_{i=1}^{M_{c}}Pr((x(M_{a},M_{b},i),Z^{n})\in\mathcal{T}_{[XZ\|U]}^{n}(U^{n})\|(Z^{n},U^{n})\in\mathcal{T}_{[ZU]}^{n})$
	$\displaystyle\leq M_{c}Pr(x(M_{a},M_{b},i)\in\mathcal{T}_{[X\|ZU]}^{n}(Y^{n}))$
	$\displaystyle\leq M_{c}\exp(-nH(X\|U)+n\epsilon)\exp(nH(X\|Z,U)+n\epsilon)$
	$\displaystyle=\exp(-n\epsilon)$
	$\displaystyle\leq\epsilon\qquad$	(40)

for sufficiently large $n$ . Thus, the average probability error is upper bounded as

Pr(E)\leq 3\epsilon+\epsilon+\epsilon=5\epsilon

(41)

which goes to zero when $n$ goes to infinity.

3 The Converse

Define $Z_{i}\triangleq g$ and $U_{i}\triangleq(Y^{i-1},X_{i+1}^{n})$ . We note that

p(u_{i},x_{i},y_{i},z_{i})=p(u_{i},x_{i})p(y_{i}|x_{i})p(z_{i}|y_{i},u_{i})

(42)

We have

$\displaystyle\ln M$	$\displaystyle=H(X^{n})$
	$\displaystyle=\sum_{i=1}^{n}H(X_{i}\|X_{i+1}^{n})$
	$\displaystyle\leq\sum_{i=1}^{n}I(Y^{i-1};Y_{i})+H(X_{i}\|X_{i+1}^{n})$
	$\displaystyle=\sum_{i=1}^{n}I(Y^{i-1},X_{i+1}^{n};Y_{i})-I(X_{i+1}^{n};Y_{i}\|Y^{i-1})+H(X_{i}\|Y^{i-1},X_{i+1}^{n})+I(Y^{i-1};X_{i}\|X_{i+1}^{n})$
	$\displaystyle\overset{\ref{c1}}{=}\sum_{i=1}^{n}I(Y^{i-1},X_{i+1}^{n};Y_{i})+H(X_{i}\|Y^{i-1},X_{i+1}^{n})$
	$\displaystyle=\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}\|U_{i})$	(43)

where

1.

Because of the following equality [3, Lemma 7]

$\sum_{i=1}^{n}I(X_{i+1}^{n};Y_{i}|Y^{i-1})=\sum_{i=1}^{n}I(Y^{i-1};X_{i}|X_{i+1}^{n})$ (44)

We have

$\displaystyle\ln\|g\|$	$\displaystyle\geq H(g)$
	$\displaystyle\geq H(g\|h)$
	$\displaystyle\geq H(g\|h)-H(g\|h,Y^{n})$
	$\displaystyle=I(g;Y^{n}\|h)$
	$\displaystyle=\sum_{i=1}^{n}I(g;Y_{i}\|h,Y^{i-1})$
	$\displaystyle=\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-I(X_{i+1}^{n};Y_{i}\|g,h,Y^{i-1})$
	$\displaystyle\overset{\ref{c21}}{=}\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-I(Y^{i-1};X_{i}\|g,h,X_{i+1}^{n})$
	$\displaystyle\geq\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-H(X_{i}\|g,h,X_{i+1}^{n})$
	$\displaystyle=-H(X^{n}\|g,h)+\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})$
	$\displaystyle\overset{\ref{c22}}{\geq}\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-\epsilon$
	$\displaystyle\geq\sum_{i=1}^{n}I(g;Y_{i}\|h,Y^{i-1},X_{i+1}^{n})-\epsilon$
	$\displaystyle\overset{\ref{c23}}{\geq}\sum_{i=1}^{n}I(g;Y_{i}\|h,Y^{i-1},X_{i+1}^{n},X_{i})-\epsilon$
	$\displaystyle\overset{\ref{c24}}{=}\sum_{i=1}^{n}I(g;Y_{i}\|Y^{i-1},X_{i+1}^{n},X_{i})-\epsilon$
	$\displaystyle=\sum_{i=1}^{n}I(Z_{i};Y_{i}\|U_{i},X_{i})-\epsilon$	(45)

where

1.

Because of the following equality [3, Lemma 7]

$\sum_{i=1}^{n}I(X_{i+1}^{n};Y_{i}|g,h,Y^{i-1})=\sum_{i=1}^{n}I(Y^{i-1};X_{i}|g,h,X_{i+1}^{n})$ (46)
2.

Due to Fano’s inequality.
3.

$g$ is a deterministic function of $Y^{n}$ . Due to the memoryless property, we have

$H(g|Y_{i},h,Y^{i-1},X_{i+1}^{n},X_{i})=H(g|Y_{i},h,Y^{i-1},X_{i+1}^{n})$ (47)

$g$ is a deterministic function of $Y^{n}$ and $h$ is a deterministic function of $X^{n}$ . Due to the memoryless property, we have

	$\displaystyle H(g\|h,Y^{i-1},X_{i+1}^{n},X_{i})$	$\displaystyle=H(g\|Y^{i-1},X_{i+1}^{n},X_{i})$		(48)
	$\displaystyle H(g\|h,Y^{i-1},X_{i+1}^{n},X_{i},Y_{i})$	$\displaystyle=H(g\|Y^{i-1},X_{i+1}^{n},X_{i},Y_{i})$		(49)

We have

$\displaystyle\ln\|h\|$	$\displaystyle\geq H(h\|g)$
	$\displaystyle\geq I(h;X^{n}\|g)$
	$\displaystyle=H(X^{n}\|g)-H(X^{n}\|g,h)$
	$\displaystyle\overset{\ref{c31}}{\geq}H(X^{n}\|g)-n\epsilon$
	$\displaystyle=\sum_{i=1}^{n}H(X_{i}\|X_{i+1}^{n},g)-\epsilon$
	$\displaystyle\geq\sum_{i=1}^{n}H(X_{i}\|Y^{i-1},X_{i+1}^{n},g)-\epsilon$
	$\displaystyle=\sum_{i=1}^{n}H(X_{i}\|U_{i},Z_{i})-\epsilon$	(50)

where

1.

Due to Fano’s inequality.

We have

$\displaystyle\ln\|g\|+\ln\|h\|$	$\displaystyle\geq H(g,h)$
	$\displaystyle\geq I(g,h;X^{n},Y^{n})$
	$\displaystyle\geq I(X^{n};g,h)+I(Y^{n};g,h\|X^{n})$
	$\displaystyle=H(X^{n})-H(X^{n}\|g,h)+I(Y^{n};g,h\|X^{n})$
	$\displaystyle\overset{\ref{c41}}{\geq}\ln M-n\epsilon+I(Y^{n};g,h\|X^{n})$
	$\displaystyle\overset{\ref{c42}}{=}\ln M-n\epsilon+I(Y^{n};g\|X^{n})$
	$\displaystyle=\ln M+\sum_{i=1}^{n}-\epsilon+I(Y_{i};g\|X^{n},Y^{i-1})$
	$\displaystyle\overset{\ref{c43}}{=}\ln M+\sum_{i=1}^{n}-\epsilon+I(Y_{i};g\|X_{i},Y^{i-1},X_{i+1}^{n})$
	$\displaystyle=\ln M+\sum_{i=1}^{n}-\epsilon+I(Y_{i};Z_{i}\|X_{i},U_{i})$	(51)

1.

Due to Fano’s inequality.
2.

$h$ is a deterministic function of $X^{n}$

$g$ is a deterministic function of $Y^{n}$ . Due to the memoryless property, we have

	$\displaystyle H(g\|X_{i},Y^{i-1},X_{i+1}^{n},X^{i-1})$	$\displaystyle=H(g\|X_{i},Y^{i-1},X_{i+1}^{n})$		(52)
	$\displaystyle H(g\|Y_{i},X_{i},Y^{i-1},X_{i+1}^{n},X^{i-1})$	$\displaystyle=H(g\|Y_{i},X_{i},Y^{i-1},X_{i+1}^{n})$		(53)

We note that $\frac{1}{n}\ln M\geq R-\eta$ , $\frac{1}{n}\ln|g|\leq R_{1}+\eta$ and $\frac{1}{n}\ln|h|\leq R_{2}+\eta$ , for an arbitrary $\eta>0$ . Assume $\epsilon\rightarrow 0$ , then from (43), (45), (50) and (51), we have

$\displaystyle R$	$\displaystyle\leq\frac{1}{n}\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}\|U_{i})$	(54)
$\displaystyle R_{1}$	$\displaystyle\geq\frac{1}{n}\sum_{i=1}^{n}I(Z_{i};Y_{i}\|U_{i},X_{i})$	(55)
$\displaystyle R_{2}$	$\displaystyle\geq\frac{1}{n}\sum_{i=1}H(X_{i}\|U_{i},Z_{i})$	(56)
$\displaystyle R_{1}+R_{2}$	$\displaystyle\geq R+\frac{1}{n}\sum_{i=1}^{n}I(Y_{i};Z_{i}\|X_{i},U_{i})$	(57)

Define a time-sharing random variable $Q$ , which is uniformly distributed on $\{1,2,\dots,n\}$ . Also define a set of random variables $(X,Y,\tilde{U},\tilde{Z})$ such that

\displaystyle Pr(X=x,Y=y,\tilde{U}=u,\tilde{Z}=z|Q=i)=p(X_{i}=x,Y_{i}=y,

\displaystyle U_{i}=u,Z_{i}=z),\quad

\displaystyle i=1,2,\dots,n

(58)

Define $U=(\tilde{U},Q)$ and $Z=(\tilde{Z},Q)$ , then

$\displaystyle R$	$\displaystyle\leq\frac{1}{n}\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}\|U_{i})$
	$\displaystyle=I(\tilde{U};Y\|Q)+H(X\|\tilde{U},Q)$
	$\displaystyle\leq I(\tilde{U},Q;Y)+H(X\|\tilde{U},Q)$
	$\displaystyle=I(U;Y)+H(X\|U)$	(59)

$\displaystyle R_{1}$	$\displaystyle\geq\frac{1}{n}\sum_{i=1}^{n}I(Z_{i};Y_{i}\|U_{i},X_{i})$
	$\displaystyle=I(\tilde{Z};Y\|\tilde{U},Q,X)$
	$\displaystyle=I(Z;Y\|U,X)$	(60)

$\displaystyle R_{2}$	$\displaystyle\geq\frac{1}{n}\sum_{i=1}H(X_{i}\|U_{i},Z_{i})$
	$\displaystyle=H(X\|\tilde{U},\tilde{Z},Q)$
	$\displaystyle=H(X\|U,Z)$	(61)

$\displaystyle R_{1}+R_{2}$	$\displaystyle\geq R+\frac{1}{n}\sum_{i=1}^{n}I(Y_{i};Z_{i}\|X_{i},U_{i})$
	$\displaystyle=R+I(\tilde{Z};Y\|\tilde{U},X,Q)$
	$\displaystyle=R+I(Z;Y\|U,X)$	(62)

where (59), (60), (61) and (62) are the same as (9), (10), (11) and (12), concluding the proof.

Finally, we note that the bounds on the cardinalities of the alphabets in (14) and (15) can be proven in a way similar to [4, Appendix D].

4 Remarks

We have several remarks regarding this result as follows:

1.

The capacity is strictly smaller than the cut-set bound [5], because first

$R\leq R_{1}+R_{2}-I(Y;Z|U,X)$ (63)

An operational interpretation is that when the noisy relay cannot fully decode the message, or in other words, when the noisy relay cannot remove the noise completely, the data going through the link from the noisy relay to the receiver contains noise. Thus, the useful information flowing through the multiple access cut will be strictly less than $R_{1}+R_{2}$ . Secondly, we note that

$R\leq I(U;Y)+H(X|U)\leq H(X)$ (64)

An operational interpretation is that when the noisy relay decodes the message with a positive rate, the rate of information flowing through the broadcast cut becomes strictly less than $H(X)$ .

Consider the following example. Let $X$ and $Y$ be binary and

$Y=X\oplus W$ (65)

where the sum is a modulo- $2$ sum and $W$ has a Bernoulli distribution with entropy $0.5$ bits. We assume $R_{1}=R_{2}=0.5$ bits. The cut-set bound in this example is $1$ bit, which is not achievable. Because if $R$ is equal to $1$ bit, we have,

$R=I(U;Y)+H(X|U)=H(X)=1$ (66)

then, $U$ has to be independent of $X$ and $Y$ . Also, we have

$R=R_{1}+R_{2}-I(Y;Z|U,X)=R_{1}+R_{2}=1$ (67)

then, $Z$ has to be independent of $X$ and $Y$ if $U$ is independent of $X$ and $Y$ . However, if $U$ and $Z$ are independent of $X$ and $Y$ , we arrive at the following contradiction,

$0.5=R_{2}\geq H(X|Z,U)=H(X)=1$ (68)

which means that the cut-set bound is not achievable in this example. We note that, even in this binary example where $|\mathcal{X}|=|\mathcal{Y}|=2$ , the cardinalities of the auxiliary random variables $U$ and $Z$ are $|\mathcal{U}|\leq 6$ and $|\mathcal{Z}|\leq 15$ . These large cardinality bounds make it practically impossible to evaluate the capacity of this diamond channel. However, we note that, even though we were not able to compute the exact value of the capacity in this example, we were able to conclude that the capacity is strictly less than the cut-set bound, which is $1$ bit.

We know that the capacity of a diamond channel with four orthogonal links is equal to the cut-set bound in this channel. Our result shows that introducing the broadcast node will reduce the capacity of this all-orthogonal diamond channel. Networks with broadcast nodes have been studied recently from different perspectives, e.g., information theory and network coding [6, 7, 8]. We note that our diamond channel model is a simple example of a general network with a broadcast node. Thus, we conclude that the cut-set bound in general is not tight in networks with broadcast nodes.
2.

The processing at the noisy relay includes two operations: decode the inner code $U^{n}$ and compress the channel output $Y^{n}$ to $Z^{n}$ conditioned on $U^{n}$ . This processing is essentially the same as Theorem $7$ in [9], i.e., combination of DAF and CAF. DAF [9, Theorem 1] has been shown to be optimal in the degraded relay channel [9]. Partial DAF, a special case of [9, Theorem 7] without compression, has been shown to be optimal in semi-deterministic relay channel [10] and the relay channel with orthogonal transmitter-relay link [11]. Recently, CAF [9, Theorem 6] has been shown to be optimal in two special relay channels [12, 13]. To our knowledge, we are the first to show the optimality of the combination of DAF and CAF in some specific channel, even though the channel we consider is not a three-node relay channel in the strict sense, i.e., as in [9].

If we assume $R=H(X)-R_{0}$ , then Theorem 1 can be rewritten as follows

$\displaystyle R\leq I(U;Y)+H(X\|U)$	$\displaystyle\longleftrightarrow$	$\displaystyle\qquad R_{0}\geq I(U;X\|Y)$	(69)
$\displaystyle R_{1}\geq I(Z;Y\|U,X)$	$\displaystyle\longleftrightarrow$	$\displaystyle\qquad R_{1}\geq I(Z;Y\|U,X)$	(70)
$\displaystyle R_{2}\geq H(X\|Z,U)$	$\displaystyle\longleftrightarrow$	$\displaystyle\qquad R_{2}\geq I(X;X\|Z,U)$	(71)
$\displaystyle R_{1}+R_{2}\geq R+I(Y;Z\|X,U)$	$\displaystyle\longleftrightarrow$	$\displaystyle\qquad R_{0}+R_{1}+R_{2}\geq I(X,Y;U,X,Z)$	(72)

for some joint distribution

p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y)

(73)

We note that the right hand sides of (69), (70), (71) and (72) in addition to the distribution constraint in (73) are the same as the rate region of the rate-distortion problem studied by Kaspi and Berger as shown in Figure 3 [4, Theorem 2.1, Case C].

This duality between our diamond channel coding problem and the Kaspi-Berger source coding problem is similar to the duality between the single-user channel coding problem and the Slepian-Wolf source coding problem[2, Section 3.1] by viewing the codebook information in the channel coding problem as the information sent to all the terminals in the source coding problem, e.g., the information with rate $R_{0}$ in Figure 3. Thus, the achievability of our diamond channel coding problem can be obtained from the achievability of Kaspi-Berger source coding problem, in the same way that the achievability of the multiple access channel coding problem can be obtained from the achievability of fork network coding problem [2, Section 3.2].

References

[1] B. E. Schein. Distributed Coordination in Network Information Theory. PhD thesis, Massachusetts Institute of Technology, 2001.
[2] I. Csiszár and J. Körner. Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press, 1981.
[3] I. Csiszár and J. Körner. Broadcast channels with confidential messages. IEEE Trans. Inform. Theory, 24(3):339–348, 1978.
[4] A. H. Kaspi and T. Berger. Rate-distortion for correlated sources with partially separated encoders. IEEE Trans. Inform. Theory, 28(6):828–840, 1982.
[5] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley and Sons, 1991.
[6] A. F. Dana, R. Gowaikar, R. Palanki, B. Hassibi, and M. Effros. Capacity of wireless erasure networks. IEEE Trans. Inform. Theory, 52(3):789–804, 2006.
[7] N. Ratnakar and G. Kramer. The multicast capacity of deterministic relay network with no interference. IEEE Trans. Inform. Theory, 52(6):2425–2432, 2006.
[8] G. Kramer, S. M. S. Tabatabaei Yazdi, and S. A. Savari. Network coding on line networks with broadcast. In Proc. Conf. Inf. Sciences and Systems (CISS), Princeton, NJ, Mar. 2008.
[9] T. M. Cover and A. El Gamal. Capacity theorems for the relay channel. IEEE Trans. Inform. Theory, 25:572–584, Sep. 1979.
[10] A. El Gamal and M. Aref. The capacity of the semideterministic relay channel. IEEE Trans. Inform. Theory, 28(3):536, 1982.
[11] A. El Gamal and S. Zahedi. Capacity of a class of relay channels with orthogonal components. IEEE Trans. Inform. Theory, 51(5):1815–1817, 2005.
[12] Y. H. Kim. Capacity of a class of deterministic relay channels. IEEE Trans. Inform. Theory, 53(3):1328–1329, 2008.
[13] M. Aleksic, P. Razaghi, and W. Yu. Capacity of a class of modulo-sum relay channels. Submitted to IEEE Transactions on Information Theory, 2007, http://arxiv.org/pdf/0704.3591.

	$\displaystyle\|\mathcal{U}\|$	$\displaystyle\leq\|\mathcal{X}\|+4$		(14)
	$\displaystyle\|\mathcal{Z}\|$	$\displaystyle\leq\|\mathcal{U}\|\|\mathcal{Y}\|+3\leq\|\mathcal{X}\|\|\mathcal{Y}\|+4\|\mathcal{X}\|+3$		(15)

$\displaystyle M_{a}\triangleq\|W_{a}\|$	$\displaystyle=\exp(n(I(U;Y)-3\epsilon))$	(20)
$\displaystyle M_{b}\triangleq\|W_{b}\|$	$\displaystyle=\frac{M}{M_{a}M_{c}}=\exp(\ln M-n(I(U;Y)+I(X;Z\|U)+6\epsilon))$	(21)
$\displaystyle M_{c}\triangleq\|W_{c}\|$	$\displaystyle=\exp(n(I(X;Z\|U)-3\epsilon))$	(22)

$\displaystyle\ln M$	$\displaystyle=H(X^{n})$
	$\displaystyle=\sum_{i=1}^{n}H(X_{i}\|X_{i+1}^{n})$
	$\displaystyle\leq\sum_{i=1}^{n}I(Y^{i-1};Y_{i})+H(X_{i}\|X_{i+1}^{n})$
	$\displaystyle=\sum_{i=1}^{n}I(Y^{i-1},X_{i+1}^{n};Y_{i})-I(X_{i+1}^{n};Y_{i}\|Y^{i-1})+H(X_{i}\|Y^{i-1},X_{i+1}^{n})+I(Y^{i-1};X_{i}\|X_{i+1}^{n})$
	$\displaystyle\overset{\ref{c1}}{=}\sum_{i=1}^{n}I(Y^{i-1},X_{i+1}^{n};Y_{i})+H(X_{i}\|Y^{i-1},X_{i+1}^{n})$
	$\displaystyle=\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}\|U_{i})$	(43)

$\displaystyle\ln\|g\|$	$\displaystyle\geq H(g)$
	$\displaystyle\geq H(g\|h)$
	$\displaystyle\geq H(g\|h)-H(g\|h,Y^{n})$
	$\displaystyle=I(g;Y^{n}\|h)$
	$\displaystyle=\sum_{i=1}^{n}I(g;Y_{i}\|h,Y^{i-1})$
	$\displaystyle=\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-I(X_{i+1}^{n};Y_{i}\|g,h,Y^{i-1})$
	$\displaystyle\overset{\ref{c21}}{=}\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-I(Y^{i-1};X_{i}\|g,h,X_{i+1}^{n})$
	$\displaystyle\geq\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-H(X_{i}\|g,h,X_{i+1}^{n})$
	$\displaystyle=-H(X^{n}\|g,h)+\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})$
	$\displaystyle\overset{\ref{c22}}{\geq}\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}\|h,Y^{i-1})-\epsilon$
	$\displaystyle\geq\sum_{i=1}^{n}I(g;Y_{i}\|h,Y^{i-1},X_{i+1}^{n})-\epsilon$
	$\displaystyle\overset{\ref{c23}}{\geq}\sum_{i=1}^{n}I(g;Y_{i}\|h,Y^{i-1},X_{i+1}^{n},X_{i})-\epsilon$
	$\displaystyle\overset{\ref{c24}}{=}\sum_{i=1}^{n}I(g;Y_{i}\|Y^{i-1},X_{i+1}^{n},X_{i})-\epsilon$
	$\displaystyle=\sum_{i=1}^{n}I(Z_{i};Y_{i}\|U_{i},X_{i})-\epsilon$	(45)

	$\displaystyle H(g\|h,Y^{i-1},X_{i+1}^{n},X_{i})$	$\displaystyle=H(g\|Y^{i-1},X_{i+1}^{n},X_{i})$		(48)
	$\displaystyle H(g\|h,Y^{i-1},X_{i+1}^{n},X_{i},Y_{i})$	$\displaystyle=H(g\|Y^{i-1},X_{i+1}^{n},X_{i},Y_{i})$		(49)

Capacity of a Class of Diamond Channels††thanks: This work was supported by NSF Grants CCF 0404-4761347613, CCF 0505-1484614846, CNS 0707-1631116311 and CCF 0707-2912729127.

Abstract

1 Problem Statement and the Result

Theorem 1

2 The Achievability

3 The Converse

4 Remarks

References

Capacity of a Class of Diamond Channels^†^†thanks: This work was supported by NSF Grants CCF $04$ - $47613$ , CCF $05$ - $14846$ , CNS $07$ - $16311$ and CCF $07$ - $29127$ .