This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Capacity of a Class of Diamond Channelsthanks: This work was supported by NSF Grants CCF 0404-4761347613, CCF 0505-1484614846, CNS 0707-1631116311 and CCF 0707-2912729127.

Wei Kang   Sennur Ulukus
Department of Electrical and Computer Engineering
University of Maryland, College Park, MD 20742
wkang@umd.edu   ulukus@umd.edu
Abstract

We study a special class of diamond channels which was introduced by Schein in 2001. In this special class, each diamond channel consists of a transmitter, a noisy relay, a noiseless relay and a receiver. We prove the capacity of this class of diamond channels by providing an achievable scheme and a converse. The capacity we show is strictly smaller than the cut-set bound. Our result also shows the optimality of a combination of decode-and-forward (DAF) and compress-and-forward (CAF) at the noisy relay node. This is the first example where a combination of DAF and CAF is shown to be capacity achieving. Finally, we note that there exists a duality between this diamond channel coding problem and the Kaspi-Berger source coding problem.

1 Problem Statement and the Result

The diamond channel was first introduced by Schein in 2001 [1]. The diamond channel consists of one transmitter, two relays and a receiver, where the transmitter and the two relays form a broadcast channel as the first stage and the two relays and the receiver form a multiple access channel as the second stage. The capacity of the diamond channel in its most general form is open. Schein explored several special cases of the diamond channel, one of which [1, Section 3.5] is specified as follows (see Figure 1). The multiple access channel consists of two orthogonal links with rate constraints R1R_{1} and R2R_{2}, respectively. The broadcast channel contains a noisy branch and a noiseless branch, i.e., with input XX and two outputs XX and YY. We refer to the relay node receiving YY as the noisy relay and the relay node receiving XX as the noiseless relay. Schein provided two achievable schemes for this class of diamond channels. In this paper, we will prove the capacity of this special class of diamond channels.

The formal definition of the problem is as follows. Consider a channel with input alphabet 𝒳\mathcal{X} and output alphabet 𝒴\mathcal{Y}, which is characterized by the transition probability p(y|x)p(y|x). Assume an nn-length block code consisting of (f,g,h,φ)(f,g,h,\varphi) where

f:\displaystyle f: {1,2,,M}𝒳n\displaystyle\{1,2,\dots,M\}\mapsto\mathcal{X}^{n} (1)
g:\displaystyle g: 𝒴n{1,2,,|g|}\displaystyle\mathcal{Y}^{n}\mapsto\{1,2,\dots,|g|\} (2)
h:\displaystyle h: {1,2,,M}{1,2,,|h|}\displaystyle\{1,2,\dots,M\}\mapsto\{1,2,\dots,|h|\} (3)
φ:\displaystyle\varphi: {1,2,,|g|}×{1,2,,|h|}{1,2,,M}\displaystyle\{1,2,\dots,|g|\}\times\{1,2,\dots,|h|\}\mapsto\{1,2,\dots,M\} (4)

Here ff denotes the encoding function at the transmitter, gg and hh denote the processing functions at the noisy and noiseless relays, respectively, and φ\varphi denotes the decoding function at the receiver.

The encoder sends xn=f(m)x^{n}=f(m) into the channel, where m{1,2,,M}m\in\{1,2,\dots,M\}. The decoder reconstructs m^=φ(g(Yn),h(m))\hat{m}=\varphi(g(Y^{n}),h(m)). The average probability of error is defined as

Pe1Mm=1MPr(m^m|m is sent)P_{e}\triangleq\frac{1}{M}\sum_{m=1}^{M}Pr(\hat{m}\neq m|m\text{ is sent}) (5)

The rate triple (R,R1,R2)(R,R_{1},R_{2}) is achievable if for every 0<ϵ<10<\epsilon<1, η>0\eta>0 and every sufficiently large nn, there exists an nn-length block code (f,g,h,φ)(f,g,h,\varphi), such that PeϵP_{e}\leq\epsilon and

1nlnM\displaystyle\frac{1}{n}\ln M Rη\displaystyle\geq R-\eta (6)
1nln|g|\displaystyle\frac{1}{n}\ln|g| R1+η\displaystyle\leq R_{1}+\eta (7)
1nln|h|\displaystyle\frac{1}{n}\ln|h| R2+η\displaystyle\leq R_{2}+\eta (8)
Refer to caption
(Relay 22)XnX^{n}YnY^{n}R1R_{1}R2R_{2}Noisy relay(Relay 11)DecoderEncoderNoiseless relayXnX^{n}
Figure 1: The diamond channel.

The following theorem characterizes the capacity of the class of diamond channels considered in this paper.

Theorem 1

The rate triple (R,R1,R2)(R,R_{1},R_{2}) is achievable in the above channel if and only if the following conditions are satisfied

R\displaystyle R I(U;Y)+H(X|U)\displaystyle\leq I(U;Y)+H(X|U) (9)
R1\displaystyle R_{1} I(Z;Y|U,X)\displaystyle\geq I(Z;Y|U,X) (10)
R2\displaystyle R_{2} H(X|Z,U)\displaystyle\geq H(X|Z,U) (11)
R1+R2\displaystyle R_{1}+R_{2} R+I(Y;Z|X,U)\displaystyle\geq R+I(Y;Z|X,U) (12)

for some joint distribution

p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y)p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y) (13)

with cardinalities of alphabets satisfying

|𝒰|\displaystyle|\mathcal{U}| |𝒳|+4\displaystyle\leq|\mathcal{X}|+4 (14)
|𝒵|\displaystyle|\mathcal{Z}| |𝒰||𝒴|+3|𝒳||𝒴|+4|𝒳|+3\displaystyle\leq|\mathcal{U}||\mathcal{Y}|+3\leq|\mathcal{X}||\mathcal{Y}|+4|\mathcal{X}|+3 (15)

2 The Achievability

Assume a given joint distribution

p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y)p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y) (16)

and consider that the information theoretic quantities on the right hand sides of (9), (10), (11) and (12) are evaluated with this fixed joint probability distribution.

Consider a message WW with rate RR. If RH(X|Z,U)R\leq H(X|Z,U), reliable transmission can be achieved by letting g(Yn)=ϕg(Y^{n})=\phi (constant) and h(W)=Wh(W)=W, i.e., by sending the message through the noiseless relay. Thus, we will only consider the case where

H(X|Z,U)<RI(U;Y)+H(X|U)H(X|Z,U)<R\leq I(U;Y)+H(X|U) (17)

We will show that the message can be reliably transmitted with a pair of functions (g,h)(g,h) such that (1nln|g|,1nln|h|)(\frac{1}{n}\ln|g|,\frac{1}{n}\ln|h|) lies in the inverse pentagon111By “inverse pentagon” with corner points aa and bb, we mean the region in the (R1,R2)(R_{1},R_{2}) space that is to the “north-east” of line segment [a,b][a,b]. More specifically, this is the region described by inequalities in (10), (11) and (12). with corners aa and bb in Figure 2. However, we instead prove reliable transmission with (1nln|g|,1nln|h|)(\frac{1}{n}\ln|g|,\frac{1}{n}\ln|h|) lying in the inverse pentagon with corners aa^{\prime} and bb^{\prime}, which contains the inverse pentagon with corners aa and bb and thus imposes a stronger condition to prove. It is straightforward to have reliable transmission with the rate pair at point bb^{\prime} by letting g(Yn)=ϕg(Y^{n})=\phi (constant) and h(W)=Wh(W)=W. Thus, it remains to prove that reliable transmission is possible with the rate pair at point aa^{\prime}, i.e.,

R1\displaystyle R_{1} =I(U;Y)+I(Y;Z|U)\displaystyle=I(U;Y)+I(Y;Z|U) (18)
R2\displaystyle R_{2} =RI(U;Y)I(X;Z|U)\displaystyle=R-I(U;Y)-I(X;Z|U) (19)
Refer to caption
H(X|U,Z)H(X|U,Z)bbaabb^{\prime}aa^{\prime}R1R_{1}R+I(Z;Y|U,X)R+I(Z;Y|U,X)RI(U;Y)I(X;Z|U)R-I(U;Y)-I(X;Z|U)R2R_{2}RRR+I(Z;Y|U,X)H(X|U,Z)R+I(Z;Y|U,X)-H(X|U,Z)I(U;Y)+I(Y;Z|U)I(U;Y)+I(Y;Z|U)I(Z;Y|U,X)I(Z;Y|U,X)
Figure 2: Rate region of (R1,R2)(R_{1},R_{2}) when H(X|U,Z)RI(U;Y)+I(X;Z|U)H(X|U,Z)\leq R\leq I(U;Y)+I(X;Z|U).

Let us assume that the message WW is decomposed as W=(Wa,Wb,Wc)W=(W_{a},W_{b},W_{c}). For a positive number ϵ\epsilon, let us define

Ma|Wa|\displaystyle M_{a}\triangleq|W_{a}| =exp(n(I(U;Y)3ϵ))\displaystyle=\exp(n(I(U;Y)-3\epsilon)) (20)
Mb|Wb|\displaystyle M_{b}\triangleq|W_{b}| =MMaMc=exp(lnMn(I(U;Y)+I(X;Z|U)+6ϵ))\displaystyle=\frac{M}{M_{a}M_{c}}=\exp(\ln M-n(I(U;Y)+I(X;Z|U)+6\epsilon)) (21)
Mc|Wc|\displaystyle M_{c}\triangleq|W_{c}| =exp(n(I(X;Z|U)3ϵ))\displaystyle=\exp(n(I(X;Z|U)-3\epsilon)) (22)

Random codebook generation: We use a superpostion code structure. The size of the inner code is MaM_{a}. For each inner codeword, we independently generate MbM_{b} outer codes. The size of each outer code is McM_{c}.

  • Independently generate MaM_{a} sequences, un(1),un(2),,un(Ma)u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a}), according to i=1np(ui)\prod_{i=1}^{n}p(u_{i}) where p(ui)=p(u)p(u_{i})=p(u), for i=1,2,,ni=1,2,\dots,n.

  • For un(j)u^{n}(j), j=1,2,,Maj=1,2,\dots,M_{a}, independently generate MbM_{b} codebooks, 𝒞(j,1),𝒞(j,2),\mathcal{C}(j,1),\mathcal{C}(j,2),\dots, 𝒞(j,Mb)\mathcal{C}(j,M_{b}).

  • In the codebook 𝒞(j,k)\mathcal{C}(j,k), j=1,2,,Maj=1,2,\dots,M_{a}, k=1,2,,Mbk=1,2,\dots,M_{b}, independently generate McM_{c} codewords xn(j,k,1),xn(j,k,2),,xn(j,k,Mc)x^{n}(j,k,1),x^{n}(j,k,2),\dots,x^{n}(j,k,M_{c}) according to i=1np(xi|Ui=ui(j))\prod_{i=1}^{n}p(x_{i}|U_{i}=u_{i}(j)), where p(xi|U=ui(j))=p(x|u)p(x_{i}|U=u_{i}(j))=p(x|u), for i=1,2,,ni=1,2,\dots,n, j=1,2,,Maj=1,2,\dots,M_{a}, k=1,2,,Mbk=1,2,\dots,M_{b}.

There will be no overlapping codebooks with high probability when nn is sufficiently large, because

1nlnMbMc<H(X|U)\frac{1}{n}\ln M_{b}M_{c}<H(X|U) (23)

Encoding at the transmitter: Let W=(Wa,Wb,Wc)W=(W_{a},W_{b},W_{c}) be the message. We send codeword Xn=f(Wa,Wb,Wc)xn(Wa,Wb,Wc)X^{n}=f(W_{a},W_{b},W_{c})\triangleq x^{n}(W_{a},W_{b},W_{c}) into the channel.

Processing at the noisy relay: First, after having received YnY^{n}, seek

U^n=un(W^a){un(1),un(2),,un(Ma)}\hat{U}^{n}=u^{n}(\hat{W}_{a})\in\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\} (24)

such that

(U^n,Yn)𝒯[UY]n(\hat{U}^{n},Y^{n})\in\mathcal{T}_{[UY]}^{n} (25)

where the definition of strong typical set can be found in [2, Section 1.2]. If there is not any such U^n\hat{U}^{n}, then let U^n\hat{U}^{n} be an arbitrary sequence in {un(1),un(2),,un(Ma)}\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}. Secondly, construct a conditional rate distortion code according to i=1np(zi,yi|u^i)\prod_{i=1}^{n}p(z_{i},y_{i}|\hat{u}_{i}) with encoding function g(Yn,U^n)g^{\prime}(Y^{n},\hat{U}^{n}) and |g|=L=exp(n(I(Y;Z|U)+τ))|g^{\prime}|=L=\exp(n(I(Y;Z|U)+\tau)). Finally send U^n\hat{U}^{n} and Zng(Yn,U^n)Z^{n}\triangleq g^{\prime}(Y^{n},\hat{U}^{n}) to the destination, i.e.,

g(Yn)=(U^n,Zn)g(Y^{n})=(\hat{U}^{n},Z^{n}) (26)

where

|g|=Ma×Lexp(n(I(U;Y)+I(X;Z|U)+τ3ϵ))|g|=M_{a}\times L\leq\exp(n(I(U;Y)+I(X;Z|U)+\tau-3\epsilon)) (27)

Processing at the noiseless relay: Let h(f(Wa,Wc,Wb))=Wbh(f(W_{a},W_{c},W_{b}))=W_{b} where

|h|=Mb=exp(lnMn(I(U;Y)+I(X;Z|U)+6ϵ))|h|=M_{b}=\exp(\ln M-n(I(U;Y)+I(X;Z|U)+6\epsilon)) (28)

Decoding: Decoder collects (U^n,Zn)(\hat{U}^{n},Z^{n}) from the noisy relay and WbW_{b} from the noiseless relay. The decoder seeks a codeword xn(Wa,Wb,i)x^{n}(W_{a},W_{b},i) from the codebook 𝒞(Wa,Wb)\mathcal{C}(W_{a},W_{b}) such that

(xn(W^a,Wb,i),Zn)𝒯[XZ|U]n(U^n)(x^{n}(\hat{W}_{a},W_{b},i),Z^{n})\in\mathcal{T}_{[XZ|U]}^{n}(\hat{U}^{n}) (29)

Probability of error: The error occurs when (U^,X^)(U,X)(\hat{U},\hat{X})\neq(U,X). The average probability of error can be decomposed into

Pr(E)Pr(E1E2E3)=Pr(E1)+Pr(E2E1c)+Pr(E3E1cE2c)Pr(E)\leq Pr(E_{1}\cup E_{2}\cup E_{3})=Pr(E_{1})+Pr(E_{2}\cap E_{1}^{c})+Pr(E_{3}\cap E_{1}^{c}\cap E_{2}^{c}) (30)

where

E1\displaystyle E_{1} (Un,Xn,Yn,Zn)𝒯[UXYZ]n\displaystyle\triangleq(U^{n},X^{n},Y^{n},Z^{n})\notin\mathcal{T}_{[UXYZ]}^{n} (31)
E2\displaystyle E_{2} u¯nUn,u¯n{un(1),un(2),,un(Ma)}(u¯n,Yn)𝒯[UY]n\displaystyle\triangleq\bigcup_{\bar{u}^{n}\neq U^{n},\bar{u}^{n}\in\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}}(\bar{u}^{n},Y^{n})\in\mathcal{T}_{[UY]}^{n} (32)
E3\displaystyle E_{3} x¯nXn,x¯n𝒞(Wa,Wb)(x¯n,Zn)𝒯[XZ|U]n(Un)\displaystyle\triangleq\bigcup_{\bar{x}^{n}\neq X^{n},\bar{x}^{n}\in\mathcal{C}(W_{a},W_{b})}(\bar{x}^{n},Z^{n})\in\mathcal{T}_{[XZ|U]}^{n}(U^{n}) (33)

We note that

Pr(E1)\displaystyle Pr(E_{1}) Pr(Un𝒯[U]n)+Pr((Yn,Zn)𝒯[YZ|U]n(Un))+Pr(Xn𝒯[X|YZU]n(Yn,Zn,Un))\displaystyle\leq Pr(U^{n}\notin\mathcal{T}_{[U]}^{n})+Pr((Y^{n},Z^{n})\notin\mathcal{T}_{[YZ|U]}^{n}(U^{n}))+Pr(X^{n}\notin\mathcal{T}_{[X|YZU]}^{n}(Y^{n},Z^{n},U^{n})) (34)

where

  • UnU^{n} is generated in an i.i.d. fashion with probability p(u)p(u). Thus, when nn is sufficiently large, we have

    Pr(Un𝒯[U]n)ϵPr(U^{n}\notin\mathcal{T}_{[U]}^{n})\leq\epsilon (35)
  • ZnZ^{n} is a conditional rate distortion code for YnY^{n} conditioned on UnU^{n}. Thus, when nn is sufficiently large, L=exp(nI(Y;Z|U)+τ)L=\exp(nI(Y;Z|U)+\tau), and Un𝒯[U]nU^{n}\in\mathcal{T}_{[U]}^{n}, we have

    Pr((Yn,Zn)𝒯[YZ|U]n(Un))ϵPr((Y^{n},Z^{n})\notin\mathcal{T}_{[YZ|U]}^{n}(U^{n}))\leq\epsilon (36)
  • XnX^{n} can be viewed as being generated according to an i.i.d. conditional probability p(x|u,y)p(x|u,y) with respect to (Un,Yn)(U^{n},Y^{n}). Thus, when nn is sufficiently large and (Yn,Zn,Un)𝒯[YZU]n(Y^{n},Z^{n},U^{n})\in\mathcal{T}_{[YZU]}^{n},

    Pr(Xn𝒯[X|YZU]n(Yn,Zn,Un))ϵPr(X^{n}\notin\mathcal{T}_{[X|YZU]}^{n}(Y^{n},Z^{n},U^{n}))\leq\epsilon (37)

From the above calculation, we have

Pr(E1)=Pr((Un,Xn,Yn,Zn)𝒯[UXZ]n)3ϵPr(E_{1})=Pr((U^{n},X^{n},Y^{n},Z^{n})\notin\mathcal{T}_{[UXZ]}^{n})\leq 3\epsilon (38)

For the second error event, we note that Ma=exp(n(I(U;Y)3ϵ)M_{a}=\exp(n(I(U;Y)-3\epsilon) and

Pr(E2E1c)\displaystyle Pr(E_{2}\cap E_{1}^{c}) =Pr(u¯nUn,u¯n{un(1),un(2),,un(Ma)}(u¯n,Yn)𝒯[UY]n|(Yn)𝒯[Y]n)\displaystyle=Pr\left(\bigcup_{\bar{u}^{n}\neq U^{n},\bar{u}^{n}\in\{u^{n}(1),u^{n}(2),\dots,u^{n}(M_{a})\}}(\bar{u}^{n},Y^{n})\in\mathcal{T}_{[UY]}^{n}|(Y^{n})\in\mathcal{T}_{[Y]}^{n}\right)
i=1MaPr((un(i),Yn)𝒯[UY]n|Yn𝒯[Y]n)\displaystyle\leq\sum_{i=1}^{M_{a}}Pr((u^{n}(i),Y^{n})\in\mathcal{T}_{[UY]}^{n}|Y^{n}\in\mathcal{T}_{[Y]}^{n})
MaPr(un(i)𝒯[U|Y]n(Yn))\displaystyle\leq M_{a}Pr(u^{n}(i)\in\mathcal{T}_{[U|Y]}^{n}(Y^{n}))
Maexp(nH(U)+nϵ)exp(nH(U|Y)+nϵ)\displaystyle\leq M_{a}\exp(-nH(U)+n\epsilon)\exp(nH(U|Y)+n\epsilon)
=exp(nϵ)\displaystyle=\exp(-n\epsilon)
ϵ\displaystyle\leq\epsilon\qquad (39)

for sufficiently large nn. We note that Mc=exp(n(I(X;Z|U)3ϵ)M_{c}=\exp(n(I(X;Z|U)-3\epsilon), then

Pr(E3E1c)\displaystyle Pr(E_{3}\cap E_{1}^{c}) =Pr(x¯nXn,x¯n𝒞(Wa,Wb)(x¯n,Zn)𝒯[XZ|U]n(Un)|(Zn,U)𝒯[ZU]n)\displaystyle=Pr\left(\bigcup_{\bar{x}^{n}\neq X^{n},\bar{x}^{n}\in\mathcal{C}(W_{a},W_{b})}(\bar{x}^{n},Z^{n})\in\mathcal{T}_{[XZ|U]}^{n}(U^{n})|(Z^{n},U)\in\mathcal{T}_{[ZU]}^{n}\right)
i=1McPr((x(Ma,Mb,i),Zn)𝒯[XZ|U]n(Un)|(Zn,Un)𝒯[ZU]n)\displaystyle\leq\sum_{i=1}^{M_{c}}Pr((x(M_{a},M_{b},i),Z^{n})\in\mathcal{T}_{[XZ|U]}^{n}(U^{n})|(Z^{n},U^{n})\in\mathcal{T}_{[ZU]}^{n})
McPr(x(Ma,Mb,i)𝒯[X|ZU]n(Yn))\displaystyle\leq M_{c}Pr(x(M_{a},M_{b},i)\in\mathcal{T}_{[X|ZU]}^{n}(Y^{n}))
Mcexp(nH(X|U)+nϵ)exp(nH(X|Z,U)+nϵ)\displaystyle\leq M_{c}\exp(-nH(X|U)+n\epsilon)\exp(nH(X|Z,U)+n\epsilon)
=exp(nϵ)\displaystyle=\exp(-n\epsilon)
ϵ\displaystyle\leq\epsilon\qquad (40)

for sufficiently large nn. Thus, the average probability error is upper bounded as

Pr(E)3ϵ+ϵ+ϵ=5ϵPr(E)\leq 3\epsilon+\epsilon+\epsilon=5\epsilon (41)

which goes to zero when nn goes to infinity.

3 The Converse

Define ZigZ_{i}\triangleq g and Ui(Yi1,Xi+1n)U_{i}\triangleq(Y^{i-1},X_{i+1}^{n}). We note that

p(ui,xi,yi,zi)=p(ui,xi)p(yi|xi)p(zi|yi,ui)p(u_{i},x_{i},y_{i},z_{i})=p(u_{i},x_{i})p(y_{i}|x_{i})p(z_{i}|y_{i},u_{i}) (42)

We have

lnM\displaystyle\ln M =H(Xn)\displaystyle=H(X^{n})
=i=1nH(Xi|Xi+1n)\displaystyle=\sum_{i=1}^{n}H(X_{i}|X_{i+1}^{n})
i=1nI(Yi1;Yi)+H(Xi|Xi+1n)\displaystyle\leq\sum_{i=1}^{n}I(Y^{i-1};Y_{i})+H(X_{i}|X_{i+1}^{n})
=i=1nI(Yi1,Xi+1n;Yi)I(Xi+1n;Yi|Yi1)+H(Xi|Yi1,Xi+1n)+I(Yi1;Xi|Xi+1n)\displaystyle=\sum_{i=1}^{n}I(Y^{i-1},X_{i+1}^{n};Y_{i})-I(X_{i+1}^{n};Y_{i}|Y^{i-1})+H(X_{i}|Y^{i-1},X_{i+1}^{n})+I(Y^{i-1};X_{i}|X_{i+1}^{n})
=1i=1nI(Yi1,Xi+1n;Yi)+H(Xi|Yi1,Xi+1n)\displaystyle\overset{\ref{c1}}{=}\sum_{i=1}^{n}I(Y^{i-1},X_{i+1}^{n};Y_{i})+H(X_{i}|Y^{i-1},X_{i+1}^{n})
=i=1nI(Ui;Yi)+H(Xi|Ui)\displaystyle=\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}|U_{i}) (43)

where

  1. 1.

    Because of the following equality [3, Lemma 7]

    i=1nI(Xi+1n;Yi|Yi1)=i=1nI(Yi1;Xi|Xi+1n)\sum_{i=1}^{n}I(X_{i+1}^{n};Y_{i}|Y^{i-1})=\sum_{i=1}^{n}I(Y^{i-1};X_{i}|X_{i+1}^{n}) (44)

We have

ln|g|\displaystyle\ln|g| H(g)\displaystyle\geq H(g)
H(g|h)\displaystyle\geq H(g|h)
H(g|h)H(g|h,Yn)\displaystyle\geq H(g|h)-H(g|h,Y^{n})
=I(g;Yn|h)\displaystyle=I(g;Y^{n}|h)
=i=1nI(g;Yi|h,Yi1)\displaystyle=\sum_{i=1}^{n}I(g;Y_{i}|h,Y^{i-1})
=i=1nI(g,Xi+1n;Yi|h,Yi1)I(Xi+1n;Yi|g,h,Yi1)\displaystyle=\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}|h,Y^{i-1})-I(X_{i+1}^{n};Y_{i}|g,h,Y^{i-1})
=1i=1nI(g,Xi+1n;Yi|h,Yi1)I(Yi1;Xi|g,h,Xi+1n)\displaystyle\overset{\ref{c21}}{=}\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}|h,Y^{i-1})-I(Y^{i-1};X_{i}|g,h,X_{i+1}^{n})
i=1nI(g,Xi+1n;Yi|h,Yi1)H(Xi|g,h,Xi+1n)\displaystyle\geq\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}|h,Y^{i-1})-H(X_{i}|g,h,X_{i+1}^{n})
=H(Xn|g,h)+i=1nI(g,Xi+1n;Yi|h,Yi1)\displaystyle=-H(X^{n}|g,h)+\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}|h,Y^{i-1})
2i=1nI(g,Xi+1n;Yi|h,Yi1)ϵ\displaystyle\overset{\ref{c22}}{\geq}\sum_{i=1}^{n}I(g,X_{i+1}^{n};Y_{i}|h,Y^{i-1})-\epsilon
i=1nI(g;Yi|h,Yi1,Xi+1n)ϵ\displaystyle\geq\sum_{i=1}^{n}I(g;Y_{i}|h,Y^{i-1},X_{i+1}^{n})-\epsilon
3i=1nI(g;Yi|h,Yi1,Xi+1n,Xi)ϵ\displaystyle\overset{\ref{c23}}{\geq}\sum_{i=1}^{n}I(g;Y_{i}|h,Y^{i-1},X_{i+1}^{n},X_{i})-\epsilon
=4i=1nI(g;Yi|Yi1,Xi+1n,Xi)ϵ\displaystyle\overset{\ref{c24}}{=}\sum_{i=1}^{n}I(g;Y_{i}|Y^{i-1},X_{i+1}^{n},X_{i})-\epsilon
=i=1nI(Zi;Yi|Ui,Xi)ϵ\displaystyle=\sum_{i=1}^{n}I(Z_{i};Y_{i}|U_{i},X_{i})-\epsilon (45)

where

  1. 1.

    Because of the following equality [3, Lemma 7]

    i=1nI(Xi+1n;Yi|g,h,Yi1)=i=1nI(Yi1;Xi|g,h,Xi+1n)\sum_{i=1}^{n}I(X_{i+1}^{n};Y_{i}|g,h,Y^{i-1})=\sum_{i=1}^{n}I(Y^{i-1};X_{i}|g,h,X_{i+1}^{n}) (46)
  2. 2.

    Due to Fano’s inequality.

  3. 3.

    gg is a deterministic function of YnY^{n}. Due to the memoryless property, we have

    H(g|Yi,h,Yi1,Xi+1n,Xi)=H(g|Yi,h,Yi1,Xi+1n)H(g|Y_{i},h,Y^{i-1},X_{i+1}^{n},X_{i})=H(g|Y_{i},h,Y^{i-1},X_{i+1}^{n}) (47)
  4. 4.

    gg is a deterministic function of YnY^{n} and hh is a deterministic function of XnX^{n}. Due to the memoryless property, we have

    H(g|h,Yi1,Xi+1n,Xi)\displaystyle H(g|h,Y^{i-1},X_{i+1}^{n},X_{i}) =H(g|Yi1,Xi+1n,Xi)\displaystyle=H(g|Y^{i-1},X_{i+1}^{n},X_{i}) (48)
    H(g|h,Yi1,Xi+1n,Xi,Yi)\displaystyle H(g|h,Y^{i-1},X_{i+1}^{n},X_{i},Y_{i}) =H(g|Yi1,Xi+1n,Xi,Yi)\displaystyle=H(g|Y^{i-1},X_{i+1}^{n},X_{i},Y_{i}) (49)

We have

ln|h|\displaystyle\ln|h| H(h|g)\displaystyle\geq H(h|g)
I(h;Xn|g)\displaystyle\geq I(h;X^{n}|g)
=H(Xn|g)H(Xn|g,h)\displaystyle=H(X^{n}|g)-H(X^{n}|g,h)
1H(Xn|g)nϵ\displaystyle\overset{\ref{c31}}{\geq}H(X^{n}|g)-n\epsilon
=i=1nH(Xi|Xi+1n,g)ϵ\displaystyle=\sum_{i=1}^{n}H(X_{i}|X_{i+1}^{n},g)-\epsilon
i=1nH(Xi|Yi1,Xi+1n,g)ϵ\displaystyle\geq\sum_{i=1}^{n}H(X_{i}|Y^{i-1},X_{i+1}^{n},g)-\epsilon
=i=1nH(Xi|Ui,Zi)ϵ\displaystyle=\sum_{i=1}^{n}H(X_{i}|U_{i},Z_{i})-\epsilon (50)

where

  1. 1.

    Due to Fano’s inequality.

We have

ln|g|+ln|h|\displaystyle\ln|g|+\ln|h| H(g,h)\displaystyle\geq H(g,h)
I(g,h;Xn,Yn)\displaystyle\geq I(g,h;X^{n},Y^{n})
I(Xn;g,h)+I(Yn;g,h|Xn)\displaystyle\geq I(X^{n};g,h)+I(Y^{n};g,h|X^{n})
=H(Xn)H(Xn|g,h)+I(Yn;g,h|Xn)\displaystyle=H(X^{n})-H(X^{n}|g,h)+I(Y^{n};g,h|X^{n})
1lnMnϵ+I(Yn;g,h|Xn)\displaystyle\overset{\ref{c41}}{\geq}\ln M-n\epsilon+I(Y^{n};g,h|X^{n})
=2lnMnϵ+I(Yn;g|Xn)\displaystyle\overset{\ref{c42}}{=}\ln M-n\epsilon+I(Y^{n};g|X^{n})
=lnM+i=1nϵ+I(Yi;g|Xn,Yi1)\displaystyle=\ln M+\sum_{i=1}^{n}-\epsilon+I(Y_{i};g|X^{n},Y^{i-1})
=3lnM+i=1nϵ+I(Yi;g|Xi,Yi1,Xi+1n)\displaystyle\overset{\ref{c43}}{=}\ln M+\sum_{i=1}^{n}-\epsilon+I(Y_{i};g|X_{i},Y^{i-1},X_{i+1}^{n})
=lnM+i=1nϵ+I(Yi;Zi|Xi,Ui)\displaystyle=\ln M+\sum_{i=1}^{n}-\epsilon+I(Y_{i};Z_{i}|X_{i},U_{i}) (51)
  1. 1.

    Due to Fano’s inequality.

  2. 2.

    hh is a deterministic function of XnX^{n}

  3. 3.

    gg is a deterministic function of YnY^{n}. Due to the memoryless property, we have

    H(g|Xi,Yi1,Xi+1n,Xi1)\displaystyle H(g|X_{i},Y^{i-1},X_{i+1}^{n},X^{i-1}) =H(g|Xi,Yi1,Xi+1n)\displaystyle=H(g|X_{i},Y^{i-1},X_{i+1}^{n}) (52)
    H(g|Yi,Xi,Yi1,Xi+1n,Xi1)\displaystyle H(g|Y_{i},X_{i},Y^{i-1},X_{i+1}^{n},X^{i-1}) =H(g|Yi,Xi,Yi1,Xi+1n)\displaystyle=H(g|Y_{i},X_{i},Y^{i-1},X_{i+1}^{n}) (53)

We note that 1nlnMRη\frac{1}{n}\ln M\geq R-\eta, 1nln|g|R1+η\frac{1}{n}\ln|g|\leq R_{1}+\eta and 1nln|h|R2+η\frac{1}{n}\ln|h|\leq R_{2}+\eta, for an arbitrary η>0\eta>0. Assume ϵ0\epsilon\rightarrow 0, then from (43), (45), (50) and (51), we have

R\displaystyle R 1ni=1nI(Ui;Yi)+H(Xi|Ui)\displaystyle\leq\frac{1}{n}\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}|U_{i}) (54)
R1\displaystyle R_{1} 1ni=1nI(Zi;Yi|Ui,Xi)\displaystyle\geq\frac{1}{n}\sum_{i=1}^{n}I(Z_{i};Y_{i}|U_{i},X_{i}) (55)
R2\displaystyle R_{2} 1ni=1H(Xi|Ui,Zi)\displaystyle\geq\frac{1}{n}\sum_{i=1}H(X_{i}|U_{i},Z_{i}) (56)
R1+R2\displaystyle R_{1}+R_{2} R+1ni=1nI(Yi;Zi|Xi,Ui)\displaystyle\geq R+\frac{1}{n}\sum_{i=1}^{n}I(Y_{i};Z_{i}|X_{i},U_{i}) (57)

Define a time-sharing random variable QQ, which is uniformly distributed on {1,2,,n}\{1,2,\dots,n\}. Also define a set of random variables (X,Y,U~,Z~)(X,Y,\tilde{U},\tilde{Z}) such that

Pr(X=x,Y=y,U~=u,Z~=z|Q=i)=p(Xi=x,Yi=y,\displaystyle Pr(X=x,Y=y,\tilde{U}=u,\tilde{Z}=z|Q=i)=p(X_{i}=x,Y_{i}=y, Ui=u,Zi=z),\displaystyle U_{i}=u,Z_{i}=z),\quad i=1,2,,n\displaystyle i=1,2,\dots,n (58)

Define U=(U~,Q)U=(\tilde{U},Q) and Z=(Z~,Q)Z=(\tilde{Z},Q), then

R\displaystyle R 1ni=1nI(Ui;Yi)+H(Xi|Ui)\displaystyle\leq\frac{1}{n}\sum_{i=1}^{n}I(U_{i};Y_{i})+H(X_{i}|U_{i})
=I(U~;Y|Q)+H(X|U~,Q)\displaystyle=I(\tilde{U};Y|Q)+H(X|\tilde{U},Q)
I(U~,Q;Y)+H(X|U~,Q)\displaystyle\leq I(\tilde{U},Q;Y)+H(X|\tilde{U},Q)
=I(U;Y)+H(X|U)\displaystyle=I(U;Y)+H(X|U) (59)
R1\displaystyle R_{1} 1ni=1nI(Zi;Yi|Ui,Xi)\displaystyle\geq\frac{1}{n}\sum_{i=1}^{n}I(Z_{i};Y_{i}|U_{i},X_{i})
=I(Z~;Y|U~,Q,X)\displaystyle=I(\tilde{Z};Y|\tilde{U},Q,X)
=I(Z;Y|U,X)\displaystyle=I(Z;Y|U,X) (60)
R2\displaystyle R_{2} 1ni=1H(Xi|Ui,Zi)\displaystyle\geq\frac{1}{n}\sum_{i=1}H(X_{i}|U_{i},Z_{i})
=H(X|U~,Z~,Q)\displaystyle=H(X|\tilde{U},\tilde{Z},Q)
=H(X|U,Z)\displaystyle=H(X|U,Z) (61)
R1+R2\displaystyle R_{1}+R_{2} R+1ni=1nI(Yi;Zi|Xi,Ui)\displaystyle\geq R+\frac{1}{n}\sum_{i=1}^{n}I(Y_{i};Z_{i}|X_{i},U_{i})
=R+I(Z~;Y|U~,X,Q)\displaystyle=R+I(\tilde{Z};Y|\tilde{U},X,Q)
=R+I(Z;Y|U,X)\displaystyle=R+I(Z;Y|U,X) (62)

where (59), (60), (61) and (62) are the same as (9), (10), (11) and (12), concluding the proof.

Finally, we note that the bounds on the cardinalities of the alphabets in (14) and (15) can be proven in a way similar to [4, Appendix D].

4 Remarks

We have several remarks regarding this result as follows:

  1. 1.

    The capacity is strictly smaller than the cut-set bound [5], because first

    RR1+R2I(Y;Z|U,X)R\leq R_{1}+R_{2}-I(Y;Z|U,X) (63)

    An operational interpretation is that when the noisy relay cannot fully decode the message, or in other words, when the noisy relay cannot remove the noise completely, the data going through the link from the noisy relay to the receiver contains noise. Thus, the useful information flowing through the multiple access cut will be strictly less than R1+R2R_{1}+R_{2}. Secondly, we note that

    RI(U;Y)+H(X|U)H(X)R\leq I(U;Y)+H(X|U)\leq H(X) (64)

    An operational interpretation is that when the noisy relay decodes the message with a positive rate, the rate of information flowing through the broadcast cut becomes strictly less than H(X)H(X).

    Consider the following example. Let XX and YY be binary and

    Y=XWY=X\oplus W (65)

    where the sum is a modulo-22 sum and WW has a Bernoulli distribution with entropy 0.50.5 bits. We assume R1=R2=0.5R_{1}=R_{2}=0.5 bits. The cut-set bound in this example is 11 bit, which is not achievable. Because if RR is equal to 11 bit, we have,

    R=I(U;Y)+H(X|U)=H(X)=1R=I(U;Y)+H(X|U)=H(X)=1 (66)

    then, UU has to be independent of XX and YY. Also, we have

    R=R1+R2I(Y;Z|U,X)=R1+R2=1R=R_{1}+R_{2}-I(Y;Z|U,X)=R_{1}+R_{2}=1 (67)

    then, ZZ has to be independent of XX and YY if UU is independent of XX and YY. However, if UU and ZZ are independent of XX and YY, we arrive at the following contradiction,

    0.5=R2H(X|Z,U)=H(X)=10.5=R_{2}\geq H(X|Z,U)=H(X)=1 (68)

    which means that the cut-set bound is not achievable in this example. We note that, even in this binary example where |𝒳|=|𝒴|=2|\mathcal{X}|=|\mathcal{Y}|=2, the cardinalities of the auxiliary random variables UU and ZZ are |𝒰|6|\mathcal{U}|\leq 6 and |𝒵|15|\mathcal{Z}|\leq 15. These large cardinality bounds make it practically impossible to evaluate the capacity of this diamond channel. However, we note that, even though we were not able to compute the exact value of the capacity in this example, we were able to conclude that the capacity is strictly less than the cut-set bound, which is 11 bit.

    We know that the capacity of a diamond channel with four orthogonal links is equal to the cut-set bound in this channel. Our result shows that introducing the broadcast node will reduce the capacity of this all-orthogonal diamond channel. Networks with broadcast nodes have been studied recently from different perspectives, e.g., information theory and network coding [6, 7, 8]. We note that our diamond channel model is a simple example of a general network with a broadcast node. Thus, we conclude that the cut-set bound in general is not tight in networks with broadcast nodes.

  2. 2.

    The processing at the noisy relay includes two operations: decode the inner code UnU^{n} and compress the channel output YnY^{n} to ZnZ^{n} conditioned on UnU^{n}. This processing is essentially the same as Theorem 77 in [9], i.e., combination of DAF and CAF. DAF [9, Theorem 1] has been shown to be optimal in the degraded relay channel [9]. Partial DAF, a special case of [9, Theorem 7] without compression, has been shown to be optimal in semi-deterministic relay channel [10] and the relay channel with orthogonal transmitter-relay link [11]. Recently, CAF [9, Theorem 6] has been shown to be optimal in two special relay channels [12, 13]. To our knowledge, we are the first to show the optimality of the combination of DAF and CAF in some specific channel, even though the channel we consider is not a three-node relay channel in the strict sense, i.e., as in [9].

  3. 3.

    If we assume R=H(X)R0R=H(X)-R_{0}, then Theorem 1 can be rewritten as follows

    RI(U;Y)+H(X|U)\displaystyle R\leq I(U;Y)+H(X|U) \displaystyle\longleftrightarrow R0I(U;X|Y)\displaystyle\qquad R_{0}\geq I(U;X|Y) (69)
    R1I(Z;Y|U,X)\displaystyle R_{1}\geq I(Z;Y|U,X) \displaystyle\longleftrightarrow R1I(Z;Y|U,X)\displaystyle\qquad R_{1}\geq I(Z;Y|U,X) (70)
    R2H(X|Z,U)\displaystyle R_{2}\geq H(X|Z,U) \displaystyle\longleftrightarrow R2I(X;X|Z,U)\displaystyle\qquad R_{2}\geq I(X;X|Z,U) (71)
    R1+R2R+I(Y;Z|X,U)\displaystyle R_{1}+R_{2}\geq R+I(Y;Z|X,U) \displaystyle\longleftrightarrow R0+R1+R2I(X,Y;U,X,Z)\displaystyle\qquad R_{0}+R_{1}+R_{2}\geq I(X,Y;U,X,Z) (72)

    for some joint distribution

    p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y)p(u,z,x,y)=p(u,x)p(y|x)p(z|u,y) (73)

    We note that the right hand sides of (69), (70), (71) and (72) in addition to the distribution constraint in (73) are the same as the rate region of the rate-distortion problem studied by Kaspi and Berger as shown in Figure 3 [4, Theorem 2.1, Case C].

    Refer to caption
    DecoderYnY^{n}R1R_{1}R0R_{0}XnX^{n}R2R_{2}(Relay 22)p(x,y)p(x,y)Relay 11Encoder
    Figure 3: Kaspi-Berger rate distortion problem.

    This duality between our diamond channel coding problem and the Kaspi-Berger source coding problem is similar to the duality between the single-user channel coding problem and the Slepian-Wolf source coding problem[2, Section 3.1] by viewing the codebook information in the channel coding problem as the information sent to all the terminals in the source coding problem, e.g., the information with rate R0R_{0} in Figure 3. Thus, the achievability of our diamond channel coding problem can be obtained from the achievability of Kaspi-Berger source coding problem, in the same way that the achievability of the multiple access channel coding problem can be obtained from the achievability of fork network coding problem [2, Section 3.2].

References

  • [1] B. E. Schein. Distributed Coordination in Network Information Theory. PhD thesis, Massachusetts Institute of Technology, 2001.
  • [2] I. Csiszár and J. Körner. Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press, 1981.
  • [3] I. Csiszár and J. Körner. Broadcast channels with confidential messages. IEEE Trans. Inform. Theory, 24(3):339–348, 1978.
  • [4] A. H. Kaspi and T. Berger. Rate-distortion for correlated sources with partially separated encoders. IEEE Trans. Inform. Theory, 28(6):828–840, 1982.
  • [5] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley and Sons, 1991.
  • [6] A. F. Dana, R. Gowaikar, R. Palanki, B. Hassibi, and M. Effros. Capacity of wireless erasure networks. IEEE Trans. Inform. Theory, 52(3):789–804, 2006.
  • [7] N. Ratnakar and G. Kramer. The multicast capacity of deterministic relay network with no interference. IEEE Trans. Inform. Theory, 52(6):2425–2432, 2006.
  • [8] G. Kramer, S. M. S. Tabatabaei Yazdi, and S. A. Savari. Network coding on line networks with broadcast. In Proc. Conf. Inf. Sciences and Systems (CISS), Princeton, NJ, Mar. 2008.
  • [9] T. M. Cover and A. El Gamal. Capacity theorems for the relay channel. IEEE Trans. Inform. Theory, 25:572–584, Sep. 1979.
  • [10] A. El Gamal and M. Aref. The capacity of the semideterministic relay channel. IEEE Trans. Inform. Theory, 28(3):536, 1982.
  • [11] A. El Gamal and S. Zahedi. Capacity of a class of relay channels with orthogonal components. IEEE Trans. Inform. Theory, 51(5):1815–1817, 2005.
  • [12] Y. H. Kim. Capacity of a class of deterministic relay channels. IEEE Trans. Inform. Theory, 53(3):1328–1329, 2008.
  • [13] M. Aleksic, P. Razaghi, and W. Yu. Capacity of a class of modulo-sum relay channels. Submitted to IEEE Transactions on Information Theory, 2007, http://arxiv.org/pdf/0704.3591.