This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Joint Source-Channel Coding on a Multiple Access Channel with Side Information

R Rajesh, Preliminary versions of parts of this paper appear in WCNC 06, ISIT 08 and ISITA 08. R Rajesh and Vinod Sharma are with the Dept of Electrical Communication Engineering, Indian Institute of Science, Bangalore, India. V K Varshneya is with IBM India Research Lab, Bangalore, India. Email: rajesh@pal.ece.iisc.ernet.in, vinod@ece.iisc.ernet.in, virendra@pal.ece.iisc.ernet.in. This work is partially supported by DRDO-IISc program on advanced research in mathematical engineering.    Vinod Sharma and    V K Varshneya
Abstract

We consider the problem of transmission of several distributed correlated sources over a multiple access channel (MAC) with side information at the sources and the decoder. Source-channel separation does not hold for this channel. Sufficient conditions are provided for transmission of sources with a given distortion. The source and/or the channel could have continuous alphabets (thus Gaussian sources and Gaussian MACs are special cases). Various previous results are obtained as special cases. We also provide several good joint source-channel coding schemes for discrete sources and discrete/continuous alphabet channel.

Keywords: Multiple access channel, side information, lossy joint source-channel coding, jointly Gaussian codewords, correlated sources.

I Introduction and Survey

In this paper we consider the transmission of information from several correlated sources over a multiple access channel with side information. This system does not satisfy source-channel separation ([12]). Thus for optimal transmission one needs to consider joint source-channel coding. We will provide several good joint source-channel coding schemes.

Although this topic has been studied for last several decades, one recent motivation is the problem of estimating a random field via sensor networks. Sensor nodes have limited computational and storage capabilities and very limited energy [3]. These sensor nodes need to transmit their observations to a fusion center which uses this data to estimate the sensed random field. Since transmission is very energy intensive, it is important to minimize it.

The proximity of the sensing nodes to each other induces high correlations between the observations of adjacent sensors. One can exploit these correlations to compress the transmitted data significantly ([3][4]). Furthermore, some of the nodes can be more powerful and act as cluster heads ([4]). Nodes transmit their data to a nearby cluster head which can further compress information before transmission to the fusion center. Transmission of data from sensor nodes to their cluster-head requires sharing the wireless multiple access channel (MAC). At the fusion center the underlying physical process is estimated. The main trade-off possible is between the rates at which the sensors send their observations and the distortion incurred in the estimation at the fusion center. The availability of side information at the encoders and/or the decoder can reduce the rate of transmission ([19][42]).

The above considerations open up new interesting problems in multi-user information theory and the quest for finding the optimal performance for various models of sources, channels and side information have made this an active area of research. The optimal solution is not known except in a few simple cases. In this paper a joint source channel coding approach is discussed under various assumptions on side information and distortion criteria. Sufficient conditions for transmission of discrete/continuous alphabet sources with a given distortion over a discrete/continuous alphabet MAC are provided. These results generalize the previous results available on this problem.

In the following we survey the related literature. Ahlswede [1] and Liao [28] obtained the capacity region of a discrete memoryless MAC with independent inputs. Cover, El Gamal and Salehi [12] made further significant progress by providing sufficient conditions for transmitting losslessly correlated observations over a MAC. They proposed a ‘correlation preserving’ scheme for transmitting the sources. This mapping is extended to a more general system with several principle sources and several side information sources subject to cross observations at the encoders in [2]. However single letter characterization of the capacity region is still unknown. Indeed Duek [15] proved that the conditions given in [12] are only sufficient and may not be necessary. In [26] a finite letter upper bound for the problem is obtained. It is also shown in [12] that the source-channel separation does not hold in this case. The authors of [35] obtain a condition for separation to hold in a multiple access channel.

The capacity region for the distributed lossless source coding problem for correlated sources is given in the classic paper by Slepian and Wolf ([38]). Cover ([11]) extended Slepian-Wolf results to an arbitrary number of discrete, ergodic sources using a technique called ‘random binning’. Other related papers on this problem are [2][6].

Inspired by Slepian-Wolf results, Wyner and Ziv [42] obtained the rate distortion function for source coding with side information at the decoder. It is shown that the knowledge of side information at the encoders in addition to the decoder, permits the transmission at a lower rate. This is in contrast to the lossless case considered by Slepian and Wolf. The rate distortion function when encoder and decoder both have side information was first obtained by Gray (See [8]). Related work on side information coding is [5][14][33]. The lossy version of Slepian-Wolf problem is called multi-terminal source coding problem and despite numerous attempts (e.g.,  [9][30]) the exact rate region is not known except for a few special cases. First major advancement was in Berger and Tung ([8]) where an inner and an outer bound on the rate distortion region was obtained. Lossy coding of continuous sources at the high resolution limit is studied in [43] where an explicit single-letter bound is obtained. Gastpar ([19]) derived an inner and an outer bound with decoder side information and proved the tightness of his bounds when the sources are conditionally independent given the side information. The authors in [39] obtain inner and outer bounds on the rate region with side information at the encoders and the decoder. In [29] an achievable rate region for a MAC with correlated sources and feedback is given.

The distributed Gaussian source coding problem is discussed in [30][41]. For two users exact rate region is provided in [41]. The capacity of a Gaussian MAC (GMAC) for independent sources with feedback is given in [32]. In [27] one necessary and two sufficient conditions for transmitting a bivariate jointly Gaussian source over a GMAC are provided. It is shown that the amplify and forward scheme is optimal below a certain SNR. The performance comparison of the schemes given in [27] with a separation-based scheme is given in [34]. GMAC under received power constraints is studied in [18] and it is shown that the source-channel separation holds in this case.

In [20] the authors discuss a joint source channel coding scheme over a MAC and show the scaling behavior for the Gaussian channel. A Gaussian sensor network in distributed and collaborative setting is studied in [24]. The authors show that it is better to compress the local estimates than to compress the raw data. The scaling laws for a many-to-one data-gathering channel are discussed in [17]. It is shown that the transport capacity of the network scales as 𝒪(logN)\mathcal{O}(logN) when the number of sensors NN grows to infinity and the total average power remains fixed. The scaling laws for the problem without side information are also discussed in [21] and it is shown that separating source coding from channel coding may require exponential growth, as a function of number of sensors, in communication bandwidth. A lower bound on best achievable distortion as a function of the number of sensors, total transmit power, the degrees of freedom of the underlying process and the spatio-temporal communication bandwidth is given.

The joint source-channel coding problem also bears relationship to the CEO problem [10]. In this problem, multiple encoders observe different, noisy versions of a single information source and communicate it to a single decoder called the CEO which is required to reconstruct the source within a certain distortion. The Gaussian version of the CEO problem is studied in [31].

This paper makes the following contributions. It obtains sufficient conditions for transmission of correlated sources with given distortions over a MAC with side information. The source/channel alphabets can be discrete or continuous. The sufficient conditions are strong enough that previous known results are special cases. Next we obtain a bit to Gaussian mapping which provides correlated Gaussian channel codewords for discrete distributed sources.

The paper is organized as follows. Sufficient conditions for transmission of distributed sources over a MAC with side information and given distortion are obtained in Section II. The sources and the channel alphabets can be continuous or discrete. Several previous results are recovered as special cases in Section III. Section IV considers the important case of transmission of discrete correlated sources over a GMAC and presents a new joint source-channel coding scheme. Section V briefly considers Gaussian sources over a GMAC. Section VI concludes the paper. The proof of the main theorem is given in Appendix A. The proofs of several other results are provided in later appendices.

II Transmission of correlated sources over a MAC

We consider the transmission of memoryless dependent sources, through a memoryless multiple access channel (Fig. 1). The sources and/or the channel input/output alphabets can be discrete or continuous. Furthermore, side information about the transmitted information may be available at the encoders and the decoder. Thus our system is very general and covers many systems studied earlier.

Refer to caption
Figure 1: Transmission of correlated sources over a MAC with side information.

Initially we consider two sources (U1,U2)(U_{1},U_{2}) and side information random variables Z1,Z2,ZZ_{1},Z_{2},Z with a known joint distribution F(u1,u2,z1,z2,z)F(u_{1},u_{2},z_{1},z_{2},z). Side information ZiZ_{i} is available to encoder i,i=1,2i,~i={1,2} and the decoder has side information ZZ. The random vector sequence {(U1n,U2n,Z1n,Z2n,Zn),n1}\{(U_{1n},U_{2n},Z_{1n},Z_{2n},Z_{n}),n\geq 1\} formed from the source outputs and the side information with distribution FF is independent identically distributed (iid) in time. We will denote {U1k,k=1,,n}\{U_{1k},~k=1,...,n\} by U1nU_{1}^{n}. Similarly for other sequences. The sources transmit their codewords XinX_{in}’s to a single decoder through a memoryless multiple access channel. The channel output YY has distribution p(y|x1,x2)p(y|x_{1},x_{2}) if x1x_{1} and x2x_{2} are transmitted at that time. Thus, {Yn}\{Y_{n}\} and {X1n,X2n}\{X_{1n},X_{2n}\} satisfy p(yk|yk1,x1k,x2k)=p(yk|x1k,x2k)p(y_{k}|y^{k-1},x_{1}^{k},x_{2}^{k})=p(y_{k}|x_{1k},x_{2k}). The decoder receives YnY_{n} and also has access to the side information ZnZ_{n}. The encoders at the two users do not communicate with each other except via the side information. The decoder uses the channel outputs and its side information to estimate the sensor observations UinU_{in} as U^in,i=1,2\hat{U}_{in},~i={1,2}. It is of interest to find encoders and a decoder such that {U1n,U2n,n1}\{U_{1n},U_{2n},n\geq 1\} can be transmitted over the given MAC with E[d1(U1,U1^)]D1E[d_{1}(U_{1},\hat{U_{1}})]\leq D_{1} and E[d2(U2,U2^)]D2E[d_{2}(U_{2},\hat{U_{2}})]\leq D_{2} where did_{i} are non-negative distortion measures and DiD_{i} are the given distortion constraints. If the distortion measures are unbounded we assume that there exist uiu_{i}^{*} such that E[di(Ui,ui)]<,i=1,2E[d_{i}(U_{i},u_{i}^{*})]<\infty,~i={1,2}. This covers the important special case of mean square error (MSE) if E[Ui2]<,i=1,2E[U_{i}^{2}]<\infty,~i=1,2.

Source channel separation does not hold in this case.

For discrete sources a common distortion measure is Hamming distance,

d(x,x)={1,if xx,0,if x=x.\displaystyle d(x,x^{\prime})=\begin{cases}1,~&\text{if $x\neq x^{\prime}$},\\ 0,~&\text{if $x=x^{\prime}$}.\end{cases}

For continuous alphabet sources the most common distortion measure is d(x,x)=(xx)2d(x,x^{\prime})=(x-x^{\prime})^{2}. To obtain the results for lossless case from our Theorem 1 below, we assume that di(x,x)=0x=xd_{i}(x,x^{\prime})=0\Leftrightarrow x=x^{\prime}, e.g., Hamming distance.

DefinitionDefinition: The source (U1n,U2n)(U_{1}^{n},U_{2}^{n}) can be transmitted over the multiple access channel with distortions 𝐃=Δ(D1,D2){\bf{D}}{\buildrel\Delta\over{=}}(D_{1},D_{2}) if for any ϵ>0\epsilon>0 there is an n0n_{0} such that for all n>n0n>n_{0} there exist encoders fE,in:𝒰in×𝒵in𝒳in,i=1,2f_{E,i}^{n}:\mathcal{U}_{i}^{n}\times\mathcal{Z}_{i}^{n}\rightarrow\mathcal{X}_{i}^{n},i={1,2} and a decoder fDn:𝒴n×𝒵n(𝒰^1n,𝒰^2n){{f}_{D}^{n}}:\mathcal{Y}^{n}\times\mathcal{Z}^{n}\rightarrow(\mathcal{\hat{U}}_{1}^{n},\mathcal{\hat{U}}_{2}^{n}) such that 1nE[j=1nd(Uij,U^ij)]Di+ϵ,i=1,2\frac{1}{n}E\left[\sum_{j=1}^{n}d(U_{ij},\hat{U}_{ij})\right]\leq D_{i}+\epsilon,~i={1,2} where (U^1n,U^2n)=fD(Yn,Zn)(\hat{U}_{1}^{n},\hat{U}_{2}^{n})=f_{D}(Y^{n},Z^{n}) and 𝒰i,𝒵i,𝒵,𝒳i,𝒴,𝒰i^\mathcal{U}_{i},~\mathcal{Z}_{i},~\mathcal{Z},~\mathcal{X}_{i},~\mathcal{Y},~\hat{\mathcal{U}_{i}} are the sets in which Ui,Zi,Z,Xi,Y,U^iU_{i},~Z_{i},~Z,~X_{i},~Y,~\hat{U}_{i} take values.

We denote the joint distribution of (U1,U2)(U_{1},U_{2}) by p(u1,u2)p(u_{1},u_{2}). Also, XYZX\leftrightarrow Y\leftrightarrow Z will denote that {X,Y,Z}\{X,Y,Z\} form a Markov chain.

Now we state the main Theorem.

Theorem 1

A source can be transmitted over the multiple access channel with distortions (D1,D2)(D_{1},D_{2}) if there exist random variables (W1,W2,X1,X2)(W_{1},W_{2},X_{1},X_{2}) such that

(1)p(u1,u2,z1,z2,z,w1,w2,x1,x2,y)=p(u1,u2,z1,z2,z)p(w1|u1,z1)p(w2|u2,z2).\displaystyle(1)~p(u_{1},u_{2},z_{1},z_{2},z,w_{1},w_{2},x_{1},x_{2},y)=p(u_{1},u_{2},z_{1},z_{2},z)p(w_{1}|u_{1},z_{1})p(w_{2}|u_{2},z_{2}).~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
p(x1|w1)p(x2|w2)p(y|x1,x2)\displaystyle p(x_{1}|w_{1})p(x_{2}|w_{2})p(y|x_{1},x_{2})~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

and
(2) there exists a function fD:𝒲1×𝒲2×𝒵(𝒰^1×𝒰^2)f_{D}:\mathcal{W}_{1}\times\mathcal{W}_{2}\times\mathcal{Z}\rightarrow(\hat{\mathcal{U}}_{1}\times\hat{\mathcal{U}}_{2}) such that E[d(Ui,U^i)]Di,i=1,2E[d(U_{i},\hat{U}_{i})]\leq D_{i},~i=1,2, where (U^1,U^2)=fD(W1,W2,Z)(\hat{U}_{1},\hat{U}_{2})=f_{D}(W_{1},W_{2},Z) and the constraints

I(U1,Z1;W1|W2,Z)\displaystyle I(U_{1},Z_{1};W_{1}|W_{2},Z) <\displaystyle< I(X1;Y|X2,W2,Z),\displaystyle I(X_{1};Y|X_{2},W_{2},Z),
I(U2,Z2;W2|W1,Z)\displaystyle I(U_{2},Z_{2};W_{2}|W_{1},Z) <\displaystyle< I(X2;Y|X1,W1,Z),\displaystyle I(X_{2};Y|X_{1},W_{1},Z), (1)
I(U1,U2,Z1,Z2;W1,W2|Z)\displaystyle I(U_{1},U_{2},Z_{1},Z_{2};W_{1},W_{2}|Z) <\displaystyle< I(X1,X2;Y|Z)\displaystyle I(X_{1},X_{2};Y|Z)

are satisfied where 𝒲i\mathcal{W}_{i} are the sets in which WiW_{i} take values.

ProofProof: See Appendix A.                                                                                                \blacksquare

In the proof of Theorem 1 the encoding scheme involves distributed vector quantization (W1n,W2n)(W_{1}^{n},W_{2}^{n}) of the sources (U1n,U2n)(U_{1}^{n},U_{2}^{n}) and the side information Z1n,Z2nZ_{1}^{n},Z_{2}^{n} followed by a correlation preserving mapping to the channel codewords (X1n,X2n)(X_{1}^{n},X_{2}^{n}). The decoding approach involves first decoding (W1n,W2n)(W_{1}^{n},W_{2}^{n}) and then obtaining the estimates (U^1n,U^2n)(\hat{U}_{1}^{n},\hat{U}_{2}^{n}) as a function of (W1n,W2n)(W_{1}^{n},W_{2}^{n}) and the decoder side information ZnZ^{n}.

If the channel alphabets are continuous (e.g., GMAC) then in addition to the conditions in Theorem 1 certain power constraints E[Xi2]Pi,i=1,2E[X_{i}^{2}]\leq P_{i},~i=1,2 are also needed. In general, we could impose a more general constraint E[gi(Xi)]αiE[g_{i}(X_{i})]\leq\alpha_{i} where gig_{i} is some non-negative cost function. Furthermore, for continuous alphabet r.v.s (sources/channel input/output) we will assume that probability density exists so that one can use differential entropy (more general cases can be handled but for simplicity we will ignore them).

The dependence in (U1,U2)(U_{1},U_{2}) is used in two ways in (1): to reducereduce the quantities on the left and to increase the quantities on the right. The side information Z1Z_{1} and Z2Z_{2} effectively increasesincreases the dependence in the inputs.

If the source-channel separation holds then one can consider the capacity region of the channel. For example, when there is no side information Z1,Z2,ZZ_{1},Z_{2},Z and the sources are independent then we obtain the rate region

R1I(X1;Y|X2),R2I(X2;Y|X1),R1+R2I(X1,X2;Y).\displaystyle R_{1}\leq I(X_{1};Y|X_{2}),~R_{2}\leq I(X_{2};Y|X_{1}),~R_{1}+R_{2}\leq I(X_{1},X_{2};Y). (2)

This is the well known rate region of a MAC ([13]). To obtain (2) from (1), take (Z1,Z2,Z)(Z_{1},Z_{2},Z) independent of (U1,U2)(U_{1},U_{2}). Also, take U1,U2U_{1},U_{2} discrete, Wi=UiW_{i}=U_{i} and XiX_{i} independent of Ui,i=1,2U_{i},~i=1,~2.

In Theorem 1 it is possible to include other distortion constraints. For example, in addition to the bounds on E[d(Ui,U^i)]E[d(U_{i},\hat{U}_{i})] one may want a bound on the joint distortion E[d((U1,U2),(U^1,U^2))]E[d((U_{1},U_{2}),(\hat{U}_{1},\hat{U}_{2}))]. Then the only modification needed in the statement of the above theorem is to include this also as a condition in defining fDf_{D}.

If we only want to estimate a function g(U1,U2)g(U_{1},U_{2}) at the decoder and not (U1,U2)(U_{1},U_{2}) themselves, then again one can use the techniques in proof of Theorem 1 to obtain sufficient conditions. Depending upon gg, the conditions needed may be weaker than those needed in (1). We will explore this in more detail in a later work.

In our problem setup the side information ZiZ_{i} can be included with source UiU_{i} and then we can consider this problem as one with no side information at the encoders. However, the above formulation has the advantage that our conditions (1) are explicit in ZiZ_{i}.

The main problem in using Theorem 1 is in obtaining good source-channel coding schemes providing (W1,W2,X1,X2)(W_{1},W_{2},X_{1},X_{2}) which satisfy the conditions in the theorem for a given source (U1,U2)(U_{1},U_{2}) and a channel. A substantial part of this paper will be devoted to this problem.

II-A Extension to multiple sources

The above results can be generalized to the multiple (2)(\geq 2) source case. Let 𝒮=1,2,,M\mathcal{S}={1,2,...,M} be the set of sources with joint distribution p(u1,,uM)p(u_{1},...,u_{M}).

Theorem 2

Sources (Uin,i𝒮)(U_{i}^{n},i\in\mathcal{S}) can be communicated in a distributed fashion over the memoryless multiple access channel p(y|xi,i𝒮)p(y|x_{i},i\in\mathcal{S}) with distortions (Di,i𝒮)(D_{i},i\in\mathcal{S}) if there exist auxiliary random variables (Wi,Xi,i𝒮)(W_{i},X_{i},i\in\mathcal{S}) satisfying

(1)p(ui,zi,z,wi,xi,y,i𝒮)=p(ui,zi,z,i𝒮)p(y|xi,i𝒮)j𝒮p(wj|uj,zj)p(xj|wj),\displaystyle(1)~p(u_{i},z_{i},z,w_{i},x_{i},y,i\in\mathcal{S})=p(u_{i},z_{i},z,i\in\mathcal{S})p(y|x_{i},i\in\mathcal{S})\prod_{j\in\mathcal{S}}p(w_{j}|u_{j},z_{j})p(x_{j}|w_{j}),~~~~~~~~~~~~~~~~~

(2) there exists a function fD:j𝒮𝒲j×𝒵(𝒰^i,i𝒮)f_{D}:\prod_{j\in\mathcal{S}}\mathcal{W}_{j}\times\mathcal{Z}\rightarrow(\hat{\mathcal{U}}_{i},i\in\mathcal{S}) such that E[d(Ui,U^i)]Di,i𝒮E[d(U_{i},\hat{U}_{i})]\leq D_{i},~i\in\mathcal{S} and the constraints

I(UA,ZA;WA|WAc,Z)<I(XA;Y|XAc,WAc,Z),for allA𝒮I(U_{A},Z_{A};W_{A}|W_{A^{c}},Z)<I(X_{A};Y|X_{A^{c}},W_{A^{c}},Z),~\text{for all}~A\subset\mathcal{S} (3)

are satisfied where UA=(Ui,iA)U_{A}=(U_{i},~i\in A), AcA^{c} is the complement of set AA and similarly for other r.v.s (in case of continuous channel alphabets we also need the power constraints E[Xi2]Pi,i=1,,|𝒮|)E[X_{i}^{2}]\leq P_{i},~i=1,...,|\mathcal{S}|).

II-B Example

We provide an example to show the reduction possible in transmission rates by exploiting the correlation between the sources, the side information and the permissible distortions.

Consider (U1,U2)(U_{1},U_{2}) with the joint distribution: P(U1=0;U2=0)=P(U1=1;U2=1)=1/3;P(U1=1;U2=0)=P(U1=0;U2=1)=1/6.P(U_{1}=0;U_{2}=0)=P(U_{1}=1;U_{2}=1)=1/3;P(U_{1}=1;U_{2}=0)=P(U_{1}=0;U_{2}=1)=1/6. If we use independent encoders which do not exploit the correlation among the sources then we need R1H(U1)=1bitR_{1}\geq H(U_{1})=1~bit and R2H(U2)=1bitR_{2}\geq H(U_{2})=1~bit for lossless coding of the sources. If we use Slepian-Wolf coding ([38]), then R1H(U1|U2)=0.918bits,R2H(U2|U1)=0.918bitsR_{1}\geq H(U_{1}|U_{2})=0.918~bits,R_{2}\geq H(U_{2}|U_{1})=0.918~bits and R1+R2H(U1,U2)=1.918bitsR_{1}+R_{2}\geq H(U_{1},U_{2})=1.918~bits suffice.

Next consider a multiple access channel such that Y=X1+X2Y=X_{1}+X_{2} where X1X_{1} and X2X_{2} take values from the alphabet {0,1}\{0,1\} and YY takes values from the alphabet {0,1,2}\{0,1,2\}. This does not satisfy the separation conditions in [35]. The sum capacity CC of such a channel with independent X1X_{1} and X2X_{2} is 1.5bits1.5~bits and if we use source-channel separation, the given sources cannot be transmitted losslessly because H(U1,U2)>CH(U_{1},U_{2})>C. Now we use a joint source-channel code to improve the capacity of the channel. Take X1=U1X_{1}=U_{1} and X2=U2X_{2}=U_{2}. Then the sum rate capacity of the channel is improved to I(X1,X2;Y)=1.585bitsI(X_{1},X_{2};Y)=1.585~bits. This is still not enough to transmit the sources over the given MAC. Next we exploit the side information.

Let the side-information random variables be generated as follows. Z1Z_{1} is transmitted from source 2 by using a (low rate) binary symmetric channel (BSC) with cross over probability p=0.3p=0.3. Similarly Z2Z_{2} is transmitted from source 1 via a similar BSC. Let Z=(Z1,Z2,V)Z=(Z_{1},Z_{2},V), where V=U1.U2.NV=U_{1}.U_{2}.N, NN is a binary random variable with P(N=0)=P(N=1)=0.5P(N=0)=P(N=1)=0.5 independent of U1U_{1} and U2U_{2} and ‘.’ denotes the logical AND operation. This denotes the case when the decoder has access to the encoder side information and also has some extra side information. Then from (1) if we use just the side information Z1Z_{1} the sum rate for the sources needs to be 1.8bits1.8~bits. By symmetry the same holds if we only have Z2Z_{2}. If we use Z1Z_{1} and Z2Z_{2} then we can use the sum rate 1.683bits1.683~bits. If only VV is used then the sum rate needed is 1.606bits1.606~bits. So far we can still not transmit (U1,U2)(U_{1},U_{2}) losslessly if we use the coding Ui=Xi,i=1,2U_{i}=X_{i},~i=1,2. If all the information in Z1,Z2,VZ_{1},Z_{2},V is used then we need R1+R21.4120bitsR_{1}+R_{2}\geq 1.4120~bits. Thus with the aid of Z1,Z2,ZZ_{1},Z_{2},Z we can transmit (U1,U2)(U_{1},U_{2}) losslessly over the MAC even with independent X1X_{1} and X2X_{2}.

Next we consider the distortion criterion to be the Hamming distance and the allowable distortion as 4%. Then for compressing the individual sources without side information we need RiH(p)H(d)=0.758bits,i=1,2R_{i}\geq H(p)-H(d)=0.758~bits,~i=1,2, where H(x)=xlog2(x)(1x)log2(1x)H(x)=-xlog_{2}(x)-(1-x)log_{2}(1-x). Thus we still cannot transmit (U1,U2)(U_{1},U_{2}) with this distortion when (X1,X2)(X_{1},X_{2}) are independent. Next assume the side information Z=(Z1,Z2)Z=(Z_{1},Z_{2}) to be available at the decoder only. Then we need R1I(U1;W1)I(Z1;W1)R_{1}\geq I(U_{1};W_{1})-I(Z_{1};W_{1}) where W1W_{1} is an auxiliary random variable generated from U1U_{1}. This implies that R10.6577bitsR_{1}\geq 0.6577~bits and R20.6577bitsR_{2}\geq 0.6577~bits and we can transmit with independent X1X_{1} and X2X_{2}.

III Special Cases

In the following we show that our result contains several previous studies as special cases. The practically important special case of GMAC will be studied in detail in later sections. There we will discuss several specific joint source-channel coding schemes for GMAC and compare their performance.

III-A Lossless multiple access communication with correlated sources

Take (Z1,Z2,Z)(U1,U2)(Z_{1},Z_{2},Z)\bot(U_{1},U_{2}) (XYX\bot Y denotes that r.v. XX is independent of r.v. YY) and W1=U1W_{1}=U_{1} and W2=U2W_{2}=U_{2} where U1,U2U_{1},U_{2} are discrete sources. Then the constraints of (1) reduce to

H(U1|U2)<I(X1;Y|X2,U2),H(U2|U1)<I(X2;Y|X1,U1),H(U1,U2)<I(X1,X2;Y)\displaystyle H(U_{1}|U_{2})<I(X_{1};Y|X_{2},U_{2}),~H(U_{2}|U_{1})<I(X_{2};Y|X_{1},U_{1}),~H(U_{1},U_{2})<I(X_{1},X_{2};Y) (4)

where X1U1U2X2X_{1}\leftrightarrow U_{1}\leftrightarrow U_{2}\leftrightarrow X_{2}. These are the conditions obtained in [12].

If U1,U2U_{1},~U_{2} are independent, then H(U1|U2)=H(U1)H(U_{1}|U_{2})=H(U_{1}) and I(X1;Y|X2,U2)=I(X1;Y|X2)I(X_{1};Y|X_{2},U_{2})=I(X_{1};Y|X_{2}).

III-B Lossy multiple access communication

Take (Z1,Z2,Z)(U1,U2,W1,W2)(Z_{1},Z_{2},Z)\bot(U_{1},U_{2},W_{1},W_{2}) . In this case the constraints in (1) reduce to

I(U1;W1|W2)<I(X1;Y|X2,W2),I(U2;W2|W1)<I(X2;Y|X1,W1),\displaystyle I(U_{1};W_{1}|W_{2})<I(X_{1};Y|X_{2},W_{2}),~I(U_{2};W_{2}|W_{1})<I(X_{2};Y|X_{1},W_{1}),
I(U1,U2;W1,W2)<I(X1,X2;Y).\displaystyle I(U_{1},U_{2};W_{1},W_{2})<I(X_{1},X_{2};Y). (5)

This is an immediate generalization of [12] to the lossy case.

III-C Lossless multiple access communication with common information

Consider U1=(U1,U0),U2=(U2,U0)U_{1}=(U_{1}^{\prime},U_{0}^{\prime}),~U_{2}=(U_{2}^{\prime},U_{0}^{\prime}) where U0,U1,U2U_{0}^{\prime},U_{1}^{\prime},U_{2}^{\prime} are independent of each other. U0U_{0}^{\prime} is interpreted as the common information at the two encoders. Then, taking (Z1,Z2,Z)(U1,U2)(Z_{1},Z_{2},Z)\bot(U_{1},U_{2}), W1=U1W_{1}=U_{1} and W2=U2W_{2}=U_{2} we obtain sufficient conditions for lossless transmission as

H(U1)<I(X1;Y|X2,U0),H(U2)<I(X2;Y|X1,U0),\displaystyle H(U_{1}^{\prime})<I(X_{1};Y|X_{2},U_{0}^{\prime}),~H(U_{2}^{\prime})<I(X_{2};Y|X_{1},U_{0}^{\prime}),
H(U1)+H(U2)+H(U0)<I(X1,X2;Y).\displaystyle H(U_{1}^{\prime})+H(U_{2}^{\prime})+H(U_{0}^{\prime})<I(X_{1},X_{2};Y). (6)

This provides the capacity region of the MAC with common information available in [37].

Our results generalize this result to lossy transmission also.

III-D Lossy distributed source coding with side information

The multiple access channel is taken as a dummy channel which reproduces its inputs. In this case we obtain that the sources can be coded with rates R1R_{1} and R2R_{2} to obtain the specified distortions at the decoder if

R1>I(U1,Z1;W1|W2,Z),R2>I(U2,Z2;W2|W1,Z),\displaystyle R_{1}>I(U_{1},Z_{1};W_{1}|W_{2},Z),~R_{2}>I(U_{2},Z_{2};W_{2}|W_{1},Z),
R1+R2>I(U1,U2,Z1,Z2;W1,W2|Z)\displaystyle R_{1}+R_{2}>I(U_{1},U_{2},Z_{1},Z_{2};W_{1},W_{2}|Z) (7)

where R1,R2R_{1},~R_{2} are obtained by taking X1X2X_{1}\bot X_{2}.

This recovers the result in [39], and generalizes the results in  [19][38][42].

III-E Correlated sources with lossless transmission over MAC with receiver side information

If we consider (Z1,Z2)(U1,U2)(Z_{1},Z_{2})\bot(U_{1},U_{2}), W1=U1W_{1}=U_{1} and W2=U2W_{2}=U_{2} then we recover the conditions

H(U1|U2,Z)<I(X1;Y|X2,U2,Z),H(U2|U1,Z)<I(X2;Y|X1,U1,Z),\displaystyle H(U_{1}|U_{2},Z)<I(X_{1};Y|X_{2},U_{2},Z),~H(U_{2}|U_{1},Z)<I(X_{2};Y|X_{1},U_{1},Z),
H(U1,U2|Z)<I(X1,X2;Y|Z)\displaystyle H(U_{1},U_{2}|Z)<I(X_{1},X_{2};Y|Z) (8)

in Theorem2.1Theorem~2.1 in [23].

III-F Mixed Side Information

The aim is to determine the rate distortion function for transmitting a source XX with the aid of side information (Y,Z)(Y,Z) (system in Fig 1(c) of [16]). The encoder is provided with YY and the decoder has access to both YY and ZZ. This represents the Mixed side information (MSI) system which combines the conditional rate distortion system and the Wyner-Ziv system. This has the system in Fig 1(a) and (b) of [16] as special cases.

The results of Fig 1(c) can be recovered from our Theorem if we take X,Y,Z,WX,Y,Z,W in [16] as U1=X,Z=(Z,Y),Z1=YU_{1}=X,Z=(Z,Y),Z_{1}=Y and W1=WW_{1}=W. We also take U2U_{2} and Z2Z_{2} to be constants. The acceptable rate region is given by R>I(X;W|Y,Z)R>I(X;W|Y,Z), where WW is a random variable with the property W(X,Y)ZW\leftrightarrow(X,Y)\leftrightarrow Z and for which there exists a decoder function such that the distortion constraints are met.

III-G Compound MAC and Interference channel with side information

In compound MAC sources U1U_{1} and U2U_{2} are transmitted through a MAC which has two outputs Y1Y_{1} and Y2Y_{2}. Decoder ii is provided with YiY_{i} and Zi,i=1,2Z_{i},~i=1,2. Each decoder is supposed to reconstruct both the sources. We take W1=U1W_{1}=U_{1} and W2=U2W_{2}=U_{2}. We can consider this system as two MAC’s. Applying (1) twice we have for i=1,2i=1,2,

H(U1|U2,Zi)<I(X1;Yi|X2,U2,Zi),H(U2|U1,Zi)<I(X2;Yi|X1,U1,Zi),\displaystyle H(U_{1}|U_{2},Z_{i})<I(X_{1};Y_{i}|X_{2},U_{2},Z_{i}),~H(U_{2}|U_{1},Z_{i})<I(X_{2};Y_{i}|X_{1},U_{1},Z_{i}),
H(U1,U2|Zi)<I(X1,X2;Yi|Zi).\displaystyle H(U_{1},U_{2}|Z_{i})<I(X_{1},X_{2};Y_{i}|Z_{i}). (9)

This recoves the achievability result in [22]. This provides the achievability conditions in [22] for strong interference channel conditions also.

III-H Correlated sources over orthogonal channels with side information

The sources transmit their codewords XiX_{i}’s to a single decoder through memoryless orthogonal channels having transition probabilities p(y1|x1)p(y_{1}|x_{1}) and p(y2|x2)p(y_{2}|x_{2}). Hence in the theorem, Y=(Y1,Y2)Y=(Y_{1},Y_{2}) and Y1X1W1(U1,Z1)(U2,Z2)W2X2Y2Y_{1}\leftrightarrow X_{1}\leftrightarrow W_{1}\leftrightarrow(U_{1},Z_{1})\leftrightarrow(U_{2},Z_{2})\leftrightarrow W_{2}\leftrightarrow X_{2}\leftrightarrow Y_{2}. In this case the constraints in (1) reduce to

I(U1,Z1;W1|W2,Z)\displaystyle I(U_{1},Z_{1};W_{1}|W_{2},Z) <\displaystyle< I(X1;Y1|W2,Z)I(X1;Y1),\displaystyle I(X_{1};Y_{1}|W_{2},Z)\leq I(X_{1};Y_{1}),
I(U2,Z2;W2|W1,Z)\displaystyle I(U_{2},Z_{2};W_{2}|W_{1},Z) <\displaystyle< I(X2;Y2|W1,Z)I(X2;Y2),\displaystyle I(X_{2};Y_{2}|W_{1},Z)\leq I(X_{2};Y_{2}), (10)
I(U1,U2,Z1,Z2;W1,W2|Z)\displaystyle I(U_{1},U_{2},Z_{1},Z_{2};W_{1},W_{2}|Z) <\displaystyle< I(X1,X2;Y1,Y2|Z)I(X1;Y1)+I(X2;Y2).\displaystyle I(X_{1},X_{2};Y_{1},Y_{2}|Z)\leq I(X_{1};Y_{1})+I(X_{2};Y_{2}).

The outer bounds in (III-H) are attained if the channel codewords (X1,X2)(X_{1},X_{2}) are independent of each other. Also, the distribution of (X1,X2)(X_{1},X_{2}) maximizing these bounds are not dependent on the distribution of (U1,U2)(U_{1},U_{2}).

Using Fano’s inequality, for lossless transmission of discrete sources over discrete channels with side information, we can show that outer bounds in (III-H) are in fact necessary and sufficient. The proof of the converse is given in Appendix B.

If we take W1=U1W_{1}=U_{1} and W2=U2W_{2}=U_{2} and the side information (Z1,Z2,Z)(U1,U2)(Z_{1},Z_{2},Z)\bot(U_{1},U_{2}), we can recover the necessary and sufficient conditions in [7].

III-I Gaussian sources over a Gaussian MAC

Let (U1,U2)(U_{1},U_{2}) be jointly Gaussian with mean zero, variances σi2,i=1,2\sigma_{i}^{2},~i=1,2 and correlation ρ\rho. These sources have to be communicated over a Gaussian MAC with the output YnY_{n} at time nn given by Yn=X1n+X2n+NnY_{n}=X_{1n}+X_{2n}+N_{n} where X1nX_{1n} and X2nX_{2n} are the channel inputs at time nn and NnN_{n} is a Gaussian random variable independent of X1nX_{1n} and X2nX_{2n}, with E[Nn]=0E[N_{n}]=0 and var(Nn)=σN2var(N_{n})=\sigma_{N}^{2}. The power constaints are E[Xi2]Pi,i=1,2E[X_{i}^{2}]\leq P_{i},~i=1,2. The distortion measure is the mean square error (MSE). We take (Z1,Z2,Z)(U1,U2)(Z_{1},Z_{2},Z)\bot(U_{1},U_{2}). We choose W1W_{1} and W2W_{2} according to the coding scheme given in [27]. X1X_{1} and X2X_{2} are scaled versions of W1W_{1} and W2W_{2} respectively. Then from (1) we find that the rates at which W1W_{1} and W2W_{2} are encoded satisfy

R10.5log[P1σN2+1(1ρ~2)],R20.5log[P2σN2+1(1ρ~2)],\displaystyle R_{1}\leq 0.5\log\left[\frac{P_{1}}{{\sigma_{N}}^{2}}+\frac{1}{(1-{\tilde{\rho}}^{2})}\right],~R_{2}\leq 0.5\log\left[\frac{P_{2}}{{\sigma_{N}}^{2}}+\frac{1}{(1-{\tilde{\rho}}^{2})}\right],
R1+R20.5log[σN2+P1+P2+2ρ~P1P2(1ρ~2)σN2].\displaystyle R_{1}+R_{2}\leq 0.5\log\left[\frac{{\sigma_{N}}^{2}+P_{1}+P_{2}+{2}{\tilde{\rho}}{\sqrt{P_{1}P_{2}}}}{{(1-{\tilde{\rho}}^{2})}{\sigma_{N}}^{2}}\right]. (11)

where ρ~\tilde{\rho} is the correlation between X1X_{1} and X2X_{2}. The distortions achieved are

D1\displaystyle D_{1} \displaystyle\geq var(U1|W1,W2)=σ1222R1[1ρ2(122R2)](1ρ~2),\displaystyle var(U_{1}|W_{1},W_{2})=\frac{{\sigma_{1}}^{2}2^{-2R_{1}}\left[1-\rho^{2}\left(1-2^{-2R_{2}}\right)\right]}{(1-{\tilde{\rho}}^{2})},
D2\displaystyle D_{2} \displaystyle\geq var(U2|W1,W2)=σ2222R2[1ρ2(122R1)](1ρ~2).\displaystyle var(U_{2}|W_{1},W_{2})=\frac{{\sigma_{2}}^{2}2^{-2R_{2}}\left[1-\rho^{2}\left(1-2^{-2R_{1}}\right)\right]}{(1-{\tilde{\rho}}^{2})}.

This recovers the sufficient conditions in [27].

IV Discrete Alphabet Sources over Gaussian MAC

This system is practically very useful. For example, in a sensor network, the observations sensed by the sensor nodes are discretized and then transmitted over a GMAC. The physical proximity of the sensor nodes makes their observations correlated. This correlation can be exploited to compress the transmitted data and increase the channel capacity. We present a novel distributed ‘correlation preserving’ joint source-channel coding scheme yielding jointly Gaussian channel codewords which transmit the data efficiently over a GMAC.

Sufficient conditions for lossless transmission of two discrete correlated sources (U1,U2)(U_{1},U_{2}) (generating iidiid sequences in time) over a general MAC with no side information are obtained in (4).

In this section, we further specialize these results to a GMAC: Y=X1+X2+NY=X_{1}+X_{2}+N where NN is a Gaussian random variable independent of X1X_{1} and X2X_{2}. The noise NN satisfies E[N]=0E[N]=0 and Var(N)=σN2Var(N)=\sigma_{N}^{2} . We will also have the transmit power constraints: E[Xi2]Pi,i=1,2E[X_{i}^{2}]\leq P_{i},i=1,2. Since source-channel separation does not hold for this system, a joint source-channel coding scheme is needed for optimal performance.

The dependence of right hand side (RHS) in (4) on input alphabets prevents us from getting a closed form expression for the admissibility criterion. Therefore we relax the conditions by taking away the dependence on the input alphabets to obtain good joint source-channel codes.

Lemma 1

Under our assumptions, I(X1;Y|X2,U2)I(X1;Y|X2)I(X_{1};Y|X_{2},U_{2})\leq I(X_{1};Y|X_{2}).

ProofProof: See Appendix C.                                                                                                                \blacksquare

Thus from (4),

H(U1|U2)\displaystyle H(U_{1}|U_{2}) <\displaystyle< I(X1;Y|X2,U2)I(X1;Y|X2),\displaystyle I(X_{1};Y|X_{2},U_{2})\leq I(X_{1};Y|X_{2}), (12)
H(U2|U1)\displaystyle H(U_{2}|U_{1}) <\displaystyle< I(X2;Y|X1,U1)I(X2;Y|X1),\displaystyle I(X_{2};Y|X_{1},U_{1})\leq I(X_{2};Y|X_{1}), (13)
H(U1,U2)\displaystyle H(U_{1},U_{2}) <\displaystyle< I(X1,X2;Y).\displaystyle I(X_{1},X_{2};Y). (14)

The relaxation of the upper bounds is only in (12) and (13) and not in (14).

We show that the relaxed upper bounds are maximized if (X1,X2)(X_{1},X_{2}) is jointly Gaussian and the correlation ρ\rho between X1X_{1} and X2X_{2} is high (the highest possible ρ\rho may not give the largest upper bound in (12)-(14)).

Lemma 2

A jointly Gaussian distribution for (X1,X2)(X_{1},X_{2}) maximizes I(X1;Y|X2)I(X_{1};Y|X_{2}), I(X2;Y|X1)I(X_{2};Y|X_{1}) and I(X1,X2;Y)I(X_{1},X_{2};Y) simultaneously.

ProofProof: See Appendix C.                                                                                                                \blacksquare

The difference between the bounds in (12) is

I(X1,Y|X2)I(X1,Y|X2,U2)=I(X1+N;U2|X2).I(X_{1},Y|X_{2})-I(X_{1},Y|X_{2},U_{2})=I(X_{1}+N;U_{2}|X_{2}). (15)

This difference is small if correlation between (U1,U2)(U_{1},U_{2}) is small. In that case H(U1|U2)H(U_{1}|U_{2}) and H(U2|U1)H(U_{2}|U_{1}) will be large and (12) and (13) can be active constraints. If correlation between (U1,U2)(U_{1},U_{2}) is large, H(U1|U2)H(U_{1}|U_{2}) and H(U2|U1)H(U_{2}|U_{1}) will be small and (14) will be the only active constraint. In this case the difference between the two bounds in (12) and (13) is large but not important. Thus, the outer bounds in (12) and (13) are close to the inner bounds whenever the constraints (12) and (13) are active. Often (14) will be the only active constraint.

Based on Lemma 2, we use jointly Gaussian channel inputs (X1,X2)(X_{1},X_{2}) with the transmit power constraints. Thus we take (X1,X2)(X_{1},X_{2}) with mean vector [00][0~~0] and covariance matrix KX1,X2=(P1ρP1P2ρP1P2P2)K_{X_{1},X_{2}}=\\ \begin{pmatrix}P_{1}&\rho\sqrt{P_{1}P_{2}}\\ \rho\sqrt{P_{1}P_{2}}&P_{2}\end{pmatrix}. The outer bounds in (12)-(14) become 0.5log[1+P1(1ρ2)σN2]0.5\log\left[1+\frac{P_{1}(1-{{\rho}}^{2})}{{\sigma_{N}}^{2}}\right], 0.5log[1+P2(1ρ2)σN2]0.5\log\left[1+\frac{P_{2}(1-{{\rho}}^{2})}{{\sigma_{N}}^{2}}\right] and 0.5log0.5\log[1+P1+P2+2ρP1P2σN2]\left[1+\frac{P_{1}+P_{2}+{2}{{\rho}}{\sqrt{P_{1}P_{2}}}}{{\sigma_{N}}^{2}}\right] respectively. The first two upper bounds decrease as ρ\rho increases. But the third upper bound increases with ρ\rho and often the third constraint is the limiting constraint. Thus, once (X1,X2)(X_{1},X_{2}) are obtained we can check for sufficient conditions (4). If these are not satisfied for the (X1,X2)(X_{1},X_{2}) obtained, we will increase the correlation ρ\rho between (X1,X2)(X_{1},X_{2}) if possible (see details below). Increasing the correlation in (X1,X2)(X_{1},X_{2}) will decrease the difference in (15) and increase the possibility of satisfying (4) when the outer bounds in (12) and (13) are satisfied. If not, we can increase ρ{\rho} further till we satisfy (4).

The next lemma provides an upper bound on the correlation ρ\rho between (X1,X2)(X_{1},X_{2}) possible in terms of the distribution of (U1,U2)(U_{1},U_{2}).

Lemma 3

Let (U1,U2)(U_{1},U_{2}) be the correlated sources and X1U1U2X2X_{1}\leftrightarrow U_{1}\leftrightarrow U_{2}\leftrightarrow X_{2} where X1X_{1} and X2X_{2} are jointly Gaussian. Then the correlation ρ{\rho} between (X1,X2)(X_{1},X_{2}) satisfies ρ2122I(U1,U2){{\rho}}^{2}\leq 1-2^{-2I(U_{1},U_{2})}.

ProofProof: See Appendix C.                                                                                                                \blacksquare

It is stated in [35], without proof, that the correlation between (X1,X2)(X_{1},X_{2}) cannot be greater than the correlation of the source (U1,U2)(U_{1},U_{2}). Lemma 3 gives a tighter bound in many cases. Consider (U1,U2)(U_{1},U_{2}) with the joint distribution: P(U1=0;U2=0)=P(U1=1;U2=1)=0.4444;P(U1=1;U2=0)=P(U1=0;U2=1)=0.0556P(U_{1}=0;U_{2}=0)=P(U_{1}=1;U_{2}=1)=0.4444;P(U_{1}=1;U_{2}=0)=P(U_{1}=0;U_{2}=1)=0.0556. The correlation between the sources is 0.7778 but from Lemma 3, the correlation between (X1,X2)(X_{1},X_{2}) cannot exceed 0.7055.

IV-A A coding Scheme

In this section we develop a distributed coding scheme for mapping the discrete alphabets (U1,U2)(U_{1},U_{2}) into jointly Gaussian correlated code words (X1,X2)(X_{1},X_{2}) which satisfy (4) and the Markov condition. The heart of the scheme is to approximate a jointly Gaussian distribution with the sum of product of Gaussian marginals. Although this is stated in the following lemma for two dimensional vectors (X1,X2)(X_{1},X_{2}), the results hold for any finite dimensional vectors (hence can be used for any number of users sharing the MAC).

Lemma 4

Any jointly Gaussian two dimensional density can be uniformly arbitrarily closely approximated by a weighted sum of product of marginal Gaussian densities:

i=1Npi2πc1ie12c1i(x1a1i)2qi2πc2ie12c2i(x2a2i)2\displaystyle\sum_{i=1}^{N}{\frac{p_{i}}{\sqrt{2\pi c_{1i}}}e^{\frac{-1}{2c_{1i}}(x_{1}-a_{1i})^{2}}\frac{q_{i}}{\sqrt{2\pi c_{2i}}}e^{\frac{-1}{2c_{2i}}(x_{2}-a_{2i})^{2}}} (16)

ProofProof: See Appendix C.                                                                                                                \blacksquare

From the above lemma we can form a sequence of functions fn(x1,x2)f_{n}(x_{1},x_{2}) of type (16) such that supx1,x2|fn(x1,x2)f(x1,x2)|0sup_{x_{1},x_{2}}|f_{n}(x_{1},x_{2})-f(x_{1},x_{2})|\rightarrow 0 as nn\rightarrow\infty, where ff is a given jointly Gaussian density. Although fnf_{n} are not guaranteed to be probability densities, due to uniform convergence, for large nn, they will almost be. In the following lemma we will assume that we have made the minor modification to ensure that fnf_{n} is a proper density for large enough nn. This lemma shows that obtaining (X1,X2)(X_{1},X_{2}) from such approximations can provide the (relaxed) upper bounds in (12)-(14) (we actually show for the third inequality only but this can be shown for the other inequalities in the same way). Of course, as mentioned earlier, then these can be used to obtain the (X1,X2)(X_{1},X_{2}) which satisfy the actual bounds in (4).

Let (Xm1,Xm2)(X_{m1},X_{m2}) and (X1,X2)(X_{1},X_{2}) be random variables with densities fmf_{m} and ff and supx1,x2|fm(x1,x2)f(x1,x2)|0sup_{x_{1},x_{2}}|f_{m}(x_{1},x_{2})-f(x_{1},x_{2})|\rightarrow 0 as mm\rightarrow\infty. Let YmY_{m} and YY denote the corresponding channel outputs.

Lemma 5

For the random variables defined above, if {logfm(Ym),m1}\{logf_{m}(Y_{m}),m\geq 1\} is uniformly integrable, I(Xm1,Xm2;Ym)I(X1,X2;Y)I(X_{m1},X_{m2};Y_{m})\rightarrow I(X_{1},X_{2};Y) as mm\rightarrow\infty.

ProofProof: See Appendix C.                                                                                                                \blacksquare

A set of sufficient conditions for uniform integrability of {logfm(Ym),m1}\{logf_{m}(Y_{m}),m\geq 1\} is

(1) Number of components in (16) is upper bounded.

(2) Variance of component densities in (16) is upper bounded and lower bounded away from zero.

(3) The means of the component densities in (16) are in a bounded set.

From Lemma 4 a joint Gaussian density with any correlation can be expressed by a linear combination of marginal Gaussian densities. But the coefficients pip_{i} and qiq_{i} in (16) may be positive or negative. To realize our coding scheme, we would like to have the pip_{i}’s and qiq_{i}’s to be non negative. This introduces constraints on the realizable Gaussian densities in our coding scheme. For example, from Lemma 3, the correlation ρ\rho between X1X_{1} and X2X_{2} cannot exceed 122I(U1;U2)\sqrt{1-2^{-2I(U_{1};U_{2})}}. Also there is still the question of getting a good linear combination of marginal densities to obtain the joint density for a given NN in (16).

This motivates us to consider an optimization procedure for finding pi,qip_{i},~q_{i}, a1i,a2ia_{1i},~a_{2i}, c1ic_{1i} and c2ic_{2i} in (16) that provides the best approximation to a given joint Gaussian density. We illustrate this with an example. Consider U1,U2U_{1},U_{2} to be binary. Let P(U1=0;U2=0)=p00;P(U1=0;U2=1)=p01;P(U1=1;U2=0)=p10P(U_{1}=0;U_{2}=0)=p_{00};P(U_{1}=0;U_{2}=1)=p_{01};P(U_{1}=1;U_{2}=0)=p_{10} and P(U1=1;U2=1)=p11P(U_{1}=1;U_{2}=1)=p_{11}. Define (notation in the following has been slightly changed compared to (16))

f(X1=.|U1=0)=p101𝒩(a101,c101)+p102𝒩(a102,c102)\displaystyle f(X_{1}=.|U_{1}=0)=p_{101}\mathcal{N}(a_{101},c_{101})+p_{102}\mathcal{N}(a_{102},c_{102})
+p10r1𝒩(a10r1,c10r1),\displaystyle...+p_{10r_{1}}\mathcal{N}(a_{10r_{1}},c_{10r_{1}}), (17)
f(X1=.|U1=1)=p111𝒩(a111,c111)+p112𝒩(a112,c112)\displaystyle f(X_{1}=.|U_{1}=1)=p_{111}\mathcal{N}(a_{111},c_{111})+p_{112}\mathcal{N}(a_{112},c_{112})
+p11r2𝒩(a11r2,c11r2),\displaystyle...+p_{11r_{2}}\mathcal{N}(a_{11r_{2}},c_{11r_{2}}), (18)
f(X2=.|U2=0)=p201𝒩(a201,c201)+p202𝒩(a202,c202)\displaystyle f(X_{2}=.|U_{2}=0)=p_{201}\mathcal{N}(a_{201},c_{201})+p_{202}\mathcal{N}(a_{202},c_{202})
+p20r3𝒩(a20r3,c20r3),\displaystyle...+p_{20r_{3}}\mathcal{N}(a_{20r_{3}},c_{20r_{3}}), (19)
f(X2=.|U2=1)=p211𝒩(a211,c211)+p212𝒩(a212,c212)\displaystyle f(X_{2}=.|U_{2}=1)=p_{211}\mathcal{N}(a_{211},c_{211})+p_{212}\mathcal{N}(a_{212},c_{212})
+p21r4𝒩(a21r4,c21r4).\displaystyle...+p_{21r_{4}}\mathcal{N}(a_{21r_{4}},c_{21r_{4}}). (20)

where 𝒩(a,b)\mathcal{N}(a,b) denotes Gaussian density with mean aa and variance bb. Let p¯\underline{p} be the vector with components p101,,p10r1p_{101},...,p_{10r_{1}}, p111,,p11r2p_{111},...,p_{11r_{2}}, p201,,p20r3p_{201},...,p_{20r_{3}}, p211,,p21r4p_{211},...,p_{21r_{4}}. Similarly we denote by a¯\underline{a} and c¯\underline{c} the vectors with components a101,,a10r1a_{101},...,a_{10r_{1}}, a111,,a11r2a_{111},...,a_{11r_{2}}, a201,,a20r3a_{201},...,a_{20r_{3}}, a211,,a21r4a_{211},...,a_{21r_{4}} and c101,,c10r1c_{101},...,c_{10r_{1}}, c111,,c11r2c_{111},...,c_{11r_{2}}, c201,c_{201},...,c20r3c_{20r_{3}}, c211c_{211},,c21r4...,c_{21r_{4}}. The mixture of Gaussian densities (17)-(20) will be used to obtain the RHS in (16) for an optimal approximation. For a given p¯,a¯,c¯\underline{p},~\underline{a},~\underline{c}, the resulting joint density is gp¯,a¯,c¯=p00f(X1=.|U1=0)f(X2=.|U2=0)+p01f(X1=.|U1=0)f(X2=.|U2=1)+p10f(X1=.|U1=1)f(X2=.|U2=0)+p11f(X1=.|U1=1)f(X2=.|U2=1)g_{\underline{p},\underline{a},\underline{c}}=p_{00}f(X_{1}=.|U_{1}=0)f(X_{2}=.|U_{2}=0)+p_{01}f(X_{1}=.|U_{1}=0)f(X_{2}=.|U_{2}=1)+p_{10}f(X_{1}=.|U_{1}=1)f(X_{2}=.|U_{2}=0)+p_{11}f(X_{1}=.|U_{1}=1)f(X_{2}=.|U_{2}=1).

Let fρ(x1,x2)f_{\rho}(x_{1},x_{2}) be the jointly Gaussian density that we want to approximate. Let it has zero mean and covariance matrix KX1,X2=(1ρρ1)K_{X_{1},X_{2}}=\begin{pmatrix}1&\rho\\ \rho&1\end{pmatrix}. The best gp¯,a¯,c¯g_{\underline{p},\underline{a},\underline{c}} is obtained by solving the minimization problem:

minp¯,a¯,c¯[gp¯,a¯,c¯(x1,x2)fρ(x1,x2)]2𝑑x1𝑑x2\displaystyle\text{min}_{\underline{p},\underline{a},\underline{c}}\int{[g_{\underline{p},\underline{a},\underline{c}}(x_{1},x_{2})-f_{\rho}(x_{1},x_{2})]^{2}dx_{1}dx_{2}} (21)

subject to

(p00+p01)i=1r1p10ia10i+(p10+p11)i=1r2p11ia11i=0,\displaystyle(p_{00}+p_{01})\sum_{i=1}^{r_{1}}p_{10i}a_{10i}+(p_{10}+p_{11})\sum_{i=1}^{r_{2}}p_{11i}a_{11i}=0,
(p00+p10)i=1r3p20ia20i+(p01+p11)i=1r4p21ia21i=0,\displaystyle(p_{00}+p_{10})\sum_{i=1}^{r_{3}}p_{20i}a_{20i}+(p_{01}+p_{11})\sum_{i=1}^{r_{4}}p_{21i}a_{21i}=0,
(p00+p01)i=1r1p10i(c10i+a10i2)+(p10+p11)i=1r2p11i(c11i+a11i2)=1,\displaystyle(p_{00}+p_{01})\sum_{i=1}^{r_{1}}p_{10i}(c_{10i}+a_{10i}^{2})+(p_{10}+p_{11})\sum_{i=1}^{r_{2}}p_{11i}(c_{11i}+a_{11i}^{2})=1,
(p00+p10)i=1r3p20i(c20i+a20i2)+(p01+p11)i=1r4p21i(c21i+a21i2)=1,\displaystyle(p_{00}+p_{10})\sum_{i=1}^{r_{3}}p_{20i}(c_{20i}+a_{20i}^{2})+(p_{01}+p_{11})\sum_{i=1}^{r_{4}}p_{21i}(c_{21i}+a_{21i}^{2})=1,
i=1r1p10i=1,i=1r2p11i=1,i=1r3p20i=1,i=1r4p21i=1,\displaystyle\sum_{i=1}^{r_{1}}p_{10i}=1,\sum_{i=1}^{r_{2}}p_{11i}=1,\sum_{i=1}^{r_{3}}p_{20i}=1,\sum_{i=1}^{r_{4}}p_{21i}=1,
p10i0,c10i0fori{1,2r1},p11i0,c11i0fori{1,2r2},\displaystyle p_{10i}\geq 0,c_{10i}\geq 0~for~i\in\{1,2...r_{1}\},~p_{11i}\geq 0,c_{11i}\geq 0~for~i\in\{1,2...r_{2}\},
p20i0,c20i0fori{1,2r3},p21i0,c21i0fori{1,2r4}.\displaystyle p_{20i}\geq 0,c_{20i}\geq 0~for~i\in\{1,2...r_{3}\},~p_{21i}\geq 0,c_{21i}\geq 0~for~i\in\{1,2...r_{4}\}.

The above constraints are such that the resulting distribution gg for (X1,X2)(X_{1},X_{2}) will satisfy E[Xi]=0E[X_{i}]=0 and E[Xi2]=1,i=1,2E[X_{i}^{2}]=1,~i=1,2.

The above coding scheme will be used to obtain a codebook as follows. If user 1 produces U1=0U_{1}=0, then independently with probability p10ip_{10i} the encoder 1 obtains codeword X1X_{1} from the distribution 𝒩(a10i,c10i)\mathcal{N}(a_{10i},c_{10i}) independently of other codewords. Similarly we obtain the codewords for U1=1U_{1}=1 and for user 2. Once we have found the encoder maps the encoding and decoding are as described in the proof of Theorem 1. The decoding is done by joint typicality of the received YnY^{n} with (U1n,U2n)(U_{1}^{n},U_{2}^{n}).

This coding scheme can be extended to any discrete alphabet case. We give an example below to illustrate the coding scheme.

IV-B Example

Consider (U1,U2)(U_{1},U_{2}) with the joint distribution: P(U1=0;U2=0)=P(U1=1;U2=1)=P(U1=0;U2=1)=1/3;P(U1=1;U2=0)=0P(U_{1}=0;U_{2}=0)=P(U_{1}=1;U_{2}=1)=P(U_{1}=0;U_{2}=1)=1/3;P(U_{1}=1;U_{2}=0)=0 and power constraints P1=3;P2=4P_{1}=3;P_{2}=4. Also consider a GMAC with σN2=1\sigma_{N}^{2}=1. If the sources are mapped into independent channel code words, then the sum rate condition in (14) with ρ=0\rho=0 should hold. The LHS evaluates to 1.585 bits whereas the RHS is 1.5 bits. Thus (14) is violated and hence the sufficient conditions in (4) are also violated.

In the following we explore the possibility of using correlated (X1,X2)(X_{1},X_{2}) to see if we can transmit this source on the given MAC. The inputs (U1,U2)(U_{1},U_{2}) can be distributedly mapped to jointly Gaussian channel code words (X1,X2)(X_{1},X_{2}) by the technique mentioned above. The maximum ρ\rho which satisfies upper bounds in (12) and (13) are 0.7024 and 0.7874 respectively and the minimum ρ\rho which satisfies (14) is 0.144. From Lemma 3, ρ\rho is upper bounded by 0.546. Therefore we want to obtain jointly Gaussian (X1,X2)(X_{1},X_{2}) satisfying X1U1U2X2X_{1}\leftrightarrow U_{1}\leftrightarrow U_{2}\leftrightarrow X_{2} with correlation ρ[0.144,0.546]\rho\in[0.144,0.546]. If we choose ρ=0.3\rho=0.3, it meets the inner bounds in (12)-(14) (i.e., the bounds in (4)): I(X1;Y|X2,U2)=0.792,I(X2;Y|X1,U1)=0.996I(X_{1};Y|X_{2},U_{2})=0.792,~I(X_{2};Y|X_{1},U_{1})=0.996, I(X1;Y|X2)=0.949,I(X2;Y|X1)=1.107I(X_{1};Y|X_{2})=0.949,~I(X_{2};Y|X_{1})=1.107, H(U1|U2)=H(U2|U1)=0.66H(U_{1}|U_{2})=H(U_{2}|U_{1})=0.66.

We choose ri=2,i=1,,4r_{i}=2,~i=1,...,4 and solve the optimization problem (21) via MATLAB to get the function gg. The optimal solution solution has both component distributions in (17)- (20) same and these are

f(X1|U1=0)=𝒩(0.0002,0.9108),f(X1|U1=1)=𝒩(0.0001,1.0446),\displaystyle f(X_{1}|U_{1}=0)=\mathcal{N}(-0.0002,0.9108),~f(X_{1}|U_{1}=1)=\mathcal{N}(-0.0001,1.0446),
f(X2|U2=0)=𝒩(0.0021,1.1358),f(X2|U2=1)=𝒩(0.0042,0.7283).\displaystyle f(X_{2}|U_{2}=0)=\mathcal{N}(-0.0021,1.1358),~f(X_{2}|U_{2}=1)=\mathcal{N}(-0.0042,0.7283).

The normalized minimum distortion, defined as [gp¯,a¯,c¯(x1,x2)fρ(x1,x2)]2dx1dx2/{\int{[g_{\underline{p},\underline{a},\underline{c}}(x_{1},x_{2})-f_{\rho}(x_{1},x_{2})]^{2}dx_{1}dx_{2}}}/ fρ2(x1,x2)𝑑x1𝑑x2{\int{f_{\rho}^{2}(x_{1},x_{2})dx_{1}dx_{2}}} is 0.137%.

The approximation (a cross section of the two dimensional densities) is shown in Fig. 2.

If we take ρ=0.6\rho=0.6 which violates Lemma 3 then the optimal solution from (21) is shown in Fig. 3. We can see that the error in this case is more. Now the normalized marginal distortion is 10.5 %.

Refer to caption
Figure 2: Cross section of the approximation of the joint Gaussian with ρ\rho=0.3.
Refer to caption
Figure 3: Cross section of the approximation of the joint Gaussian with ρ\rho=0.6.

IV-C Generalizations

The procedure mentioned in Section IV-A can be extended to systems with general discrete alphabets, multiple sources, lossy transmissions and side information as follows.

Consider N2N\geq 2 users with source ii taking values in a discrete alphabet 𝒰i\mathcal{U}_{i}. In such a case for each user we find P(Xi=.|Ui=ui),ui𝒰iP(X_{i}=.|{U_{i}=u_{i}}),~u_{i}\in\mathcal{U}_{i} using a mapping mentioned as in (17)-(20) to yield jointly Gaussian (X1,X2,,XN)(X_{1},X_{2},...,X_{N}).

If Z1Z_{1} and Z2Z_{2} are the side information available, then we use f(Xi=.|Ui,Zi),i=1,2f(X_{i}=.|U_{i},Z_{i}),~i={1,2} as in (17)-(20) and obtain the optimal approximation from (21).

For lossy transmission, we choose appropriate discrete auxiliary random variables WiW_{i} satisfying the conditions in Theorem 1. Then we can form (X1,X2)(X_{1},X_{2}) from (W1,W2)(W_{1},W_{2}) via the optimization procedure (21).

V Gaussian sources over a GMAC

In this section we consider transmission of correlated Gaussian sources over a GMAC. This is an important example for transmitting continuous alphabet sources over a GMAC. For example one comes across it if a sensor network is sampling a Gaussian random field. Also, in the application of detection of change ([40]) by a sensor network, it is often the detection of change in the mean of the sensor observations with the sensor observation noise being Gaussian.

We will assume that (U1n,U2n)(U_{1n},U_{2n}) is jointly Gaussian with mean zero, variances σi2,i=1,2\sigma_{i}^{2},~i=1,2 and correlation ρ.\rho. The distortion measure will be Mean Square Error (MSE). The (relaxed) sufficient conditions from (12)-(14) for transmission of the sources over the channel are given by (these continue to hold because Lemmas 1-3 are still valid)

I(U1;W1|W2)<0.5log[1+P1(1ρ~2)σN2],I(U2;W2|W1)<0.5log[1+P2(1ρ~2)σN2],\displaystyle I(U_{1};W_{1}|W_{2})<0.5\log\left[1+\frac{P_{1}(1-{\tilde{\rho}}^{2})}{{\sigma_{N}}^{2}}\right],~I(U_{2};W_{2}|W_{1})<0.5\log\left[1+\frac{P_{2}(1-{\tilde{\rho}}^{2})}{{\sigma_{N}}^{2}}\right], (22)
I(U1,U2;W1,W2)<0.5log[1+P1+P2+2ρ~P1P2σN2].\displaystyle I(U_{1},U_{2};W_{1},W_{2})<0.5\log\left[1+\frac{P_{1}+P_{2}+{2}{\tilde{\rho}}{\sqrt{P_{1}P_{2}}}}{{\sigma_{N}}^{2}}\right].~~~~~~~~~~~~~~~~~~

where ρ~\tilde{\rho} is the correlation between (X1,X2)(X_{1},X_{2}) which are chosen to be jointly Gaussian, as in Section IV.

We consider three specific coding schemes to obtain W1,W2,X1,X2W_{1},W_{2},X_{1},X_{2} where (W1,W2)(W_{1},W_{2}) satisfy the distortion constraints and (X1,X2)(X_{1},X_{2}) are jointly Gaussian with an appropriate ρ~\tilde{\rho} such that (22) is satisfied. These coding schemes have been widely used. The schemes are Amplify and Forward (AF), Separation Based (SB) and the coding scheme provided in Lapidoth and Tinguely (LT) [27]. We have compared the performance of these schemes in [34]. The AF and LT are joint source-channel coding schemes. In [27] it is shown that AF is optimal at low SNR. In [34] we show that at high SNR LT is close to optimal. SB although performs well at high SNR, is sub-optimal.

For general continuous alphabet sources (U1,U2)(U_{1},U_{2}), no necessarly Gaussian, we vector quantize U1n,U2nU_{1}^{n},U_{2}^{n} into U~1n,U~2n\tilde{U}_{1}^{n},\tilde{U}_{2}^{n}. Then to obtain correlated Gaussian codewords (X1n,X2n)(X_{1}^{n},X_{2}^{n}) we can use the scheme provided in Section IV-A. Alternatively, use Slepian-Wolf coding on (U~1n,U~2n)(\tilde{U}_{1}^{n},\tilde{U}_{2}^{n}). Then for large nn, U~1n\tilde{U}_{1}^{n} and U~2n\tilde{U}_{2}^{n} are almost independent. Now on each U~in,i=1,2\tilde{U}_{i}^{n},~i=1,2 we can use usual independent Gaussian codebooks as in a point to point channel.

VI Conclusions

In this paper, sufficient conditions are provided for transmission of correlated sources over a multiple access channel. Various previous results on this problem are obtained as special cases. Suitable examples are given to emphasis the superiority of joint source-channel coding schemes. Important special cases of correlated discrete sources over a GMAC and Gaussian sources over a GMAC are discussed in more detail. In particular a new joint source-channel coding scheme is presented for discrete sources over a GMAC.

Appendix A Proof of Theorem 1

The coding scheme involves distributed quantization (W1n,W2n)(W_{1}^{n},W_{2}^{n}) of the sources and the side information (U1n,Z1n),(U2n,Z2n)(U_{1}^{n},Z_{1}^{n}),(U_{2}^{n},Z_{2}^{n}) followed by a correlation preserving mapping to the channel codewords. The decoding approach involves first decoding (W1n,W2n)(W_{1}^{n},W_{2}^{n}) and then obtaining estimate (U^1n,U^2n)(\hat{U}_{1}^{n},\hat{U}_{2}^{n}) as a function of (W1n,W2n)(W_{1}^{n},W_{2}^{n}) and the decoder side information ZnZ^{n}.

Let Tϵn(X,Y)T_{\epsilon}^{n}(X,Y) denote the weakly ϵ\epsilon-typical set of sequences of length nn for (X,Y)(X,Y) where ϵ>0\epsilon>0 is an arbitrarily small fixed positive constant. We use the following Lemmas in the proof.

Markov Lemma: Suppose XYZX\leftrightarrow Y\leftrightarrow Z. If for a given (xn,yn)Tϵn(X,Y)(x^{n},y^{n})\in T_{\epsilon}^{n}(X,Y), ZnZ^{n} is drawn according to i=1np(zi|yi)\prod_{i=1}^{n}p(z_{i}|y_{i}), then with high probability (xn,yn,Zn)Tϵn(X,Y,Z)(x^{n},y^{n},Z^{n})\in T_{\epsilon}^{n}(X,Y,Z) for nn sufficiently large.

The proof of this Lemma for strong typicality is given in [8]. We need it for weak typicality. By the Markov property, (xn,yn,zn)(x^{n},y^{n},z^{n}) formed in the statement of the Lemma has the same joint distribution as the original sequence (Xn,Yn,Zn)(X^{n},Y^{n},Z^{n}). Thus the statement of the above Lemma follows. In the same way the following Lemma also holds.

Extended Markov Lemma: Suppose W1U1Z1U2W2Z2ZW_{1}\leftrightarrow U_{1}Z_{1}\leftrightarrow U_{2}W_{2}Z_{2}Z and W2U2Z2U1W1Z1ZW_{2}\leftrightarrow U_{2}Z_{2}\leftrightarrow U_{1}W_{1}Z_{1}Z. If for a given (u1n,u2n,z1n,z2n,zn)(u_{1}^{n},u_{2}^{n},z_{1}^{n},z_{2}^{n},z^{n}) \in Tϵn(U1,U2,Z1,Z2,Z)T_{\epsilon}^{n}(U_{1},U_{2},Z_{1},Z_{2},Z), W1nW_{1}^{n} and W2nW_{2}^{n} are drawn respectively according to i=1np(w1i|u1i,z1i)\prod_{i=1}^{n}p(w_{1i}|u_{1i},z_{1i}) and i=1np(w2i|u2i,z2i)\prod_{i=1}^{n}p(w_{2i}|u_{2i},z_{2i}), then with high probability (u1n,u2n,z1n,z2n(u_{1}^{n},u_{2}^{n},z_{1}^{n},z_{2}^{n},zn,W1nz^{n},W_{1}^{n},W2n)W_{2}^{n}) \in Tϵn(U1,U2,Z1,Z2,Z,W1,W2)T_{\epsilon}^{n}(U_{1},U_{2},Z_{1},Z_{2},Z,W_{1},W_{2}) for nn sufficiently large.

We show the achievability of all points in the rate region (1).

ProofProof: Fix p(w1|u1,z1),p(w2|u2,z2),p(x1|w1),p(x2|w2)p(w_{1}|u_{1},z_{1}),p(w_{2}|u_{2},z_{2}),p(x_{1}|w_{1}),p(x_{2}|w_{2}) as well as fDn(.)f_{D}{n}(.) satisfying the distortion constraints. First we give the proof for the discrete channel alphabet case.

CodebookGenerationCodebook~Generation: Let Ri=I(Ui,Zi;Wi)+δ,i{1,2}R_{i}^{{}^{\prime}}=I(U_{i},Z_{i};W_{i})+\delta,i\in\{1,2\} for some δ>0\delta>0. Generate 2nRi2^{nR_{i}^{{}^{\prime}}} codewords of length nn, sampled iid from the marginal distribution p(wi),i{1,2}p(w_{i}),i\in\{1,2\}. For each winw_{i}^{n} independently generate sequence XinX_{i}^{n} according to j=1np(xij|wij),i{1,2}\prod_{j=1}^{n}p(x_{ij}|w_{ij}),i\in\{1,2\}. Call these sequences xi(win),i1,2x_{i}(w_{i}^{n}),i\in{1,2}. Reveal the codebooks to the encoders and the decoder.

EncodingEncoding: For i{1,2}i\in\{1,2\}, given the source sequence UinU_{i}^{n} and ZinZ_{i}^{n}, the ithi^{th} encoder looks for a codeword WinW_{i}^{n} such that (Uin,Zin,Win)Tϵn(Ui,Zi,Wi)(U_{i}^{n},Z_{i}^{n},W_{i}^{n})\in T_{\epsilon}^{n}(U_{i},Z_{i},W_{i}) and then transmits Xi(Win)X_{i}(W_{i}^{n}).

DecodingDecoding: Upon receiving YnY^{n}, the decoder finds the unique (W1n,W2n)(W_{1}^{n},W_{2}^{n}) pair such that
(W1n,W2n,x1(W1n),x2(W2n),Yn,Zn)Tϵn(W_{1}^{n},W_{2}^{n},x_{1}(W_{1}^{n}),x_{2}(W_{2}^{n}),Y^{n},Z^{n})\in T_{\epsilon}^{n}. If it fails to find such a unique pair, the decoder declares an error and incurres a maximum distortion of dmaxd_{max} (we assume that the distortion measures are bounded; at the end we will remove this condition).

In the following we show that the probability of error for this encoding-decoding scheme tends to zero as nn\rightarrow\infty. The error can occur because of the following four events E1-E4. We show that P(𝐄𝐢)0P({\bf{Ei}})\rightarrow 0, for 𝐢=1,2,3,4{\bf{i}}=1,2,3,4.

E1 The encoders do not find the codewords. However from rate distortion theory ([13], page 356), limnP(E1)=0\lim_{n\to\infty}P(E_{1})=0 if Ri>I(Ui,Zi;Wi),i1,2R_{i}^{{}^{\prime}}>I(U_{i},Z_{i};W_{i}),i\in{1,2}.

E2 The codewords are not jointly typical with (Yn,Zn)(Y^{n},Z^{n}). Probability of this event goes to zero from the extended Markov Lemma.

E3 There exists another codeword w^1n\hat{w}_{1}^{n} such that (w^1n,W2n,x1(w^1n),x2(W2n)(\hat{w}_{1}^{n},W_{2}^{n},x_{1}(\hat{w}_{1}^{n}),x_{2}(W_{2}^{n}),
Yn,Zn)TϵnY^{n},Z^{n})\in T_{\epsilon}^{n}. Define α=Δ\alpha{\buildrel\Delta\over{=}} (w^1n,W2n,x1(w^1n),x2(W2n),Yn,Zn)(\hat{w}_{1}^{n},W_{2}^{n},x_{1}(\hat{w}_{1}^{n}),x_{2}(W_{2}^{n}),Y^{n},Z^{n}). Then,

P(𝐄𝟑)=Pr{There isw^1nw1n:αTϵn}w^1nW1n:(w^1n,W2n,Zn)TϵnPr{αTϵn}\displaystyle P({\bf{E3}})=Pr{\{\text{There is}~\hat{w}_{1}^{n}\neq w_{1}^{n}:\alpha\in T_{\epsilon}^{n}}\}\leq\sum_{\hat{w}_{1}^{n}\neq W_{1}^{n}:(\hat{w}_{1}^{n},W_{2}^{n},Z^{n})\in T_{\epsilon}^{n}}Pr{\{\alpha\in T_{\epsilon}^{n}\}} (23)

The probability term inside the summation in (23) is

\displaystyle\leq (x1(.),x2(.),yn):αTϵnPr{x1(w^1n),x2(w2n),yn|w^1n,w2n,zn}p(w^1n,w2n,zn)\displaystyle\sum_{(x_{1}(.),x_{2}(.),y^{n}):{\alpha\in T_{\epsilon}^{n}}}Pr\{x_{1}(\hat{w}_{1}^{n}),x_{2}(w_{2}^{n}),y^{n}|\hat{w}_{1}^{n},w_{2}^{n},z^{n}\}p(\hat{w}_{1}^{n},w_{2}^{n},z^{n})
\displaystyle\leq (x1(.),x2(.),yn):αTϵnPr{x1(w^1n)|w^1n}Pr{x2(w2n),yn|w2n,zn}\displaystyle\sum_{(x_{1}(.),x_{2}(.),y^{n}):{\alpha\in T_{\epsilon}^{n}}}Pr\{x_{1}(\hat{w}_{1}^{n})|\hat{w}_{1}^{n}\}Pr\{x_{2}(w_{2}^{n}),y^{n}|w_{2}^{n},z^{n}\}
\displaystyle\leq (x1(.),x2(.),yn):αTϵn2n{H(X1|W1)+H(X2,Y|W2,Z)4ϵ}\displaystyle\sum_{(x_{1}(.),x_{2}(.),y^{n}):{\alpha\in T_{\epsilon}^{n}}}2^{-n\{H(X_{1}|W_{1})+H(X_{2},Y|W_{2},Z)-4\epsilon\}}
\displaystyle\leq 2nH(X1,X2,Y|W1,W2,Z)2n{H(X1|W1)+H(X2,Y|W2,Z)4ϵ}.\displaystyle 2^{n{H(X_{1},X_{2},Y|W_{1},W_{2},Z)}}2^{-n\{H(X_{1}|W_{1})+H(X_{2},Y|W_{2},Z)-4\epsilon\}}.

But from hypothesis, we have

H(X1,X2,Y|W1,W2,Z)H(X1|W1)H(X2,Y|W2,Z)\displaystyle H(X_{1},X_{2},Y|W_{1},W_{2},Z)-H(X_{1}|W_{1})-H(X_{2},Y|W_{2},Z)
=\displaystyle= H(X1|W1)+H(X2|W2)+H(Y|X1,X2)H(X1|W1)H(X2,Y|W2,Z)\displaystyle H(X_{1}|W_{1})+H(X_{2}|W_{2})+H(Y|X_{1},X_{2})-H(X_{1}|W_{1})-H(X_{2},Y|W_{2},Z)
=\displaystyle= H(Y|X1,X2)H(Y|X2,W2,Z)\displaystyle H(Y|X_{1},X_{2})-H(Y|X_{2},W_{2},Z)
=\displaystyle= H(Y|X1,X2,W2,Z)H(Y|X2,W2,Z)=I(X1;Y|X2,W2,Z).\displaystyle H(Y|X_{1},X_{2},W_{2},Z)-H(Y|X_{2},W_{2},Z)=~-I(X_{1};Y|X_{2},W_{2},Z).

Hence,

Pr{(w^1n,W2n,x1(w^1n),x2(W2n),Yn,Zn)Tϵn}2n{I(X1;Y|X2,W2,Z)6ϵ}.\displaystyle Pr\{(\hat{w}_{1}^{n},W_{2}^{n},x_{1}(\hat{w}_{1}^{n}),x_{2}(W_{2}^{n}),Y^{n},Z^{n})\in T_{\epsilon}^{n}\}\leq 2^{-n\{I(X_{1};Y|X_{2},W_{2},Z)-6\epsilon\}}. (24)

Then from (23)

P(𝐄𝟑)\displaystyle P({\bf{E3}}) \displaystyle\leq w^1nw1n:(w^1n,w2n,zn)Tϵn2n{I(X1;Y|X2,W2,Z)6ϵ}\displaystyle\sum_{\hat{w}_{1}^{n}\neq w_{1}^{n}:(\hat{w}_{1}^{n},w_{2}^{n},z^{n})\in T_{\epsilon}^{n}}2^{-n\{I(X_{1};Y|X_{2},W_{2},Z)-6\epsilon\}}
=\displaystyle= |{w^1n:(w^1n,w2n,zn)Tϵn}|2n{I(X1;Y|X2,W2,Z)6ϵ}\displaystyle|\{\hat{w}_{1}^{n}:(\hat{w}_{1}^{n},w_{2}^{n},z^{n})\in T_{\epsilon}^{n}\}|2^{-n\{I(X_{1};Y|X_{2},W_{2},Z)-6\epsilon\}}
\displaystyle\leq |{w^1n}|Pr{w^1n,w2n,zn)Tϵn}2n{I(X1;Y|X2,W2,Z)6ϵ}\displaystyle|\{\hat{w}_{1}^{n}\}|Pr\{\hat{w}_{1}^{n},w_{2}^{n},z^{n})\in T_{\epsilon}^{n}\}2^{-n\{I(X_{1};Y|X_{2},W_{2},Z)-6\epsilon\}}
\displaystyle\leq 2n{I(U1,Z1;W1)+δ}2n{I(W1;W2,Z)3ϵ}2n{I(X1;Y|X2,W2,Z)6ϵ}\displaystyle 2^{n\{I(U_{1},Z_{1};W_{1})+\delta\}}2^{-n\{I(W_{1};W_{2},Z)-3\epsilon\}}2^{-n\{I(X_{1};Y|X_{2},W_{2},Z)-6\epsilon\}}
=\displaystyle= 2n{I(U1,Z1;W1|W2,Z)}2n{I(X1;Y|X2,W2,Z)9ϵδ}.\displaystyle 2^{n\{I(U_{1},Z_{1};W_{1}|W_{2},Z)\}}2^{-n\{I(X_{1};Y|X_{2},W_{2},Z)-9\epsilon-\delta\}}.

In (A) we have used the fact that

I(U1,Z1;W1)I(W1;W2,Z)=H(W1|W2,Z)H(W1|U1,Z1)\displaystyle I(U_{1},Z_{1};W_{1})-I(W_{1};W_{2},Z)=H(W_{1}|W_{2},Z)-H(W_{1}|U_{1},Z_{1})
=H(W1|W2,Z)H(W1|U1,Z1,W2,Z)=I(U1,Z1;W1|W2,Z).\displaystyle=H(W_{1}|W_{2},Z)-H(W_{1}|U_{1},Z_{1},W_{2},Z)=I(U_{1},Z_{1};W_{1}|W_{2},Z).

The RHS of (A) tends to zero if I(U1,Z1;W1|W2,Z)<I(X1;Y|X2,W2,Z)I(U_{1},Z_{1};W_{1}|W_{2},Z)<I(X_{1};Y|X_{2},W_{2},Z).

Similarly, by symmetry of the problem we require I(U2,Z2;W2|W1,Z)<I(X2;Y|X1,W1,Z)I(U_{2},Z_{2};W_{2}|W_{1},Z)<I(X_{2};Y|X_{1},W_{1},Z).

E4 There exist other codewords w^1n\hat{w}_{1}^{n} and w^2n\hat{w}_{2}^{n} such that α=Δ(w^1n,w^2n,x1(w^1n),x2(w^2n),Yn,Zn)Tϵn\alpha{\buildrel\Delta\over{=}}(\hat{w}_{1}^{n},\hat{w}_{2}^{n},x_{1}(\hat{w}_{1}^{n}),x_{2}(\hat{w}_{2}^{n}),Y^{n},Z^{n})\in T_{\epsilon}^{n}. Then,

P(𝐄𝟒)\displaystyle P({\bf{E4}}) =\displaystyle= Pr{There is(w^1n,w^2n)(w1n,w2n):αTϵn}\displaystyle Pr{\{\text{There is}~(\hat{w}_{1}^{n},\hat{w}_{2}^{n})\neq(w_{1}^{n},w_{2}^{n}):\alpha\in T_{\epsilon}^{n}}\} (26)
\displaystyle\leq (w^1n,w^2n)(w1n,w1n):(w^1n,w^2n,zn)TϵnPr{αTϵn}.\displaystyle\sum_{(\hat{w}_{1}^{n},\hat{w}_{2}^{n})\neq(w_{1}^{n},w_{1}^{n}):(\hat{w}_{1}^{n},\hat{w}_{2}^{n},z^{n})\in T_{\epsilon}^{n}}Pr{\{\alpha\in T_{\epsilon}^{n}\}}.

The probability term inside the summation in (26) is

\displaystyle\leq (x1(.),x2(.),yn):αTϵnPr{x1(w^1n),x2(w^2n),yn|w^1n,w^2n,zn}p(w^1n,w^2n,zn)\displaystyle\sum_{(x_{1}(.),x_{2}(.),y^{n}):{\alpha\in T_{\epsilon}^{n}}}Pr\{x_{1}(\hat{w}_{1}^{n}),x_{2}(\hat{w}_{2}^{n}),y^{n}|\hat{w}_{1}^{n},\hat{w}_{2}^{n},z^{n}\}p(\hat{w}_{1}^{n},\hat{w}_{2}^{n},z^{n})
\displaystyle\leq .Pr{x1(w^1n)|w^1n}Pr{x2(w^2n)|w^2n}Pr{yn|zn}\displaystyle\sum_{....}{Pr\{x_{1}(\hat{w}_{1}^{n})|\hat{w}_{1}^{n}\}Pr\{x_{2}(\hat{w}_{2}^{n})|\hat{w}_{2}^{n}\}Pr\{y^{n}|z^{n}\}}
\displaystyle\leq (x1(.),x2(.),yn):αTϵn2n{H(X1|W1)+H(X2|W2)+H(Y|Z)5ϵ}\displaystyle\sum_{(x_{1}(.),x_{2}(.),y^{n}):{\alpha\in T_{\epsilon}^{n}}}2^{-n\{H(X_{1}|W_{1})+H(X_{2}|W_{2})+H(Y|Z)-5\epsilon\}}
\displaystyle\leq 2nH(X1,X2,Y|W1,W2,Z)2n{H(X1|W1)+H(X2|W2)+H(Y|Z)7ϵ}.\displaystyle 2^{n{H(X_{1},X_{2},Y|W_{1},W_{2},Z)}}2^{-n\{H(X_{1}|W_{1})+H(X_{2}|W_{2})+H(Y|Z)-7\epsilon\}}.

But from hypothesis, we have

H(X1,X2,Y|W1,W2,Z)H(X1|W1)H(X2|W2)H(Y|Z)\displaystyle H(X_{1},X_{2},Y|W_{1},W_{2},Z)-H(X_{1}|W_{1})-H(X_{2}|W_{2})-H(Y|Z)~~~~~~~~~~~~~~~~
=H(Y|X1,X2)H(Y|Z)=H(Y|X1,X2,Z)H(Y|Z)=I(X1,X2;Y|Z).\displaystyle=H(Y|X_{1},X_{2})-H(Y|Z)=H(Y|X_{1},X_{2},Z)-H(Y|Z)=-I(X_{1},X_{2};Y|Z).

Hence,

Pr{(w^1n,w^2n,x1(w^1n),x2(w^2n),yn,zn)Tϵn}2n{I(X1,X2;Y|Z)7ϵ}.\displaystyle Pr\{(\hat{w}_{1}^{n},\hat{w}_{2}^{n},x_{1}(\hat{w}_{1}^{n}),x_{2}(\hat{w}_{2}^{n}),y^{n},z^{n})\in T_{\epsilon}^{n}\}\leq 2^{-n\{I(X_{1},X_{2};Y|Z)-7\epsilon\}}. (27)

Then from (26)

P(𝐄𝟒)\displaystyle P({\bf{E4}}) \displaystyle\leq (w^1n,w^2n)(w1n,w1n):(w^1n,w^2n,zn)Tϵn2n{I(X1,X2;Y|Z)7ϵ}\displaystyle\sum_{\begin{subarray}{c}(\hat{w}_{1}^{n},\hat{w}_{2}^{n})\neq(w_{1}^{n},w_{1}^{n}):\\ (\hat{w}_{1}^{n},\hat{w}_{2}^{n},z^{n})\in T_{\epsilon}^{n}\end{subarray}}2^{-n\{I(X_{1},X_{2};Y|Z)-7\epsilon\}}
=\displaystyle= |{(w^1n,w^2n):(w^1n,w^2n,zn)Tϵn}|2n{I(X1,X2;Y|Z)7ϵ}\displaystyle|\{(\hat{w}_{1}^{n},\hat{w}_{2}^{n}):(\hat{w}_{1}^{n},\hat{w}_{2}^{n},z^{n})\in T_{\epsilon}^{n}\}|2^{-n\{I(X_{1},X_{2};Y|Z)-7\epsilon\}}
\displaystyle\leq |{w^1n}||{w^2n}|Pr{(w^1n,w^2n,zn)Tϵn}2n{I(X1,X2;Y|Z)7ϵ}\displaystyle|\{\hat{w}_{1}^{n}\}||\{\hat{w}_{2}^{n}\}|Pr\{(\hat{w}_{1}^{n},\hat{w}_{2}^{n},z^{n})\in T_{\epsilon}^{n}\}2^{-n\{I(X_{1},X_{2};Y|Z)-7\epsilon\}}
\displaystyle\leq 2n{I(U1,Z1;W1)+I(U2,Z2;W2)+2δ}.\displaystyle 2^{n\{I(U_{1},Z_{1};W_{1})+I(U_{2},Z_{2};W_{2})+2\delta\}}.
2n{I(W1;W2,Z)+I(W2;W1,Z)+I(W1;W2|Z)4ϵ}2n{I(X1,X2;Y|Z)7ϵ}\displaystyle 2^{-n\{I(W_{1};W_{2},Z)+I(W_{2};W_{1},Z)+I(W_{1};W_{2}|Z)-4\epsilon\}}2^{-n\{I(X_{1},X_{2};Y|Z)-7\epsilon\}}
=\displaystyle= 2n{I(U1,U2,Z1,Z2;W1,W2|Z)}2n{I(X1,X2;Y|Z)11ϵ2δ}.\displaystyle 2^{n\{I(U_{1},U_{2},Z_{1},Z_{2};W_{1},W_{2}|Z)\}}2^{-n\{I(X_{1},X_{2};Y|Z)-11\epsilon-2\delta\}}.

The RHS of the above inequality tends to zero if I(U1,U2,Z1,Z2;W1W2|Z)<I(X1,X2;Y|Z)I(U_{1},U_{2},Z_{1},Z_{2};W_{1}W_{2}|Z)<I(X_{1},X_{2};Y|Z).

Thus as nn\rightarrow\infty, with probability tending to 1, the decoder finds the correct sequence (W1n,W2n)(W_{1}^{n},W_{2}^{n}) which is jointly weakly ϵ\epsilon-typical with (U1n,U2n,Zn)(U_{1}^{n},U_{2}^{n},Z^{n}).

The fact that (W1n,W2n)(W_{1}^{n},W_{2}^{n}) is weakly ϵ\epsilon-typical with (U1n,U2n,Zn)(U_{1}^{n},U_{2}^{n},Z^{n}) does not guarantee that fDn(W1n,W2n,Zn)f_{D}^{n}(W_{1}^{n},W_{2}^{n},Z^{n}) will satisfy the distortions D1,D2D_{1},D_{2}. For this, one needs that (W1n,W2n)(W_{1}^{n},W_{2}^{n}) is distortion-ϵ\epsilon-weakly typical ([13]) with (U1n,U2n,Zn)(U_{1}^{n},U_{2}^{n},Z^{n}). Let TD,ϵnT_{D,\epsilon}^{n} denote the set of distortion typical sequences. Then by strong law of large numbers P(TD,ϵn|Tϵn)1P(T_{D,\epsilon}^{n}|T_{\epsilon}^{n})\rightarrow 1 as nn\rightarrow\infty. Thus the distortion constraints are also satisfied by (W1n,W2n)(W_{1}^{n},W_{2}^{n}) obtained above with a probability tending to 1 as nn\rightarrow\infty. Therefore, if distortion measure dd is bounded limnE[d(Uin,U^in)]Di+ϵ,i=1,2\lim_{n\rightarrow\infty}E[d(U_{i}^{n},\hat{U}_{i}^{n})]\leq D_{i}+\epsilon,~i=1,2.

For continuous channel alphabet case (e.g., GMAC) one also needs transmission constraints E[gi(Xi)]αi,i=1,2E[g_{i}(X_{i})]\leq\alpha_{i},~i=1,2. For this we need to ensure that the coding scheme chooses a distribution p(xi|wi)p(x_{i}|w_{i}) which satisfies E[gi(Xi)]<αiϵE[g_{i}(X_{i})]<\alpha_{i}-\epsilon. Then if a specific codeword does not satisfy 1nk=1ngi(xk)<αi\frac{1}{n}\sum_{k=1}^{n}g_{i}(x_{k})<\alpha_{i}, one declares an error. As nn\rightarrow\infty this happens with a vanishingly small probability.

If there exist uiu_{i}^{*} such that E[di(Ui,ui)]<,i=1,2E[d_{i}(U_{i},u_{i}^{*})]<\infty,~i=1,2, then the result extends to unbounded distortion measures also as follows. Whenever the decoded (W1n,W2n)(W_{1}^{n},W_{2}^{n}) are not in the distortion typical set then we estimate (U^1n,U^2n)(\hat{U}_{1}^{n},\hat{U}_{2}^{n}) as (u1n,u2n)({u_{1}^{*}}^{n},{u_{2}^{*}}^{n}). Then for i=1,2i=1,2,

E[di(Uin,U^in)]Di+ϵ+E[d(Uin,uin)𝟏{(TD,ϵn)c}].E[d_{i}(U_{i}^{n},\hat{U}_{i}^{n})]\leq D_{i}+\epsilon+E[d(U_{i}^{n},{u_{i}^{*}}^{n}){\bf{1}}_{\{(T_{D,\epsilon}^{n})^{c}\}}]. (28)

Since E[d(Uin,uin)]<E[d(U_{i}^{n},{u_{i}^{*}}^{n})]<\infty and P[(TD,ϵn)c]0P[({T_{D,\epsilon}^{n}})^{c}]\rightarrow 0 as nn\rightarrow\infty, the last term of (28) goes to zero as nn\rightarrow\infty.

Appendix B Proof of converse for lossless transmission of discrete correlated sources over orthogonal channels with side information

Let PneP^{e}_{n} be the probability of error in estimating U1n,U2nU_{1}^{n},~U_{2}^{n} from (Y1n,Y2n,Zn)(Y_{1}^{n},Y_{2}^{n},Z^{n}). For any given coding-decoding scheme, we will show that if Pne0P^{e}_{n}\rightarrow 0 then the inequalities in (III-H) specialized to the lossless transmission must be satisfied for this system.

Let 𝒰i||\mathcal{U}_{i}|| be the cardinality of set 𝒰i\mathcal{U}_{i}. From Fano’s inequality we have

1nH(U1n,U2n|Y1n,Y2n,Zn)\displaystyle\frac{1}{n}H(U_{1}^{n},U_{2}^{n}|Y_{1}^{n},Y_{2}^{n},Z^{n}) \displaystyle\leq 1nlog(U1nU2n)Pne+1n\displaystyle\frac{1}{n}log(||U_{1}^{n}U_{2}^{n}||)P^{e}_{n}+\frac{1}{n}
=\displaystyle= Pne(log𝒰1+log𝒰2)+1n.\displaystyle P^{e}_{n}(log||\mathcal{U}_{1}||+log||\mathcal{U}_{2}||)+\frac{1}{n}.

Denote Pn(log𝒰1+log𝒰2)+1nP_{n}(log||\mathcal{U}_{1}||+log||\mathcal{U}_{2}||)+\frac{1}{n} by λn\lambda_{n}. As Pne0,λn0P^{e}_{n}\rightarrow 0,\lambda_{n}\rightarrow 0.

Since,

H(U1n,U2n|Y1n,Y2n,Zn)=H(U1n|Y1n,Y2n,Zn)+H(U2n|U1n,Y1n,Y2n,Zn),\displaystyle H(U_{1}^{n},U_{2}^{n}|Y_{1}^{n},Y_{2}^{n},Z^{n})=H(U_{1}^{n}|Y_{1}^{n},Y_{2}^{n},Z^{n})+H(U_{2}^{n}|U_{1}^{n},Y_{1}^{n},Y_{2}^{n},Z^{n}),

we obtain H(U1n|Y1n,Y2n,Zn)/nλnH(U_{1}^{n}|Y_{1}^{n},Y_{2}^{n},Z^{n})/n\leq\lambda_{n}. Therefore, because {U1n}\{U_{1}^{n}\} is an iid sequence,

nH(U1)\displaystyle nH(U_{1}) =\displaystyle= H(U1n)\displaystyle H(U_{1}^{n}) (29)
=\displaystyle= H(U1n|Y1n,Y2n,Zn)+I(U1n;Y1n,Y2n,Zn)\displaystyle H(U_{1}^{n}|Y_{1}^{n},Y_{2}^{n},Z^{n})+I(U_{1}^{n};Y_{1}^{n},Y_{2}^{n},Z^{n})
\displaystyle\leq nλn+I(U1n;Y1n,Y2n,Zn).\displaystyle n\lambda_{n}+I(U_{1}^{n};Y_{1}^{n},Y_{2}^{n},Z^{n}).

Also, by data processing inequality,

I(U1n;Y1n,U2n,Zn)\displaystyle I(U_{1}^{n};Y_{1}^{n},U_{2}^{n},Z^{n}) =\displaystyle= I(U1n;U2n,Zn)+I(U1n;Y1n|U2n,Zn)\displaystyle I(U_{1}^{n};U_{2}^{n},Z^{n})+I(U_{1}^{n};Y_{1}^{n}|U_{2}^{n},Z^{n}) (30)
\displaystyle\leq I(U1n;U2n,Zn)+I(X1n;Y1n|U2n,Zn).\displaystyle I(U_{1}^{n};U_{2}^{n},Z^{n})+I(X_{1}^{n};Y_{1}^{n}|U_{2}^{n},Z^{n}).

But,

I(X1n;Y1n|U2n,Zn)\displaystyle I(X_{1}^{n};Y_{1}^{n}|U_{2}^{n},Z^{n}) =\displaystyle= H(Y1n|U2n,Zn)H(Y1n|X1n)H(Y1n)H(Y1n|X1n)\displaystyle H(Y_{1}^{n}|U_{2}^{n},Z^{n})-H(Y_{1}^{n}|X_{1}^{n})\leq H(Y_{1}^{n})-H(Y_{1}^{n}|X_{1}^{n}) (31)
\displaystyle\leq i=1nH(Y1i)i=1nH(Y1i|Y1i1,X1i)=i=1nH(Y1i)i=1nH(Y1i|X1i)\displaystyle\sum_{i=1}^{n}H(Y_{1i})-\sum_{i=1}^{n}H(Y_{1i}|Y_{1}^{i-1},X_{1i})=\sum_{i=1}^{n}H(Y_{1i})-\sum_{i=1}^{n}H(Y_{1i}|X_{1i})
=\displaystyle= i=1nI(X1i;Y1i).\displaystyle\sum_{i=1}^{n}I(X_{1i};Y_{1i}).

The inequality in the second line is due to the fact that conditioning reduces entropy and the equality in the fifth line is due to the memoryless property of the channel.

From (29), (30) and (31)

H(U1)\displaystyle H(U_{1}) \displaystyle\leq 1ni=1nI(U1i;U2i,Zi)+1ni=1nI(X1i;Y1i)+λn.\displaystyle\frac{1}{n}\sum_{i=1}^{n}I(U_{1i};U_{2i},Z_{i})+\frac{1}{n}\sum_{i=1}^{n}I(X_{1i};Y_{1i})+\lambda_{n}.

We can introduce time sharing random variable as done in  [13] and show that H(U1)I(U1;U2,Z)+I(X1;Y1)H(U_{1})\leq I(U_{1};U_{2},Z)+I(X_{1};Y_{1}). This simplifies to H(U1|U2,Z)I(X1;Y1)H(U_{1}|U_{2},Z)\leq I(X_{1};Y_{1}).

By the symmetry of the problem we get H(U2|U1,Z)I(X2;Y2)H(U_{2}|U_{1},Z)\leq I(X_{2};Y_{2}).

We also have

nH(U1,U2)\displaystyle nH(U_{1},U_{2}) =\displaystyle= H(U1n,U2n)\displaystyle H(U_{1}^{n},U_{2}^{n})
=\displaystyle= H(U1n,U2n|Y1n,Y2n,Zn)+I(U1n,U2n;Y1n,Y2n,Zn)\displaystyle H(U_{1}^{n},U_{2}^{n}|Y_{1}^{n},Y_{2}^{n},Z^{n})+I(U_{1}^{n},U_{2}^{n};Y_{1}^{n},Y_{2}^{n},Z^{n})
\displaystyle\leq I(U1n,U2n;Y1n,Y2n,Zn)+nλn.\displaystyle I(U_{1}^{n},U_{2}^{n};Y_{1}^{n},Y_{2}^{n},Z^{n})+n\lambda_{n}.

But

I(U1n,U2n;Y1n,Y2n,Zn)\displaystyle I(U_{1}^{n},U_{2}^{n};Y_{1}^{n},Y_{2}^{n},Z^{n}) =\displaystyle= I(U1n,U2n;,Zn)+I(U1n,U2n;Y1n,Y2n|Zn)\displaystyle I(U_{1}^{n},U_{2}^{n};,Z^{n})+I(U_{1}^{n},U_{2}^{n};Y_{1}^{n},Y_{2}^{n}|Z^{n})
\displaystyle\leq I(U1n,U2n;Zn)+I(X1n,X2n;Y1n,Y2n|Zn).\displaystyle I(U_{1}^{n},U_{2}^{n};Z^{n})+I(X_{1}^{n},X_{2}^{n};Y_{1}^{n},Y_{2}^{n}|Z^{n}).

Also,

I(X1n,X2n;Y1n,Y2n|Zn)\displaystyle I(X_{1}^{n},X_{2}^{n};Y_{1}^{n},Y_{2}^{n}|Z^{n}) =\displaystyle= H(Y1n,Y2n|Zn)H(Y1n,Y2n|X1n,X2n,Zn)\displaystyle H(Y_{1}^{n},Y_{2}^{n}|Z^{n})-H(Y_{1}^{n},Y_{2}^{n}|X_{1}^{n},X_{2}^{n},Z^{n})
\displaystyle\leq H(Y1n,Y2n)H(Y1n|X1n)H(Y2n|X2n)\displaystyle H(Y_{1}^{n},Y_{2}^{n})-H(Y_{1}^{n}|X_{1}^{n})-H(Y_{2}^{n}|X_{2}^{n})
\displaystyle\leq H(Y1n)+H(Y2n)H(Y1n|X1n)H(Y2n|X2n).\displaystyle H(Y_{1}^{n})+H(Y_{2}^{n})-H(Y_{1}^{n}|X_{1}^{n})-H(Y_{2}^{n}|X_{2}^{n}).

Then, following the steps used above, we obtain H(U1,U2|Z)I(X1;Y1)+I(X2;Y2)H(U_{1},U_{2}|Z)\leq I(X_{1};Y_{1})+I(X_{2};Y_{2}).                           \blacksquare

Appendix C Proofs of Lemmas in Section 4

Proof of Lemma 1: Let Δ=ΔI(X1;Y|X2,U2)I(X1;Y|X2)\Delta\buildrel\Delta\over{=}I(X_{1};Y|X_{2},U_{2})-I(X_{1};Y|X_{2}). Then denoting differential entropy by hh,

Δ=h(Y|X2,U2)h(Y|X1,X2,U2)[h(Y|X2)h(Y|X1,X2)].\displaystyle\Delta=h(Y|X_{2},U_{2})-h(Y|X_{1},X_{2},U_{2})-[h(Y|X_{2})-h(Y|X_{1},X_{2})].

Since the channel is memoryless, h(Y|X1,X2,U2)=h(Y|X1,X2)h(Y|X_{1},X_{2},U_{2})=h(Y|X_{1},X_{2}). Thus, Δ0\Delta\leq 0.

\blacksquare

Proof of Lemma 2: Since

I(X1,X2;Y)=h(Y)h(Y|X1,X2)=h(X1+X2+N)h(N),I(X_{1},X_{2};Y)=h(Y)-h(Y|X_{1},X_{2})=h(X_{1}+X_{2}+N)-h(N),

it is maximized when h(X1+X2+N)h(X_{1}+X_{2}+N) is maximized. This entropy is maximized when X1+X2X_{1}+X_{2} is Gaussian with the largest possible variance =P1+P2=P_{1}+P_{2}. If (X1,X2)(X_{1},X_{2}) is jointly Gaussian then so is X1+X2X_{1}+X_{2}.

Next consider I(X1;Y|X2)I(X_{1};Y|X_{2}). This equals

h(Y|X2)h(N)=h(X1+X2+N|X2)h(N)=h(X1+N|X2)h(N)h(Y|X_{2})-h(N)=h(X_{1}+X_{2}+N|X_{2})-h(N)=h(X_{1}+N|X_{2})-h(N)

which is maximized when p(x1|x2)p(x_{1}|x_{2}) is Gaussian and this happens when X1,X2X_{1},X_{2} are jointly Gaussian.

A similar result holds for I(X2;Y|X1)I(X_{2};Y|X_{1}).                                                                                     \blacksquare

Proof of Lemma 3: Since X1U1U2X2X_{1}\leftrightarrow U_{1}\leftrightarrow U_{2}\leftrightarrow X_{2} is a Markov chain, by data processing inequality I(X1;X2)I(U1;U2)I(X_{1};X_{2})\leq I(U_{1};U_{2}). Taking X1,X2X_{1},X_{2} to be jointly Gaussian with zero mean, unit variance and correlation ρ,I(X1,X2)=0.5log2(11ρ2){\rho},~I(X_{1},X_{2})=0.5log_{2}(\frac{1}{1-{\rho}^{2}}). This implies ρ2122I(U1,U2){\rho}^{2}\leq 1-2^{-2I(U_{1},U_{2})}.                                                 \blacksquare

Proof of Lemma 4: By Stone-Weierstrass theorem ([25], [36]) the class of functions (x1,x2)e12c1(x1a1)2(x_{1},x_{2})\mapsto e^{\frac{-1}{2c_{1}}(x_{1}-a_{1})^{2}}e12c2(x2a2)2e^{\frac{-1}{2c_{2}}(x_{2}-a_{2})^{2}} can be shown to be dense in C0C_{0} under uniform convergence where C0C_{0} is the set of all continuous functions on 2\Re^{2} such that limX|f(x)|=0\lim_{\|X\|\to\infty}|f(x)|=0 . Since the jointly Gaussian density (x1,x2)e12σ2(x12+x222ρx1x21ρ2)(x_{1},x_{2})\mapsto e^{\frac{-1}{2\sigma^{2}}(\frac{x_{1}^{2}+x_{2}^{2}-2\rho x_{1}x_{2}}{1-\rho^{2}})} is in C0C_{0}, it can be approximated arbitrarily closely uniformly by the functions (16).                                                                                                                      \blacksquare

Proof of Lemma 5: Since

I(Xm1,Xm2;Ym)=h(Ym)h(Ym|Xm1,Xm2)=h(Ym)h(N),I(X_{m1},X_{m2};Y_{m})=h(Y_{m})-h(Y_{m}|X_{m1},X_{m2})=h(Y_{m})-h(N),

it is sufficient to show that h(Ym)h(Y)h(Y_{m})\rightarrow h(Y). From (Xm1,Xm2)d(X1,X2)(X_{m1},X_{m2}){\buildrel d\over{\longrightarrow}}(X_{1},X_{2}) and independence of (Xm1,Xm2)(X_{m1},X_{m2}) from NN, we get Ym=Xm1+Xm2+NdX1+X2+N=YY_{m}=X_{m1}+X_{m2}+N{\buildrel d\over{\longrightarrow}}X_{1}+X_{2}+N=Y. Then fmff_{m}\rightarrow f uniformly implies that fm(Ym)df(Y)f_{m}(Y_{m}){\buildrel d\over{\longrightarrow}}f(Y). Since fm(Ym)0,f(Y)0a.sf_{m}(Y_{m})\geq 0,~f(Y)\geq 0~a.s and loglog being continuous except at 0, we obtain logfm(Ym)dlogf(Y)logf_{m}(Y_{m}){\buildrel d\over{\longrightarrow}}logf(Y). Then uniform integrability provides I(Xm1,Xm2;Ym)I(X1,X2;Y)I(X_{m1},X_{m2};Y_{m})\rightarrow I(X_{1},X_{2};Y).         \blacksquare

References

  • [1] R. Ahlswede. Multiway communication channels. Proc. Second Int. Symp. Inform. Transmission, Armenia, USSR, Hungarian Press, 1971.
  • [2] R. Ahlswede and T. Han. On source coding with side information via a multiple access channel and related problems in information theory. IEEE Trans. Inform. Theory, 29(3):396–411, May 1983.
  • [3] I. F. Akylidiz, W. Su, Y. Sankarasubramaniam, and E. Cayirici. A survey on sensor networks. IEEE Communications Magazine, pages 1–13, Aug. 2002.
  • [4] S. J. Baek, G. Veciana, and X. Su. Minimizing energy consumption in large-scale sensor networks through distributed data compression and hierarchical aggregation. IEEE JSAC, 22(6):1130–1140, Aug. 2004.
  • [5] R. J. Barron, B. Chen, and G. W. Wornell. The duality between information embedding and source coding with side information and some applications. IEEE Trans. Inform. Theory, 49(5):1159–1180, May 2003.
  • [6] J. Barros and S. D. Servetto. Reachback capacity with non-interfering nodes. Proc.ISIT, pages 356–361, 2003.
  • [7] J. Barros and S. D. Servetto. Network information flow with correlated sources. IEEE Trans. Inform. Theory, 52(1):155–170, Jan 2006.
  • [8] T. Berger. Multiterminal source coding. Lecture notes presented at 1977 CISM summer school, Udine, Italy, July. 1977.
  • [9] T. Berger and R. W. Yeung. Multiterminal source coding with one distortion criterion. IEEE Trans. Inform. Theory, 35(2):228–236, March 1989.
  • [10] T. Berger, Z. Zhang, and H. Viswanathan. The CEO problem. IEEE Trans. Inform. Theory, 42(3):887–902, May 1996.
  • [11] T. M. Cover. A proof of the data compression theorem of Slepian and Wolf for ergodic sources. IEEE Trans. Inform. Theory, 21(2):226–228, March 1975.
  • [12] T. M. Cover, A. E. Gamal, and M. Salehi. Multiple access channels with arbitrarily correlated sources. IEEE Trans. Inform. Theory, 26(6):648–657, Nov. 1980.
  • [13] T. M. Cover and J. A. Thomas. Elements of Information theory. Wiley Series in Telecommunication, N.Y., 2004.
  • [14] S. C. Draper and G. W. Wornell. Side information aware coding stategies for sensor networks. IEEE Journal on Selected Areas in Comm., 22:1–11, Aug 2004.
  • [15] G. Dueck. A note on the multiple access channel with correlated sources. IEEE Trans. Inform. Theory, 27(2):232–235, March 1981.
  • [16] M. Fleming and M. Effros. On rate distortion with mixed types of side information. IEEE Trans. Inform. Theory, 52(4):1698–1705, April 2006.
  • [17] H. E. Gamal. On scaling laws of dense wireless sensor networks: the data gathering channel. IEEE Trans. Inform. Theory, 51(3):1229–1234, March 2005.
  • [18] M. Gastpar. Multiple access channels under received-power constraints. Proc. IEEE Inform. Theory Workshop, pages 452–457, 2004.
  • [19] M. Gastpar. Wyner-ziv problem with multiple sources. IEEE Trans. Inform. Theory, 50(11):2762–2768, Nov. 2004.
  • [20] M. Gastpar and M. Vetterli. Source-channel communication in sensor networks. Proc. IPSN’03, pages 162–177, 2003.
  • [21] M. Gastpar and M. Vetterli. Power spatio-temporal bandwidth and distortion in large sensor networks. IEEE JSAC, 23(4):745–754, 2005.
  • [22] D. Gunduz and E. Erkip. Interference channel and compound mac with correlated sources and receiver side information. IEEE ISIT 07, June 2007.
  • [23] D. Gunduz and E. Erkip. Transmission of correlated sources over multiuser channels with receiver side information. UCSD ITA Workshop, San Diego, CA, Jan 2007.
  • [24] P. Ishwar, R. Puri, K. Ramchandran, and S. S. Pradhan. On rate constrained distributed estimation in unreliable sensor networks. IEEE JSAC, pages 765–775, 2005.
  • [25] J. Jacod and P. Protter. Probability Essentials. Springer, N.Y., 2004.
  • [26] W. Kang and S. Ulukus. An outer bound for mac with correlated sources. Proc. 40 th annual conference on Information Sciences and Systems, pages 240–244, March 2006.
  • [27] A. Lapidoth and S. Tinguely. Sending a bi- variate Gaussian source over a Gaussian MAC. IEEE ISIT 06, 2006.
  • [28] H. Liao. Multiple access channels. Ph.D dissertion, Dept. Elec. Engg., Univ of Hawaii, Honolulu, 1972.
  • [29] L. Ong and M. Motani. Coding stategies for multiple-access channels with feedback and correlated sources. IEEE Trans. Inform. Theory, 53(10):3476–3497, Oct 2007.
  • [30] Y. Oohama. Gaussian multiterminal source coding. IEEE Trans. Inform. Theory, 43(6):1912–1923, Nov. 1997.
  • [31] Y. Oohama. The rate distortion function for quadratic Gaussian CEO problem. IEEE Trans. Inform. Theory, 44(3):1057–1070, May 1998.
  • [32] L. H. Ozarow. The capacity of the white Gaussian multiple access channel with feedback. IEEE Trans. Inform. Theory, 30(4):623 – 629, July 1984.
  • [33] S. S. Pradhan, J. Chou, and K. Ramachandran. Duality between source coding and channel coding and its extension to the side information case. IEEE Trans. Inform. Theory, 49(5):1181–1203, May 2003.
  • [34] R. Rajesh and V. Sharma. Source channel coding for Gaussian sources over a Gaussian multiple access channel. Proc. 45 Allerton conference on computing control and communication, Monticello, IL, 2007.
  • [35] S. Ray, M. Medard, M. Effros, and R. Kotter. On separation for multiple access channels. Proc. IEEE Inform. Theory Workshop, 2006.
  • [36] H. L. Royden. Real analysis. Prentice Hall, Inc., Eaglewood cliffs, New Jersey, 1988.
  • [37] D. Slepian and J. K. Wolf. A coding theorem for multiple access channels with correlated sources. Bell syst. Tech. J., 52(7):1037–1076, Sept. 1973.
  • [38] D. Slepian and J. K. Wolf. Noiseless coding of correlated information sources. IEEE Trans. Inform. Theory, 19(4):471–480, Jul. 1973.
  • [39] V. K. Varsheneya and V. Sharma. Lossy distributed source coding with side information. Proc. National Conference on Communication (NCC), New Delhi, Jan 2006.
  • [40] V. V. Veeravalli. Decentralized quickest change detection. IEEE Trans. Inform. Theory, 47(4):1657–1665, May 2001.
  • [41] A. B. Wagner, S. Tavildar, and P. Viswanath. The rate region of the quadratic Gaussian two terminal source coding problem. IEEE Trans. Inform. Theory, 54(5):1938–1961, May 2008.
  • [42] A. Wyner and J. Ziv. The rate distortion function for source coding with side information at the receiver. IEEE Trans. Inform. Theory, IT-22:1–11, Jan. 1976.
  • [43] R. Zamir and T. Berger. Multiterminal source coding with high resolution. IEEE Trans. Inform. Theory, 45(1):106–117, Jan. 1999.