This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The invalidity of a strong capacity for a quantum channel with memory

Tony Dorlas dorlas@stp.dias.ie School of Theoretical Physics
Dublin Institute for Advanced Studies
10 Burlington Road, Dublin 4, Ireland
   Ciara Morgan cqtciara@nus.edu.sg Centre for Quantum Technologies
National University of Singapore
3 Science Drive 2, Singapore 117543
Abstract

The strong capacity of a particular channel can be interpreted as a sharp limit on the amount of information which can be transmitted reliably over that channel. To evaluate the strong capacity of a particular channel one must prove both the direct part of the channel coding theorem and the strong converse for the channel. Here we consider the strong converse theorem for the periodic quantum channel and show some rather surprising results. We first show that the strong converse does not hold in general for this channel and therefore the channel does not have a strong capacity. Instead, we find that there is a scale of capacities corresponding to error probabilities between integer multiples of the inverse of the periodicity of the channel. A similar scale also exists for the random channel.

channel coding theorem, classical capacity, quantum channels with memory
pacs:
03.37.Hk, 89.70.Kn

I Introduction

The full channel coding theorem provides a limit on the rate at which a sender can communicate an encoded message to a receiver, such that the probability of a decoding error at the receiver’s side decays exponentially in the number of channel uses. The theorem is comprised of two parts: the direct part of the theorem, which refers to the construction of the code, and the converse to the theorem. The direct part of the quantum channel coding theorem states that using nn copies of the channel, we can code with an exponentially small probability of error at a rate R=1nlog||R=\frac{1}{n}\log|\mathcal{M}|, provided R<CR<C in the asymptotic limit, where \cal M denotes the set of possible codewords to be transmitted over the channel and CC denotes the capacity of the channel. If the rate at which classical information is transmitted over a quantum channel exceeds the capacity of the channel, i.e. if R>CR>C, then the probability of decoding the information correctly goes to zero in the number of channel uses. The latter is known as the strong converse to the channel coding theorem. The weak converse, on the other hand, states that if R>CR>C, then the probability of decoding the information correctly is bounded away from 11, i.e. the error probability does not tend to zero, whatever encoding/decoding scheme is used.
Shannon Shannon48 first proposed the theorem for classical discrete memoryless channels and the first rigorous proof of the direct part of the theorem was provided by Feinstein Feinstein and the strong converse by Wolfowitz Wolfowitz .
However, it was observed that the existence of the strong converse, and therefore strong capacity, for other types of classical channels does not always hold WolfowitzNoCap . See Ahlswede Ahlswede06 for a more complete discussion of converse results for various types of classical channels.
The strong converse to the channel coding theorem for memoryless classical-quantum channels with product state inputs was determined independently by Winter Winter99 and by Ogawa and Nagaoka ON99 . Their result implies that every memoryless discrete classical-quantum channel has a strong capacity which provides a sharp upper-bound on the rate at which classical information can be transmitted over this type of channel using product states.
Recent results include a proof by Bjelaković and Boche BB08 ; BB09 of a full coding theorem for the discrete memoryless compound classical-quantum channel. Wehner and König KW09 proved the fully general strong converse theorem for a family of channels, that is, they proved that the strong converse theorem holds for a family of quantum channels even in the case when entangled state inputs are allowed.
In this article we relax the assumption that the communication channel in question is memoryless and we concentrate on a particular quantum channel with memory, that is, a channel with correlations between successive channel uses. In our case the correlations between successive uses of the channel can be described by a Markov chain. Communication channels with memory are widely considered to be more realistic than memoryless channels since real-world channels may not exhibit independence between successive errors and correlations are common. Noise correlations are also necessary for certain models of quantum communication Bose03 . See for example Kretschmann and Werner KW05 and Mancini Mancini06 for models of quantum memory channels.
The article is organised as follows. We introduce notation, necessary definitions and define the quantum periodic channel in Section II. In Section III we prove that the periodic channel does not have a strong capacity. The observation relies on a result which is proved in Appendix A, involving a particular instance of a periodic channel and consequently the strong converse does not hold in general for the periodic channel. In Section V we remark on a scale of capacities for the random channel. We then state and prove the main result involving a scale of capacities for the channel.
Note that log\log is understood to be taken to the base 22 throughout the article.

II Preliminaries

We begin by introducing some notation. A memoryless channel is given by a completely positive trace-preserving (CPT) map Φ:()(𝒦){\Phi}:{\mathcal{B}}({\cal H})\to{\cal B}({\cal K}), where (){\cal B}({\cal H}) and (𝒦){\cal B}({\cal K}) denote the states on the input and output Hilbert spaces {\cal H} and 𝒦\cal K, respectively.
Equivalently, we can describe a classical-quantum channel, here also denoted Φ\Phi, as a mapping from the classical message to the output state of the channel on (𝒦)\mathcal{B}(\mathcal{K}) as follows,

Φ:𝒳(𝒦),\Phi:\mathcal{X}\mapsto\mathcal{B}(\mathcal{K}), (2.1)

where the message is first encoded into a sequence belonging the set 𝒳n\mathcal{X}^{n}, where 𝒳\cal X represents the input alphabet.
We can combine the two mapping descriptions as follows. We wish to send classical information in the form of quantum states over a quantum channel Φ\Phi. A (discrete) memoryless quantum channel, Φ\Phi, carrying classical information can be thought of as a map from a (finite) set, or alphabet, 𝒳\cal X into (𝒦)\cal B(\cal K), taking each x𝒳x\in\cal X to Φx=Φ(ρx)\Phi_{x}=\Phi(\rho_{x}), where the input state to the channel is given by {ρx}x𝒳\{\rho_{x}\}_{x\in\cal{X}} and each ρx()\rho_{x}\in\cal B(\cal H). Let d=dim()d=\dim(\cal H) and a=|𝒳|a=|\cal X|.
For a probability distribution PP on the input alphabet 𝒳\mathcal{X}, the average output state of a channel Φ\Phi is given by

Pσ=x𝒳P(x)Φ(ρx).P\sigma=\sum_{x\in\mathcal{X}}P(x)\Phi(\rho_{x}). (2.2)

The conditional von Neumann entropy of Φ\Phi given PP is defined by

S(Φ|P)=x𝒳P(x)S(Φ(ρx)),S(\Phi|P)=\sum_{x\in\mathcal{X}}P(x)S(\Phi(\rho_{x})),\hfill (2.3)

and the mutual information between the probability distribution PP and the channel Φ\Phi is defined as follows,

I(P;Φ)=S(Pσ)S(Φ|P).I(P;\Phi)=S(P\sigma)-S(\Phi|P). (2.4)

An nn-block code for a quantum channel Φ\Phi is a pair (Cn,En)(C^{n},E^{n}), where CnC^{n} is a mapping from a finite set of messages \cal M, of length nn, into 𝒳n\mathcal{X}^{n}, i.e. a sequence xn𝒳x^{n}\in\cal X is assigned to each of the |||\cal M| messages, and EnE^{n} is a POVM, i.e. a quantum measurement, on the output space 𝒦n\mathcal{K}^{\otimes n} of the channel Φxnn\Phi_{x^{n}}^{n}. The maximum error probability of the code (Cn,En)(C^{n},E^{n}) is defined as

pe(Cn,En)=max{1Tr(ΦCn(m)nEmn):m}.p_{e}(C^{n},E^{n})={\rm max}\,\{1-\mathop{\rm Tr}(\Phi_{C^{n}(m)}^{n}E^{n}_{m}):m\in\cal M\}. (2.5)

The code (Cn,En)(C^{n},E^{n}) is called an (n,λ)(n,\lambda)-code, if pe(Cn,En)λp_{e}(C^{n},E^{n})\leq\lambda. The maximum size |||\cal M| of an (n,λ)(n,\lambda)-code is denoted N(n,λ)N(n,\lambda). Define an finite alphabet 𝒳\cal X and sequences xn=x1,,xn𝒳nx^{n}=x_{1},\dots,x_{n}\in\mathcal{X}^{n} and let

N(x|xn)=|{i{1,,n}:xi=x}|N(x\big{|}x^{n})=\big{|}\{i\in\{1,\dots,n\}:x_{i}=x\}\big{|} (2.6)

for x𝒳x\in\mathcal{X}. The type of the sequence xnx^{n} is given by the empirical distribution PxnP_{x^{n}} on 𝒳\cal X such that

Pxn(x)=N(x|xn)n.P_{x^{n}}(x)=\frac{N(x\big{|}x^{n})}{n}. (2.7)

Clearly, the number of types is upper bounded by (n+1)a(n+1)^{a}, where a=|𝒳|a=\big{|}\cal X\big{|}.

II.1 Coding theorem and strong converse

The strong capacity of a particular channel provides a sharp threshold on the rate at which information may be transmitted over that channel with exponentially decreasing probability of decoding error in the number of channel uses. In order to establish a strong capacity for a particular channel one must prove both existence of a capacity achieving code and the strong converse.
The direct part of the coding theorem for memoryless quantum channels with product-state inputs was determined independently by Holevo Hol98 and Schumacher and Westmoreland SW97 . Winter Winter99 and Ogawa and Nagaoka ON99 independently proved the strong converse for memoryless quantum channels.
In Section III we require a version of the strong converse theorem proved by Winter Winter99 which holds for a single codeword type. We therefore provide this version (Lemma II.1) below, following both the direct part and strong converse theorems for memoryless classical quantum channels as stated and proved in Winter99 .

Theorem II.1

(Direct part)
For all λ(0,1)\lambda\in(0,1) and δ>0\delta>0 there exists n0(λ,δ)𝑁n_{0}(\lambda,\delta)\in\mathop{N} such that for all nn0n\geq n_{0} and every classical quantum channel Φ\Phi and probability distribution PP on 𝒳\cal{X}, there exists an (n,λ)(n,\lambda)-code such that the number of messages satisfies

|n|2n(χ(Φ)δ),|\mathcal{M}_{n}|\geq 2^{n(\chi^{*}(\Phi)-\delta)}, (2.8)

where the Holevo capacity χ\chi^{*} is given by

χ(Φ)=supPI(P;Φ)\chi^{*}(\Phi)=\sup_{P}I(P;\Phi) (2.9)

the supremum being over all probability distributions PP on 𝒳\cal X.

Theorem II.2

(Strong converse)
For all λ(0,1)\lambda\in(0,1) and all δ>0\delta>0 there exists n1(λ,δ)n_{1}(\lambda,\delta) such that for all nn1n\geq n_{1} and every memoryless classical quantum channel Φ\Phi and the number of messages of an (n,λ)(n,\lambda)-code is bounded by

|n|2n(χ(Φ)+δ).|\mathcal{M}_{n}|\leq 2^{n(\chi^{*}(\Phi)+\delta)}. (2.10)

Remark. Winter in fact proved a stronger version of these theorems in which δ\delta is replaced by a constant times 1/n1/\sqrt{n}.

In the following we follow the approach of Winter (Winter99 , Theorem 13) in which the strong converse is derived from a bound on the number of codewords of a given type PP:

Lemma II.1

(Single-type strong converse)
For λ(0,1)\lambda\in(0,1) and δ>0\delta>0 there exists n1(λ,δ)n_{1}(\lambda,\delta) such that for nn1n\geq n_{1}, every (n,λ)(n,\lambda)-code for which all codewords are of the same type PP,

|n,P|2n(I(P;Φ)+δ).|\mathcal{M}_{n,P}|\leq 2^{n(I(P;\,\Phi)+\delta)}. (2.11)

The strong converse follows immediately from this lemma using the fact that the number of types is upper bounded by (1+n)a(1+n)^{a} (see CK11 Lemma 2.2).

Remark. In contrast to the strong converse where the decoding error goes to 11 exponentially in the number of channel applications if R>CR>C, the weak converse states that if R>CR>C, then the probability of decoding the information correctly is bounded away from 11.

II.2 Quantum channels with classical memory

Next, we provide definitions needed to describe quantum channels with classical memory Norris . Let II denote a countable set and let λi=(X=i)\lambda_{i}=\mathbb{P}(X=i), where XX is a random variable taking values in the state space II. Let QQ denote a transition matrix, with entries labeled qj|iq_{j|i}. A discrete time random process denoted XnX_{n} can be considered to be a Markov chain with transition matrix QQ and initial distribution λ\lambda, if and only if the following holds for i0,,in1Ii_{0},\dots,i_{n-1}\in I,

\displaystyle\mathbb{P} (X0=i0,X1=i1,,Xn1=in1)\displaystyle(X_{0}=i_{0},X_{1}=i_{1},\dots,X_{n-1}=i_{n-1}) (2.12)
=\displaystyle= λi0qi1|i0qi2|i1qin1|in2.\displaystyle\lambda_{i_{0}}q_{i_{1}|i_{0}}q_{i_{2}|i_{1}}\cdots q_{i_{n-1}|i_{n-2}}.

In DD09Markov Datta and Dorlas analyse a quantum channel of length nn with Markovian noise correlations, first defined by Bowen and Mancini BM04 , as follows

Φn(ρn)=i0in1qin1|in2qi1|i0λi0(Φi0Φin1)(ρn)\Phi^{n}(\rho^{n})=\hskip-8.53581pt\sum_{i_{0}\dots i_{n-1}}\hskip-8.53581ptq_{i_{n-1}|i_{n-2}}\dots q_{i_{1}|i_{0}}\lambda_{i_{0}}(\Phi_{i_{0}}\otimes\cdots\otimes\Phi_{i_{n-1}})(\rho^{n}) (2.13)

where qj|iq_{j|i} are the elements of the transition matrix of a discrete-time Markov chain, and {λi}\{\lambda_{i}\} represents an invariant distribution on the Markov chain.
In Section III we analyse a particular channel with classical memory, namely the periodic channel. We describe this channel below.
A periodic channel acting on an nn-fold input state can be described as follows

Φn(ρn)=1Li=0L1(ΦiΦi+1Φi+n1)(ρn),\Phi^{n}\left(\rho^{n}\right)=\frac{1}{L}\sum_{i=0}^{L-1}\left(\Phi_{i}\otimes\Phi_{i+1}\otimes\cdots\otimes\Phi_{i+n-1}\right)\left(\rho^{n}\right), (2.14)

where Φi\Phi_{i} are CPT maps acting on the same Hilbert space and the index is cyclic, modulo the period LL, i.e. Φi+L=Φi\Phi_{i+L}=\Phi_{i}. In this case the elements of the corresponding transition matrix are given by qj|i=θi,jq_{j|i}=\theta_{i,j}, where

θi,j={1,if j=i+1modL0,otherwise.\theta_{i,j}=\begin{cases}1,&\text{if $j=i+1\mod L$}\\ 0,&\text{otherwise.}\end{cases} (2.15)

The product-state capacity of the channel, denoted CpC_{p} is given by

Cp(Φ)=1LsupPi=0L1I(P;Φi).C_{p}\left(\Phi\right)=\frac{1}{L}\sup_{P}\sum_{i=0}^{L-1}I(P;\Phi_{i}). (2.16)

The proof of direct part of the channel coding theorem for the periodic quantum channel is provided in Appendix B of CMThesis . This is in fact a special case of the main result proved by Datta and Dorlas in DD09Markov . Note that the proof of the direct part of the coding theorem for this channel makes use of a preamble to the code which the receiver uses upon receipt to determine which branch of the channel was selected.

Another channel of the general type (2.13) is the random channel. It is given by

Φn(ρn)=i=1MqiΦin(ρn),\Phi^{n}\left(\rho^{n}\right)=\sum_{i=1}^{M}q_{i}\,\Phi_{i}^{\otimes n}\left(\rho^{n}\right), (2.17)

where Φi\Phi_{i} (i=1,,Mi=1,\dots,M) are CPT maps acting on the same Hilbert space and q1,,qMq_{1},\dots,q_{M} is a probability distribution. In this case the elements of the corresponding transition matrix are given by qj|i=δijq_{j|i}=\delta_{ij}. It was shown in DD07CC that the product state capacity of this channel is given by

Cp(Φ)=supPmini=1MI(P;Φi).C_{p}\left(\Phi\right)=\sup_{P}\min_{i=1}^{M}I(P;\Phi_{i}). (2.18)

We will remark on this channel, which like the periodic channel has long-term memory, in Section V.

III Channel without a strong converse

The strong converse for the periodic quantum channel does not hold in general because the following inequality holds

Cp<Cp¯,C_{p}<\overline{C_{p}}, (3.19)

where,

Cp¯=1Li=0L1supPI(P,Φi).\overline{C_{p}}=\frac{1}{L}\sum_{i=0}^{L-1}\sup_{P}I(P,\Phi_{i}). (3.20)

The strict inequality above can be shown explicitly for a periodic channel consisting of two branches of qubit amplitude-damping channels (see Appendix A below for detailed proof). On the other hand, equality for expression (3.19) can be shown to hold for a periodic channel with depolarising channel branches DM09 .

Let us now investigate whether we can prove a full coding theorem for rates RR such that

Cp<R<Cp¯.C_{p}<R<\overline{C_{p}}. (3.21)

We first define the average probability of error as follows

pe¯=1Li=0L1peiλ,\overline{p_{e}}=\frac{1}{L}\sum_{i=0}^{L-1}p_{e}^{i}\leq\lambda, (3.22)

where peip_{e}^{i} denotes the probability of error for the ii-th channel branch.

Our coding strategy is as follows. We choose a code i.e. a particular encoding and decoding scheme, suitable for a particular channel branch labeled by the index i{0,,L1}i\in\{0,\dots,L-1\}. Here a ‘branch’ is defined as one term in the sum (2.14) i.e. Φi(n)=ΦiΦi+1Φi+n1\Phi_{i}^{(n)}=\Phi_{i}\otimes\Phi_{i+1}\otimes\dots\otimes\Phi_{i+n-1}. According to the coding theorem for memoryless channels, there is a code with error probability tending to zero for this branch with rate RR. Indeed, for each jj there exists a probability distribution PjP_{j} of states optimising χj=supPI(P;Φj)\chi_{j}^{*}=\sup_{P}I(P;\Phi_{j}) and we can choose states from a typical subspace for these distributions, which can be interlaced at the positions ji+kLj-i+kL, where k[nL1]k\in[\frac{n}{L}-1]. The probability of choosing a particular branch correctly is given by 1L\frac{1}{L} and therefore the probability of error approaches

pe=L1L<1.p_{e}=\frac{L-1}{L}<1. (3.23)

We thus have a λ\lambda-code for all λ>11L\lambda>1-\frac{1}{L}. In particular, the error probability is bounded away from 1, and the strong converse does not hold.

On the other hand the strong converse does hold for R>Cp¯R>\overline{C_{p}}. Indeed, the codewords can be decomposed into sub-codewords corresponding to the different stages of a period: xn=(x0n,,xL1n)x^{n}=(x_{0}^{n},\dots,x_{L-1}^{n}), where the components of the xinx_{i}^{n} are understood to be interlaced in xnx^{n}. We distinguish types P0,,PL1P_{0},\dots,P_{L-1} for the sub-codewords. Then we have an analogue of the single-type strong converse given by Lemma II.1:

Lemma III.1

For λ(0,1)\lambda\in(0,1) and δ>0\delta>0 there exists n1(λ,δ)n_{1}(\lambda,\delta) such that for nn1n\geq n_{1}, every (n,λ)(n,\lambda)-code for which all sub-codewords are of the same type P0,,PL1P_{0},\dots,P_{L-1}, given that the ii-th branch is selected,

|n,P0,,PL1|2nLk=0L1(I(Pk;Φi+k)+δ).|\mathcal{M}_{n,P_{0},\dots,P_{L-1}}|\leq 2^{\frac{n}{L}\sum_{k=0}^{L-1}(I(P_{k};\,\Phi_{i+k})+\delta)}. (3.24)

Clearly, for the complete channel, it follows that the number of codewords such that the sub-codewords are of types P0,,PL1P_{0},\dots,P_{L-1}, satisfies

|n,P0,,PL1|2nLk=0L1(supPI(P;Φi+k)+δ).|{\cal M}_{n,P_{0},\dots,P_{L-1}}|\leq 2^{\frac{n}{L}\sum_{k=0}^{L-1}(\sup_{P}I(P;\Phi_{i+k})+\delta)}. (3.25)

Summing over the types, we obtain the strong converse.

We can conclude that the strong converse holds for rates R>Cp¯R>\overline{C_{p}}.

IV A scale of capacities

The above obviously raises the question if smaller error probabilities can be attained for smaller rates, but still above CpC_{p}. For this, we define a ‘pair capacity’ Cp(2)C_{p}^{\,(2)} as follows:

Cp(2)=12Lmax0i1<i2<Lk=0L1supP(I(P;Φi1+k)+I(P;Φi2+k)).C_{p}^{\,(2)}=\frac{1}{2L}\max_{0\leq i_{1}<i_{2}<L}\sum_{k=0}^{L-1}\hskip-2.84526pt\sup_{P}\left(I(P;\Phi_{i_{1}+k})+I(P;\Phi_{i_{2}+k})\right). (4.26)

Suppose the maximum is attained at a certain pair (i1,i2)(i_{1},i_{2}). With probability 2/L2/L, one of the two branches i1i_{1} or i2i_{2} is chosen. We attach a preamble to the code as in the proof of the product-state capacity of the periodic channel (2.16). If, for example, the branch i1i_{1} is selected by the channel, the receiver can determine that this is the case by measuring the preamble, and can then choose states for each value of kk from the typical space corresponding to the maximising distribution PkP_{k} for the CPT map Φi1+k\Phi_{i_{1}+k}. This constitutes an encoding with rate given by the average of the mutual informations I(P;Φi1+k)I(P;\Phi_{i_{1}+k}) for k=0,,L1k=0,\dots,L-1, which is greater than or equal to the pair capacity C(2)C^{\,(2)} given by Equation (4.26). We have thus constructed a λ\lambda-code for λ>12L\lambda>1-\frac{2}{L}.

On the other hand, let R>Cp(2)R>C^{\,(2)}_{p} and suppose that (Cn,En)(C^{n},E^{n}) is a sequence of (n,λ)(n,\lambda)-codes with λ<11L\lambda<1-\frac{1}{L}, and assume that

1nlog|n|R>Cp(2).\frac{1}{n}\log|{\cal M}_{n}|\geq R>C_{p}^{\,(2)}. (4.27)

First note that we may assume that the number of codewords with sub-codewords of types P0,PL1P_{0},\dots P_{L-1} is bounded by

1nlog|n,P0,,PL1|1Lk=0L1(I(Pk,Φi+k)+δ)\frac{1}{n}\log|{\cal M}_{n,P_{0},\dots,P_{L-1}}|\leq\frac{1}{L}\sum_{k=0}^{L-1}(I(P_{k},\Phi_{i+k})+\delta) (4.28)

for some fixed i=0,,L1i=0,\dots,L-1. Indeed, otherwise, by Lemma III.1, pei>λp^{i}_{e}>\lambda for all ii and hence pe>λp_{e}>\lambda.

We now claim that for every other jij\neq i, and ϵ>0\epsilon>0 small enough,

1nlog|n,P0,,PL1|>1Lk=0L1(I(Pk,Φj+k)+ϵ).\frac{1}{n}\log|{\cal M}_{n,P_{0},\dots,P_{L-1}}|>\frac{1}{L}\sum_{k=0}^{L-1}(I(P_{k},\Phi_{j+k})+\epsilon). (4.29)

If this were not the case then the pair capacity for a single type PkP_{k} can be written as

1nlog|n,P0,,PL1|\displaystyle\frac{1}{n}\log|{\cal M}_{n,P_{0},\dots,P_{L-1}}| \displaystyle\leq 12Lk=0L1(I(Pk,Φi+k)\displaystyle\frac{1}{2L}\sum_{k=0}^{L-1}(I(P_{k},\Phi_{i+k}) (4.30)
+\displaystyle+ I(Pk,Φj+k)+δ)\displaystyle I(P_{k},\Phi_{j+k})+\delta)

and hence

1nlog|n,P0,,PL1|\displaystyle\frac{1}{n}\log|{\cal M}_{n,P_{0},\dots,P_{L-1}}| \displaystyle\leq 12Lk=0L1(supP{I(P,Φi+k)\displaystyle\frac{1}{2L}\sum_{k=0}^{L-1}(\sup_{P}\{I(P,\Phi_{i+k}) (4.31)
+\displaystyle+ I(P,Φj+k)}+δ).\displaystyle I(P,\Phi_{j+k})\}+\delta).

Summing over the types P0,,PL1P_{0},\dots,P_{L-1} leads to a contradiction with (4.27).

Now, expression (4.29) implies with Lemma III.1 that, if the jj-th branch is selected by the channel, then the error probability pej>1ηp^{j}_{e}>1-\eta for any η>0\eta>0. Since with probability 11L1-\frac{1}{L} one of the branches jj other than ii is selected, we conclude that the error probability pe>(11L)(1η)>λp_{e}>\left(1-\frac{1}{L}\right)(1-\eta)>\lambda if η<11Lλ\eta<1-\frac{1}{L}-\lambda is small enough.

It is now clear that this argument can be generalised to prove:

Theorem IV.1

Define, for r=1,,Lr=1,\dots,L a scale of capacities Cp(r)C_{p}^{(r)} by

Cp(r)=1rLmax0i1<<ir<Lk=0L1supPm=1rI(P;Φim+k).C_{p}^{(r)}=\frac{1}{rL}\max_{0\leq i_{1}<\dots<i_{r}<L}\sum_{k=0}^{L-1}\sup_{P}\sum_{m=1}^{r}I(P;\Phi_{i_{m}+k}). (4.32)

(Note that Cp(1)=CpC_{p}^{(1)}=C_{p} and Cp(L)=Cp¯C_{p}^{(L)}=\overline{C_{p}}.) Then, if λ>1rL\lambda>1-\frac{r}{L} and R<Cp(r)R<C_{p}^{(r)}, there exists a sequence of (n,λ)(n,\lambda)-codes with rate RR. Conversely, if λ<1r1L\lambda<1-\frac{r-1}{L}, there exists no sequence of (n,λ)(n,\lambda)-codes with rate R>Cp(r)R>C_{p}^{(r)}.

V The random channel

The situation for the random channel is similar, but more complicated due to the fact that different branches can have different probabilities qiq_{i}. We can in general distinguish break points at values of the error probability given by

q(Δ)=iΔqi,Δ{1,,M}.q(\Delta)=\sum_{i\in\Delta}q_{i},\qquad\Delta\subset\{1,\dots,M\}. (5.33)

We have an analogue of the detailed theorem for periodic channels above:

Theorem V.1

Define, for Δ{1,,M}\Delta\subset\{1,\dots,M\} a scale of capacities CpΔC_{p}^{\Delta} by

CpΔ=supPminiΔI(P;Φi).C_{p}^{\Delta}=\sup_{P}\min_{i\in\Delta}I(P;\Phi_{i}). (5.34)

Then, if λ>1q(Δ)\lambda>1-q(\Delta) and R<CpΔR<C_{p}^{\Delta}, then there exists a sequence of (n,λ)(n,\lambda) codes with rate RR.

For the converse to the theorem, we introduce another scale as follows:

CpΔ¯=supPmaxiΔI(P;Φi).\overline{C_{p}^{\Delta}}=\sup_{P}\max_{i\in\Delta}I(P;\Phi_{i}). (5.35)

Then, if λ<1q(Δ)\lambda<1-q(\Delta) there exist no (n,λ)(n,\lambda)-codes with rate R>CpΔ¯R>\overline{C_{p}^{\Delta}}.

The situation is less clear-cut than it seems, however. In fact, not every q(Δ)q(\Delta) is necessarily a point of discontinuity for the capacity, because CpΔC_{p}^{\Delta} is in general not monotonic in the probabilities q(Δ)q(\Delta)!

VI Discussion

One of the most surprising and interesting results which has emerged from Shannon Theory is the observation that the strong information-carrying capacity of a memoryless channel is independent of the upper bound on the maximum error probability of that channel, usually denoted λ\lambda. The independence of the parameter λ\lambda is crucial to the existence of a so-called strong capacity for the channel Wolfowitz .
The dependency of some channel capacities on this parameter λ\lambda, including non-stationary discrete memoryless classical channels, led to the definition of a capacity function Ahlswede06 . Note that recently Ahlswede Ahlswede10 proved that the capacity functions can now be thought of as so-called capacity-sequences.
For the case of the quantum periodic channel, and also the random channel, we have shown that an analogous parameter-dependent capacity can be defined, which takes the form of a scale of capacities applicable for various ranges of the error parameter.

Note. It appears that similar results to ours were obtained by Datta, Hsieh and Brandão DHB11 , using different methods.

Appendix A The periodic channel with amplitude-damping channel branches

The qubit amplitude-damping channel acting on the state ρ=(abb¯1a)\rho=\left(\begin{array}[]{cc}a&b\\ \overline{b}&1-a\end{array}\right) is given by

Φamp(ρ)=(a+(1a)γb1γb¯1γ(1a)(1γ)).\Phi_{amp}(\rho)=\left(\begin{array}[]{cc}a+(1-a)\gamma&b\sqrt{1-\gamma}\\ \overline{b}\sqrt{1-\gamma}&(1-a)(1-\gamma)\\ \end{array}\right). (1.36)

The expression for the product-state capacity of the qubit amplitude-damping channel is given as follows,

χ\displaystyle\chi (Φamp({pj,ρj}))\displaystyle\hskip-2.84526pt\left(\Phi_{amp}(\{p_{j},\rho_{j}\})\right) (1.38)
=\displaystyle= S[j(pj(aj+(1aj)γ)pjbj(1γ)pjb¯j(1γ)pj(1aj)(1γ))]\displaystyle S\left[\sum_{j}\left(\begin{array}[]{cc}p_{j}\left(a_{j}+(1-a_{j})\gamma\right)&p_{j}b_{j}\sqrt{(1-\gamma)}\\ p_{j}{\overline{b}}_{j}\sqrt{(1-\gamma)}&p_{j}(1-a_{j})(1-\gamma)\\ \end{array}\right)\right]
\displaystyle- jpjS(aj+(1aj)γbj1γb¯j1γ(1aj)(1γ)).\displaystyle\sum_{j}p_{j}\,S\left(\begin{array}[]{cc}a_{j}+(1-a_{j})\gamma&b_{j}\sqrt{1-\gamma}\\ {\overline{b}}_{j}\sqrt{1-\gamma}&(1-a_{j})(1-\gamma)\\ \end{array}\right). (1.41)

We now investigate whether the following equation holds for a periodic channel with two amplitude-damping channel branches

12sup{pj,ρj}i=01χi({pj,ρj})=12i=01sup{pj,ρj}χi({pj,ρj}).\frac{1}{2}\sup_{\{p_{j},\rho_{j}\}}\sum_{i=0}^{1}\chi_{i}(\{p_{j},\rho_{j}\})=\frac{1}{2}\sum_{i=0}^{1}\sup_{\{p_{j},\rho_{j}\}}\chi_{i}(\{p_{j},\rho_{j}\}). (1.42)

Let γ0\gamma_{0} and γ1\gamma_{1} represent the error parameters for two amplitude-damping channels Φ0\Phi_{0} and Φ1\Phi_{1} respectively. We have argued DM08 that the Holevo quantity for the qubit amplitude-damping channel can be increased using an ensemble containing two mirror image pure states each with probability 12\frac{1}{2}. Using this minimal ensemble we investigate both sides of Equation (1.42), for a periodic channel with two qubit amplitude-damping channel branches.
Clearly the left hand side of Equation (1.42) will be attained for a single parameter which we denote by amaxa_{max}. However, the right hand side of Equation (1.42) cannot be obtained by a single amaxa_{max}. Instead, the supremum for each channel will be attained at a different value of the input state parameter aa. We denote by amax0a_{max_{0}} and amax1a_{max_{1}} the state parameter that achieves the product-state capacity for the channels Φ0\Phi_{0} and Φ1\Phi_{1} respectively. Let χ0(a)\chi_{0}(a) and χ1(a)\chi_{1}(a) denote the Holevo quantities of the channels Φ0\Phi_{0} and Φ1\Phi_{1}, respectively. Denoting xi=14γi(1γi)(1a2)x_{i}=\sqrt{1-4\gamma_{i}\left(1-\gamma_{i}\right)(1-a^{2})} the eigenvalues for each of the amplitude-damping channels can be written as

λampi±=12(1±14γi(1γi)(1a)2).\lambda_{amp_{i}\pm}=\frac{1}{2}\left(1\pm\sqrt{1-4\gamma_{i}(1-\gamma_{i})(1-a)^{2}}\right). (1.43)

the values for amax0a_{max_{0}} and amax1a_{max_{1}} can be determined by separately solving the following equation for each channel

dχi(a)da\displaystyle\frac{d\chi_{i}(a)}{da} =\displaystyle= (1γi)ln((1a)(1γi)a+(1a)γi)\displaystyle(1-\gamma_{i})\ln\left(\frac{(1-a)(1-\gamma_{i})}{a+(1-a)\gamma_{i}}\right) (1.44)
+\displaystyle+ 2γi(1γi)(1a)xiln(1+xi1xi)\displaystyle\frac{2\gamma_{i}(1-\gamma_{i})(1-a)}{x_{i}}\ln\left(\frac{1+x_{i}}{1-x_{i}}\right)
=\displaystyle= 0.\displaystyle 0.

Let χavg(γ0,γ1,amax0,amax1)\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}}) denote the average of the supremum of the Holevo capacities of the channels Φ0\Phi_{0} and Φ1\Phi_{1}, i.e.,

χavg(γ0,γ1,amax0,amax1)=12(χ0(amax0)+χ1(amax1)).\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}})=\frac{1}{2}\left(\chi_{0}^{*}(a_{max_{0}})+\chi_{1}^{*}(a_{max_{1}})\right). (1.45)

It is not difficult to show that

χ(γ0=1,γ1,amax)=χavg(γ0=1,γ1,amax0,amax1).\chi^{*}(\gamma_{0}=1,\gamma_{1},a_{max})=\chi^{*}_{avg}(\gamma_{0}=1,\gamma_{1},a_{max_{0}},a_{max_{1}}). (1.46)

Similarly, we can show that

χ(γ0,γ1=1,amax)=χavg(γ0,γ1=1,amax0,amax1).\chi^{*}(\gamma_{0},\gamma_{1}=1,a_{max})=\chi^{*}_{avg}(\gamma_{0},\gamma_{1}=1,a_{max_{0}},a_{max_{1}}).

Next, we show separately for a) γi=0\gamma_{i}=0 and for b) 0<γi<10<\gamma_{i}<1 that the following inequality holds

χ(γ0,γ1,amax)<χavg(γ0,γ1,amax0,amax1).\chi^{*}(\gamma_{0},\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}}). (1.47)
  1. a)

    Taking γ0=0\gamma_{0}=0, the expression χ(γ0,γ1,amax)\chi^{*}(\gamma_{0},\gamma_{1},a_{max}) becomes

    χ(γ0=0,γ1,amax)\displaystyle\chi^{*}(\gamma_{0}=0,\gamma_{1},a_{max}) =\displaystyle= Hbin(amax1)\displaystyle H_{bin}(a_{max1}) (1.48)
    +\displaystyle+ Hbin((1amax1)(1γ1))\displaystyle H_{bin}((1-a_{max1})(1-\gamma_{1}))
    \displaystyle- S(Φ1(ρamax)).\displaystyle S\left(\Phi_{1}\left(\rho_{amax}\right)\right).

    Denoting χavg(γ0,γ1,amax0,amax1)\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}}) by χavg(γ1)\chi^{*}_{avg}\left(\gamma_{1}\right) the right hand side becomes

    χavg(γ1,amax1)\displaystyle\chi^{*}_{avg}\left(\gamma_{1},a_{max1}\right) =\displaystyle= Hbin(amax1)\displaystyle H_{bin}(a_{max1}) (1.49)
    +\displaystyle+ Hbin((1amax1)(1γ1))\displaystyle H_{bin}((1-a_{max1})(1-\gamma_{1}))
    \displaystyle- S(Φ1(ρamax1)).\displaystyle S\left(\Phi_{1}\left(\rho_{amax_{1}}\right)\right).

    Clearly, amax0=12a_{max_{0}}=\frac{1}{2}. To show that amax<amax1a_{max}<a_{max_{1}}, we must show that ddaiχi(a)<0\frac{d}{da}\sum_{i}\chi_{i}(a)<0 at a=amax1=12a=a_{max_{1}}=\frac{1}{2}.

    For γ0=0\gamma_{0}=0 the Holevo quantity of the channel Φ0\Phi_{0} becomes

    χ0(a)=S(a00(1a))S(ρ).\chi_{0}(a)=S\left(\begin{array}[]{cc}a&0\\ 0&(1-a)\\ \end{array}\right)-S(\rho). (1.50)

    But ρ\rho is a pure state and therefore S(ρ)=0S(\rho)=0. Therefore, from Equation (1.44),

    dχ0(a)da=ln((1a)a).\frac{d\chi_{0}(a)}{da}=\ln\left(\frac{(1-a)}{a}\right). (1.51)

    We have previously shown that the maximising state parameter for the amplitude-damping channel is achieved at a12a\geq\frac{1}{2} DM08 . We are considering the case where γ0γ1\gamma_{0}\neq\gamma_{1}, i.e. γ10\gamma_{1}\neq 0, therefore amax1>12a_{max_{1}}>\frac{1}{2}. The expression χ0(a)\chi_{0}(a) now represents the binary entropy, H(a)H(a), and is therefore maximised at a=12a=\frac{1}{2}. It was shown above that the entropy S(a)S(a) is a strictly concave function for γ0=0\gamma_{0}=0 and χ0(a)\chi_{0}(a) is therefore decreasing at a=amax1a=a_{max_{1}}.

    The capacity χ1(a)\chi_{1}^{*}(a) is achieved at a=amax1a=a_{max_{1}}. Therefore dχ1(a)da\frac{d\chi_{1}(a)}{da} is equal to zero at this point.

    We can now conclude that ddaiχi(a)<0\frac{d}{da}\sum_{i}\chi_{i}(a)<0 when a=amax1a=a_{max_{1}} and therefore

    χ(γ0=0,γ1,amax)<χavg(γ0=0,γ1,amax0,amax1).\displaystyle\chi^{*}(\gamma_{0}=0,\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma_{0}=0,\gamma_{1},a_{max_{0}},a_{max_{1}}).
  2. b)

    We now show that an inequality exists between the expressions χ(γ0,γ1,amax)\chi^{*}(\gamma_{0},\gamma_{1},a_{max}) and χavg(γ0,γ1,amax0,amax1)\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}}) for fixed γ0\gamma_{0}, such that 0<γ0<10<\gamma_{0}<1.

    In DM08 we proved that if γ0<γ1\gamma_{0}<\gamma_{1}, then χ(γ0)>χ(γ1)\chi(\gamma_{0})>\chi(\gamma_{1}) and therefore amax0<amax1a_{max_{0}}<a_{max_{1}}. Therefore, dχ0(a)da<0\frac{d\chi_{0}(a)}{da}<0 at a=amax1a=a_{max_{1}} and amax<amax1a_{max}<a_{max_{1}}. Similarly, if γ0>γ1\gamma_{0}>\gamma_{1}, then amax0>amax1a_{max_{0}}>a_{max_{1}} and dχ0(a)da>0\frac{d\chi_{0}(a)}{da}>0 at a=amax1a=a_{max_{1}} and amax>amax1a_{max}>a_{max_{1}}.

    As a result, amaxa_{max} will always lie in between amax0a_{max_{0}} and amax1a_{max_{1}}. We have previously shown DM08 that the Holevo quantity for the qubit amplitude-damping channel is concave in its single state parameter. Therefore amax>a~a_{max}>\tilde{a}, where a~\tilde{a} is the parameter value associated with χavg(γ,γ1,amax0,amax1)\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{0}},a_{max_{1}}), i.e. isupaχi(a)=χγ0,γ1(a~)\sum_{i}\sup_{a}\chi_{i}(a)=\chi^{*}_{\gamma_{0},\gamma_{1}}(\tilde{a}). This proves that χ(γ0,γ1,amax)<χavg(γ,γ1,amax0,amax1)\chi^{*}(\gamma_{0},\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{0}},a_{max_{1}}).

In conclusion, if γ0=1\gamma_{0}=1 or γ1=1\gamma_{1}=1, then amax=amax0a_{max}=a_{max_{0}} or amax=amax1a_{max}=a_{max_{1}} respectively and χ(γ0,γ1,amax)=χavg(γ,γ1,amax1,amax1)\chi^{*}(\gamma_{0},\gamma_{1},a_{max})=\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{1}},a_{max_{1}}). However, if γ0,γ11\gamma_{0},\gamma_{1}\neq 1, then χ(γ0,γ1,amax)<χavg(γ,γ1,amax1,amax1)\chi^{*}(\gamma_{0},\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{1}},a_{max_{1}}).

References

  • (1) C. Shannon, “A mathematical theory of communication,” The Bell Syst. Tech. J., vol. 27, pp. 379, 623, 1948.
  • (2) A. Feinstein, “A new basic theorem of information theory,” IRE Trans. Inf. Theory PGIT, vol. 4, p. 2, 1954.
  • (3) J. Wolfowitz, Coding Theorems of Information Theory. Berlin: Springer, 1964.
  • (4) J. Wolfowitz, “On channels without a capacity,” Inf. Control, vol. 6, p. 49, 1963.
  • (5) R. Ahlswede, “On concepts of performance parameters for channels,” in General Theory of Information Transfer and Combinatorics, vol. 4123 of Lecture Notes in Computer Science, pp. 639–663, Springer Berlin, Heidelberg, 2006.
  • (6) A. Winter, “Coding theorem and strong converse for quantum channels,” IEEE Trans. Inf. Theory, vol. 45, p. 2481, 1999.
  • (7) T. Ogawa and H. Nagaoka, “Strong converse to the quantum channel coding theorem,” IEEE Trans. Inf. Theory, vol. 45, p. 2486, 1999. arXiv:quant-ph/9808063.
  • (8) I. Bjelaković and H. Boche, “Classical capacities of compound quantum channels,” in Proc. IEEE Information Theory Workshop (ITW 2008) Porto, Portugal, 2008.
  • (9) I. Bjelaković and H. Boche, “Classical capacities of averaged and compound channels,” IEEE Trans. Inf. Theory, vol. 55, p. 3360, 2009.
  • (10) R. König and S. Wehner, “A strong converse for classical channel coding using entangled inputs,” Phys. Rev. Lett., vol. 103, p. 070504, 2009. arXiv:0903.2838.
  • (11) S. Bose, “Quantum communication through an unmodulated spin chain,” Phys. Rev. Lett., vol. 91, p. 207901, 2003. arXiv:quant-ph/0212041.
  • (12) D. Kretschmann and R. Werner, “Quantum channels with memory,” Phys. Rev. A, vol. 72, p. 062323, 2005. arXiv:0803.2069.
  • (13) S. Mancini, “Models for quantum memory channels,” J. Phys.: Conf. Ser., vol. 36, p. 121, 2006.
  • (14) A. Holevo, “The capacity of the quantum channel with general signal states,” IEEE Trans. Inf. Theory, vol. 44, p. 269, 1998. arXiv:quant-ph/9611023.
  • (15) B. Schumacher and M. Westmoreland, “Sending classical information via noisy quantum channels,” Phys. Rev. A, vol. 56, p. 131, 1997.
  • (16) I. Csiszár and J. Körner, Information Theory : Coding Theorems for Discrete Memoryless Systems. Cambridge: Cambridge University Press, second ed., 2011.
  • (17) J. Norris, Markov Chains. Cambridge: Cambridge University Press, 1997.
  • (18) N. Datta and T. Dorlas, “Classical capacity of quantum channels with general markovian correlated noise,” J. Stat. Phys., vol. 134, p. 1173, 2009. arXiv:0712.0722.
  • (19) G. Bowen and S. Mancini, “Quantum channels with a finite memory,” Phys. Rev. A, vol. 69, p. 012306, 2004. arXiv:quant-ph/0305010.
  • (20) C. Morgan, The information-carrying capacity of certain quantum channels. PhD thesis, University College Dublin, 2010. arXiv:1007.2723.
  • (21) N. Datta and T. Dorlas, “The coding theorem for a class of quantum channels with long-term memory,” J. Phys. A, vol. 40, p. 8147, 2007. arXiv:quant-ph/0610049.
  • (22) T. Dorlas and C. Morgan, “Classical capacity of quantum channels with memory,” Phys. Rev. A, vol. 79, p. 032320, 2009. arXiv:0902.2834.
  • (23) R. Ahlswede, “Every channel with time structure has a capacity sequence,” in Proc. IEEE Information Theory Workshop (ITW 2010), Dublin, Ireland, 2010.
  • (24) N. Datta and M. Hsieh and F. Brandão, “Strong converse rates and an example of violation of the strong converse property.” arXiv:1106.3089.
  • (25) T. Dorlas and C. Morgan, “Calculating a maximizer for quantum mutual information,” Int. J. Quantum Inf.n, vol. 6, p. 745, 2008. arXiv:1107.3741.