The invalidity of a strong capacity for a quantum channel with memory

Tony Dorlas dorlas@stp.dias.ie School of Theoretical Physics
Dublin Institute for Advanced Studies
10 Burlington Road, Dublin 4, Ireland Ciara Morgan cqtciara@nus.edu.sg Centre for Quantum Technologies
National University of Singapore
3 Science Drive 2, Singapore 117543

Abstract

The strong capacity of a particular channel can be interpreted as a sharp limit on the amount of information which can be transmitted reliably over that channel. To evaluate the strong capacity of a particular channel one must prove both the direct part of the channel coding theorem and the strong converse for the channel. Here we consider the strong converse theorem for the periodic quantum channel and show some rather surprising results. We first show that the strong converse does not hold in general for this channel and therefore the channel does not have a strong capacity. Instead, we find that there is a scale of capacities corresponding to error probabilities between integer multiples of the inverse of the periodicity of the channel. A similar scale also exists for the random channel.

channel coding theorem, classical capacity, quantum channels with memory

pacs:

03.37.Hk, 89.70.Kn

I Introduction

The full channel coding theorem provides a limit on the rate at which a sender can communicate an encoded message to a receiver, such that the probability of a decoding error at the receiver’s side decays exponentially in the number of channel uses. The theorem is comprised of two parts: the direct part of the theorem, which refers to the construction of the code, and the converse to the theorem. The direct part of the quantum channel coding theorem states that using $n$ copies of the channel, we can code with an exponentially small probability of error at a rate $R=\frac{1}{n}\log|\mathcal{M}|$ , provided $R<C$ in the asymptotic limit, where $\cal M$ denotes the set of possible codewords to be transmitted over the channel and $C$ denotes the capacity of the channel. If the rate at which classical information is transmitted over a quantum channel exceeds the capacity of the channel, i.e. if $R>C$ , then the probability of decoding the information correctly goes to zero in the number of channel uses. The latter is known as the strong converse to the channel coding theorem. The weak converse, on the other hand, states that if $R>C$ , then the probability of decoding the information correctly is bounded away from $1$ , i.e. the error probability does not tend to zero, whatever encoding/decoding scheme is used.
Shannon Shannon48 first proposed the theorem for classical discrete memoryless channels and the first rigorous proof of the direct part of the theorem was provided by Feinstein Feinstein and the strong converse by Wolfowitz Wolfowitz .
However, it was observed that the existence of the strong converse, and therefore strong capacity, for other types of classical channels does not always hold WolfowitzNoCap . See Ahlswede Ahlswede06 for a more complete discussion of converse results for various types of classical channels.
The strong converse to the channel coding theorem for memoryless classical-quantum channels with product state inputs was determined independently by Winter Winter99 and by Ogawa and Nagaoka ON99 . Their result implies that every memoryless discrete classical-quantum channel has a strong capacity which provides a sharp upper-bound on the rate at which classical information can be transmitted over this type of channel using product states.
Recent results include a proof by Bjelaković and Boche BB08 ; BB09 of a full coding theorem for the discrete memoryless compound classical-quantum channel. Wehner and König KW09 proved the fully general strong converse theorem for a family of channels, that is, they proved that the strong converse theorem holds for a family of quantum channels even in the case when entangled state inputs are allowed.
In this article we relax the assumption that the communication channel in question is memoryless and we concentrate on a particular quantum channel with memory, that is, a channel with correlations between successive channel uses. In our case the correlations between successive uses of the channel can be described by a Markov chain. Communication channels with memory are widely considered to be more realistic than memoryless channels since real-world channels may not exhibit independence between successive errors and correlations are common. Noise correlations are also necessary for certain models of quantum communication Bose03 . See for example Kretschmann and Werner KW05 and Mancini Mancini06 for models of quantum memory channels.
The article is organised as follows. We introduce notation, necessary definitions and define the quantum periodic channel in Section II. In Section III we prove that the periodic channel does not have a strong capacity. The observation relies on a result which is proved in Appendix A, involving a particular instance of a periodic channel and consequently the strong converse does not hold in general for the periodic channel. In Section V we remark on a scale of capacities for the random channel. We then state and prove the main result involving a scale of capacities for the channel.
Note that $\log$ is understood to be taken to the base $2$ throughout the article.

II Preliminaries

We begin by introducing some notation. A memoryless channel is given by a completely positive trace-preserving (CPT) map ${\Phi}:{\mathcal{B}}({\cal H})\to{\cal B}({\cal K})$ , where ${\cal B}({\cal H})$ and ${\cal B}({\cal K})$ denote the states on the input and output Hilbert spaces ${\cal H}$ and $\cal K$ , respectively.
Equivalently, we can describe a classical-quantum channel, here also denoted $\Phi$ , as a mapping from the classical message to the output state of the channel on $\mathcal{B}(\mathcal{K})$ as follows,

\Phi:\mathcal{X}\mapsto\mathcal{B}(\mathcal{K}),

(2.1)

where the message is first encoded into a sequence belonging the set $\mathcal{X}^{n}$ , where $\cal X$ represents the input alphabet.
We can combine the two mapping descriptions as follows. We wish to send classical information in the form of quantum states over a quantum channel $\Phi$ . A (discrete) memoryless quantum channel, $\Phi$ , carrying classical information can be thought of as a map from a (finite) set, or alphabet, $\cal X$ into $\cal B(\cal K)$ , taking each $x\in\cal X$ to $\Phi_{x}=\Phi(\rho_{x})$ , where the input state to the channel is given by $\{\rho_{x}\}_{x\in\cal{X}}$ and each $\rho_{x}\in\cal B(\cal H)$ . Let $d=\dim(\cal H)$ and $a=|\cal X|$ .
For a probability distribution $P$ on the input alphabet $\mathcal{X}$ , the average output state of a channel $\Phi$ is given by

P\sigma=\sum_{x\in\mathcal{X}}P(x)\Phi(\rho_{x}).

(2.2)

The conditional von Neumann entropy of $\Phi$ given $P$ is defined by

S(\Phi|P)=\sum_{x\in\mathcal{X}}P(x)S(\Phi(\rho_{x})),\hfill

(2.3)

and the mutual information between the probability distribution $P$ and the channel $\Phi$ is defined as follows,

I(P;\Phi)=S(P\sigma)-S(\Phi|P).

(2.4)

An $n$ -block code for a quantum channel $\Phi$ is a pair $(C^{n},E^{n})$ , where $C^{n}$ is a mapping from a finite set of messages $\cal M$ , of length $n$ , into $\mathcal{X}^{n}$ , i.e. a sequence $x^{n}\in\cal X$ is assigned to each of the $|\cal M|$ messages, and $E^{n}$ is a POVM, i.e. a quantum measurement, on the output space $\mathcal{K}^{\otimes n}$ of the channel $\Phi_{x^{n}}^{n}$ . The maximum error probability of the code $(C^{n},E^{n})$ is defined as

p_{e}(C^{n},E^{n})={\rm max}\,\{1-\mathop{\rm Tr}(\Phi_{C^{n}(m)}^{n}E^{n}_{m}):m\in\cal M\}.

(2.5)

The code $(C^{n},E^{n})$ is called an $(n,\lambda)$ -code, if $p_{e}(C^{n},E^{n})\leq\lambda$ . The maximum size $|\cal M|$ of an $(n,\lambda)$ -code is denoted $N(n,\lambda)$ . Define an finite alphabet $\cal X$ and sequences $x^{n}=x_{1},\dots,x_{n}\in\mathcal{X}^{n}$ and let

N(x\big{|}x^{n})=\big{|}\{i\in\{1,\dots,n\}:x_{i}=x\}\big{|}

(2.6)

for $x\in\mathcal{X}$ . The type of the sequence $x^{n}$ is given by the empirical distribution $P_{x^{n}}$ on $\cal X$ such that

P_{x^{n}}(x)=\frac{N(x\big{|}x^{n})}{n}.

(2.7)

Clearly, the number of types is upper bounded by $(n+1)^{a}$ , where $a=\big{|}\cal X\big{|}$ .

II.1 Coding theorem and strong converse

The strong capacity of a particular channel provides a sharp threshold on the rate at which information may be transmitted over that channel with exponentially decreasing probability of decoding error in the number of channel uses. In order to establish a strong capacity for a particular channel one must prove both existence of a capacity achieving code and the strong converse.
The direct part of the coding theorem for memoryless quantum channels with product-state inputs was determined independently by Holevo Hol98 and Schumacher and Westmoreland SW97 . Winter Winter99 and Ogawa and Nagaoka ON99 independently proved the strong converse for memoryless quantum channels.
In Section III we require a version of the strong converse theorem proved by Winter Winter99 which holds for a single codeword type. We therefore provide this version (Lemma II.1) below, following both the direct part and strong converse theorems for memoryless classical quantum channels as stated and proved in Winter99 .

Theorem II.1

(Direct part)
For all $\lambda\in(0,1)$ and $\delta>0$ there exists $n_{0}(\lambda,\delta)\in\mathop{N}$ such that for all $n\geq n_{0}$ and every classical quantum channel $\Phi$ and probability distribution $P$ on $\cal{X}$ , there exists an $(n,\lambda)$ -code such that the number of messages satisfies

|\mathcal{M}_{n}|\geq 2^{n(\chi^{*}(\Phi)-\delta)},

(2.8)

where the Holevo capacity $\chi^{*}$ is given by

\chi^{*}(\Phi)=\sup_{P}I(P;\Phi)

(2.9)

the supremum being over all probability distributions $P$ on $\cal X$ .

Theorem II.2

(Strong converse)
For all $\lambda\in(0,1)$ and all $\delta>0$ there exists $n_{1}(\lambda,\delta)$ such that for all $n\geq n_{1}$ and every memoryless classical quantum channel $\Phi$ and the number of messages of an $(n,\lambda)$ -code is bounded by

|\mathcal{M}_{n}|\leq 2^{n(\chi^{*}(\Phi)+\delta)}.

(2.10)

Remark. Winter in fact proved a stronger version of these theorems in which $\delta$ is replaced by a constant times $1/\sqrt{n}$ .

In the following we follow the approach of Winter (Winter99 , Theorem 13) in which the strong converse is derived from a bound on the number of codewords of a given type $P$ :

Lemma II.1

(Single-type strong converse)
For $\lambda\in(0,1)$ and $\delta>0$ there exists $n_{1}(\lambda,\delta)$ such that for $n\geq n_{1}$ , every $(n,\lambda)$ -code for which all codewords are of the same type $P$ ,

|\mathcal{M}_{n,P}|\leq 2^{n(I(P;\,\Phi)+\delta)}.

(2.11)

The strong converse follows immediately from this lemma using the fact that the number of types is upper bounded by $(1+n)^{a}$ (see CK11 Lemma 2.2).

Remark. In contrast to the strong converse where the decoding error goes to $1$ exponentially in the number of channel applications if $R>C$ , the weak converse states that if $R>C$ , then the probability of decoding the information correctly is bounded away from $1$ .

II.2 Quantum channels with classical memory

Next, we provide definitions needed to describe quantum channels with classical memory Norris . Let $I$ denote a countable set and let $\lambda_{i}=\mathbb{P}(X=i)$ , where $X$ is a random variable taking values in the state space $I$ . Let $Q$ denote a transition matrix, with entries labeled $q_{j|i}$ . A discrete time random process denoted $X_{n}$ can be considered to be a Markov chain with transition matrix $Q$ and initial distribution $\lambda$ , if and only if the following holds for $i_{0},\dots,i_{n-1}\in I$ ,

		$\displaystyle\mathbb{P}$	$\displaystyle(X_{0}=i_{0},X_{1}=i_{1},\dots,X_{n-1}=i_{n-1})$		(2.12)
		$\displaystyle=$	$\displaystyle\lambda_{i_{0}}q_{i_{1}\|i_{0}}q_{i_{2}\|i_{1}}\cdots q_{i_{n-1}\|i_{n-2}}.$		(2.12)

In DD09Markov Datta and Dorlas analyse a quantum channel of length $n$ with Markovian noise correlations, first defined by Bowen and Mancini BM04 , as follows

\Phi^{n}(\rho^{n})=\hskip-8.53581pt\sum_{i_{0}\dots i_{n-1}}\hskip-8.53581ptq_{i_{n-1}|i_{n-2}}\dots q_{i_{1}|i_{0}}\lambda_{i_{0}}(\Phi_{i_{0}}\otimes\cdots\otimes\Phi_{i_{n-1}})(\rho^{n})

(2.13)

where $q_{j|i}$ are the elements of the transition matrix of a discrete-time Markov chain, and $\{\lambda_{i}\}$ represents an invariant distribution on the Markov chain.
In Section III we analyse a particular channel with classical memory, namely the periodic channel. We describe this channel below.
A periodic channel acting on an $n$ -fold input state can be described as follows

\Phi^{n}\left(\rho^{n}\right)=\frac{1}{L}\sum_{i=0}^{L-1}\left(\Phi_{i}\otimes\Phi_{i+1}\otimes\cdots\otimes\Phi_{i+n-1}\right)\left(\rho^{n}\right),

(2.14)

where $\Phi_{i}$ are CPT maps acting on the same Hilbert space and the index is cyclic, modulo the period $L$ , i.e. $\Phi_{i+L}=\Phi_{i}$ . In this case the elements of the corresponding transition matrix are given by $q_{j|i}=\theta_{i,j}$ , where

\theta_{i,j}=\begin{cases}1,&\text{if $j=i+1\mod L$}\\ 0,&\text{otherwise.}\end{cases}

(2.15)

The product-state capacity of the channel, denoted $C_{p}$ is given by

C_{p}\left(\Phi\right)=\frac{1}{L}\sup_{P}\sum_{i=0}^{L-1}I(P;\Phi_{i}).

(2.16)

The proof of direct part of the channel coding theorem for the periodic quantum channel is provided in Appendix B of CMThesis . This is in fact a special case of the main result proved by Datta and Dorlas in DD09Markov . Note that the proof of the direct part of the coding theorem for this channel makes use of a preamble to the code which the receiver uses upon receipt to determine which branch of the channel was selected.

Another channel of the general type (2.13) is the random channel. It is given by

\Phi^{n}\left(\rho^{n}\right)=\sum_{i=1}^{M}q_{i}\,\Phi_{i}^{\otimes n}\left(\rho^{n}\right),

(2.17)

where $\Phi_{i}$ ( $i=1,\dots,M$ ) are CPT maps acting on the same Hilbert space and $q_{1},\dots,q_{M}$ is a probability distribution. In this case the elements of the corresponding transition matrix are given by $q_{j|i}=\delta_{ij}$ . It was shown in DD07CC that the product state capacity of this channel is given by

C_{p}\left(\Phi\right)=\sup_{P}\min_{i=1}^{M}I(P;\Phi_{i}).

(2.18)

We will remark on this channel, which like the periodic channel has long-term memory, in Section V.

III Channel without a strong converse

The strong converse for the periodic quantum channel does not hold in general because the following inequality holds

C_{p}<\overline{C_{p}},

(3.19)

where,

\overline{C_{p}}=\frac{1}{L}\sum_{i=0}^{L-1}\sup_{P}I(P,\Phi_{i}).

(3.20)

The strict inequality above can be shown explicitly for a periodic channel consisting of two branches of qubit amplitude-damping channels (see Appendix A below for detailed proof). On the other hand, equality for expression (3.19) can be shown to hold for a periodic channel with depolarising channel branches DM09 .

Let us now investigate whether we can prove a full coding theorem for rates $R$ such that

C_{p}<R<\overline{C_{p}}.

(3.21)

We first define the average probability of error as follows

\overline{p_{e}}=\frac{1}{L}\sum_{i=0}^{L-1}p_{e}^{i}\leq\lambda,

(3.22)

where $p_{e}^{i}$ denotes the probability of error for the $i$ -th channel branch.

Our coding strategy is as follows. We choose a code i.e. a particular encoding and decoding scheme, suitable for a particular channel branch labeled by the index $i\in\{0,\dots,L-1\}$ . Here a ‘branch’ is defined as one term in the sum (2.14) i.e. $\Phi_{i}^{(n)}=\Phi_{i}\otimes\Phi_{i+1}\otimes\dots\otimes\Phi_{i+n-1}$ . According to the coding theorem for memoryless channels, there is a code with error probability tending to zero for this branch with rate $R$ . Indeed, for each $j$ there exists a probability distribution $P_{j}$ of states optimising $\chi_{j}^{*}=\sup_{P}I(P;\Phi_{j})$ and we can choose states from a typical subspace for these distributions, which can be interlaced at the positions $j-i+kL$ , where $k\in[\frac{n}{L}-1]$ . The probability of choosing a particular branch correctly is given by $\frac{1}{L}$ and therefore the probability of error approaches

p_{e}=\frac{L-1}{L}<1.

(3.23)

We thus have a $\lambda$ -code for all $\lambda>1-\frac{1}{L}$ . In particular, the error probability is bounded away from 1, and the strong converse does not hold.

On the other hand the strong converse does hold for $R>\overline{C_{p}}$ . Indeed, the codewords can be decomposed into sub-codewords corresponding to the different stages of a period: $x^{n}=(x_{0}^{n},\dots,x_{L-1}^{n})$ , where the components of the $x_{i}^{n}$ are understood to be interlaced in $x^{n}$ . We distinguish types $P_{0},\dots,P_{L-1}$ for the sub-codewords. Then we have an analogue of the single-type strong converse given by Lemma II.1:

Lemma III.1

For $\lambda\in(0,1)$ and $\delta>0$ there exists $n_{1}(\lambda,\delta)$ such that for $n\geq n_{1}$ , every $(n,\lambda)$ -code for which all sub-codewords are of the same type $P_{0},\dots,P_{L-1}$ , given that the $i$ -th branch is selected,

|\mathcal{M}_{n,P_{0},\dots,P_{L-1}}|\leq 2^{\frac{n}{L}\sum_{k=0}^{L-1}(I(P_{k};\,\Phi_{i+k})+\delta)}.

(3.24)

Clearly, for the complete channel, it follows that the number of codewords such that the sub-codewords are of types $P_{0},\dots,P_{L-1}$ , satisfies

|{\cal M}_{n,P_{0},\dots,P_{L-1}}|\leq 2^{\frac{n}{L}\sum_{k=0}^{L-1}(\sup_{P}I(P;\Phi_{i+k})+\delta)}.

(3.25)

Summing over the types, we obtain the strong converse.

We can conclude that the strong converse holds for rates $R>\overline{C_{p}}$ .

IV A scale of capacities

The above obviously raises the question if smaller error probabilities can be attained for smaller rates, but still above $C_{p}$ . For this, we define a ‘pair capacity’ $C_{p}^{\,(2)}$ as follows:

C_{p}^{\,(2)}=\frac{1}{2L}\max_{0\leq i_{1}<i_{2}<L}\sum_{k=0}^{L-1}\hskip-2.84526pt\sup_{P}\left(I(P;\Phi_{i_{1}+k})+I(P;\Phi_{i_{2}+k})\right).

(4.26)

Suppose the maximum is attained at a certain pair $(i_{1},i_{2})$ . With probability $2/L$ , one of the two branches $i_{1}$ or $i_{2}$ is chosen. We attach a preamble to the code as in the proof of the product-state capacity of the periodic channel (2.16). If, for example, the branch $i_{1}$ is selected by the channel, the receiver can determine that this is the case by measuring the preamble, and can then choose states for each value of $k$ from the typical space corresponding to the maximising distribution $P_{k}$ for the CPT map $\Phi_{i_{1}+k}$ . This constitutes an encoding with rate given by the average of the mutual informations $I(P;\Phi_{i_{1}+k})$ for $k=0,\dots,L-1$ , which is greater than or equal to the pair capacity $C^{\,(2)}$ given by Equation (4.26). We have thus constructed a $\lambda$ -code for $\lambda>1-\frac{2}{L}$ .

On the other hand, let $R>C^{\,(2)}_{p}$ and suppose that $(C^{n},E^{n})$ is a sequence of $(n,\lambda)$ -codes with $\lambda<1-\frac{1}{L}$ , and assume that

\frac{1}{n}\log|{\cal M}_{n}|\geq R>C_{p}^{\,(2)}.

(4.27)

First note that we may assume that the number of codewords with sub-codewords of types $P_{0},\dots P_{L-1}$ is bounded by

\frac{1}{n}\log|{\cal M}_{n,P_{0},\dots,P_{L-1}}|\leq\frac{1}{L}\sum_{k=0}^{L-1}(I(P_{k},\Phi_{i+k})+\delta)

(4.28)

for some fixed $i=0,\dots,L-1$ . Indeed, otherwise, by Lemma III.1, $p^{i}_{e}>\lambda$ for all $i$ and hence $p_{e}>\lambda$ .

We now claim that for every other $j\neq i$ , and $\epsilon>0$ small enough,

\frac{1}{n}\log|{\cal M}_{n,P_{0},\dots,P_{L-1}}|>\frac{1}{L}\sum_{k=0}^{L-1}(I(P_{k},\Phi_{j+k})+\epsilon).

(4.29)

If this were not the case then the pair capacity for a single type $P_{k}$ can be written as

	$\displaystyle\frac{1}{n}\log\|{\cal M}_{n,P_{0},\dots,P_{L-1}}\|$	$\displaystyle\leq$	$\displaystyle\frac{1}{2L}\sum_{k=0}^{L-1}(I(P_{k},\Phi_{i+k})$		(4.30)
		$\displaystyle+$	$\displaystyle I(P_{k},\Phi_{j+k})+\delta)$		(4.30)

and hence

	$\displaystyle\frac{1}{n}\log\|{\cal M}_{n,P_{0},\dots,P_{L-1}}\|$	$\displaystyle\leq$	$\displaystyle\frac{1}{2L}\sum_{k=0}^{L-1}(\sup_{P}\{I(P,\Phi_{i+k})$		(4.31)
		$\displaystyle+$	$\displaystyle I(P,\Phi_{j+k})\}+\delta).$		(4.31)

Summing over the types $P_{0},\dots,P_{L-1}$ leads to a contradiction with (4.27).

Now, expression (4.29) implies with Lemma III.1 that, if the $j$ -th branch is selected by the channel, then the error probability $p^{j}_{e}>1-\eta$ for any $\eta>0$ . Since with probability $1-\frac{1}{L}$ one of the branches $j$ other than $i$ is selected, we conclude that the error probability $p_{e}>\left(1-\frac{1}{L}\right)(1-\eta)>\lambda$ if $\eta<1-\frac{1}{L}-\lambda$ is small enough.

It is now clear that this argument can be generalised to prove:

Theorem IV.1

Define, for $r=1,\dots,L$ a scale of capacities $C_{p}^{(r)}$ by

C_{p}^{(r)}=\frac{1}{rL}\max_{0\leq i_{1}<\dots<i_{r}<L}\sum_{k=0}^{L-1}\sup_{P}\sum_{m=1}^{r}I(P;\Phi_{i_{m}+k}).

(4.32)

(Note that $C_{p}^{(1)}=C_{p}$ and $C_{p}^{(L)}=\overline{C_{p}}$ .) Then, if $\lambda>1-\frac{r}{L}$ and $R<C_{p}^{(r)}$ , there exists a sequence of $(n,\lambda)$ -codes with rate $R$ . Conversely, if $\lambda<1-\frac{r-1}{L}$ , there exists no sequence of $(n,\lambda)$ -codes with rate $R>C_{p}^{(r)}$ .

V The random channel

The situation for the random channel is similar, but more complicated due to the fact that different branches can have different probabilities $q_{i}$ . We can in general distinguish break points at values of the error probability given by

q(\Delta)=\sum_{i\in\Delta}q_{i},\qquad\Delta\subset\{1,\dots,M\}.

(5.33)

We have an analogue of the detailed theorem for periodic channels above:

Theorem V.1

Define, for $\Delta\subset\{1,\dots,M\}$ a scale of capacities $C_{p}^{\Delta}$ by

C_{p}^{\Delta}=\sup_{P}\min_{i\in\Delta}I(P;\Phi_{i}).

(5.34)

Then, if $\lambda>1-q(\Delta)$ and $R<C_{p}^{\Delta}$ , then there exists a sequence of $(n,\lambda)$ codes with rate $R$ .

For the converse to the theorem, we introduce another scale as follows:

\overline{C_{p}^{\Delta}}=\sup_{P}\max_{i\in\Delta}I(P;\Phi_{i}).

(5.35)

Then, if $\lambda<1-q(\Delta)$ there exist no $(n,\lambda)$ -codes with rate $R>\overline{C_{p}^{\Delta}}$ .

The situation is less clear-cut than it seems, however. In fact, not every $q(\Delta)$ is necessarily a point of discontinuity for the capacity, because $C_{p}^{\Delta}$ is in general not monotonic in the probabilities $q(\Delta)$ !

VI Discussion

One of the most surprising and interesting results which has emerged from Shannon Theory is the observation that the strong information-carrying capacity of a memoryless channel is independent of the upper bound on the maximum error probability of that channel, usually denoted $\lambda$ . The independence of the parameter $\lambda$ is crucial to the existence of a so-called strong capacity for the channel Wolfowitz .
The dependency of some channel capacities on this parameter $\lambda$ , including non-stationary discrete memoryless classical channels, led to the definition of a capacity function Ahlswede06 . Note that recently Ahlswede Ahlswede10 proved that the capacity functions can now be thought of as so-called capacity-sequences.
For the case of the quantum periodic channel, and also the random channel, we have shown that an analogous parameter-dependent capacity can be defined, which takes the form of a scale of capacities applicable for various ranges of the error parameter.

Note. It appears that similar results to ours were obtained by Datta, Hsieh and Brandão DHB11 , using different methods.

Appendix A The periodic channel with amplitude-damping channel branches

The qubit amplitude-damping channel acting on the state $\rho=\left(\begin{array}[]{cc}a&b\\ \overline{b}&1-a\end{array}\right)$ is given by

\Phi_{amp}(\rho)=\left(\begin{array}[]{cc}a+(1-a)\gamma&b\sqrt{1-\gamma}\\ \overline{b}\sqrt{1-\gamma}&(1-a)(1-\gamma)\\ \end{array}\right).

(1.36)

The expression for the product-state capacity of the qubit amplitude-damping channel is given as follows,

$\displaystyle\chi$	$\displaystyle\hskip-2.84526pt\left(\Phi_{amp}(\{p_{j},\rho_{j}\})\right)$	(1.38)
$\displaystyle=$	$\displaystyle S\left[\sum_{j}\left(\begin{array}[]{cc}p_{j}\left(a_{j}+(1-a_{j})\gamma\right)&p_{j}b_{j}\sqrt{(1-\gamma)}\\ p_{j}{\overline{b}}_{j}\sqrt{(1-\gamma)}&p_{j}(1-a_{j})(1-\gamma)\\ \end{array}\right)\right]$	(1.38)
$\displaystyle-$	$\displaystyle\sum_{j}p_{j}\,S\left(\begin{array}[]{cc}a_{j}+(1-a_{j})\gamma&b_{j}\sqrt{1-\gamma}\\ {\overline{b}}_{j}\sqrt{1-\gamma}&(1-a_{j})(1-\gamma)\\ \end{array}\right).$	(1.41)

We now investigate whether the following equation holds for a periodic channel with two amplitude-damping channel branches

\frac{1}{2}\sup_{\{p_{j},\rho_{j}\}}\sum_{i=0}^{1}\chi_{i}(\{p_{j},\rho_{j}\})=\frac{1}{2}\sum_{i=0}^{1}\sup_{\{p_{j},\rho_{j}\}}\chi_{i}(\{p_{j},\rho_{j}\}).

(1.42)

Let $\gamma_{0}$ and $\gamma_{1}$ represent the error parameters for two amplitude-damping channels $\Phi_{0}$ and $\Phi_{1}$ respectively. We have argued DM08 that the Holevo quantity for the qubit amplitude-damping channel can be increased using an ensemble containing two mirror image pure states each with probability $\frac{1}{2}$ . Using this minimal ensemble we investigate both sides of Equation (1.42), for a periodic channel with two qubit amplitude-damping channel branches.
Clearly the left hand side of Equation (1.42) will be attained for a single parameter which we denote by $a_{max}$ . However, the right hand side of Equation (1.42) cannot be obtained by a single $a_{max}$ . Instead, the supremum for each channel will be attained at a different value of the input state parameter $a$ . We denote by $a_{max_{0}}$ and $a_{max_{1}}$ the state parameter that achieves the product-state capacity for the channels $\Phi_{0}$ and $\Phi_{1}$ respectively. Let $\chi_{0}(a)$ and $\chi_{1}(a)$ denote the Holevo quantities of the channels $\Phi_{0}$ and $\Phi_{1}$ , respectively. Denoting $x_{i}=\sqrt{1-4\gamma_{i}\left(1-\gamma_{i}\right)(1-a^{2})}$ the eigenvalues for each of the amplitude-damping channels can be written as

\lambda_{amp_{i}\pm}=\frac{1}{2}\left(1\pm\sqrt{1-4\gamma_{i}(1-\gamma_{i})(1-a)^{2}}\right).

(1.43)

the values for $a_{max_{0}}$ and $a_{max_{1}}$ can be determined by separately solving the following equation for each channel

$\displaystyle\frac{d\chi_{i}(a)}{da}$	$\displaystyle=$	$\displaystyle(1-\gamma_{i})\ln\left(\frac{(1-a)(1-\gamma_{i})}{a+(1-a)\gamma_{i}}\right)$	(1.44)
	$\displaystyle+$	$\displaystyle\frac{2\gamma_{i}(1-\gamma_{i})(1-a)}{x_{i}}\ln\left(\frac{1+x_{i}}{1-x_{i}}\right)$
	$\displaystyle=$	$\displaystyle 0.$

Let $\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}})$ denote the average of the supremum of the Holevo capacities of the channels $\Phi_{0}$ and $\Phi_{1}$ , i.e.,

\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}})=\frac{1}{2}\left(\chi_{0}^{*}(a_{max_{0}})+\chi_{1}^{*}(a_{max_{1}})\right).

(1.45)

It is not difficult to show that

\chi^{*}(\gamma_{0}=1,\gamma_{1},a_{max})=\chi^{*}_{avg}(\gamma_{0}=1,\gamma_{1},a_{max_{0}},a_{max_{1}}).

(1.46)

Similarly, we can show that

\chi^{*}(\gamma_{0},\gamma_{1}=1,a_{max})=\chi^{*}_{avg}(\gamma_{0},\gamma_{1}=1,a_{max_{0}},a_{max_{1}}).

Next, we show separately for a) $\gamma_{i}=0$ and for b) $0<\gamma_{i}<1$ that the following inequality holds

\chi^{*}(\gamma_{0},\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}}).

(1.47)

Taking $\gamma_{0}=0$ , the expression $\chi^{*}(\gamma_{0},\gamma_{1},a_{max})$ becomes

$\displaystyle\chi^{*}(\gamma_{0}=0,\gamma_{1},a_{max})$	$\displaystyle=$	$\displaystyle H_{bin}(a_{max1})$	(1.48)
	$\displaystyle+$	$\displaystyle H_{bin}((1-a_{max1})(1-\gamma_{1}))$
	$\displaystyle-$	$\displaystyle S\left(\Phi_{1}\left(\rho_{amax}\right)\right).$

Denoting $\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}})$ by $\chi^{*}_{avg}\left(\gamma_{1}\right)$ the right hand side becomes

$\displaystyle\chi^{*}_{avg}\left(\gamma_{1},a_{max1}\right)$	$\displaystyle=$	$\displaystyle H_{bin}(a_{max1})$	(1.49)
	$\displaystyle+$	$\displaystyle H_{bin}((1-a_{max1})(1-\gamma_{1}))$
	$\displaystyle-$	$\displaystyle S\left(\Phi_{1}\left(\rho_{amax_{1}}\right)\right).$

Clearly, $a_{max_{0}}=\frac{1}{2}$ . To show that $a_{max}<a_{max_{1}}$ , we must show that $\frac{d}{da}\sum_{i}\chi_{i}(a)<0$ at $a=a_{max_{1}}=\frac{1}{2}$ .

For $\gamma_{0}=0$ the Holevo quantity of the channel $\Phi_{0}$ becomes

\chi_{0}(a)=S\left(\begin{array}[]{cc}a&0\\ 0&(1-a)\\ \end{array}\right)-S(\rho).

(1.50)

But $\rho$ is a pure state and therefore $S(\rho)=0$ . Therefore, from Equation (1.44),

\frac{d\chi_{0}(a)}{da}=\ln\left(\frac{(1-a)}{a}\right).

(1.51)

We have previously shown that the maximising state parameter for the amplitude-damping channel is achieved at $a\geq\frac{1}{2}$ DM08 . We are considering the case where $\gamma_{0}\neq\gamma_{1}$ , i.e. $\gamma_{1}\neq 0$ , therefore $a_{max_{1}}>\frac{1}{2}$ . The expression $\chi_{0}(a)$ now represents the binary entropy, $H(a)$ , and is therefore maximised at $a=\frac{1}{2}$ . It was shown above that the entropy $S(a)$ is a strictly concave function for $\gamma_{0}=0$ and $\chi_{0}(a)$ is therefore decreasing at $a=a_{max_{1}}$ .

The capacity $\chi_{1}^{*}(a)$ is achieved at $a=a_{max_{1}}$ . Therefore $\frac{d\chi_{1}(a)}{da}$ is equal to zero at this point.

We can now conclude that $\frac{d}{da}\sum_{i}\chi_{i}(a)<0$ when $a=a_{max_{1}}$ and therefore

\displaystyle\chi^{*}(\gamma_{0}=0,\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma_{0}=0,\gamma_{1},a_{max_{0}},a_{max_{1}}).

b)

We now show that an inequality exists between the expressions $\chi^{*}(\gamma_{0},\gamma_{1},a_{max})$ and $\chi^{*}_{avg}(\gamma_{0},\gamma_{1},a_{max_{0}},a_{max_{1}})$ for fixed $\gamma_{0}$ , such that $0<\gamma_{0}<1$ .

In DM08 we proved that if $\gamma_{0}<\gamma_{1}$ , then $\chi(\gamma_{0})>\chi(\gamma_{1})$ and therefore $a_{max_{0}}<a_{max_{1}}$ . Therefore, $\frac{d\chi_{0}(a)}{da}<0$ at $a=a_{max_{1}}$ and $a_{max}<a_{max_{1}}$ . Similarly, if $\gamma_{0}>\gamma_{1}$ , then $a_{max_{0}}>a_{max_{1}}$ and $\frac{d\chi_{0}(a)}{da}>0$ at $a=a_{max_{1}}$ and $a_{max}>a_{max_{1}}$ .

As a result, $a_{max}$ will always lie in between $a_{max_{0}}$ and $a_{max_{1}}$ . We have previously shown DM08 that the Holevo quantity for the qubit amplitude-damping channel is concave in its single state parameter. Therefore $a_{max}>\tilde{a}$ , where $\tilde{a}$ is the parameter value associated with $\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{0}},a_{max_{1}})$ , i.e. $\sum_{i}\sup_{a}\chi_{i}(a)=\chi^{*}_{\gamma_{0},\gamma_{1}}(\tilde{a})$ . This proves that $\chi^{*}(\gamma_{0},\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{0}},a_{max_{1}})$ .

In conclusion, if $\gamma_{0}=1$ or $\gamma_{1}=1$ , then $a_{max}=a_{max_{0}}$ or $a_{max}=a_{max_{1}}$ respectively and $\chi^{*}(\gamma_{0},\gamma_{1},a_{max})=\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{1}},a_{max_{1}})$ . However, if $\gamma_{0},\gamma_{1}\neq 1$ , then $\chi^{*}(\gamma_{0},\gamma_{1},a_{max})<\chi^{*}_{avg}(\gamma,\gamma_{1},a_{max_{1}},a_{max_{1}})$ .

References

(1) C. Shannon, “A mathematical theory of communication,” The Bell Syst. Tech. J., vol. 27, pp. 379, 623, 1948.
(2) A. Feinstein, “A new basic theorem of information theory,” IRE Trans. Inf. Theory PGIT, vol. 4, p. 2, 1954.
(3) J. Wolfowitz, Coding Theorems of Information Theory. Berlin: Springer, 1964.
(4) J. Wolfowitz, “On channels without a capacity,” Inf. Control, vol. 6, p. 49, 1963.
(5) R. Ahlswede, “On concepts of performance parameters for channels,” in General Theory of Information Transfer and Combinatorics, vol. 4123 of Lecture Notes in Computer Science, pp. 639–663, Springer Berlin, Heidelberg, 2006.
(6) A. Winter, “Coding theorem and strong converse for quantum channels,” IEEE Trans. Inf. Theory, vol. 45, p. 2481, 1999.
(7) T. Ogawa and H. Nagaoka, “Strong converse to the quantum channel coding theorem,” IEEE Trans. Inf. Theory, vol. 45, p. 2486, 1999. arXiv:quant-ph/9808063.
(8) I. Bjelaković and H. Boche, “Classical capacities of compound quantum channels,” in Proc. IEEE Information Theory Workshop (ITW 2008) Porto, Portugal, 2008.
(9) I. Bjelaković and H. Boche, “Classical capacities of averaged and compound channels,” IEEE Trans. Inf. Theory, vol. 55, p. 3360, 2009.
(10) R. König and S. Wehner, “A strong converse for classical channel coding using entangled inputs,” Phys. Rev. Lett., vol. 103, p. 070504, 2009. arXiv:0903.2838.
(11) S. Bose, “Quantum communication through an unmodulated spin chain,” Phys. Rev. Lett., vol. 91, p. 207901, 2003. arXiv:quant-ph/0212041.
(12) D. Kretschmann and R. Werner, “Quantum channels with memory,” Phys. Rev. A, vol. 72, p. 062323, 2005. arXiv:0803.2069.
(13) S. Mancini, “Models for quantum memory channels,” J. Phys.: Conf. Ser., vol. 36, p. 121, 2006.
(14) A. Holevo, “The capacity of the quantum channel with general signal states,” IEEE Trans. Inf. Theory, vol. 44, p. 269, 1998. arXiv:quant-ph/9611023.
(15) B. Schumacher and M. Westmoreland, “Sending classical information via noisy quantum channels,” Phys. Rev. A, vol. 56, p. 131, 1997.
(16) I. Csiszár and J. Körner, Information Theory : Coding Theorems for Discrete Memoryless Systems. Cambridge: Cambridge University Press, second ed., 2011.
(17) J. Norris, Markov Chains. Cambridge: Cambridge University Press, 1997.
(18) N. Datta and T. Dorlas, “Classical capacity of quantum channels with general markovian correlated noise,” J. Stat. Phys., vol. 134, p. 1173, 2009. arXiv:0712.0722.
(19) G. Bowen and S. Mancini, “Quantum channels with a finite memory,” Phys. Rev. A, vol. 69, p. 012306, 2004. arXiv:quant-ph/0305010.
(20) C. Morgan, The information-carrying capacity of certain quantum channels. PhD thesis, University College Dublin, 2010. arXiv:1007.2723.
(21) N. Datta and T. Dorlas, “The coding theorem for a class of quantum channels with long-term memory,” J. Phys. A, vol. 40, p. 8147, 2007. arXiv:quant-ph/0610049.
(22) T. Dorlas and C. Morgan, “Classical capacity of quantum channels with memory,” Phys. Rev. A, vol. 79, p. 032320, 2009. arXiv:0902.2834.
(23) R. Ahlswede, “Every channel with time structure has a capacity sequence,” in Proc. IEEE Information Theory Workshop (ITW 2010), Dublin, Ireland, 2010.
(24) N. Datta and M. Hsieh and F. Brandão, “Strong converse rates and an example of violation of the strong converse property.” arXiv:1106.3089.
(25) T. Dorlas and C. Morgan, “Calculating a maximizer for quantum mutual information,” Int. J. Quantum Inf.n, vol. 6, p. 745, 2008. arXiv:1107.3741.