On Zero-Error Capacity of Graphs with One Edge
Abstract
In this paper, we study the zero-error capacity of channels with memory, which are represented by graphs. We provide a method to construct code for any graph with one edge, thereby determining a lower bound on its zero-error capacity. Moreover, this code can achieve zero-error capacity when the symbols in a vertex with degree one are the same. We further apply our method to the one-edge graphs representing the binary channels with two memories. There are 28 possible graphs, which can be organized into 11 categories based on their symmetries. The code constructed by our method is proved to achieve the zero-error capacity for all these graphs except for the two graphs in Case 11.
Index Terms:
zero-error capacity, graph with one edge, channel with memoryI Introduction
Let be a finite set. A channel with transition matrix is represented by a graph , where the vertex set is and the edge set is with for if
For , and is called distinguishable if . Any two sequences is also called distinguishable if there exists a coordinate , the -th symbols of and are adjacent in [2, 3].
Now for a set of -ary sequences, if sequences in are pairwise distinguishable, then messages mapping to can be transmitted through without error. The set is called a code of length for and is its rate, where the base of the logarithm is 2, which is omitted throughout this paper. Let be a sequence of the codes for . The zero-error capacity of is defined to be the maximum among all
The zero-error capacity problem was introduced by Shannon [2] in 1956. He considered a typewriter channel, as shown in Fig. 1, which can be represented by a graph with length , i.e., each with is distinguishable from another two elements in . He established a lower bound of the capacity, which was proved tight by Lovász [4] in 1979. The problem remains open even for the complement of a cycle graph with length 7.111The zero-error capacity problem is trivial for the complement of a cycle graph with even length. Due to the difficulties in solving the problem in general, in recent years, the zero-error capacity of some special graphs were investigated. In 2010, Zhao and Permuter [5] introduced a dynamic programming formulation for computing the zero-error feedback capacity of channels with state information. The zero-error capacity of some special timing channels were determined by Kovačević and Popovski [6] in 2014. In 2016, Nakano and Wadayama [7] derived a lower bound and an upper bound on the zero-error capacity of Nearest Neighbor Error channels with a multilevel alphabet.
The zero-error capacity of a channel with memory was first studied by Ahlswede et al. [8] in 1998. A channel with memories can be represented by with . In [8], authors studied a binary channel with one memory i.e., a channel represented by with , where , and any two vertices may or may not be distinguishable. They determined the zero-error capacity when only one pair of the vertices are distinguishable. Based on their work, Cohen et al. [9] in 2016 studied channels with 3 pairs of vertices being distinguishable. All the remaining cases were solved in 2018 by Cao et al [10].
However, when or , the number of cases will explode dramatically, making it completely impossible to be solved one by one. For example, when and , the number of cases is billion. There arises a pressing demand for a generalized result. This paper considers any graph with the only edge, where , and is a finite set. This graph is denoted by or . We devise a simple method to establish a code for , thus obtaining a lower bound on its zero-error capacity. For any graph with more than one edge, let be one of its edges. Note that any code for is also the code for . Our method is applicable for establishing a code for any non-empty graph, thus obtaining a general lower bound on zero-error capacity.
We apply our method to the binary channels with two memories represented by the graphs with only one edge. There are 28 possible graphs, which can be classified into 11 categories up to symmetry (see Section II for more details). The capacities of the graphs in each category are the same, so only one of them need to be considered. Table I summarizes all the solved and unsolved cases. The code constructed by our method achieves the zero-error capacity for all these graphs except for the two graphs in Case 11.
II Zero-Error Capacity Problem with Memories
For a finite set of symbols, the channel with memories can be represented by a graph , where , , and for any , if
and and are called distinguishable for .
For , , let . For , , we say and are distinguishable for if there exists at least one coordinate with
Definition 1
Let be a set of length sequences and , , be a sequence of such sets indexed by . The asymptotic rate of is if it exists. If the sequences in are pairwise distinguishable for the graph , then is called a code of length for , and the sequences in are called codewords.
Definition 2
Let be a sequence of codes for the graph such that for all , achieves the largest cardinality of a code of length for . The zero-error capacity of the graph is defined as
Note that the limit above always exists because is superadditive, i.e., Clearly, . A sequence of codes is said to be asymptotically optimal for the graph if .
We apply the method in [10] to construct a new code based on an existed code. Let be a set of length sequences and be a sequence of such sets indexed by . For any , by adding an arbitrary prefix of length and an arbitrary suffix of length to all the sequences in , we obtain a new set of sequences of length , denoted by . Let be a sequence of such sets indexed by .
Lemma 1 (Lemma 2 in [10])
, i.e., .
Obviously, if is a sequence of codes for a given graph, then is also a sequence of codes for the same graph. To the contrary, if is a sequence of codes for a graph, then is called a sequence of quasi-codes for the same graph.
Let , be any string with entries in . Let be uniquely decomposable222For a set of strings defined above, denotes the family of all sequences which are concatenations of these , ., i.e., any sequence in can be uniquely decomposed into a sequence of strings in . If there exists a prefix and a suffix such that by adding a prefix and a suffix to all sequences in , we obtain a new set denoted by . If is a sequence of codes for the graph , then we call a sequence of quasi -codes for . Moreover, if is a sequence of codes for , then we call a sequence of -codes for . For any sequence , let denote the length of . By [11, Lemma 4.5], we have
where is the only positive root of
Moreover, if achieves the largest rate of a sequence of (quasi) -code for , we say that is an asymptotically optimal (quasi) -code.
Now we define two kinds of mappings and as follows.
-
•
If a graph is obtained from a graph by reversing the sequences representing the vertices of , then .
-
•
Let be a permutation of . Let be a graph such that for any pair , if and only if . Then .
Two graphs and are called interchangeable if there exists a permutation of such that one one of the following three conditions holds.
-
1.
-
2.
-
3.
Obviously, the zero-error capacities of two interchangeable graphs are the same. Table I lists all the graphs with one edge representing the binary channels with two memories. The graphs in each case are pairwise interchangeable. We only need to consider any one of the graphs in each case.
III Capacity of the Graphs with One Edge
In this section, we first construct an optimal quasi 2-code for any graph with one edge. Then we will show that this quasi 2-code is asymptotically optimal for a class of graphs with one edge.
III-A optimal quasi 2-code for the graph with one edge
Recall that the graph with only one edge is represented by , where is the only edge. Before we construct the optimal (quasi) 2-code for , we first show the three definitions following.
Definition 3
The sequence is a unit of if and there exists a non-negative number such that for any . Moreover, is called a prefix-unit of if and is called a suffix-unit of if .
Clearly, any sequence is a prefix-unit and suffix-unit of itself. To facilitate the channel capacity characterization, given with , we introduce the following notations.
-
•
Let and denote, respectively, the longest common prefix and suffix of and ;
-
•
Let and . With a slight abuse of notation, let
Since the zero-error capacities of and are the same, we only need to consider the case that , i.e., .
-
•
Let denote the shortest prefix-unit of such that . Likewise, let denote the shortest prefix-unit of such that .
Now we have the following theorem.
Theorem 1
For with and , the set is an asymptotically optimal quasi 2-code. Moreover, , where is the only positive root of
The following two lemmas will serve as stepping stones to establish Theorem 1.
Lemma 2
For , let be an arbitrary but fixed sequence. Let , the concatenation of and . If , then for any , we have .
Proof:
The sequence can be written as , where , , and . Now we prove that for any . Note for each , , there exists a string , such that , an entry of . Note that
We have
and
Now we denote
and
where is the indicator function. Clearly, .
By the definition of and , for any , we have
(1) |
and for any , we have
-
1.
implies
-
2.
implies .
When , and we have
where and hold since Condition 1 and 2 above hold, and holds since (1) holds.
Likewise, when . We have
When , we have , and thus,
∎
Example 1
For , we have and . Then and . Let be an arbitrary but fixed sequence in . By Lemma 2, we have for any , i.e.,
Lemma 3
Let for each , where , , are strings with entries in . For with and , if is a sequence of quasi codes, then any with is a unit of or . Moreover, if is a unit of , then .
Proof:
Note that is a sequence of quasi codes. We can find a prefix and a suffix such that for any , by adding and to all sequences in , we obtain a code for with . Let be a sequence of these codes indexed by . For any sequence and positive integer , let denote the concatenation of with itself times.
Now we consider . Let
be two codewords in , where is an arbitrary string in with . Since and are distinguishable for , there exists a coordinate such that
Without loss of generality, we assume and , and we will prove that is a unit of .
By definition, , which implies that . On the other hand, the first and the last bits of and are respectively the same, i.e., for . Therefore,
Thus,
and
Recall that . Letting denote the modulo- addition and , we have
Note that . We have is a prefix-unit of , and then is a unit of .
Now we prove that if is a unit of , then . Assume the contrary, . Recall that is the shortest prefix-unit of such that . As a prefix shorter than , we have , which implies and . Note that . Now for and , as is a unit of ,
(2) |
Now we consider . Let
(3) |
be two codewords in . There exists a coordinate such that one of the following two conditions holds.
-
1.
and ;
-
2.
and .
Note that and . We have
(4) |
where and . To simplify the notations, let , , and . Clearly, and . If , then which contradicts (4). Thus Likewise, we also have
Corollary 1
Let be a sequence of quasi codes for with . Then , .
With the above auxiliary results, we turn to the proof of Theorem 1.
Proof:
We first prove that is a quasi 2-code for . It is sufficient to prove that
is a code for with , i.e., any two different sequences are distinguishable for . The sequences can be written as and , where for and , and . Let be the smallest index such that , i.e.,
Without loss of generality, we assume that and . Let
be the coordinate of the first symbol of in .
Note that both and are in . By Lemma 2, we have
(5) |
On the other hand, the fact that and are, respectively, the prefix-unit of and implies that
(6) |
and
(7) |
Recall that . By (5) and (6), we have
Therefore,
Likewise, we can also obtain that
Hence are distinguishable for , which indicates that is a quasi 2-code for .
We now prove that has the highest rate among all the quasi 2-codes for . Assume there exists a quasi 2-code
such that Note that and , where and satisfy, respectively, that
and
If and , then , which contradicts the assumption that Thus, or . By Lemma 3, either or is a unit of or . Without loss of generality, let be a unit of and . Moreover, since , we have . Thus, by Lemma 3, is also a unit of or . Considering that , we can obtain that is a unit of but not a unit of , and thus . Moreover, we can also obtain that is not a unit of .
We have proved that for , each of and is a unit of , and neither of them is a unit of . The remaining part will prove that this statement does not hold, so that there does not exist a sequence of codes whose rate would be larger than . This will complete the proof of this theorem.
Now we assume that the statement holds. There exists a prefix and a suffix such that by adding and to all sequences in , we obtain a code for , denoted by
Let
and
be sequences of length . Note that these two sequences are distinguishable. There exists a coordinate such that
Letting and , since and , we have
which implies
(8) |
On the other hand, as is not a unit of , we have and thus . Since is neither a unit of , we further have
(9) |
By (8) and (9), we can obtain that
or
Letting , if , then and
Likewise, letting , if , then and
Consider the four cases:
-
)
and ,
-
)
and ,
-
)
and ,
-
)
and ,
where
and
for .
We have shown that the fact that and are distinguishable implies that Case or holds. Likewise, we have the following implications.
-
•
and are distinguishable Case or holds;
-
•
and are distinguishable Case or holds;
-
•
and are distinguishable Case or holds.
From all these implications, we derive that either both Cases and hold, or both Cases and hold. If Cases and hold, then , and thus . Hence, , a contradiction. Likewise, if Cases and hold, then , and thus , also leads to a contradiction and the theorem is proved. ∎
III-B optimality for the quasi 2-code
For any , we define two string operations and , where . Let
Let be the string obtained by deleting each symbol of whose coordinate is in . With a slight abuse of notation, let . Obviously, . The following lemma will be used in determining the upper bound on zero-error capacity.
Lemma 4
For any graph representing the channel with memories (any vertex in is of length ), let be a code for .
-
1.
If there exists a codeword and a coordinate set such that for any
the vertex is of degree zero, then after replacing by any sequence in , the updated remains a code for .
-
2.
If there exists a set such that for any and any , and are indistinguishable, then and is a code for .
Proof:
We first prove that when is replaced by any sequence , the updated remains a code. It is sufficient to prove that for any , , we have and are distinguishable. Since and are distinguishable, there exists a coordinate such that and are adjacent in , which implies that the degree of is not zero, i.e., . Thus,
which implies that . Therefore, and are distinguishable.
Now we prove is a code for . We only need to prove that for any codewords , and are distinguishable. Since and are distinguishable, there exists a coordinate such that and are adjacent, which implies that and so . Thus, and are distinguishable. ∎
Example 2
For the graph in Fig. 2, let be a codeword in a code for . It can be seen that and
Since the degree of the vertex is zero, when the codeword is replaced by , the updated remains a code for . Moreover, if all the codewords in start with , then is also a code.
For disjoint sets , let denote the disjoint union of and , and denote the disjoint union of , .
Theorem 2
For with and , if all the symbols of are the same, , then the capacity , where is the only positive root of
Proof:
We obtain directly from Theorem 1. Now we prove that . Let . By Theorem 1,
is an asymptotically optimal quasi 2-code for . Since each prefix of is a prefix-unit of , we have and
On the other hand, since the first symbols and the last symbols of and are all the same, that is, for , we have
is a prefix-unit of . Assume that . We have , and then
Note that for . We have
and so
which does not hold. Therefore, and .
Let be a sequence of codes for such that for all , achieves the largest cardinality of a code of length . For any codeword with , let
We can find that for any , the vertex is neither nor , i.e., this vertex is of degree zero. By Lemma 4, replacing by , the updated set remains a code, where
or equivalently
Thus, we can assume that each codeword in starts with and ends with .
Let be an arbitrary but fixed sequence in the updated . Let be the first coordinate such that . Clearly, . If one of the following two conditions holds,
-
•
or
-
•
and
then for any , the vertex is neither nor and so of degree zero. By Lemma 4, replacing by , the updated set remains a code, where
This replacement can be repeated until any codeword in the updated satisfies that or
where is the first coordinate such that . Let
for and
Clearly,
the disjoint union of , .
Note that all the sequences in start with . Letting , we can see that for any and any ,
As , neither nor can be , and so they are indistinguishable. Therefore, by Lemma 4, we have
(10) |
IV Capacity of the Binary Channel with Two Memories
In this section, we consider all the 28 graphs with one edge, which can be classified in 11 cases and have been listed in Table I. As discussed in Section I, for each case, we only need to consider any one of the graphs therein. The zero-error capacity of the graphs in Cases 1 to 10 are solved in this paper. We will also give a lower bound and a upper bound on the zero-error capacity of the graphs in Case 11.
Theorem 3
for , where is the only positive root of the equation
Theorem 4
for .
Theorem 5
for , where is the only positive root of the equation
To facilitate the proofs of the following theorems, for a set of sequences of length , and a string with length strictly less than , we denote be the subset of sequences in starting with .
Theorem 6
for , where is the only positive root of the equation
Proof:
By Theorem 1, we obtain that
To prove , let be a sequence of codes for such that for all , achieves the largest cardinality of a code of length . Flipping the first bit of any codeword in starting with , and the second bit of any codeword starting with , by Lemma 4, the updated remains a code. Thus, WLOG, we assume that any codeword in starts with , or . Equivalently, we assume that any codeword in starts with , , or , i.e.,
and so
(12) |
Obviously,
(13) |
and by Lemma 4, both and are codes for . Note that both and achieve maximum cardinality for codes of lengths and , respectively. We have
(14) |
By (12)-(14), we have , which is a classic recurrence formula. Therefore, . ∎
Theorem 7
for , where is the only positive root of the equation
Proof:
By Theorem 1, we obtain that
To prove , let be a sequence of codes for such that for all , achieves the largest cardinality of a code of length . Flip the first bit of any codeword in starting with . For any codeword containing , replace any one of 11s by 10. By Lemma 4, the updated remains a code. Thus, we assume that any codeword in starts with , or , i.e.,
and so
We also have
and by Lemma 4, both and are codes for . Thus, , which implies that . ∎
Lemma 5
for containing all four edges , , and .
Proof:
We can easily obtain a sequence of codes for :
whose rate is
Now we consider the proof of the upper bound. Let be a sequence of codes for such that for all , achieves the largest cardinality of a code of length . Obviously,
As for any , the third bits of any two codewords in are the same, by Lemma 4, we see that is also a code for and
Thus, , i.e., . ∎
The following lemma is evident, and so the proof is omitted.
Lemma 6
For with , is a code for each , and .
Theorem 8
for .
Theorem 9
for .
Theorem 10
for .
Theorem 11
for .
Proof:
We can easily obtain a sequence of codes for :
whose rate is
Let be the graph with two edges and . Since , we have . Let be asymptotically optimal for .
If no codeword in contains the substrings , then is a code for . Otherwise, we perform a sequence of substring replacements. Specifically, let be an arbitrary codeword which contains the substring 111. Then replace any one of 111s by 101. The updated remains a code. Thus the replacement can be repeated until no codeword in the updated contains the substring 101, and therefore the final updated is a code for . Thus, ∎
Theorem 12
for .
Proof:
By Theorem 1, we obtain that
The proof of the upper bound is also similar to the proof of Theorem 8. Let be a sequence of codes for such that for all , achieves the largest cardinality of a code of length . Flip the first bit of any codeword in starting with and the second bit of any codeword starting with . For any codeword containing or , replace any one of 111s by 110 or 0000s by 0100. By Lemma 4(1), the final updated remains a code. Thus, we can assume that any codeword in starts with 0001, 0010, 0011, 0100 or 0110, i.e.,
and so
We also have
and
Both and are codes for . Thus, i.e., . ∎
Theorem 13
for , where is the only positive root of the equation
Proof:
We can obtain a sequence of codes for :
whose rate is
The proof of the upper bound is also similar to the proof of Theorem 4. Let be a sequence of codes for such that for all , achieves the largest cardinality of a code of length . Flip the first bit of any codeword in starting with , or , the second bit of any codeword starting with , and the first two bits of any codeword starting with . Thus, we can assume that any codeword in starts with 001, 010, 100, i.e.,
and so
By Lemma 4(2), we have
and
Both and are codes for . Thus,
i.e., . ∎
Remark: We can see that for the graph in Cases 1 to 10, the optimal quasi 2-code constructed in Theorem 1 achieves the zero-error capacity. However, for (Case 11), the optimal quasi 2-code is whose rate is . The capacity .
V Conclusion
In this paper, we have investigated the zero-error capacity of channels characterized by graphs containing a single edge. Previous works primarily focused on binary input channels with 2 or 3 memories. Our study extends the analysis to channels with -ary inputs and arbitrary memories. We provide a method for constructing zero-error codes for such graphs with one edge, which offers a lower bound on the zero-error capacity. For the binary channel two memories, the zero-error codes constructed by this method have been proven to be optimal in most cases.
It could be valuable to construct a method to obtain a general upper bound for graphs with one edge. If this upper bound matches the lower bound derived in this paper, it would allow us to determine the capacity for numerous graphs of this type.
References
- [1] Q. Cao and Q. Chen, “On zero-error capacity of ”one-edge” binary channels with two memories,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 2762–2767.
- [2] C. Shannon, “The zero error capacity of a noisy channel,” IRE Trans. Inf. Theory, vol. 2, no. 3, pp. 8–19, 1956.
- [3] Q. Cao and R. W. Yeung, “Zero-error capacity regions of noisy networks,” IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4201–4223, 2022.
- [4] L. Lovász, “On the Shannon capacity of a graph,” IEEE Trans. Inf. Theory, vol. 25, no. 1, pp. 1–7, 1979.
- [5] L. Zhao and H. H. Permuter, “Zero-error feedback capacity of channels with state information via dynamic programming,” IEEE Trans. Inf. Theory, vol. 56, no. 6, pp. 2640–2650, June 2010.
- [6] M. Kovačević and P. Popovski, “Zero-error capacity of a class of timing channels,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 6796–6800, Nov 2014.
- [7] T. Nakano and T. Wadayama, “On zero error capacity of nearest neighbor error channels with multilevel alphabet,” in 2016 International Symposium on Information Theory and Its Applications (ISITA), Oct 2016, pp. 66–70.
- [8] R. Ahlswede, N. Cai, and Z. Zhang, “Zero-error capacity for models with memory and the enlightened dictator channel,” IEEE Trans. Inf. Theory, vol. 44, no. 3, pp. 1250–1252, May 1998.
- [9] G. Cohen, E. Fachini, and J. Körner, “Zero-error capacity of binary channels with memory,” IEEE Trans. Inf. Theory, vol. 62, no. 1, pp. 3–7, Jan 2016.
- [10] Q. Cao, N. Cai, W. Guo, and R. W. Yeung, “On zero-error capacity of binary channels with one memory,” IEEE Transactions on Information Theory, vol. 64, no. 10, pp. 6771–6778, 2018.
- [11] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd ed. Cambridge University Press, 2011.