On Zero-Error Capacity of Graphs with One Edge

Qi Cao, Qi Chen and Baoming Bai Qi Cao (caoqi@xidian.edu.cn) is with Xidian-Guangzhou Research Institute, Xidian University, Guangzhou, China. Qi Chen (qichen@xidian.edu.cn) is with the School of Telecommunications Engineering, Xidian University, Xi’an 710071, China. Baoming Bai (bmbai@mail.xidian.eu.cn) is with the State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an 710071, China.This paper was presented in part at ISIT2022[1].

Abstract

In this paper, we study the zero-error capacity of channels with memory, which are represented by graphs. We provide a method to construct code for any graph with one edge, thereby determining a lower bound on its zero-error capacity. Moreover, this code can achieve zero-error capacity when the symbols in a vertex with degree one are the same. We further apply our method to the one-edge graphs representing the binary channels with two memories. There are 28 possible graphs, which can be organized into 11 categories based on their symmetries. The code constructed by our method is proved to achieve the zero-error capacity for all these graphs except for the two graphs in Case 11.

Index Terms:

zero-error capacity, graph with one edge, channel with memory

I Introduction

Let $\mathcal{X}$ be a finite set. A channel with transition matrix $p(z|x),x\in\mathcal{X}$ is represented by a graph $G=(\mathcal{X},E)$ , where the vertex set is $\mathcal{X}$ and the edge set is $E$ with $uv\in E$ for $u,v\in\mathcal{X}$ if

\{z:p(z|u)\}\cap\{z:p(z|v)\}=\emptyset.

For $G$ , $u$ and $v$ is called distinguishable if $uv\in E$ . Any two sequences $\bm{x},\bm{y}\in\mathcal{X}^{n}$ is also called distinguishable if there exists a coordinate $i$ , the $i$ -th symbols of $\bm{x}$ and $\bm{y}$ are adjacent in $G$ [2, 3].

Now for a set $\mathcal{A}_{n}\subseteq\mathcal{X}^{n}$ of $|\mathcal{X}|$ -ary sequences, if sequences in $\mathcal{A}_{n}$ are pairwise distinguishable, then messages mapping to $\mathcal{A}_{n}$ can be transmitted through $G$ without error. The set $\mathcal{A}_{n}$ is called a code of length $n$ for $G$ and $\frac{1}{n}\log|\mathcal{A}_{n}|$ is its rate, where the base of the logarithm is 2, which is omitted throughout this paper. Let $\{\mathcal{A}_{n}\}$ be a sequence of the codes for $G$ . The zero-error capacity of $G$ is defined to be the maximum among all

\lim_{n\to\infty}\frac{1}{n}\log|\mathcal{A}_{n}|.

Figure 1: Typewriter channel.

The zero-error capacity problem was introduced by Shannon [2] in 1956. He considered a typewriter channel, as shown in Fig. 1, which can be represented by a graph with length $5$ , i.e., each $x\in\mathcal{X}$ with $|\mathcal{X}|=5$ is distinguishable from another two elements in $\mathcal{X}$ . He established a lower bound of the capacity, which was proved tight by Lovász [4] in 1979. The problem remains open even for the complement of a cycle graph with length 7.¹¹1The zero-error capacity problem is trivial for the complement of a cycle graph with even length. Due to the difficulties in solving the problem in general, in recent years, the zero-error capacity of some special graphs were investigated. In 2010, Zhao and Permuter [5] introduced a dynamic programming formulation for computing the zero-error feedback capacity of channels with state information. The zero-error capacity of some special timing channels were determined by Kovačević and Popovski [6] in 2014. In 2016, Nakano and Wadayama [7] derived a lower bound and an upper bound on the zero-error capacity of Nearest Neighbor Error channels with a multilevel alphabet.

The zero-error capacity of a channel with memory was first studied by Ahlswede et al. [8] in 1998. A channel with $m$ memories can be represented by $G$ with $V(G)=\mathcal{X}^{m+1}$ . In [8], authors studied a binary channel with one memory i.e., a channel represented by $G$ with $V(G)=\mathcal{X}^{2}$ , where $\mathcal{X}=\{0,1\}$ , and any two vertices may or may not be distinguishable. They determined the zero-error capacity when only one pair of the vertices are distinguishable. Based on their work, Cohen et al. [9] in 2016 studied channels with 3 pairs of vertices being distinguishable. All the remaining cases were solved in 2018 by Cao et al [10].

However, when $m>1$ or $|\mathcal{X}|>2$ , the number of cases will explode dramatically, making it completely impossible to be solved one by one. For example, when $m=1$ and $|\mathcal{X}|=3$ , the number of cases is $2^{36}\approx 68$ billion. There arises a pressing demand for a generalized result. This paper considers any graph with $\bm{u}\bm{v}$ the only edge, where $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ , $m\geq 1$ and $\mathcal{X}$ is a finite set. This graph is denoted by $G(\bm{u},\bm{v})$ or $G(\bm{v},\bm{u})$ . We devise a simple method to establish a code for $G(\bm{u},\bm{v})$ , thus obtaining a lower bound on its zero-error capacity. For any graph $G$ with more than one edge, let $\bm{u}\bm{v}$ be one of its edges. Note that any code for $G(\bm{u},\bm{v})$ is also the code for $G$ . Our method is applicable for establishing a code for any non-empty graph, thus obtaining a general lower bound on zero-error capacity.

We apply our method to the binary channels with two memories represented by the graphs with only one edge. There are 28 possible graphs, which can be classified into 11 categories up to symmetry (see Section II for more details). The capacities of the graphs in each category are the same, so only one of them need to be considered. Table I summarizes all the solved and unsolved cases. The code constructed by our method achieves the zero-error capacity for all these graphs except for the two graphs in Case 11.

TABLE I: Zero-error capacity of binary channels with two memories

Case	$G$	$C(G)$	Theorems
1	$G(000,001)$	${-\log\alpha\approx 0.551}$ ¹	Theorem 3
	$G(000,100)$
	$G(111,110)$
	$G(111,011)$
2	$G(000,010)$	$\frac{1}{2}$	Theorem 4
	$G(111,101)$
3	$G(000,011)$	${-\log\beta\approx 0.406}$ ²	Theorem 5
	$G(000,110)$
	$G(111,100)$
	$G(111,001)$
4	$G(010,011)$		Theorem 6
	$G(010,110)$
	$G(101,100)$
	$G(101,001)$
5	$G(010,001)$		Theorem 7
	$G(010,100)$
	$G(101,110)$
	$G(101,011)$
6	$G(000,111)$	$\frac{1}{3}$	Theorem 8
7	$G(010,101)$		Theorem 9
8	$G(100,011)$		Theorem 10
	$G(110,001)$
9	$G(000,101)$		Theorem 11
	$G(111,010)$
10	$G(001,011)$		Theorem 12
	$G(110,100)$
11	$G(001,100)$	Not Determined³	Theorem 13
	$G(110,011)$

1

$\alpha$ is the positive root of the equation $x+x^{3}=1$ .
2

$\beta$ is the positive root of the equation $x^{2}+x^{3}=1$ .
3

$C(G)\in[\frac{\log 14}{11}\approx 0.346,-\log\beta\approx 0.406]$ .

II Zero-Error Capacity Problem with Memories

For a finite set $\mathcal{X}$ of symbols, the channel with $m-1$ memories can be represented by a graph $G=(V,E)$ , where $m\geq 2$ , $V={{\mathcal{X}}^{m}}$ , and for any $\bm{u},\bm{v}\in V(G)$ , $\bm{u}\bm{v}\in E$ if

\{\bm{z}:p(\bm{z}|\bm{u})\}\cap\{\bm{z}:p(\bm{z}|\bm{v})\}=\emptyset,

and $\bm{u}$ and $\bm{v}$ are called distinguishable for $G$ .

For $n_{1},n_{2}\in\mathbb{Z}$ , $n_{1}\leq n_{2}$ , let $\mathbb{Z}[n_{1},n_{2}]=\{i\in\mathbb{Z}:n_{1}\leq i\leq n_{2}\}$ . For $\bm{x}=(x_{0},x_{1},\ldots,x_{n-1}),\bm{y}=(y_{0},y_{1},\ldots,y_{n-1})\in\mathcal{X}^{n}$ , $n\geq m$ , we say $\bm{x}$ and $\bm{y}$ are distinguishable for $G$ if there exists at least one coordinate $i\in\mathbb{Z}[0,n-m-1]$ with

\{x_{i}x_{i+1},\ldots,x_{i+m},y_{i}y_{i+1},\ldots,y_{i+m}\}\in E(G).

Definition 1

Let $\mathcal{A}_{n}$ be a set of length $n$ sequences and $\{\mathcal{A}_{n}\}$ , $n=m,m+1,\ldots$ , be a sequence of such sets indexed by $n$ . The asymptotic rate of $\{\mathcal{A}_{n}\}$ is $R(\{\mathcal{A}_{n}\})\triangleq\underset{n\to\infty}{\mathop{\lim}}\,\frac{1}{n}\log|\mathcal{A}_{n}|$ if it exists. If the sequences in $\mathcal{A}_{n}$ are pairwise distinguishable for the graph $G$ , then $\mathcal{A}_{n}$ is called a code of length $n$ for $G$ , and the sequences in $\mathcal{A}_{n}$ are called codewords.

Definition 2

Let $\{\hat{\mathcal{A}}_{n}\}$ be a sequence of codes for the graph $G$ such that for all $n$ , $\hat{\mathcal{A}}_{n}$ achieves the largest cardinality of a code of length $n$ for $G$ . The zero-error capacity of the graph $G$ is defined as

C(G)=\underset{n\to\infty}{\mathop{\lim}}\,\frac{1}{n}\log|\hat{\mathcal{A}}_{n}|.

Note that the limit above always exists because $\log|\hat{\mathcal{A}}_{n}|$ is superadditive, i.e., $\log|\hat{\mathcal{A}}_{m+n}|\geq\log|\hat{\mathcal{A}}_{m}|+\log|\hat{\mathcal{A}}_{n}|,$ $\forall m,n\geq 0.$ Clearly, $0\leq C(G)\leq\log|\mathcal{X}|$ . A sequence of codes $\{{\mathcal{A}}_{n}\}$ is said to be asymptotically optimal for the graph $G$ if $R(\{{\mathcal{A}}_{n}\})=C(G)$ .

We apply the method in [10] to construct a new code based on an existed code. Let ${\mathcal{A}}_{n}$ be a set of length $n$ sequences and $\{{\mathcal{A}}_{n}\}$ be a sequence of such sets indexed by $n$ . For any $n$ , by adding an arbitrary prefix ${\bm{s}}_{\mathrm{p}}$ of length $p\geq 0$ and an arbitrary suffix ${\bm{s}}_{\mathrm{s}}$ of length $l\geq 0$ to all the sequences in ${\mathcal{A}}_{n}$ , we obtain a new set of sequences of length $t\triangleq n+(p+l)$ , denoted by $\mathcal{A}^{\prime}_{t}$ . Let $\{\mathcal{A}^{\prime}_{t}\}$ be a sequence of such sets indexed by $t$ .

Lemma 1 (Lemma 2 in [10])

$R(\{{\mathcal{A}}_{n}\})=R(\{\mathcal{A}^{\prime}_{t}\})$ , i.e., $\underset{n\to\infty}{\mathop{\lim}}\,\frac{1}{n}\log\left|{{\mathcal{A}}_{n}}\right|=\underset{t\to\infty}{\mathop{\lim}}\,\frac{1}{t}\log\left|{{\mathcal{A}}^{\prime}_{t}}\right|$ .

Obviously, if $\{{\mathcal{A}}_{n}\}$ is a sequence of codes for a given graph, then $\{\mathcal{A}^{\prime}_{t}\}$ is also a sequence of codes for the same graph. To the contrary, if $\{\mathcal{A}^{\prime}_{t}\}$ is a sequence of codes for a graph, then $\{{\mathcal{A}}_{n}\}$ is called a sequence of quasi-codes for the same graph.

Let $\bm{b}_{i}$ , $i=1,2,\ldots,T$ be any string with entries in $\mathcal{X}$ . Let $\mathcal{B}_{n}\triangleq\{\bm{b}_{1},\bm{b}_{2},\ldots,\bm{b}_{T}\}^{*}\cap\mathcal{X}^{n}$ be uniquely decomposable²²2For a set of strings $\{\bm{b}_{1},\bm{b}_{2},\ldots,\bm{b}_{T}\}$ defined above, $\{\bm{b}_{1},\bm{b}_{2},\ldots,\bm{b}_{T}\}^{*}$ denotes the family of all sequences which are concatenations of these $b_{i}$ , $i=1,2,...,T$ ., i.e., any sequence in $\mathcal{B}_{n}$ can be uniquely decomposed into a sequence of strings in $\{\bm{b}_{1},\bm{b}_{2},\ldots,\bm{b}_{T}\}$ . If there exists a prefix ${\bm{s}}_{\mathrm{p}}$ and a suffix ${\bm{s}}_{\mathrm{s}}$ such that by adding a prefix ${\bm{s}}_{\mathrm{p}}$ and a suffix ${\bm{s}}_{\mathrm{s}}$ to all sequences in $\mathcal{B}_{n}$ , we obtain a new set denoted by $\mathcal{B}^{\prime}_{t}$ . If $\{\mathcal{B}^{\prime}_{t}\}$ is a sequence of codes for the graph $G$ , then we call $\{{\mathcal{B}}_{n}\}$ a sequence of quasi $T$ -codes for $G$ . Moreover, if $\{\mathcal{B}_{n}\}$ is a sequence of codes for $G$ , then we call $\{{\mathcal{B}}_{n}\}$ a sequence of $T$ -codes for $G$ . For any sequence $\bm{x}$ , let $\ell(\bm{x})$ denote the length of $\bm{x}$ . By [11, Lemma 4.5], we have

R(\{{\mathcal{B}}_{n}\})=-\log x^{*},

where $x^{*}$ is the only positive root of

\sum_{t=1}^{T}x^{\ell(\bm{b}_{t})}=1.

Moreover, if $\{{\mathcal{B}}_{n}\}$ achieves the largest rate of a sequence of (quasi) $T$ -code for $G$ , we say that $\{{\mathcal{B}}_{n}\}$ is an asymptotically optimal (quasi) $T$ -code.

Now we define two kinds of mappings $T_{\mathrm{r}}$ and $T_{\pi}$ as follows.

•

If a graph $G^{\prime}$ is obtained from a graph $G$ by reversing the sequences representing the vertices of $G$ , then $T_{\mathrm{r}}(G)=G^{\prime}$ .
•

Let $\pi$ be a permutation of $\mathcal{X}$ . Let $G^{\prime\prime}$ be a graph such that for any pair $v,u$ , $uv\in E(G)$ if and only if $\pi(u)\pi(v)\in E(G^{\prime\prime})$ . Then $T_{\pi}(G)=G^{\prime\prime}$ .

Two graphs $G_{1}$ and $G_{2}$ are called interchangeable if there exists a permutation $\pi$ of $\mathcal{X}$ such that one one of the following three conditions holds.

1.

$T_{\mathrm{r}}(G_{1})=G_{2}$
2.

$T_{\pi}(G_{1})=G_{2}$
3.

$T_{\pi}(T_{\mathrm{r}}(G_{1}))=G_{2}$

Obviously, the zero-error capacities of two interchangeable graphs are the same. Table I lists all the graphs with one edge representing the binary channels with two memories. The graphs in each case are pairwise interchangeable. We only need to consider any one of the graphs in each case.

III Capacity of the Graphs with One Edge

In this section, we first construct an optimal quasi 2-code for any graph with one edge. Then we will show that this quasi 2-code is asymptotically optimal for a class of graphs with one edge.

III-A optimal quasi 2-code for the graph with one edge

Recall that the graph $G$ with only one edge is represented by $G(\bm{u},\bm{v})$ , where $\bm{uv}$ is the only edge. Before we construct the optimal (quasi) 2-code for $G(\bm{u},\bm{v})$ , we first show the three definitions following.

Definition 3

The sequence $\bm{x}$ is a unit of $\bm{y}$ if $\ell(\bm{x})\leq\ell(\bm{y})$ and there exists a non-negative number $t<\ell(\bm{x})$ such that $y_{i}=x_{i-t\bmod\ell(\bm{x})}$ for any $i\in\mathbb{Z}[0,\ell(\bm{y})-1]$ . Moreover, $\bm{x}$ is called a prefix-unit of $\bm{y}$ if $t=0$ and $\bm{x}$ is called a suffix-unit of $\bm{y}$ if $t=\ell(\bm{y})\bmod\ell(\bm{x})$ .

Clearly, any sequence is a prefix-unit and suffix-unit of itself. To facilitate the channel capacity characterization, given $G(\bm{u},\bm{v})$ with $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ , we introduce the following notations.

•

Let $\mathrm{pre}(\bm{u},\bm{v})$ and $\mathrm{suf}(\bm{u},\bm{v})$ denote, respectively, the longest common prefix and suffix of $\bm{u}$ and $\bm{v}$ ;
•

Let $\ell_{\mathrm{p}}(\bm{u},\bm{v})\triangleq\ell(\mathrm{pre}(\bm{u},\bm{v}))$ and $\ell_{\mathrm{s}}(\bm{u},\bm{v})\triangleq\ell(\mathrm{suf}(\bm{u},\bm{v}))$ . With a slight abuse of notation, let

$\ell(\bm{u},\bm{v})\triangleq\max\{\ell_{\mathrm{p}}(\bm{u},\bm{v}),\ell_{\mathrm{s}}(\bm{u},\bm{v})\}.$

Since the zero-error capacities of $G(\bm{u},\bm{v})$ and $T_{\mathrm{r}}(G(\bm{u},\bm{v}))$ are the same, we only need to consider the case that $\ell_{\mathrm{p}}(\bm{u},\bm{v})\geq\ell_{\mathrm{s}}(\bm{u},\bm{v})$ , i.e., $\ell(\bm{u},\bm{v})=\ell_{\mathrm{p}}(\bm{u},\bm{v})$ .
•

Let $\bm{u_{v}}$ denote the shortest prefix-unit of $\bm{u}$ such that $\ell(\bm{u_{v}})\geq m+1-\ell(\bm{u},\bm{v})$ . Likewise, let $\bm{v_{u}}$ denote the shortest prefix-unit of $\bm{v}$ such that $\ell(\bm{v_{u}})\geq m+1-\ell(\bm{u},\bm{v})$ .

Now we have the following theorem.

Theorem 1

For $G(\bm{u},\bm{v})$ with $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ and $\ell(\bm{u},\bm{v})=\ell_{\mathrm{p}}(\bm{u},\bm{v})$ , the set $\mathcal{B}_{n}\triangleq\{\bm{u_{v}},\bm{v_{u}}\}^{*}\cap\mathcal{X}^{n}$ is an asymptotically optimal quasi 2-code. Moreover, $C(G(\bm{u},\bm{v}))\geq-\log x^{*}$ , where $x^{*}$ is the only positive root of

x^{\ell(\bm{u_{v}})}+x^{\ell(\bm{v_{u}})}=1.

The following two lemmas will serve as stepping stones to establish Theorem 1.

Lemma 2

For $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ , let $\bm{x}\in\{\bm{u_{v}},\bm{v_{u}}\}^{*}$ be an arbitrary but fixed sequence. Let $\bm{y}=\bm{x}\mathrm{pre}(\bm{u},\bm{v})$ , the concatenation of $\bm{x}$ and $\mathrm{pre}(\bm{u},\bm{v})$ . If $\ell(\bm{u},\bm{v})>0$ , then for any $i\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]$ , we have $y_{i}=u_{i}=v_{i}$ .

Proof:

The sequence $\bm{y}$ can be written as $\bm{x}^{(0)}\bm{x}^{(1)}\ldots\bm{x}^{(N-1)}\bm{x}^{(N)}$ , where $\bm{x}^{(n)}\in\{\bm{u_{v}},\bm{v_{u}}\}$ , $n\in\mathbb{Z}[0,N-1]$ , and $\bm{x}^{(N)}=\mathrm{pre}(\bm{u},\bm{v})$ . Now we prove that $y_{i}=u_{i}=v_{i}$ for any $i\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]$ . Note for each $y_{i}$ , $i\in\mathbb{Z}[0,\ell(\bm{u,v})-1]$ , there exists a string $\bm{x}^{(n_{i})}$ , $n_{i}\in\mathbb{Z}[0,N]$ such that $y_{i}=x^{(n_{i})}_{\tilde{i}}$ , an entry of $\bm{x}^{(n_{i})}$ . Note that

\sum_{j=0}^{n_{i}-1}\ell(\bm{x}_{j})\leq i\text{ and }\sum_{j=0}^{n_{i}}\ell(\bm{x}_{j})>i.

We have

n_{i}=\max\left\{n\in\mathbb{Z}:\sum_{j=0}^{n-1}\ell(\bm{x}_{j})\leq i\right\}

and

\tilde{i}=i-\sum_{j=0}^{n_{i}-1}\ell(\bm{x}_{j}).

Now we denote

n_{i}^{\prime}\triangleq\sum_{j=0}^{n_{i}-1}\mathbbm{1}(\bm{x}_{j}=\bm{u_{v}})

and

n_{i}^{\prime\prime}\triangleq\sum_{j=0}^{n_{i}-1}\mathbbm{1}(\bm{x}_{j}=\bm{v_{u}}),

where $\mathbbm{1}(\cdot)$ is the indicator function. Clearly, $i=n_{i}^{\prime}\ell(\bm{u_{v}})+n_{i}^{\prime\prime}\ell(\bm{v_{u}})+\tilde{i}$ .

By the definition of $\bm{u_{v}}$ and $\bm{v_{u}}$ , for any $j\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]$ , we have

u_{j}=u_{j\bmod\ell(\bm{u_{v}})}=v_{j}=v_{j\bmod\ell(\bm{v_{u}})},

(1)

and for any $j_{1},j_{2}\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]$ , we have

1.

$j_{1}\equiv j_{2}\bmod\ell(\bm{u_{v}})$ implies $u_{j_{1}}=u_{j_{2}}$
2.

$j_{1}\equiv j_{2}\bmod\ell(\bm{v_{u}})$ implies $v_{j_{1}}=v_{j_{2}}$ .

When $\bm{x}_{n_{i}}=\bm{u_{v}}$ , and we have

{y}_{i}=x^{(n_{i})}_{\tilde{i}}=(\bm{u_{v}})_{\tilde{i}}=u_{\tilde{i}}\stackrel{{\scriptstyle(a)}}{{=}}u_{\tilde{i}+n_{i}^{\prime}\ell(\bm{u_{v}})}\stackrel{{\scriptstyle(b)}}{{=}}v_{\tilde{i}+n_{i}^{\prime}\ell(\bm{u_{v}})}\stackrel{{\scriptstyle(c)}}{{=}}v_{\tilde{i}+n_{i}^{\prime}\ell(\bm{u_{v}})+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=v_{i}=u_{i},

where $(a)$ and $(c)$ hold since Condition 1 and 2 above hold, and $(b)$ holds since (1) holds.

Likewise, when $\bm{x}_{n_{i}}=\bm{v_{u}}$ . We have

{y}_{i}=x^{(n_{i})}_{\tilde{i}}=(\bm{v_{u}})_{\tilde{i}}=v_{\tilde{i}}=v_{\tilde{i}+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=u_{\tilde{i}+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=u_{\tilde{i}+n_{i}^{\prime}\ell(\bm{u_{v}})+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=u_{i}=v_{i}.

When $\bm{x}_{n_{i}}=\mathrm{pre}(\bm{u},\bm{v})$ , we have $x^{(n_{i})}_{\tilde{i}}=u_{\tilde{i}}=v_{\tilde{i}}$ , and thus,

{y}_{i}=x^{(n_{i})}_{\tilde{i}}=v_{\tilde{i}}=v_{\tilde{i}+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=u_{\tilde{i}+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=u_{\tilde{i}+n_{i}^{\prime}\ell(\bm{u_{v}})+n_{i}^{\prime\prime}\ell(\bm{v_{u}})}=u_{i}=v_{i}.

∎

Example 1

For $G=G(001001,001000)$ , we have $\mathrm{pre}(\bm{u},\bm{v})=00100$ and $\ell(\bm{u},\bm{v})=5$ . Then $\bm{u_{v}}=001$ and $\bm{v_{u}}=0010$ . Let $\bm{y}$ be an arbitrary but fixed sequence in $\{\bm{u_{v}},\bm{v_{u}}\}^{*}\times\{\mathrm{pre}(\bm{u},\bm{v})\}=\{001,0010\}^{*}\times\{00100\}$ . By Lemma 2, we have $y_{i}=u_{i}=v_{i}$ for any $i\in\mathbb{Z}[0,4]$ , i.e., $y_{0}y_{1}y_{2}y_{3}y_{4}=00100.$

Lemma 3

Let $\mathcal{C}_{n}=\{\bm{c}_{1},\bm{c}_{2},\cdots,\bm{c}_{T}\}^{*}\cap\mathcal{X}^{n}$ for each $n$ , where $\bm{c}_{i}$ , $i=1,2,...,T$ , are strings with entries in $\mathcal{X}$ . For $G(\bm{u},\bm{v})$ with $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ and $\ell(\bm{u},\bm{v})=\ell_{\mathrm{p}}(\bm{u},\bm{v})$ , if $\{\mathcal{C}_{n}\}$ is a sequence of quasi codes, then any $\bm{x}\in\{\bm{c}_{1},\bm{c}_{2},\cdots,\bm{c}_{T}\}$ with $\ell(\bm{x})\leq m+1$ is a unit of $\bm{u}$ or $\bm{v}$ . Moreover, if $\bm{x}$ is a unit of $\bm{u}$ , then $\ell(\bm{x})\geq\ell(\bm{u_{v}})$ .

Proof:

Note that $\{\mathcal{C}_{n}\}$ is a sequence of quasi codes. We can find a prefix ${\bm{s}}_{\mathrm{p}}$ and a suffix ${\bm{s}}_{\mathrm{s}}$ such that for any $n$ , by adding ${\bm{s}}_{\mathrm{p}}$ and ${\bm{s}}_{\mathrm{s}}$ to all sequences in $\mathcal{C}_{n}$ , we obtain a code $\mathcal{C}^{\prime}_{n^{\prime}}$ for $G(\bm{u},\bm{v})$ with $n^{\prime}=\ell({\bm{s}}_{\mathrm{p}})+n+\ell({\bm{s}}_{\mathrm{s}})$ . Let $\{\mathcal{C}^{\prime}_{n^{\prime}}\}$ be a sequence of these codes indexed by $n^{\prime}$ . For any sequence $\bm{s}$ and positive integer $w$ , let $\bm{s}^{w}$ denote the concatenation of $\bm{s}$ with itself $w$ times.

Now we consider $n^{\prime}=\ell({\bm{s}}_{\mathrm{p}})+\left(2m+\ell(\bm{y})\right)\ell(\bm{x})+\ell({\bm{s}}_{\mathrm{s}})$ . Let

\bm{c}^{\prime}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{2m+\ell(\bm{y})}{\bm{s}}_{\mathrm{s}}\text{ and }\bm{c}^{\prime\prime}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m}\bm{y}^{\ell(\bm{x})}\bm{x}^{m}{\bm{s}}_{\mathrm{s}}

be two codewords in $\mathcal{C}^{\prime}_{n^{\prime}}$ , where $\bm{y}$ is an arbitrary string in $\{\bm{c}_{1},\bm{c}_{2},\cdots,\bm{c}_{T}\}$ with $\bm{y}\neq\bm{x}$ . Since $\bm{c}^{\prime}$ and $\bm{c}^{\prime\prime}$ are distinguishable for $G(\bm{u},\bm{v})$ , there exists a coordinate $i\in\mathbb{Z}[0,n^{\prime}-m-1]$ such that

\{c^{\prime}_{i}c^{\prime}_{i+1}\cdots c^{\prime}_{i+m},c^{\prime\prime}_{i}c^{\prime\prime}_{i+1}\cdots c^{\prime\prime}_{i+m}\}=\{\bm{u},\bm{v}\}.

Without loss of generality, we assume $\bm{u}=c^{\prime}_{i}c^{\prime}_{i+1}\cdots c^{\prime}_{i+m}$ and $\bm{v}=c^{\prime\prime}_{i}c^{\prime\prime}_{i+1}\cdots c^{\prime\prime}_{i+m}$ , and we will prove that $\bm{x}$ is a unit of $\bm{u}$ .

By definition, $u_{\ell(\bm{u},\bm{v})}\neq v_{\ell(\bm{u},\bm{v})}$ , which implies that $c^{\prime}_{i+\ell(\bm{u},\bm{v})}\neq c^{\prime\prime}_{i+\ell(\bm{u},\bm{v})}$ . On the other hand, the first $\ell({\bm{s}}_{\mathrm{p}})+m\times\ell(\bm{x})$ and the last $m\times\ell(\bm{x})+\ell({\bm{s}}_{\mathrm{s}})$ bits of $\bm{c}^{\prime}$ and $\bm{c}^{\prime\prime}$ are respectively the same, i.e., $c^{\prime}_{j}=c^{\prime\prime}_{j}$ for $j\in\mathbb{Z}[0,\ell({\bm{s}}_{\mathrm{p}})+m\times\ell(\bm{x})-1]\cup\mathbb{Z}[n^{\prime}-m\times\ell(\bm{x})-\ell({\bm{s}}_{\mathrm{s}}),n^{\prime}-1]$ . Therefore,

i+\ell(\bm{u},\bm{v})\in\mathbb{Z}[\ell({\bm{s}}_{\mathrm{p}})+m\times\ell(\bm{x}),n^{\prime}-m\times\ell(\bm{x})-\ell({\bm{s}}_{\mathrm{s}})-1].

Thus,

i\geq\ell({\bm{s}}_{\mathrm{p}})+m\times\ell(\bm{x})-\ell(\bm{u},\bm{v})\geq\ell({\bm{s}}_{\mathrm{p}})+m-m=\ell({\bm{s}}_{\mathrm{p}})

and

i+m\leq n^{\prime}-m\times\ell(\bm{x})-\ell({\bm{s}}_{\mathrm{s}})-1+m-\ell(\bm{u},\bm{v})\leq n^{\prime}-\ell({\bm{s}}_{\mathrm{s}})-1.

Recall that $\bm{c}^{\prime}={\bm{s}}_{\mathrm{p}}\bm{x}^{2m+\ell(\bm{y})}{\bm{s}}_{\mathrm{s}}$ . Letting $\oplus$ denote the modulo- $\ell(\bm{x})$ addition and $i^{\prime}=i\oplus\left(-\ell(\bm{s}_{\mathrm{p}})\right)$ , we have

\bm{u}=c^{\prime}_{i}c^{\prime}_{i+1}\cdots c^{\prime}_{i+m}=x_{i^{\prime}}x_{i^{\prime}\oplus 1}\cdots x_{i^{\prime}\oplus m}.

Note that $\ell(\bm{x})\leq m+1$ . We have $\bm{x}^{\prime}\triangleq x_{i^{\prime}}x_{{i^{\prime}\oplus 1}}\cdots x_{i^{\prime}\oplus(\ell(\bm{x})-1)}$ is a prefix-unit of $\bm{u}$ , and then $\bm{x}$ is a unit of $\bm{u}$ .

Now we prove that if $\bm{x}$ is a unit of $\bm{u}$ , then $\ell(\bm{x})\geq\ell(\bm{u_{v}})$ . Assume the contrary, $\ell(\bm{x})<\ell(\bm{u_{v}})$ . Recall that $\bm{u_{v}}$ is the shortest prefix-unit of $\bm{u}$ such that $\ell(\bm{u_{v}})\geq m+1-\ell(\bm{u},\bm{v})$ . As a prefix shorter than $\bm{u_{v}}$ , we have $\ell(\bm{x})<m+1-\ell(\bm{u},\bm{v})\leq m+1-\ell_{\mathrm{s}}(\bm{u},\bm{v})$ , which implies $\ell(\bm{u},\bm{v})+\ell(\bm{x})<m+1$ and $\ell(\bm{u},\bm{v})+\ell_{\mathrm{s}}(\bm{x})<m+1$ . Note that $\ell(\bm{u})=m+1$ . Now for $u_{\ell(\bm{u},\bm{v})+\ell(\bm{x})}$ and $u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})-\ell(\bm{x})}$ , as $\bm{x}$ is a unit of $\bm{u}$ ,

u_{\ell(\bm{u},\bm{v})+\ell(\bm{x})}=u_{\ell(\bm{u},\bm{v})}\text{ and }u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})-\ell(\bm{x})}=u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})}.

(2)

Now we consider $n^{\prime\prime}=\ell({\bm{s}}_{\mathrm{s}})+(2m+1)\ell(\bm{x})+m\ell(\bm{y})+\ell({\bm{s}}_{\mathrm{s}})$ . Let

\bm{d}^{\prime}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m+1}\bm{y}^{m}\bm{x}^{m}{\bm{s}}_{\mathrm{s}}\text{ and }\bm{d}^{\prime\prime}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m}\bm{y}^{m}\bm{x}^{m+1}{\bm{s}}_{\mathrm{s}}

(3)

be two codewords in $\mathcal{C}^{\prime}_{n^{\prime}}$ . There exists a coordinate $i\in\mathbb{Z}[0,n^{\prime\prime}-m-1]$ such that one of the following two conditions holds.

1.

$d^{\prime}_{i}d^{\prime}_{i+1}\cdots d^{\prime}_{i+m}=\bm{u}$ and $d^{\prime\prime}_{i}d^{\prime\prime}_{i+1}\cdots d^{\prime\prime}_{i+m}=\bm{v}$ ;
2.

$d^{\prime}_{i}d^{\prime}_{i+1}\cdots d^{\prime}_{i+m}=\bm{v}$ and $d^{\prime\prime}_{i}d^{\prime\prime}_{i+1}\cdots d^{\prime\prime}_{i+m}=\bm{u}$ .

Note that $u_{\ell(\bm{u},\bm{v})}\neq v_{\ell(\bm{u},\bm{v})}$ and $u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})}\neq v_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})}$ . We have

d^{\prime}_{i_{1}}\neq d^{\prime\prime}_{i_{1}}\text{ and }d^{\prime}_{i_{2}}\neq d^{\prime\prime}_{i_{2}},

(4)

where $i_{1}=i+\ell(\bm{u},\bm{v})$ and $i_{2}=i+m-\ell_{\mathrm{s}}(\bm{u},\bm{v})$ . To simplify the notations, let $I_{1}\triangleq\ell({\bm{s}}_{\mathrm{s}})+m\ell(\bm{x})$ , $I_{2}\triangleq I_{1}+\ell(\bm{x})$ , $I_{3}\triangleq I_{1}+m\ell(\bm{y})$ and $I_{4}\triangleq I_{1}+\ell(\bm{x})+m\ell(\bm{y})$ . Clearly, $i_{1}\leq i_{2}$ and $I_{1}<\min\{I_{2},I_{3}\}\leq\max\{I_{2},I_{3}\}<I_{4}$ . If $i_{1}\in\mathbb{Z}[0,I_{1}-1]\cup\mathbb{Z}[I_{4},n^{\prime\prime}-1]$ , then $d^{\prime}_{i_{1}}=d^{\prime\prime}_{i_{1}}$ which contradicts (4). Thus $i_{1}\in\mathbb{Z}[I_{1},I_{4}-1].$ Likewise, we also have $i_{2}\in\mathbb{Z}[I_{1},I_{4}-1].$

We first assume that Condition 1) holds. If $i_{1}\in\mathbb{Z}[I_{1},I_{3}-1]$ , then by (2), we have

d^{\prime}_{i_{1}}=u_{\ell(\bm{u},\bm{v})}=u_{\ell(\bm{u},\bm{v})+\ell(\bm{x})}=d^{\prime}_{i_{1}+\ell(\bm{x})}.

Thus

d^{\prime}_{i_{1}}=d^{\prime}_{i_{1}+\ell(\bm{x})}=(\bm{s}_{\mathrm{p}}\bm{x}^{m+1}\bm{y}^{m})_{i_{1}+\ell(\bm{x})}=(\bm{y}^{m})_{i_{1}+\ell(\bm{x})-I_{2}}=(\bm{y}^{m})_{i_{1}-I_{1}}=(\bm{s}_{\mathrm{p}}\bm{x}^{m}\bm{y}^{m})_{i_{1}}=d^{\prime\prime}_{i_{1}},

which contradicts (4). Therefore, $i_{1}\in\mathbb{Z}[I_{3},I_{4}-1]$ . Recall that $i_{2}\geq i_{1}$ and $i_{2}\in\mathbb{Z}[I_{1},I_{4}-1]$ . We have $i_{2}\in\mathbb{Z}[I_{3},I_{4}-1]$ , and so

u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})}=d^{\prime}_{i_{2}}=d^{\prime}_{i_{2}-\ell(\bm{x})}=d^{\prime\prime}_{i_{2}},

which contradicts (4). Hence, there do not exist coordinates $i_{1}$ and $i_{2}$ such that Condition 1 holds.

Now we consider that Condition 2 holds. If $i_{2}\in\mathbb{Z}[I_{2},I_{4}-1]$ , likewise, we can obtain that

d^{\prime\prime}_{i_{2}}=u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})}=u_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})-\ell(\bm{x})}=d^{\prime\prime}_{i_{2}-\ell(\bm{x})}=(\bm{s}_{\mathrm{p}}\bm{x}^{m}\bm{y}^{m}\bm{x})_{i_{1}-\ell(\bm{x})}=(\bm{s}_{\mathrm{p}}\bm{x}^{m+1}\bm{y}^{m})_{i_{1}}=d^{\prime}_{i_{2}},

which contradicts (4). Therefore, $i_{2}\in\mathbb{Z}[I_{1},I_{2}-1]$ , and then $i_{1}\in\mathbb{Z}[I_{1},I_{2}-1]$ , and so

u_{\ell(\bm{u},\bm{v})}=d^{\prime\prime}_{i_{2}}=d^{\prime\prime}_{i_{2}-\ell(\bm{x})}=d^{\prime}_{i_{2}},

which contradicts (4). Hence, there do not exist coordinates $i_{1}$ and $i_{2}$ such that Condition 2 holds. Therefore, $\ell(\bm{x})\geq\ell(\bm{u_{v}})$ if $\bm{x}$ is a unit of $\bm{u}$ . ∎

Corollary 1

Let $\{\mathcal{C}_{n}\}$ be a sequence of quasi codes for $G(\bm{u},\bm{v})$ with $\mathcal{C}_{n}\triangleq\{\bm{c}_{1},\bm{c}_{2},\cdots,\bm{c}_{T}\}^{*}\cap\mathcal{X}^{n}$ . Then $\ell(\bm{c}_{t})\geq\min\{\ell(\bm{u_{v}}),\ell(\bm{v_{u}})\}$ , $t=1,2,...,T$ .

With the above auxiliary results, we turn to the proof of Theorem 1.

Proof:

We first prove that $\mathcal{B}_{n}$ is a quasi 2-code for $G(\bm{u},\bm{v})$ . It is sufficient to prove that

\mathcal{B}^{\prime}_{n^{\prime}}\triangleq\left(\{\bm{u_{v}},\bm{v_{u}}\}^{*}\cap\mathcal{X}^{n}\right)\times\{\mathrm{pre}(\bm{u},\bm{v})\}

is a code for $G(\bm{u},\bm{v})$ with $n^{\prime}\triangleq n+\ell_{\mathrm{p}}(\bm{u},\bm{v})$ , i.e., any two different sequences $\bm{b}^{\prime},\bm{b}^{\prime\prime}\in\mathcal{B}^{\prime}_{n}$ are distinguishable for $G(\bm{u},\bm{v})$ . The sequences $\bm{b}^{\prime},\bm{b}^{\prime\prime}\in\mathcal{B}^{\prime}_{n}$ can be written as $\bm{b}^{\prime}=\bm{b}^{\prime}_{0}\bm{b}^{\prime}_{1}\ldots\bm{b}^{\prime}_{N_{1}-1}\bm{b}^{\prime}_{N_{1}}$ and $\bm{b}^{\prime\prime}=\bm{b}^{\prime\prime}_{0}\bm{b}^{\prime\prime}_{1}\ldots\bm{b}^{\prime\prime}_{N_{2}-1}\bm{b}^{\prime\prime}_{N_{2}}$ , where $\bm{b}^{\prime}_{n_{1}},\bm{b}^{\prime\prime}_{n_{2}}\in\{\bm{u_{v}},\bm{v_{u}}\}$ for $n_{1}\in\mathbb{Z}[0,N_{1}-1]$ and $n_{2}\in\mathbb{Z}[0,N_{2}-1]$ , and $\bm{b}^{\prime}_{N_{1}}=\bm{b}^{\prime\prime}_{N_{2}}=\mathrm{pre}(\bm{u},\bm{v})$ . Let $n^{*}$ be the smallest index such that $\bm{b}^{\prime}_{n^{*}}\neq\bm{b}^{\prime\prime}_{n^{*}}$ , i.e.,

n^{*}\triangleq\min\left\{n\in\mathbb{Z}[0,\min\{N_{1},N_{2}\}-1]:\bm{b}^{\prime}_{n}\neq\bm{b}^{\prime\prime}_{n}\right\}.

Without loss of generality, we assume that $\bm{b}^{\prime}_{n^{*}}=\bm{{u}_{v}}$ and $\bm{b}^{\prime\prime}_{n^{*}}=\bm{{v}_{u}}$ . Let

i\triangleq\sum_{n=0}^{n^{*}-1}\ell(\bm{b}^{\prime}_{n})=\sum_{n=0}^{n^{*}-1}\ell(\bm{b}^{\prime\prime}_{n})

be the coordinate of the first symbol of $\bm{b}^{\prime}_{n^{*}}$ in $\bm{b}^{\prime}$ .

Note that both $\bm{b}^{\prime}_{n^{*}+1}\bm{b}^{\prime}_{n^{*}+2}\ldots\bm{b}^{\prime}_{N_{1}-1}$ and $\bm{b}^{\prime\prime}_{n^{*}+1}\bm{b}^{\prime\prime}_{n^{*}+2}\ldots\bm{b}^{\prime\prime}_{N_{2}-1}$ are in $\{\bm{u_{v}},\bm{v_{u}}\}^{*}$ . By Lemma 2, we have

b^{\prime}_{i+\ell(\bm{u_{v}})+j}=b^{\prime\prime}_{i+\ell(\bm{v_{u}})+j}=u_{j}=v_{j},\forall j\in\mathbb{Z}[0,\ell(\bm{u,v})-1].

(5)

On the other hand, the fact that $\bm{u_{v}}$ and $\bm{v_{u}}$ are, respectively, the prefix-unit of $\bm{u}$ and $\bm{v}$ implies that

u_{j}=u_{j+\ell(\bm{u_{v}})},\forall j\in\mathbb{Z}[0,m-\ell(\bm{u_{v}})]

(6)

and

v_{j}=v_{j+\ell(\bm{v_{u}})},\forall j\in\mathbb{Z}[0,m-\ell(\bm{v_{u}})].

(7)

Recall that $\ell(\bm{u_{v}}),\ell(\bm{v_{u}})\geq m+1-\ell(\bm{u},\bm{v})$ . By (5) and (6), we have

b^{\prime}_{i+\ell(\bm{u_{v}})+j}=u_{\ell(\bm{u_{v}})+j},\forall j\in\mathbb{Z}[0,\ell(\bm{u,v})]\cap\mathbb{Z}[0,m-\ell(\bm{u_{v}})]=\mathbb{Z}[0,m-\ell(\bm{u_{v}})].

Therefore,

\begin{split}&b^{\prime}_{i}b^{\prime}_{i+1}\cdots b^{\prime}_{i+m}\\ =&b^{\prime}_{i}b^{\prime}_{i+1}\cdots b^{\prime}_{i+\ell(\bm{u_{v}})-1}b^{\prime}_{i+\ell(\bm{u_{v}})}\cdots b^{\prime}_{i+m}\\ =&\bm{b}^{\prime}_{n^{*}}b^{\prime}_{i+\ell(\bm{u_{v}})}\cdots b^{\prime}_{i+m}\\ =&\bm{u_{v}}u_{\ell(\bm{u_{v}})}\cdots u_{m}\\ =&\bm{u}.\end{split}

Likewise, we can also obtain that

b^{\prime\prime}_{i}b^{\prime\prime}_{i+1}\cdots b^{\prime\prime}_{i+m}=\bm{v}.

Hence $\bm{b}^{\prime},\bm{b}^{\prime\prime}\in\mathcal{B}^{\prime}_{n}$ are distinguishable for $G(\bm{u},\bm{v})$ , which indicates that $\mathcal{B}_{n}$ is a quasi 2-code for $G(\bm{u},\bm{v})$ .

We now prove that $\{\mathcal{B}_{n}\}$ has the highest rate among all the quasi 2-codes for $G(\bm{u},\bm{v})$ . Assume there exists a quasi 2-code

\mathcal{C}_{n}\triangleq\{\bm{x},\bm{y}\}^{*}\cap\mathcal{X}^{n}

such that $R(\{\mathcal{C}_{n}\})>R(\{\mathcal{B}_{n}\}).$ Note that $R(\{\mathcal{B}_{n}\})=-\log x^{*}$ and $R(\{\mathcal{C}_{n}\})=-\log x^{\prime}$ , where $x^{*}$ and $x^{\prime}$ satisfy, respectively, that

(x^{*})^{\ell(\bm{u_{v}})}+(x^{*})^{\ell(\bm{v_{u}})}=1

and

(x^{\prime})^{\ell(\bm{x})}+(x^{\prime})^{\ell(\bm{y})}=1.

If $\ell(\bm{x})\geq\ell(\bm{u_{v}})$ and $\ell(\bm{y})\geq\ell(\bm{v_{u}})$ , then $x^{*}\geq x^{\prime}$ , which contradicts the assumption that $R(\{\mathcal{B}_{n}\})<R(\{\mathcal{C}_{n}\}).$ Thus, $\ell(\bm{x})<\ell(\bm{u_{v}})\leq m+1$ or $\ell(\bm{y})<\ell(\bm{v_{u}})\leq m+1$ . By Lemma 3, either $\bm{x}$ or $\bm{y}$ is a unit of $\bm{u}$ or $\bm{v}$ . Without loss of generality, let $\bm{x}$ be a unit of $\bm{u}$ and $\ell(\bm{x})\geq\ell(\bm{u_{v}})$ . Moreover, since $R(\{\mathcal{C}_{n}\})>R(\{\mathcal{B}_{n}\})$ , we have $\ell(\bm{y})<\ell(\bm{v_{u}})\leq m+1$ . Thus, by Lemma 3, $\bm{y}$ is also a unit of $\bm{u}$ or $\bm{v}$ . Considering that $\ell(\bm{y})<\ell(\bm{v_{u}})$ , we can obtain that $\bm{y}$ is a unit of $\bm{u}$ but not a unit of $\bm{v}$ , and thus $\ell(\bm{u_{v}})\leq\ell(\bm{y})<\ell(\bm{v_{u}})$ . Moreover, we can also obtain that $\bm{x}$ is not a unit of $\bm{v}$ .

We have proved that for $\{\mathcal{C}_{n}\}$ , each of $\bm{x}$ and $\bm{y}$ is a unit of $\bm{u}$ , and neither of them is a unit of $\bm{v}$ . The remaining part will prove that this statement does not hold, so that there does not exist a sequence of codes $\{\mathcal{C}_{n}\}$ whose rate would be larger than $R(\{\mathcal{B}_{n}\})$ . This will complete the proof of this theorem.

Now we assume that the statement holds. There exists a prefix ${\bm{s}}_{\mathrm{p}}$ and a suffix ${\bm{s}}_{\mathrm{s}}$ such that by adding ${\bm{s}}_{\mathrm{p}}$ and ${\bm{s}}_{\mathrm{s}}$ to all sequences in $\mathcal{C}_{n}$ , we obtain a code for $G(\bm{u},\bm{v})$ , denoted by

\mathcal{C}^{\prime}_{t}\triangleq\left(\{{\bm{s}}_{\mathrm{p}}\}\times\{\bm{x},\bm{y}\}^{*}\times\{{\bm{s}}_{\mathrm{s}}\}\right)\cap\mathcal{X}^{t}.

Let

\bm{c}^{xxx}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{x}^{m\ell(\bm{y})}\bm{x}^{m\ell(\bm{y})}{\bm{s}}_{\mathrm{s}}

and

\bm{c}^{xyx}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\bm{x}^{m\ell(\bm{y})}{\bm{s}}_{\mathrm{s}}

be sequences of length $t^{\prime}\triangleq\ell({\bm{s}}_{\mathrm{p}})+3m\ell(\bm{x})\ell(\bm{y})+\ell({\bm{s}}_{\mathrm{s}})$ . Note that these two sequences are distinguishable. There exists a coordinate $i$ such that

\{c^{xxx}_{i}c^{xxx}_{i+1}\cdots c^{xxx}_{i+m},c^{xyx}_{i}c^{xyx}_{i+1}\cdots c^{xyx}_{i+m}\}=\{\bm{u},\bm{v}\}.

Letting $i_{\mathrm{p}}\triangleq i+\ell(\bm{u},\bm{v})$ and $i_{\mathrm{s}}\triangleq i+m-\ell_{\mathrm{s}}(\bm{u},\bm{v})$ , since $c^{xxx}_{i_{\mathrm{p}}}\neq c^{xyx}_{i_{\mathrm{p}}}$ and $c^{xxx}_{i_{\mathrm{s}}}\neq c^{xyx}_{i_{\mathrm{s}}}$ , we have

\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right)\leq i_{\mathrm{p}}\leq i_{\mathrm{s}}<\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right),

which implies

i\in\mathbb{Z}\left[\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right)-\ell(\bm{u},\bm{v}),\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right)-m+\ell_{\mathrm{s}}(\bm{u},\bm{v})-1\right].

(8)

On the other hand, as $\bm{x}$ is not a unit of $\bm{v}$ , we have $c^{xxx}_{i}c^{xxx}_{i+1}\cdots c^{xxx}_{i+m}=\bm{u}$ and thus $c^{xyx}_{i}c^{xyx}_{i+1}\cdots c^{xyx}_{i+m}=\bm{v}$ . Since $\bm{y}$ is neither a unit of $\bm{v}$ , we further have

i\notin\mathbb{Z}\left[\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right),\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right)-m-1\right].

(9)

By (8) and (9), we can obtain that

i\in\mathcal{I}_{1}\triangleq\mathbb{Z}\left[\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right)-\ell(\bm{u},\bm{v}),\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right)-1\right]

i\in\mathcal{I}_{2}\triangleq\mathbb{Z}\left[\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right)-m,\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right)-m+\ell_{\mathrm{s}}(\bm{u},\bm{v})-1\right].

Letting $I_{1}\triangleq\min\{j\geq 0:x_{j\bmod\ell(\bm{x})}\neq y_{j\bmod\ell(\bm{y})}\}$ , if $i\in\mathcal{I}_{1}$ , then $I_{1}\leq\ell(\bm{u,v})$ and

\begin{split}i_{\mathrm{p}}=&\min\{j\geq\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right):c^{xxx}_{j}\neq c^{xyx}_{j}\}\\ =&\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\right)+I_{1}.\end{split}

Likewise, letting $I_{2}\triangleq\min\{j>0:x_{-j\bmod\ell(\bm{x})}\neq y_{-j\bmod\ell(\bm{y})}\}$ , if $i\in\mathcal{I}_{2}$ , then $I_{2}\leq\ell_{\mathrm{s}}(\bm{u,v})$ and

\begin{split}i_{\mathrm{s}}=&\max\{j<\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right):c^{xxx}_{j}\neq c^{xyx}_{j}\}\\ =&\ell\left({\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\right)-I_{2}.\end{split}

Consider the four cases:

$A$ )

$\bm{c}_{\mathrm{p}}^{\bm{xx}}=\bm{u}$ and $\bm{c}_{\mathrm{p}}^{\bm{xy}}=\bm{v}$ ,
$B$ )

$\bm{c}_{\mathrm{p}}^{\bm{yy}}=\bm{u}$ and $\bm{c}_{\mathrm{p}}^{\bm{yx}}=\bm{v}$ ,
$C$ )

$\bm{c}_{\mathrm{s}}^{\bm{yy}}=\bm{u}$ and $\bm{c}_{\mathrm{s}}^{\bm{xy}}=\bm{v}$ ,
$D$ )

$\bm{c}_{\mathrm{s}}^{\bm{xx}}=\bm{u}$ and $\bm{c}_{\mathrm{s}}^{\bm{yx}}=\bm{v}$ ,

where

\bm{c}_{\mathrm{p}}^{\bm{w}^{\prime}\bm{w}^{\prime\prime}}=w^{\prime}_{I_{1}-\ell(\bm{u,v})\bmod\ell(\bm{w}^{\prime})}...w^{\prime}_{-1\bmod\ell(\bm{w}^{\prime})}w^{\prime\prime}_{0\bmod\ell(\bm{w}^{\prime\prime})}...w^{\prime\prime}_{I_{1}-\ell(\bm{u,v})+m\bmod\ell(\bm{w}^{\prime\prime})},

and

\bm{c}_{\mathrm{s}}^{\bm{w}^{\prime}\bm{w}^{\prime\prime}}=w^{\prime}_{-I_{2}+\ell_{\mathrm{s}}(\bm{u,v})-m\bmod\ell(\bm{w}^{\prime})}...w^{\prime}_{-1\bmod\ell(\bm{w}^{\prime})}w^{\prime\prime}_{0\bmod\ell(\bm{w}^{\prime\prime})}...w^{\prime\prime}_{-I_{2}+\ell_{\mathrm{s}}(\bm{u,v})\bmod\ell(\bm{w}^{\prime\prime})},

for $\bm{w}^{\prime},\bm{w}^{\prime\prime}\in\{\bm{x},\bm{y}\}$ .

We have shown that the fact that $\bm{c}^{xxx}$ and $\bm{c}^{xyx}$ are distinguishable implies that Case $A$ or $D$ holds. Likewise, we have the following implications.

•

$\bm{c}^{xxy}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}{\bm{s}}_{\mathrm{s}}$ and $\bm{c}^{xyy}\triangleq{\bm{s}}_{\mathrm{p}}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}\bm{y}^{m\ell(\bm{x})}{\bm{s}}_{\mathrm{s}}$ are distinguishable $\implies$ Case $A$ or $C$ holds;
•

$\bm{c}^{yxx}\triangleq{\bm{s}}_{\mathrm{p}}\bm{y}^{m\ell(\bm{x})}\bm{x}^{m\ell(\bm{y})}\bm{x}^{m\ell(\bm{y})}{\bm{s}}_{\mathrm{s}}$ and $\bm{c}^{yyx}\triangleq{\bm{s}}_{\mathrm{p}}\bm{y}^{m\ell(\bm{x})}\bm{y}^{m\ell(\bm{x})}\bm{x}^{m\ell(\bm{y})}{\bm{s}}_{\mathrm{s}}$ are distinguishable $\implies$ Case $B$ or $D$ holds;
•

$\bm{c}^{yxy}\triangleq{\bm{s}}_{\mathrm{p}}\bm{y}^{m\ell(\bm{x})}\bm{x}^{m\ell(\bm{y})}\bm{y}^{m\ell(\bm{x})}{\bm{s}}_{\mathrm{s}}$ and $\bm{c}^{yyy}\triangleq{\bm{s}}_{\mathrm{p}}\bm{y}^{m\ell(\bm{x})}\bm{y}^{m\ell(\bm{x})}\bm{y}^{m\ell(\bm{x})}{\bm{s}}_{\mathrm{s}}$ are distinguishable $\implies$ Case $B$ or $C$ holds.

From all these implications, we derive that either both Cases $A$ and $B$ hold, or both Cases $C$ and $D$ hold. If Cases $A$ and $B$ hold, then $\bm{c}_{\mathrm{p}}^{\bm{xx}}=\bm{c}_{\mathrm{p}}^{\bm{yy}}$ , and thus $\bm{c}_{\mathrm{p}}^{\bm{xx}}=\bm{c}_{\mathrm{p}}^{\bm{yy}}=\bm{c}_{\mathrm{p}}^{\bm{yx}}=\bm{c}_{\mathrm{p}}^{\bm{xy}}$ . Hence, $\bm{u}=\bm{c}_{\mathrm{p}}^{\bm{xx}}=\bm{c}_{\mathrm{p}}^{\bm{yx}}=\bm{v}$ , a contradiction. Likewise, if Cases $C$ and $D$ hold, then $\bm{c}_{\mathrm{s}}^{\bm{xx}}=\bm{c}_{\mathrm{s}}^{\bm{yy}}$ , and thus $\bm{u}=\bm{c}_{\mathrm{s}}^{\bm{xx}}=\bm{c}_{\mathrm{s}}^{\bm{yx}}=\bm{v}$ , also leads to a contradiction and the theorem is proved. ∎

III-B optimality for the quasi 2-code

For any $\bm{x}\in\mathcal{X}^{n}$ , we define two string operations ${\pi}_{\mathcal{S}}$ and $\mathrm{del}_{\mathcal{S}}$ , where $\mathcal{S}\subset\mathbb{Z}[0,n-1]$ . Let

\pi_{\mathcal{S}}(\bm{x})\triangleq\left\{\bm{x}^{\prime}\in\mathcal{X}^{n}:x^{\prime}_{i}=x_{i},\,i\in\mathbb{Z}[0,n-1]\setminus\mathcal{S}\right\}.

Let $\mathrm{del}_{\mathcal{S}}(\bm{x})$ be the string obtained by deleting each symbol of $\bm{x}$ whose coordinate is in $\mathcal{S}$ . With a slight abuse of notation, let $\mathrm{del}_{\mathcal{S}}(\mathcal{A}_{n})=\{\mathrm{del}_{\mathcal{S}}(\bm{x}):\bm{x}\in\mathcal{A}_{n}\}$ . Obviously, $|\mathcal{A}_{n}|\geq|\mathrm{del}_{\mathcal{S}}(\mathcal{A}_{n})|$ . The following lemma will be used in determining the upper bound on zero-error capacity.

Lemma 4

For any graph $G$ representing the channel with $m$ memories (any vertex in $V(G)$ is of length $m+1$ ), let $\mathcal{A}_{n}$ be a code for $G$ .

1.

If there exists a codeword $\bm{x}$ and a coordinate set $\mathcal{S}\subset\mathbb{Z}[0,n-1]$ such that for any

$j\in\mathcal{I}^{m}_{\mathcal{S}}\triangleq\bigcup\limits_{i\in\mathcal{S}}{\mathbb{Z}[\max\{i-m,0\},\min\{i,n-m-1\}]},$

the vertex $x_{j}x_{j+1}\ldots x_{j+m}$ is of degree zero, then after replacing $\bm{x}$ by any sequence in $\pi_{\mathcal{S}}(\bm{x})$ , the updated $\mathcal{A}_{n}$ remains a code for $G$ .
2.

If there exists a set $\mathcal{S}\subset\mathbb{Z}[0,n-1]$ such that for any $\bm{x},\bm{y}\in\mathcal{A}_{n}$ and any $j\in\mathcal{I}^{m}_{\mathcal{S}}$ , $x_{j}x_{j+1}\ldots x_{j+m}$ and $y_{j}y_{j+1}\ldots y_{j+m}$ are indistinguishable, then $|\mathrm{del}_{\mathcal{S}}(\mathcal{A}_{n})|=|\mathcal{A}_{n}|$ and $\mathrm{del}_{\mathcal{S}}(\mathcal{A}_{n})$ is a code for $G$ .

Proof:

We first prove that when $\bm{x}$ is replaced by any sequence $\bm{x}^{\prime}\in\pi_{\mathcal{S}}(\bm{x})$ , the updated $\mathcal{A}_{n}$ remains a code. It is sufficient to prove that for any $\bm{y}\in\mathcal{A}_{n}$ , $\bm{y}\neq\bm{x}$ , we have $\bm{x}^{\prime}$ and $\bm{y}$ are distinguishable. Since $\bm{x}$ and $\bm{y}$ are distinguishable, there exists a coordinate $j^{\prime}\in\mathbb{Z}[0,n-m-1]$ such that $x_{j^{\prime}}x_{j^{\prime}+1}\ldots x_{j^{\prime}+m}$ and $y_{j^{\prime}}y_{j^{\prime}+1}\ldots y_{j^{\prime}+m}$ are adjacent in $G$ , which implies that the degree of $x_{j^{\prime}}x_{j^{\prime}+1}\ldots x_{j^{\prime}+m}$ is not zero, i.e., $j^{\prime}\notin\mathcal{I}_{\mathcal{S}}^{m}$ . Thus,

\mathcal{S}\cap\mathbb{Z}[j^{\prime},j^{\prime}+m]=\emptyset,

which implies that $x_{j^{\prime}}x_{j^{\prime}+1}\ldots x_{j^{\prime}+m}=x^{\prime}_{j^{\prime}}x^{\prime}_{j^{\prime}+1}\ldots x^{\prime}_{j^{\prime}+m}$ . Therefore, $\bm{x}^{\prime}$ and $\bm{y}$ are distinguishable.

Now we prove $\mathrm{del}_{\mathcal{S}}(\mathcal{A}_{n})$ is a code for $G$ . We only need to prove that for any codewords $\bm{x},\bm{y}\in\mathcal{A}_{n}$ , $\mathrm{del}_{\mathcal{S}}(\bm{x})$ and $\mathrm{del}_{\mathcal{S}}(\bm{y})$ are distinguishable. Since $\bm{x}$ and $\bm{y}$ are distinguishable, there exists a coordinate $j^{\prime}\in\mathbb{Z}[0,n-m-1]$ such that $x_{j^{\prime}}x_{j^{\prime}+1}\ldots x_{j^{\prime}+m}$ and $y_{j^{\prime}}y_{j^{\prime}+1}\ldots y_{j^{\prime}+m}$ are adjacent, which implies that $j^{\prime}\notin\mathcal{I}_{\mathcal{S}}^{m}$ and so $\mathcal{S}\cap\mathbb{Z}[j^{\prime},j^{\prime}+m]=\emptyset$ . Thus, $\mathrm{del}_{\mathcal{S}}(\bm{x})$ and $\mathrm{del}_{\mathcal{S}}(\bm{y})$ are distinguishable. ∎

Example 2

For the graph $G$ in Fig. 2, let $\bm{x}=11001110011$ be a codeword in a code $\mathcal{A}_{11}$ for $G$ . It can be seen that $\mathcal{S}=\{0,5,10\}$ and

\mathcal{I}^{m}_{\mathcal{S}}=\mathbb{Z}[0,0]\cup\mathbb{Z}[4,5]\cup\mathbb{Z}[9,9]=\{0,4,5,9\}.

Since the degree of the vertex $11$ is zero, when the codeword $\bm{x}$ is replaced by $01001010010\in\pi_{\mathcal{S}}(\bm{x})$ , the updated $\mathcal{A}_{11}$ remains a code for $G$ . Moreover, if all the codewords in $\mathcal{A}_{11}$ start with $11$ , then $\mathrm{del}_{\{0\}}(\mathcal{A}_{11})$ is also a code.

Figure 2: A graph representing the channel with one memory.

For disjoint sets $\mathcal{A}_{i}$ , $i=1,2,...,$ let $\mathcal{A}_{1}\biguplus\mathcal{A}_{2}$ denote the disjoint union of $\mathcal{A}_{1}$ and $\mathcal{A}_{2}$ , and $\biguplus_{i}\mathcal{A}_{i}$ denote the disjoint union of $\mathcal{A}_{i}$ , $i=1,2,...$ .

Theorem 2

For $G(\bm{u},\bm{v})$ with $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ and $\ell(\bm{u},\bm{v})=\ell_{\mathrm{p}}(\bm{u},\bm{v})$ , if all the symbols of $\bm{u}$ are the same, $\bm{u}=u^{m+1}$ , then the capacity $C(G(\bm{u},\bm{v}))=-\log x^{*}$ , where $x^{*}$ is the only positive root of

x^{\ell(\bm{u_{v}})}+x^{\ell(\bm{v_{u}})}=1.

Proof:

We obtain $C(G(\bm{u},\bm{v}))\geq-\log x^{*}$ directly from Theorem 1. Now we prove that $C(G(\bm{u},\bm{v}))\leq-\log x^{*}$ . Let $u=u_{0}$ . By Theorem 1,

\{\mathcal{B}_{n}\}\triangleq\{\bm{u_{v}},\bm{v_{u}}\}^{*}\cap\mathcal{X}^{n}

is an asymptotically optimal quasi 2-code for $G(\bm{u},\bm{v})$ . Since each prefix of $\bm{u}$ is a prefix-unit of $\bm{u}$ , we have $\ell(\bm{u_{v}})=m+1-\ell(\bm{u},\bm{v})$ and

\bm{u_{v}}=u^{m+1-\ell(\bm{u},\bm{v})}.

On the other hand, since the first $\ell(\bm{u},\bm{v})$ symbols and the last $\ell_{\mathrm{s}}(\bm{u},\bm{v})$ symbols of $\bm{u}$ and $\bm{v}$ are all the same, that is, $u_{i}=v_{i}=u$ for $i\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]\cup\mathbb{Z}[m+1-\ell_{\mathrm{s}}(\bm{u},\bm{v}),m]$ , we have

\bm{v_{u}}^{\prime}\triangleq v_{0}v_{1}\cdots v_{m-\ell_{\mathrm{s}}(\bm{u},\bm{v})}

is a prefix-unit of $\bm{v}$ . Assume that $\bm{v_{u}}^{\prime}\neq\bm{v_{u}}$ . We have $\ell(\bm{v_{u}}^{\prime})>\ell(\bm{v_{u}})$ , and then

u\neq v_{\ell(\bm{v_{u}}^{\prime})-1}=v_{\ell(\bm{v_{u}}^{\prime})-1-\ell(\bm{v_{u}})}.

Note that $v_{i}=u$ for $i\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]$ . We have

\ell(\bm{v_{u}}^{\prime})-1-\ell(\bm{v_{u}})\geq\ell(\bm{u},\bm{v})

and so

\ell(\bm{v_{u}})\leq\ell(\bm{v_{u}}^{\prime})-\ell(\bm{u},\bm{v})-1<m+1-\ell(\bm{u},\bm{v}),

which does not hold. Therefore, $\bm{v_{u}}=\bm{v_{u}}^{\prime}$ and $\ell(\bm{v_{u}})=m+1-\ell_{\mathrm{s}}(\bm{u},\bm{v})$ .

Let $\{\mathcal{A}_{n}\}$ be a sequence of codes for $G$ such that for all $n$ , $\mathcal{A}_{n}$ achieves the largest cardinality of a code of length $n$ . For any codeword $\bm{x}\in\mathcal{A}_{n}$ with $n>>m$ , let

\mathcal{S}\triangleq\{i\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]\cup\mathbb{Z}[n-\ell_{\mathrm{s}}(\bm{u},\bm{v}),n-1]:x_{i}\neq u\}.

We can find that for any $j\in\mathcal{I}^{m}_{\mathcal{S}}$ , the vertex $x_{j}x_{j+1}\ldots x_{j+m}$ is neither $\bm{u}$ nor $\bm{v}$ , i.e., this vertex is of degree zero. By Lemma 4, replacing $\bm{x}$ by $\bm{x}^{\prime}\in\pi_{\mathcal{S}}(\bm{x})$ , the updated set $\mathcal{A}_{n}$ remains a code, where

\left\{\begin{array}[]{ll}x^{\prime}_{i}=u,&\text{if }i\in\mathcal{S}\\ x^{\prime}_{i}=x_{i},&\text{if }i\in\mathbb{Z}[0,n-1]\setminus\mathcal{S}\end{array}\right.

or equivalently

\left\{\begin{array}[]{ll}x^{\prime}_{i}=u,&\text{if }i\in\mathbb{Z}[0,\ell(\bm{u},\bm{v})-1]\cup\mathbb{Z}[n-\ell_{\mathrm{s}}(\bm{u},\bm{v}),n-1]\\ x^{\prime}_{i}=x_{i},&\text{if }i\in\mathbb{Z}[\ell(\bm{u},\bm{v}),n-1].\end{array}\right.

Thus, we can assume that each codeword in $\mathcal{A}_{n}$ starts with $u^{\ell(\bm{u},\bm{v})}$ and ends with $u^{\ell_{\mathrm{s}}(\bm{u},\bm{v})}$ .

Let $\bm{x}\neq u^{n}$ be an arbitrary but fixed sequence in the updated $\mathcal{A}_{n}$ . Let $i$ be the first coordinate such that $x_{i}\neq u$ . Clearly, $i\geq\ell(\bm{u},\bm{v})$ . If one of the following two conditions holds,

•

$i+m-\ell(\bm{u},\bm{v})\geq n$ or

•

$i+m-\ell(\bm{u},\bm{v})<n$ and

x_{i}x_{i+1}\cdots x_{i+m-\ell(\bm{u},\bm{v})}\neq v_{\ell(\bm{u},\bm{v})}v_{\ell(\bm{u},\bm{v})+1}\cdots v_{m},

then for any $j\in\mathcal{I}^{m}_{\{i\}}$ , the vertex $x_{j}x_{j+1}\ldots x_{j+m}$ is neither $\bm{u}$ nor $\bm{v}$ and so of degree zero. By Lemma 4, replacing $\bm{x}$ by $\bm{x}^{\prime}\in\pi_{\{i\}}(\bm{x})$ , the updated set $\mathcal{A}_{n}$ remains a code, where

\left\{\begin{array}[]{ll}x^{\prime}_{i^{\prime}}=u,&\text{if }i^{\prime}=i\\ x^{\prime}_{i^{\prime}}=x_{i^{\prime}},&\text{if }i^{\prime}\in\mathbb{Z}[0,n-1]\setminus\{i\}.\end{array}\right.

This replacement can be repeated until any codeword $\bm{x}$ in the updated $\mathcal{A}_{n}$ satisfies that $\bm{x}=u^{n}$ or

x_{i-\ell(\bm{u},\bm{v})}x_{i-\ell(\bm{u},\bm{v})+1}\cdots x_{i+m-\ell(\bm{u},\bm{v})}\neq\bm{v},

where $i$ is the first coordinate such that $x_{i}\neq u$ . Let

\mathcal{A}^{i}_{n}\triangleq\{\bm{x}\in\mathcal{A}_{n}:x_{i}\neq u\text{ and }x_{i^{\prime}}=u,\forall i^{\prime}\in\mathbb{Z}[0,i-1]\}

for $i\in\mathbb{Z}[\ell(\bm{u},\bm{v}),m]$ and

\mathcal{A}^{m+1}_{n}\triangleq\{\bm{x}\in\mathcal{A}_{n}:x_{i^{\prime}}=u,\forall i^{\prime}\in\mathbb{Z}[0,m]\}.

Clearly,

\mathcal{A}_{n}=\biguplus_{i=\ell(\bm{u},\bm{v})}^{m+1}\mathcal{A}^{i}_{n},

the disjoint union of $\mathcal{A}^{i}_{n}$ , $i\in\mathbb{Z}[\ell(\bm{u},\bm{v}),m+1]$ .

Note that all the sequences in $\mathcal{A}^{m+1}_{n}$ start with $u^{m+1}$ . Letting $\mathcal{S}_{1}=\mathbb{Z}[0,{m}-\ell(\bm{u},\bm{v})]$ , we can see that for any $\bm{x},\bm{y}\in\mathcal{A}^{m+1}_{n}$ and any $j\in\mathcal{I}^{m}_{\mathcal{S}_{1}}=\mathcal{S}_{1}$ ,

x_{j+\ell(\bm{u},\bm{v})}=y_{j+\ell(\bm{u},\bm{v})}=u.

As $v_{\ell(\bm{u},\bm{v})}\neq u$ , neither $x_{j}x_{j+1}\ldots x_{j+m}$ nor $y_{j}y_{j+1}\ldots y_{j+m}$ can be $\bm{v}$ , and so they are indistinguishable. Therefore, by Lemma 4, we have

|\mathcal{A}^{m+1}_{n}|=|\mathrm{del}_{\mathcal{S}_{1}}(\mathcal{A}^{m+1}_{n})|\leq|\mathcal{A}_{n-|\mathcal{S}_{1}|}|=|\mathcal{A}_{n-(m+1-\ell(\bm{u},\bm{v}))}|.

(10)

Now we consider $\mathcal{A}^{\prime}_{n}\triangleq\bigcup_{i=\ell(\bm{u},\bm{v})}^{m}\mathcal{A}^{i}_{n}$ . For any sequence $\bm{a}\in\mathcal{A}^{\prime}_{n}$ , there exists an $i\in\mathbb{Z}[\ell(\bm{u},\bm{v}),m]$ such that $\bm{a}\in\mathcal{A}^{i}_{n}$ . Then $a_{i}\neq u$ . Letting $\mathcal{S}_{2}=\mathbb{Z}[0,{m}-\ell_{\mathrm{s}}(\bm{u},\bm{v})]$ , we can see that for any $j\in\mathcal{I}^{m}_{\mathcal{S}_{2}}=\mathcal{S}_{2}$ ,

a_{j}a_{j+1}\ldots a_{j+m}\neq\bm{u}.

Therefore, for any $\bm{x},\bm{y}\in\mathcal{A}^{m+1}_{n}$ and any $j\in\mathcal{I}^{m}_{\mathcal{S}_{2}}$ , neither $x_{j}x_{j+1}\ldots x_{j+m}$ nor $y_{j}y_{j+1}\ldots y_{j+m}$ can be $\bm{u}$ , and so they are indistinguishable. By Lemma 4, we have

|\mathcal{A}^{\prime}_{n}|=|\mathrm{del}_{\mathcal{S}_{2}}(\mathcal{A}^{\prime}_{n})|\leq|\mathcal{A}_{n-|\mathcal{S}_{2}|}|=|\mathcal{A}_{n-(m+1-\ell_{\mathrm{s}}(\bm{u},\bm{v}))}|.

(11)

By (10) and (11), we have

\begin{split}|\mathcal{A}_{n}|&\leq|\mathcal{A}_{n-(m+1-\ell(\bm{u},\bm{v}))}|+|\mathcal{A}_{n-(m+1-\ell_{\mathrm{s}}(\bm{u},\bm{v}))}|\\ &=|\mathcal{A}_{n-\ell(\bm{u_{v}})}|+|\mathcal{A}_{n-\ell(\bm{v_{u}})}|,\end{split}

which is a classic recurrence formula. We have $C(G)=R(\{\mathcal{A}_{n}\})\leq-\log x^{*}$ . ∎

IV Capacity of the Binary Channel with Two Memories

In this section, we consider all the 28 graphs with one edge, which can be classified in 11 cases and have been listed in Table I. As discussed in Section I, for each case, we only need to consider any one of the graphs therein. The zero-error capacity of the graphs in Cases 1 to 10 are solved in this paper. We will also give a lower bound and a upper bound on the zero-error capacity of the graphs in Case 11.

According to Theorem 2, Theorems 3 to 5 can be obtained immediately. The proofs are omitted.

Theorem 3

$C(G)=-\log\alpha\approx 0.551$ for $G=G(000,001)$ , where $\alpha$ is the only positive root of the equation

x+x^{3}=1.

Theorem 4

$C(G)=\frac{1}{2}$ for $G=G(000,010)$ .

Theorem 5

$C(G)=-\log\beta\approx 0.406$ for $G=G(000,011)$ , where $\beta$ is the only positive root of the equation

x^{2}+x^{3}=1.

To facilitate the proofs of the following theorems, for a set of sequences $\mathcal{A}_{n}$ of length $n$ , and a string $\bm{x}$ with length strictly less than $n$ , we denote $\mathcal{A}_{n}^{\bm{x}}$ be the subset of sequences in $\mathcal{A}_{n}$ starting with $\bm{x}$ .

Theorem 6

$C(G)=-\log\beta\approx 0.406$ for $G=G(010,011)$ , where $\beta$ is the only positive root of the equation

x^{2}+x^{3}=1.

Proof:

By Theorem 1, we obtain that $C(G)\geq-\log\beta.$

To prove $C(G)\leq-\log\beta$ , let $\{\mathcal{A}_{n}\}$ be a sequence of codes for $G$ such that for all $n$ , $\mathcal{A}_{n}$ achieves the largest cardinality of a code of length $n$ . Flipping the first bit of any codeword in $\mathcal{A}_{n}$ starting with $1$ , and the second bit of any codeword starting with $000$ , by Lemma 4, the updated $\mathcal{A}_{n}$ remains a code. Thus, WLOG, we assume that any codeword in $\mathcal{A}_{n}$ starts with $011$ , $010$ or $001$ . Equivalently, we assume that any codeword in $\mathcal{A}_{n}$ starts with $011$ , $010$ , $0010$ or $0011$ , i.e.,

\mathcal{A}_{n}=\mathcal{A}_{n}^{011}\biguplus\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{0010}\biguplus\mathcal{A}_{n}^{0011},

and so

|\mathcal{A}_{n}|=\left|\mathcal{A}_{n}^{011}\biguplus\mathcal{A}_{n}^{0011}\right|+\left|\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{0010}\right|.

(12)

Obviously,

\begin{split}\left|\mathcal{A}_{n}^{011}\biguplus\mathcal{A}_{n}^{0011}\right|&=\left|\mathrm{del}_{\{0,1,2\}}\left(\mathcal{A}_{n}^{011}\biguplus\mathcal{A}_{n}^{0011}\right)\right|,\\ \left|\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{0010}\right|&=\left|\mathrm{del}_{\{0,1\}}\left(\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{0010}\right)\right|,\end{split}

(13)

and by Lemma 4, both $\mathrm{del}_{\{0,1,2\}}\left(\mathcal{A}_{n}^{011}\biguplus\mathcal{A}_{n}^{0011}\right)$ and $\mathrm{del}_{\{0,1\}}\left(\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{0010}\right)$ are codes for $G$ . Note that both $\mathcal{A}_{n-3}$ and $\mathcal{A}_{n-2}$ achieve maximum cardinality for codes of lengths $n-3$ and $n-2$ , respectively. We have

\begin{split}\left|\mathrm{del}_{\{0,1,2\}}\left(\mathcal{A}_{n}^{011}\biguplus\mathcal{A}_{n}^{0011}\right)\right|&\leq|\mathcal{A}_{n-3}|,\\ \left|\mathrm{del}_{\{0,1\}}\left(\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{0010}\right)\right|&\leq|\mathcal{A}_{n-2}|.\end{split}

(14)

By (12)-(14), we have $|\mathcal{A}_{n}|\leq|\mathcal{A}_{n-2}|+|\mathcal{A}_{n-3}|$ , which is a classic recurrence formula. Therefore, $C(G)=R(\{\mathcal{A}_{n}\})\leq-\log\beta$ . ∎

Theorem 7

$C(G)=-\log\beta\approx 0.406$ for $G=G(010,001)$ , where $\beta$ is the only positive root of the equation

x^{2}+x^{3}=1.

Proof:

By Theorem 1, we obtain that $C(G)\geq-\log\beta.$

To prove $C(G)\leq-\log\beta$ , let $\{\mathcal{A}_{n}\}$ be a sequence of codes for $G$ such that for all $n$ , $\mathcal{A}_{n}$ achieves the largest cardinality of a code of length $n$ . Flip the first bit of any codeword in $\mathcal{A}_{n}$ starting with $1$ . For any codeword containing $11$ , replace any one of 11s by 10. By Lemma 4, the updated $\mathcal{A}_{n}$ remains a code. Thus, we assume that any codeword in $\mathcal{A}_{n}$ starts with $001$ , $000$ or $010$ , i.e.,

\mathcal{A}_{n}=\mathcal{A}_{n}^{001}\biguplus\mathcal{A}_{n}^{000}\biguplus\mathcal{A}_{n}^{010},

and so

|\mathcal{A}_{n}|=\left|\mathcal{A}_{n}^{001}\right|+\left|\mathcal{A}_{n}^{000}\biguplus\mathcal{A}_{n}^{010}\right|.

We also have

\begin{split}\left|\mathcal{A}_{n}^{001}\right|&=|\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{001})|\\ \left|\mathcal{A}_{n}^{000}\biguplus\mathcal{A}_{n}^{010}\right|&=\left|\mathrm{del}_{\{0,1\}}(\mathcal{A}_{n}^{000}\biguplus\mathcal{A}_{n}^{010})\right|,\end{split}

and by Lemma 4, both $\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{001})$ and $\mathrm{del}_{\{0,1\}}(\mathcal{A}_{n}^{000}\biguplus\mathcal{A}_{n}^{010})$ are codes for $G$ . Thus, $|\mathcal{A}_{n}|\leq|\mathcal{A}_{n-2}|+|\mathcal{A}_{n-3}|$ , which implies that $C(G)=R(\{\mathcal{A}_{n}\})\leq-\log\beta$ . ∎

Lemma 5

$C(G)=\frac{1}{3}$ for $G$ containing all four edges $\{000,111\}$ , $\{010,101\}$ , $\{100,011\}$ and $\{110,001\}$ .

Proof:

We can easily obtain a sequence of codes for $G$ :

\mathcal{A}_{n}^{*}\triangleq\{000,111\}^{*}\bigcap\{0,1\}^{n}.

whose rate is $R(\{\mathcal{A}_{n}^{*}\})=\frac{1}{3}.$

Now we consider the proof of the upper bound. Let $\{\mathcal{A}_{n}\}$ be a sequence of codes for $G$ such that for all $n$ , $\mathcal{A}_{n}$ achieves the largest cardinality of a code of length $n$ . Obviously,

\begin{split}|\mathcal{A}_{n}|&=\sum_{i_{0},i_{1},i_{2}\in\{0,1\}}|\mathcal{A}_{n}^{i_{0}i_{1}i_{2}}|\\ &=\sum_{i_{2}\in\{0,1\}}\left|\biguplus_{i_{0},i_{1}\in\{0,1\}}\mathcal{A}_{n}^{i_{0}i_{1}i_{2}}\right|.\end{split}

As for any $i_{2}\in\{0,1\}$ , the third bits of any two codewords in $\biguplus_{i_{0},i_{1}\in\{0,1\}}\mathcal{A}_{n}^{i_{0}i_{1}i_{2}}$ are the same, by Lemma 4, we see that $\mathrm{del}_{\{0,1,2\}}\left(\biguplus_{i_{0},i_{1}\in\{0,1\}}\mathcal{A}_{n}^{i_{0}i_{1}i_{2}}\right)$ is also a code for $G$ and

\begin{split}\left|\biguplus_{i_{0},i_{1}\in\{0,1\}}\mathcal{A}_{n}^{i_{0}i_{1}i_{2}}\right|&=\left|\mathrm{del}_{\{0,1,2\}}\left(\biguplus_{i_{0},i_{1}\in\{0,1\}}\mathcal{A}_{n}^{i_{0}i_{1}i_{2}}\right)\right|\\ &\leq|\mathcal{A}_{n-3}|.\end{split}

Thus, $|\mathcal{A}_{n}|\leq 2|\mathcal{A}_{n-3}|$ , i.e., $C(G)=R(\{\mathcal{A}_{n}\})\leq 1/3$ . ∎

The following lemma is evident, and so the proof is omitted.

Lemma 6

For $G(\bm{u},\bm{v})$ with $\bm{u},\bm{v}\in\mathcal{X}^{m+1}$ , $\{\bm{u},\bm{v}\}^{k}$ is a code for each $k$ , and $C(G)\geq\frac{1}{m+1}$ .

According to Lemmas 5 and 6, we can obtain Theorems 8 to 10 immediately.

Theorem 8

$C(G)=\frac{1}{3}$ for $G=G(000,111)$ .

Theorem 9

$C(G)=\frac{1}{3}$ for $G=G(010,101)$ .

Theorem 10

$C(G)=\frac{1}{3}$ for $G=G(100,011)$ .

Theorem 11

$C(G)=\frac{1}{3}$ for $G=G(000,101)$ .

Proof:

We can easily obtain a sequence of codes for $G$ :

\mathcal{A}_{n}^{*}\triangleq\{000,101\}^{*}\bigcap\{0,1\}^{n}

whose rate is $R(\{\mathcal{A}_{n}^{*}\})=\frac{1}{3}.$

Let $G^{\prime}$ be the graph with two edges $\{000,101\}$ and $\{000,111\}$ . Since $E(G)\subseteq E(G^{\prime})$ , we have $C(G)\leq C(G^{\prime})$ . Let $\{\mathcal{A}_{n}\}$ be asymptotically optimal for $G^{\prime}$ .

If no codeword in $\mathcal{A}_{n}$ contains the substrings $101$ , then $\mathcal{A}_{n}$ is a code for $G(000,111)$ . Otherwise, we perform a sequence of substring replacements. Specifically, let $\bm{x}\in\mathcal{A}^{\prime}_{n}$ be an arbitrary codeword which contains the substring 111. Then replace any one of 111s by 101. The updated $\mathcal{A}_{n}$ remains a code. Thus the replacement can be repeated until no codeword in the updated $\mathcal{A}_{n}$ contains the substring 101, and therefore the final updated $\mathcal{A}_{n}$ is a code for $G(000,111)$ . Thus, $C(G)\leq C(G^{\prime})=R(\{\mathcal{A}_{n}\})\leq C(G(000,111))=\frac{1}{3}.$ ∎

Theorem 12

$C(G)=\frac{1}{3}$ for $G=G(001,011)$ .

Proof:

By Theorem 1, we obtain that $C(G)\geq-\log\beta.$

The proof of the upper bound is also similar to the proof of Theorem 8. Let $\{\mathcal{A}_{n}\}$ be a sequence of codes for $G$ such that for all $n$ , $\mathcal{A}_{n}$ achieves the largest cardinality of a code of length $n$ . Flip the first bit of any codeword in $\mathcal{A}_{n}$ starting with $1$ and the second bit of any codeword starting with $0101$ . For any codeword containing $111$ or $0000$ , replace any one of 111s by 110 or 0000s by 0100. By Lemma 4(1), the final updated $\mathcal{A}_{n}$ remains a code. Thus, we can assume that any codeword in $\mathcal{A}_{n}$ starts with 0001, 0010, 0011, 0100 or 0110, i.e.,

\mathcal{A}_{n}=\mathcal{A}_{n}^{0001}\biguplus\mathcal{A}_{n}^{0010}\biguplus\mathcal{A}_{n}^{0011}\biguplus\mathcal{A}_{n}^{0100}\biguplus\mathcal{A}_{n}^{0110},

and so

|\mathcal{A}_{n}|=\left|\mathcal{A}_{n}^{0100}\biguplus\mathcal{A}_{n}^{0010}\biguplus\mathcal{A}_{n}^{0011}\right|+\left|\mathcal{A}_{n}^{0110}\biguplus\mathcal{A}_{n}^{0001}\right|.

We also have

\begin{split}&\left|\mathcal{A}_{n}^{0100}\biguplus\mathcal{A}_{n}^{0010}\biguplus\mathcal{A}_{n}^{0011}\right|=\\ &\left|\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{0100}\biguplus\mathcal{A}_{n}^{0010}\biguplus\mathcal{A}_{n}^{0011})\right|\end{split}

and

\left|\mathcal{A}_{n}^{0110}\biguplus\mathcal{A}_{n}^{0001}\right|=\left|\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{0110}\biguplus\mathcal{A}_{n}^{0001})\right|.

Both $\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{0100}\biguplus\mathcal{A}_{n}^{0010}\biguplus\mathcal{A}_{n}^{0011})$ and $\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{0110}\biguplus\mathcal{A}_{n}^{0001})$ are codes for $G$ . Thus, $|\mathcal{A}_{n}|\leq 2|\mathcal{A}_{n-3}|,$ i.e., $C(G)=R(\{\mathcal{A}_{n}\})\leq\frac{1}{3}$ . ∎

Theorem 13

$\frac{\log 14}{11}\leq C(G)\leq-\log\beta\approx 0.406$ for $G=G(001,100)$ , where $\beta$ is the only positive root of the equation

x^{2}+x^{3}=1.

Proof:

We can obtain a sequence of codes for $G$ :

\mathcal{A}_{n}^{*}=\{00100100100,00100101001,00100110010,00110010010,00110011001,01001001001,01001001100,\\ 01001100100,10000100001,10010010010,10010011001,10010100100,10011001001,10011001100\}^{*}\bigcap\{0,1\}^{n},

whose rate is $R(\{\mathcal{A}_{n}^{*}\})=\frac{\log 14}{11}.$

The proof of the upper bound is also similar to the proof of Theorem 4. Let $\{\mathcal{A}_{n}\}$ be a sequence of codes for $G$ such that for all $n$ , $\mathcal{A}_{n}$ achieves the largest cardinality of a code of length $n$ . Flip the first bit of any codeword in $\mathcal{A}_{n}$ starting with $000$ , $110$ or $101$ , the second bit of any codeword starting with $011$ , and the first two bits of any codeword starting with $111$ . Thus, we can assume that any codeword in $\mathcal{A}_{n}$ starts with 001, 010, 100, i.e.,

\mathcal{A}_{n}=\mathcal{A}_{n}^{001}\biguplus\mathcal{A}_{n}^{010}\biguplus\mathcal{A}_{n}^{100},

and so

|\mathcal{A}_{n}|=|\mathcal{A}_{n}^{100}|+\left|\mathcal{A}_{n}^{001}\biguplus\mathcal{A}_{n}^{010}\right|.

By Lemma 4(2), we have

|\mathcal{A}_{n}^{100}|=|\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{100})|

and

\left|\mathcal{A}_{n}^{001}\biguplus\mathcal{A}_{n}^{010}\right|=\left|\mathrm{del}_{\{0,1\}}(\mathcal{A}_{n}^{001}\biguplus\mathcal{A}_{n}^{010})\right|.

Both $\mathrm{del}_{\{0,1,2\}}(\mathcal{A}_{n}^{100})$ and $\mathrm{del}_{\{0,1\}}(\mathcal{A}_{n}^{001}\biguplus\mathcal{A}_{n}^{010})$ are codes for $G$ . Thus,

|\mathcal{A}_{n}|\leq|\mathcal{A}_{n-3}|+|\mathcal{A}_{n-2}|,

i.e., $C(G)=R(\{\mathcal{A}_{n}\})\leq-\log\beta$ . ∎

Remark: We can see that for the graph in Cases 1 to 10, the optimal quasi 2-code constructed in Theorem 1 achieves the zero-error capacity. However, for $G=G(001,100)$ (Case 11), the optimal quasi 2-code is $\{001,100\}^{*}\bigcap\{0,1\}^{n}$ whose rate is $\frac{1}{3}$ . The capacity $C(G)\geq\frac{\log 14}{11}>\frac{1}{3}$ .

V Conclusion

In this paper, we have investigated the zero-error capacity of channels characterized by graphs containing a single edge. Previous works primarily focused on binary input channels with 2 or 3 memories. Our study extends the analysis to channels with $|\mathcal{X}|$ -ary inputs and arbitrary memories. We provide a method for constructing zero-error codes for such graphs with one edge, which offers a lower bound on the zero-error capacity. For the binary channel two memories, the zero-error codes constructed by this method have been proven to be optimal in most cases.

It could be valuable to construct a method to obtain a general upper bound for graphs with one edge. If this upper bound matches the lower bound derived in this paper, it would allow us to determine the capacity for numerous graphs of this type.

References

[1] Q. Cao and Q. Chen, “On zero-error capacity of ”one-edge” binary channels with two memories,” in 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 2762–2767.
[2] C. Shannon, “The zero error capacity of a noisy channel,” IRE Trans. Inf. Theory, vol. 2, no. 3, pp. 8–19, 1956.
[3] Q. Cao and R. W. Yeung, “Zero-error capacity regions of noisy networks,” IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4201–4223, 2022.
[4] L. Lovász, “On the Shannon capacity of a graph,” IEEE Trans. Inf. Theory, vol. 25, no. 1, pp. 1–7, 1979.
[5] L. Zhao and H. H. Permuter, “Zero-error feedback capacity of channels with state information via dynamic programming,” IEEE Trans. Inf. Theory, vol. 56, no. 6, pp. 2640–2650, June 2010.
[6] M. Kovačević and P. Popovski, “Zero-error capacity of a class of timing channels,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 6796–6800, Nov 2014.
[7] T. Nakano and T. Wadayama, “On zero error capacity of nearest neighbor error channels with multilevel alphabet,” in 2016 International Symposium on Information Theory and Its Applications (ISITA), Oct 2016, pp. 66–70.
[8] R. Ahlswede, N. Cai, and Z. Zhang, “Zero-error capacity for models with memory and the enlightened dictator channel,” IEEE Trans. Inf. Theory, vol. 44, no. 3, pp. 1250–1252, May 1998.
[9] G. Cohen, E. Fachini, and J. Körner, “Zero-error capacity of binary channels with memory,” IEEE Trans. Inf. Theory, vol. 62, no. 1, pp. 3–7, Jan 2016.
[10] Q. Cao, N. Cai, W. Guo, and R. W. Yeung, “On zero-error capacity of binary channels with one memory,” IEEE Transactions on Information Theory, vol. 64, no. 10, pp. 6771–6778, 2018.
[11] I. Csiszár and J. Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd ed. Cambridge University Press, 2011.