¹¹institutetext: National Research University Higher School of Economics
¹¹email: rubtsov99@gmail.com

A Linear-time Simulation of Deterministic $d$ -Limited Automata

Alexander A. Rubtsov Supported by Russian Science Foundation grant 20–11–20203 0000-0001-8850-9749

A $d$ -limited automaton is a Turing machine that may rewrite each input cell at most $d$ times. Hibbard (1967) showed that for every $d\geqslant 2$ such automata recognize all context-free languages and that deterministic $d$ -limited automata form a strict hierarchy. Later, Pighizzini and Pisoni proved that the second level of this hierarchy coincides with deterministic context-free languages (DCFLs). We present a linear-time recognition algorithm for deterministic $d$ -limited automata in the RAM model, thereby extending linear-time recognition beyond DCFLs. We further generalize this result to deterministic $d(n)$ -limited automata, where the bound $d$ may depend on the input length $n$ . In addition, we prove an $O(n\cdot k\cdot d(n)+m)$ bound for the membership problem, where the input includes both the word and the automaton’s description, with $m$ denoting the size of the description and $k$ the number of states.

1 Introduction

Context-free languages (CFLs) play a central role in computer science. Their deterministic subclass (DCFLs) is especially important in compiler construction, where parsing is based on the connection between ${\mathrm{LR}}(1)$ -grammars and deterministic pushdown automata (DPDA). In 1965 Knuth showed [9] that ${\mathrm{LR}}(1)$ -grammars generate exactly the class of DCFLs, and DPDAs provide linear-time parsing algorithms for them. Thus, DCFLs form a practically significant subclass of CFLs: they are recognizable in linear time, and ${\mathrm{LR}}(1)$ -grammars admit linear-time construction of derivation trees. All these linear-time bounds, both in classical parsing theory and in this paper, are measured in the RAM model. We emphasize that the situation is different for Turing machines: Hennie showed [7] that any language recognizable in linear time on a Turing machine is regular, so stronger models such as RAM are required to capture the linear-time parsing of non-regular languages.

It is important to distinguish between two closely related problems. We follow the convention that in the recognition problem, the language is fixed and the input is only the word $w$ , while in the membership problem, both a description of the language (for instance, a context-free grammar) and the word $w$ are given as input.

According to this convention, the best known upper bound for the recognition problem for context-free languages is $O(n^{\omega})$ , where $\omega\leqslant 2.373$ is the exponent of fast matrix multiplication. This bound was obtained by Valiant in 1975 [20], and his algorithm decides whether a given word belongs to the language generated by a fixed context-free grammar, but it does not construct a parse tree. Subsequent work [12, 1] confirmed that the same setting — fixed grammar and variable input word — is considered, and provided strong evidence that this upper bound is hard to substantially improve.

1.1 $d$ -Limited Automata and $d$ -DCFLs

We next recall the notion of $d$ -limited automata, introduced by Hibbard [8]. A $d$ -limited automaton ( $d$ -LA) is a nondeterministic Turing machine that scans only the cells containing the input word together with end-markers, and is allowed to rewrite a symbol in each cell (except the end-markers) only during its first $d$ visits. Hibbard showed that for every $d\geqslant 2$ $d$ -LAs recognize precisely the class of context-free languages; it is also known that $1$ -LAs recognize exactly the class of regular languages [21]. For $d=\infty$ , the model coincides with linear-bounded automata, which recognize the class of context-sensitive languages.

For the deterministic case, Pighizzini and Pisoni [14] proved that deterministic $2$ -LAs recognize exactly the deterministic context-free languages (DCFLs) by providing an algorithm that transforms a 2-LA into a PDA while preserving determinism; the inverse transformation had already been established by Hibbard [8]. Following Hibbard, we call a language recognized by a deterministic $d$ -LA a $d$ -deterministic language ( $d$ -DCFL) and denote such automata by $d$ -DLAs. Hibbard also established that the hierarchy is strict: for every fixed $d$ , the class of $(d\!+\!1)$ -DCFLs properly contains the class of $d$ -DCFLs.

Although DCFLs are widely used, they suffer from certain practical limitations. First, they are not closed under reversal: while both

L_{d}=\{da^{n}b^{n}c^{m}\mid n,m\geqslant 0\},\qquad L_{e}=\{ea^{m}b^{n}c^{n}\mid n,m\geqslant 0\}

are DCFLs, their union is a DCFL, but the reversal $(L_{d}\cup L_{e})^{R}$ is not. This language can still be recognized by a $3$ -DLA, which can scan the input from right to left before simulating a $2$ -DLA for $L_{d}\cup L_{e}$ .

Second, DCFLs are not closed under union. The language

L_{d,e}=\{a^{n}b^{n}c^{m}\mid n,m\geqslant 0\}\cup\{a^{m}b^{n}c^{n}\mid n,m\geqslant 0\}

is a union of two DCFLs but is known to be inherently ambiguous [17]. Every DCFL can be generated by an unambiguous grammar, in particular by an ${\mathrm{LR}}(1)$ grammar, and Hibbard showed [8] that the same holds for $d$ -DCFLs. Hence the union $\bigcup_{d\geqslant 1}d$ -DCFLs does not cover all CFLs. This illustrates that $d$ -DCFLs, like DCFLs, are not closed under union; recognizing such languages in linear time via $d$ -DLAs requires parallelism.

1.2 $d(n)$ -Limited Automata

We now consider a natural extension of $d$ -LAs. Assume that $d$ is not a constant but a function $d(n)$ depending on the input length $n$ . The automaton can then rewrite the content of a cell until the number of visits to that cell reaches $d(n)$ . To our knowledge, this is the first time such a generalization has been considered. In classical restrictions on Turing machine computation, the time bound is imposed on the total number of cell visits, whereas here we impose a bound on the number of visits per individual cell (after which the cell remains accessible for reading but no longer for rewriting).

1.3 Our Contribution

We focus on the membership problem for $d(n)$ -DLAs. Let $m$ be the length of the description of a $d(n)$ -DLA ${\cal A}$ , $k$ the number of its states, and $n$ the length of the input word $w$ . We present an $O(n\cdot k\cdot d(n)+m)$ algorithm in the RAM model for the membership problem. In particular, when ${\cal A}$ is fixed and $d(n)=O(1)$ (i.e., for $d$ -DLAs), this yields a linear-time algorithm. Thus every $d$ -DCFL is recognizable in linear time in the RAM model.

Hennie proved in [7] that every language recognizable in linear time by a deterministic Turing machine is regular (and a more general result holds for nondeterministic TMs [19, 13]). Hence no $d$ -DLA with $d\geqslant 2$ can be simulated by a linear-time Turing machine. Guillon and Prigioniero [6] showed that every $1$ -DLA can be transformed into an equivalent linear-time Turing machine (and an analogous result holds for nondeterministic $1$ -LAs), and a related construction was also used by Kutrib and Wendlandt [11]. Their approach relies on Shepherdson’s classical simulation of two-way DFAs by one-way DFAs [18]. We build on this idea as well, but it cannot be applied directly because of Hennie’s result: otherwise one could simulate a non-regular language in linear time on a Turing machine, contradicting the theorem.

To overcome this obstacle we transform a classical Turing machine into one that operates on a doubly-linked list instead of a tape and adapt Shepherdson’s construction to this model. This forms the basis of our membership algorithm. We also reinterpret Birget’s algebraic constructions [3] in graph-theoretic terms, which provides subroutines underlying the final version of our algorithm. A related algebraic approach was developed by Kunc and Okhotin [10], who employed transformation semigroups to capture, for each substring, the state-to-state behavior of two-way deterministic finite automata. This line of work closely mirrors Birget’s method via function composition, and our construction follows the same underlying ideas.

We further establish an upper bound of $O(d(n)\cdot n^{2})$ steps for direct simulation of $d(n)$ -DLAs, provided the computation does not enter an infinite loop. In particular, this implies an $O(n^{2})$ upper bound for $d$ -DLAs, which, to the best of our knowledge, was previously open. This bound is tight: it is witnessed by the classical $d$ -DLA recognizing the language $\{a^{n}b^{n}\mid n\geqslant 0\}$ .

From a theoretical perspective, our results show that some CFLs are easy (linear-time recognizable), while in general recognition of CFLs may require $O(n^{\omega})$ time, and by conditional results there must exist hard CFLs (recognizable only in superlinear time). We discuss this point in the following subsection.

1.4 Related Results

We begin with linear-time recognizable subclasses of context-sensitive languages (CSLs) and context-free languages (CFLs). E. Bertsch and M.-J. Nederhof [2] showed that a nontrivial subclass of CFLs, the regular closure of DCFLs, is recognizable in linear time. This class consists of all languages obtained by taking a regular expression and replacing each symbol with a DCFL. It evidently contains the aforementioned language $L_{d,e}$ (as a union of DCFLs), so it is a strict extension of DCFLs. Note also that $L_{d,e}$ is recognizable by 2-DPDAs.

A broad subclass of CSLs recognizable in linear time is given by the class of languages accepted by two-way deterministic pushdown automata (2-DPDAs). A linear-time simulation algorithm for 2-DPDAs was obtained by S. Cook [4]. This class clearly contains DCFLs, and it also includes the language of palindromes over an alphabet of at least two letters, which is a well-known example of a context-free language that is not a DCFL.

A. Rubtsov and N. Chudinov introduced in [15, 16] a computational model, DPPDA, for Parsing Expression Grammars (PEGs). This model extends DCFLs, remains recognizable in linear time, and is based on a modification of classical pushdown storage. It was also shown that the class of languages recognized by 2-DPPDAs is recognizable in linear time. Moreover, they proved that parsing expression languages (the class generated by PEGs) contain a highly nontrivial subclass, namely the Boolean closure of the regular closure of DCFLs.

It remains open whether 2-DPDAs, 2-DPPDAs, or PEGs recognize all CFLs. However, the works of L. Lee [12] and Abboud et al. [1] provide strong evidence that this is very unlikely due to complexity-theoretic considerations: any CFG parser with time complexity $O(gn^{3-\varepsilon})$ , where $g$ is the size of the grammar and $n$ the input length, can be efficiently converted into an algorithm for multiplying $m\times m$ Boolean matrices in time $O(m^{3-\varepsilon/3})$ . This naturally raises the question: can 2-DPDAs or 2-DPPDAs simulate $d$ -DLAs for $d\geqslant 3$ ?

Finally, T. Yamakami presented another extension of Hibbard’s approach [22, 23]: in several models the input tape is one-way read-only, while the work tape obeys a similar restriction, forbidding rewriting beyond the first $d$ visits. We leave the application of our simulation technique to such models as a direction for future research.

2 Definitions

In this section we give precise definitions of the computational models used in the paper. We begin with Hibbard’s original model of $d$ -limited automata, introduced in [8]. We provide below a concise definition; a more formal equivalent definition can be found in [14]. Next, we introduce a modified variant, called the deleting automaton, in which the tape is replaced by a doubly linked list and cells may be deleted; this auxiliary model is central to our simulation technique. Finally, we define $d(n)$ -limited automata, a natural extension of $d$ -limited automata where the rewriting bound depends on the length of the input word. For clarity, we formulate our simulation algorithm for the case of a fixed constant $d$ , since the extension to $d(n)$ is straightforward.

2.1 $d$ -Limited Automaton

Let $d\geqslant 0$ be a fixed integer. A deterministic $d$ -limited automaton ( $d$ -DLA) is a deterministic single-tape Turing machine whose tape initially contains the input word bordered by the left endmarker $\vartriangleright$ and the right endmarker $\vartriangleleft$ . Each tape symbol is annotated with an integer in $\{0,\dots,d\}$ , called its rank. Initially, all letters of the input word have rank $0$ , while the endmarkers have rank $d$ . Whenever the head visits a cell containing a symbol of rank $r<d$ , the symbol may be overwritten with a new symbol of rank $r^{\prime}$ such that $r<r^{\prime}\leqslant d$ . Symbols of rank $d$ are read-only and cannot be changed.

Formally, a $d$ -DLA ${\cal A}$ is defined by a tuple

{\cal A}=(Q,\,\Sigma,\,\Gamma,\,\delta,\,q_{0},\,F),

where $Q$ is a finite set of states, $\Sigma$ is the input alphabet (symbols of rank $0$ ), $\Gamma$ is the tape alphabet with $\Sigma\cup\{\vartriangleright,\vartriangleleft\}\subseteq\Gamma$ , for each $r$ we let $\Gamma_{r}\subseteq\Gamma$ denote the set of symbols of rank $r$ , $q_{0}\in Q$ is the initial state, $F\subseteq Q$ is the set of accepting states, and $\delta$ is the transition function

\delta:Q\times\Gamma\to Q\times\Gamma\times\{\leftarrow,\rightarrow\},

such that:

•

for $a_{r}\in\Gamma_{r}$ with $r<d$ , each transition is of the form

$\delta(q,a_{r})=(q^{\prime},a_{r^{\prime}},m),$

where $a_{r^{\prime}}\in\Gamma_{r^{\prime}}\setminus\{\vartriangleright,\vartriangleleft\}$ , $r<r^{\prime}\leqslant d$ , and $m\in\{\leftarrow,\rightarrow\}$ ;
•

for $a_{d}\in\Gamma_{d}$ , each transition is of the form

$\delta(q,a_{d})=(q^{\prime},a_{d},m),$

where $m\in\{\leftarrow,\rightarrow\}$ ; moreover, if $a_{d}=\vartriangleright$ then $m=\rightarrow$ , and if $a_{d}=\vartriangleleft$ then $m=\leftarrow$ .

A $d$ -DLA starts in state $q_{0}$ with the head positioned on the first input symbol (immediately to the right of the left endmarker $\vartriangleright$ ). At each step, given a state $q$ and the symbol $a$ under the head, it computes $\delta(q,a)=(q^{\prime},a^{\prime},m)$ , replaces $a$ with $a^{\prime}$ , updates its state to $q^{\prime}$ , and moves the head left or right according to $m$ . The input is accepted if the automaton reaches a state $q_{f}\in F$ while scanning the right endmarker $\vartriangleleft$ ; in this case the computation halts.

We slightly modify Hibbard’s original definition by requiring the transition function to be total. This modification does not change the class of recognizable languages and comes at the cost of a single additional state.

2.2 Deterministic Linked-List Automaton

We next introduce an auxiliary model, which we call the Deterministic Linked-List Automaton (DLLA). In this model there is no constraint on the number of visits to a cell. The tape is replaced by a doubly linked list, so the automaton may delete any cell between the endmarkers (but never the endmarkers themselves). Formally, a DLLA has the same components as a $d$ -DLA (states, input and tape alphabets, initial and accepting states), but with a modified transition function:

\delta:Q\times\Gamma\to\bigl{(}Q\times(\Gamma\cup\{\perp\})\times\{\leftarrow,\rightarrow\}\bigr{)}\cup\{\uparrow\},

where the special symbol $\perp$ indicates that the current cell is to be deleted immediately after the head leaves it. Once a cell is deleted, the head moves directly from its left neighbor to its right neighbor when moving right, and symmetrically when moving left. If the transition function returns $\uparrow$ , the computation halts and the input is rejected. No additional restrictions are imposed on the transition function.

It is easy to see that DLLAs recognize exactly the class of deterministic context-sensitive languages. They can simulate deterministic linear-bounded automata (DLBA) directly, and conversely, a DLBA can simulate a DLLA by marking a deleted cell with a special symbol. We employ the doubly linked list representation in order to achieve the claimed upper bounds for deterministic $d(n)$ -DLAs.

2.3 $d(n)$ -Limited Automaton

For $d$ -DLAs it is convenient to associate with each letter $a_{r}$ its rank $r$ , representing the number of visits to the corresponding cell. For $d(n)$ -DLAs this is no longer possible, since the alphabet and the machine description are fixed and cannot grow with the input length. Instead, in a $d(n)$ -DLA each tape cell maintains its own visit counter, so the rank is associated with the cell rather than the symbol. Formally, each cell (except the endmarkers $\vartriangleright,\vartriangleleft$ ) contains a pair $(a,e)$ , where $a\in\Gamma$ and $e$ is a bit. The bit $e$ is initially $0$ , and remains $0$ while the number of visits to the cell is less than $d(n)$ ; once the number of visits reaches $d(n)$ , $e$ is set to $1$ , and from that point on the cell becomes read-only.

This modification of $d$ -DLAs does not affect our simulation algorithm. The value of $d(n)$ is fixed for a given input length $n$ , and the algorithm never relies on $d$ being a constant rather than a precomputed parameter. Therefore, it suffices to describe the simulation algorithm for $d$ -DLAs; the extension to $d(n)$ -DLAs is straightforward.

3 Linear-Time Simulation Algorithm

In this section we present a linear-time simulation algorithm for deterministic $d$ -limited automata (DLAs). We begin with the recognition problem, where the automaton ${\cal A}$ is fixed and only the input word $w$ of length $n$ varies. Later we extend the construction to the membership problem, where both the automaton and the word are part of the input.

High-level idea. Our approach is inspired by Shepherdson’s classical simulation of two-way deterministic finite automata by one-way DFAs [18]. The key observation is that whenever a $d$ -DLA ${\cal A}$ produces a block of cells all of rank $d$ , the precise contents of these cells are no longer relevant: what matters is only how ${\cal A}$ can enter and leave this block. We therefore compress each such maximal block into a single cell containing a compact mapping that summarizes the block’s effect on the computation. If two adjacent blocks are compressed, their mappings can be merged by composition.

To implement this idea, we simulate ${\cal A}$ by a deterministic linked-list automaton $M$ that uses a doubly linked list in place of a tape. The machine $M$ performs two types of steps:

•

${\cal A}$ -moves, which directly simulate moves of ${\cal A}$ on symbols of rank $<d$ ;
•

technical moves, which occur when $M$ encounters a compressed block. In this case $M$ consults the mapping stored in the corresponding cell to decide how ${\cal A}$ would leave the block, and moves accordingly.

In this way, long stretches of redundant $d$ -rank cells are collapsed into constant-size summaries, ensuring that each cell contributes only a bounded number of times to the overall running time.

In the remainder of this section we first present the simulation algorithm for recognition, together with a correctness proof and amortized analysis. We then introduce the technical machinery of mappings and composition, which allows us to extend the algorithm to the membership problem and to prove the claimed bound.

3.1 Preparations

Directed states.

For convenience, we write $\overrightarrow{p}$ (resp. $\overleftarrow{p}$ ) to denote a state $p$ entered from the left (resp. right). Formally, if $\delta(q,X)=(p,Y,m)$ with $m\in\{\leftarrow,\rightarrow\}$ , we abbreviate it as $\delta(q,X)=(\overleftrightarrow{p},Y)$ , where we write $\overleftrightarrow{p}=(p,m)$ . We call such pairs directed states, and use this notation for both $d$ -DLAs and DLLAs. We write

\overleftarrow{Q}=Q\times\{\leftarrow\},\quad\overrightarrow{Q}=Q\times\{\rightarrow\},\quad\overleftrightarrow{Q}=\overleftarrow{Q}\cup\overrightarrow{Q},

A_{\uparrow}=A\cup\{\uparrow\}\;\;\text{for }A\in\{\overleftarrow{Q},\overrightarrow{Q},\overleftrightarrow{Q}\}.

Mappings.

The key idea is to collapse long segments of rank- $d$ cells into a single object. When ${\cal A}$ makes the $d$ -th visit to a cell and writes a symbol of rank $d$ , we replace that cell by a mapping $f$ . Formally, a mapping is a function

f:\overleftrightarrow{Q}\to\overleftrightarrow{Q}_{\uparrow};

it specifies, given the entry state and the entry direction, the exit state and the exit direction when ${\cal A}$ leaves the block (or $\uparrow$ if it never does). For a mapping describing a single rank- $d$ cell, the entry direction is irrelevant, but for multi-cell segments it matters.

Segment traversal.

We denote by $W_{\cal A}[i]$ the $i$ -th cell of ${\cal A}$ ’s tape and by

W_{\cal A}[l,r]=W_{\cal A}[l]\cdots W_{\cal A}[r-1]

the segment of ${\cal A}$ ’s tape. When we say that the head enters the segment $W_{\cal A}[l,r]$ in the directed state $\overleftrightarrow{q}$ we mean that $\overleftrightarrow{q}=\overrightarrow{q}$ corresponds to the head entering $W_{\cal A}[l]$ from the left, and $\overleftrightarrow{q}=\overleftarrow{q}$ corresponds to entering $W_{\cal A}[r-1]$ from the right. Symmetrically, $\overleftrightarrow{p}=\overrightarrow{p}$ means that the head exits through $W_{\cal A}[r-1]$ to the right, and $\overleftrightarrow{p}=\overleftarrow{p}$ means it exits through $W_{\cal A}[l]$ to the left, and the head leaves the segment $W_{\cal A}[l,r]$ in the directed state $\overleftrightarrow{p}$ .

Segment description mappings.

We say that a mapping $f$ describes a segment $W_{\cal A}[l,r]$ if

•

all letters in $W_{\cal A}[l,r]$ have rank $d$
•

$f(\overleftrightarrow{q})=\overleftrightarrow{p}$ if the head enters the segment $W_{\cal A}[l,r]$ in a directed state $\overleftrightarrow{q}$ , it leaves the segment in a directed state $\overleftrightarrow{p}$ .
•

$f(\overleftrightarrow{q})=\uparrow$ if the head enters the segment $W_{\cal A}[l,r]$ in a directed state $\overleftrightarrow{q}$ , it never leaves the segment (i.e., the computation loops inside).

We denote the set of all possible mappings as

{\cal F}=\{f\mid f:\overleftrightarrow{Q}\to\overleftrightarrow{Q}_{\uparrow}\}

Operations with mappings.

Let $f$ and $g$ describe segments $W_{\cal A}[L,r]$ and $W_{\cal A}[r,R]$ respectively. We define the directed composition $\diamond$ of mappings by setting $h=f\diamond g$ whenever $h$ describes the concatenated segment $W_{\cal A}[L,R]$ . Assume now that the head either enters the segment $W_{\cal A}[r,R]$ in the directed state $\overrightarrow{q}$ , or enters $W_{\cal A}[L,r]$ in $\overleftarrow{q}$ . We define the departure function

D:{\cal F}\times{\cal F}\times\overleftrightarrow{Q}\to\overleftrightarrow{Q}_{\uparrow},

which, given mappings $f,g$ and an entry state $\overleftrightarrow{q}$ , returns the directed state $\overleftrightarrow{p}$ such that ${\cal A}$ leaves the concatenated segment $W_{\cal A}[L,R]$ in state $p$ and direction $\overleftrightarrow{p}$ . If the head never leaves $W_{\cal A}[L,R]$ , then $D(f,g,\overleftrightarrow{q})=\uparrow$ .

Finally, we denote by $CF:\Gamma_{d}\to{\cal F}$ a function that for each symbol $X\in\Gamma_{d}$ , returns the mapping $CF(X)$ describing the one-cell segment consisting only of $X$ .

Since $Q$ is finite, the set ${\cal F}$ is finite as well. Therefore all the functions $CF$ , $D$ , and the operation $\diamond$ are computable in constant time for a fixed automaton ${\cal A}$ , because the number of mappings is finite. In the general case (varying ${\cal A}$ ) we later show that they are computable in $O(|Q_{\cal A}|)$ . The operation $\diamond$ and the departure function $D$ are well-defined; this follows directly from the definitions, and we will later give a formal justification.

3.2 Recognition Problem

We now present a high-level simulation algorithm for the recognition problem. Detailed constructions of the subroutines will be given later in Subsection 3.3, where the membership problem is analyzed. The algorithm below both formally defines the DLLA $M$ that simulates ${\cal A}$ , and at the same time serves as the procedure for simulating $M$ on a RAM model.

Simulation Algorithm 1 is given in pseudocode; its description in natural language is as follows. Since the DLLA $M$ deletes cells during its run, we refer to the current left and right neighbors of the $i$ -th cell as $i.\mathsf{prev}$ and $i.\mathsf{next}$ , respectively. When we write $j=i.\mathsf{prev}$ we assume that both $i$ and $j$ refer to the indices of ${\cal A}$ ’s tape cells; we do not reenumerate the cells of $M$ ’s tape.

Thus, we denote the $i$ -th cell of $M$ ’s tape by $W_{M}[i]$ . If $W_{\cal A}[i]$ contains a letter of rank less than $d-1$ , then $W_{M}[i]=W_{\cal A}[i]$ , and $M$ behaves exactly as ${\cal A}$ (an ${\cal A}$ -move). When $M$ visits a cell for the $d$ -th time, where ${\cal A}$ would write a symbol $X$ of rank $d$ (not an end-marker), $M$ instead writes to that cell a mapping $g=CF(X)$ . When $M$ writes a mapping $g$ in a cell $i$ for the first time, it scans the neighbors $i.\mathsf{prev}$ and $i.\mathsf{next}$ and performs the procedure we call a deletion scan.

•

If none of the neighbors contains a mapping, the scan is finished.
•

If only one of the neighbors $i.\mathsf{prev}$ or $i.\mathsf{next}$ contains a mapping (say $f$ or $h$ ), then $M$ replaces $i$ with $f\diamond g$ (or $g\diamond h$ ) and deletes the neighboring cell.
•

If both neighbors contain mappings, then $M$ replaces $i$ with $f\diamond g\diamond h$ and deletes both neighboring cells.

After a deletion scan the cell $W_{M}[i]$ contains the resulting mapping, say $g$ , while both its neighbors contain letters. Hence $g$ describes the segment $W_{\cal A}[i.\mathsf{prev}+1,\,i.\mathsf{next}]$ . $M$ then moves the head to the same neighbor ( $i.\mathsf{prev}$ or $i.\mathsf{next}$ ) where ${\cal A}$ would be after leaving the rank- $d$ segment $W_{\cal A}[i.\mathsf{prev}+1,\,i.\mathsf{next}]$ . If the head of ${\cal A}$ had not quit the segment $W_{\cal A}[i.\mathsf{prev}+1,\,i.\mathsf{next}]$ right after visiting $W_{\cal A}[i]$ , the cell where ${\cal A}$ arrives after it exits the segment is determined via the departure function $D$ during the deletion scan.

We have thus described the cases in which $M$ arrives:

•

at a cell of rank $r<d$ (from any neighbor), and
•

at a cell of rank $d$ (from another rank- $d$ cell).

It remains to describe the case when $M$ arrives at a cell containing a mapping $f$ in a directed state $\overleftrightarrow{q}$ , coming from a cell with a letter of rank $r<d$ (lines 1–1). In this case $M$ computes $f(\overleftrightarrow{q})=\overleftrightarrow{p}$ and moves the head to the left or right neighbor according to the direction of $\overleftrightarrow{p}$ , arriving at that neighbor in state $p$ .

\overleftrightarrow{q}:=\overleftarrow{q_{0}}

;

i:=1

;

3while no result returned do

4 case $W_{M}[i]\in\Gamma_{r}\cup\{\vartriangleright,\vartriangleleft\},r<d-1$ do /*

{\cal A}

-move */

(\overleftrightarrow{p},a_{r^{\prime}}):=\delta_{\cal A}(q,W_{M}[i])

;

W_{M}[i]:=a_{r^{\prime}}

;

7 case $W_{M}[i]\in\Gamma_{d-1}$ do /* Deletion scan */

(\overleftrightarrow{p},X):=\delta_{\cal A}(q,W_{M}[i])

;

g:=CF(X)

;

\langle CF\rangle({\cal A})

9 if $i.\mathsf{prev}>0$ and $W_{M}[i.\mathsf{prev}]\in{\cal F}$ then

f:=W_{M}[i.\mathsf{prev}]

;

11 if $\overleftrightarrow{p}=\overleftarrow{p}$ then

\overleftrightarrow{p}:=D(f,g,\overleftarrow{p})

;

12 if $\overleftrightarrow{p}=\,\uparrow$ then return Reject;

g:=f\diamond g

;

i.\mathsf{prev}:=(i.\mathsf{prev}).\mathsf{prev}

;

\langle\diamond\rangle({\cal A})

. */

14 end if

15 if $i.\mathsf{next}<n+1$ and $W_{M}[i.\mathsf{next}]\in{\cal F}$ then

h:=W_{M}[i.\mathsf{next}]

;

17 if $\overleftrightarrow{p}=\overrightarrow{p}$ then

\overleftrightarrow{p}=D(g,h,\overrightarrow{p})

;

18 /*

\langle D\rangle({\cal A})

. */ if $\overleftrightarrow{p}=\,\uparrow$ then return Reject;

g:=g\diamond h

;

i.\mathsf{next}:=(i.\mathsf{next}).\mathsf{next}

;

\langle\diamond\rangle({\cal A})

. */

20 end if

W_{M}[i]:=g

;

23 case $W_{M}[i]\in{\cal F}$ do

f:=W_{M}[i]

;

25 if $f(\overleftrightarrow{q})=\,\uparrow$ then return Reject else

\overleftrightarrow{p}=f(\overleftrightarrow{q})

;

27 end case

28 if $\overleftrightarrow{p}=\overleftarrow{p}$ then

i:=i.\mathsf{prev}

else

i:=i.\mathsf{next}

;

\overleftrightarrow{q}:=\overleftrightarrow{p}

;

30 if $q\in F$ and $W_{M}[i]=\,\vartriangleleft$ then return Accept;

32 end while

Algorithm 1 Simulation Algorithm

Before proving correctness we fix notation for time indices. Let $W_{\cal A}^{t}[i]$ denote the content of cell $i$ of ${\cal A}$ ’s tape after $t$ steps of ${\cal A}$ , and let $W_{M}^{t^{\prime}}[i]$ denote the content of cell $i$ of $M$ ’s tape after $t^{\prime}$ steps of $M$ . We consider only regular steps, i.e. steps performed on symbols of rank $<d$ or on endmarkers. Every regular step $t$ of ${\cal A}$ has a corresponding regular step $t^{\prime}$ of $M$ , and the mapping $t\mapsto t^{\prime}$ is strictly increasing (if $t_{1}<t_{2}$ then $t^{\prime}_{1}<t^{\prime}_{2}$ ). This correspondence will be used below.

Lemma 1

For each $d$ -DLA ${\cal A}$ , the corresponding DLLA $M$ simulates ${\cal A}$ . More precisely, there exists an order-preserving correspondence $t\mapsto t^{\prime}$ such that:

(i)

for every regular step $t$ of ${\cal A}$ with the corresponding step $t^{\prime}$ of $M$ , if ${\cal A}$ visits cell $i$ with a symbol of rank $<d$ or an endmarker, then $W_{\cal A}^{t}[i]=W_{M}^{t^{\prime}}[i]$ ;
(ii)

at steps $t$ and $t^{\prime}$ ${\cal A}$ and $M$ are in the same state when arriving at cell $i$ ;
(iii)

$M$ accepts an input iff ${\cal A}$ does.

Proof

Let ${\cal A}$ perform $N$ moves

\delta_{1},\ldots,\delta_{N},\quad\delta_{i}\in Q\times\Gamma\times\{\leftarrow,\rightarrow\},\quad\delta_{1}=\delta(q_{0},W^{0}_{\cal A}[1])

(1)

on a fixed input, and either accepts the input or enters a loop.

We call a move $\delta_{i}=(q,a,m)$ a $d$ -move if $a$ has rank $d$ but is not an endmarker; otherwise we call it a regular move. Thus a run is a sequence (1) partitioned into alternating segments of regular moves and $d$ -moves.

For $M$ we define runs analogously: regular moves are the same, while a $d$ -move is either a step into a cell containing a mapping, or a step of the deletion scan initiated on a cell with a symbol of rank $d-1$ .

We claim:

(i)

if we delete all $d$ -moves from the runs of ${\cal A}$ and $M$ , the resulting sequences of regular moves are identical;
(ii)

after each maximal block of $d$ -moves, both automata end in the same cell and in the same state.

From these two properties it follows that $M$ accepts exactly the same words as ${\cal A}$ , because accepting configurations are reached by regular moves only. The correspondence $t\mapsto t^{\prime}$ between regular steps of ${\cal A}$ and $M$ is then precisely the index matching described before the lemma.

Note that property (i) follows immediately from (ii), since between two blocks of $d$ -moves both automata perform the same sequence of regular moves. Indeed, once the heads are in the same cell with the same symbol of rank $<d$ and in the same state, the subsequent regular moves of $M$ coincide with those of ${\cal A}$ by construction.

It remains to prove (ii). Assume that before some $d$ -block both ${\cal A}$ and $M$ are in the same state on the same cell. We distinguish two cases for the first move of the $d$ -block of ${\cal A}$ :

Case 1: the head visits a cell $i$ containing a symbol $a$ of rank $d-1$ . Then ${\cal A}$ rewrites $a$ to a symbol $X$ of rank $d$ . In the corresponding move $M$ writes into $i$ the mapping $f_{X}$ describing the one-cell segment at $i$ . If the $d$ -block of ${\cal A}$ ends immediately (or on the next step ${\cal A}$ visits another rank- $(d-1)$ cell), then by the definition of $f_{X}$ both automata end in the same cell and state. Otherwise, ${\cal A}$ proceeds into a neighboring rank- $d$ cell and eventually leaves the contiguous rank- $d$ segment. By construction of $M$ , after the deletion scan and subsequent use of the departure function $D$ , $M$ arrives at the same cell and in the same state as ${\cal A}$ . Thus both automata synchronize at the end of the $d$ -block.

Case 2: the head of ${\cal A}$ enters a cell of rank $d$ . Hence ${\cal A}$ moves inside a segment of contiguous rank- $d$ symbols, while $M$ is positioned at a cell containing a mapping $f$ describing this segment (the invariant maintained by the deletion scan). Since $f$ faithfully describes the segment, after $M$ executes the step $f(\overleftrightarrow{q})$ , it reaches exactly the same exit cell and state as ${\cal A}$ . If ${\cal A}$ then continues into a rank- $(d-1)$ cell, we return to Case 1; otherwise the $d$ -block ends with synchronization.

Finally, if in either case ${\cal A}$ enters a loop, the corresponding mapping for $M$ returns $\uparrow$ , and $M$ rejects the input. Thus (ii) holds, which completes the proof. ∎

Now we prove that the simulation algorithm for $d$ -DLAs works in linear time. We present the proof for the general case of a $d(n)$ -DLA. The simulation algorithm for $d(n)$ -DLAs is identical to that for a fixed $d$ , since its behavior depends only on whether the number of visits to a cell is equal to $d(n)-1$ or smaller. The counters for cell visits required in the case of $d(n)$ -DLAs can be implemented in the RAM model with $O(1)$ overhead per operation, and thus do not affect the asymptotic running time.

We denote by $\langle F\rangle({\cal A})$ the time complexity of the operation $F$ on a $d$ -DLA ${\cal A}$ . Since we will prove later that the complexity of each auxiliary step depending on ${\cal A}$ is $O(|Q|)$ , we replace $\langle CF\rangle({\cal A})$ , $\langle D\rangle({\cal A})$ , and $\langle\diamond\rangle({\cal A})$ by

\langle U\!B\rangle({\cal A})\;=\;\langle CF\rangle({\cal A})\;+\;\langle D\rangle({\cal A})\;+\;\langle\diamond\rangle({\cal A}).

(2)

Lemma 2

The automaton $M$ performs $O(d(n)\cdot\langle U\!B\rangle({\cal A})\cdot n)$ steps on processing an input of length $n$ .

Proof

We use amortized analysis [5], namely the accounting method. Each cell $i\in\{1,\ldots,n\}$ on $M$ ’s tape has its own budget $B[i]$ (credit, in the terminology of [5]). We denote by $B^{t}[i]$ the value of the budget of cell $i$ after step $t$ .

The budgets are updated according to the following rules:

•

$B^{0}[i]=2d(n)$ for all $i$ ;
•

$B^{t}[i]=B^{t-1}[i]-1$ if at step $t$ the head visits cell $i$ and this cell still contains a letter (i.e., it has been visited fewer than $d(n)$ times);
•

$B^{t}[j]=B^{t-1}[j]-1$ if at step $t$ the head enters cell $i$ from cell $j$ , where $W_{M}[i]$ is a mapping and $W_{M}[j]$ is a letter;
•

$B^{t}[i]=B^{t-1}[i]$ otherwise.

Budgets are not changed during deletion scans.

Fix a step $t$ . Suppose that the $i$ -th cell currently contains a mapping $f=W^{t}_{M}[i]$ describing a segment $W_{\cal A}[l,r]$ . Its neighbors $W_{M}[l-1]=W_{\cal A}[l-1]$ and $W_{M}[r+1]=W_{\cal A}[r+1]$ have rank $<d(n)$ , so each has been visited fewer than $d(n)$ times. When the head moves from a neighbor into the segment, that neighbor pays $1. Until the neighbor itself turns into a mapping, it pays at most $d(n)$ times for such visits. Once it becomes a mapping, further payments are taken over by the new neighbor. Thus each cell pays $1 for each of its own visits and at most $1 for each visit into an adjacent mapping before it itself turns into a mapping. Since this can happen only after at most $d(n)$ visits of the cell, each cell pays at most $2d(n)$ dollars in total. Therefore, the described budget strategy guarantees $B^{t}[i]\geq 0$ for all $t$ and $i$ .

Deletion scans were not counted above. Clearly there are at most $O(n)$ scans, since each cell can initiate at most one. During one scan, at most two directed compositions are computed, each in $O(\langle U\!B\rangle({\cal A}))$ . Hence all deletion scans together cost $O(n\cdot\langle U\!B\rangle({\cal A}))$ time.

Summing up, $M$ performs the following kinds of operations:

1.

moves that end on a letter, but not an endmarker;
2.

moves that end on a mapping;
3.

deletion scans;
4.

moves that end on an endmarker.

By amortized analysis, Cases 1 and 2 together take $O(d(n)\cdot\langle U\!B\rangle({\cal A})\cdot n)$ : the total number of such moves is $O(d(n)\cdot n)$ (since $\sum_{i=1}^{n}B^{0}[i]=2n\cdot d(n)$ ), and each move costs $O(\langle U\!B\rangle({\cal A}))$ . Case 3 costs $O(n\cdot\langle U\!B\rangle({\cal A}))$ , as discussed. For Case 4, note that after the head leaves the left endmarker $\vartriangleright$ on the very first move, each endmarker can only be visited when arriving from an inner cell. Hence the total number of endmarker visits does not exceed the number of visits to all other cells, which is $O(n\cdot d(n))$ . Since each simulation step of Algorithm 1 takes at most $O(\langle U\!B\rangle({\cal A}))$ time, endmarker visits also cost $O(d(n)\cdot\langle U\!B\rangle({\cal A})\cdot n)$ .

Thus the total running time of $M$ on an input of length $n$ is $O(d(n)\cdot\langle U\!B\rangle({\cal A})\cdot n)$ . ∎

3.3 Membership Problem

To analyze the membership problem for $d$ -DLAs we need to formalize the auxiliary operations on mappings. Recall that mappings represent contiguous segments of rank- $d$ cells and that the simulation algorithm relies on three basic subroutines:

•

the cell description function $CF$ , which produces the mapping for a single rank- $d$ cell;
•

the directed composition $\diamond$ , which merges mappings of adjacent segments into one;
•

the departure function $D$ , which determines the exit state and direction when the head is located at the boundary between two adjacent segments, i.e., when it enters one segment from the other.

We prove in this subsection that all these operations are well-defined and computable in $O(|Q_{\cal A}|)$ time, so $\langle U\!B\rangle({\cal A})=O(|Q_{\cal A}|)$ .

Our constructions rely on graph representations of mappings. A mapping $f\in{\cal F}$ describing a segment $L$ is encoded by a four-partite graph $G_{f}$ with parts $\overrightarrow{L}_{\mathsf{in}},\overleftarrow{L}_{\mathsf{in}},\overrightarrow{L}_{\mathsf{out}},\overleftarrow{L}_{\mathsf{out}}$ , each a copy of $Q_{\cal A}$ . These four parts form a partition of the set $\overleftrightarrow{Q}$ according to their labeling. We also adjoin a distinguished sink vertex $(\uparrow)$ of out-degree $0$ . For every $\overleftrightarrow{q}\in\overleftrightarrow{L}_{\mathsf{in}}$ we add:

•

an edge $\overleftrightarrow{q}\to\overleftrightarrow{p}$ to some $\overleftrightarrow{p}\in\overleftrightarrow{L}_{\mathsf{out}}$ if $f(\overleftrightarrow{q})=\overleftrightarrow{p}$ ,
•

or an edge $\overleftrightarrow{q}\to(\uparrow)$ if $f(\overleftrightarrow{q})=\uparrow$ .

Thus every vertex has out-degree at most $1$ : inputs have either one outgoing edge to an output or to $(\uparrow)$ , while outputs and $(\uparrow)$ have out-degree $0$ . We say that $G_{f}$ represents the mapping $f$ (or the segment $L$ ).

If $g$ describes an adjacent segment $R$ , to compute the composition $f\diamond g$ and the departure function $D$ we use the intermediate graph ${\cal I}(f,g)$ obtained by gluing the graphs $G_{f}$ and $G_{g}$ (with parts labeled by $R$ ). Formally, ${\cal I}(f,g)$ is defined as follows: part $\overrightarrow{L}_{\mathsf{out}}$ is glued with $\overrightarrow{R}_{\mathsf{in}}$ , part $\overleftarrow{L}_{\mathsf{in}}$ with $\overleftarrow{R}_{\mathsf{out}}$ , and the two sinks $(\uparrow)$ are glued together. By gluing two vertices we mean that one of them is deleted and all its edges are reattached to the other. When we glue parts, we glue the vertices carrying the same state label (but with opposite directions). An illustration of ${\cal I}(f,g)$ is given in Fig. 1.

Figure 1: Graph-based computation of

f\diamond g

and

D

Proposition 1

The directed composition $h=f\diamond g$ is determined via the intermediate graph as follows: $h(\overleftrightarrow{q})=\overleftrightarrow{p}$ iff there is a path $\overleftrightarrow{q}\leadsto\overleftrightarrow{p}$ to an output vertex, and $h(\overleftrightarrow{q})=\uparrow$ iff the unique path from $\overleftrightarrow{q}$ reaches $(\uparrow)$ or falls into a directed cycle. Associativity follows from associativity of concatenating segments (hence of gluing graphs).

Lemma 3

For every $d$ -DLA ${\cal A}$ , the operations $\diamond$ , $D$ , and $CF$ are well-defined and computable in $O(|Q_{\cal A}|)$ time, assuming that the description of ${\cal A}$ is already stored in RAM.

Proof

The function $CF$ is trivially computable in $O(|Q_{\cal A}|)$ time: for a rank- $d$ symbol $a_{d}$ , the mapping $f=CF(a_{d})$ satisfies $f(\overleftrightarrow{q})=\overleftrightarrow{p}$ iff $\delta_{\cal A}(q,a_{d})=(\overleftrightarrow{p},a_{d})$ .

We next analyze the algorithms for $\diamond$ and $D$ . Let $V$ and $E$ be the vertex and edge sets of the intermediate graph ${\cal I}(f,g)$ (Fig. 1). Since each vertex has out-degree at most 1, we have $|E|=O(|V|)$ and $|V|=O(|Q_{\cal A}|)$ . We maintain two global arrays: $h$ for the resulting mapping and $m$ for marks on vertices. Initially all values of $m$ are set to a placeholder $\downarrow$ . These arrays serve as memoization.

The core procedure is the function FindPath (Algorithm 2), which returns the endpoint of the unique path starting from $\overleftrightarrow{q}$ , or $\uparrow$ if the path falls into a loop. We explain the steps of the algorithm while proving correctness.

2Function FindPath( $\overleftrightarrow{q}$ ) :

u:=\overleftrightarrow{q}.\mathsf{next}

;

4 while $u\not\in\overleftarrow{L}_{\mathsf{out}}\cup\overrightarrow{R}_{\mathsf{out}}\cup\{(\uparrow)\}$ do

5 if $m[u]=\downarrow$ then

m[u]:=\overleftrightarrow{q}

;

u:=u.\mathsf{next}

;

8 else if $m[u]=\overleftrightarrow{q}$ then

return

\uparrow

;

/* Detected a cycle reachable from

\overleftrightarrow{q}

10 else

return

h[m[u]]

;

/* Path merges with a processed input */

12 end if

14 end while

return

u

;

/* Reached an output or the sink

(\uparrow)

16 end

Algorithm 2 FindPath on the intermediate graph

Correctness and complexity of FindPath. Let $u_{0}=\overleftrightarrow{q}$ and define $u_{k+1}=u_{k}.\mathsf{next}$ . The procedure terminates when one of the following holds:

•

$u_{k}$ is an output: then FindPath returns $u_{k}$ ;
•

$u_{k}=(\uparrow)$ : then FindPath returns $\uparrow$ ;
•

$u_{k}$ is marked: let $\overleftrightarrow{s}=m[u_{k}]$ .

If $\overleftrightarrow{s}=\overleftrightarrow{q}$ , a loop is detected and FindPath returns $\uparrow$ . Otherwise FindPath returns $h(\overleftrightarrow{s})$ , since $h(\overleftrightarrow{s})$ has already been computed and stored in $h[m[u]]$ . As the paths from $\overleftrightarrow{q}$ and $\overleftrightarrow{s}$ merge at the vertex $u_{k}$ , their endpoints coincide.

The invariant “if $m[u]=\overleftrightarrow{s}$ then $h[m[u]]=h(\overleftrightarrow{s})$ ” is established by Algorithm 3, which computes $h=f\diamond g$ using FindPath.

m[u]:=\downarrow

for all

u\in V

;

/* clear marks */

h[\overleftrightarrow{q}]:=\downarrow

for all

\overleftrightarrow{q}\in\overrightarrow{L}_{\mathsf{in}}\cup\overleftarrow{R}_{\mathsf{in}}

;

4for $\overleftrightarrow{q}\in\overrightarrow{L}_{\mathsf{in}}\cup\overleftarrow{R}_{\mathsf{in}}$ do

h[\overleftrightarrow{q}]:=\textnormal{{FindPath}}(\overleftrightarrow{q})

;

7 end for

Algorithm 3 Directed composition

h=f\diamond g

Throughout its execution each edge of ${\cal I}(f,g)$ is traversed at most once before the corresponding vertex is marked, so the total running time is $O(|E|+|V|)=O(|Q_{\cal A}|)$ . Observe that $D(f,g,\overleftrightarrow{q})$ is defined exactly as the endpoint of the path from $\overleftrightarrow{q}$ in the intermediate graph ${\cal I}(f,g)$ . Hence a single call $D(f,g,\overleftrightarrow{q})$ is realized by $\textnormal{{FindPath}}(\overleftrightarrow{q})$ and runs in $O(|Q_{\cal A}|)$ time. Therefore $\diamond$ and $D$ are computable in $O(|Q_{\cal A}|)$ time, and together with the bound for $CF$ this completes the proof. ∎

3.4 Main Result

We now summarize the simulation in the following theorem.

Theorem 3.1

For every $d(n)$ -DLA ${\cal A}$ , the membership problem for ${\cal A}$ is solvable in time $O(k\cdot n\cdot d(n)+m)$ on a RAM model, where $n$ is the length of the input word $w$ , $k=|Q_{\cal A}|$ is the number of states of ${\cal A}$ , and $m$ is the length of the description of ${\cal A}$ , assuming that the function $d(n)$ is computable in time $O(n+m)$ .

Proof

By Lemma 1, for each $d(n)$ -DLA ${\cal A}$ there exists a corresponding DLLA $M$ (described by Algorithm 1) that simulates ${\cal A}$ . By Lemma 2, $M$ performs $O(d(n)\cdot\langle U\!B\rangle({\cal A})\cdot n)$ steps, where $\langle U\!B\rangle({\cal A})$ is defined by Eq. (2). By Lemma 3, $\langle U\!B\rangle({\cal A})=O(|Q_{\cal A}|)=O(k)$ . To simulate $M$ on a RAM model we use Simulation Algorithm 1 with subroutines from Algorithms 2 and 3. Before running the simulation we preprocess the description of ${\cal A}$ and store it in RAM, which takes $O(m)$ time. Combining all these bounds, we obtain the claimed complexity $O(k\cdot n\cdot d(n)+m)$ .

Corollary 1

The recognition problem for a $d(n)$ -DLA ${\cal A}$ is solvable in $O(n\cdot d(n))$ time. In particular, for each fixed $d\in\mathbb{N}$ , every $d$ -DCFL is recognizable in linear time.

4 Upper bound on $d(n)$ -DLA runtime

In this section we establish an upper bound on the runtime of a $d(n)$ -DLA in the classical simulation model (as defined above).

Theorem 4.1

A $d(n)$ -DLA with $k$ states performs at most $O(d(n)\cdot n^{2}\cdot k)$ steps on an input of length $n$ , provided the computation does not enter an infinite loop.

Proof

Suppose the head traverses a segment of tape consisting of rank- $d(n)$ symbols for more than $kn$ steps. Then some cell must be visited at least twice in the same state, since the average number of state visits per cell exceeds $k$ . Hence the computation would fall into an infinite loop.

As noted in the proof of Lemma 1, $d(n)$ -DLAs have two types of moves: regular moves, when the head arrives at a cell of rank less than $d(n)$ , and $d(n)$ -moves, when the head arrives at a segment consisting of rank- $d(n)$ cells. Any series of $d(n)$ -moves cannot exceed $kn$ steps unless the computation enters a loop. Each such series must be preceded by a regular move, and the total number of regular moves is $O(n\cdot d(n))$ : after each regular move the rank of the visited cell strictly increases, and it can increase only up to $d(n)$ .

Since each regular move takes $O(1)$ steps, and each subsequent series of $d(n)$ -moves takes $O(kn)$ steps, the total number of steps is bounded by $O(d(n)\cdot n^{2}\cdot k)$ . Therefore the claim follows. ∎

Theorem 4.1 implies that a $d$ -DLA runs in $O(d\cdot k\cdot n^{2})$ steps. This bound is asymptotically tight: for instance, the classical $2$ -LA recognizing the language $\{a^{n}b^{n}\mid n\geqslant 0\}$ runs in quadratic time. Its behavior is as follows. The head moves right until it encounters the first $b$ , then returns left to find the leftmost $a$ of rank $1$ (which is then promoted to rank $2$ ). After this $a$ is located, the automaton moves right again to check for a matching $b$ of rank $1$ ; if found, it proceeds to the next matching $a$ , and so on. When the automaton reaches the right endmarker $\vartriangleleft$ , it verifies that no $a$ ’s of rank $1$ remain. In this case the input is accepted, and otherwise rejected.

Acknowledgments

The author thanks Dmitry Chistikov for valuable feedback, for discussions of the results presented in this text, and for helpful suggestions that improved the presentation.

References

[1] Abboud, A., Backurs, A., Williams, V.V.: If the current clique algorithms are optimal, so is Valiant’s parser. p. 98–117. FOCS ’15, IEEE Computer Society, USA (2015)
[2] Bertsch, E., Nederhof, M.J.: Regular closure of deterministic languages. SIAM Journal on Computing 29(1), 81–102 (1999)
[3] Birget, J.C.: Concatenation of inputs in a two-way automaton. Theoretical Computer Science 63(2), 141–156 (1989)
[4] Cook, S.A.: Linear time simulation of deterministic two-way pushdown automata. Department of Computer Science, University of Toronto (1970)
[5] Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to Algorithms, fourth edition. MIT Press (2022)
[6] Guillon, B., Prigioniero, L.: Linear-time limited automata. Theor. Comput. Sci. 798, 95–108 (2019)
[7] Hennie, F.: One-tape, off-line Turing machine computations. Information and Control 8(6), 553–578 (1965)
[8] Hibbard, T.N.: A generalization of context-free determinism. Information and Control 11(1/2), 196–238 (1967)
[9] Knuth, D.: On the translation of languages from left to right. Information and Control 8, 607–639 (1965)
[10] Kunc, M., Okhotin, A.: Describing periodicity in two-way deterministic finite automata using transformation semigroups. In: Mauri, G., Leporati, A. (eds.) Developments in Language Theory. pp. 324–336. Springer Berlin Heidelberg, Berlin, Heidelberg (2011)
[11] Kutrib, M., Wendlandt, M.: Reversible limited automata. Fundamenta Informaticae 155(1-2), 31–58 (2017). https://doi.org/10.3233/FI-2017-1575, https://journals.sagepub.com/doi/abs/10.3233/FI-2017-1575
[12] Lee, L.: Fast context-free grammar parsing requires fast boolean matrix multiplication. J. ACM 49(1), 1–15 (2002)
[13] Pighizzini, G.: Nondeterministic one-tape off-line Turing machines and their time complexity. J. Autom. Lang. Comb. 14(1), 107–124 (2009)
[14] Pighizzini, G., Pisoni, A.: Limited automata and context-free languages. In: Fundamenta Informaticae. vol. 136, pp. 157–176. IOS Press (2015)
[15] Rubtsov, A.A., Chudinov, N.: Computational model for parsing expression grammars. In: Královic, R., Kucera, A. (eds.) 49th International Symposium on Mathematical Foundations of Computer Science, MFCS 2024, August 26-30, 2024, Bratislava, Slovakia. LIPIcs, vol. 306, pp. 80:1–80:13. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2024). https://doi.org/10.4230/LIPICS.MFCS.2024.80, https://doi.org/10.4230/LIPIcs.MFCS.2024.80
[16] Rubtsov, A.A., Chudinov, N.: Computational model for parsing expression grammars. CoRR abs/2406.14911 (2024). https://doi.org/10.48550/ARXIV.2406.14911, https://doi.org/10.48550/arXiv.2406.14911
[17] Shallit, J.O.: A Second Course in Formal Languages and Automata Theory. Cambridge University Press (2008)
[18] Shepherdson, J.C.: The reduction of two-way automata to one-way automata. IBM Journal of Research and Development 3(2), 198–200 (1959)
[19] Tadaki, K., Yamakami, T., Lin, J.C.: Theory of one-tape linear-time Turing machines. Theoretical Computer Science 411(1), 22–43 (2010)
[20] Valiant, L.G.: General context-free recognition in less than cubic time. J. Comput. Syst. Sci. 10(2), 308–315 (1975)
[21] Wagner, K., Wechsung, G.: Computational complexity. Springer Netherlands (1986)
[22] Yamakami, T.: Behavioral strengths and weaknesses of various models of limited automata (2021), https://arxiv.org/abs/2111.05000
[23] Yamakami, T.: What is the most natural generalized pumping lemma beyond regular and context-free languages? In: Malcher, A., Prigioniero, L. (eds.) Descriptional Complexity of Formal Systems - 26th IFIP WG 1.02 International Conference, DCFS 2025, Loughborough, UK, July 22-24, 2025, Proceedings. Lecture Notes in Computer Science, vol. 15759, pp. 196–210. Springer (2025). https://doi.org/10.1007/978-3-031-97100-6_14, https://doi.org/10.1007/978-3-031-97100-6\_14

A Linear-time Simulation of Deterministic dd-Limited Automata

1 Introduction

1.1 dd-Limited Automata and dd-DCFLs

1.2 d​(n)d(n)-Limited Automata

1.3 Our Contribution

1.4 Related Results

2 Definitions

2.1 dd-Limited Automaton

2.2 Deterministic Linked-List Automaton

2.3 d​(n)d(n)-Limited Automaton

3 Linear-Time Simulation Algorithm

3.1 Preparations

Directed states.

Mappings.

Segment traversal.

Segment description mappings.

Operations with mappings.

3.2 Recognition Problem

Lemma 1

Proof

Lemma 2

Proof

3.3 Membership Problem

Proposition 1

Lemma 3

Proof

3.4 Main Result

Theorem 3.1

Proof

Corollary 1

4 Upper bound on d​(n)d(n)-DLA runtime

Theorem 4.1

Proof

Acknowledgments

References

A Linear-time Simulation of Deterministic $d$ -Limited Automata

1.1 $d$ -Limited Automata and $d$ -DCFLs

1.2 $d(n)$ -Limited Automata

2.1 $d$ -Limited Automaton

2.3 $d(n)$ -Limited Automaton

4 Upper bound on $d(n)$ -DLA runtime