This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quantum Algorithm for Lexicographically Minimal String Rotation

Qisheng Wang 111Qisheng Wang is with the Graduate School of Mathematics, Nagoya University, Nagoya, Japan (e-mail: QishengWang1994@gmail.com). Part of the work was done when the author was at the Department of Computer Science and Technology, Tsinghua University, Beijing, China.    Mingsheng Ying 222Mingsheng Ying is with the State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, China, and also with the Department of Computer Science and Technology, Tsinghua University, Beijing, China (e-mail: yingms@ios.ac.cn).
Abstract

Lexicographically minimal string rotation (LMSR) is a problem to find the minimal one among all rotations of a string in the lexicographical order, which is widely used in equality checking of graphs, polygons, automata and chemical structures. In this paper, we propose an O(n3/4)O(n^{3/4}) quantum query algorithm for LMSR. In particular, the algorithm has average-case query complexity O(nlogn)O(\sqrt{n}\log n), which is shown to be asymptotically optimal up to a polylogarithmic factor, compared to its Ω(n/logn)\Omega\left(\sqrt{n/\log n}\right) lower bound. Furthermore, we show that our quantum algorithm outperforms any (classical) randomized algorithms in both worst and average cases. As an application, it is used in benzenoid identification and disjoint-cycle automata minimization.

Keywords: quantum computing, quantum algorithms, quantum query complexity, string problems, lexicographically minimal string rotation.

1 Introduction

1.1 Lexicographically Minimal String Rotation

Lexicographically Minimal String Rotation (LMSR) is the problem of finding the lexicographically smallest string among all possible cyclic rotations of a given input string [Boo80]. It has been widely used in equality checking of graphs [CB80], polygons [IS89, Mae91], automata [Pup10] (and their minimizations [AZ08]) and chemical structures [Shi79], and in generating de Bruijn sequences [SWW16, DHS+18] (see also [GY03, vLWW01]). Booth [Boo80] first proposed an algorithm in linear time for LMSR based on the Knuth-Morris-Pratt string-matching algorithm [KMP77]. Shiloach [Shi81] later improved Booth’s algorithm in terms of performance. A more efficient algorithm was developed by Duval [Duv83] from a different point of view known as Lyndon Factorization. All these algorithms for LMSR are deterministic and have worst-case time complexity Θ(n)\Theta(n). After that, several parallel algorithms for LMSR were developed. Apostolico, Iliopoulos and Paige [AIP87] found an O(logn)O(\log n) time CRCW PRAM (Concurrent Read Concurrent Write Parallel Random-Access Machine) algorithm for LMSR using O(n)O(n) processors, which was then improved by Iliopoulos and Smyth [IS92] to use only O(n/logn)O(n/\log n) processors.

The LMSR of a string ss can also be computed by finding the lexicographically minimal suffix of ss$ss\$, i.e. the concatenation of two occurrences of ss and an end-marker $\$, where $\$ is considered as a character lexicographically larger than every character in ss. The minimal suffix of a string can be found in linear time with the help of data structures known as suffix trees [Wei73, AHU74, McC76] and suffix arrays [MM90, GBYS92, KSB06], and alternatively by some specific algorithms [Duv83, Cro92, Ryt03] based on Duval’s algorithm [Duv83] or the KMP algorithm [KMP77].

1.2 Quantum Algorithms for String Problems

Although a large number of new quantum algorithms have been found for various problems (e.g., [Sho94, Gro96, BŠ06, Amb07, MSS07, HHL09, BS17]), only a few of them solve string problems.

Pattern matching is a fundamental problem in stringology, where we are tasked with determining whether a pattern PP of length mm occurs in a text TT of length nn. In classical computing, it is considered to be closely related to LMSR. The Knuth-Morris-Pratt algorithm [KMP77] used in Booth’s algorithm [Boo80] for LMSR mentioned above is one of the first few algorithms for pattern matching, with time complexity Θ(n+m)\Theta(n+m). Recently, several quantum algorithms have been developed for pattern matching; for example, Ramesh and Vinay [RV03] developed an O(nlog(n/m)logm+mlog2m)O\left(\sqrt{n}\log(n/m)\log m+\sqrt{m}\log^{2}m\right) quantum pattern matching algorithm based on a useful technique for parallel pattern matching, namely deterministic sampling [Vis90], and Montanaro [Mon17] proposed an average-case O((n/m)d2O(d3/2logm))O\left((n/m)^{d}2^{O\left(d^{3/2}\sqrt{\log m}\right)}\right) quantum algorithm for dd-dimensional pattern matching. However, it seems that these quantum algorithms for pattern matching cannot be directly generalized to solve LMSR.

Additionally, quantum algorithms for reconstructing unknown strings with nonstandard queries have been proposed; for example, substring queries [CILG+12] and wildcard queries [AM14]. Recently, a quantum algorithm that approximates the edit distance within a constant factor was developed in [BEG+18]. Soon after, Le Gall and Seddighin [LGS22] studied quantum algorithms for several other string problems: longest common substring, longest palindrome substring, and Ulam distance.

1.3 Main Contributions of This Paper

A naive quantum algorithm for LMSR (see Definition 2.1 for its formal definition) is to find the LMSR of the string within O(n)O(\sqrt{n}) comparisons of rotations by quantum minimum finding [DH96, AK99] among all rotations. However, each comparison of two rotations in the lexicographical order costs O(n)O(\sqrt{n}) and is bounded-error. Combining the both, an O~(n)\tilde{O}(n) quantum algorithm for LMSR is obtained, which has no advantages compared to classical algorithms.

In this paper, however, we find a more efficient quantum algorithm for LMSR. Formally, we have:

Theorem 1.1 (Quantum Algorithm for LMSR).

There is a bounded-error quantum query algorithm for LMSR, for which the worst-case query complexity is O(n3/4)O\left(n^{3/4}\right) and average-case query complexity is O(nlogn)O\left(\sqrt{n}\log n\right).

In the top-level design of this algorithm, we are required to find the minimal value of a function, which is given by a bounded-error quantum oracle. To resolve this issue, we develop an efficient error reduction for nested quantum algorithms (see Section 1.4.1 for an outline). With this framework of nested quantum algorithms, we are able to solve problems in nested structures efficiently. The high level illustrations of the algorithm for the worst and average cases are given in Section 1.4.2 and Section 1.4.3, respectively. A detailed description of the algorithm is presented in Section 5.

We assume access to a quantum-read/classical-write random access memory (QRAM) and define time complexity as the number of elementary two-qubit quantum gates, input queries and QRAM operations (see Section 2.2.2 for more details). Our quantum algorithm uses only O(log2n)O(\log^{2}n) classical bits in QRAM and O(logn)O(\log n) “actual” qubits in the quantum computation. Thus, the time complexity of our quantum algorithms in this paper is just an O(logn)O\lparen\log n\rparen factor bigger than their query complexity (in both worst and average cases).

In order to show a separation between classical and quantum algorithms for LMSR, we settle the classical and quantum lower bounds for LMSR in both the worst and average cases. Let R(LMSR)R(\operatorname{LMSR}) and R𝑢𝑛𝑖𝑓(LMSR)R^{\mathit{unif}}(\operatorname{LMSR}) be the worst-case and average-case (classical) randomized query complexities for LMSR, and let Q(LMSR)Q(\operatorname{LMSR}) and Q𝑢𝑛𝑖𝑓(LMSR)Q^{\mathit{unif}}(\operatorname{LMSR}) be their quantum counterparts. Then we have:

Theorem 1.2 (Classical and Quantum Lower Bounds for LMSR).
  1. 1.

    For every bounded-error (classical) randomized algorithm for LMSR, it has worst-case query complexity Ω(n)\Omega(n) and average-case query complexity Ω(n/logn)\Omega(n/\log n). That is, R(LMSR)=Ω(n)R(\operatorname{LMSR})=\Omega(n) and R𝑢𝑛𝑖𝑓(LMSR)=Ω(n/logn)R^{\mathit{unif}}(\operatorname{LMSR})=\Omega(n/\log n).

  2. 2.

    For every bounded-error quantum algorithm for LMSR, it has worst-case query complexity Ω(n)\Omega\left(\sqrt{n}\right) and average-case query complexity Ω(n/logn)\Omega\left(\sqrt{n/\log n}\right). That is, Q(LMSR)=Ω(n)Q(\operatorname{LMSR})=\Omega(\sqrt{n}) and Q𝑢𝑛𝑖𝑓(LMSR)=Ω(n/logn)Q^{\mathit{unif}}(\operatorname{LMSR})=\Omega\left(\sqrt{n/\log n}\right).

Remark 1.1.

It suffices to consider only bounded-error quantum algorithms for LMSR, as we can show that every exact (resp. zero-error) quantum algorithm for LMSR has worst-case query complexity Ω(n)\Omega(n). This is achieved by reducing the search problem to LMSR (see Appendix F), since the search problem is known to have worst-case query complexity Ω(n)\Omega(n) for exact and zero-error quantum algorithms [BBC+01].

Theorem 1.2 is proved in Section 6. Our main proof technique is to reduce a total Boolean function to LMSR and to find a lower bound of that Boolean function based on the notion of block sensitivity. The key observation is that the block sensitivity of that Boolean function is related to the string sensitivity of input (Lemma 6.3, and see Section 1.4.3 for more discussions).

The results of Theorems 1.1 and 1.2 can be summarized as Table 1.

Classical Quantum
Lower bounds Upper bounds Lower bounds Upper bounds
Worst-case Ω(n)\Omega(n) O(n)O(n) [Boo80, Shi81, Duv83] Ω(n)\Omega(\sqrt{n}) O(n3/4)O\left(n^{3/4}\right)
Average-case Ω(n/logn)\Omega(n/\log n) O(n)O(n) [IS94, BCN05] Ω(n/logn)\Omega\left(\sqrt{n/\log n}\right) O(nlogn)O\left(\sqrt{n}\log n\right)
Table 1: Classical (randomized) and quantum query complexities of LMSR.

Note that

Ω(n/logn)Q𝑢𝑛𝑖𝑓(LMSR)O(nlogn).\Omega\left(\sqrt{n/\log n}\right)\leq Q^{\mathit{unif}}(\operatorname{LMSR})\leq O\left(\sqrt{n}\log n\right).

Therefore, our quantum algorithm is asymptotically optimal in the average case up to a logarithmic factor. Moreover, a quantum separation from (classical) randomized computation in both the worst-case and average-case query complexities is achieved:

  1. 1.

    Worst case: Q(LMSR)=O(n3/4)Q(\operatorname{LMSR})=O\left(n^{3/4}\right) but R(LMSR)=Ω(n)R(\operatorname{LMSR})=\Omega(n); and

  2. 2.

    Average case: Q𝑢𝑛𝑖𝑓(LMSR)=O(nlogn)Q^{\mathit{unif}}(\operatorname{LMSR})=O\left(\sqrt{n}\log n\right) but R𝑢𝑛𝑖𝑓(LMSR)=Ω(n/logn)R^{\mathit{unif}}(\operatorname{LMSR})=\Omega(n/\log n).

In other words, our quantum algorithm is faster than any classical randomized algorithms both in the worst case and the average case.

As an application, we show that our algorithm can be used in identifying benzenoids [Baš16] and minimizing disjoint-cycle automata [AZ08] (see Section 7). The quantum speedups for these problems are illustrated in Table 2.

Classical Quantum (this work)
LMSR O(n)O(n) [Boo80, Shi81, Duv83] O(n3/4)O(n^{3/4})
Canonical boundary-edges code O(n)O(n) [Baš16] O(n3/4)O(n^{3/4})
Disjoint-cycle automata minimization O(mn)O(mn) [AZ08] O~(m2/3n3/4)\tilde{O}(m^{2/3}n^{3/4})
Table 2: Classical and quantum query complexities of LMSR, canonical boundary-edges code and disjoint-cycle automata minimization. For disjoint-cycle automata minimization, mm indicates the number of disjoint cycles and nn indicates the length of each cycle. Here, O~()\tilde{O}(\cdot) suppresses logarithmic factors.
Recent Developments.

After the work described in this paper, the worst-case quantum query complexity of LMSR was further improved to n1/2+o(1)n^{1/2+o(1)} in [AJ22] by refining the exclusion rule of LMSR proposed in this paper (see Lemma 4.8 of [AJ22]), and later a quasi-polylogarithmic improvement was achieved in [Wan22]. A quantum algorithm for the decision version of LMSR with worst-case query complexity O~(n)\tilde{O}\lparen\sqrt{n}\rparen was proposed in [CKKD+22] under their quantum divide-and-conquer framework.

As an application, the near-optimal quantum algorithm for LMSR [AJ22] was then used as a subroutine in finding the longest common substring of two input strings [JN23].

1.4 Overview of the Technical Ideas

Our results presented in the above subsection are achieved by introducing the following three new ideas:

1.4.1 Optimal Error Reduction for Nested Quantum Minimum Finding

Our main algorithm for LMSR is essentially a nest of quantum search and minimum finding. A major difficulty in its design is error reduction in nested quantum oracles, which has not been considered in the previous studies of nested quantum algorithms (e.g., nested quantum search analyzed by Cerf, Grover and Williams [CGW00] and nested quantum walks introduced by Jeffery, Kothari and Magniez [JKM13]).

A dd-level nested classical algorithm needs O(logd1n)O(\log^{d-1}n) repetitions to ensure a constant error probability by majority voting. For a dd-level quantum algorithm composed of quantum minimum finding, it is known that only a small factor O(logn)O(\log n) of repetitions is required [CGYM08]. We show that this factor can be even better; that is, only O(1)O(1) of repetitions are required as if there were no errors in quantum oracles:

  • We extend quantum minimum finding algorithm [DH96, AK99] to the situation where the input is given by a bounded-error oracle so that it has query complexity O(n)O(\sqrt{n}) (see Lemma 3.4) rather than O(nlogn)O(\sqrt{n}\log n) straightforwardly by majority voting.

  • We introduce a success probability amplification method for quantum minimum finding on bounded-error oracles, which requires O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) query complexity to obtain the minimum with error probability ε\leq\varepsilon (see Lemma 3.5). In contrast, a straightforward solution by O(log(1/ε))O(\log(1/\varepsilon)) repetitions of the O(n)O(\sqrt{n}) algorithm (by Lemma 3.4) has query complexity O(nlog(1/ε))O(\sqrt{n}\log(1/\varepsilon)).

These ideas are inspired by quantum searching on bounded-error oracles [HMdW03] and amplification of success of quantum search [BCdWZ99]. The above two algorithms will be used in the main algorithm for LMSR as a subroutine. Both of them are optimal because their simpler version (OR function) has lower bound Ω(n)\Omega(\sqrt{n}) [BBBV97, BBHT98, Zal99] for quantum bounded-error algorithm and Ω(nlog(1/ε))\Omega(\sqrt{n\log(1/\varepsilon)}) [BCdWZ99] for its error reduction. To be clearer, we compare the results about quantum searching and quantum minimum finding in the previous literature and ours in Table 3.

Algorithm type Oracle type Search Minimum finding
Bounded-error exact O(n)O(\sqrt{n}) [Gro96] O(n)O(\sqrt{n}) [DH96, AK99]
bounded-error O(n)O(\sqrt{n}) [HMdW03] O(n)O(\sqrt{n}) (this work)
Error reduction exact O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) [BCdWZ99] O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) (this work)
bounded-error O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) [HMdW03] O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) (this work)
Table 3: Quantum query complexities of bounded-error quantum algorithms and their error reductions for search and minimum finding.

Based on the above results, we develop an O(nlog3nloglogn)O\left(\sqrt{n\log^{3}n\log\log n}\right) quantum algorithm for deterministic sampling [Vis90], and furthermore obtain an O(nlogm+mlog3mloglogm)O\left(\sqrt{n\log m}+\sqrt{m\log^{3}m\log\log m}\right) quantum algorithm for pattern matching, which is better than the best known result [RV03] of O(nlog(n/m)logm+mlog2m).O(\sqrt{n}\log(n/m)\log m+\sqrt{m}\log^{2}m). We also develop an optimal O(nd)O\left(\sqrt{n^{d}}\right) quantum algorithm for evaluating dd-level shallow MIN-MAX trees that matches the lower bound Ω(nd)\Omega\left(\sqrt{n^{d}}\right) [Amb00, BS04] for AND-OR trees, and therefore it is optimal. The best known previous quantum query complexity of MIN-MAX trees is O(Wd(n)logn)O\left(W_{d}(n)\log n\right) [CGYM08], where Wd(n)W_{d}(n) is the query complexity of dd-level AND-OR trees and optimally O(nd)O\left(\sqrt{n^{d}}\right) as known from [BCW98, HMdW03]. Our improvements on these problems are summarized in Table 4.

Previous Improved
Deterministic sampling O(nlog2n)O\left(\sqrt{n}\log^{2}n\right) [RV03] O(nlog3nloglogn)O\left(\sqrt{n\log^{3}n\log\log n}\right)
Pattern matching O(nlog(n/m)logm)O\left(\sqrt{n}\log(n/m)\log m\right) [RV03] O(nlogm)O\left(\sqrt{n\log m}\right)
dd-level MIN-MAX tree O(ndlogn)O\left(\sqrt{n^{d}}\log n\right) [CGYM08] O(nd)O\left(\sqrt{n^{d}}\right)
Table 4: Quantum query complexities improved by our nested quantum algorithm framework.

1.4.2 Exclusion Rule of LMSR

We find a useful property of LMSR, named exclusion rule (Lemma 5.1): for any two overlapped substrings which are prefixes of the canonical representation of a string, the LMSR of the string cannot be the latter one. This property enables us to reduce the worst-case query complexity by splitting the string into blocks of suitable sizes, and in each block the exclusion rule can apply so that there is at most one candidate for LMSR. This kind of trick has been used in parallel algorithms, e.g., Lemma 1.1 of [IS92] and the Ricochet Property of [Vis90]. However, the exclusion rule of LMSR used here is not found in the literature (to the best of our knowledge).

We outline our algorithm as follows:

  1. 1

    Let B=nB=\lfloor\sqrt{n}\rfloor and L=B/4L=\lfloor B/4\rfloor. We split ss into n/L\lceil n/L\rceil blocks of size LL (except the last block).

  2. 2

    Find the prefix pp of SCR(s)\operatorname{SCR}(s) of length BB, where SCR(s)\operatorname{SCR}(s) is the canonical representation of ss.

  3. 3

    In each block, find the leftmost index that matches pp as the candidate. Only the leftmost index is required because of the exclusion rule of LMSR.

  4. 4

    Find the lexicographically minimal one among all candidates in blocks. In case of a tie, the minimal candidate is required.

A formal description and the analysis of this algorithm is given in Section 5.1. In order to find the leftmost index that matches pp (Step 3 of this algorithm) efficiently, we adopt the deterministic sampling [Vis90] trick. That is, we preprocess a deterministic sample of pp, with which whether an index matches pp can be checked within O(log|p|)O(\log\lvert p\rvert) rather than O(|p|)O(\lvert p\rvert) comparisons. Especially, we allow pp to be periodic, and therefore extend the definition of deterministic samples for periodic strings (see Definition 4.1) and propose a quantum algorithm for finding a deterministic sample of a string (either periodic or aperiodic) (see Algorithm 4).

1.4.3 String Sensitivity

We also observe the property of LMSR that almost all strings are of low string sensitivity (Lemma 5.3), which can be used to reduce the query complexity of our quantum algorithm significantly in the average case. Here, the string sensitivity of a string (see Definition 5.1) is a metric showing the difficulty to distinguish its substrings, and helpful to obtain lower bounds for LMSR (see Lemma 6.3).

We outline our improvements for better average-case query complexity as follows:

  1. 1

    Let s1s_{1} and s2s_{2} be the minimal and the second minimal substrings of ss of length B=O(logn)B=O(\log n), respectively.

  2. 2

    If s1<s2s_{1}<s_{2} lexicographically, then return the starting index of s1s_{1}; otherwise, run the basic quantum algorithm given in Section 1.4.2.

Intuitively, in the average case, we only need to consider the first O(logn)O(\log n) characters in order to compare two rotations. The correctness is straightforward but the average-case query complexity needs some analysis. See Section 5.2 for a formal description and the analysis of the improvements.

1.5 Organization of This Paper

We recall some basic definitions about strings and quantum query algorithms, and formally define the LMSR problem in Section 2. An efficient error reduction for nested quantum algorithms is developed in Section 3. An improved quantum algorithm for pattern matching based on the new error reduction technique (given in Section 3) is proposed in Section 4. The quantum algorithm for LMSR is proposed in Section 5. The classical and quantum lower bounds for LMSR are given in Section 6. The applications are discussed in Section 7.

2 Preliminaries

For convenience of the reader, in this section, we briefly review the lexicographically minimal string rotation (LMSR) problem, quantum query model and several notions of worst-case and average-case complexities used in the paper.

2.1 Lexicographically Minimal String Rotation

For any positive number nn, let [n]={0,1,2,,n1}[n]=\{0,1,2,\dots,n-1\}. Let Σ\Sigma be a finite alphabet with a total order <<. A string sΣns\in\Sigma^{n} of length nn is a function s:[n]Σs:[n]\to\Sigma. The empty string is denoted ϵ\epsilon. We use s[i]s[i] to denote the ii-th character of ss. In case of i[n]i\notin\mathbb{Z}\setminus[n], we define s[i]s[imodn]s[i]\equiv s[i\bmod n]. If lrl\leq r, s[lr]=s[l]s[l+1]s[r]s[l\dots r]=s[l]s[l+1]\dots s[r] stands for the substring consisting of the ll-th to the rr-th character of ss, and if l>rl>r, we define s[lr]=ϵs[l\dots r]=\epsilon. A prefix of ss is a string of the form s[0i]s[0\dots i] for i[n]{1}i\in[n]\cup\{-1\}. The period of a string sΣns\in\Sigma^{n} is the minimal positive integer dd such that s[i]=s[i+d]s[i]=s[i+d] for every 0i<nd0\leq i<n-d. String ss is called periodic if its period n/2\leq n/2, and ss is aperiodic if it is not periodic.

Let sΣns\in\Sigma^{n} and tΣmt\in\Sigma^{m}. The concatenation of ss and tt is string st=s[0]s[1]s[n1]t[0]t[1]t[m1]st=s[0]s[1]\dots s[n-1]t[0]t[1]\dots t[m-1]. We write s=ts=t if n=mn=m and s[i]=t[i]s[i]=t[i] for every i[n]i\in[n]. We say that ss is smaller than tt in the lexicographical order, denoted s<ts<t, if either ss is a prefix of tt but sts\neq t, or there exists an index 0k<min{n,m}0\leq k<\min\{n,m\} such that s[i]=t[i]s[i]=t[i] for i[k]i\in[k] and s[k]<t[k]s[k]<t[k]. For convenience, we write sts\leq t if s<ts<t or s=ts=t.

Definition 2.1 (Lexicographically Minimal String Rotation).

For any string sΣns\in\Sigma^{n} of length nn, we call s(k)=s[kk+n1]s^{(k)}=s[k\dots k+n-1] the rotation of ss by offset kk. The lexicographically minimal string rotation (LMSR) problem is to find an offset kk such that s(k)s^{(k)} is the minimal string among s(0),s(1),,s(n1)s^{(0)},s^{(1)},\dots,s^{(n-1)} in the lexicographical order. The minimal s(k)s^{(k)} is called the string canonical representation (SCR) of ss, denoted SCR(s)\operatorname{SCR}(s); that is,

SCR(s)=min{s(0),s(1),,s(n1)}.\operatorname{SCR}(s)=\min\left\{s^{(0)},s^{(1)},\dots,s^{(n-1)}\right\}.

In case of a tie; that is, there are multiple offsets such that each of their corresponding strings equals to SCR(s)\operatorname{SCR}(s), then the minimal offset is desired, and the goal is to find

LMSR(s)=min{k[n]:s(k)=SCR(s)}.\operatorname{LMSR}(s)=\min\left\{k\in[n]:s^{(k)}=\operatorname{SCR}(s)\right\}.

The LMSR problem has been well-studied in the literature [CB80, Boo80, Shi81, Duv83, Jeu93, CR94, CHL07], and several linear time (classical) algorithms for LMSR are known, namely Booth’s, Shiloach’s and Duval’s Algorithms:

Theorem 2.1 ([Boo80, Shi81, Duv83]).

There is an O(n)O(n) deterministic algorithm for LMSR.

2.2 Quantum Query Algorithms

Our computational model is the quantum query model [Amb04, BdW02]. The goal is to compute an nn-variable function f(x)=f(x0,x1,,xn1)f(x)=f(x_{0},x_{1},\dots,x_{n-1}), where x0,x1,,xn1x_{0},x_{1},\dots,x_{n-1} are input variables. For example, the LMSR problem can be viewed as function f(x)=LMSR(x0,x1,xn1)f(x)=\operatorname{LMSR}(x_{0},x_{1},\dots x_{n-1}), where x0x1x2xn1x_{0}x_{1}x_{2}\dots x_{n-1} denotes the string of element x0,x1,,xn1x_{0},x_{1},\dots,x_{n-1}. The input variables xix_{i} can be accessed by queries to a quantum oracle OxO_{x} (which is a quantum unitary operator) defined by Ox|i,j=|i,jxiO_{x}\lvert i,j\rangle=\lvert i,j\oplus x_{i}\rangle, where \oplus is the bitwise exclusive-OR operation. A quantum algorithm AA with TT queries is described by a sequence of quantum unitary operators

A:U0OxU1OxOxUT.A:U_{0}\to O_{x}\to U_{1}\to O_{x}\to\dots\to O_{x}\to U_{T}.

The intermediate operators U0,U1,,UTU_{0},U_{1},\dots,U_{T} can be arbitrary quantum unitary operators that are determined independent of OxO_{x}. The computation is performed in a Hilbert space =ow\mathcal{H}=\mathcal{H}_{o}\otimes\mathcal{H}_{w}, where o\mathcal{H}_{o} is the output space and w\mathcal{H}_{w} is the work space. The computation starts from basis state |0o|0w\lvert 0\rangle_{o}\lvert 0\rangle_{w}, and then we apply U0,Ox,U1,Ox,,Ox,UTU_{0},O_{x},U_{1},O_{x},\dots,O_{x},U_{T} on it in that order. The result state is

|ψ=UTOxOxU1OxU0|0o|0w.\lvert\psi\rangle=U_{T}O_{x}\dots O_{x}U_{1}O_{x}U_{0}\lvert 0\rangle_{o}\lvert 0\rangle_{w}.

Measuring the output space, the outcome is then defined as the output A(x)A(x) of algorithm AA on input xx. More precisely, Pr[A(x)=y]=My|ψ2\Pr[A(x)=y]=\lVert M_{y}\lvert\psi\rangle\rVert^{2}, where My=|yoy|M_{y}=\lvert y\rangle_{o}\langle y\rvert. Furthermore, AA is said to be a bounded-error quantum algorithm that computes ff, if Pr[A(x)=f(x)]2/3\Pr[A(x)=f(x)]\geq 2/3 for every xx.

To deal with average-case complexity, following the setting used in [AdW99], we assume that after each UjU_{j}, a dedicated flag-qubit will be measured on the computational basis (and this measurement may change the quantum state). The measurement outcome indicates whether the algorithm is ready to halt and return its output. If the outcome is 11, then we measure the output space with the outcome as the output, and then stop the algorithm; otherwise, the algorithm continues with the next query OxO_{x} and Uj+1U_{j+1}. Let TA(x)T_{A}(x) denote the expected number of queries that AA uses on input xx. Note that TA(x)T_{A}(x) only depends on the algorithm AA and its given input xx (which is fixed rather than from some distribution).

2.2.1 Worst-Case and Average-Case Query Complexities

Let f:{0,1}n{0,1}f:\{0,1\}^{n}\to\{0,1\} be a Boolean function. If AA is a (either randomized or quantum) algorithm and y{0,1}y\in\{0,1\}, we use Pr[A(x)=y]\Pr[A(x)=y] to denote the probability that AA outputs yy on input xx. Let (f)\mathcal{R}(f) and 𝒬(f)\mathcal{Q}(f) be the set of randomized and quantum bounded-error algorithms that compute ff, respectively:

(f)\displaystyle\mathcal{R}(f) ={randomized algorithm A:x{0,1}n,Pr[A(x)=f(x)]2/3},\displaystyle=\{\text{randomized algorithm }A:\forall x\in\{0,1\}^{n},\ \Pr[A(x)=f(x)]\geq 2/3\},
𝒬(f)\displaystyle\mathcal{Q}(f) ={quantum algorithm A:x{0,1}n,Pr[A(x)=f(x)]2/3}.\displaystyle=\{\text{quantum algorithm }A:\forall x\in\{0,1\}^{n},\ \Pr[A(x)=f(x)]\geq 2/3\}.

Then the worst-case query complexities of ff are:

R(f)\displaystyle R(f) =infA(f)maxx{0,1}nTA(x),\displaystyle=\inf_{A\in\mathcal{R}(f)}\max_{x\in\{0,1\}^{n}}T_{A}(x),
Q(f)\displaystyle Q(f) =infA𝒬(f)maxx{0,1}nTA(x).\displaystyle=\inf_{A\in\mathcal{Q}(f)}\max_{x\in\{0,1\}^{n}}T_{A}(x).

Let μ:{0,1}n[0,1]\mu:\{0,1\}^{n}\to[0,1] be a probability distribution. We usually use 𝑢𝑛𝑖𝑓2n\mathit{unif}\equiv 2^{-n} to denote the uniform distribution. The average-case query complexity of an algorithm AA with respect to μ\mu is

TAμ=𝔼xμ[TA(x)]=x{0,1}nμ(x)TA(x).T_{A}^{\mu}=\mathbb{E}_{x\sim\mu}[T_{A}(x)]=\sum_{x\in\{0,1\}^{n}}\mu(x)T_{A}(x).

Thus, the randomized and quantum average-case query complexities of ff with respect to μ\mu are:

Rμ(f)\displaystyle R^{\mu}(f) =infA(f)TAμ,\displaystyle=\inf_{A\in\mathcal{R}(f)}T_{A}^{\mu},
Qμ(f)\displaystyle Q^{\mu}(f) =infA𝒬(f)TAμ.\displaystyle=\inf_{A\in\mathcal{Q}(f)}T_{A}^{\mu}.

Clearly, Qμ(f)Rμ(f)Q^{\mu}(f)\leq R^{\mu}(f) for all ff and μ\mu.

2.2.2 Time and Space Efficiency

In order to talk about the “time” and “space” complexities of quantum algorithms, we assume access to a quantum-read/classical-write random access memory (QRAM), where it takes a single QRAM operation to either classically write a bit to the QRAM or make a quantum query to a bit stored in the QRAM. For simplicity, we assume the access to the QRAM is described by a quantum unitary operator UQRAMU_{\text{QRAM}} that swaps the accumulator and a register indexed by another register:

UQRAM|i,j|r0,r1,,ri,,rM1=|i,ri|r0,r1,,j,,rM1,U_{\text{QRAM}}\lvert i,j\rangle\lvert r_{0},r_{1},\dots,r_{i},\dots,r_{M-1}\rangle=\lvert i,r_{i}\rangle\lvert r_{0},r_{1},\dots,j,\dots,r_{M-1}\rangle,

where r0,r1,,rM1r_{0},r_{1},\dots,r_{M-1} are bit registers that are only accessible through this QRAM operator.

Let AA be a quantum query algorithm, and tA(x)t_{A}(x) denote the expected number of two-qubit quantum gates and QRAM operators UQRAMU_{\text{QRAM}} composing intermediate operators, and the quantum input oracles OxO_{x} that AA uses on input xx. The space complexity of AA measures the number of (qu)bits used in AA. The worst-case and average-case time complexities of a Boolean function ff are defined similarly to Section 2.2.1 by replacing TA(x)T_{A}(x) with tA(x)t_{A}(x).

3 Optimal Error Reduction for Nested Quantum Algorithms

Our quantum algorithm for LMSR (Theorem 1.1) is essentially a nested algorithm calling quantum search and quantum minimum finding. The error reduction is often crucial for nested quantum algorithms. Traditional probability amplification methods for randomized algorithms can obtain an O(logdn)O\left(\log^{d}n\right) slowdown for dd-level nested quantum algorithms by repeating the algorithm O(logn)O(\log n) times in each level. In this section, we introduce an efficient error reduction for nested quantum algorithms composed of quantum search and quantum minimum finding, which only costs a factor of O(1)O(1). This improvement is obtained by finding an O(n)O\left(\sqrt{n}\right) quantum algorithm for minimum finding when the comparison oracle can have bounded errors (see Algorithm 1). Moreover, we also show how to amplify the success probability of quantum minimum finding with both exact and bounded-error oracles. In particular, we obtain an O(nlog(1/ε))O\left(\sqrt{n\log{(1/\varepsilon)}}\right) quantum algorithm for minimum finding with success probability 1ε\geq 1-\varepsilon (see Algorithm 2). These two algorithms allow us to control the error produced by nested quantum oracles better than traditional (classical) methods. Both of them are optimal because their simpler version (OR function) has lower bound Ω(n)\Omega(\sqrt{n}) [BBBV97, BBHT98, Zal99] for quantum bounded-error algorithm and Ω(nlog(1/ε))\Omega(\sqrt{n\log(1/\varepsilon)}) [BCdWZ99] for its error reduction. As an application, we develop a useful tool to find the first solution in the search problem.

3.1 Quantum Search

Let us start from an O(n)O\left(\sqrt{n}\right) quantum algorithm to search on bounded-error inputs [HMdW03]. The search problem is described by a function f(x0,x1,,xn1)f(x_{0},x_{1},\dots,x_{n-1}) that finds an index j[n]j\in[n] (if exists) such that xj=1x_{j}=1, where xi{0,1}x_{i}\in\{0,1\} for all i[n]i\in[n]. It was first shown by Grover [Gro96] that the search problem can be solved by an O(n)O\left(\sqrt{n}\right) quantum algorithm, which was found after the discovery of the Ω(n)\Omega\lparen\sqrt{n}\rparen lower bound [BBBV97] (see also [BBHT98, Zal99]).

3.1.1 Quantum Search on Bounded-Error Oracles

A more robust approach for the search problem on bounded-error oracles was proposed by Høyer, Mosca and de Wolf [HMdW03]. Rather than an exact quantum oracle Ux|i,0=|i,xiU_{x}\lvert i,0\rangle=\lvert i,x_{i}\rangle, they consider a bounded-error one introducing extra workspace |0w\lvert 0\rangle_{w}:

Ux|i,0|0w=pi|i,xi|ψiw+1pi|i,x¯i|ϕiw,U_{x}\lvert i,0\rangle\lvert 0\rangle_{w}=\sqrt{p_{i}}\lvert i,x_{i}\rangle\lvert\psi_{i}\rangle_{w}+\sqrt{1-p_{i}}\lvert i,\bar{x}_{i}\rangle\lvert\phi_{i}\rangle_{w},

where pi2/3p_{i}\geq 2/3 for every i[n]i\in[n], u¯\bar{u} denotes the negation of uu, and |ψiw\lvert\psi_{i}\rangle_{w} and |ϕiw\lvert\phi_{i}\rangle_{w} are ignorable work qubits. This kind of bounded-error oracles is general in the sense that every bounded-error quantum algorithm and (classical) randomized algorithm can be described by it. A naive way to solve the search problem on bounded-error oracles is to repeat k=O(logn)k=O(\log n) times and choose the majority value among the kk outputs. This gives an O(nlogn)O\left(\sqrt{n}\log n\right) quantum algorithm. Surprisingly, it can be made better to O(n)O\left(\sqrt{n}\right) as shown in the following:

Theorem 3.1 (Quantum Search on Bounded-Error Oracles, [HMdW03]).

There is an O(n)O\left(\sqrt{n}\right) bounded-error quantum algorithm for the search problem on bounded-error oracles. Moreover, if there are t1t\geq 1 solutions, the algorithm finds a solution in expected O(n/t)O\left(\sqrt{n/t}\right) queries (even if tt is unknown).

For convenience, we use Search(Ux)\textbf{Search}(U_{x}) to denote the algorithm of Theorem 3.1 which, with probability 2/3\geq 2/3, returns an index j[n]j\in[n] such that xj=1x_{j}=1 or reports that no such jj exists (we require the algorithm to return 1-1 in this case).

3.1.2 Amplification of the Success of Quantum Search

Usually, we need to amplify the success probability of a quantum or (classical) randomized algorithm to make it sufficiently large. A common trick used in randomized algorithms is to repeat the bounded-error algorithm O(log(1/ε))O(\log(1/\varepsilon)) times and choose the majority value among all outputs to ensure success probability 1ε\geq 1-\varepsilon. Buhrman, Cleve, de Wolf and Zalka [BCdWZ99] showed that we can do better for quantum searching.

Theorem 3.2 (Amplification of the success of quantum search, [BCdWZ99]).

For every ε>0\varepsilon>0, there is an O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) bounded-error quantum algorithm for the search problem with success probability 1ε\geq 1-\varepsilon. Moreover, if there is a promise of t1t\geq 1 solutions, the algorithm finds a solution in O(n(t+log(1/ε)t))O\left(\sqrt{n}\left(\sqrt{t+\log(1/\varepsilon)}-\sqrt{t}\right)\right) queries.

Theorem 3.2 also holds for bounded-error oracles. For convenience, we use Search(Ux,ε)\textbf{Search}(U_{x},\varepsilon) to denote the algorithm of Theorem 3.2, which succeeds with probability 1ε\geq 1-\varepsilon. Note that Theorem 3.2 does not cover the case that there can be t2t\geq 2 solutions without promise. In this case, we can obtain an O(n/tlog(1/ε))O\left(\sqrt{n/t}\log(1/\varepsilon)\right) bounded-error quantum algorithm with error probability ε\leq\varepsilon by straightforward majority voting.

3.2 Quantum Minimum Finding

We now turn to consider the minimum-finding problem. Given x0,x1,,xn1x_{0},x_{1},\dots,x_{n-1}, the problem is to find an index j[n]j\in[n] such that xjx_{j} is the minimal element. Let cmp(i,j)\text{cmp}(i,j) be the comparator to determine whether xi<xjx_{i}<x_{j}:

cmp(i,j)={1xi<xj,0otherwise.\text{cmp}(i,j)=\begin{cases}1&x_{i}<x_{j},\\ 0&\text{otherwise}.\end{cases}

The comparison oracle UcmpU_{\text{cmp}} simulating cmp is defined by

Ucmp|i,j,k=|i,j,kcmp(i,j).U_{\text{cmp}}\lvert i,j,k\rangle=\lvert i,j,k\oplus\text{cmp}(i,j)\rangle.

We measure the query complexity by counting the number of queries to this oracle UcmpU_{\text{cmp}}. A quantum algorithm was proposed by Dürr and Høyer [DH96] and Ahuja and Kapoor [AK99] for finding the minimum:

Theorem 3.3 (Minimum finding, [DH96, AK99]).

There is an O(n)O\left(\sqrt{n}\right) bounded-error quantum algorithm for the minimum-finding problem.

We also note that a generalized minimum-finding was developed in [vAGGdW17], which only needs to prepare a superposition over the search space (rather than make queries to individual elements of the search space).

3.2.1 Optimal Quantum Minimum Finding on Bounded-Error Oracles

For the purpose of this paper, we need to generalize the above algorithm to one with a bounded-error version of UcmpU_{\text{cmp}}. For simplicity, we abuse a little bit of notation and define:

Ucmp|i,j,0|0w=pij|i,j,cmp(i,j)|ψijw+1pij|i,j,cmp(i,j)¯|ϕijw,U_{\text{cmp}}\lvert i,j,0\rangle\lvert 0\rangle_{w}=\sqrt{p_{ij}}\lvert i,j,\text{cmp}(i,j)\rangle\lvert\psi_{ij}\rangle_{w}+\sqrt{1-p_{ij}}\lvert i,j,\overline{\text{cmp}(i,j)}\rangle\lvert\phi_{ij}\rangle_{w},

where pij2/3p_{ij}\geq 2/3 for all i,j[n]i,j\in[n], and |ψijw\lvert\psi_{ij}\rangle_{w} and |ϕijw\lvert\phi_{ij}\rangle_{w} are ignorable work qubits. Moreover, for every index j[n]j\in[n], we can obtain a bounded-error oracle UcmpjU_{\text{cmp}}^{j}:

Ucmpj|i,0|0w=pij|i,cmp(i,j)|ψijw+1pij|i,cmp(i,j)¯|ϕijwU_{\text{cmp}}^{j}\lvert i,0\rangle\lvert 0\rangle_{w}=\sqrt{p_{ij}}\lvert i,\text{cmp}(i,j)\rangle\lvert\psi_{ij}\rangle_{w}+\sqrt{1-p_{ij}}\lvert i,\overline{\text{cmp}(i,j)}\rangle\lvert\phi_{ij}\rangle_{w} (1)

with only one query to UcmpU_{\text{cmp}}. Then we can provide a quantum algorithm for minimum finding on bounded-error oracles as Algorithm 1.

Algorithm 1 Minimum(Ucmp)\textbf{Minimum}(U_{\text{cmp}}): An algorithm for minimum finding on bounded-error oracles.
1:A bounded-error oracle UxU_{x} for x0,x1,,xn1x_{0},x_{1},\dots,x_{n-1}.
2:An index j[n]j\in[n] such that xjxix_{j}\leq x_{i} for every i[n]i\in[n] with probability 2/3\geq 2/3.
3:if n2n\leq 2 then
4:     return the answer by classical algorithms.
5:end if
6:m12lnnm\leftarrow\lceil 12\ln n\rceil, q36lnmq\leftarrow\lceil 36\ln m\rceil.
7:Choose j[n]j\in[n] uniformly at random.
8:for t=1mt=1\to m do
9:     if the total number of queries to Ucmp>30Cn+mqU_{\text{cmp}}>30C\sqrt{n}+mq then
10:         break.
11:     end if
12:     iSearch(Ucmpj)i\leftarrow\textbf{Search}(U_{\text{cmp}}^{j}), where UcmpjU_{\text{cmp}}^{j} is defined by Eq. (1).
13:     for k=1qk=1\to q do
14:         bkthe measurement outcome of the third register of Ucmp|i,j,0|0wb_{k}\leftarrow\text{the measurement outcome of the third register of }U_{\text{cmp}}\lvert i,j,0\rangle\lvert 0\rangle_{w}.
15:     end for
16:     b(i1)maj(b1,,bq)b\leftarrow\lparen i\neq-1\rparen\land\operatorname{maj}\lparen b_{1},\dots,b_{q}\rparen, where maj(b1,,bq)\operatorname{maj}\lparen b_{1},\dots,b_{q}\rparen returns the majority of b1,,bqb_{1},\dots,b_{q}.
17:     if bb then
18:         jij\leftarrow i.
19:     end if
20:end for
21:return jj.

The constant C>0C>0 in Algorithm 1 is given so that Search(Ux)\textbf{Search}(U_{x}) in Theorem 3.1 takes at most Cn/max{t,1}C\sqrt{n/\max\{t,1\}} queries to UxU_{x} if there are tt solutions.

Lemma 3.4.

Algorithm 1 is a bounded-error quantum algorithm for minimum finding on bounded-error oracles in O(n)O\left(\sqrt{n}\right) queries.

Proof.

The query complexity is trivially O(n)O(\sqrt{n}) due to the guard (Line 7) of Algorithm 1.

The correctness is proved as follows. Let m=12lnnm=\lceil 12\ln n\rceil and q=36lnmq=\lceil 36\ln m\rceil. We assume that n3n\geq 3 and therefore m12m\geq 12. In each of the mm iterations, Line 11-14 of Algorithm 1 calls UcmpU_{\text{cmp}} for qq times and bb gets the value (i1)(xi<xj)\lparen i\neq-1\rparen\land\lparen x_{i}<x_{j}\rparen with probability 11/m2\geq 1-1/m^{2} (This is a straightforward majority voting. For completeness, its analysis is provided in Appendix A). Here, ii is an index such that xi<xjx_{i}<x_{j} (with high probability) and i=1i=-1 if no such ii exists; thus b=1b=1 means that there exists smaller element xix_{i} than xjx_{j}.

We only consider the case that the values of bb in all iterations are as desired. This case happens with probability (11/m2)m11/m11/12\geq(1-1/m^{2})^{m}\geq 1-1/m\geq 11/12. In each iteration, ii finds a candidate index with probability 2/3\geq 2/3 such that xi<xjx_{i}<x_{j} if exists (and if there are many, any of them is obtained with equal probability). It is shown in [DH96, Lemma 2] that: if ii finds a candidate index with certainty, then the expected number of queries before jj holds the index of the minimal is 52Cn\leq\frac{5}{2}C\sqrt{n}; moreover, the expected number of iterations is lnn\leq\ln n. In our case, ii finds a candidate index in expected 3/23/2 iterations. Therefore, the expected number of queries to UcmpU_{\text{cmp}} is 154Cn\leq\frac{15}{4}C\sqrt{n} and that of iterations is 32lnn\leq\frac{3}{2}\ln n. When Algorithm 1 makes queries to the oracle 30Cn\geq 30C\sqrt{n} times (except those negligible queries in Line 11-14) or iterations m\geq m times (that is, more than 88 times their expectations), the error probability is 1/8+1/8=1/4\leq 1/8+1/8=1/4 by Markov’s inequality. Therefore, the overall success probability is 1112342/3\geq\frac{11}{12}\cdot\frac{3}{4}\geq 2/3. ∎

3.2.2 Amplifying the Success Probability of Quantum Minimum Finding

We can amplify the success probability for quantum minimum finding better than a naive method, as shown in Algorithm 2.

Algorithm 2 Minimum(Ucmp,ε)\textbf{Minimum}(U_{\text{cmp}},\varepsilon): Amplification of the success of minimum finding.
1:A bounded-error oracle UcmpU_{\text{cmp}} for x0,x1,,xn1x_{0},x_{1},\dots,x_{n-1} and 0<ε<1/20<\varepsilon<1/2.
2:An index j[n]j\in[n] such that xjxix_{j}\leq x_{i} for every i[n]i\in[n] with probability 1ε\geq 1-\varepsilon.
3:while true do
4:     jMinimum(Ucmp)j\leftarrow\textbf{Minimum}(U_{\text{cmp}}).
5:     if Search(Ucmpj,ε)1\textbf{Search}(U_{\text{cmp}}^{j},\varepsilon)\neq-1 then
6:         break.
7:     end if
8:end while
9:return jj.
Lemma 3.5.

Algorithm 2 runs in expected O(nlog(1/ε))O\left(\sqrt{n\log{(1/\varepsilon)}}\right) queries with error probability ε\leq\varepsilon.

Proof.

Algorithm 2 terminates with a guard by Search(Ucmpj,ε)\textbf{Search}(U_{\text{cmp}}^{j},\varepsilon). Here, Search(Ucmpj,ε)1\textbf{Search}(U_{\text{cmp}}^{j},\varepsilon)\neq-1 means that with probability 1ε\geq 1-\varepsilon, there is no index ii such that cmp(i,j)=1\text{cmp}\lparen i,j\rparen=1 and thus jj is the desired answer. Therefore, it has error probability ε\leq\varepsilon as the guard. Let p2/3p\geq 2/3 be the probability that jj holds the index of the minimal element with a single query to Minimum(Ucmp)\textbf{Minimum}(U_{\text{cmp}}) by Lemma 3.4. Let qq be the probability that Algorithm 2 breaks the “while” loop at each iteration. Then

q=p(1ε)+(1p)εp(1ε)1/3,q=p(1-\varepsilon)+(1-p)\varepsilon\geq p(1-\varepsilon)\geq 1/3,

which is greater than a constant. So, the expected number of iterations is O(1)O(1). In a single iteration, Minimum(Ucmp)\textbf{Minimum}(U_{\text{cmp}}) takes O(n)O\left(\sqrt{n}\right) queries (by Lemma 3.4) and Search(Ucmpj,ε)\textbf{Search}(U_{\text{cmp}}^{j},\varepsilon) takes O(nlog(1/ε))O\left(\sqrt{n\log(1/\varepsilon)}\right) queries (by Theorem 3.2). Therefore, the expected query complexity of Algorithm 2 is O(1)(O(n)+O(nlog(1/ε)))=O(nlog(1/ε))O\lparen 1\rparen\cdot\left\lparen O\lparen\sqrt{n}\rparen+O\left(\sqrt{n\log{(1/\varepsilon)}}\right)\right\rparen=O\left(\sqrt{n\log{(1/\varepsilon)}}\right). ∎

3.3 An Application: Searching for the First Solution

In this subsection, we develop a tool needed in our quantum algorithm for LMSR as an application of the above two subsections. It solves the problem of finding the first solution (i.e. leftmost solution, or solution with the minimal index rather than an arbitrary solution) and thus can be seen as a generalization of quantum searching, but the solution is based on quantum minimum finding.

Formally, the query oracle UxU_{x} of x0,x1,,xn1x_{0},x_{1},\dots,x_{n-1} is given. The searching-first problem is to find the minimal index j[n]j\in[n] such that xj=1x_{j}=1 or report that no solution exists. This problem can be solved by minimum-finding with the comparator

cmp(i,j)={1xi=1xj=01xi=xji<j0otherwise,\text{cmp}(i,j)=\begin{cases}1&x_{i}=1\land x_{j}=0\\ 1&x_{i}=x_{j}\land i<j\\ 0&\text{otherwise}\end{cases},

which immediately yields an O(n)O(\sqrt{n}) solution if the query oracle UxU_{x} is exact.

In the case that the query oracle UxU_{x} is bounded-error, a bounded-error comparison oracle UcmpU_{\text{cmp}} corresponding to cmp can be implemented with a constant number of queries to UxU_{x}. Therefore, the results in Lemma 3.4 and Lemma 3.5 also hold for the searching-first problem. For convenience in the following discussions, we write SearchFirst(Ux)\textbf{SearchFirst}(U_{x}) and SearchFirst(Ux,ε)\textbf{SearchFirst}(U_{x},\varepsilon) to denote the algorithm for the searching-first problem based on the two algorithm Minimum(Ucmp)\textbf{Minimum}(U_{\text{cmp}}) and Minimum(Ucmp,ε)\textbf{Minimum}(U_{\text{cmp}},\varepsilon), respectively. Symmetrically, we have SearchLast(Ux)\textbf{SearchLast}(U_{x}) and SearchLast(Ux,ε)\textbf{SearchLast}(U_{x},\varepsilon) for searching the last solution.

Recently, an O(n)O(\sqrt{n}) quantum algorithm for searching the first was proposed in [KKM+20]. Their approach is quite different from our presented above. It is specifically designed for this particular problem, but our approach is based on a more general framework of quantum minimum finding.

We believe that the techniques presented in this section can be applied in solving other problems. For this reason, we present a description of them in a general framework of nested quantum algorithm in Appendix B.

4 Quantum Deterministic Sampling

In this section, we prepare another tool to be used in our quantum algorithm for LMSR, namely an efficient quantum algorithm for deterministic sampling. It is based on our nested quantum algorithm composed of quantum search and quantum minimum finding given in the last section. Deterministic sampling is also a useful trick in parallel pattern matching [Vis90]. We provide a simple quantum lexicographical comparator in Section 4.1, and a quantum algorithm for deterministic sampling in Section 4.2. As an application, we obtain quantum algorithms for string periodicity and pattern matching in Section 4.3.

4.1 Lexicographical Comparator

Suppose there are two strings s,tΣns,t\in\Sigma^{n} of length nn over a finite alphabet Σ=[α]\Sigma=[\alpha]. Let UsU_{s} and UtU_{t} be their query oracles, respectively. That is,

Us|i,j=|i,js[i],Ut|i,j=|i,jt[i].U_{s}\lvert i,j\rangle=\lvert i,j\oplus s[i]\rangle,\qquad\qquad U_{t}\lvert i,j\rangle=\lvert i,j\oplus t[i]\rangle.

In order to compare the two strings in the lexicographical order, we need to find the leftmost index k[n]k\in[n] that s[k]t[k]s[k]\neq t[k]. If no such kk exists, then s=ts=t. To this end, we construct the oracle

Ux|i,j={|i,j,s[i]=t[i],|i,j1,s[i]t[i],U_{x}\lvert i,j\rangle=\begin{cases}\lvert i,j\rangle,&s[i]=t[i],\\ \lvert i,j\oplus 1\rangle,&s[i]\neq t[i],\end{cases} (2)

using 11 query to each of UsU_{s} and UtU_{t}. A straightforward algorithm for lexicographical comparison based on the searching-first problem is described in Algorithm 3.

Algorithm 3 LexicographicalComparator(Us,Ut)\textbf{LexicographicalComparator}(U_{s},U_{t}): Lexicographical Comparator.
1:Two query oracles UsU_{s} and UtU_{t} for two strings ss and tt, respectively.
2:If s<ts<t, then return 1-1; if s>ts>t, then return 11; if s=ts=t, then return 0, with probability 2/3\geq 2/3.
3:kSearchFirst(Ux)k\leftarrow\textbf{SearchFirst}(U_{x}), where UxU_{x} is defined by Eq. (2).
4:if k=1k=-1 then
5:     return 0.
6:end if
7:if s[k]<t[k]s[k]<t[k] then
8:     return 1-1.
9:else
10:     return 11.
11:end if
Lemma 4.1.

Algorithm 3 is an O(n)O\left(\sqrt{n}\right) bounded-error quantum algorithm that compares two strings by their oracles in the lexicographical order.

Remark 4.1.

We usually need to compare two strings in the lexicographical order as a subroutine nested as low-level quantum oracles in string algorithms. However, the lexicographical comparator (Algorithm 3) brings errors. Therefore, the error reduction trick for nested quantum oracles proposed in Section 3 is required here.

4.2 Deterministic Sampling

Deterministic sampling [Vis90] is a useful technique for pattern matching in pattern analysis. In this subsection, we provide a quantum solution to deterministic sampling in the form of a nested quantum algorithm.

For our purpose, we extend the definition of deterministic samples to the periodic case. The following is a generalised definition of deterministic samples.

Definition 4.1 (Deterministic samples).

Let sΣns\in\Sigma^{n} and dd be its period. A deterministic sample of ss consists of an offset 0δ<n/20\leq\delta<\lfloor n/2\rfloor and a sequence of indices i0,i1,,il1i_{0},i_{1},\dots,i_{l-1} (called checkpoints) such that

  1. 1.

    ikδ[n]i_{k}-\delta\in[n] for k[l]k\in[l];

  2. 2.

    For every 0j<n/20\leq j<\lfloor n/2\rfloor with jδ(modd)j\not\equiv\delta\pmod{d}, there exists k[l]k\in[l] such that ikj[n]i_{k}-j\in[n] and s[ikj]s[ikδ]s[i_{k}-j]\neq s[i_{k}-\delta]. We denote ck=s[ikδ]c_{k}=s[i_{k}-\delta] when the exact values of iki_{k} and δ\delta are ignorable.

If ss is aperiodic (i.e. d>n/2d>n/2), the second condition degenerates into “for every 0j<n/20\leq j<\lfloor n/2\rfloor with jδj\neq\delta”, which is consistent with the definition for aperiodic strings in [Vis90].

The use of deterministic sampling.

Suppose that TT is a text and PP is a pattern. If we have a deterministic sample (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) of PP with a small ll, then we can test if an index of TT can be a starting position that matches PP using only ll comparisons according to the deterministic sample. It is worth noting that one can disqualify two possible starting positions (that pass the above testing of deterministic sampling) by the Ricochet property proposed in [Vis90] (see Lemma D.1).

The following theorem shows that the size of the deterministic sample can be every small.

Theorem 4.2 (Deterministic sampling [Vis90]).

Let sΣns\in\Sigma^{n}. There is a deterministic sample (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) of ss with llog2nl\leq\lfloor\log_{2}n\rfloor.

Proof.

For the case that ss is aperiodic, a simple procedure for constructing a valid deterministic sample was given in [Vis90]. We describe it as follows.

  1. 1.

    Initially, let A0=[n/2]A_{0}=[\lfloor n/2\rfloor] be the set of candidates of δ\delta, and S0=S_{0}=\emptyset be the set of checkpoints.

  2. 2.

    At step k0k\geq 0, let δkmin=minAk\delta_{k}^{\min}=\min A_{k} and δkmax=maxAk\delta_{k}^{\max}=\max A_{k}.

    1. 2.1.

      If δkmin=δkmax\delta_{k}^{\min}=\delta_{k}^{\max}, then set δ=δkmin\delta=\delta_{k}^{\min} and return the current set SkS_{k} of checkpoints.

    2. 2.2.

      Otherwise, there must be an index iki_{k} such that s[ikδkmin]s[ikδkmax]s[i_{k}-\delta_{k}^{\min}]\neq s[i_{k}-\delta_{k}^{\max}] (or ss is periodic). Let σk\sigma_{k} be the symbol with least occurrences (but at least once) among s[ikγ]s[i_{k}-\gamma] for γAk\gamma\in A_{k} (and choose any of them if there are multiple possible symbols). Let Sk+1=Sk{ik}S_{k+1}=S_{k}\cup\{i_{k}\} and Ak+1={γAk:s[ikγ]=σk}A_{k+1}=\left\{\,\gamma\in A_{k}\colon s[i_{k}-\gamma]=\sigma_{k}\,\right\}, then go to the next step for k+1k+1.

It can be seen that the above procedure always stops as the set AkA_{k} halves after each step, i.e., |Ak+1||Ak|/2\lvert A_{k+1}\rvert\leq\lvert A_{k}\rvert/2. It can be verified that the returned δ\delta and checkpoints together form a valid deterministic sample. The procedure will have at most log2n\lfloor\log_{2}n\rfloor steps and each step will add one checkpoint, which implies that there exist a deterministic sample with at most log2n\lfloor\log_{2}n\rfloor checkpoints.

For the case that ss is periodic, the above procedure will still work if we set the initial set of candidates to be A0=[d]A_{0}=[d]. Intuitively, since ss is periodic, most symbols are redundant and thus we only have to consider the first dd offsets. After this modification, the analysis of the modified procedure is almost identical to the original one. Here, we note that if iki_{k} does not exist during the execution of the modified procedure, then ss has a smaller period than dd. Finally, the procedure will return at most log2dlog2n\lfloor\log_{2}d\rfloor\leq\lfloor\log_{2}n\rfloor checkpoints. ∎

Now let us consider how a quantum algorithm can do deterministic sampling. We start from the case where sΣns\in\Sigma^{n} is aperiodic. Let UsU_{s} be the query oracle of ss, that is, U|i,j=|i,js[i]U\lvert i,j\rangle=\lvert i,j\oplus s[i]\rangle. Suppose at step ll, the sequence of indices i0,i1,,il1i_{0},i_{1},\dots,i_{l-1} is known as well as ck=s[ikδ]c_{k}=s[i_{k}-\delta] for k[l]k\in[l] (we need not know δ\delta explicitly). For 0j<n/20\leq j<\lfloor n/2\rfloor, let xjx_{j} denote whether candidate jj agrees with δ\delta at all checkpoints, that is,

xj={0k[l],jik<j+ns[ikj]ck1otherwise.x_{j}=\begin{cases}0&\exists k\in[l],j\leq i_{k}<j+n\land s[i_{k}-j]\neq c_{k}\\ 1&\text{otherwise}\end{cases}. (3)

Based on the search problem, there is a bounded-error oracle UxU_{x} for computing xjx_{j} with O(l)=O(logn)O\left(\sqrt{l}\right)=O\left(\sqrt{\log n}\right) queries to UsU_{s}.

A quantum algorithm for deterministic sampling is described in Algorithm 4. Initially, all offsets 0ξ<n/20\leq\xi<\lfloor n/2\rfloor are candidates of δ\delta. The idea of the algorithm is to repeatedly find two remaining candidates pp and qq that differ at an index jj (if there is only one remaining candidate, the algorithm has already found a deterministic sample and terminates), randomly choose a character cc being s[jp]s[j-p] or s[jq]s[j-q], and delete either pp or qq according to cc. It is sufficient to select pp and qq to be the first and the last solution of xjx_{j} defined in Eq. (3). To explicitly describe how to find an index jj where pp and qq differ, we note that qj<p+nq\leq j<p+n and jj must exist because of the aperiodicity of ss, and let

yj={1qj<p+ns[jp]s[jq]0otherwise.y_{j}=\begin{cases}1&q\leq j<p+n\land s[j-p]\neq s[j-q]\\ 0&\text{otherwise}\end{cases}.

It is trivial that there is an exact oracle UyU_{y} for computing yjy_{j} with O(1)O(1) queries to UsU_{s}.

For the case of periodic sΣns\in\Sigma^{n}, the algorithm requires some careful modifications. We need a variable QQ to denote the upper bound of current available candidates. Initially, Q=n/21Q=\lfloor n/2\rfloor-1. We modify the definition of xjx_{j} in Eq. (3) to make sure 0jQ0\leq j\leq Q by

xj={0j>Q(k[l],jik<j+ns[ikj]ck)1otherwise.x_{j}=\begin{cases}0&j>Q\lor\left(\exists k\in[l],j\leq i_{k}<j+n\land s[i_{k}-j]\neq c_{k}\right)\\ 1&\text{otherwise}\end{cases}. (4)

For an aperiodic string ss, there is at least one yjy_{j} such that yj=1y_{j}=1, so the algorithm will reach Line 17-25 during its execution with small probability 1/m\leq 1/m, where m=O(logn)m=O(\log n). But for periodic string ss, let dd be its period, if qpq-p is divisible by dd, then yj=0y_{j}=0 for all yjy_{j} and thus the algorithm once reaches Line 17-25 with high probability 11/6m2\geq 1-1/6m^{2}. In this case, there does not exist qj<p+nq\leq j<p+n such that yj=1y_{j}=1. We set Q=q1Q=q-1 to eliminate all candidates q\geq q. In fact, even for a periodic string ss, the algorithm is intended to reach Line 17-25 only once (with high probability). If the algorithm reaches Line 17-25 the second (or more) time, it is clear that current (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) is a deterministic sample of ss (with high probability 11/m\geq 1-1/m), and therefore consequent computation does not influence the correctness and can be ignored.

Algorithm 4 DeterministicSampling(Us)\textbf{DeterministicSampling}(U_{s}): Deterministic Sampling.
1:The query oracles UsU_{s} for string sΣns\in\Sigma^{n}.
2:A deterministic sample (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}).
3:if n=1n=1 then
4:     return (0;0)(0;0).
5:end if
6:m8log2nm\leftarrow\lceil 8\log_{2}n\rceil.
7:ε1/6m2\varepsilon\leftarrow 1/6m^{2}.
8:l0l\leftarrow 0.
9:Qn/21Q\leftarrow\lfloor n/2\rfloor-1.
10:for t=1mt=1\to m do
11:     Let UxU_{x} be the bounded-error oracle for computing xjx_{j} defined in Eq. (4).
12:     pSearchFirst(Ux,ε)p\leftarrow\textbf{SearchFirst}(U_{x},\varepsilon).
13:     qSearchLast(Ux,ε)q\leftarrow\textbf{SearchLast}(U_{x},\varepsilon).
14:     if p=qp=q then
15:         break.
16:     end if
17:     ilSearch(Uy,ε)i_{l}\leftarrow\textbf{Search}(U_{y},\varepsilon).
18:     if il=1i_{l}=-1 then
19:         Qq1Q\leftarrow q-1.
20:         qSearchLast(Ux,ε)q\leftarrow\textbf{SearchLast}(U_{x},\varepsilon).
21:         if p=qp=q then
22:              break.
23:         end if
24:         ilSearch(Uy,ε)i_{l}\leftarrow\textbf{Search}(U_{y},\varepsilon).
25:         if il=1i_{l}=-1 then
26:              break.
27:         end if
28:     end if
29:     cls[ilp]c_{l}\leftarrow s[i_{l}-p] with probability 1/21/2 and s[ilq]s[i_{l}-q] with probability 1/21/2.
30:     ll+1l\leftarrow l+1.
31:end for
32:Let UxU_{x} be the bounded-error oracle for computing xjx_{j} defined in Eq. (3).
33:δSearchFirst(Ux,ε)\delta\leftarrow\textbf{SearchFirst}(U_{x},\varepsilon).
34:return (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}).
Lemma 4.3.

Algorithm 4 is an O(nlog3nloglogn)O\left(\sqrt{n\log^{3}n\log\log n}\right) bounded-error quantum algorithm for deterministic sampling.

Proof.

Assume n2n\geq 2 and let m=8log2nm=\lceil 8\log_{2}n\rceil and ε=1/6m2\varepsilon=1/6m^{2}. There are mm iterations in Algorithm 4. In each iteration, there are less than 66 calls to Search, SearchFirst or SearchLast, which may bring errors. It is clear that each call to Search, SearchFirst or SearchLast has error probability ε\leq\varepsilon. Therefore, Algorithm 4 runs with no errors from Search, SearchFirst or SearchLast with probability (1ε)6m11/m\geq(1-\varepsilon)^{6m}\geq 1-1/m.

Now suppose Algorithm 4 runs with no errors from Search, SearchFirst or SearchLast. To prove the correctness of Algorithm 4, we consider the following two cases:

  1. 1.

    Case 1. ss is aperiodic. In this case, Algorithm 4 will never reach Line 17-25. In each iteration, the leftmost and the rightmost remaining candidates pp and qq are found. If p=qp=q, then only one candidate remains, and thus a deterministic sample is found. If pqp\neq q, then there exists an index qj<p+nq\leq j<p+n such that s[jp]s[j-p] and s[jq]s[j-q] differ. We set il=ji_{l}=j and set clc_{l} randomly from s[jp]s[j-p] and s[jq]s[j-q] with equal probability. Then with probability 1/21/2, half of the remaining candidates are eliminated. In other words, it is expected to find a deterministic sample in 2log2n2\log_{2}n iterations. The iteration limit to m8log2nm\geq 8\log_{2}n will make error probability 1/4\leq 1/4. That is, a deterministic sample is found with probability 3/4\geq 3/4.

  2. 2.

    Case 2. ss is periodic and the period of ss is dn/2d\leq n/2. In each iteration, the same argument for aperiodic ss holds if Algorithm 4 does not reach Line 17-25. If Line 17-25 is reached for the first time, it means candidates between q+1q+1 and n/21\lfloor n/2\rfloor-1 are eliminated and qpq-p is divisible by dd. If Line 17-25 is reached for the second time, all candidates pq(modd)p\not\equiv q\pmod{d} are eliminated, and therefore a deterministic sample is found.

Combining the above two cases, a deterministic sample is found with probability 3/4(11/m)2/3\geq 3/4(1-1/m)\geq 2/3.

On the other hand, we note that a single call to SearchFirst and SearchLast in Algorithm 4 has O(nlog(1/ε))=O(nloglogn)O\left(\sqrt{n\log(1/\varepsilon)}\right)=O\left(\sqrt{n\log\log n}\right) queries to UxU_{x} (by Lemma 3.5), and Seach has query complexity O(nlog(1/ε))=O(nloglogn)O\left(\sqrt{n\log(1/\varepsilon)}\right)=O\left(\sqrt{n\log\log n}\right) (by Theorem 3.2). Hence, a single iteration has query complexity O(nlloglogn)=O(nlognloglogn)O\left(\sqrt{nl\log\log n}\right)=O\left(\sqrt{n\log n\log\log n}\right), and the total query complexity (mm iterations) is O(mnlognloglogn)=O(nlog3nloglogn)O\left(m\sqrt{n\log n\log\log n}\right)=O\left(\sqrt{n\log^{3}n\log\log n}\right). ∎

Algorithm 4 is a 22-level nested quantum algorithm (see Appendix C for a more detailed discussion), and is a better solution for deterministic sampling in O(nlog3nloglogn)O\left(\sqrt{n\log^{3}n\log\log n}\right) queries than the known O(nlog2n)O\left(\sqrt{n}\log^{2}n\right) solution in [RV03].

Remark 4.2.

In order to make our quantum algorithm for deterministic sampling time and space efficient, we need to store and modify the current deterministic sample (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) during the execution in the QRAM, which needs O(llogn)=O(log2n)O(l\log n)=O(\log^{2}n) bits of memory. Moreover, only O(logn)O(\log n) qubits are needed in the computation (for search and minimum finding). In this way, the time complexity of the quantum algorithm is O(nlog5nloglogn)O\left\lparen\sqrt{n\log^{5}n\log\log n}\right\rparen, which is just an O(logn)O\lparen\log n\rparen factor bigger than its query complexity.

4.3 Applications

Based on our quantum algorithm for deterministic sampling, we provide applications for string periodicity and pattern matching.

4.3.1 String Periodicity

We can check whether a string is periodic (and if yes, find its period) with its deterministic sample. Formally, let (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) be a deterministic sample of ss, and xjx_{j} defined in Eq. (3). Let j1j_{1} denote the smallest index jj such that xj=1x_{j}=1, which can be computed by SearchFirst on xjx_{j}. After j1j_{1} is obtained, define

xj={0jj1(k[l],jik<j+ns[ikj]ck)1otherwise.x_{j}^{\prime}=\begin{cases}0&j\leq j_{1}\lor\left(\exists k\in[l],j\leq i_{k}<j+n\land s[i_{k}-j]\neq c_{k}\right)\\ 1&\text{otherwise}\end{cases}.

Let j2j_{2} denote the smallest index jj such that xj=1x_{j}^{\prime}=1 (if not found then j2=1j_{2}=-1), which can be computed by SearchFirst on xjx_{j}^{\prime}. If j2=1j_{2}=-1, then ss is aperiodic; otherwise, ss is periodic with period d=j2j1d=j_{2}-j_{1}. This algorithm for checking periodicity is O(nlogn)O\left(\sqrt{n\log n}\right). (See Appendix D for more details.)

4.3.2 Pattern Matching

As an application of deterministic sampling, we have a quantum algorithm for pattern matching with query complexity O(nlogm+mlog3mloglogm)O\left(\sqrt{n\log m}+\sqrt{m\log^{3}m\log\log m}\right), better than the best known solution in [RV03] with query complexity O(nlog(n/m)logm+mlog2m)O\left(\sqrt{n}\log(n/m)\log m+\sqrt{m}\log^{2}m\right).

For readability, these algorithms are postponed to Appendix D.

5 The Quantum Algorithm for LMSR

Now we are ready to present our quantum algorithm for LMSR and thus prove Theorem 1.1. This algorithm is designed in two steps:

  1. 1.

    Design a quantum algorithm with worst-case query complexity O(n3/4)O\left(n^{3/4}\right) in Section 5.1; and

  2. 2.

    Improve the algorithm to average-case query complexity O(nlogn)O\left(\sqrt{n}\log n\right) in Section 5.2.

5.1 The Basic Algorithm

For convenience, we assume that the alphabet Σ=[α]\Sigma=[\alpha] for some α2\alpha\geq 2, where [n]={0,1,2,,n1}[n]=\{0,1,2,\dots,n-1\} and the total order of Σ\Sigma follows that of natural numbers. Suppose the input string sΣns\in\Sigma^{n} is given by an oracle UinU_{\text{in}}:

Uin|i,j=|i,js[i].U_{\text{in}}\lvert i,j\rangle=\lvert i,j\oplus s[i]\rangle.

The overall idea of our algorithm is to split ss into blocks of length BB, and then in each block find a candidate with the help of the prefix of SCR(s)\operatorname{SCR}(s) of length BB. These candidates are eliminated between blocks by the exclusion rule for LMSR (see Lemma 5.1). We describe it in detail in the next three subsections.

5.1.1 Find a Prefix of SCR

Our first goal is to find the prefix p=s[LMSR(s)LMSR(s)+B1]p=s[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1] of SCR(s)\operatorname{SCR}(s) of length BB by finding an index i[n]i^{*}\in[n] such that s[ii+B1]s[i^{*}\dots i^{*}+B-1] matches pp, where B=nB=\lfloor\sqrt{n}\rfloor is chosen optimally (see later discussions). To achieve this, we need to compare two substrings of ss of length BB with the comparator cmpB\text{cmp}_{B}:

cmpB(i,j)={1s[ii+B1]<s[jj+B1]0otherwise.\text{cmp}_{B}(i,j)=\begin{cases}1&s[i\dots i+B-1]<s[j\dots j+B-1]\\ 0&\text{otherwise}\end{cases}. (5)

According to Algorithm 3, we can obtain a bounded-error comparison oracle UcmpBU_{\text{cmp}_{B}} corresponding to cmpB\text{cmp}_{B} with O(B)O\left(\sqrt{B}\right) queries to UinU_{\text{in}}. After that, let i[n]i^{*}\in[n] be any index such that s[ii+B1]=ps[i^{*}\dots i^{*}+B-1]=p by calling Minimum(UcmpB)\textbf{Minimum}(U_{\text{cmp}_{B}}), which needs O(n)O\left(\sqrt{n}\right) queries to UcmpBU_{\text{cmp}_{B}} (by Lemma 3.4) and succeeds with a constant probability. In the following discussion, we use ii^{*} to find possible candidates of LMSR(s)\operatorname{LMSR}(s) and then find the solution among all candidates.

5.1.2 Candidate in Each Block

Being able to access contents of pp by ii^{*}, we can obtain a deterministic sample of pp by Algorithm 4 in O(Blog3BloglogB)=O~(B)O\left(\sqrt{B\log^{3}B\log\log B}\right)=\tilde{O}\lparen\sqrt{B}\rparen queries with a constant probability. Suppose a deterministic sample of pp is known to be (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}). We split ss into blocks of length L=B/4L=\lfloor B/4\rfloor. In the ii-th block (0-indexed, 0i<n/L0\leq i<\lceil n/L\rceil), with the index ranging from iLiL to min{(i+1)L,n}1\min\{(i+1)L,n\}-1, a candidate hih_{i} is computed by

hi=min{iLj<min{(i+1)L,n}:s[jj+B1]=p},h_{i}=\min\{iL\leq j<\min\{(i+1)L,n\}:s[j\dots j+B-1]=p\}, (6)

where the minimum is taken over all indices jj in the ii-th block such that s[jj+B1]=ps[j\dots j+B-1]=p, and min=\min\emptyset=\infty. Intuitively, for each 0i<n/L0\leq i<\lceil n/L\rceil, hih_{i} defined by Eq. (6) denotes the leftmost possible candidate for LMSR(s)\operatorname{LMSR}(s) such that s[hihi+B1]=ps[h_{i}\dots h_{i}+B-1]=p in the ii-th block. On the other hand, hih_{i} denotes the first occurrence of pp with starting index in the ii-th block of ss, and thus can be computed by a procedure in quantum pattern matching (see Appendix D for more details), which needs O(BlogB)=O~(B)O\left(\sqrt{B\log B}\right)=\tilde{O}\lparen\sqrt{B}\rparen queries to UinU_{\text{in}} with the help of the deterministic sample of pp. We write UhU_{h} for the bounded-error oracle of hih_{i}. Note that UhU_{h} is a 22-level nested quantum oracle.

5.1.3 Candidate Elimination between Blocks

If we know the values of hih_{i} for 0i<n/L0\leq i<\lceil n/L\rceil, with either hih_{i} being a candidate or \infty (indicating non-existence), then we can find LMSR(s)\operatorname{LMSR}(s) among all hih_{i}’s with the comparator

cmp(i,j)={1cmpn(hi,hj)=1,1cmpn(hi,hj)=cmpn(hi,hj)=0hi<hj,0otherwise,\text{cmp}(i,j)=\begin{cases}1&\text{cmp}_{n}(h_{i},h_{j})=1,\\ 1&\text{cmp}_{n}(h_{i},h_{j})=\text{cmp}_{n}(h_{i},h_{j})=0\land h_{i}<h_{j},\\ 0&\text{otherwise,}\end{cases} (7)

where cmpn\text{cmp}_{n} is defined by Eq. (5), and \infty can be regarded as nn explicitly in the computation. Then we can obtain the bounded-error comparison oracle UcmpU_{\text{cmp}} corresponding to cmp with constant number of queries to UcmpnU_{\text{cmp}_{n}} and UhU_{h}, with O(BlogB+n)O\left(\sqrt{B\log B}+\sqrt{n}\right) queries to UinU_{\text{in}}. Here, UcmpU_{\text{cmp}} is a 22-level nested quantum oracle. At the end of the algorithm, the value of LMSR(s)\operatorname{LMSR}(s) is chosen to be the minimal element among hih_{i} by comparison oracle UcmpU_{\text{cmp}} according to comparator cmp. It can be seen that the algorithm is a 33-level nested quantum algorithm.

The Algorithm

We summarize the above design ideas in Algorithm 5. There are four main steps (Line 5, Line 6, Line 7 and Line 8) in the algorithm. Especially, Line 7 of Algorithm 5 involves a 33-level nested quantum algorithm. For convenience, we assume that each of these steps succeeds with a high enough constant probability, say 0.99\geq 0.99. To achieve this, each step just needs a constant number of repetitions to amplify the success probability from 2/32/3 up to 0.990.99.

Algorithm 5 BasicLMSR(Uin)\textbf{BasicLMSR}(U_{\text{in}}): Quantum algorithm for LMSR.
1:The query oracle UinU_{\text{in}} for string ss.
2:LMSR(s)\operatorname{LMSR}(s) with probability 2/3\geq 2/3.
3:if n15n\leq 15 then
4:     return LMSR(s)\operatorname{LMSR}(s) by classical algorithms in Theorem 2.1.
5:end if
6:BnB\leftarrow\lfloor\sqrt{n}\rfloor.
7:iMinimum(UcmpB)i^{*}\leftarrow\textbf{Minimum}(U_{\text{cmp}_{B}}), where cmpB\text{cmp}_{B} is defined by Eq. (5).
8:(δ;i0,i1,,il1)DeterministicSampling(Up)(\delta;i_{0},i_{1},\dots,i_{l-1})\leftarrow\textbf{DeterministicSampling}(U_{p}), where p=s[ii+B1]p=s[i^{*}\dots i^{*}+B-1].
9:iMinimum(Ucmp)i\leftarrow\textbf{Minimum}(U_{\text{cmp}}), where cmp is defined by Eq. (7).
10:return hih_{i}, where hih_{i} is defined by Eq. (6).

Complexity

The query complexity of Algorithm 5 comes from the following four parts:

  1. 1.

    One call to Minimum(UcmpB)\textbf{Minimum}(U_{\text{cmp}_{B}}), which needs O(n)O\left(\sqrt{n}\right) queries to UcmpBU_{\text{cmp}_{B}} (by Lemma 3.4), i.e. O(nB)O\left(\sqrt{nB}\right) queries to UinU_{\text{in}}.

  2. 2.

    One call to DeterministicSampling(Us[ii+B1])\textbf{DeterministicSampling}(U_{s[i^{*}\dots i^{*}+B-1]}), which needs O(Blog3BloglogB)O\left(\sqrt{B\log^{3}B\log\log B}\right) queries to UinU_{\text{in}} (by Lemma 4.3).

  3. 3.

    One call to Minimum(Ucmp)\textbf{Minimum}(U_{\text{cmp}}), which needs O(n/L)O\left(\sqrt{n/L}\right) queries to UcmpU_{\text{cmp}} (by Lemma 3.4), i.e.

    O(n/L(BlogB+n))=O(n/B)O\left(\sqrt{n/L}\left(\sqrt{B\log B}+\sqrt{n}\right)\right)=O\left(n/\sqrt{B}\right)

    queries to UinU_{\text{in}}.

  4. 4.

    Compute hih_{i}, i.e. one query to UhU_{h}, which needs O(BlogB)O\left(\sqrt{B\log B}\right) queries to UinU_{\text{in}}.

Therefore, the total query complexity is

O(nB+n/B)=O(n3/4)O\left(\sqrt{nB}+n/\sqrt{B}\right)=O\left(n^{3/4}\right)

by selecting B=Θ(n)B=\Theta\left(\sqrt{n}\right).

Correctness

The correctness of Algorithm 5 is not obvious due to the fact that we only deal with one candidate in each block, but there can be several candidates that matches pp in a single block. This issue is resolved by the following exclusion rule:

  • For every two equal substrings s[ii+B1]s[i\dots i+B-1] and s[jj+B1]s[j\dots j+B-1] of ss that overlap each other with 0i<j<n0\leq i<j<n and 1Bn/21\leq B\leq n/2, if both of them are prefixes of SCR(s)\operatorname{SCR}(s), then LMSR(s)\operatorname{LMSR}(s) cannot be the larger index jj.

More precisely, this exclusion rule can be stated as the following:

Lemma 5.1 (Exclusion Rule for LMSR).

Suppose sΣns\in\Sigma^{n} is a string of length nn. Let 2Bn/22\leq B\leq n/2, and two indices i,j[n]i,j\in[n] with i<j<i+Bi<j<i+B. If s[ii+B1]=s[jj+B1]=s[LMSR(s)LMSR(s)+B1]s[i\dots i+B-1]=s[j\dots j+B-1]=s[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1], then LMSR(s)j\operatorname{LMSR}(s)\neq j.

Proof.

See Appendix E. ∎

Indeed, the above exclusion rule can be viewed as the Ricochet Property of LMSR. Here, the Ricochet Property means that if two candidates are in the same block, then at most one of them can survive. This kind of Ricochet property was found to be useful in string matching, e.g., [Vis90]. If there are two candidates in the same block, since each block is of length L=B/4<BL=\lfloor B/4\rfloor<B, then the two candidates must overlap each other. By this rule, the smaller candidate remains. Consequently, the correctness of Algorithm 5 is guaranteed because hih_{i} defined by Eq. (6) always chooses the smallest candidate in each block.

After the above discussions, we obtain:

Theorem 5.2.

Algorithm 5 is an O(n3/4)O\left(n^{3/4}\right) bounded-error quantum query algorithm for LMSR.

The quantum algorithm given by Theorem 5.2 uses quantum deterministic sampling (Lemma 4.3) as a subroutine. It can be also made time-efficient in the same way as discussed in Remark 4.2.

5.2 An Improvement for Better Average-Case Query Complexity

In the previous subsection, we propose an efficient quantum algorithm for LMSR in terms of its worst-case query complexity. It is easy to see that its average-case query complexity remains the same as its worst-case query complexity. In this subsection, we give an improved algorithm with better average-case query complexity which also retains the worst-case query complexity.

The basic idea is to individually deal with several special cases, which cover almost all of the possibilities on average. Let B=3logαnB=\lceil 3\log_{\alpha}n\rceil. Our strategy is to just consider substrings of length BB. Let

k=argmini[n]s[ii+B1]k=\arg\min_{i\in[n]}s[i\dots i+B-1]

denote the index of the minimal substring among all substrings of length BB, and then let

k=argmini[n]{k}s[ii+B1]k^{\prime}=\arg\min_{i\in[n]\setminus\{k\}}s[i\dots i+B-1]

denote the index of the second minimal substring among all substrings of length BB. If s[kk+B1]s[kk+B1]s[k\dots k+B-1]\neq s[k^{\prime}\dots k^{\prime}+B-1], then it immediately holds that LMSR(s)=k\operatorname{LMSR}(s)=k. To find the second minimal substring, the index kk of the minimal substring should be excluded. For this, we need comparator cmpBk\text{cmp}_{B\setminus k}:

cmpBk(i,j)={0i=k1ikj=kcmpB(i,j)otherwise.\text{cmp}_{B\setminus k}(i,j)=\begin{cases}0&i=k\\ 1&i\neq k\land j=k\\ \text{cmp}_{B}(i,j)&\text{otherwise}\end{cases}. (8)

The bounded-error quantum comparison oracle UcmpBkU_{\text{cmp}_{B\setminus k}} corresponding to cmpBk(i,j)\text{cmp}_{B\setminus k}(i,j) can be defined with at most one query to UcmpBU_{\text{cmp}_{B}}.

The Algorithm

Our improved algorithm is presented as Algorithm 6. It has three main steps (Line 5, Line 6 and Line 10). For the same reason as in Algorithm 5, we assume that the third step (Line 10) succeeds with a high enough constant probability, say 0.99\geq 0.99.

Algorithm 6 ImprovedLMSR(Uin)\textbf{ImprovedLMSR}(U_{\text{in}}): Improved quantum algorithm for LMSR.
1:The query oracle UinU_{\text{in}} for string ss.
2:LMSR(s)\operatorname{LMSR}(s) with probability 2/3\geq 2/3.
3:if n3n\leq 3 then
4:     return LMSR(s)\operatorname{LMSR}(s) by classical algorithms.
5:end if
6:B3logαnB\leftarrow\lceil 3\log_{\alpha}n\rceil.
7:kMinimum(UcmpB,1/2n)k\leftarrow\textbf{Minimum}(U_{\text{cmp}_{B}},1/2n), where cmpB\text{cmp}_{B} is defined by Eq. (5).
8:kMinimum(UcmpBk,1/2n)k^{\prime}\leftarrow\textbf{Minimum}(U_{\text{cmp}_{B\setminus k}},1/2n), where cmpBk\text{cmp}_{B\setminus k} is defined by Eq. (8).
9:if cmpB(k,k)=1\text{cmp}_{B}(k,k^{\prime})=1 then
10:     return kk.
11:else
12:     return BasicLMSR(Uin)\textbf{BasicLMSR}(U_{\text{in}}).
13:end if

Correctness

The correctness of Algorithm 6 is trivial. We only consider the case where n4n\geq 4, and all of the three main steps succeed with probability

(112n)20.99>23.\geq\left(1-\frac{1}{2n}\right)^{2}\cdot 0.99>\frac{2}{3}.

If cmpB(k,k)=1\text{cmp}_{B}(k,k^{\prime})=1, then s[kk+B1]s[k\dots k+B-1] is the minimal substring of length BB, and it immediately holds that LMSR(s)=k\operatorname{LMSR}(s)=k. Otherwise, the correctness is based on that of BasicLMSR(Uin)\textbf{BasicLMSR}(U_{\text{in}}), which is guaranteed by Theorem 5.2.

Complexity

The worst-case query complexity of Algorithm 6 is O(n3/4)O\left(n^{3/4}\right), obtained directly by Theorem 5.2. But settling the average-case query complexity of Algorithm 6 is a bit more subtle, which requires a better understanding of some properties of LMSR. To this end, we first introduce the notion of string sensitivity.

Definition 5.1 (String Sensitivity).

Let sΣns\in\Sigma^{n} be a string of length nn over a finite alphabet Σ\Sigma. The string sensitivity of ss, denoted C(s)C(s), is the smallest positive number ll such that s[ii+l1]s[jj+l1]s[i\dots i+l-1]\neq s[j\dots j+l-1] for all 0i<j<n0\leq i<j<n. In case that no such ll exists, define C(s)=C(s)=\infty.

The string sensitivity of a string is a metric indicating the difficulty to distinguish its rotations by their prefixes. If we know the string sensitivity C(s)C(s) of a string ss, we can compute LMSR(s)\operatorname{LMSR}(s) by finding the minimal string among all substrings of ss of length C(s)C(s), that is, s[ii+C(s)1]s[i\dots i+C(s)-1] for all i[n]i\in[n].

The following lemma shows that almost all strings have a low string sensitivity.

Lemma 5.3 (String Sensitivity Distribution).

Let ss be a uniformly random string over Σn\Sigma^{n} and 1Bn/21\leq B\leq n/2. Then

Pr[C(s)B]112n(n1)αB,\Pr[C(s)\leq B]\geq 1-\frac{1}{2}n(n-1)\alpha^{-B},

where α=|Σ|\alpha=\lvert\Sigma\rvert. In particular,

Pr[C(s)3logαn]11n.\Pr\left[C(s)\leq\lceil 3\log_{\alpha}n\rceil\right]\geq 1-\frac{1}{n}.
Proof.

Let sΣns\in\Sigma^{n} and 0i<j<n0\leq i<j<n. We claim that

Pr[s[ii+B1]=s[jj+B1]]=αB.\Pr[s[i\dots i+B-1]=s[j\dots j+B-1]]=\alpha^{-B}.

This can be seen as follows. Let dd be the number of members that appear in both sequences {imodn,(i+1)modn,,(i+B1)modn}\{i\bmod n,(i+1)\bmod n,\dots,(i+B-1)\bmod n\} and {jmodn,(j+1)modn,,(j+B1)modn}\{j\bmod n,(j+1)\bmod n,\dots,(j+B-1)\bmod n\}. It is clear that 0d<B0\leq d<B. We note that s[ii+B1]=s[jj+B1]s[i\dots i+B-1]=s[j\dots j+B-1] implies the following system of BB equations:

s[i+k]=s[j+k]forall 0k<B.s[i+k]=s[j+k]\ {\rm for\ all}\ 0\leq k<B.

On the other hand, these BB equations involve 2Bd2B-d (random) characters. Therefore, there must be (2Bd)B=Bd(2B-d)-B=B-d independent characters, and the probability that the BB equations hold is

Pr[s[ii+B1]=s[jj+B1]]=αBdα2Bd=αB.\Pr[s[i\dots i+B-1]=s[j\dots j+B-1]]=\frac{\alpha^{B-d}}{\alpha^{2B-d}}=\alpha^{-B}.

Consequently, we have:

Pr[C(s)B]\displaystyle\Pr[C(s)\leq B] =1Pr[0i<j<n,s[ii+B1]=s[jj+B1]]\displaystyle=1-\Pr[\exists 0\leq i<j<n,s[i\dots i+B-1]=s[j\dots j+B-1]]
1i=0n1j=i+1n1Pr[s[ii+B1]=s[jj+B1]]\displaystyle\geq 1-\sum_{i=0}^{n-1}\sum_{j=i+1}^{n-1}\Pr[s[i\dots i+B-1]=s[j\dots j+B-1]]
=112n(n1)αB.\displaystyle=1-\frac{1}{2}n(n-1)\alpha^{-B}.

In particular, in the case of B=3logαnB=\lceil 3\log_{\alpha}n\rceil, it holds that αBn3\alpha^{B}\geq n^{3} and we obtain:

Pr[C(s)B]112n(n1)n311n.\Pr[C(s)\leq B]\geq 1-\frac{1}{2}n(n-1)n^{-3}\geq 1-\frac{1}{n}.

With the above preparation, we now can analyze the average-case query complexity of Algorithm 6. Let sΣns\in\Sigma^{n} be a uniformly random string over Σn\Sigma^{n} and B=3logαnB=\lceil 3\log_{\alpha}n\rceil. Let kk and kk^{\prime} denote the indices of the minimal and the second minimal substrings of length BB of ss. To compute kk and kk^{\prime}, by Lemma 3.5, Algorithm 6 needs to make O(nlogn)O\left(\sqrt{n\log n}\right) queries to UcmpBU_{\text{cmp}_{B}}, which is equivalent to O(nlognB)=O(nlogn)O\left(\sqrt{n\log n}\sqrt{B}\right)=O\left(\sqrt{n}\log n\right) queries to UinU_{\text{in}}. On the other hand, it requires O(B)=O(logn)O(B)=O(\log n) queries to UinU_{\text{in}} in order to check whether cmpB(k,k)=1\text{cmp}_{B}(k,k^{\prime})=1, which is ignorable compared to other large complexities. Based on the result of cmpB(k,k)\text{cmp}_{B}(k,k^{\prime}), we only need to further consider the following two cases:

Case 1. cmpB(k,k)=1\text{cmp}_{B}(k,k^{\prime})=1. Note that this case happens with probability

Pr[cmpB(k,k)=1]Pr[C(s)B]11n.\Pr[\text{cmp}_{B}(k,k^{\prime})=1]\geq\Pr[C(s)\leq B]\geq 1-\frac{1}{n}. (9)

In this case, Algorithm 6 returns kk immediately.

Case 2. cmpB(k,k)1\text{cmp}_{B}(k,k^{\prime})\neq 1. According to Eq. (9), this case happens with probability 1/n\leq 1/n. In this case, Algorithm 6 makes one query to BasicLMSR(Uin)\textbf{BasicLMSR}(U_{\text{in}}), which needs O(n3/4)O\left(n^{3/4}\right) queries to UinU_{\text{in}} (by Theorem 5.2).

Combining the above two cases yields the average-case query complexity of Algorithm 6:

O(nlogn)+(11n)O(1)+1nO(n3/4)=O(nlogn).\leq O\left(\sqrt{n}\log n\right)+\left(1-\frac{1}{n}\right)\cdot O(1)+\frac{1}{n}\cdot O\left(n^{3/4}\right)=O\left(\sqrt{n}\log n\right).

After the above discussions, we obtain:

Theorem 5.4.

Algorithm 6 is an O(n3/4)O\left(n^{3/4}\right) bounded-error quantum query algorithm for LMSR, whose average-case query complexity is O(nlogn)O\left(\sqrt{n}\log n\right).

Algorithm 6 for Theorem 5.4 can be made time-efficient by an argument similar to that given in Section 5.1 for time and space efficiency.

6 Lower Bounds of LMSR

In this section, we establish average-case and worst-case lower bounds of both classical and quantum algorithms for the LMSR problem and thus prove Theorem 1.2.

The notion of block sensitivity is the key tool we use to obtain lower bounds. Let f:{0,1}n{0,1}f:\{0,1\}^{n}\to\{0,1\} be a Boolean function. If x{0,1}nx\in\{0,1\}^{n} is a binary string and S[n]S\subseteq[n], we use xSx^{S} to denote the binary string obtained by flipping the values of xix_{i} for iSi\in S, where xix_{i} is the ii-th character of xx:

(xS)i={x¯iiSxiiS,\left(x^{S}\right)_{i}=\begin{cases}\bar{x}_{i}&i\in S\\ x_{i}&i\notin S\end{cases},

where u¯\bar{u} denotes the negation of uu, i.e. 0¯=1\bar{0}=1 and 1¯=0\bar{1}=0. The block sensitivity of ff on input xx, denoted 𝑏𝑠x(f)\mathit{bs}_{x}(f), is the maximal number mm such that there are mm disjoint sets S1,S2,,Sm[n]S_{1},S_{2},\dots,S_{m}\subseteq[n] for which f(x)f(xSi)f(x)\neq f(x^{S_{i}}) for 1im1\leq i\leq m.

6.1 Average-Case Lower Bounds

For settling the average-case lower bound, we need the following useful result about block sensitivities given in [AdW99].

Theorem 6.1 (General bounds for average-case complexity, [AdW99, Theorem 6.3]).

For every function f:{0,1}n{0,1}f:\{0,1\}^{n}\to\{0,1\} and probability distribution μ:{0,1}n[0,1]\mu:\{0,1\}^{n}\to[0,1], we have Rμ(f)=Ω(𝔼xμ[𝑏𝑠x(f)])R^{\mu}(f)=\Omega\left(\mathbb{E}_{x\sim\mu}[\mathit{bs}_{x}(f)]\right) and Qμ(f)=Ω(𝔼xμ[𝑏𝑠x(f)])Q^{\mu}(f)=\Omega\left(\mathbb{E}_{x\sim\mu}\left[\sqrt{\mathit{bs}_{x}(f)}\right]\right).

In order to give a lower bound for LMSR by using Theorem 6.1, we need a binary function that can be reduced LMSR(x)\operatorname{LMSR}(x) to but is simpler than it. Here, we choose LMSR0(x)=LMSR(x)mod2\operatorname{LMSR}_{0}(x)=\operatorname{LMSR}(x)\bmod 2. Obviously, we can compute LMSR(x)\operatorname{LMSR}(x), then LMSR0(x)\operatorname{LMSR}_{0}(x) is immediately obtained. Moreover, LMSR0\operatorname{LMSR}_{0} enjoys the following basic property:

Lemma 6.2.

Let x{0,1}nx\in\{0,1\}^{n} and 0r<n0\leq r<n. Then 𝑏𝑠x(LMSR0)=𝑏𝑠x(r)(LMSR0)\mathit{bs}_{x}(\operatorname{LMSR}_{0})=\mathit{bs}_{x^{(r)}}(\operatorname{LMSR}_{0}).

Proof.

Let m=𝑏𝑠x(LMSR0)m=\mathit{bs}_{x}(\operatorname{LMSR}_{0}) and S1,S2,,SmS_{1},S_{2},\dots,S_{m} be the mm disjoint sets for which LMSR0(x)LMSR0(xSi)\operatorname{LMSR}_{0}(x)\neq\operatorname{LMSR}_{0}\left(x^{S_{i}}\right) for 1im1\leq i\leq m. We define Si={(a+r)modn:aSi}S_{i}^{\prime}=\{(a+r)\bmod n:a\in S_{i}\} for 1im1\leq i\leq m. Then it can be verified that LMSR0(x(r))LMSR0((x(r))Si)\operatorname{LMSR}_{0}\left(x^{(r)}\right)\neq\operatorname{LMSR}_{0}\left(\left(x^{(r)}\right)^{S_{i}}\right) for 1im1\leq i\leq m. Hence, 𝑏𝑠x(r)(LMSR0)m=𝑏𝑠x(LMSR0)\mathit{bs}_{x^{(r)}}(\operatorname{LMSR}_{0})\geq m=\mathit{bs}_{x}(\operatorname{LMSR}_{0}).

The same argument yields that 𝑏𝑠x(LMSR0)𝑏𝑠x(r)(LMSR0)\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq\mathit{bs}_{x^{(r)}}(\operatorname{LMSR}_{0}). Therefore, 𝑏𝑠x(LMSR0)=𝑏𝑠x(r)(LMSR0)\mathit{bs}_{x}(\operatorname{LMSR}_{0})=\mathit{bs}_{x^{(r)}}(\operatorname{LMSR}_{0}). ∎

Next, we establish a lower bound for 𝑏𝑠x(LMSR0)\mathit{bs}_{x}(\operatorname{LMSR}_{0}).

Lemma 6.3.

Let x{0,1}nx\in\{0,1\}^{n}. Then

𝑏𝑠x(LMSR0)n4C(x)14,\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq\left\lfloor\frac{n}{4C(x)}-\frac{1}{4}\right\rfloor, (10)

where C(x)C(x) is the string sensitivity of xx.

Proof.

We first note that inequality (10) is trivially true when C(x)>n/5C(x)>n/5 because the right hand side is equal to 0. For the case of C(x)n/5C(x)\leq n/5, our proof is carried out in two steps:

Step 1. Let us start from the special case of LMSR(x)=0\operatorname{LMSR}(x)=0. Note that LMSR0(x)=0\operatorname{LMSR}_{0}(x)=0. Let B=C(x)B=C(x). We split xx into (k+1)(k+1) substrings x=y1y2ykyk+1x=y_{1}y_{2}\dots y_{k}y_{k+1}, where |yi|=B\lvert y_{i}\rvert=B for 1ik1\leq i\leq k, |yk+1|=nmodB\lvert y_{k+1}\rvert=n\bmod B, and k=n/Bk=\lfloor n/B\rfloor. By the assumption that C(x)=BC(x)=B, we have y1<min{y2,y3,,yk}y_{1}<\min\{y_{2},y_{3},\dots,y_{k}\}. It holds that y1y2>02By_{1}y_{2}>0^{2B}; otherwise, y1=y2=0By_{1}=y_{2}=0^{B}, and then a contradiction C(x)>BC(x)>B arises.

Let m=(k1)/4m=\lfloor(k-1)/4\rfloor. We select some of yiy_{i}s and divide them into mm groups (and the others are ignored). For every 1im1\leq i\leq m, the ii-th group is zi=y4i2y4i1y4iy4i+1z_{i}=y_{4i-2}y_{4i-1}y_{4i}y_{4i+1}. Let LiL_{i} be the number of characters in front of ziz_{i}. Then Li=(4i3)BL_{i}=(4i-3)B. We claim that 𝑏𝑠x(LMSR0)m\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq m by explicitly constructing mm disjoint sets S1,S2,,Sm[n]S_{1},S_{2},\dots,S_{m}\subseteq[n] such that LMSR0(x)LMSR0(xSi)\operatorname{LMSR}_{0}(x)\neq\operatorname{LMSR}_{0}\left(x^{S_{i}}\right) for 1im1\leq i\leq m:

  1. 1.

    If LiL_{i} is even, then we define:

    Si={LijLi+2B:xjδj,Li},S_{i}=\{L_{i}\leq j\leq L_{i}+2B:x_{j}\neq\delta_{j,L_{i}}\},

    where δx,y\delta_{x,y} is the Kronecker delta, that is, δx,y=1\delta_{x,y}=1 if x=yx=y and 0 otherwise.

  2. 2.

    If LiL_{i} is odd, then we define:

    Si={Li+1jLi+2B+1:xjδj,Li+1}.S_{i}=\{L_{i}+1\leq j\leq L_{i}+2B+1:x_{j}\neq\delta_{j,L_{i}+1}\}.

Note that SiS_{i}\neq\emptyset, and 02B0^{2B} is indeed the substring of xSix^{S_{i}} that starts at the index Li+1L_{i}+1 if LiL_{i} is even and at Li+2L_{i}+2 if LiL_{i} is odd. That is,

LMSR(xSi)={Li+1Li0(mod2)Li+2Li1(mod2).\operatorname{LMSR}\left(x^{S_{i}}\right)=\begin{cases}L_{i}+1&L_{i}\equiv 0\pmod{2}\\ L_{i}+2&L_{i}\equiv 1\pmod{2}\end{cases}.

Then we conclude that LMSR0(xSi)=1\operatorname{LMSR}_{0}\left(x^{S_{i}}\right)=1 for 1im1\leq i\leq m. Consequently,

𝑏𝑠x(LMSR0)m=k14=n4B14.\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq m=\left\lfloor\frac{k-1}{4}\right\rfloor=\left\lfloor\frac{n}{4B}-\frac{1}{4}\right\rfloor.

Step 2. Now we remove the condition that LMSR(x)=0\operatorname{LMSR}(x)=0 in Step 1. Let r=LMSR0(x)r=\operatorname{LMSR}_{0}(x) and we consider the binary string x(r)x^{(r)}. Note that LMSR(x(r))=0\operatorname{LMSR}\left(x^{(r)}\right)=0. By Lemma 6.2, we have:

𝑏𝑠x(LMSR0)=𝑏𝑠x(r)(LMSR0)n4B14.\mathit{bs}_{x}(\operatorname{LMSR}_{0})=\mathit{bs}_{x^{(r)}}(\operatorname{LMSR}_{0})\geq\left\lfloor\frac{n}{4B}-\frac{1}{4}\right\rfloor.

Therefore, inequality (10) holds for all x{0,1}nx\in\{0,1\}^{n} with C(x)n/5C(x)\leq n/5. ∎

We remark that inequality (10) can be slightly improved:

𝑏𝑠x(LMSR0)n12C(x)+2,\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq\left\lfloor\frac{n-1}{2C(x)+2}\right\rfloor,

by splitting xx more carefully. However, Lemma 6.3 is sufficient for our purpose. With it, we obtain a lower bound of the expected value of 𝑏𝑠x(LMSR0)\mathit{bs}_{x}(\operatorname{LMSR}_{0}) when xx is uniformly distributed:

Lemma 6.4.

Let 𝑢𝑛𝑖𝑓:{0,1}n[0,1]\mathit{unif}:\{0,1\}^{n}\to[0,1] be the uniform distribution that 𝑢𝑛𝑖𝑓(x)=2n\mathit{unif}(x)=2^{-n} for every x{0,1}nx\in\{0,1\}^{n}. Then

𝔼x𝑢𝑛𝑖𝑓[𝑏𝑠x(LMSR0)]\displaystyle\mathbb{E}_{x\sim\mathit{unif}}\left[\mathit{bs}_{x}(\operatorname{LMSR}_{0})\right] =Ω(n/logn),\displaystyle=\Omega\left(n/\log n\right),
𝔼x𝑢𝑛𝑖𝑓[𝑏𝑠x(LMSR0)]\displaystyle\mathbb{E}_{x\sim\mathit{unif}}\left[\sqrt{\mathit{bs}_{x}(\operatorname{LMSR}_{0})}\right] =Ω(n/logn).\displaystyle=\Omega\left(\sqrt{n/\log n}\right).
Proof.

By Lemma 5.3 and Lemma 6.3, we have:

𝔼x𝑢𝑛𝑖𝑓[𝑏𝑠x(LMSR0)]\displaystyle\mathbb{E}_{x\sim\mathit{unif}}[\mathit{bs}_{x}(\operatorname{LMSR}_{0})] =x{0,1}n2n𝑏𝑠x(LMSR0)\displaystyle=\sum_{x\in\{0,1\}^{n}}2^{-n}\mathit{bs}_{x}(\operatorname{LMSR}_{0})
x{0,1}n2nn4C(x)14\displaystyle\geq\sum_{x\in\{0,1\}^{n}}2^{-n}\left\lfloor\frac{n}{4C(x)}-\frac{1}{4}\right\rfloor
=B=1nx{0,1}n:C(x)=B2nn4B14\displaystyle=\sum_{B=1}^{n}\sum_{x\in\{0,1\}^{n}:C(x)=B}2^{-n}\left\lfloor\frac{n}{4B}-\frac{1}{4}\right\rfloor
B=13log2nx{0,1}n:C(x)=B2nn4B14\displaystyle\geq\sum_{B=1}^{\lceil 3\log_{2}n\rceil}\sum_{x\in\{0,1\}^{n}:C(x)=B}2^{-n}\left\lfloor\frac{n}{4B}-\frac{1}{4}\right\rfloor
B=13log2nx{0,1}n:C(x)=B2nn43log2n14\displaystyle\geq\sum_{B=1}^{\lceil 3\log_{2}n\rceil}\sum_{x\in\{0,1\}^{n}:C(x)=B}2^{-n}\left\lfloor\frac{n}{4\lceil 3\log_{2}n\rceil}-\frac{1}{4}\right\rfloor
=n43log2n14Prx𝑢𝑛𝑖𝑓[C(x)3log2n]\displaystyle=\left\lfloor\frac{n}{4\lceil 3\log_{2}n\rceil}-\frac{1}{4}\right\rfloor\Pr_{x\sim\mathit{unif}}[C(x)\leq\lceil 3\log_{2}n\rceil]
n43log2n14(11n)\displaystyle\geq\left\lfloor\frac{n}{4\lceil 3\log_{2}n\rceil}-\frac{1}{4}\right\rfloor\left(1-\frac{1}{n}\right)
=Ω(nlogn).\displaystyle=\Omega\left(\frac{n}{\log n}\right).

A similar argument yields that 𝔼x𝑢𝑛𝑖𝑓[𝑏𝑠x(LMSR0)]=Ω(n/logn)\mathbb{E}_{x\sim\mathit{unif}}\left[\sqrt{\mathit{bs}_{x}(\operatorname{LMSR}_{0})}\right]=\Omega\left(\sqrt{n/\log n}\right). ∎

By combining the above lemma with Theorem 6.1, we obtain lower bounds for randomized and quantum average-case bounded-error algorithms for LMSR:

R𝑢𝑛𝑖𝑓(LMSR0)=Ω(n/logn)andQ𝑢𝑛𝑖𝑓(LMSR0)=Ω(n/logn).R^{\mathit{unif}}(\operatorname{LMSR}_{0})=\Omega(n/\log n)\ {\rm and}\ Q^{\mathit{unif}}(\operatorname{LMSR}_{0})=\Omega\left(\sqrt{n/\log n}\right).

6.2 Worst-Case Lower Bounds

Now we turn to consider the worst-case lower bounds. The idea is similar to the average case. First, the following result similar to Theorem 6.1 was also proved in [AdW99].

Theorem 6.5 ([AdW99]).

Let AA be a bounded-error algorithm for some function f:{0,1}n{0,1}f:\{0,1\}^{n}\to\{0,1\}.

  1. 1.

    If AA is classical, then TA(x)=Ω(𝑏𝑠x(f))T_{A}(x)=\Omega\left(\mathit{bs}_{x}(f)\right); and

  2. 2.

    If AA is quantum, then TA(x)=Ω(𝑏𝑠x(f))T_{A}(x)=\Omega\left(\sqrt{\mathit{bs}_{x}(f)}\right).

We still consider the function LMSR0\operatorname{LMSR}_{0} in this subsection. The following lemma shows that its block sensitivity can be linear in the worst case.

Lemma 6.6.

There is a string x{0,1}nx\in\{0,1\}^{n} such that 𝑏𝑠x(LMSR0)n/2\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq\lfloor n/2\rfloor.

Proof.

Let x=1nx=1^{n}. Then LMSR(x)=0\operatorname{LMSR}(x)=0 and LMSR0(x)=0\operatorname{LMSR}_{0}(x)=0. We can choose m=n/2m=\lfloor n/2\rfloor disjoint sets S1,S2,,SmS_{1},S_{2},\dots,S_{m} with Si={2i1}S_{i}=\{2i-1\}. It may be easily verified that LMSR0(xSi)=1\operatorname{LMSR}_{0}\left(x^{S_{i}}\right)=1 for every 1im1\leq i\leq m. Thus, by the definition of block sensitivity, we have 𝑏𝑠x(LMSR0)n/2\mathit{bs}_{x}(\operatorname{LMSR}_{0})\geq\lfloor n/2\rfloor. ∎

Combining the above lemma with Theorem 6.5, we conclude that R(LMSR0)=Ω(n)R(\operatorname{LMSR}_{0})=\Omega(n) and Q(LMSR0)=Ω(n)Q(\operatorname{LMSR}_{0})=\Omega\left(\sqrt{n}\right), which give a lower bound for randomized and one for quantum worst-case bounded-error algorithms for LMSR, respectively. We have another more intuitive proof for the worst-case lower bound for quantum bounded-error algorithms and postpone into Appendix F.

7 Applications

In this section, we present some practical applications of our quantum algorithm for LMSR.

7.1 Benzenoid Identification

The first application of our algorithm is a quantum solution to a problem about chemical graphs. Benzenoid hydrocarbons are a very important class of compounds [Dia87, Dia88] and also popular as mimics of graphene (see [WSS+08, WMK08, PLB+16]). Several algorithmic solutions to the identification problem of benzenoids have been proposed in the previous literature; for example, Bašić [Baš16] identifies benzenoids by boundary-edges code [HLZ96] (see also [KPBF14]).

Formally, the boundary-edges code (BEC) of a benzenoid is a finite string over a finite alphabet Σ6={1,2,3,4,5,6}\Sigma_{6}=\{1,2,3,4,5,6\}. The canonical BEC of a benzenoid is essentially the lexicographically maximal string among all rotations of any of its BECs and their reverses. Our quantum algorithm for LMSR can be used to find the canonical BEC of a benzenoid in O(n3/4)O\left(n^{3/4}\right) queries, where nn is the length of its BEC.

More precisely, it is equivalent to find the lexicographically minimal one if we assume that the lexicographical order is 6<5<4<3<2<16<5<4<3<2<1. Suppose a benzenoid has a BEC ss. Our quantum algorithm is described as follows:

  1. 1

    Let i=LMSR(s)i=\operatorname{LMSR}(s) and iR=LMSR(sR)i^{R}=\operatorname{LMSR}(s^{R}), where sRs^{R} denotes the reverse of ss. This is achieved by Algorithm 5 in O(n3/4)O(n^{3/4}) query complexity.

  2. 2

    Return the smaller one between s[ii+n1]s[i\dots i+n-1] and sR[iRiR+n1]s^{R}[i^{R}\dots i^{R}+n-1]. This is achieved by Algorithm 3 in O(n)O(\sqrt{n}) query complexity.

It is straightforward to see that the overall query complexity is O(n3/4)O(n^{3/4}).

7.2 Disjoint-Cycle Automata Minimization

Another application of our algorithm is a quantum solution to minimization of a special class of automata. Automata minimization is an important problem in automata theory [HMU00, BBCF10] and has many applications in various areas of computer science. The best known algorithm for minimizing deterministic automata is O(nlogn)O(n\log n) [Hop71], where nn is the number of states. A few linear algorithms for minimizing some special automata are proposed in [Rev92, AZ08], which are important in practice, e.g., dictionaries in natural language processing.

We consider the minimization problem of disjoint-cycle automata discussed by Almeida and Zeitoun in [AZ08]. The key to this problem is a decision problem that checks whether there are two cycles that are equal to each other under rotations. Formally, suppose there are mm cycles, which are described by strings s1,s2,,sms_{1},s_{2},\dots,s_{m} over a finite alphabet Σ\Sigma. It is asked whether there are two strings sis_{i} and sjs_{j} (iji\neq j) such that SCR(si)=SCR(sj)\operatorname{SCR}(s_{i})=\operatorname{SCR}(s_{j}). For convenience, we assume that all strings are of equal length nn, i.e. |s1|=|s2|==|sm|=n\lvert s_{1}\rvert=\lvert s_{2}\rvert=\dots=\lvert s_{m}\rvert=n.

A classical algorithm solving the above decision problem was developed in [AZ08] with time complexity O(mn)O(mn). With the help of our quantum algorithm for LMSR, this problem can be solved more efficiently. We employ a quantum comparison oracle UcmpU_{\text{cmp}} that compares strings by their canonical representations in the lexicographical order, where the corresponding classical comparator is:

cmp(i,j)={1SCR(si)<SCR(sj),0otherwise,\text{cmp}(i,j)=\begin{cases}1&\operatorname{SCR}(s_{i})<\operatorname{SCR}(s_{j}),\\ 0&\text{otherwise},\end{cases}

and can be computed by finding ri=LMSR(si)r_{i}=\operatorname{LMSR}(s_{i}) and rj=LMSR(sj)r_{j}=\operatorname{LMSR}(s_{j}). In particular, it can be done by our quantum algorithm for LMSR in O(n3/4)O\left(n^{3/4}\right) queries. Then the lexicographical comparator in Algorithm 3 can be use to compare si[riri+n1]s_{i}[r_{i}\dots r_{i}+n-1] and sj[rjrj+n1]s_{j}[r_{j}\dots r_{j}+n-1] in query complexity O(n3/4)O\left(n^{3/4}\right). Furthermore, the problem of checking whether there are two strings that are equal to each other under rotations among the mm strings may be viewed as the element distinctness problem with quantum comparison oracle UcmpU_{\text{cmp}}, and thus can be solved by Ambainis’s quantum algorithm [Amb07] with O~(m2/3)\tilde{O}\left(m^{2/3}\right) queries to UcmpU_{\text{cmp}}. In conclusion, the decision problem can be solved in quantum time complexity O~(m2/3n3/4)\tilde{O}\left(m^{2/3}n^{3/4}\right), which is better than the best known classical O(mn)O(mn) time.

Acknowledgment

We would like to thank the anonymous referees for their valuable comments and suggestions, which helped to improve this paper. Qisheng Wang would like to thank François Le Gall for helpful discussions in an earlier version of this paper.

This work was supported in part by the National Natural Science Foundation of China under Grant 61832015. Qisheng Wang was also supported in part by the MEXT Quantum Leap Flagship Program (MEXT Q-LEAP) grants No. JPMXS0120319794.

References

  • [AdW99] A. Ambainis and R. de Wolf. Average-case quantum query complexity. Journal of Physics A General Physics, 34(35):6741–6754, 1999.
  • [AHU74] A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer Algorithms. Addison-Wesley, 1974.
  • [AIP87] A. Apostolico, C. S. Iliopoulos, and R. Paige. An O(nlogn)O(n\log n) cost parallel algorithm for the one function partitioning problem. In Proceedings of the International Workshop on Parallel Algorithms and Architectures, pages 70–76, 1987.
  • [AJ22] S. Akmal and C. Jin. Near-optimal quantum algorithms for string problems. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms, pages 2791–2832, 2022.
  • [AK99] A. Ahuja and S. Kapoor. A quantum algorithm for finding the maximum. ArXiv e-prints, 1999. arXiv:quant-ph/9911082.
  • [AM14] A. Ambainis and A. Montanaro. Quantum algorithms for search with wildcards and combinatorial group testing. Quantum Information and Computation, 14(5-6):439–453, 2014.
  • [Amb00] A. Ambainis. Quantum lower bounds by quantum arguments. In Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, pages 636–643, 2000.
  • [Amb04] A. Ambainis. Quantum query algorithms and lower bounds. In Classical and New Paradigms of Computation and their Complexity Hierarchies, pages 15–32, 2004.
  • [Amb07] A. Ambainis. Quantum walk algorithm for element distinctness. SIAM Journal on Computing, 37(1):210–239, 2007.
  • [AZ08] J. Almeida and M. Zeitoun. Description and analysis of a bottom-up DFA minimization algorithm. Information Processing Letters, 107(2):52–59, 2008.
  • [Baš16] N. Bašić. Algebraic approach to several families of chemical graphs. PhD thesis, Faculty of Mathematics and Physics, University of Ljubljana, 2016.
  • [BBBV97] C. H. Bennett, E. Bernstein, G. Brassard, and U. Vazirani. Strengths and weaknesses of quantum computing. SIAM Journal on Computing, 26(5):1524–1540, 1997.
  • [BBC+01] R. Beals, H. Buhrman, R. Cleve, M. Mosca, and R. de Wolf. Quantum lower bounds by polynomials. Journal of the ACM, 48(4):778–797, 2001.
  • [BBCF10] J. Berstel, L. Boasson, O. Carton, and I. Fagnot. Minimization of automata. ArXiv e-prints, 2010. arXiv:1010.5318.
  • [BBHT98] M. Boyer, G. Brassard, P. Hoeyer, and A. Tapp. Tight bounds on quantum searching. Fortschritte der Physik, 46(4-5):493–505, 1998.
  • [BCdWZ99] H. Buhrman, R. Cleve, R. de Wolf, and C. Zalka. Bounds for small-error and zero-error quantum algorithms. In Proceedings of the Fortieth IEEE Annual Symposium on Foundations of Computer Science, pages 358–368, 1999.
  • [BCN05] F. Bassino, J. Clément, and C. Nicaud. The standard factorization of Lyndon words: an average point of view. Discrete Mathematics, 280(1):1–25, 2005.
  • [BCW98] H. Buhrman, R. Cleve, and A. Wigderson. Quantum vs. classical communication and computation. In Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, pages 63–68, 1998.
  • [BdW02] H. Buhrman and R. de Wolf. Complexity measures and decision tree complexity: A survey. Theoretical Computer Science, 288(1):21–43, 2002.
  • [BEG+18] M. Boroujeni, S. Ehsani, M. Ghodsi, M. HajiAghayi, and S. Seddighin. Approximating edit distance in truly subquadratic time: Quantum and mapreduce. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1170–1189, 2018.
  • [Boo80] K. S. Booth. Lexicographically least circular substrings. Information Processing Letters, 10(4-5):240–242, 1980.
  • [BS04] H. Barnum and M. Saks. A lower bound on the quantum query complexity of read-once functions. Journal of Computer and System Sciences, 69(2):244–258, 2004.
  • [BŠ06] H. Buhrman and R. Špalek. Quantum verification of matrix products. In Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithm, pages 880–889, 2006.
  • [BS17] F. G. S. L. Brandao and K. M. Svore. Quantum speed-ups for solving semidefinite programs. In Proceedings of the 58th Annual Symposium on Foundations of Computer Science, pages 415–426, 2017.
  • [CB80] C. J. Colbourn and K. S. Booth. Linear time automorphism algorithms for trees, interval graphs, and planar graphs. SIAM Journal on Computing, 10(1):203–225, 1980.
  • [CGW00] N. J. Cerf, L. K. Grover, and C. P. Williams. Nested quantum search and structured problems. Physical Review A, 61(3):032303, 2000.
  • [CGYM08] R. Cleve, D. Gavinsky, and D. L. Yonge-Mallo. Quantum algorithms for evaluating MIN-MAX trees. In Theory of Quantum Computation, Communication, and Cryptography: Third Workshop, TQC 2008, pages 11–15, 2008.
  • [CHL07] M. Crochemore, C. Hancart, and T. Lecroq. Algorithms on Strings. Cambridge University Press, 2007.
  • [CILG+12] R. Cleve, K. Iwama, F. Le Gall, H. Nishimura, S. Tani, J. Teruyama, and S. Yamashita. Reconstructing strings from substrings with quantum queries. In Proceedings of the 13th Scandinavian Symposium and Workshops on Algorithm Theory, pages 388–397, 2012.
  • [CKKD+22] A. M. Childs, R. Kothari, M. Kovacs-Deak, A. Sundaram, and D. Wang. Quantum divide and conquer. ArXiv e-prints, 2022. arXiv:2210.06419.
  • [CR94] M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, 1994.
  • [Cro92] M. Crochemore. String-matching on ordered alphabets. Theoretical Computer Science, 92(1):33–47, 1992.
  • [DH96] C. Dürr and P. Høyer. A quantum algorithm for finding the minimum. ArXiv e-prints, 1996. arXiv:quant-ph/9607014.
  • [DHS+18] P. B. Dragon, O. I. Hernandez, J. Sawada, A. Williams, and D. Wong. Constructing de Bruijn sequences with co-lexicographic order: The k-ary Grandmama sequence. European Journal of Combinatorics, 72:1–11, 2018.
  • [Dia87] J. R. Dias. Handbook of Polycyclic Hydrocarbons: Part A, Benzenoid Hydrocarbons. Physical Sciences Data, Elsevier, 1987.
  • [Dia88] J. R. Dias. Handbook of Polycyclic Hydrocarbons: Part B, Polycyclic Isomers and Heteroatom Analogs of Benzenoid Hydrocarbons. Physical Sciences Data, Elsevier, 1988.
  • [Duv83] J. P. Duval. Factorizing words over an ordered alphabet. Journal of Algorithms, 8(8):363–381, 1983.
  • [GBYS92] G. H. Gonnet, R. A. Baeza-Yates, and T. Snider. New indices for text: PAT trees and PAT arrays. In Information Retrieval: Data Structures and Algorithms, pages 66–82, 1992.
  • [Gro96] L. K. Grover. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Computing, pages 212–219, 1996.
  • [GY03] J. L. Gross and J. Yellen. Handbook of Graph Theory. CRC Press, 2003.
  • [HHL09] A. W. Harrow, A. Hassidim, and S. Lloyd. Quantum algorithm for linear systems of equations. Physical Review Letters, 103(15):150502, 2009.
  • [HLZ96] P. Hansen, C. Lebatteux, and M. Zheng. The boundary-edges code for polyhexes. Journal of Molecular Structure: THEOCHEM, 363(2):237–247, 1996.
  • [HMdW03] P. Høyer, M. Mosca, and R. de Wolf. Quantum search on bounded-error inputs. In International Colloquium on Automata, Languages, and Programming, pages 291–299, 2003.
  • [HMU00] J. E. Hopcroft, R. Motwani, and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, 2 edition, 2000.
  • [Hoe63] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13–30, 1963.
  • [Hop71] J. E. Hopcroft. An nlognn\log n algorithm for minimizing states in a finite automaton. In Proceedings of an International Symposium on the Theory of Machines and Computations, pages 189–196, 1971.
  • [IS89] C. S. Iliopoulos and W. F. Smyth. PRAM algorithms for identifying polygon similarity. In Proceedings of the International Symposium on Optimal Algorithms, pages 25–32, 1989.
  • [IS92] C. S. Iliopoulos and W. F. Smyth. Optimal algorithms for computing the canonical form of a circular string. Theoretical Computer Science, 92(1):87–105, 1992.
  • [IS94] C. S. Iliopoulos and W. F. Smyth. A fast average case algorithm for Lyndon decomposition. International Journal of Computer Mathematics, 57(1-2):15–31, 1994.
  • [Jeu93] J. T. Jeuring. Theories for algorithm calculation. Utrecht University, 1993.
  • [JKM13] S. Jeffery, R. Kothari, and F. Magniez. Nested quantum walks with quantum data structures. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1474–1485, 2013.
  • [JN23] C. Jin and J. Nogler. Quantum speed-ups for string synchronizing sets, longest common substring, and kk-mismatch matching. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms, pages 5090–5121, 2023.
  • [KKM+20] R. Kapralov, K. Khadiev, J. Mokut, Y. Shen, and M. Yagafarov. Fast classical and quantum algorithms for online kk-server problem on trees. ArXiv e-prints, 2020. arXiv:2008.00270.
  • [KMP77] D. Knuth, J. H. Morris, and V. Pratt. Fast pattern matching in strings. SIAM Journal on Computing, 6(2):323–350, 1977.
  • [KPBF14] J. Kovič, T. Pisanski, A. T. Balaban, and P. W. Fowler. On symmetries of benzenoid systems. MATCH Communications in Mathematical and in Computer Chemistry, 72(1):3–26, 2014.
  • [KSB06] J. Kärkkäinen, P. Sanders, and S. Burkhardt. Linear work suffix array construction. Journal of the ACM, 53(6):918–9360, 2006.
  • [LGS22] F. Le Gall and S. Seddighin. Quantum meets fine-grained complexity: Sublinear time quantum algorithms for string problems. In Proceedings of the 13th Innovations in Theoretical Computer Science Conference, pages 97:1–97:23, 2022.
  • [Mae91] M. Maes. Polygonal shape recognition using string-matching techniques. Pattern Matching, 24(5):433–440, 1991.
  • [McC76] E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23(2):262–272, 1976.
  • [MM90] U. Manber and G. Myers. Suffix arrays: a new method for on-line string searches. In Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms, pages 319–327, 1990.
  • [Mon17] A. Montanaro. Quantum pattern matching fast on average. Algorithmica, 77:16–39, 2017.
  • [MSS07] F. Magniez, M. Santha, and M. Szegedy. Quantum algorithms for the triangle problem. SIAM Journal on Computing, 37(2):413–424, 2007.
  • [PLB+16] R. Papadakis, H. Li, J. Bergman, A. Lundstedt, K. Jorner, R. Ayub, S. Haldar, B. O. Jahn, A. Denisova, B. Zietz, R. Lindh, B. Sanyal, H. Grennberg, K. Leifer, and H. Ottosson. Metal-free photochemical silylations and transfer hydrogenations of benzenoid hydrocarbons and graphene. Nature Communications, 7:12962, 2016.
  • [Pup10] G. Puppis. Automata for Branching and Layered Temporal Structures: An Investigation into Regularities of Infinite Transition Systems. Springer Science & Business Media, 2010.
  • [Rev92] D. Revuz. Minimisation of acyclic deterministic automata in linear time. Theoretical Computer Science, 92(1):181–189, 1992.
  • [RV03] H. Ramesh and V. Vinay. String matching in O~(n+m)\tilde{O}(\sqrt{n}+\sqrt{m}) quantum time. Journal of Discrete Algorithms, 1(1):103–110, 2003.
  • [Ryt03] W. Rytter. On maximal suffixes and constant-space linear-time versions of KMP algorithm. Theoretical Computer Science, 299(1-3):763–774, 2003.
  • [Shi79] Y. Shiloach. A fast equivalence-checking algorithm for circular lists. Information Processing Letters, 8(5):236–238, 1979.
  • [Shi81] Y. Shiloach. Fast canonization of circular strings. Journal of Algorithms, 2(2):107–121, 1981.
  • [Sho94] P. W. Shor. Algorithms for quantum computation: discrete logarithms and factoring. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, pages 124–134, 1994.
  • [SWW16] J. Sawada, A. Williams, and D. Wong. A surprisingly simple de Bruijn sequence construction. Discrete Mathematics, 339(1):127–131, 2016.
  • [vAGGdW17] J. van Apeldoorn, A. Gilyén, S. Gribling, and R. de Wolf. Quantum SDP-solvers: better upper and lower bounds. In Proceedings of the Fifty-eighth Annual Symposium on Foundations of Computer Science, pages 403–414, 2017.
  • [Vis90] U. Vishkin. Deterministic sampling - a new technique for fast pattern matching. In Proceedings of the Twenty-Second Annual ACM Symposium on Theory of Computing, pages 170–180, 1990.
  • [vLWW01] J. H. van Lint, R. W. Wilson, and R. M. Wilson. A Course in Combinatorics. Cambridge University Press, 2001.
  • [Wan22] Q. Wang. A note on quantum divide and conquer for minimal string rotation. ArXiv e-prints, 2022. arXiv:2210.09149.
  • [Wei73] P. Weiner. Linear pattern matching algorithms. In Proceedings of the Fourteenth Annual Symposium on Switching and Automata Theory, pages 1–11, 1973.
  • [WMK08] W. L. Wang, S. Meng, and E. Kaxiras. Graphene NanoFlakes with large spin. Nano Letters, 8(1):241–245, 2008.
  • [WSS+08] T. Wassmann, A. P. Seitsonen, A. M. Saitta, M. Lazzeri, and F. Mauri. Structure, stability, edge states, and aromaticity of graphene ribbons. Physical Review Letters, 101(9):096402, 2008.
  • [Zal99] C. Zalka. Grover’s quantum searching algorithm is optimal. Physical Review A, 60(4):2746–2751, 1999.

Appendix A Probability Analysis of Majority Voting

For convenience of the reader, let us first recall the Hoeffding’s inequality [Hoe63].

Theorem A.1 (The Hoeffding’s inequality, [Hoe63]).

Let X1,X2,,XnX_{1},X_{2},\dots,X_{n} be independent random variables with 0Xi10\leq X_{i}\leq 1 for every 1in1\leq i\leq n. We define the empirical mean of these variables by

X¯=1ni=1nXi.\bar{X}=\frac{1}{n}\sum_{i=1}^{n}X_{i}.

Then for every ε>0\varepsilon>0, we have:

Pr[X¯𝔼[X¯]ε]exp(2nε2).\Pr[\bar{X}\leq\mathbb{E}[\bar{X}]-\varepsilon]\leq\exp(-2n\varepsilon^{2}).

We only consider the case where bb is true. In this case, we have q=36lnmq=\lceil 36\ln m\rceil independent random variables X1,X2,,XqX_{1},X_{2},\dots,X_{q}, where for 1iq1\leq i\leq q, Xi=0X_{i}=0 with probability 1/3\leq 1/3 and Xi=1X_{i}=1 with probability 2/3\geq 2/3 (Here, Xi=1X_{i}=1 means the ii-th query is true and 0 otherwise). It holds that 𝔼[X¯]2/3\mathbb{E}[\bar{X}]\geq 2/3. By Theorem A.1 and letting ε=1/6\varepsilon=1/6, we have:

Pr[X¯12]Pr[X¯𝔼[X¯]ε]exp(2qε2)exp(2lnm)=1m2.\Pr\left[\bar{X}\leq\frac{1}{2}\right]\leq\Pr[\bar{X}\leq\mathbb{E}[\bar{X}]-\varepsilon]\leq\exp(-2q\varepsilon^{2})\leq\exp(-2\ln m)=\frac{1}{m^{2}}.

By choosing b^\hat{b} to be true if X¯>1/2\bar{X}>1/2 and false otherwise, we obtain Pr[b^=b]=1Pr[X¯1/2]11/m2\Pr\left[\hat{b}=b\right]=1-\Pr\left[\bar{X}\leq 1/2\right]\geq 1-1/m^{2} as required in Algorithm 1.

Appendix B A Framework of Nested Quantum Algorithms

In this appendix, we provide a general framework to explain how the improvement given in Section 3 can be achieved on nested quantum algorithms composed of quantum search and quantum minimum finding.

This framework generalizes multi-level AND-OR trees. In an ordinary AND-OR tree, a Boolean value is given at each of its leaves, and each non-leaf node is associated with an AND or OR operation which alternates in each level. A query-optimal quantum algorithm for evaluation of constant-depth AND-OR trees was proposed in [HMdW03]. Our framework can be seen as a generalization of fault tolerant quantum search studied in [HMdW03], which deals with the extended MIN-MAX-AND-OR trees which allow four basic operations and work for non-Boolean cases. Moreover, combining the error reduction idea in [BCdWZ99], we also provide an error reduction for nested quantum algorithms. Our results will be formally stated later in Lemma B.1.

Suppose there is a dd-level nested quantum algorithm composed of dd bounded-error quantum algorithms A1,A2,,AdA_{1},A_{2},\dots,A_{d} on a dd-dimensional input x(θd1,,θ2,θ1,θ0)x(\theta_{d-1},\dots,\theta_{2},\theta_{1},\theta_{0}) given by an (exact) quantum oracle U0U_{0}:

U0|θd1,,θ2,θ1,θ0,j=U0|θd1,,θ2,θ1,θ0,jx(θd1,,θ2,θ1,θ0),U_{0}\lvert\theta_{d-1},\dots,\theta_{2},\theta_{1},\theta_{0},j\rangle=U_{0}\lvert\theta_{d-1},\dots,\theta_{2},\theta_{1},\theta_{0},j\oplus x(\theta_{d-1},\dots,\theta_{2},\theta_{1},\theta_{0})\rangle,

where θk[nk]\theta_{k}\in[n_{k}] for k[d]k\in[d] and d2d\geq 2. For 1kd1\leq k\leq d, AkA_{k} is a bounded-error quantum algorithm given parameters θd1,,θk\theta_{d-1},\dots,\theta_{k} that computes the function fk(θd1,,θk)f_{k}(\theta_{d-1},\dots,\theta_{k}). In particular, f0xf_{0}\equiv x, and AdA_{d} computes a single value fdf_{d}, which is considered to be the output of the nested quantum algorithm A1,A2,,AdA_{1},A_{2},\dots,A_{d}.

Let tk{s,m}t_{k}\in\{s,m\} denote the type of AkA_{k}, where ss indicates the search problem and mm indicates the minimum finding problem. Let Sm={1kd:tk=m}S_{m}=\{1\leq k\leq d:t_{k}=m\} denote the set of indices of the minimum finding algorithms. The behavior of algorithm AkA_{k} is defined as follows:

  1. 1.

    Case 1. tk=st_{k}=s. Then AkA_{k} is associated with a checker pk(θd1,,θk,ξ)p_{k}(\theta_{d-1},\dots,\theta_{k},\xi) that determines whether ξ\xi is a solution (which returns 11 for “yes” and 0 for “no”). AkA_{k} is to find a solution i[nk1]i\in[n_{k-1}] such that

    pk(θd1,,θk,fk1(θd1,,θk,i))=1p_{k}(\theta_{d-1},\dots,\theta_{k},f_{k-1}(\theta_{d-1},\dots,\theta_{k},i))=1

    (which returns 1-1 if no solution exists). Here, computing pk(θd1,,θk,fk1(θd1,,θk,i))p_{k}(\theta_{d-1},\dots,\theta_{k},f_{k-1}(\theta_{d-1},\dots,\theta_{k},i)) requires the value of fk1(θd1,,θk,i)f_{k-1}(\theta_{d-1},\dots,\theta_{k},i), which in turn requires a constant number of queries to Ak1A_{k-1} with parameters θd1,,θk,i\theta_{d-1},\dots,\theta_{k},i.

  2. 2.

    Case 2. tk=mt_{k}=m. Then AkA_{k} is associated with a comparator ck(θd1,,θk,α,β)c_{k}(\theta_{d-1},\dots,\theta_{k},\alpha,\beta) that compares α\alpha and β\beta. AkA_{k} is to find the index i[nk1]i\in[n_{k-1}] of the minimal element such that

    ck(θd1,,θk,fk1(θd1,,θk,j),fk1(θd1,,θk,i))=0c_{k}(\theta_{d-1},\dots,\theta_{k},f_{k-1}(\theta_{d-1},\dots,\theta_{k},j),f_{k-1}(\theta_{d-1},\dots,\theta_{k},i))=0

    for all jij\neq i. Here, computing ck(θd1,,θk,fk1(θd1,,θk,j),fk1(θd1,,θk,i))c_{k}(\theta_{d-1},\dots,\theta_{k},f_{k-1}(\theta_{d-1},\dots,\theta_{k},j),f_{k-1}(\theta_{d-1},\dots,\theta_{k},i)) requires the value of fk1(θd1,,θk,j)f_{k-1}(\theta_{d-1},\dots,\theta_{k},j) and fk1(θd1,,θk,i)f_{k-1}(\theta_{d-1},\dots,\theta_{k},i), which requires a constant number of queries to Ak1A_{k-1} with parameters θd1,,θk,j\theta_{d-1},\dots,\theta_{k},j and θd1,,θk,i\theta_{d-1},\dots,\theta_{k},i, respectively.

The following lemma settles the query complexity of nested quantum algorithms, which shows that nested algorithms can do much better than naively expected.

Lemma B.1 (Nested quantum algorithms).

Given a dd-level nested quantum algorithm A1,A2,,AdA_{1},A_{2},\dots,A_{d}, where d2d\geq 2, and n0,n1,,nd1n_{0},n_{1},\dots,n_{d-1}, Sm,fdS_{m},f_{d} are defined as above. Then:

  1. 1.

    There is a bounded-error quantum algorithm that computes fdf_{d} with query complexity

    O(k=0d1nk).O\left(\sqrt{\prod_{k=0}^{d-1}n_{k}}\right).
  2. 2.

    There is a quantum algorithm that computes fdf_{d} with error probability ε\leq\varepsilon with query complexity

    O(k=0d1nklog1ε).O\left(\sqrt{\prod_{k=0}^{d-1}n_{k}\log\frac{1}{\varepsilon}}\right).
Proof.

Immediately yields by Theorem 3.1, Theorem 3.2, Lemma 3.4 and Lemma 3.5. ∎

Remark B.1.

Lemma B.1 can be seen as a combination of Theorem 3.1, Theorem 3.2, Lemma 3.4 and Lemma 3.5. For convenience, we assume n0=n1==nd1=nn_{0}=n_{1}=\dots=n_{d-1}=n in the following discussions. Note that traditional probability amplification methods for randomized algorithms usually introduce an O(logd1n)O\left(\log^{d-1}n\right) slowdown for dd-level nested quantum algorithms by repeating the algorithm O(logn)O(\log n) times in each level. However, our method obtains an extremely better O(1)O(1) factor as if there were no errors in oracles at all.

Remark B.2.

Lemma B.1 only covers a special case of nested quantum algorithms. A more general form of nested quantum algorithms can be described as a tree rather than a sequence, which allows intermediate quantum algorithms to compute their results by queries to several low-level quantum algorithms. We call them adaptively nested quantum algorithms, and an example of this kind algorithm is presented in Appendix D for pattern matching (see Figure 1).

Appendix C Remarks for Quantum Deterministic Sampling

Algorithm 4 uses several nested quantum algorithms as subroutines, but they are not described as nested quantum algorithms explicitly. Here, we provide an explicit description for Line 10 in Algorithm 4 as an example. Let θ0[l]\theta_{0}\in[l] and θ1[n]\theta_{1}\in[n]. Then:

  • The 0-level function is

    f0(θ1,θ0)={1θ1iθ0<θ1+ns[iθ0θ1]s[iθ0δ],0otherwise,f_{0}(\theta_{1},\theta_{0})=\begin{cases}1&\theta_{1}\leq i_{\theta_{0}}<\theta_{1}+n\land s[i_{\theta_{0}}-\theta_{1}]\neq s[i_{\theta_{0}}-\delta],\\ 0&\text{otherwise},\end{cases}

    which checks whether candidate θ1\theta_{1} does not match the current deterministic sample at the θ0\theta_{0}-th checkpoint.

  • The 11-level function is

    f1(θ1)={0θ1>Qi,f0(θ1,i)=1,1otherwise,f_{1}(\theta_{1})=\begin{cases}0&\theta_{1}>Q\lor\exists i,f_{0}(\theta_{1},i)=1,\\ 1&\text{otherwise},\end{cases}

    which checks candidate θ1\theta_{1} matches the current deterministic sample at all checkpoints.

  • The 22-level function is

    f2=min{i:f1(i)=1},f_{2}=\min\left\{i:f_{1}(i)=1\right\},

    which finds the first candidate of δ\delta.

By Lemma B.1, f2f_{2} can be computed with error probability ε=1/6m2\leq\varepsilon=1/6m^{2} in

O(mnlog(1/ε))=O(nlognloglogn).O\left(\sqrt{mn\log(1/\varepsilon)}\right)=O\left(\sqrt{n\log n\log\log n}\right).

Appendix D Quantum Algorithm for Pattern Matching

In this appendix, we give a detailed description of our quantum algorithm for pattern matching.

D.1 Quantum Algorithm for String Periodicity

The algorithm will be presented in the form of a nested quantum algorithm. Suppose a string sΣns\in\Sigma^{n} is given by a quantum oracle UsU_{s} that Us|i,j=|i,js[i]U_{s}\lvert i,j\rangle=\lvert i,j\oplus s[i]\rangle. We are asked to check whether ss is periodic, and if yes, find its period.

Let (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) be a deterministic sample of ss. We need a 22-level nested quantum algorithm. Let θ0[l]\theta_{0}\in[l] and θ1[n]\theta_{1}\in[n]. Then:

  • The 0-level function is

    f0(θ1,θ0)={1θ1iθ0<θ1+ns[iθ0θ1]=s[iθ0δ],0otherwise,f_{0}(\theta_{1},\theta_{0})=\begin{cases}1&\theta_{1}\leq i_{\theta_{0}}<\theta_{1}+n\land s[i_{\theta_{0}}-\theta_{1}]=s[i_{\theta_{0}}-\delta],\\ 0&\text{otherwise},\end{cases}

    which checks whether offset θ1\theta_{1} matches the deterministic sample at the θ0\theta_{0}-th checkpoint. There is obviously an exact quantum oracle that computes f0(θ1,θ0)f_{0}(\theta_{1},\theta_{0}) with a constant number of queries to UsU_{s}.

  • The 11-level function is

    f1(θ1)={0i[l],f0(θ1,i)=1,1otherwise,f_{1}(\theta_{1})=\begin{cases}0&\exists i\in[l],f_{0}(\theta_{1},i)=1,\\ 1&\text{otherwise},\end{cases}

    where f1(θ1)f_{1}(\theta_{1}) means offset θ1\theta_{1} matches the deterministic sample of ss. The 22-level function is

    f2=min{i[n]:f1(i)=1},f_{2}=\min\{i\in[n]:f_{1}(i)=1\},

    which finds the minimal offset that matches the deterministic sample.

By Lemma B.1, we obtain an O(nl)O(\sqrt{nl}) bounded-error quantum algorithm that finds the smallest possible offset δ1\delta_{1} of the deterministic sample of ss. According to δ1\delta_{1}, we define another 22-level function

f2=min{i[n]:i>δ1f1(i)=1},f_{2}^{\prime}=\min\{i\in[n]:i>\delta_{1}\land f_{1}(i)=1\},

which finds the second minimal offset that matches the deterministic sample of ss, where min=\min\emptyset=\infty. Similarly, we can find the second smallest offset δ2\delta_{2} of the deterministic sample of ss in query complexity O(nl)O\left(\sqrt{nl}\right) with bounded-error. If δ2=\delta_{2}=\infty, then ss is aperiodic; otherwise, ss is periodic with period d=δ2δ1d=\delta_{2}-\delta_{1}. Therefore, we obtain an O(nlogn)O\left(\sqrt{n\log n}\right) bounded-error quantum algorithm that checks whether a string is periodic and, if yes, finds its period.

D.2 Quantum Algorithm for Pattern Matching

Suppose text tΣnt\in\Sigma^{n} and pattern pΣmp\in\Sigma^{m}, and a deterministic sample of pp is (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}). The idea for pattern matching is to split the text tt into blocks of length L=m/4L=\lfloor m/4\rfloor, the ii-th (0-indexed) of which consists of indices ranged from iLiL to min{(i+1)L,n}1\min\{(i+1)L,n\}-1. Our algorithm applies to the case of m4m\geq 4, but does not to the case 1m31\leq m\leq 3, where a straightforward quantum search is required (we omit it here).

The key step for pattern matching is to find a candidate hih_{i} for 0i<n/L0\leq i<\lceil n/L\rceil, indicating the first occurrence in the ii-th block with starting index iLj<min{(i+1)L,n}iL\leq j<\min\{(i+1)L,n\}, where j+mnj+m\leq n and t[jj+m1]=pt[j\dots j+m-1]=p. Formally,

hi=min{iLj<min{(i+1)L,n}:j+mnt[jj+m1]=p},h_{i}=\min\{iL\leq j<\min\{(i+1)L,n\}:j+m\leq n\land t[j\dots j+m-1]=p\}, (11)

where min=\min\emptyset=\infty. Note that Eq. (6) is similar to Eq. (11) but without the condition of j+mnj+m\leq n, which can easily removed from our algorithm given in the following discussion. In the previous subsection, we presented an efficient quantum algorithm that checks whether the string is periodic or not. Then we are able to design an quantum algorithm for pattern matching. Let us consider the cases of aperiodic patterns and periodic patterns, separately:

D.2.1 Aperiodic Patterns

The Algorithm. For an aperiodic pattern pp, hih_{i} can be computed in the following two steps:

  1. 1.

    Search jj from iLj<min{(i+1)L,n}iL\leq j<\min\{(i+1)L,n\} such that t[ikδ+j]=p[ikδ]t[i_{k}-\delta+j]=p[i_{k}-\delta] for every k[l]k\in[l] (we call such jj a candidate of matching). For every jj, there is an O(l)O(\sqrt{l}) bounded-error quantum algorithm to check whether t[ikδ+j]=p[ikδ]t[i_{k}-\delta+j]=p[i_{k}-\delta] for every k[l]k\in[l]. Therefore, finding any jj requires O(Ll)=O(mlogm)O\left(\sqrt{Ll}\right)=O\left(\sqrt{m\log m}\right) queries (by Theorem 3.1).

  2. 2.

    Check whether the index jj found in the previous step satisfies j+mnj+m\leq n and t[jj+m1]=pt[j\dots j+m-1]=p. This can be computed by quantum search in O(m)O\left(\sqrt{m}\right) queries. If the found index jj does not satisfy this condition, then hi=h_{i}=\infty; otherwise, hi=jh_{i}=j.

It is clear that hih_{i} can be computed with bounded-error in O(mlogm)O\left(\sqrt{m\log m}\right) queries according to the above discussion. Recall that pp appears in tt at least once, if and only if there is at least a 0i<n/L0\leq i<\lceil n/L\rceil such that hi=1h_{i}=1. By Theorem 3.1, this can be checked in O(n/Lmlogm)=O(nlogm)O\left(\sqrt{n/L}\sqrt{m\log m}\right)=O\left(\sqrt{n\log m}\right) queries.

Correctness. Note that in the ii-th block, if there is no such jj or there are more than two values of jj such that t[ikδ+j]=p[ikδ]t[i_{k}-\delta+j]=p[i_{k}-\delta] for every k[l]k\in[l], then there is no matching in the ii-th block. More precisely, we have:

Lemma D.1 (The Ricochet Property [Vis90]).

Let pΣmp\in\Sigma^{m} be aperiodic, (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) be a deterministic sample, and tΣnt\in\Sigma^{n} be a string of length nmn\geq m, and let j[nm+1]j\in[n-m+1]. If t[ik+jδ]=p[ikδ]t[i_{k}+j-\delta]=p[i_{k}-\delta] for every k[l]k\in[l], then t[jj+m1]pt[j^{\prime}\dots j^{\prime}+m-1]\neq p for every j[nm+1]j^{\prime}\in[n-m+1] with jδj<jδ+m/2j-\delta\leq j^{\prime}<j-\delta+\lfloor m/2\rfloor.

Proof.

Assume that t[jj+m1]=pt[j^{\prime}\dots j^{\prime}+m-1]=p for some jδj<jδ+m/2j-\delta\leq j^{\prime}<j-\delta+\lfloor m/2\rfloor. Let x=δj+jx=\delta-j+j^{\prime}. Note that 0x<m/20\leq x<\lfloor m/2\rfloor. We further assume that xδx\neq\delta. Then by the definition of a deterministic sample, there exists k[l]k\in[l] such that xik<x+mx\leq i_{k}<x+m and p[ikx]p[ikδ]p[i_{k}-x]\neq p[i_{k}-\delta]. On the other hand, p[ikx]=t[j+ikx]=t[ik+jδ]=p[ikδ]p[i_{k}-x]=t[j^{\prime}+i_{k}-x]=t[i_{k}+j-\delta]=p[i_{k}-\delta]. A contradiction arises, which implies x=δx=\delta and therefore j=jj=j^{\prime}. ∎

We can see that if there are two different indices j1,j2j_{1},j_{2} in the ii-th block such that t[ikδ+jr]=p[ikδ]t[i_{k}-\delta+j_{r}]=p[i_{k}-\delta] for every k[l]k\in[l] and r{1,2}r\in\{1,2\}, it must hold that |j1j2|m/4\lvert j_{1}-j_{2}\rvert\leq m/4, since each block has length L=m/4L=\lfloor m/4\rfloor. If we apply Lemma D.1 on j1j_{1}, then t[j2j2+m1]pt[j_{2}\dots j_{2}+m-1]\neq p; and if we apply it on j2j_{2}, then t[j1j1+m1]pt[j_{1}\dots j_{1}+m-1]\neq p. Thus, we conclude that neither j1j_{1} nor j2j_{2} can be a starting index of occurrence pp in tt. As a result, there is at most one candidate of matching in each block.

A Description in the Form of a Nested Quantum Algorithm. The above algorithm can be more clearly described as a 33-level nested quantum algorithm.

  • The level-0 function is defined by

    f0(θ2,θ1,θ0)={1iθ0δ+θ2L+θ1[n]t[iθ0δ+θ2L+θ1]=p[iθ0δ]0otherwise,f_{0}(\theta_{2},\theta_{1},\theta_{0})=\begin{cases}1&i_{\theta_{0}}-\delta+\theta_{2}L+\theta_{1}\in[n]\land t[i_{\theta_{0}}-\delta+\theta_{2}L+\theta_{1}]=p[i_{\theta_{0}}-\delta]\\ 0&\text{otherwise}\end{cases},

    which checks whether tt matches pp at the θ0\theta_{0}-th checkpoint at the θ1\theta_{1}-th index in the θ2\theta_{2}-th block, where θ0[l]\theta_{0}\in[l], θ1[L]\theta_{1}\in[L] and θ2[n/L]\theta_{2}\in\left[\lceil n/L\rceil\right].

  • The level-11 function is defined by

    f1(θ2,θ1)={0i[l],f0(θ2,θ1,i)=0,1otherwise,f_{1}(\theta_{2},\theta_{1})=\begin{cases}0&\exists i\in[l],f_{0}(\theta_{2},\theta_{1},i)=0,\\ 1&\text{otherwise},\end{cases}

    which checks whether tt matches the deterministic sample at the θ1\theta_{1}-th index in the θ2\theta_{2}-th block.

  • The level-22 function f2(θ2)f_{2}(\theta_{2}) finds a solution i[L]i\in[L] such that f1(θ2,i)=1f_{1}(\theta_{2},i)=1, which indicates the (only) candidate in the θ2\theta_{2}-th block.

  • The level-33 function f3f_{3} finds a matching among f2(i)f_{2}(i) over all i[n/L]i\in[\lceil n/L\rceil] by checking whether iL+f2(i)+mniL+f_{2}(i)+m\leq n and t[iL+f2(i)iL+f2(i)+m1]=pt[iL+f_{2}(i)\dots iL+f_{2}(i)+m-1]=p, where the latter condition can be checked by a quantum searching algorithm, which can be formulated as a 1-level nested quantum algorithm:

    • The level-0 function

      g0(ξ2,ξ1,ξ0)={1t[ξ2L+ξ1+ξ0]=p[ξ0],0otherwise,g_{0}(\xi_{2},\xi_{1},\xi_{0})=\begin{cases}1&t[\xi_{2}L+\xi_{1}+\xi_{0}]=p[\xi_{0}],\\ 0&\text{otherwise},\end{cases}

      which checks whether the ξ0\xi_{0}-th character of substring of tt starting at offset ξ1\xi_{1} in the ξ2\xi_{2}-th block matches the ξ0\xi_{0}-th character of pp, where ξ0[m]\xi_{0}\in[m], ξ1[L]\xi_{1}\in[L] and ξ0[n/L]\xi_{0}\in\left[\lceil n/L\rceil\right].

    • The 11-level function

      g1(ξ2,ξ1)={0i[m],g0(ξ2,ξ1,i)=0,1otherwise,g_{1}(\xi_{2},\xi_{1})=\begin{cases}0&\exists i\in[m],g_{0}(\xi_{2},\xi_{1},i)=0,\\ 1&\text{otherwise},\end{cases}

      which checks whether the substring of tt starting at offset ξ1\xi_{1} in the ξ2\xi_{2}-th block matches pp. Finally, we have that f3f_{3} finds a solution i[n/L]i\in\left[\lceil n/L\rceil\right] such that iL+f2(i)+mniL+f_{2}(i)+m\leq n and g1(i,f2(i))=1g_{1}(i,f_{2}(i))=1.

The structure of our algorithm can be visualized as the tree in Figure 1. It is worth noting that f3f_{3} calls both f2f_{2} and g1g_{1}.

\Tree

[.f3f_{3} [.f2f_{2} [.f1f_{1} f0f_{0} ] ] [.g1g_{1} [.g0g_{0} ] ] ]

Figure 1: Quantum pattern matching algorithm for aperiodic strings.

By a careful analysis, we see that the query complexity of the above algorithm is O(nlogm)O\left(\sqrt{n\log m}\right).

D.2.2 Periodic Patterns

For a periodic pattern pp, a similar result can be achieved with some minor modifications to the algorithm for an aperiodic pattern. First, we have:

Lemma D.2 (The Ricochet Property for periodic strings).

Let pΣmp\in\Sigma^{m} be periodic with period dm/2d\leq m/2, (δ;i0,i1,,il1)(\delta;i_{0},i_{1},\dots,i_{l-1}) be a deterministic sample, and tΣnt\in\Sigma^{n} be a string of length nmn\geq m, and let j[nm+1]j\in[n-m+1]. If t[ik+jδ]=p[ikδ]t[i_{k}+j-\delta]=p[i_{k}-\delta] for every k[l]k\in[l], then t[jj+m1]pt[j^{\prime}\dots j^{\prime}+m-1]\neq p for every j[nm+1]j^{\prime}\in[n-m+1] with jδj<jδ+m/2j-\delta\leq j^{\prime}<j-\delta+\lfloor m/2\rfloor and jj(modd)j^{\prime}\not\equiv j\pmod{d}.

Proof.

Similar to proof of Lemma D.1. ∎

Now in order deal with periodic pattern pp, the algorithm for an aperiodic pattern can be modified as follows. For the ii-th block, in order to compute hih_{i}, we need to find (by minimum finding) the leftmost and the rightmost candidates jlj_{l} and jrj_{r}, which requires O(Ll)=O(mlogm)O\left(\sqrt{Ll}\right)=O\left(\sqrt{m\log m}\right) queries (by Algorithm 1). Let us consider two possible cases:

  1. 1.

    If jljr(modd)j_{l}\not\equiv j_{r}\pmod{d}, then by Lemma D.2, there is no matching in the ii-th block and thus hi=h_{i}=\infty;

  2. 2.

    If jljr(modd)j_{l}\equiv j_{r}\pmod{d}, find the smallest jlqRij_{l}\leq q\leq R_{i} such that t[qRi]=p[qjlqjl+Riq]t[q\dots R_{i}]=p[q-j_{l}\dots q-j_{l}+R_{i}-q], where Ri=min{(i+1)L,n}1R_{i}=\min\{(i+1)L,n\}-1 denotes the right endpoints of the ii-th block, by minimum finding in O(L)=O(m)O\left(\sqrt{L}\right)=O\left(\sqrt{m}\right) queries (by Algorithm 1). Then the leftmost candidate will be

    j=jl+qjldd.j=j_{l}+\left\lceil\frac{q-j_{l}}{d}\right\rceil d.

    If jmin{nm,jr}j\leq\min\left\{n-m,j_{r}\right\} and t[jj+m1]=pt[j\dots j+m-1]=p (which can be checked by quantum search in O(m)O(\sqrt{m}) queries), then hi=jh_{i}=j; otherwise, hi=h_{i}=\infty.

Correctness. It is not straightforward to see that the leftmost occurrence in the ii-th block is found in the case jljr(modd)j_{l}\equiv j_{r}\pmod{d} if there does exist an occurrence in that block. By Lemma D.2, if there exists an occurrence starting at index jljjrj_{l}\leq j\leq j_{r} in the ii-th block, then jljjr(modd)j_{l}\equiv j\equiv j_{r}\pmod{d}. Let the leftmost and the rightmost occurrences of pp in the ii-th block be jlj_{l}^{\prime} and jrj_{r}^{\prime}, respectively. Then jljljrjrj_{l}\leq j_{l}^{\prime}\leq j_{r}^{\prime}\leq j_{r} and jljljrjr(modd)j_{l}\equiv j_{l}^{\prime}\equiv j_{r}^{\prime}\equiv j_{r}\pmod{d}. By the minimality of qq with jlqRij_{l}\leq q\leq R_{i} and t[qRi]=p[qjlqjl+Riq]t[q\dots R_{i}]=p[q-j_{l}\dots q-j_{l}+R_{i}-q], we have qjlq\leq j_{l}^{\prime}, and therefore the candidate determined by qq is

jq=jl+qjlddjl.j_{q}=j_{l}+\left\lceil\frac{q-j_{l}}{d}\right\rceil d\leq j_{l}^{\prime}.

On the other hand, if jqjlj_{q}\neq j_{l}^{\prime}, the existence of jlj_{l}^{\prime} leads immediately to that t[jqjq+m1]t[j_{q}\dots j_{q}+m-1] matches pp, i.e. jq<jlj_{q}<j_{l}^{\prime} is also an occurrence, which contradicts with the minimality of jlj_{l}^{\prime}. As a result, we have jq=jlj_{q}=j_{l}^{\prime}. That is, our algorithm finds the leftmost occurrence in the block, if exists.

Complexity. According to the above discussion, it is clear that hih_{i} can be computed with bounded-error in O(mlogm)O\left(\sqrt{m\log m}\right) queries. Thus, the entire problem can be solved by searching on bounded-error oracles (by Theorem 3.1) and the query complexity is

O(n/Lmlogm)=O(nlogm).O\left(\sqrt{n/L}\sqrt{m\log m}\right)=O\left(\sqrt{n\log m}\right).

Combining the above two cases, we conclude that there is a bounded-error quantum algorithm for pattern matching in O(nlogm+mlog3mloglogm)O\left(\sqrt{n\log m}+\sqrt{m\log^{3}m\log\log m}\right) queries.

Appendix E Proof of Exclusion Rule for LMSR

In this appendix, we present a proof of Lemma 5.1. To this end, we first observe:

Proposition E.1.

Suppose sΣns\in\Sigma^{n} and s[LMSR(s)LMSR(s)+B1]=abas[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1]=aba, where |a|1\lvert a\rvert\geq 1, |b|0\lvert b\rvert\geq 0, and B=2|a|+|b|n/2B=2\lvert a\rvert+\lvert b\rvert\leq n/2. For every m>0m>0 and i[n]i\in[n], if s[ii+|b|+|a|1]=bas[i\dots i+\lvert b\rvert+\lvert a\rvert-1]=ba, then

s[ii+m1]s[i+|b|+|a|i+|b|+|a|+m1].s[i\dots i+m-1]\leq s[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert+\lvert a\rvert+m-1].
Proof.

We prove it by induction on mm.

Basis. For every i[n]i\in[n] with s[ii+|b|+|a|1]=bas[i\dots i+\lvert b\rvert+\lvert a\rvert-1]=ba, we note that s[i+|b|i+|b|+B1]=as[i+|b|+|a|i+|b|+B1]s[i+\lvert b\rvert\dots i+\lvert b\rvert+B-1]=as[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert+B-1]. On the other hand, by the definition of LMSR, we have

s[i+|b|i+|b|+B1]s[LMSR(s)LMSR(s)+B1]=aba.s[i+\lvert b\rvert\dots i+\lvert b\rvert+B-1]\geq s[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1]=aba.

Therefore, it holds that s[i+|b|+|a|i+|b|B1]ba=s[ii+|b|+|a|1]s[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert-B-1]\geq ba=s[i\dots i+\lvert b\rvert+\lvert a\rvert-1]. Immediately, we see that the proposition holds for 1m|b|+|a|1\leq m\leq\lvert b\rvert+\lvert a\rvert.

Induction. Assume that the proposition holds for m=k(|b|+|a|)m^{\prime}=k(\lvert b\rvert+\lvert a\rvert) and k1k\geq 1, and we are going to prove it for the case m<m(k+1)(|b|+|a|)m^{\prime}<m\leq(k+1)(\lvert b\rvert+\lvert a\rvert). According to the induction hypothesis, we have

s[ii+m1]s[i+|b|+|a|i+|b|+|a|+m1]s[i\dots i+m^{\prime}-1]\leq s[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert+\lvert a\rvert+m^{\prime}-1]

for every 0i<n0\leq i<n. Let us consider the following two cases:

  1. 1.

    s[ii+m1]<s[i+|b|+|a|i+|b|+|a|+m1]s[i\dots i+m^{\prime}-1]<s[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert+\lvert a\rvert+m^{\prime}-1]. In this case, it is trivial that s[ii+m1]<s[i+|b|+|a|i+|b|+|a|+m1]s[i\dots i+m-1]<s[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert+\lvert a\rvert+m-1] for every m>mm>m^{\prime}.

  2. 2.

    s[ii+m1]=s[i+|b|+|a|i+|b|+|a|+m1]s[i\dots i+m^{\prime}-1]=s[i+\lvert b\rvert+\lvert a\rvert\dots i+\lvert b\rvert+\lvert a\rvert+m^{\prime}-1]. In this case, we have s[ii+(k+1)(|b|+|a|)1]=(ba)k+1s[i\dots i+(k+1)(\lvert b\rvert+\lvert a\rvert)-1]=(ba)^{k+1}, and

    s[i+|b|+|a|i+(k+2)(|b|+|a|)1]=(ba)ks[i+(k+1)(|b|+|a|)i+(k+2)(|b|+|a|)1].s[i+\lvert b\rvert+\lvert a\rvert\dots i+(k+2)(\lvert b\rvert+\lvert a\rvert)-1]=(ba)^{k}s[i+(k+1)(\lvert b\rvert+\lvert a\rvert)\dots i+(k+2)(\lvert b\rvert+\lvert a\rvert)-1].

    According to the induction hypothesis for i=i+m=i+k(|b|+|a|)i^{\prime}=i+m^{\prime}=i+k(\lvert b\rvert+\lvert a\rvert) (this can be derived from the index i′′=imodn[n]i^{\prime\prime}=i^{\prime}\bmod n\in[n]), we have:

    s[i+k(|b|+|a|)i+(k+1)(|b|+|a|)1]s[i+(k+1)(|b|+|a|)i+(k+2)(|b|+|a|)1].s[i+k(\lvert b\rvert+\lvert a\rvert)\dots i+(k+1)(\lvert b\rvert+\lvert a\rvert)-1]\leq s[i+(k+1)(\lvert b\rvert+\lvert a\rvert)\dots i+(k+2)(\lvert b\rvert+\lvert a\rvert)-1].

    Therefore, we obtain s[ii+(k+1)(|b|+|a|)1]s[i+|b|+|a|i+(k+2)(|b|+|a|)1]s[i\dots i+(k+1)(\lvert b\rvert+\lvert a\rvert)-1]\leq s[i+\lvert b\rvert+\lvert a\rvert\dots i+(k+2)(\lvert b\rvert+\lvert a\rvert)-1], which means the proposition holds for m=m+|b|+|a|m=m^{\prime}+\lvert b\rvert+\lvert a\rvert. Immediately, we see that the proposition also holds for m<m(k+1)(|b|+|a|)m^{\prime}<m\leq(k+1)(\lvert b\rvert+\lvert a\rvert).

Now we are ready to prove Lemma 5.1. Let δ=ji\delta=j-i, then we have 1δB11\leq\delta\leq B-1. We consider the following two cases:

  • Case 1. δ>B/2\delta>B/2. In this case, s[LMSR(s)LMSR(s)+B1]=abas[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1]=aba for some strings aa and bb, where |a|=Bδ\lvert a\rvert=B-\delta and |b|=2δB\lvert b\rvert=2\delta-B. In order to prove that LMSR(s)j\operatorname{LMSR}(s)\neq j, it is sufficient to show that s[ii+n1]s[jj+n1]s[i\dots i+n-1]\leq s[j\dots j+n-1]. Note that

    s[ii+n1]\displaystyle s[i\dots i+n-1] =ababas[j+Bj+nδ1],\displaystyle=ababas[j+B\dots j+n-\delta-1],
    s[jj+n1]\displaystyle s[j\dots j+n-1] =abas[j+Bj+n1].\displaystyle=abas[j+B\dots j+n-1].

    We only need to show that bas[j+Bj+nδ1]s[j+Bj+n1]bas[j+B\dots j+n-\delta-1]\leq s[j+B\dots j+n-1], that is, s[i+Bi+n1]s[i+B+δi+δ+n1]s[i+B\dots i+n-1]\leq s[i+B+\delta\dots i+\delta+n-1], which can be immediately obtained from Proposition E.1 by letting mnBm\equiv n-B and ii+Bi\equiv i+B.

  • Case 2. δB/2\delta\leq B/2. Let t=s[LMSR(s)LMSR(s)+B1]t=s[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1]. Note that t[i+δ]=t[i]t[i+\delta]=t[i] for every i[B]i\in[B]. We immediately see that tt has period d=gcd(B,δ)d=\gcd(B,\delta), that is, t=akt=a^{k}, where |a|=d\lvert a\rvert=d and B=kdB=kd with k2k\geq 2. For convenience, we denote δ=ld\delta=ld for some 1lk11\leq l\leq k-1. In order to prove that LMSR(s)j\operatorname{LMSR}(s)\neq j, it is sufficient to show that s[ii+n1]s[jj+n1]s[i\dots i+n-1]\leq s[j\dots j+n-1]. Note that

    s[ii+n1]\displaystyle s[i\dots i+n-1] =ak+ms[j+Bj+nδ1],\displaystyle=a^{k+m}s[j+B\dots j+n-\delta-1],
    s[jj+n1]\displaystyle s[j\dots j+n-1] =aks[j+Bj+n1].\displaystyle=a^{k}s[j+B\dots j+n-1].

    We only need to show that ams[j+Bj+nδ1]s[j+Bj+n1]a^{m}s[j+B\dots j+n-\delta-1]\leq s[j+B\dots j+n-1], i.e. s[i+Bi+n1]s[i+B+δi+δ+n1]s[i+B\dots i+n-1]\leq s[i+B+\delta\dots i+\delta+n-1], which can be immediately obtained from Proposition E.1 by noting that s[LMSR(s)LMSR(s)+B1]=alak2lals[\operatorname{LMSR}(s)\dots\operatorname{LMSR}(s)+B-1]=a^{l}a^{k-2l}a^{l}.

Appendix F Worst-case Quantum Lower Bound

The worst-case quantum lower bound can be examined in a way different from that in Section 6.2. Let us consider a special case where all strings are binary, that is, the alphabet is Σ={0,1}\Sigma=\{0,1\}. A solution to the LMSR problem implies the existence of a 0 character. That is, s[i]=0s[i]=0 for some i[n]i\in[n] if and only if s[LMSR(s)]=0s[\operatorname{LMSR}(s)]=0. Therefore, the search problem can be reduced to LMSR. It is known that the search problem has a worst-case query complexity lower bound Ω(n)\Omega\left(\sqrt{n}\right) for bounded-error quantum algorithms [BBBV97, BBHT98, Zal99], and Ω(n)\Omega(n) for exact and zero-error quantum algorithms [BBC+01]. Consequently, we assert that the LMSR problem also has a worst-case quantum query complexity lower bound Ω(n)\Omega\left(\sqrt{n}\right) for bounded-error quantum algorithms, and Ω(n)\Omega(n) for exact and zero-error quantum algorithms.