This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\NewEnviron

scaletikzpicturetowidth[1]\BODY University of Warsaw, Poland and Samsung R&D Polandjrad@mimuw.edu.plhttps://orcid.org/0000-0002-0067-6401 University of Warsaw, Polandrytter@mimuw.edu.plhttps://orcid.org/0000-0002-9162-6724 University of Warsaw, Polandjks@mimuw.edu.pl0000-0003-2207-0053 University of Warsaw, Polandwalen@mimuw.edu.plhttps://orcid.org/0000-0002-7369-3309 University of Warsaw, Polandw.zuba@mimuw.edu.plhttps://orcid.org/0000-0002-1988-3507

Acknowledgements.
The authors warmly thank Paweł Gawrychowski and Tomasz Kociumaka for helpful discussions. \CopyrightJakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba {CCSXML} <ccs2012> <concept> <concept_id>10003752.10003809.10010031.10010032</concept_id> <concept_desc>Theory of computation Pattern matching</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Pattern matching \hideLIPIcs

Hardness of Detecting Abelian and Additive Square Factors in Strings

Jakub Radoszewski    Wojciech Rytter    Juliusz Straszyński    Tomasz Waleń    Wiktor Zuba
Abstract

We prove 3SUM-hardness (no strongly subquadratic-time algorithm, assuming the 3SUM conjecture) of several problems related to finding Abelian square and additive square factors in a string. In particular, we conclude conditional optimality of the state-of-the-art algorithms for finding such factors.

Overall, we show 3SUM-hardness of (a) detecting an Abelian square factor of an odd half-length, (b) computing centers of all Abelian square factors, (c) detecting an additive square factor in a length-nn string of integers of magnitude n𝒪(1)n^{\mathcal{O}(1)}, and (d) a problem of computing a double 3-term arithmetic progression (i.e., finding indices iji\neq j such that (xi+xj)/2=x(i+j)/2(x_{i}+x_{j})/2=x_{(i+j)/2}) in a sequence of integers x1,,xnx_{1},\dots,x_{n} of magnitude n𝒪(1)n^{\mathcal{O}(1)}.

Problem (d) is essentially a convolution version of the AVERAGE problem that was proposed in a manuscript of Erickson. We obtain a conditional lower bound for it with the aid of techniques recently developed by Dudek et al. [STOC 2020]. Problem (d) immediately reduces to problem (c) and is a step in reductions to problems (a) and (b). In conditional lower bounds for problems (a) and (b) we apply an encoding of Amir et al. [ICALP 2014] and extend it using several string gadgets that include arbitrarily long Abelian-square-free strings.

Our reductions also imply conditional lower bounds for detecting Abelian squares in strings over a constant-sized alphabet. We also show a subquadratic upper bound in this case, applying a result of Chan and Lewenstein [STOC 2015].

keywords:
Abelian square, additive square, 3SUM problem

1 Introduction

Abelian squares.

An Abelian square, Ab-square in short (also known as a jumbled square), is a string of the form XYXY, where YY is a permutation of XX; we say that XX and YY are Ab-equivalent. We are interested in factors (i.e., substrings composed of consecutive letters) of a given text string being Ab-squares.

Example 1.1.

The string

061056516101111065657861650510566506030652

has exactly two Ab-square factors of length 12, shown above (but it has also Ab-squares of other lengths, e.g. 5665, 11, 1111, 011110).

Ab-squares were first studied by Erdős [16], who posed a question on the smallest alphabet size for which there exists an infinite Ab-square-free string, i.e., an infinite string without Ab-square factors. The first example of such a string over a finite alphabet was given by Evdokimov [18]. Later the alphabet size was improved to five by Pleasants [34] and finally an optimal example over a four-letter alphabet was shown by Keränen [27]. Further results on combinatorics of Ab-square-free strings and several examples of their applications in group theory, algorithmic music and cryptography can be found in [26] and references therein. Avoidability of long Ab-squares was also considered [36].

Strings containing Ab-squares were also studied. Motivated by another problem posed by Erdős [16], Entringer et al. [15] showed that every infinite binary string has arbitrarily long Ab-square factors. Fici et al. [19] considered infinite strings containing many distinct Ab-squares. A string of length nn may contain Θ(n2)\Theta(n^{2}) Ab-square factors that are distinct as strings, but contains only 𝒪(n11/6)\mathcal{O}(n^{11/6}) Ab-squares which are pairwise Abelian nonequivalent (correspond to different Parikh vectors), see [28]. It is also conjectured that a binary string of length nn must have at least n/4\left\lfloor n/4\right\rfloor distinct [20] and nonequivalent [21] Ab-square factors. For more conjectures related to combinatorics of Ab-square factors of strings ad circular strings, see [39].

Several algorithms computing Ab-square factors of a string are known. All Ab-squares in a string of length nn can be computed in 𝒪(n2)\mathcal{O}(n^{2}) time [13]. For a string over a constant-sized alphabet, all Ab-square factors of a string can be computed in 𝒪(n2/log2n+𝗈𝗎𝗍𝗉𝗎𝗍)\mathcal{O}(n^{2}/\log^{2}n+\mathsf{output}) time and the longest Ab-square can be computed in 𝒪(n2/log2n)\mathcal{O}(n^{2}/\log^{2}n) time [29, 30]. Moreover, for a string of length nn that is given by its run-length encoding consisting of rr runs, the longest Ab-square that ends at each position can be computed in 𝒪(|Σ|(r2+n))\mathcal{O}(|\Sigma|(r^{2}+n)) time [2] or in 𝒪(rn)\mathcal{O}(rn) time [40]; both approaches require Ω(n2)\Omega(n^{2}) time in the worst case.

In [37] a different problem of enumerating strings being Ab-squares was considered.

Additive squares.

An additive square is an even-length string over an integer alphabet such that the sums of characters of the halves of this string are the same.

Example 1.2.

The following string has exactly 4 additive squares of length 10, as shown.

1203212023210123

All of them except for the rightmost one are also Ab-squares. This string does not contain any longer additive square. Altogether this string has 8 additive square factors.

An Ab-square (over an integer alphabet) is an additive square, but not necessarily the other way around. Combinatorially, problems related to additive squares are hard, in particular avoiding additive squares seems more difficult than avoiding Ab-squares. There are infinitely many strings over {0,1,2,3}\{0,1,2,3\} avoiding Ab-squares, but there are only finitely many strings over the same alphabet avoiding additive squares; see [22].

In fact it is unknown if there are infinitely many strings over any finite integer alphabet avoiding additive squares [7, 25, 33]. For additive cubes the property was proved in [9] (see also [32]) however.

Nowadays, combinatorial study of Ab-square and additive square factors often involves computer experiments; see e.g. [9, 19, 36]. In addition to other applications, efficient algorithms detecting such types of squares could provide a significant aid in this research. In case of classic square factors (i.e., factors of the form XXXX), a linear-time algorithm for computing them is known for a string over a constant [24] and over an integer alphabet [4, 12]. We show that, unfortunately, in many cases the existence of near-linear-time algorithms for detecting Ab-square and additive square factors is unlikely, based on conjectured hardness of the 3SUM\operatorname{\textsc{3SUM}} problem.

3SUM\operatorname{\textsc{3SUM}} problem.

The problem asks if there are distinct elements a,b,cSa,b,c\in S such that a+b=ca+b=c for a given set SS of nn integers ; see [35]. It is a general belief that the following conjecture is true for the word-RAM model.

3SUM\operatorname{\textsc{3SUM}} conjecture:

There is no 𝒪(n2ϵ)\mathcal{O}(n^{2-\epsilon}) time algorithm for the 3SUM\operatorname{\textsc{3SUM}} problem, for any constant ϵ>0\epsilon>0.

A problem with input of size nn is called 3SUM\operatorname{\textsc{3SUM}}-hard if an 𝒪(n2ε)\mathcal{O}(n^{2-\varepsilon})-time solution to the problem implies an 𝒪(n2ε)\mathcal{O}(n^{2-\varepsilon^{\prime}})-time solution for 3SUM\operatorname{\textsc{3SUM}}, for some constants ε,ε>0\varepsilon,\varepsilon^{\prime}>0.

Our results.

  • We show that the problems of computing all centers of Ab-square factors and detecting an odd half-length Ab-square factor, called an odd Ab-square (consequently also computing all lengths of Ab-square factors), for a length-nn string over an alphabet of size ω(1)\omega(1), cannot be solved in 𝒪(n2ε)\mathcal{O}(n^{2-\varepsilon}) time, for constant ε>0\varepsilon>0, unless the 3SUM conjecture fails. Weaker conditional lower bounds are also stated in the case of a constant-sized alphabet.

  • For constant-sized alphabets, we show strongly sub-quadratic algorithms for these problems based on an involved result of [11] related to jumbled indexing.

  • En route we prove that detection of a double 3-term arithmetic progression (see [8]) and additive squares in a length-nn sequence of integers of magnitude n𝒪(1)n^{\mathcal{O}(1)} is 3SUM\operatorname{\textsc{3SUM}}-hard.

We obtain deterministic conditional lower bounds from a convolution version of 3SUM\operatorname{\textsc{3SUM}} that is well-known to be 3SUM\operatorname{\textsc{3SUM}}-hard.

Related work.

In the jumbled indexing problem, we are given a text TT and are to answer queries for a pattern specified by a Parikh vector which gives, for each letter of the alphabet, the number of occurrences of this letter in the pattern. For each query, we are to check if there is a factor of the text that is Ab-equivalent to the pattern (existence query) or report all such factors (reporting query). Chan and Lewenstein [11] showed a data structure that can be constructed in truly subquadratic expected time and answers existence queries in truly sublinear time for a constant-sized alphabet (deterministic constructions for very small alphabets were also shown). Amir et al. [3] showed under a 3SUM-hardness assumption that jumbled indexing with existence queries requires Ω(n2ε)\Omega(n^{2-\varepsilon}) preprocessing time or Ω(n1δ)\Omega(n^{1-\delta}) queries for any ϵ,δ>0\epsilon,\delta>0 for an alphabet of size ω(1)\omega(1). They also provided particular constants εσ,δσ\varepsilon_{\sigma},\delta_{\sigma} for an alphabet of a constant size σ3\sigma\geq 3 such that, under a stronger 3SUM-hardness assumption, jumbled indexing requires Ω(n2εσ)\Omega(n^{2-\varepsilon_{\sigma}}) preprocessing time or Ω(n1δσ)\Omega(n^{1-\delta_{\sigma}}) queries. We use the techniques from both results in our algorithm and conditional lower bound for Ab-squares, respectively. The lower bound of Amir et al. was later improved and extended to both existence and reporting variants and any constant σ2\sigma\geq 2 by Goldstein et al. [23, Section 7] with the aid of randomization. Moreover, recently an unconditional lower bound for the reporting variant was given in [1].

Our techniques.

A subsequence of three distinct positions is a 3-term double arithmetic progression (3dap in short) if it is an arithmetic progression and the elements on these positions also form an arithmetic progression. The problem of finding a 3dap in a sequence is denoted by 3DAP\operatorname{\textsc{3DAP}}. It is an odd 3dap if the first and the third positions are odd and the middle position is even. The corresponding problem is denoted by Odd-3DAP\operatorname{\textsc{Odd\mbox{-}3DAP}}. First we reduce the convolution problem 3SUM (known to be 3SUM-hard) to the 3DAP\operatorname{\textsc{3DAP}} problem via Odd-3DAP\operatorname{\textsc{Odd\mbox{-}3DAP}} as an intermediate problem. This uses a divide-and-conquer approach and a partition of sets into sets avoiding bad arithmetic progression of length 3.

The 3DAP\operatorname{\textsc{3DAP}} problem reduces in a simple way to detection of an additive square, showing that the latter problem is 3SUM hard.

Next, the 3DAP\operatorname{\textsc{3DAP}} problem is encoded as a string. We follow the high-level idea from Amir et al. Instead of checking equality of numbers, we can check equality of their remainders modulo sufficiently many prime numbers. Then, each prime number corresponds to a distinct characters. If the numbers are polyn\mathrm{poly}\,n then only 𝒪(logn)\mathcal{O}(\log n) prime numbers are needed. However, there is a certain technical complication, already present in the paper of Amir et al., which needs an introduction of additional gadgets working as equalizers. The details, compared with construction of Amir at al., are different, mostly because in the end we want to ask about detection, not indexing.

Then we consider the problem of computing all centers of Ab-squares, this requires new gadgets. We show that computing all centers of Ab-squares is 3SUM-hard, as well as detection of any Ab-square which is well centred.

Later we extend this to detection of any odd Ab-square. We use a construction of a string over the alphabet of size 4 with no Ab-square. The input string is “shuffled” with such a string, with some separators added. This forces odd Ab-squares to be well centered, in this way we reduce the previously considered problem of detection of any well-centred Ab-square to the detection of any odd Ab-square. Ultimately, this shows that the latter problem is 3SUM-hard.

2 From Conv3SUM\operatorname{\textsc{Conv3SUM}} to finding double 3-term arithmetic progressions

For integers a,ba,b, by [a,b][a,b] we denote the set {a,,b}\{a,\dots,b\}. We use the following convolution variant of the 3SUM\operatorname{\textsc{3SUM}} problem that is 3SUM\operatorname{\textsc{3SUM}}-hard; see [10, 31, 35] for both randomized and deterministic reductions. As already noted in [3], the range of elements can be made [N2,N2][-N^{2},N^{2}] using a randomized hashing reduction from [5, 35].

Conv3SUM(x¯)\operatorname{\textsc{Conv3SUM}}(\bar{x}) Input: A sequence x¯=[x1,,xN][N2,N2]\bar{x}=[x_{1},\dots,x_{N}]\in[-N^{2},N^{2}] Output: Yes if there are iji\neq j such that xi+xj=xi+jx_{i}+x_{j}=x_{i+j}; no otherwise.

Let us denote 𝑚𝑖𝑑(a,b)=(a+b)/2\mathit{mid}(a,b)=(a+b)/2 and define the condition:

𝚲x¯(i,j)=(ijjiis evenx𝑚𝑖𝑑(i,j)=𝑚𝑖𝑑(xi,xj)).\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{x}}(i,j)\;=\;(\,i\neq j\ \land\ j-i\ \text{is even}\ \land\ x_{\mathit{mid}(i,j)}=\mathit{mid}(x_{i},x_{j})\,).

We omit the subscript x¯{\bar{x}} if it is clear from the context. The last part of the condition is equivalent to xjx𝑚𝑖𝑑(i,j)=x𝑚𝑖𝑑(i,j)xix_{j}-x_{\mathit{mid}(i,j)}=x_{\mathit{mid}(i,j)}-x_{i}.

Our first goal is to reduce the Conv3SUM\operatorname{\textsc{Conv3SUM}} problem to the following one with K=N𝒪(1)K=N^{\mathcal{O}(1)}.

Double 3-Term Arithmetic Progression, 3DAP(x¯\operatorname{\textsc{3DAP}}(\bar{x}) Input: x¯=[x1,,xn]\bar{x}=[x_{1},\dots,x_{n}], each of xix_{i} is in [0,K][0,K]. Output: (i,j)𝚲(i,j)(\exists\,i,j)\;\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j).

In Section 2.1 we obtain a reduction of Conv3SUM\operatorname{\textsc{Conv3SUM}} to an intermediate version of 3DAP\operatorname{\textsc{3DAP}} with additional constraints on i,ji,j, and in Section 2.2 we show how these constraints can be avoided.

2.1 From Conv3SUM\operatorname{\textsc{Conv3SUM}} to Odd-3DAP\operatorname{\textsc{Odd\mbox{-}3DAP}}

Let us fix an integer sequence x1,,xNx_{1},\dots,x_{N}. For an arithmetic progression (arithmetic sequence) =i1,,in\mathcal{I}=i_{1},\dots,i_{n}, where 1i1<<inN1\leq i_{1}<\dots<i_{n}\leq N, i.e. i2i1==inin1i_{2}-i_{1}=\dots=i_{n}-i_{n-1}, we define the following extended functions.

Conv3SUM(x¯,)\displaystyle\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I})\, =(ia,ib:xia+xib=xia+ib,ia<ib)\displaystyle=\,(\,\exists\,i_{a},i_{b}\in\mathcal{I}\;:\;x_{i_{a}}+x_{i_{b}}=x_{i_{a}+i_{b}},\,i_{a}<i_{b})
OddConv3SUM(x¯,)\displaystyle{\operatorname{\textsc{OddConv3SUM}}}(\bar{x},\mathcal{I})\, =(ia,ib:xia+xib=xia+ib,a+bis odd).\displaystyle=\,(\,\exists\,i_{a},i_{b}\in\mathcal{I}\;:\;x_{i_{a}}+x_{i_{b}}=x_{i_{a}+i_{b}},\,a+b\ \mbox{is odd}).

Note that it can happen that ia+ibi_{a}+i_{b}\notin\mathcal{I}. For a fixed x¯\bar{x} the input size is |||\mathcal{I}|.

Lemma 2.1.

An instance of Conv3SUM(x¯)\operatorname{\textsc{Conv3SUM}}(\bar{x}) can be reduced to an alternative of 𝒪(N)\mathcal{O}(N) instances of OddConv3SUM(x¯,)\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I}) of total size 𝒪(NlogN)\mathcal{O}(N\log N) in 𝒪(NlogN)\mathcal{O}(N\log N) time.

Proof 2.2.

If =i1,,in\mathcal{I}=i_{1},\dots,i_{n}, by 𝑜𝑑𝑑\mathcal{I}_{\mathit{odd}} and 𝑒𝑣𝑒𝑛\mathcal{I}_{\mathit{even}} we denote the subsequences i1,i3,i_{1},i_{3},\dots and i2,i4,i_{2},i_{4},\dots, respectively. We proceed recursively as shown in the following function Conv3SUM\operatorname{\textsc{Conv3SUM}}, with the first call to Conv3SUM(x¯,[1,2,,N])\operatorname{\textsc{Conv3SUM}}(\bar{x},[1,2,\ldots,N]).

function Conv3SUM(x¯,)\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I})  Comment: \mathcal{I} is an arithmetic progression
if ||2|\mathcal{I}|\leq 2 then return false;
return OddConv3SUM(x¯,)Conv3SUM(x¯,𝑜𝑑𝑑)Conv3SUM(x¯,𝑒𝑣𝑒𝑛)\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})\lor\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I}_{\mathit{odd}})\lor\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I}_{\mathit{even}});

Correctness. Let =i1,,in\mathcal{I}=i_{1},\dots,i_{n} and assume there are two indices a,ba,b such that xia+xib=xia+ibx_{i_{a}}+x_{i_{b}}=x_{i_{a}+i_{b}}. If a+ba+b is odd, then OddConv3SUM(x¯,)\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I}) returns true. Otherwise both a,ba,b are of the same parity, so ia,iboddi_{a},i_{b}\in\mathcal{I}_{\mathit{odd}} or ia,ibeveni_{a},i_{b}\in\mathcal{I}_{\mathit{even}}. Consequently, the problem is split recursively into subproblems that correspond to odd\mathcal{I}_{\mathit{odd}} and even\mathcal{I}_{\mathit{even}}.

Complexity. Let us observe that one call to Conv3SUM(x¯,)\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I}) creates an instance of OddConv3SUM\operatorname{\textsc{OddConv3SUM}} of 𝒪(||)\mathcal{O}(|\mathcal{I}|) size in 𝒪(||)\mathcal{O}(|\mathcal{I}|) time (x¯\bar{x} does not change). Let #(n)\#(n) and S(n)S(n) denote the total number and size of all instances of \mathcal{I} generated by Conv3SUM(x¯,)\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I}), when initially ||=n|\mathcal{I}|=n. We then have

#(n)=#(n/2)+#(n/2)+1andS(n)=S(n/2)+S(n/2)+Θ(n),\#(n)=\#(\lfloor n/2\rfloor)+\#(\lceil n/2\rceil)+1\quad\mbox{and}\quad S(n)=S(\lfloor n/2\rfloor)+S(\lceil n/2\rceil)+\Theta(n),

which yields #(N)=𝒪(N)\#(N)=\mathcal{O}(N) and S(N)=𝒪(NlogN)S(N)=\mathcal{O}(N\log N). The reduction takes 𝒪(S(N))\mathcal{O}(S(N)) time.

We say that a 3-element arithmetic progression is a good progression if the middle element is even and two others are odd and introduce the following problem.

Odd-3DAP\operatorname{\textsc{Odd\mbox{-}3DAP}}(x¯\bar{x}) Input: x¯=[x1,,xn]\bar{x}=[x_{1},\dots,x_{n}], each of xix_{i} is in [𝒪(N2),𝒪(N2)][-\mathcal{O}(N^{2}),\mathcal{O}(N^{2})]. Output: (i,j)[𝚲(i,j)(\,\exists\,i,j\,)\;[\,\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) and (i,𝑚𝑖𝑑(i,j),j)(i,\mathit{mid}(i,j),j) is a good progression ].

Lemma 2.3.

OddConv3SUM(x¯,)\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I}) is reducible in 𝒪(||)\mathcal{O}(|\mathcal{I}|) time and space to Odd-3DAP(y¯)\operatorname{\textsc{Odd\mbox{-}3DAP}}(\bar{y}), where |y¯|=𝒪(||)|\bar{y}|=\mathcal{O}(|\mathcal{I}|).

Proof 2.4.

Let =i1,,in\mathcal{I}=i_{1},\dots,i_{n}. Define αN=2N2+1\alpha_{N}=2N^{2}+1 and let y¯\bar{y} be a sequence of length 2n12n-1 that is created as follows:

  1. 1.

    put xi1,xi2,,xinx_{i_{1}},x_{i_{2}},\dots,x_{i_{n}} at subsequent odd positions in y¯\bar{y};

  2. 2.

    at each even position 2j2j, put xij+ij+1x_{i_{j}+i_{j+1}} or, if ij+ij+1>Ni_{j}+i_{j+1}>N, put αN\alpha_{N}.

  3. 3.

    multiply elements on even positions by 2.

After the first two steps OddConv3SUM(x¯,)\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I}) is equivalent to (i,j)y𝑚𝑖𝑑(i,j)=yi+yj(\exists i,j)\;y_{\mathit{mid}(i,j)}=y_{i}+y_{j} for odd i,ji,j and even 𝑚𝑖𝑑(i,j)\mathit{mid}(i,j); see Figure 1. Then, after the third step, OddConv3SUM(x¯,)\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I}) is equivalent to Odd-3DAP(y¯)\operatorname{\textsc{Odd\mbox{-}3DAP}}(\bar{y}).

𝐱𝟏𝟏\mathbf{x_{11}}x1{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{1}}𝐱𝟏𝟏\mathbf{x_{11}}𝐱𝟑\mathbf{x_{3}}𝐱𝟏𝟏\mathbf{x_{11}}x2{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{2}}𝐱𝟏𝟏\mathbf{x_{11}}𝐱𝟓\mathbf{x_{5}}𝐱𝟏𝟏\mathbf{x_{11}}x3{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{3}}𝐱𝟏𝟏\mathbf{x_{11}}𝐱𝟕\mathbf{x_{7}}𝐱𝟏𝟏\mathbf{x_{11}}x4{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{4}}𝐱𝟏𝟏\mathbf{x_{11}}𝐱𝟗\mathbf{x_{9}}𝐱𝟏𝟏\mathbf{x_{11}}x5{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{5}}𝐱𝟏𝟏\mathbf{x_{11}}𝐱𝟏𝟏\mathbf{x_{11}}𝐱𝟏𝟏\mathbf{x_{11}}x6{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{6}}𝐱𝟏𝟏\mathbf{x_{11}}*𝐱𝟏𝟏\mathbf{x_{11}}x7{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{7}}𝐱𝟏𝟏\mathbf{x_{11}}*𝐱𝟏𝟏\mathbf{x_{11}}x8{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{8}}𝐱𝟏𝟏\mathbf{x_{11}}*𝐱𝟏𝟏\mathbf{x_{11}}x9{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{9}}𝐱𝟏𝟏\mathbf{x_{11}}*𝐱𝟏𝟏\mathbf{x_{11}}x10{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{10}}𝐱𝟏𝟏\mathbf{x_{11}}*𝐱𝟏𝟏\mathbf{x_{11}}x11{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}x_{11}}
Figure 1: The sequence constructed in Lemma 2.3 for x¯=[x1,x2,x3,,x11]\bar{x}\,=\,[x_{1},x_{2},x_{3},\dots,x_{11}] and =(1,2,,11)\mathcal{I}=(1,2,\ldots,11) after the first two steps (* denotes α11\alpha_{11}). Note that the elements connected by arcs all have their sum of indices equal to 77; this is because \mathcal{I} is an arithmetic progression.

2.2 From Odd-3DAP\operatorname{\textsc{Odd\mbox{-}3DAP}} to 3DAP\operatorname{\textsc{3DAP}}

Our main tool in this subsection is partitioning a set of integers into progression-free sets. A set of integers AA is called progression-free if it does not contain a non-constant three-element arithmetic progression. We use the following recent result that extends a classical paper of Behrend [6].

Theorem 2.5 ([14]).

Any set A[1,n]A\subseteq[1,n] can be partitioned into no(1)n^{o(1)} progression-free sets in n1+o(1)n^{1+o(1)} time.

Lemma 2.6.

We can construct in n1+o(1)n^{1+o(1)} time a family \mathcal{F} of no(1)n^{o(1)} subsets of [1,n][1,n] satisfying:

  1. (a)

    Each good 3-element progression is contained in some SS\in\mathcal{F}.

  2. (b)

    If SS\in\mathcal{F}, then all 3-element arithmetic progressions in SS are good.

Proof 2.7.

Let us divide the elements from [1,n][1,n] into three classes:

𝖡𝖫𝖴𝖤\displaystyle\mathsf{BLUE} ={in:i is even},\displaystyle=\{i\leq n\,:\,i\text{ is even}\,\},
𝖱𝖤𝖣\displaystyle\mathsf{RED} ={in:imod4=1},𝖦𝖱𝖤𝖤𝖭={in:imod4=3}.\displaystyle=\{i\leq n\,:\,i\bmod 4=1\},\quad\mathsf{GREEN}=\{i\leq n\,:\,i\bmod 4=3\}.

Each element i[1,n]i\in[1,n] has the colour blue, red or green of its corresponding class. Each class forms an arithmetic progression.

A progression is called multi-chromatic if its elements are of three distinct colours. Let us observe that a 3-element progression is good if and only if it is multi-chromatic. Indeed, this is because if i,j𝖱𝖤𝖣i,j\in\mathsf{RED} (or 𝖦𝖱𝖤𝖤𝖭\mathsf{GREEN}), then 𝑚𝑖𝑑(i,j)\mathit{mid}(i,j) is odd.

Now instead of good progressions we will deal with multi-chromatic progressions. We treat sets of integers as increasing sequences and for a set C={c1,,cm}C=\{c_{1},\dots,c_{m}\} we denote by C𝑜𝑑𝑑C_{\mathit{odd}} and C𝑒𝑣𝑒𝑛C_{\mathit{even}} the subsets {c1,c3,}\{c_{1},c_{3},\dots\} and {c2,c4,}\{c_{2},c_{4},\dots\}.

For example 𝖡𝖫𝖴𝖤odd={in:imod4=2},𝖱𝖤𝖣even={in:imod8=5}\mathsf{BLUE}_{odd}=\{i\leq n\,:\,i\bmod 4=2\},\ \mathsf{RED}_{even}=\{i\leq n\,:\,i\bmod 8=5\}.

Our construction works as follows:

  1. 1.

    Partition the set [1,n][1,n] into classes 𝖡𝖫𝖴𝖤,𝖱𝖤𝖣,𝖦𝖱𝖤𝖤𝖭\mathsf{BLUE},\mathsf{RED},\mathsf{GREEN}.

  2. 2.

    For each class C{𝖡𝖫𝖴𝖤,𝖱𝖤𝖣,𝖦𝖱𝖤𝖤𝖭}C\in\{\mathsf{BLUE},\mathsf{RED},\mathsf{GREEN}\} partition it in n1+o(1)n^{1+o(1)} time into a family C\mathcal{F}_{C} of no(1)n^{o(1)} progression-free sets with the use of Theorem 2.5.

  3. 3.

    Refine each partition C\mathcal{F}_{C}, splitting each set XCX\in\mathcal{F}_{C} into two sets XC𝑜𝑑𝑑X\cap C_{\mathit{odd}}, XC𝑒𝑣𝑒𝑛X\cap C_{\mathit{even}}, so that for each set XX in the new refined partition C\mathcal{F}_{C} we have XC𝑜𝑑𝑑X\subseteq C_{\mathit{odd}} or XC𝑒𝑣𝑒𝑛X\subseteq C_{\mathit{even}}. Each family is still of size no(1)n^{o(1)}.

  4. 4.

    Return ={XYZ:X𝖡𝖫𝖴𝖤,Y𝖱𝖤𝖣,Z𝖦𝖱𝖤𝖤𝖭}\mathcal{F}\,=\,\{\,X\cup Y\cup Z\;:\;X\in\mathcal{F}_{\mathsf{BLUE}},Y\in\mathcal{F}_{\mathsf{RED}},Z\in\mathcal{F}_{\mathsf{GREEN}}\,\}.

Proof of point (a). Each multi-chromatic progression is contained in some SS\in\mathcal{F} since each element of CC is contained in a set from C\mathcal{F}_{C}.

Proof of point (b). The proof is by contradiction. Assume that SS\in\mathcal{F} contains a progression which is not multi-chromatic. There are two cases.

Case 1: the progression is monochromatic, hence it appears in a single set XCX\in\mathcal{F}_{C}. However every XX is progression-free (step 2), hence such a progression cannot appear in any SS\in\mathcal{F}; a contradiction.

Case 2: the progression contains exactly two different colors. Observe that if imodp=𝑚𝑖𝑑(i,j)modp=ri\bmod p=\mathit{mid}(i,j)\bmod p=r, then jmodp=rj\bmod p=r (if the middle element of progression belongs to the same class as one of the other elements, then the triple is monochromatic), hence the two-coloured arithmetic progression has to consist of i,jCi,j\in C and 𝑚𝑖𝑑(i,j)C\mathit{mid}(i,j)\notin C.

Since i,ji,j both belong to C𝑜𝑑𝑑C_{\mathit{odd}} or C𝑒𝑣𝑒𝑛C_{\mathit{even}} (step 3), 𝑚𝑖𝑑(i,j)\mathit{mid}(i,j) must belong to CC (if imod2p=jmod2pi\bmod 2p=j\bmod 2p, then imodp=𝑚𝑖𝑑(i,j)modpi\bmod p=\mathit{mid}(i,j)\bmod p). Consequently, the progression cannot contain exactly two colours; a contradiction.

Our next tool is a deactivation of a set of elements which indexes are not in a given set EE, that is, omitting them in the computation of a solution. For E[1,n]E\subseteq[1,n] the operation 𝑟𝑒𝑠𝑡𝑟(x¯,E)\mathit{restr}(\bar{x},E) replaces each element xix_{i} on position iEi\notin E by 5max{𝑀𝐴𝑋,n2}+i25\max\{\mathit{MAX},n^{2}\}+i^{2}, where 𝑀𝐴𝑋=maxk1|xk|\mathit{MAX}=\max_{k\geq 1}\,|x_{k}|.

Lemma 2.8.

3DAP(𝑟𝑒𝑠𝑡𝑟(x¯,E))(i,j)𝚲x¯(i,j)i,j,𝑚𝑖𝑑(i,j)E\operatorname{\textsc{3DAP}}(\mathit{restr}(\bar{x},E))\iff(\exists\,i,j)\;\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{x}}(i,j)\;\land\ i,j,\mathit{mid}(i,j)\in E.

Proof 2.9.

The ()(\Leftarrow) part if obvious, so it suffices to show ()(\Rightarrow). If at least one, but not all, of i,j,𝑚𝑖𝑑(i,j)i,j,\mathit{mid}(i,j) is not in EE, then it can be checked that 𝚲y¯(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{y}}(i,j) cannot hold for y¯=𝑟𝑒𝑠𝑡𝑟(x¯,E)\bar{y}=\mathit{restr}(\bar{x},E) because y𝑚𝑖𝑑(i,j)y_{\mathit{mid}(i,j)} and 𝑚𝑖𝑑(yi,yj)\mathit{mid}(y_{i},y_{j}) differ by at least M:=max{maxk{|xk|},n2}M:=\max\{\max_{k}\{|x_{k}|\},n^{2}\}. Indeed, there are seven possible cases:

  1. 1.

    i,𝑚𝑖𝑑(i,j),jEi,\mathit{mid}(i,j),j\notin E, then 𝑚𝑖𝑑(yi,yj)y𝑚𝑖𝑑(i,j)=i2+j22(i+j2)2=(ij2)2>0sinceij\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=\frac{i^{2}+j^{2}}{2}-(\frac{i+j}{2})^{2}=(\frac{i-j}{2})^{2}>0\ \mbox{since}\ i\neq j

  2. 2.

    iEi\in E, 𝑚𝑖𝑑(i,j),jE\mathit{mid}(i,j),j\notin E, then 𝑚𝑖𝑑(yi,yj)y𝑚𝑖𝑑(i,j)=52M+j22(i+j2)2+xi22M+xi2M\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=-\frac{5}{2}M+\frac{j^{2}}{2}-(\frac{i+j}{2})^{2}+\frac{x_{i}}{2}\leq-2M+\frac{x_{i}}{2}\leq-M

  3. 3.

    jEj\in E, i,𝑚𝑖𝑑(i,j)Ei,\mathit{mid}(i,j)\notin E works as the previous case

  4. 4.

    𝑚𝑖𝑑(i,j)E\mathit{mid}(i,j)\in E, i,jEi,j\notin E, then 𝑚𝑖𝑑(yi,yj)y𝑚𝑖𝑑(i,j)=5M+i2+j22x𝑚𝑖𝑑(i,j)4M\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=5M+\frac{i^{2}+j^{2}}{2}-x_{\mathit{mid}(i,j)}\geq 4M

  5. 5.

    i,𝑚𝑖𝑑(i,j)Ei,\mathit{mid}(i,j)\in E, jEj\notin E, then 𝑚𝑖𝑑(yi,yj)y𝑚𝑖𝑑(i,j)=52M+j22+xi2x𝑚𝑖𝑑(i,j)M\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=\frac{5}{2}M+\frac{j^{2}}{2}+\frac{x_{i}}{2}-x_{\mathit{mid}(i,j)}\geq M

  6. 6.

    𝑚𝑖𝑑(i,j),jE\mathit{mid}(i,j),j\in E, iEi\notin E works as the previous case

  7. 7.

    i,jEi,j\in E, 𝑚𝑖𝑑(i,j)E\mathit{mid}(i,j)\notin E, then 𝑚𝑖𝑑(yi,yj)y𝑚𝑖𝑑(i,j)=xi+xj25M(i+j2)23M\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=\frac{x_{i}+x_{j}}{2}-5M-(\frac{i+j}{2})^{2}\leq-3M.

Hence, apart from the first case, where none of the indices belongs to EE, the absolute value of difference is at least MM.

Otherwise, if all the positions i,j,𝑚𝑖𝑑(i,j)i,j,\mathit{mid}(i,j) are not in EE, then 𝚲y¯(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{y}}(i,j) does not hold because 𝑚𝑖𝑑(i2,j2)(i+j2)2=(ij2)2>0sinceij.\mathit{mid}(i^{2},j^{2})-(\frac{i+j}{2})^{2}=(\frac{i-j}{2})^{2}>0\ \mbox{since}\ i\neq j.

An instance x¯\bar{x} is called an odd-half instance if 𝚲(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) is false for i,ji,j such that (ji)/2(j-i)/2 is even (equivalently, for i,ji,j such that ii and 𝑚𝑖𝑑(i,j)\mathit{mid}(i,j) have the same parity). Efficient equivalence

Odd-3DAP(x¯)(S)3DAP(𝑟𝑒𝑠𝑡𝑟(x¯,S))\operatorname{\textsc{Odd\mbox{-}3DAP}}(\bar{x})\ \Leftrightarrow\ (\exists\,S\in\mathcal{F}\,)\;\operatorname{\textsc{3DAP}}(\mathit{restr}(\bar{x},S))

follows now from Lemmas 2.6 and 2.8.

This produces only odd-half instances because only good progressions are left in the construction of Lemma 2.6. The instances have elements in [𝒪(N2),𝒪(N2)][-\mathcal{O}(N^{2}),\mathcal{O}(N^{2})]. We can increase all the elements by 𝒪(N2)\mathcal{O}(N^{2}) so that they become non-negative. This implies:

Lemma 2.10.

An instance of Odd-3DAP\operatorname{\textsc{Odd\mbox{-}3DAP}} can be reduced in n1+o(1)n^{1+o(1)} time to no(1)n^{o(1)} odd-half instances of 3DAP\operatorname{\textsc{3DAP}} of total size n1+o(1)n^{1+o(1)} and with elements up to K=𝒪(N2)K=\mathcal{O}(N^{2}).

Finally, we show that the resulting instances can be glued together to a single equivalent one.

Theorem 2.11.

An instance of Conv3SUM\operatorname{\textsc{Conv3SUM}} can be reduced in N1+o(1)N^{1+o(1)} time to an odd-half instance of 3DAP\operatorname{\textsc{3DAP}} of size n=N1+o(1)n=N^{1+o(1)} with elements up to K=N3+o(1)K=N^{3+o(1)}.

Proof 2.12.

With Lemmas 2.1, 2.3 and 2.10 we obtain a reduction from Conv3SUM\operatorname{\textsc{Conv3SUM}} to N1+o(1)N^{1+o(1)} odd-half instances of 3DAP\operatorname{\textsc{3DAP}} of total size N1+o(1)N^{1+o(1)}. The instances have elements in [0,𝒪(N2)][0,\mathcal{O}(N^{2})]. We will show that these instances can be reduced to a single odd-half instance of 3DAP\operatorname{\textsc{3DAP}} of size N1+o(1)N^{1+o(1)} with elements in the range [0,N3+o(1)][0,N^{3+o(1)}] in time N1+o(1)N^{1+o(1)}. The resulting instance will return true if and only if at least one of the input instances does.

Let t=N1+o(1)t=N^{1+o(1)} be the number of the instances of 3DAP\operatorname{\textsc{3DAP}}, numbered 11 through tt. We use Theorem 2.5 111Actually, a deterministic version of Behrend’s construction from [14] or an earlier construction of Salem and Spencer [38] would suffice here. and pick the largest constructed progression-free set A[1,m]A\subseteq[1,m], for some mm. By the pigeonhole principle, |A|m1o(1)|A|\geq m^{1-o(1)}. We select mm that is large enough so that m1o(1)tm^{1-o(1)}\geq t, so m=t1+o(1)=N1+o(1)m=t^{1+o(1)}=N^{1+o(1)}, and trim the set AA to the size tt. Let A={a1,,at}A=\{a_{1},\dots,a_{t}\}. For instance ii we multiply all its elements by 2m2m and add to each element the value aia_{i}. Finally we concatenate all the instances.

If any of the input instances returns true, then so does the output instance, since multiplication by and addition of the same number to all elements cannot affect the outcome of a single instance. If none of the input instances returns true, then the only possibility for the output instance to return true is to contain a 3-element arithmetic progression with elements from multiple parts corresponding to the input instances. However, this is impossible since, taken modulo 2m2m, the progression would form an arithmetic progression in the set AA.

Corollary 2.13.

The general 3DAP\operatorname{\textsc{3DAP}} problem is also 3SUM\operatorname{\textsc{3SUM}}-hard.

Remark 2.14.

Similarly as in Conv3SUM\operatorname{\textsc{Conv3SUM}}, techniques from [5] can be used to hash down the range in 3DAP\operatorname{\textsc{3DAP}} to integers of magnitude 𝒪(N2)\mathcal{O}(N^{2}) (cf. [3]), using randomization.

Remark 2.15.

The AVERAGE problem (introduced by J.  Erickson [17]) asks if there are distinct elements a,b,cSa,b,c\in S such that a+b=2ca+b=2c for a given set SS of nn integers. It was recently shown to be 3SUM\operatorname{\textsc{3SUM}}-hard [14]. The 3DAP\operatorname{\textsc{3DAP}} problem can be viewed as a convolution version of the AVERAGE problem222https://cs.stackexchange.com/questions/10681/is-detecting-doubly-arithmetic-progressions-3sum-hard/10725#10725. The ideas based on almost linear hashing used in the reductions from 3SUM\operatorname{\textsc{3SUM}} to Conv3SUM\operatorname{\textsc{Conv3SUM}} [35, 10] can be extended with some effort to reduce AVERAGE to 3DAP\operatorname{\textsc{3DAP}}. We presented a different reduction that additionally directly leads to an instance of 3DAP\operatorname{\textsc{3DAP}} with an odd-half property, which is essential in our proof of 3SUM\operatorname{\textsc{3SUM}}-hardness of computing Ab-squares (see the proof of Lemma 4.2).

2.3 Hardness of detecting additive squares

If the alphabet is a set of integers, then a string WW is called an additive square if W=UVW=UV, where |U|=|V||U|=|V| and i=1|U|U[i]=i=1|V|V[i]\sum_{i=1}^{|U|}\,U[i]=\sum_{i=1}^{|V|}\,V[i].

Theorem 2.16.

Finding an additive square in a length-NN sequence composed of integers of magnitude N𝒪(1)N^{\mathcal{O}(1)} is 3SUM\operatorname{\textsc{3SUM}}-hard.

Proof 2.17.

We use Theorem 2.11 to reduce Conv3SUM\operatorname{\textsc{Conv3SUM}} to an instance of 3DAP\operatorname{\textsc{3DAP}} of size n=N1+o(1)n=N^{1+o(1)} with elements in the requested range. 3DAP\operatorname{\textsc{3DAP}} returns true on an instance x1,,xnx_{1},\dots,x_{n} if and only if the sequence x2x1,x3x2,,xnxn1x_{2}-x_{1},x_{3}-x_{2},\dots,x_{n}-x_{n-1} contains an additive square. As the reduction works in N1+o(1)N^{1+o(1)} total time, the conclusion follows.

3 From arithmetics to Abelian stringology

We use capital letters to denote strings and lower case Greek letters to denote sets of integers. We assume that the positions in a string SS are numbered 1 through |S||S|, where |S||S| denotes the length of SS. By S[i]S[i] and S[i..j]S[i..j] we denote the iith letter of SS and the string S[i]S[j]S[i]\cdots S[j] called a factor of SS. The reverse of string SS, i.e. the string S[|S|]S[1]S[|S|]\cdots S[1], is denoted as SRS^{R}. By ε\varepsilon we denote the empty string. By 𝐴𝑙𝑝ℎ(S)\mathit{Alph}(S) we denote the set of distinct letters in SS.

We denote Ab-equivalence of UU and VV by UVU\cong V. For a string UU, by 𝑃𝑎𝑟𝑖𝑘ℎ(U)\mathit{Parikh}(U) we denote the Parikh vector of UU. Then UVU\cong V if and only if 𝐴𝑙𝑝ℎ(U)=𝐴𝑙𝑝ℎ(V)\mathit{Alph}(U)=\mathit{Alph}(V) and 𝑃𝑎𝑟𝑖𝑘ℎ(U)=𝑃𝑎𝑟𝑖𝑘ℎ(V)\mathit{Parikh}(U)=\mathit{Parikh}(V).

We use an encoding of Amir et al. [3] based on the Chinese remainder theorem to connect Conv3SUM\operatorname{\textsc{Conv3SUM}}-type problems with Abelian stringology.

Let p1<p2<<pkp_{1}<p_{2}<\dots<p_{k} be prime numbers. The Chinese remainder theorem states that if one knows the remainders r1,r2,,rkr_{1},r_{2},\dots,r_{k} of an integer xx, such that 0x<pi0\leq x<\prod\,p_{i}, when dividing by pip_{i}’s, then one can uniquely determine xx. Assuming that the remainders of an integer xx are r1,r2,,rkr_{1},r_{2},\dots,r_{k}, we could encode xx as a possibly short string 𝚊1r1𝚊2r2𝚊krk\mathtt{a}_{1}^{r_{1}}\mathtt{a}_{2}^{r_{2}}\cdots\mathtt{a}_{k}^{r_{k}} over an alphabet {𝚊1,𝚊2,,𝚊k}\{\mathtt{a}_{1},\mathtt{a}_{2},\ldots,\mathtt{a}_{k}\} (the symbols correspond to consecutive prime numbers).

For example for primes 2,3,5 the encoding of 11 would be 𝚊11𝚊22𝚊31\mathtt{a}_{1}^{1}\mathtt{a}_{2}^{2}\mathtt{a}_{3}^{1} since its remainders modulo 2,3,5 are 1,2,1, respectively. However, we are interested in encodings of subtractions of one number from another one, and it is more complicated.

Let x¯=[x1,,xn]\bar{x}=[x_{1},\dots,x_{n}] be an instance of 3DAP\operatorname{\textsc{3DAP}} and r1(i),r2(i),,rk(i)r_{1}^{(i)},r_{2}^{(i)},\dots,r_{k}^{(i)} be remainders of xix_{i} modulo p1,p2,,pkp_{1},p_{2},\ldots,p_{k}. Like Amir et al. [3], we define for 1i<n1\leq i<n and 1jk1\leq j\leq k,

𝐸𝑋𝑃i(j)=rj(i+1)rj(i)+d where d=maxj=1kpj,SSi=𝚊1𝐸𝑋𝑃i(1)𝚊2𝐸𝑋𝑃i(2)𝚊k𝐸𝑋𝑃i(k).\mathit{EXP}_{i}(j)=r_{j}^{(i+1)}-r_{j}^{(i)}+d\text{ where }d=\max_{j=1}^{k}p_{j},\ \ \SS_{i}\;=\;\mathtt{a}_{1}^{\mathit{EXP}_{i}(1)}\,\mathtt{a}_{2}^{\mathit{EXP}_{i}(2)}\cdots\mathtt{a}_{k}^{\mathit{EXP}_{i}(k)}.

We choose a sequence p1,,pkp_{1},\dots,p_{k} of kk distinct primes such that p1pk>max{xi}p_{1}\cdots p_{k}>\max\{x_{i}\}. In this way we encode the difference xjxix_{j}-x_{i}, for j>ij>i, by a string SSiSSi+1SSj1\SS_{i}\,\SS_{i+1}\cdots\SS_{j-1}. An obstacle is the potentially possible inequality (amodp)(bmodp)(ab)modp(a\bmod p)-(b\bmod p)\neq(a-b)\bmod p. For example

(4mod 3)(2mod 3)=((42)mod 3) 3.(4\ \text{mod}\ 3)-(2\ \text{mod}\ 3)\,=\,((4-2)\ \text{mod}\ 3)\,-\,3.

However a small correction is sufficient, due to the following observation. {observation} (amodp)(bmodp)+q=(ab)modp(a\bmod p)-(b\bmod p)+q\,=\,(a-b)\bmod p, where q{0,p}q\in\{0,p\}.

If we apply the encoding to an instance x¯=x1,,xn\bar{x}=x_{1},\dots,x_{n} of 3DAP\operatorname{\textsc{3DAP}}, we obtain a lemma that is analogous to [3, Lemma 1].

Lemma 3.1.

𝚲(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) holds for i<ji<j, jij-i even, iff for each t[1,k]t\in[1,k], there are et,ft{0,pt}e_{t},f_{t}\in\{0,p_{t}\}, such that

et+𝐸𝑋𝑃i(t)+𝐸𝑋𝑃i+1(t)++𝐸𝑋𝑃𝑚𝑖𝑑(i,j)1(t)=e_{t}+\mathit{EXP}_{i}(t)+\mathit{EXP}_{i+1}(t)+\dots+\mathit{EXP}_{\mathit{mid}(i,j)-1}(t)\,=\,
𝐸𝑋𝑃𝑚𝑖𝑑(i,j)(t)+𝐸𝑋𝑃𝑚𝑖𝑑(i,j)+1(t)++𝐸𝑋𝑃j1(t)+ft.\hskip 28.45274pt\mathit{EXP}_{\mathit{mid}(i,j)}(t)+\mathit{EXP}_{\mathit{mid}(i,j)+1}(t)+\cdots+\mathit{EXP}_{j-1}(t)+f_{t}.

Let Ψ\Psi be a morphism such that Ψ(i)=𝚊ipi\Psi(i)=\mathtt{a}_{i}^{p_{i}} for each i=1,,ki=1,\dots,k. We treat a set U={u1,,uw}U=\{u_{1},\dots,u_{w}\} as a string u1uwu_{1}\cdots u_{w}, where u1<u2<<uwu_{1}<u_{2}<\ldots<u_{w}. If we interpret the vector (𝐸𝑋𝑃i(1),𝐸𝑋𝑃i(2),,𝐸𝑋𝑃i(k))(\mathit{EXP}_{i}(1),\mathit{EXP}_{i}(2),\dots,\mathit{EXP}_{i}(k)) as SSi\SS_{i}, then Lemma 3.1 directly implies the following fact.

Lemma 1.

Assume i<ji<j and jij-i is even. Then

𝚲(i,j)(Ψ(α)SSiSSi+1SS𝑚𝑖𝑑(i,j)1SS𝑚𝑖𝑑(i,j)SSj1Ψ(β))\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)\ \iff\ (\,\Psi(\alpha)\,\SS_{i}\SS_{i+1}\cdots\SS_{\mathit{mid}(i,j)-1}\cong\SS_{\mathit{mid}(i,j)}\cdots\SS_{j-1}\;\Psi(\beta)\,)

for some disjoint subsets α,β\alpha,\beta of [1,k][1,k].

4 Hardness of computing all centers of Ab-squares

We construct a text TT over the alphabet {𝚊1,,𝚊k,𝚋,,,#,$}\{\mathtt{a}_{1},\dots,\mathtt{a}_{k},\mathtt{b},\bullet,\star,\#,\$\} such that 3DAP\operatorname{\textsc{3DAP}} has a solution if and only if TT contains an Ab-square with one of specified centers, so-called well-placed Ab-square.

First we extend each SSi\SS_{i} to have the same length Mmaxi=1n1|SSi|M\geq\max_{i=1}^{n-1}|\SS_{i}|, to be defined later. Intuitively, it is needed to control the number of SSi\SS_{i}’s in the strings from 1. We append M|SSi|M-|\SS_{i}| occurrences of a letter 𝚋\mathtt{b} to each SSi\SS_{i}. Let SSiI\SS^{I}_{i} denote this modified string.

Lemma 1 immediately implies the following fact.

Lemma 4.1.

Assume i<ji<j and jij-i is even. Then

𝚲(i,j)(𝚋eΨ(α)SSiISSi+1ISS𝑚𝑖𝑑(i,j)1ISS𝑚𝑖𝑑(i,j)ISSj1IΨ(β)𝚋f),\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)\ \iff\ (\,\mathtt{b}^{e}\Psi(\alpha)\,\SS_{i}^{I}\SS_{i+1}^{I}\cdots\SS_{\mathit{mid}(i,j)-1}^{I}\,\cong\SS_{\mathit{mid}(i,j)}^{I}\cdots\SS_{j-1}^{I}\Psi(\beta)\mathtt{b}^{f}\,),

for some disjoint subsets α,β\alpha,\beta of [1,k][1,k], where e+|Ψ(α)|=f+|Ψ(β)|e+|\Psi(\alpha)|=f+|\Psi(\beta)| with min(e,f)=0\min(e,f)=0.

The parts 𝚋eΨ(α)\mathtt{b}^{e}\Psi(\alpha), Ψ(β)𝚋f\Psi(\beta)\mathtt{b}^{f} in the above lemma can be treated as equalizers. Let us note that in the above lemma we can assume that max(e,f)max(|Ψ(α)|,|Ψ(β)|)kd\max(e,f)\leq\max(|\Psi(\alpha)|,|\Psi(\beta)|)\leq kd.

A pair of disjoint sets α,β\alpha,\beta that satisfies αβ=[1,k]\alpha\cup\beta=[1,k] will be called a 2-partition of [1,k][1,k]. For a 2-partition (α,β)(\alpha,\beta) of [1,k][1,k], we use the string

Γ(α,β)=#α$𝚋kd#β$,\Gamma(\alpha,\beta)=\#\,\alpha\,\$\,\mathtt{b}^{kd}\,\#\,\beta\,\$,

called a Γ\Gamma-string. If k=4k=4, d=7d=7, an example of a Γ\Gamma-string is Γ(2, 1 3 4)=# 2$𝚋28# 1 3 4$\Gamma(2,\,1\,3\,4)\,=\,\#\,2\;\$\;\mathtt{b}^{28}\;\#\;1\,3\,4\;\$.

Let (π1,π1),(π2,π2),,(πm,πm)(\pi_{1},\pi^{\prime}_{1}),(\pi_{2},\pi^{\prime}_{2}),\dots,(\pi_{m},\pi^{\prime}_{m}) be the sequence of all m=4km=4^{k} pairs of Γ\Gamma-strings. Define

U=πmπm1π1,V=π1π2πm.U=\pi_{m}\pi_{m-1}\ldots\pi_{1},\ \ V=\pi^{\prime}_{1}\,\pi^{\prime}_{2}\,\ldots\pi^{\prime}_{m}.

We have {π1,,πm}={π1,,πm}\{\pi_{1},\dots,\pi_{m}\}=\{\pi^{\prime}_{1},\dots,\pi^{\prime}_{m}\}, so UVU\cong V.

{observation}

For disjoint subsets α,β[1,k]\alpha,\beta\subseteq[1,k] and integers 0e,fkd0\leq e,f\leq kd, there are decompositions U=U1𝚋e#β$U2U\;=\;U_{1}\,\mathtt{b}^{e}\,\#\,\beta\,\$\,U_{2} and V=V1#α$𝚋fV2V\;=\;V_{1}\,\#\,\alpha\,\$\,\mathtt{b}^{f}\,V_{2}, where U2V1U_{2}\cong V_{1}. Let us recall the morphism Ψ\Psi such that Ψ(i)=𝚊ipi\Psi(i)=\mathtt{a}_{i}^{p_{i}} for each i[1,k]i\in[1,k]. We define additionally Ψ(c)=c\Psi(c)=c for c{𝚋,#,$}c\in\{\mathtt{b},\#,\$\} and set

𝐁=Ψ(U),𝐀=Ψ(V),M=|𝐀|=|𝐁|.\mathbf{B}\,=\,\Psi(U),\ \mathbf{A}\,=\,\Psi(V),\ \ M=|\mathbf{A}|=|\mathbf{B}|.

Let us observe that indeed maxi=1n1|SSi|M\max_{i=1}^{n-1}|\SS_{i}|\leq M holds since |SSi|kd+j=1kpj|\SS_{i}|\leq kd+\sum_{j=1}^{k}p_{j} and the length of Ψ(W)\Psi(W) for any Γ\Gamma-string WW is kd+j=1kpj+4kd+\sum_{j=1}^{k}p_{j}+4.

𝚋eΨ(α)\mathtt{b}^{e}\Psi(\alpha)Ψ(π2)\Psi(\pi_{2})Ψ(π1)\Psi(\pi_{1})𝐁\mathbf{B}SS1\SS_{1}SS1I\SS_{1}^{I}𝐀\mathbf{A}𝐁\mathbf{B}SS2\SS_{2}SS2I\SS_{2}^{I}Ψ(β)𝚋f\Psi(\beta)\mathtt{b}^{f}Ψ(π1)\Psi(\pi^{\prime}_{1})Ψ(π2)\Psi(\pi^{\prime}_{2})𝐀\mathbf{A}
Figure 2: Internal structure of an Ab-square, shown in a thick box (proportions are symbolic), in 𝐁SS1I𝐀𝐁SS2I𝐀\mathbf{B}\SS_{1}^{I}\mathbf{A}\,\mathbf{B}\SS_{2}^{I}\mathbf{A}. Here x𝑚𝑖𝑑(1,3)=𝑚𝑖𝑑(x1,x3)x_{\mathit{mid}(1,3)}=\mathit{mid}(x_{1},x_{3}) and 𝚋eΨ(α)\mathtt{b}^{e}\Psi(\alpha), Ψ(β)𝚋f\Psi(\beta)\mathtt{b}^{f} are equalizers.

We add two new letters ,\bullet,\star and define the following string (the symbols “center{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}” are not parts of the string, but only show supposed centers of Ab-squares).

T=𝐁SS1I𝐀center𝐁SS2I𝐀center𝐁SS3I𝐀center𝐁SS4I𝐀.T\;=\;\bullet\,\;\mathbf{B}\;\star\,\;\SS^{I}_{1}\;\mathbf{A}\;\bullet\,{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}\star\,\;\mathbf{B}\;\bullet\,\;\SS^{I}_{2}\;\;\mathbf{A}\;\star\,{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}\bullet\,\;\mathbf{B}\;\star\,\;\SS^{I}_{3}\;\mathbf{A}\;\bullet\,{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}\star\,\;\mathbf{B}\;\bullet\,\;\SS^{I}_{4}\;\mathbf{A}\,\star\,\cdots. (1)

An Ab-square is called well-placed if its center is between the letters ,\bullet,\star in any order. Recall that, due to Theorem 2.11, we can assume that the input to 3DAP\operatorname{\textsc{3DAP}} guarantees that only odd-half instances could have solutions.

Lemma 4.2.

Assume x¯\bar{x} is an odd-half instance. Then 3DAP(x¯)\operatorname{\textsc{3DAP}}(\bar{x}) has a solution if and only if TT contains a well-placed Ab-square.

Proof 4.3.

Let x¯=[x1,,xn]\bar{x}=[x_{1},\dots,x_{n}] be an odd-half instance of 3DAP\operatorname{\textsc{3DAP}}. We show two implications.

()(\mathbf{\Rightarrow})

Assume that 𝚲(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) holds for x¯\bar{x}. Lemma 4.1 implies that for strings W,ZW,Z such that WZW\cong Z we have

𝚋e#Ψ(α)$WSSiI𝐀𝐁SSi+1I𝐀𝐁SS𝑚𝑖𝑑(i,j)1I𝐀𝐁SS𝑚𝑖𝑑(i,j)I𝐀𝐁SSj2I𝐀𝐁SSj1IZ#Ψ(β)$𝚋f\mathtt{b}^{e}\#\Psi(\alpha)\$W\,\star\SS_{i}^{I}\mathbf{A}\bullet\,\,\,\star\mathbf{B}\bullet\SS_{i+1}^{I}\mathbf{A}\star\,\,\,\cdots\,\,\,\bullet\mathbf{B}\star\SS_{\mathit{mid}(i,j)-1}^{I}\mathbf{A}\bullet\,\,\,\cong\\ \quad\quad\star\mathbf{B}\bullet\SS_{\mathit{mid}(i,j)}^{I}\mathbf{A}\star\,\,\,\cdots\,\,\,\bullet\mathbf{B}\star\SS_{j-2}^{I}\mathbf{A}\bullet\,\,\,\star\mathbf{B}\bullet\SS_{j-1}^{I}\,Z\#\Psi(\beta)\$\mathtt{b}^{f} (2)

for some disjoint subsets α,β\alpha,\beta of [1,k][1,k], where e+|Ψ(α)|=f+|Ψ(β)|e+|\Psi(\alpha)|=f+|\Psi(\beta)| with min(e,f)=0\min(e,f)=0. Indeed, we use the fact that 𝐀𝐁\mathbf{A}\cong\mathbf{B} and the counts of letters \bullet and \star on both hand sides are equal (because (ji)/2(j-i)/2 is odd). By Section 4, we obtain a well-placed Ab-square in TT (or we obtain it after exchanging all letters \bullet with \star).

()(\mathbf{\Leftarrow}) Assume that TT has a well-placed Ab-square factor with center immediately after 𝐁SStI𝐀\bullet\mathbf{B}\star\SS_{t}^{I}\mathbf{A}\bullet (the case that it is immediately after 𝐁SStI𝐀\star\mathbf{B}\bullet\SS_{t}^{I}\mathbf{A}\star is symmetric). Let us investigate what can be the position ss of the first letter of this Ab-square.

Recall that |SSiI|=|𝐀|=|𝐁|=M|\SS_{i}^{I}|=|\mathbf{A}|=|\mathbf{B}|=M for each i[1,n1]i\in[1,n-1], so TT can be seen as composed of blocks of length M=M+1M^{\prime}=M+1. We will check which of these blocks can contain ss, by checking the counts of each of the letters ,\bullet,\star in both halves of the Ab-square. The positions of letters ,\bullet,\star in TT repeat with period 6(M+1)6(M+1), so it is sufficient to inspect the first 6 blocks on each side, as the remaining ones will behave periodically; see Figures 3 and 4.

SSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐁\star\,\mathbf{B}SSt1I\bullet\Diamond\SS_{t-1}^{I} SSt1I\bullet\,\SS_{t-1}^{I}SSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐀{\mathbf{A}}\,\starSSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐁\bullet\,\mathbf{B}SSt1I\bullet\Diamond\SS_{t-1}^{I} SStI\star\,\SS_{t}^{I}SSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐀{\mathbf{A}}\,\bulletSSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐁\star\,\mathbf{B}SSt1I\bullet\Diamond\SS_{t-1}^{I} SSt+1I\bullet\,\SS_{t+1}^{I}SSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐀{\mathbf{A}}\,\starSSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐁\bullet\,\mathbf{B}SSt1I\bullet\Diamond\SS_{t-1}^{I} SSt+2I\star\,\SS_{t+2}^{I}SSt1I\bullet\Diamond\SS_{t-1}^{I} 𝐀{\mathbf{A}}\,\bullettyfirsttynonetyanytynot firsttyfirsttynone
Figure 3: Which position in a block of TT can be the starting position of a well-placed Ab-square with the designated center, just counting letters ,\bullet,\star.

By counting letters ,\bullet,\star in both halves of the Ab-square, it can be readily verified that ss cannot be in any block 𝐀\mathbf{A}\bullet or SSiI\bullet\SS_{i}^{I}; if in any block 𝐁\star\mathbf{B} or SSiI\star\SS_{i}^{I}, it can only be the first position of the block; it cannot be the first position in a block 𝐁\bullet\mathbf{B}; and it can be in any position in a block 𝐀\mathbf{A}\star.

Moreover, ss cannot be the first position in a block 𝐁\star\mathbf{B}, since this would imply, by Lemma 4.1, that 𝚲(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) holds for ii such that the block SSiI\bullet\SS_{i}^{I} immediately follows the 𝐁\star\mathbf{B} block and j=2tij=2t-i. However, in this case (ji)/2(j-i)/2 is even, which is impossible.

If ss is the first position of a block SSiI\star\SS_{i}^{I}, then this implies, again by Lemma 4.1, that 𝚲(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) holds for j=2tij=2t-i. In this case (ji)/2(j-i)/2 is odd, so this is a valid solution to the corresponding 3DAP\operatorname{\textsc{3DAP}} instance.

We are left with the case that ss belongs to a block 𝐀\mathbf{A}\star or 𝐁\bullet\mathbf{B} (and in case of 𝐁\bullet\mathbf{B} does not coincide with the position of the letter \bullet). Henceforth it suffices to count letters different from ,\bullet,\star in the halves. Each of the gadgets 𝐀,𝐁\mathbf{A},\mathbf{B} is a concatenation of mm Ab-equivalent strings of the form #Ψ(α)$𝚋kd#Ψ(β)$\#\,\Psi(\alpha)\,\$\,\mathtt{b}^{kd}\,\#\,\Psi(\beta)\,\$, where Ψ(α),Ψ(β)\Psi(\alpha),\Psi(\beta) are composed of letters 𝚊i\mathtt{a}_{i} only. By counting the letters # and $ in both halves of the Ab-square, we see that ss can only be a position which holds the letter 𝚋\mathtt{b} or #.

Hence, the Ab-square is necessarily of the form (2), which, by Lemma 4.1, implies that 𝚲(i,j)\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j) holds, where SSiI\star\SS_{i}^{I} is the first such block after the position ss and j=2tij=2t-i.

{scaletikzpicturetowidth}
\bullet\star\bullet\star\star\bullet\bullet\star\star\bullet\bullet\star\star\bullet\star\bullet\star\bulletAb-square
Figure 4: The global structure of a fragment containing a well-placed Ab-square; there are three types of blocks: c𝐁,cSSiI,𝐀cc\mathbf{B},\,c\SS^{I}_{i},\,\mathbf{A}c, where cc is one of ,\bullet,\star. The blocks of the second type (which can be considered as essential blocks) are in color, each block is of length M+1M+1 (recall that |𝐀|=|𝐁|=|SSi|=M|\mathbf{A}|=|\mathbf{B}|=|\SS_{i}|=M). The special letters ,\bullet,\star force each half of a well-placed Ab-square to contain an odd number of full SSi\SS_{i}’s and does not contain any SSi\SS_{i} only partially.
Theorem 4.4.

Computing all positions that are centers of Ab-square factors in a length-nn string over an alphabet of size ω(1)\omega(1) is 3SUM\operatorname{\textsc{3SUM}}-hard.

Proof 4.5.

Due to Theorem 2.11 we can reduce Conv3SUM\operatorname{\textsc{Conv3SUM}} in N1+o(1)N^{1+o(1)} time to an odd-half instance x¯\bar{x} of 3DAP\operatorname{\textsc{3DAP}} of size n=N1+o(1)n=N^{1+o(1)} with elements in the range [0,N3+o(1)][0,N^{3+o(1)}].

We construct the string TT as shown in Eq. 1 for the sequence x¯\bar{x}. Then Lemma 4.2 implies that 3DAP\operatorname{\textsc{3DAP}} is a YES-instance if and only if TT has a well-placed Ab-square. The string TT has length 𝒪(N1+o(1)M)\mathcal{O}(N^{1+o(1)}M). Each of the strings 𝐀,𝐁\mathbf{A},\mathbf{B} has length MM and is composed of m=4km=4^{k} strings of length 𝒪(kd)\mathcal{O}(kd), i.e., Ψ\Psi-images of Γ\Gamma-strings.

Hence, M=𝒪(4kkd)M=\mathcal{O}(4^{k}kd). We select kk such that k=ω(1)k=\omega(1) and simultaneously k=𝒪(logN/loglogN)k=\mathcal{O}(\log N/\log\log N). Then we have 4kk=No(1)4^{k}k=N^{o(1)} and the kk primes are of magnitude d=𝒪(N(3+o(1))/k)=No(1)d=\mathcal{O}(N^{(3+o(1))/k})=N^{o(1)} (we can choose kk consecutive primes computed using Eratosthenes’s sieve).

Overall |T|=N1+o(1)|T|=N^{1+o(1)} and |𝐴𝑙𝑝ℎ(T)|k+5=ω(1)|\mathit{Alph}(T)|\leq k+5=\omega(1). (One can obtain any alphabet up to 𝒪(N)\mathcal{O}(N) by appending distinct letters to TT.)

With the same argument for a constant-sized alphabet we obtain the following result.

Theorem 4.6.

All positions that are centers of Ab-square factors in a length-nn string over an alphabet of size 5+k5+k, for a constant kk, cannot be computed in 𝒪(n263+kε)\mathcal{O}(n^{2-\tfrac{6}{3+k}-\varepsilon}) time, for a constant ε>0\varepsilon>0, unless the 3SUM\operatorname{\textsc{3SUM}} conjecture fails.

5 Computing centers of Ab-squares for constant-sized alphabets

A set of vectors in [1,n]d[1,n]^{d} is called monotone if its elements can be ordered so that they form a monotone non-decreasing sequence on each coordinate.

Definition 5.1.

For sets 𝒜\mathcal{A} and \mathcal{B} of vectors we define

𝒜+={a+b:a𝒜,b},c𝒜={ca:a𝒜}\mathcal{A}+\mathcal{B}=\{a+b\,:\,a\in\mathcal{A},b\in\mathcal{B}\},\ c\cdot\mathcal{A}=\{ca\,:\,a\in\mathcal{A}\}

and for a string WW we define: Pl,r(W)={𝑃𝑎𝑟𝑖𝑘ℎ(W[1..k]):lkr}P_{l,r}(W)=\{\mathit{Parikh}(W[1..k]):l\leq k\leq r\}. Let us also denote by |A||A| the length of a string corresponding to a Parikh vector AA.

In the algorithm we use the following fact shown in [11]. The exact complexities can be found in [11, Theorem 3.1].

Fact 2 ([11]).

Given three monotone sequences 𝒜,,𝒞\mathcal{A},\mathcal{B},\mathcal{C} in [1,n]d[1,n]^{d} for a constant dd, we can compute (𝒜+)𝒞(\mathcal{A}+\mathcal{B})\cap\mathcal{C} in 𝒪(n2ϵ)\mathcal{O}(n^{2-\epsilon}) expected time for a constant ϵ>0\epsilon>0, or in 𝒪(n2ϵ)\mathcal{O}(n^{2-\epsilon^{\prime}}) worst case time for a constant ϵ>0\epsilon^{\prime}>0 if d7d\leq 7.

if |T|<2|T|<2 return \emptyset;
m=n/2m=\lceil n/2\rceil;
𝒜=P0,m1(T)\mathcal{A}=P_{0,m-1}(T)=Pm,n(T)\mathcal{B}=P_{m,n}(T)𝒞=P0,n(T)\mathcal{C}=P_{0,n}(T);
={|C|: 2C(𝒜+)2𝒞}\mathcal{M}=\{|C|\,:\,2C\in(\mathcal{A}+\mathcal{B})\cap 2\cdot\mathcal{C}\};
T𝑙𝑒𝑓𝑡=T[1..m1]T_{\mathit{left}}=T[1..m-1]T𝑟𝑖𝑔ℎ𝑡=T[m..n]T_{\mathit{right}}=T[m..n];
return 𝖢𝖤𝖭𝖳𝖤𝖱𝖲(T𝑙𝑒𝑓𝑡){k+m:k𝖢𝖤𝖭𝖳𝖤𝖱𝖲(T𝑟𝑖𝑔ℎ𝑡)}\mathcal{M}\cup\mathsf{CENTERS}(T_{\mathit{left}})\cup\{k+m\,:\,k\in\mathsf{CENTERS}(T_{\mathit{right}})\}
Algorithm 1 𝖢𝖤𝖭𝖳𝖤𝖱𝖲(T)\mathsf{CENTERS}(T)
mmDDEEiikkjjAACCBB
Figure 5: A𝒜A\in\mathcal{A}, BB\in\mathcal{B}, C𝒞C\in\mathcal{C} denote Parikh vectors of the corresponding fragments. If A+B=2CA+B=2C, then D=ED=E and k=|C|k=|C| is a center of an Ab-square.
Theorem 5.2.

For a string of length nn over an alphabet of size d=𝒪(1)d=\mathcal{O}(1), we can compute centers of all Ab-squares and centers of all odd Ab-squares in expected time 𝒪(n2ϵ)\mathcal{O}(n^{2-\epsilon}) or in worst case time 𝒪(n2ϵ)\mathcal{O}(n^{2-\epsilon}) if d7d\leq 7, for ϵ>0\epsilon>0.

Proof 5.3.

We use the above algorithm. Correctness of the algorithm is straightforward; see Figure 5. If

|A|<|B|,|C|=(|A|+|B|)/2,B=A+D+E,C=A+D|A|<|B|,\ |C|=(|A|+|B|)/2,\ B=A+D+E,\ C=A+D

then A+B=2C2A+D+E=2A+2DA+B=2C\iff 2A+D+E=2A+2D.

Consequently, after cancelling the same parts on both sides, A+B=2CE=DA+B=2C\iff E=D, equivalently if and only if the factor T[i..j]T[i..j] corresponding to DEDE is an Ab-square centred in k=|C|k=|C|. The figure shows the case when kk is in the right half of the strings; the other case is symmetric.

By 2 the cost of the algorithm can be given by a recurrence

S(n)=2S(n2)+𝒪(n2ϵ)S(n)=2\cdot S\left(\tfrac{n}{2}\right)+\mathcal{O}(n^{2-\epsilon})

which results in S(n)=𝒪(n2ϵ)S(n)=\mathcal{O}(n^{2-\epsilon}) for ϵ>0\epsilon>0.

In case of of odd Ab-squares let

Pl,rc(W)={𝑃𝑎𝑟𝑖𝑘ℎ(W[1..k]):lkr,kmod2=c}.P^{c}_{l,r}(W)=\{\mathit{Parikh}(W[1..k]):l\leq k\leq r,\,k\bmod{2}=c\}.

In the algorithm the statement ={|C|: 2C(𝒜+)2𝒞}\mathcal{M}=\{|C|\,:\,2C\in(\mathcal{A}+\mathcal{B})\cap 2\cdot\mathcal{C}\} is executed for both c{0,1}c\in\{0,1\}, with

𝒜=P0,m1c(T),=Pm,nc(T),𝒞=P0,n1c(T).\mathcal{A}=P^{c}_{0,m-1}(T),\quad\mathcal{B}=P^{c}_{m,n}(T),\quad\mathcal{C}=P^{1-c}_{0,n}(T).

Other parts of the algorithm, as well as its analysis, are essentially the same.

6 Detecting odd Ab-squares

Unfortunately the string TT from Lemma 4.2 has many Ab-squares which are not well-placed. Our approach is to embed the (slightly) modified string TT into a string which is a special composition of TT and a combination of long quaternary Ab-square-free strings. The resulting string will fix the potential centers in specified locations. We use additional letters: ,\diamondsuit,\,\circ and 𝟶,,𝟼\mathtt{0},\dots,\mathtt{6}.

6.1 Fixing centers

We show first a fact useful in fixing Ab-squares in specified places (Lemma 6.5). Keränen’s construction [27] of a quaternary Ab-square-free string consists in iterating a certain morphism ϕ\phi, such that |ϕ(a)|=85|\phi(a)|=85 for each of the four letters aa, on an initially single-letter string. This implies the following lemma.

Lemma 6.1 (Keränen [27]).

A length-nn quaternary Ab-square-free string can be generated in 𝒪(n)\mathcal{O}(n) time.

Let Pt2P_{t-2} be any Ab-square-free string of length t2t-2 over alphabet {𝟹,𝟺,𝟻,𝟼}\{\mathtt{3},\mathtt{4},\mathtt{5},\mathtt{6}\}. Let us define

U2t= 0Pt2 1 2Pt2R 0.U_{2t}\,=\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}\,P_{t-2}^{R}\,\mathtt{0}.
AAAABBCCDDBBsupposed Ab-squarePPPRP^{R}1 2\mathtt{1}\,\mathtt{2}1 2\mathtt{1}\,\mathtt{2}0 0\mathtt{0}\,\mathtt{0}0 0\mathtt{0}\,\mathtt{0}
Figure 6: Illustration of the proof that Ab-square cannot start inside P=Pt2P=P_{t-2} and end inside another instance of PP, not having its center between two 𝟶\mathtt{0}’s and have length which is not an even multiple of the period 2t2t. If it is an Ab-square (as shown in the figure) then A𝟷𝟸PR𝟶𝟶BCDA𝟷𝟸PR𝟶𝟶BA\mathtt{1}\mathtt{2}P^{R}\mathtt{0}\mathtt{0}BC\cong DA\mathtt{1}\mathtt{2}P^{R}\mathtt{0}\mathtt{0}B, then we can cancel equal parts on both sides. Consequently CDC\cong D, which implies that PRP^{R} contains an Ab-square CDCD; a contradiction.
Lemma 6.2.

The string (U2t)m(U_{2t})^{m} contains exactly the following Ab-squares:

  1. (1)

    of length divisible by 4t4t; and

  2. (2)

    with the center between two 𝟶\mathtt{0}’s and of all admissible even lengths other than (4q+2)t(4q+2)t, for an integer q0q\geq 0.

Proof 6.3.

Let XX be a factor of (U2t)m(U_{2t})^{m}. If XX has length greater than or equal to 4t4t, then its middle length-4t4t factor forms a classic square, and after removing it we obtain a different factor YY of length smaller by 4t4t, which is centred exactly like XX. Hence, we can focus only on non-empty factors of length smaller than 4t4t.

If XX is centred between two 𝟶\mathtt{0}’s, then after removing letters 𝟷\mathtt{1} and 𝟸\mathtt{2} we obtain an even palindrome (hence also an Ab-square). If XX is shorter than 2t2t, then no letters 𝟷\mathtt{1} or 𝟸\mathtt{2} occur. If it is longer than 2t2t, then both parts contain one letter 𝟷\mathtt{1} and 𝟸\mathtt{2} each. If its length is exactly 2t2t, then the letters 𝟷\mathtt{1} and 𝟸\mathtt{2} remain unmatched, hence it is the only case where the factor is not an Ab-square.

Let us assume that XX is centred in a different place. If the factor does not contain any of the letters 𝟶\mathtt{0}, 𝟷\mathtt{1} or 𝟸\mathtt{2}, then it is a factor of Pt2P_{t-2} or its reverse, hence it cannot be an Ab-square. Otherwise, the factor needs to contain each of 𝟷\mathtt{1}, 𝟸\mathtt{2} twice and 𝟶\mathtt{0} four times. Then XX fully contains a factor

1 2Pt2R 0 0Pt2 1 2Pt2R 0 0or0 0Pt2 1 2Pt2R 0 0Pt2 1 2.\mathtt{1}\,\mathtt{2}\,P_{t-2}^{R}\,\mathtt{0}\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}\,P_{t-2}^{R}\,\mathtt{0}\,\mathtt{0}\quad\text{or}\quad\mathtt{0}\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}\,\,P_{t-2}^{R}\,\mathtt{0}\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}.

Let us assume the former, see Figure 6; the latter is considered analogously. String XX has a length-ii suffix of Pt2P_{t-2} as a prefix and a length-jj prefix of Pt2P_{t-2} as a suffix, with 0i+jt20\leq i+j\leq t-2 and t2(i+j)t-2-(i+j) even. Then XX is an Ab-square if and only if Pt2[j+1..t2i]P_{t-2}[j+1..t-2-i] is an Ab-square, since Pt2[(ti+j)/2..t2i]XP[j+1..(ti+j)/21]P_{t-2}[(t-i+j)/2..t-2-i]\,X\,P[j+1..(t-i+j)/2-1] is a classic square; see Figure 6.

Remark 6.4.

Lemma 6.2 works for any Ab-square-free string Pt2P_{t-2} such that 𝐴𝑙𝑝ℎ(Pt2){𝟶,𝟷,𝟸}=\mathit{Alph}(P_{t-2})\cap\{\mathtt{0},\mathtt{1},\mathtt{2}\}=\emptyset.

For equal-length strings X,YX,Y we define the string

𝗌𝗁𝗎𝖿𝖿𝗅𝖾(X,Y)=X[1]Y[1]X[2]Y[2]X[3]Y[3].\mathsf{shuffle}_{\Diamond}(X,Y)\,=\,X[1]\,\Diamond\,Y[1]\;X[2]\,\Diamond\,Y[2]\;X[3]\,\Diamond\,Y[3]\;\cdots.

For example, 𝗌𝗁𝗎𝖿𝖿𝗅𝖾(𝚊𝚋𝚌,𝙰𝙱𝙲)=𝚊𝙰𝚋𝙱𝚌𝙲\mathsf{shuffle}_{\diamondsuit}(\mathtt{abc},\mathtt{ABC})\,=\,\mathtt{a}\diamondsuit\mathtt{Ab}\diamondsuit\mathtt{Bc}\diamondsuit\mathtt{C}.

The parity condition for half lengths of Ab-squares in the following observation justifies the usage of the additional letter \Diamond in 𝗌𝗁𝗎𝖿𝖿𝗅𝖾\mathsf{shuffle}. Let U[X]U_{[X]} be the string resulting from UU by removing all letters outside 𝐴𝑙𝑝ℎ(X)\mathit{Alph}(X). {observation} Assume X,YX,Y are equal-length strings composed of disjoint sets of letters distinct from \Diamond and WW is an Ab-square in 𝗌𝗁𝗎𝖿𝖿𝗅𝖾(X,Y)\mathsf{shuffle}_{\Diamond}(X,Y). Then W[X],W[Y]W_{[X]},W_{[Y]} are Ab-squares in X,YX,Y, respectively (we say that these Ab-squares are implied by WW). Moreover, |W[X]|/2,|W[Y]|/2,W|/2|W_{[X]}|/2,\,|W_{[Y]}|/2,\,W|/2 are of the same parity. We say that an even-length factor of a string XX is centred at ii if it has its center between positions ii and i+1i+1 in XX. By aba\mid b and aba\nmid b we denote that aa divides bb and aa does not divide bb. For an illustration of the following lemma, see Figure 7.

Lemma 6.5.

Let X=(U2t)n1X=(U_{2t})^{n-1}, YY be a string of length |X||X| such that its alphabet is disjoint with 𝐴𝑙𝑝ℎ(X){}\mathit{Alph}(X)\cup\{\Diamond\}, W=𝗌𝗁𝗎𝖿𝖿𝗅𝖾(X,Y)W=\mathsf{shuffle}_{\Diamond}(X,Y), and let an integer \ell satisfy 12t12t\nmid\ell. Then a length-\ell factor of WW is an Ab-square if and only if it is centred in WW at r{0,1,2}(mod6t)r\equiv\{0,-1,-2\}\pmod{6t}, YY contains an Ab-square factor of length /3\ell/3 centred in YY at r/3\left\lfloor r/3\right\rfloor, and 6t6t\nmid\ell.

Proof 6.6.

By the disjointness of sets of letters in X,YX,Y and {}\{\Diamond\}, each Ab-square in WW has length that is divisible by 3. The following claim is then readily verified (cf. Section 6.1).

Claim 3.

For positive integer \ell such that 66\mid\ell, a length-\ell factor of WW centred at rr is an Ab-square if and only if the length-/3\ell/3 factors in XX and YY centred at r+23\left\lfloor\tfrac{r+2}{3}\right\rfloor and r3\left\lfloor\tfrac{r}{3}\right\rfloor, respectively, are Ab-squares.

Let integer >0\ell>0 satisfy 66\mid\ell and 12t12t\nmid\ell. We show two implications.

()(\mathbf{\Rightarrow}) If WW contains an Ab-square factor of length \ell centred at some rr, then the implied Ab-square factor of XX has length /3\ell/3, where 4t/34t\nmid\ell/3, so by Lemma 6.2 it has its center between two 𝟶\mathtt{0}’s, i.e., 2tr+232t\mid\left\lfloor\tfrac{r+2}{3}\right\rfloor. Hence, r{0,1,2}(mod6t)r\equiv\{0,-1,-2\}\pmod{6t}.

Moreover, 2t/32t\nmid\ell/3 also by Lemma 6.2. Finally, the implied Ab-square factor of YY indeed has length /3\ell/3 and is centred at r/3\left\lfloor r/3\right\rfloor.

()(\mathbf{\Leftarrow}) Let r{0,1,2}(mod6t)r\equiv\{0,-1,-2\}\pmod{6t}, 6t6t\nmid\ell, and assume that YY contains an Ab-square factor of length /3\ell/3 centred at r/3\left\lfloor r/3\right\rfloor. We have 2tr+23and 2t/3,2t\mid\left\lfloor\tfrac{r+2}{3}\right\rfloor\ \text{and}\ 2t\nmid\ell/3, so by Lemma 6.2 the string XX contains an Ab-square factor of length /3\ell/3 centred at r+23\left\lfloor\tfrac{r+2}{3}\right\rfloor. Finally, the unary string 2tm\Diamond^{2tm}, certainly contains an Ab-square factor of length /3\ell/3 centred at r+13\left\lfloor\tfrac{r+1}{3}\right\rfloor. By the claim, WW contains an Ab-square of length \ell centred at rr that implies the three Ab-squares.

123456789101112131415161718192021222324252627282930313233343536031230\Diamond\Diamond\Diamond\Diamond\Diamond\Diamond031230\Diamond\Diamond\Diamond\Diamond\Diamond\Diamondabbaabbaabba
Figure 7: Illustration of Lemma 6.5. Let X=(U6)2X=(U_{6})^{2}. The string Y=(𝚊𝚋𝚋𝚊)3Y=(\mathtt{abba})^{3}, composed of black letters, contains many Ab-squares. However the string Z=𝗌𝗁𝗎𝖿𝖿𝗅𝖾(X,Y)Z=\mathsf{shuffle}_{\Diamond}(X,Y) of length 36, shown above, contains only Ab-squares centred at 16, 17 or 18, as in the figure. The implied Ab-squares in ZZ are only those which are centred at positions 5 or 6 in ZZ.

6.2 Main result

We use the technique of fixing Ab-squares from Lemma 6.5. Moreover, we make the following minor modifications upon the construction of string TT in Section 4:

  1. (1)

    Each fragment 𝚋kd\mathtt{b}^{kd} is extended by one letter to 𝚋kd+1\mathtt{b}^{kd+1}, and

  2. (2)

    the letters ,\bullet,\star are replaced each by two letters ,\bullet\circ,\star\circ, respectively.

Intuitively, (1) allows to extend Ab-squares considered in the proof of Lemma 4.2 by one letter 𝚋\mathtt{b} to either side, and (2) makes |𝐀|=|𝐁|=|SSiI||\mathbf{A}|=|\mathbf{B}|=|\SS_{i}^{I}| even which facilitates the usage of Lemma 6.5 with Y=TY=T. It can be verified by inspecting the proof that Lemma 4.2 still holds after these two changes. We refer to all the notions from Section 4 after these modifications.

Theorem 6.7.

Checking if a length-nn string over an alphabet of size ω(1)\omega(1) contains an odd Ab-square is 3SUM\operatorname{\textsc{3SUM}}-hard. Moreover, for a string over an alphabet of size 14+k14+k, for a constant kk, the same problem cannot be solved in 𝒪(n263+kε)\mathcal{O}(n^{2-\tfrac{6}{3+k}-\varepsilon}) time, for a constant ε>0\varepsilon>0, unless the 3SUM\operatorname{\textsc{3SUM}} conjecture fails.

Proof 6.8.

It is enough now to show the following equivalence for X=(U2t)n1X=(U_{2t})^{n-1}, where 2t=|T|/(n1)2t=|T|/(n-1). We assume that n3n\geq 3.

Claim 4.

An odd-half instance of 3DAP\operatorname{\textsc{3DAP}} is a YES-instance if and only if W=𝗌𝗁𝗎𝖿𝖿𝗅𝖾(X,T)W=\mathsf{shuffle}_{\Diamond}(X,T) has an odd Ab-square factor.

Proof 6.9.

𝐁SS1I𝐀2t𝐁M+2SS2I𝐀2t𝐁SS3I𝐀2t𝐁SS4I𝐀2t\overbrace{\underbrace{\star\,\circ\;\mathbf{B}}\;\underbrace{\bullet\,\circ\;\SS^{I}_{1}}\;\underbrace{\mathbf{A}\;\star\,\circ}}^{2t}\;\overbrace{\underbrace{\bullet\,\circ\;\mathbf{B}}_{M+2}\;\underbrace{\star\,\circ\;\SS^{I}_{2}}\;\;\underbrace{\mathbf{A}\;\bullet\,\circ}}^{2t}\;\overbrace{\underbrace{\star\,\circ\;\mathbf{B}}\;\underbrace{\bullet\,\circ\;\SS^{I}_{3}}\;\underbrace{\mathbf{A}\;\star\,\circ}}^{2t}\;\overbrace{\underbrace{\bullet\,\circ\;\mathbf{B}}\;\underbrace{\star\,\circ\;\SS^{I}_{4}}\;\underbrace{\mathbf{A}\,\bullet\,\circ}}^{2t}

Figure 8: A schematic structure of a fragment of TT after insertion of symbols \circ. There are 3(n1)3(n-1) (underbraced) blocks in TT, each of size M+2M+2, and 2t=3M+62t=3M+6.

()(\mathbf{\Rightarrow}) Assume that x¯\bar{x} is an odd-half instance and 3DAP(x¯)\operatorname{\textsc{3DAP}}(\bar{x}) has a solution.

By Lemma 4.2, TT contains a well-placed Ab-square, that is, an Ab-square centred at a position rr^{\prime} such that 2tr2t\mid r^{\prime}. (Recall that 3(M+2)=2t3(M+2)=2t.) Moreover, in the proof of that lemma it is shown that in this case there exists a well-placed Ab-square in TT that satisfies the following additional requirements: (1) it starts within the gadget 𝐁\mathbf{B}(2) it starts and ends within a block of 𝚋\mathtt{b}’s;  (3) its maximal prefix and suffix consisting of letters 𝚋\mathtt{b} are 𝚋e\mathtt{b}^{e} and 𝚋f\mathtt{b}^{f}, where e,fkde,f\leq kd.

Let \ell^{\prime} denote the half length of this Ab-square. By (2) and (3), if \ell^{\prime} is even, the Ab-square can be extended by one letter 𝚋\mathtt{b} to either side (because we have extended each block 𝚋kd\mathtt{b}^{kd}) so that \ell^{\prime} becomes odd. Moreover, by (1), we have mod(2t)[43t,2t)\ell^{\prime}\bmod(2t)\in[\tfrac{4}{3}t,2t), in particular, tt\nmid\ell^{\prime}. Then Lemma 6.5 concludes that the factor of WW centred at r=3r0(mod6t)r=3r^{\prime}\equiv 0\pmod{6t} and of length 66\ell^{\prime} such that 6t66t\nmid 6\ell^{\prime} is an Ab-square. Its half length, 33\ell^{\prime}, is odd, as desired.

𝐌𝟐𝐌𝐌𝟐𝐌|𝐫𝐌𝟐𝐌𝐌𝟐𝐌\circ\,\star\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\stackrel{{\scriptstyle\mathbf{r^{\prime}}}}{{|}}\,\circ\,\,\star\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\circ

Figure 9: A simplified version of Figure 8. Which position in a block can be the starting position of an Ab-square with the center at rr^{\prime} (one position to the left of a good center), only counting ,,\bullet,\star,\circ.

()(\mathbf{\Leftarrow}) Assume that WW has an Ab-square factor UU of length \ell such that /2\ell/2 is odd. In particular, we have 12t12t\nmid\ell, so by Lemma 6.5 the Ab-square UU is centred in WW at r{0,1,2}(mod6t)r\equiv\{0,-1,-2\}\pmod{6t} and TT contains an Ab-square factor VV of length /3\ell/3 centred in TT at r=r/3r^{\prime}=\left\lfloor r/3\right\rfloor. If 6tr6t\mid r, then 2tr2t\mid r^{\prime} and VV is well-placed.

Otherwise, VV cannot be an Ab-square due to the following fact: TT does not contain an Ab-square factor of length \ell not divisible by 4t4t and centred at r1(mod2t)r^{\prime}\equiv-1\pmod{2t}. Indeed, similarly as in the proof of Lemma 4.2, we will show that each even-length factor centred at such rr^{\prime} contains different counts of one of the letters ,,\bullet,\star,\circ in both halves. The positions of letters ,,\bullet,\star,\circ in TT repeat with period 6(M+2)6(M+2), so it is sufficient to inspect the first 6 blocks on each side, as the remaining ones will behave periodically; see Figures 8 and 9.

letter \circ \star \circ 𝐁\mathbf{B} \bullet \circ SS\SS 𝐀\mathbf{A} \star
distance 6M+126M+12 6M+116M+11 6M+106M+10 5M+95M+9 5M+85M+8 3M+73M+7
letter \circ \bullet \circ 𝐁\mathbf{B} \star \circ SS\SS 𝐀\mathbf{A} \bullet
distance 3M+63M+6 3M+53M+5 3M+43M+4 2M+32M+3 2M+22M+2 11
letter \circ \star \circ 𝐁\mathbf{B} \bullet \circ SS\SS 𝐀\mathbf{A} \star
distance 11 22 33 M+4M+4 M+5M+5 3M+63M+6
letter \circ \bullet \circ 𝐁\mathbf{B} \star \circ SS\SS 𝐀\mathbf{A} \bullet
distance 3M+73M+7 3M+83M+8 3M+93M+9 4M+104M+10 4M+114M+11 6M+126M+12
Table 1: Top/bottom table: the distances of letters from {,,}\{\bullet,\star,\circ\} from the left/right half to the center of the factor.

An exhaustive verification can be performed as follows. First, in Table 1, we count the distances of letters from {,,}\{\bullet,\star,\circ\} in both directions to the center of the factor. In Table 2 we perform a merge of these two sequences of distances assuming that M3M\geq 3.

For each distance, we write a letter that is located at this distance with a “++” sign if it is in the left half and with a “-” sign otherwise. Then remaining columns show the partial sum of the number of occurrences of the letter cc in the left and in the right half, for each c{,,}c\in\{\bullet,\star,\circ\}.

position letter \bullet \star \circ
11 ++\bullet 11 0 0
11 -\circ 11 0 1-1
22 -\star 11 1-1 1-1
33 -\circ 11 1-1 2-2
M+4M+4 -\bullet 0 1-1 2-2
M+5M+5 -\circ 0 1-1 3-3
2M+22M+2 ++\circ 0 1-1 2-2
2M+32M+3 ++\star 0 0 2-2
3M+43M+4 ++\circ 0 0 1-1
3M+53M+5 ++\bullet 11 0 1-1
3M+63M+6 ++\circ 11 0 0
3M+63M+6 -\star 11 1-1 0
position letter \bullet \star \circ
3M+73M+7 ++\star 11 0 0
3M+73M+7 -\circ 11 0 1-1
3M+83M+8 -\bullet 0 0 1-1
3M+93M+9 -\circ 0 0 2-2
4M+104M+10 -\star 0 1-1 2-2
4M+114M+11 -\circ 0 1-1 3-3
5M+85M+8 ++\circ 0 1-1 2-2
5M+95M+9 ++\bullet 11 1-1 2-2
6M+106M+10 ++\circ 11 1-1 1-1
6M+116M+11 ++\star 11 0 1-1
6M+126M+12 ++\circ 11 0 0
6M+126M+12 -\bullet 0 0 0
Table 2: The merge of distance sequences from Table 1.

Consequently, as in Lemma 4.2, the corresponding instance of 3DAP\operatorname{\textsc{3DAP}} is a YES-instance.

The complexities in the theorem are obtained as in Theorems 4.4 and 4.6.

7 Open problems

The most interesting questions that remain open are as follows:

  1. 1.

    Is checking Ab-square-freeness 3SUM\operatorname{\textsc{3SUM}}-hard? Our reductions allowed us to show 3SUM\operatorname{\textsc{3SUM}}-hardness of detecting an odd Ab-square.

  2. 2.

    Can one detect an additive square in a length-nn string over a constant-sized alphabet in 𝒪(n2ε)\mathcal{O}(n^{2-\varepsilon}) time, for some ε>0\varepsilon>0? We have shown 3SUM\operatorname{\textsc{3SUM}}-hardness of this problem for an alphabet that is polynomial in nn.

References

  • [1] Peyman Afshani, Ingo van Duijn, Rasmus Killmann, and Jesper Sindahl Nielsen. A lower bound for jumbled indexing. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 592–606. SIAM, 2020. doi:10.1137/1.9781611975994.36.
  • [2] Amihood Amir, Alberto Apostolico, Tirza Hirst, Gad M. Landau, Noa Lewenstein, and Liat Rozenberg. Algorithms for jumbled indexing, jumbled border and jumbled square on run-length encoded strings. Theoretical Computer Science, 656:146–159, 2016. doi:10.1016/j.tcs.2016.04.030.
  • [3] Amihood Amir, Timothy M. Chan, Moshe Lewenstein, and Noa Lewenstein. On hardness of jumbled indexing. In Javier Esparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, editors, Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part I, volume 8572 of Lecture Notes in Computer Science, pages 114–125. Springer, 2014. doi:10.1007/978-3-662-43948-7_10.
  • [4] Hideo Bannai, Shunsuke Inenaga, and Dominik Köppl. Computing all distinct squares in linear time for integer alphabets. In Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter, editors, 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, July 4-6, 2017, Warsaw, Poland, volume 78 of LIPIcs, pages 22:1–22:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.CPM.2017.22.
  • [5] Ilya Baran, Erik D. Demaine, and Mihai Patrascu. Subquadratic algorithms for 3SUM. Algorithmica, 50(4):584–596, 2008. doi:10.1007/s00453-007-9036-3.
  • [6] Felix Adalbert Behrend. On sets of integers which contain no three terms in arithmetical progression. Proceedings of the National Academy of Sciences of the United States of America, 32(12):331–332, 1946. doi:10.1073/pnas.32.12.331.
  • [7] Tom C. Brown and Allen R. Freedman. Arithmetic progressions in lacunary sets. Rocky Mountain Journal of Mathematics, 17(3):587–596, 1987.
  • [8] Tom C. Brown, Veselin Jungić, and Andrew Poelstra. On double 3-term arithmetic progressions. Integers, 14:A43, 2014. URL: https://www.emis.de/journals/INTEGERS/papers/o43/o43.Abstract.html.
  • [9] Julien Cassaigne, James D. Currie, Luke Schaeffer, and Jeffrey O. Shallit. Avoiding three consecutive blocks of the same size and same sum. Journal of ACM, 61(2):10:1–10:17, 2014. doi:10.1145/2590775.
  • [10] Timothy M. Chan and Qizheng He. Reducing 3SUM to Convolution-3SUM. In Martin Farach-Colton and Inge Li Gørtz, editors, 3rd Symposium on Simplicity in Algorithms, SOSA@SODA 2020, Salt Lake City, UT, USA, January 6-7, 2020, pages 1–7. SIAM, 2020. doi:10.1137/1.9781611976014.1.
  • [11] Timothy M. Chan and Moshe Lewenstein. Clustered integer 3SUM via additive combinatorics. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 31–40. ACM, 2015. doi:10.1145/2746539.2746568.
  • [12] Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Extracting powers and periods in a word from its runs structure. Theoretical Computer Science, 521:29–41, 2014. doi:10.1016/j.tcs.2013.11.018.
  • [13] Larry J. Cummings and William F. Smyth. Weak repetitions in strings. Journal of Combinatorial Mathematics and Combinatorial Computing, 24:33–48, 1997.
  • [14] Bartłomiej Dudek, Paweł Gawrychowski, and Tatiana Starikovskaya. All non-trivial variants of 3-LDT are equivalent. In Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy, editors, Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 974–981. ACM, 2020. doi:10.1145/3357713.3384275.
  • [15] Roger C. Entringer, Douglas E. Jackson, and J.A. Schatz. On nonrepetitive sequences. Journal of Combinatorial Theory, Series A, 16(2):159–164, 1974. doi:10.1016/0097-3165(74)90041-7.
  • [16] Paul Erdős. Some unsolved problems. Magyar Tudományos Akadémia Matematikai Kutató Intézetének Közleményei, 6:221–254, 1961.
  • [17] Jeff Erickson. Finding longest arithmetic progressions, 1999. URL: https://jeffe.cs.illinois.edu/pubs/arith.html.
  • [18] Aleksandr Andreevich Evdokimov. Strongly asymmetric sequences generated by a finite number of symbols. Doklady Akademii Nauk SSSR, 179(6):1268–1271, 1968.
  • [19] Gabriele Fici, Filippo Mignosi, and Jeffrey O. Shallit. Abelian-square-rich words. Theoretical Computer Science, 684:29–42, 2017. doi:10.1016/j.tcs.2017.02.012.
  • [20] Gabriele Fici and Aleksi Saarela. On the minimum number of abelian squares in a word. In Maxime Crochemore, James Currie, Gregory Kucherov, and Dirk Nowotka, editors, Combinatorics and Algorithmics of Strings (Dagstuhl Seminar 14111), volume 4 (3), pages 34–35, Dagstuhl, Germany, 2014. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. doi:10.4230/DagRep.4.3.28.
  • [21] Aviezri S. Fraenkel, Jamie Simpson, and Mike Paterson. On weak circular squares in binary words. In Alberto Apostolico and Jotun Hein, editors, Combinatorial Pattern Matching, 8th Annual Symposium, CPM 97, Aarhus, Denmark, June 30 - July 2, 1997, Proceedings, volume 1264 of Lecture Notes in Computer Science, pages 76–82. Springer, 1997. doi:10.1007/3-540-63220-4_51.
  • [22] Allen R. Freedman and Tom C. Brown. Sequences on sets of four numbers. Integers, 16:A33, 2016. URL: http://math.colgate.edu/~integers/q33/q33.Abstract.html.
  • [23] Isaac Goldstein, Tsvi Kopelowitz, Moshe Lewenstein, and Ely Porat. How hard is it to find (honest) witnesses? In Piotr Sankowski and Christos D. Zaroliagis, editors, 24th Annual European Symposium on Algorithms, ESA 2016, August 22-24, 2016, Aarhus, Denmark, volume 57 of LIPIcs, pages 45:1–45:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ESA.2016.45.
  • [24] Dan Gusfield and Jens Stoye. Linear time algorithms for finding and representing all the tandem repeats in a string. Journal of Computer and System Sciences, 69(4):525–546, 2004. doi:10.1016/j.jcss.2004.03.004.
  • [25] Lorenz Halbeisen and Norbert Hungerbühlre. An application of van der Waerden’s theorem in additive number theory. Integers, 0:A7, 2000. URL: http://math.colgate.edu/~integers/a7/a7.pdf.
  • [26] Veikko Keränen. A powerful abelian square-free substitution over 4 letters. Theoretical Computer Science, 410(38-40):3893–3900, 2009. doi:10.1016/j.tcs.2009.05.027.
  • [27] Veikko Keränen. Abelian squares are avoidable on 4 letters. In Werner Kuich, editor, Automata, Languages and Programming, ICALP 1992, volume 623 of Lecture Notes in Computer Science, pages 41–52. Springer, 1992. doi:10.1007/3-540-55719-9_62.
  • [28] Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Maximum number of distinct and nonequivalent nonstandard squares in a word. Theoretical Computer Science, 648:84–95, 2016. doi:10.1016/j.tcs.2016.08.010.
  • [29] Tomasz Kociumaka, Jakub Radoszewski, and Bartłomiej Wiśniewski. Subquadratic-time algorithms for abelian stringology problems. In Ilias S. Kotsireas, Siegfried M. Rump, and Chee K. Yap, editors, Mathematical Aspects of Computer and Information Sciences - 6th International Conference, MACIS 2015, Berlin, Germany, November 11-13, 2015, Revised Selected Papers, volume 9582 of Lecture Notes in Computer Science, pages 320–334. Springer, 2015. doi:10.1007/978-3-319-32859-1_27.
  • [30] Tomasz Kociumaka, Jakub Radoszewski, and Bartłomiej Wiśniewski. Subquadratic-time algorithms for abelian stringology problems. AIMS Medical Science, 4(3):332–351, 2017. doi:10.3934/ms.2017.3.332.
  • [31] Tsvi Kopelowitz, Seth Pettie, and Ely Porat. Higher lower bounds from the 3SUM conjecture. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1272–1287. SIAM, 2016. doi:10.1137/1.9781611974331.ch89.
  • [32] Florian Lietard and Matthieu Rosenfeld. Avoidability of additive cubes over alphabets of four numbers. In Natasa Jonoska and Dmytro Savchuk, editors, Developments in Language Theory - 24th International Conference, DLT 2020, Tampa, FL, USA, May 11-15, 2020, Proceedings, volume 12086 of Lecture Notes in Computer Science, pages 192–206. Springer, 2020. doi:10.1007/978-3-030-48516-0_15.
  • [33] Giuseppe Pirillo and Stefano Varricchio. On uniformly repetitive semigroups. Semigroup Forum, 49:125–129, 1994. doi:10.1007/BF02573477.
  • [34] Peter A. B. Pleasants. Non-repetitive sequences. Mathematical Proceedings of the Cambridge Philosophical Society, 68:267–274, 1970.
  • [35] Mihai Pătra\textcommabelowscu. Towards polynomial lower bounds for dynamic problems. In Leonard J. Schulman, editor, Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010, pages 603–610. ACM, 2010. doi:10.1145/1806689.1806772.
  • [36] Michaël Rao and Matthieu Rosenfeld. Avoiding two consecutive blocks of same size and same sum over 2\mathbb{Z}^{2}. SIAM Journal on Discrete Mathematics, 32(4):2381–2397, 2018. doi:10.1137/17M1149377.
  • [37] Lawrence Bruce Richmond and Jeffrey O. Shallit. Counting abelian squares. Electronic Journal of Combinatorics, 16(1), 2009. URL: http://www.combinatorics.org/Volume_16/Abstracts/v16i1r72.html.
  • [38] Raphaël Salem and Donald C. Spencer. On sets of integers which contain no three terms in arithmetical progression. Proceedings of the National Academy of Sciences of the United States of America, 28(12):561–563, 1942. doi:10.1073/pnas.28.12.561.
  • [39] Jamie Simpson. Solved and unsolved problems about abelian squares, 2018. arXiv:1802.04481.
  • [40] Shiho Sugimoto, Naoki Noda, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing abelian string regularities based on RLE. In Ljiljana Brankovic, Joe Ryan, and William F. Smyth, editors, Combinatorial Algorithms - 28th International Workshop, IWOCA 2017, Newcastle, NSW, Australia, July 17-21, 2017, Revised Selected Papers, volume 10765 of Lecture Notes in Computer Science, pages 420–431. Springer, 2017. doi:10.1007/978-3-319-78825-8_34.