\NewEnviron

scaletikzpicturetowidth[1]\BODY University of Warsaw, Poland and Samsung R&D Polandjrad@mimuw.edu.plhttps://orcid.org/0000-0002-0067-6401 University of Warsaw, Polandrytter@mimuw.edu.plhttps://orcid.org/0000-0002-9162-6724 University of Warsaw, Polandjks@mimuw.edu.pl0000-0003-2207-0053 University of Warsaw, Polandwalen@mimuw.edu.plhttps://orcid.org/0000-0002-7369-3309 University of Warsaw, Polandw.zuba@mimuw.edu.plhttps://orcid.org/0000-0002-1988-3507

Acknowledgements.

The authors warmly thank Paweł Gawrychowski and Tomasz Kociumaka for helpful discussions. \CopyrightJakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba {CCSXML} <ccs2012> <concept> <concept_id>10003752.10003809.10010031.10010032</concept_id> <concept_desc>Theory of computation Pattern matching</concept_desc> <concept_significance>500</concept_significance> </concept> </ccs2012> \ccsdesc[500]Theory of computation Pattern matching \hideLIPIcs

Hardness of Detecting Abelian and Additive Square Factors in Strings

Jakub Radoszewski Wojciech Rytter Juliusz Straszyński Tomasz Waleń Wiktor Zuba

Abstract

We prove 3SUM-hardness (no strongly subquadratic-time algorithm, assuming the 3SUM conjecture) of several problems related to finding Abelian square and additive square factors in a string. In particular, we conclude conditional optimality of the state-of-the-art algorithms for finding such factors.

Overall, we show 3SUM-hardness of (a) detecting an Abelian square factor of an odd half-length, (b) computing centers of all Abelian square factors, (c) detecting an additive square factor in a length- $n$ string of integers of magnitude $n^{\mathcal{O}(1)}$ , and (d) a problem of computing a double 3-term arithmetic progression (i.e., finding indices $i\neq j$ such that $(x_{i}+x_{j})/2=x_{(i+j)/2}$ ) in a sequence of integers $x_{1},\dots,x_{n}$ of magnitude $n^{\mathcal{O}(1)}$ .

Problem (d) is essentially a convolution version of the AVERAGE problem that was proposed in a manuscript of Erickson. We obtain a conditional lower bound for it with the aid of techniques recently developed by Dudek et al. [STOC 2020]. Problem (d) immediately reduces to problem (c) and is a step in reductions to problems (a) and (b). In conditional lower bounds for problems (a) and (b) we apply an encoding of Amir et al. [ICALP 2014] and extend it using several string gadgets that include arbitrarily long Abelian-square-free strings.

Our reductions also imply conditional lower bounds for detecting Abelian squares in strings over a constant-sized alphabet. We also show a subquadratic upper bound in this case, applying a result of Chan and Lewenstein [STOC 2015].

keywords:

Abelian square, additive square, 3SUM problem

1 Introduction

Abelian squares.

An Abelian square, Ab-square in short (also known as a jumbled square), is a string of the form $XY$ , where $Y$ is a permutation of $X$ ; we say that $X$ and $Y$ are Ab-equivalent. We are interested in factors (i.e., substrings composed of consecutive letters) of a given text string being Ab-squares.

Example 1.1.

The string

has exactly two Ab-square factors of length 12, shown above (but it has also Ab-squares of other lengths, e.g. 5665, 11, 1111, 011110).

Ab-squares were first studied by Erdős [16], who posed a question on the smallest alphabet size for which there exists an infinite Ab-square-free string, i.e., an infinite string without Ab-square factors. The first example of such a string over a finite alphabet was given by Evdokimov [18]. Later the alphabet size was improved to five by Pleasants [34] and finally an optimal example over a four-letter alphabet was shown by Keränen [27]. Further results on combinatorics of Ab-square-free strings and several examples of their applications in group theory, algorithmic music and cryptography can be found in [26] and references therein. Avoidability of long Ab-squares was also considered [36].

Strings containing Ab-squares were also studied. Motivated by another problem posed by Erdős [16], Entringer et al. [15] showed that every infinite binary string has arbitrarily long Ab-square factors. Fici et al. [19] considered infinite strings containing many distinct Ab-squares. A string of length $n$ may contain $\Theta(n^{2})$ Ab-square factors that are distinct as strings, but contains only $\mathcal{O}(n^{11/6})$ Ab-squares which are pairwise Abelian nonequivalent (correspond to different Parikh vectors), see [28]. It is also conjectured that a binary string of length $n$ must have at least $\left\lfloor n/4\right\rfloor$ distinct [20] and nonequivalent [21] Ab-square factors. For more conjectures related to combinatorics of Ab-square factors of strings ad circular strings, see [39].

Several algorithms computing Ab-square factors of a string are known. All Ab-squares in a string of length $n$ can be computed in $\mathcal{O}(n^{2})$ time [13]. For a string over a constant-sized alphabet, all Ab-square factors of a string can be computed in $\mathcal{O}(n^{2}/\log^{2}n+\mathsf{output})$ time and the longest Ab-square can be computed in $\mathcal{O}(n^{2}/\log^{2}n)$ time [29, 30]. Moreover, for a string of length $n$ that is given by its run-length encoding consisting of $r$ runs, the longest Ab-square that ends at each position can be computed in $\mathcal{O}(|\Sigma|(r^{2}+n))$ time [2] or in $\mathcal{O}(rn)$ time [40]; both approaches require $\Omega(n^{2})$ time in the worst case.

In [37] a different problem of enumerating strings being Ab-squares was considered.

Additive squares.

An additive square is an even-length string over an integer alphabet such that the sums of characters of the halves of this string are the same.

Example 1.2.

The following string has exactly 4 additive squares of length 10, as shown.

All of them except for the rightmost one are also Ab-squares. This string does not contain any longer additive square. Altogether this string has 8 additive square factors.

An Ab-square (over an integer alphabet) is an additive square, but not necessarily the other way around. Combinatorially, problems related to additive squares are hard, in particular avoiding additive squares seems more difficult than avoiding Ab-squares. There are infinitely many strings over $\{0,1,2,3\}$ avoiding Ab-squares, but there are only finitely many strings over the same alphabet avoiding additive squares; see [22].

In fact it is unknown if there are infinitely many strings over any finite integer alphabet avoiding additive squares [7, 25, 33]. For additive cubes the property was proved in [9] (see also [32]) however.

Nowadays, combinatorial study of Ab-square and additive square factors often involves computer experiments; see e.g. [9, 19, 36]. In addition to other applications, efficient algorithms detecting such types of squares could provide a significant aid in this research. In case of classic square factors (i.e., factors of the form $XX$ ), a linear-time algorithm for computing them is known for a string over a constant [24] and over an integer alphabet [4, 12]. We show that, unfortunately, in many cases the existence of near-linear-time algorithms for detecting Ab-square and additive square factors is unlikely, based on conjectured hardness of the $\operatorname{\textsc{3SUM}}$ problem.

$\operatorname{\textsc{3SUM}}$ problem.

The problem asks if there are distinct elements $a,b,c\in S$ such that $a+b=c$ for a given set $S$ of $n$ integers ; see [35]. It is a general belief that the following conjecture is true for the word-RAM model.

$\operatorname{\textsc{3SUM}}$ conjecture:

There is no $\mathcal{O}(n^{2-\epsilon})$ time algorithm for the $\operatorname{\textsc{3SUM}}$ problem, for any constant $\epsilon>0$ .

A problem with input of size $n$ is called $\operatorname{\textsc{3SUM}}$ -hard if an $\mathcal{O}(n^{2-\varepsilon})$ -time solution to the problem implies an $\mathcal{O}(n^{2-\varepsilon^{\prime}})$ -time solution for $\operatorname{\textsc{3SUM}}$ , for some constants $\varepsilon,\varepsilon^{\prime}>0$ .

Our results.

•

We show that the problems of computing all centers of Ab-square factors and detecting an odd half-length Ab-square factor, called an odd Ab-square (consequently also computing all lengths of Ab-square factors), for a length- $n$ string over an alphabet of size $\omega(1)$ , cannot be solved in $\mathcal{O}(n^{2-\varepsilon})$ time, for constant $\varepsilon>0$ , unless the 3SUM conjecture fails. Weaker conditional lower bounds are also stated in the case of a constant-sized alphabet.
•

For constant-sized alphabets, we show strongly sub-quadratic algorithms for these problems based on an involved result of [11] related to jumbled indexing.
•

En route we prove that detection of a double 3-term arithmetic progression (see [8]) and additive squares in a length- $n$ sequence of integers of magnitude $n^{\mathcal{O}(1)}$ is $\operatorname{\textsc{3SUM}}$ -hard.

We obtain deterministic conditional lower bounds from a convolution version of $\operatorname{\textsc{3SUM}}$ that is well-known to be $\operatorname{\textsc{3SUM}}$ -hard.

Related work.

In the jumbled indexing problem, we are given a text $T$ and are to answer queries for a pattern specified by a Parikh vector which gives, for each letter of the alphabet, the number of occurrences of this letter in the pattern. For each query, we are to check if there is a factor of the text that is Ab-equivalent to the pattern (existence query) or report all such factors (reporting query). Chan and Lewenstein [11] showed a data structure that can be constructed in truly subquadratic expected time and answers existence queries in truly sublinear time for a constant-sized alphabet (deterministic constructions for very small alphabets were also shown). Amir et al. [3] showed under a 3SUM-hardness assumption that jumbled indexing with existence queries requires $\Omega(n^{2-\varepsilon})$ preprocessing time or $\Omega(n^{1-\delta})$ queries for any $\epsilon,\delta>0$ for an alphabet of size $\omega(1)$ . They also provided particular constants $\varepsilon_{\sigma},\delta_{\sigma}$ for an alphabet of a constant size $\sigma\geq 3$ such that, under a stronger 3SUM-hardness assumption, jumbled indexing requires $\Omega(n^{2-\varepsilon_{\sigma}})$ preprocessing time or $\Omega(n^{1-\delta_{\sigma}})$ queries. We use the techniques from both results in our algorithm and conditional lower bound for Ab-squares, respectively. The lower bound of Amir et al. was later improved and extended to both existence and reporting variants and any constant $\sigma\geq 2$ by Goldstein et al. [23, Section 7] with the aid of randomization. Moreover, recently an unconditional lower bound for the reporting variant was given in [1].

Our techniques.

A subsequence of three distinct positions is a 3-term double arithmetic progression (3dap in short) if it is an arithmetic progression and the elements on these positions also form an arithmetic progression. The problem of finding a 3dap in a sequence is denoted by $\operatorname{\textsc{3DAP}}$ . It is an odd 3dap if the first and the third positions are odd and the middle position is even. The corresponding problem is denoted by $\operatorname{\textsc{Odd\mbox{-}3DAP}}$ . First we reduce the convolution problem 3SUM (known to be 3SUM-hard) to the $\operatorname{\textsc{3DAP}}$ problem via $\operatorname{\textsc{Odd\mbox{-}3DAP}}$ as an intermediate problem. This uses a divide-and-conquer approach and a partition of sets into sets avoiding bad arithmetic progression of length 3.

The $\operatorname{\textsc{3DAP}}$ problem reduces in a simple way to detection of an additive square, showing that the latter problem is 3SUM hard.

Next, the $\operatorname{\textsc{3DAP}}$ problem is encoded as a string. We follow the high-level idea from Amir et al. Instead of checking equality of numbers, we can check equality of their remainders modulo sufficiently many prime numbers. Then, each prime number corresponds to a distinct characters. If the numbers are $\mathrm{poly}\,n$ then only $\mathcal{O}(\log n)$ prime numbers are needed. However, there is a certain technical complication, already present in the paper of Amir et al., which needs an introduction of additional gadgets working as equalizers. The details, compared with construction of Amir at al., are different, mostly because in the end we want to ask about detection, not indexing.

Then we consider the problem of computing all centers of Ab-squares, this requires new gadgets. We show that computing all centers of Ab-squares is 3SUM-hard, as well as detection of any Ab-square which is well centred.

Later we extend this to detection of any odd Ab-square. We use a construction of a string over the alphabet of size 4 with no Ab-square. The input string is “shuffled” with such a string, with some separators added. This forces odd Ab-squares to be well centered, in this way we reduce the previously considered problem of detection of any well-centred Ab-square to the detection of any odd Ab-square. Ultimately, this shows that the latter problem is 3SUM-hard.

2 From $\operatorname{\textsc{Conv3SUM}}$ to finding double 3-term arithmetic progressions

For integers $a,b$ , by $[a,b]$ we denote the set $\{a,\dots,b\}$ . We use the following convolution variant of the $\operatorname{\textsc{3SUM}}$ problem that is $\operatorname{\textsc{3SUM}}$ -hard; see [10, 31, 35] for both randomized and deterministic reductions. As already noted in [3], the range of elements can be made $[-N^{2},N^{2}]$ using a randomized hashing reduction from [5, 35].

$\operatorname{\textsc{Conv3SUM}}(\bar{x})$ Input: A sequence $\bar{x}=[x_{1},\dots,x_{N}]\in[-N^{2},N^{2}]$ Output: Yes if there are $i\neq j$ such that $x_{i}+x_{j}=x_{i+j}$ ; no otherwise.

Let us denote $\mathit{mid}(a,b)=(a+b)/2$ and define the condition:

\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{x}}(i,j)\;=\;(\,i\neq j\ \land\ j-i\ \text{is even}\ \land\ x_{\mathit{mid}(i,j)}=\mathit{mid}(x_{i},x_{j})\,).

We omit the subscript ${\bar{x}}$ if it is clear from the context. The last part of the condition is equivalent to $x_{j}-x_{\mathit{mid}(i,j)}=x_{\mathit{mid}(i,j)}-x_{i}$ .

Our first goal is to reduce the $\operatorname{\textsc{Conv3SUM}}$ problem to the following one with $K=N^{\mathcal{O}(1)}$ .

Double 3-Term Arithmetic Progression, $\operatorname{\textsc{3DAP}}(\bar{x}$ ) Input: $\bar{x}=[x_{1},\dots,x_{n}]$ , each of $x_{i}$ is in $[0,K]$ . Output: $(\exists\,i,j)\;\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ .

In Section 2.1 we obtain a reduction of $\operatorname{\textsc{Conv3SUM}}$ to an intermediate version of $\operatorname{\textsc{3DAP}}$ with additional constraints on $i,j$ , and in Section 2.2 we show how these constraints can be avoided.

2.1 From $\operatorname{\textsc{Conv3SUM}}$ to $\operatorname{\textsc{Odd\mbox{-}3DAP}}$

Let us fix an integer sequence $x_{1},\dots,x_{N}$ . For an arithmetic progression (arithmetic sequence) $\mathcal{I}=i_{1},\dots,i_{n}$ , where $1\leq i_{1}<\dots<i_{n}\leq N$ , i.e. $i_{2}-i_{1}=\dots=i_{n}-i_{n-1}$ , we define the following extended functions.

	$\displaystyle\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I})\,$	$\displaystyle=\,(\,\exists\,i_{a},i_{b}\in\mathcal{I}\;:\;x_{i_{a}}+x_{i_{b}}=x_{i_{a}+i_{b}},\,i_{a}<i_{b})$
	$\displaystyle{\operatorname{\textsc{OddConv3SUM}}}(\bar{x},\mathcal{I})\,$	$\displaystyle=\,(\,\exists\,i_{a},i_{b}\in\mathcal{I}\;:\;x_{i_{a}}+x_{i_{b}}=x_{i_{a}+i_{b}},\,a+b\ \mbox{is odd}).$

Note that it can happen that $i_{a}+i_{b}\notin\mathcal{I}$ . For a fixed $\bar{x}$ the input size is $|\mathcal{I}|$ .

Lemma 2.1.

An instance of $\operatorname{\textsc{Conv3SUM}}(\bar{x})$ can be reduced to an alternative of $\mathcal{O}(N)$ instances of $\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})$ of total size $\mathcal{O}(N\log N)$ in $\mathcal{O}(N\log N)$ time.

Proof 2.2.

If $\mathcal{I}=i_{1},\dots,i_{n}$ , by $\mathcal{I}_{\mathit{odd}}$ and $\mathcal{I}_{\mathit{even}}$ we denote the subsequences $i_{1},i_{3},\dots$ and $i_{2},i_{4},\dots$ , respectively. We proceed recursively as shown in the following function $\operatorname{\textsc{Conv3SUM}}$ , with the first call to $\operatorname{\textsc{Conv3SUM}}(\bar{x},[1,2,\ldots,N])$ .

function

\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I})

Comment:

\mathcal{I}

is an arithmetic progression

if $|\mathcal{I}|\leq 2$ then return false;

return

\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})\lor\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I}_{\mathit{odd}})\lor\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I}_{\mathit{even}})

;

Correctness. Let $\mathcal{I}=i_{1},\dots,i_{n}$ and assume there are two indices $a,b$ such that $x_{i_{a}}+x_{i_{b}}=x_{i_{a}+i_{b}}$ . If $a+b$ is odd, then $\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})$ returns true. Otherwise both $a,b$ are of the same parity, so $i_{a},i_{b}\in\mathcal{I}_{\mathit{odd}}$ or $i_{a},i_{b}\in\mathcal{I}_{\mathit{even}}$ . Consequently, the problem is split recursively into subproblems that correspond to $\mathcal{I}_{\mathit{odd}}$ and $\mathcal{I}_{\mathit{even}}$ .

Complexity. Let us observe that one call to $\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I})$ creates an instance of $\operatorname{\textsc{OddConv3SUM}}$ of $\mathcal{O}(|\mathcal{I}|)$ size in $\mathcal{O}(|\mathcal{I}|)$ time ( $\bar{x}$ does not change). Let $\#(n)$ and $S(n)$ denote the total number and size of all instances of $\mathcal{I}$ generated by $\operatorname{\textsc{Conv3SUM}}(\bar{x},\mathcal{I})$ , when initially $|\mathcal{I}|=n$ . We then have

\#(n)=\#(\lfloor n/2\rfloor)+\#(\lceil n/2\rceil)+1\quad\mbox{and}\quad S(n)=S(\lfloor n/2\rfloor)+S(\lceil n/2\rceil)+\Theta(n),

which yields $\#(N)=\mathcal{O}(N)$ and $S(N)=\mathcal{O}(N\log N)$ . The reduction takes $\mathcal{O}(S(N))$ time.

We say that a 3-element arithmetic progression is a good progression if the middle element is even and two others are odd and introduce the following problem.

$\operatorname{\textsc{Odd\mbox{-}3DAP}}$ ( $\bar{x}$ ) Input: $\bar{x}=[x_{1},\dots,x_{n}]$ , each of $x_{i}$ is in $[-\mathcal{O}(N^{2}),\mathcal{O}(N^{2})]$ . Output: $(\,\exists\,i,j\,)\;[\,\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ and $(i,\mathit{mid}(i,j),j)$ is a good progression ].

Lemma 2.3.

$\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})$ is reducible in $\mathcal{O}(|\mathcal{I}|)$ time and space to $\operatorname{\textsc{Odd\mbox{-}3DAP}}(\bar{y})$ , where $|\bar{y}|=\mathcal{O}(|\mathcal{I}|)$ .

Proof 2.4.

Let $\mathcal{I}=i_{1},\dots,i_{n}$ . Define $\alpha_{N}=2N^{2}+1$ and let $\bar{y}$ be a sequence of length $2n-1$ that is created as follows:

1.

put $x_{i_{1}},x_{i_{2}},\dots,x_{i_{n}}$ at subsequent odd positions in $\bar{y}$ ;
2.

at each even position $2j$ , put $x_{i_{j}+i_{j+1}}$ or, if $i_{j}+i_{j+1}>N$ , put $\alpha_{N}$ .
3.

multiply elements on even positions by 2.

After the first two steps $\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})$ is equivalent to $(\exists i,j)\;y_{\mathit{mid}(i,j)}=y_{i}+y_{j}$ for odd $i,j$ and even $\mathit{mid}(i,j)$ ; see Figure 1. Then, after the third step, $\operatorname{\textsc{OddConv3SUM}}(\bar{x},\mathcal{I})$ is equivalent to $\operatorname{\textsc{Odd\mbox{-}3DAP}}(\bar{y})$ .

Figure 1: The sequence constructed in Lemma 2.3 for

\bar{x}\,=\,[x_{1},x_{2},x_{3},\dots,x_{11}]

and

\mathcal{I}=(1,2,\ldots,11)

after the first two steps (

*

denotes

\alpha_{11}

). Note that the elements connected by arcs all have their sum of indices equal to

7

; this is because

\mathcal{I}

is an arithmetic progression.

2.2 From $\operatorname{\textsc{Odd\mbox{-}3DAP}}$ to $\operatorname{\textsc{3DAP}}$

Our main tool in this subsection is partitioning a set of integers into progression-free sets. A set of integers $A$ is called progression-free if it does not contain a non-constant three-element arithmetic progression. We use the following recent result that extends a classical paper of Behrend [6].

Theorem 2.5 ([14]).

Any set $A\subseteq[1,n]$ can be partitioned into $n^{o(1)}$ progression-free sets in $n^{1+o(1)}$ time.

Lemma 2.6.

We can construct in $n^{1+o(1)}$ time a family $\mathcal{F}$ of $n^{o(1)}$ subsets of $[1,n]$ satisfying:

(a)

Each good 3-element progression is contained in some $S\in\mathcal{F}$ .
(b)

If $S\in\mathcal{F}$ , then all 3-element arithmetic progressions in $S$ are good.

Proof 2.7.

Let us divide the elements from $[1,n]$ into three classes:

	$\displaystyle\mathsf{BLUE}$	$\displaystyle=\{i\leq n\,:\,i\text{ is even}\,\},$
	$\displaystyle\mathsf{RED}$	$\displaystyle=\{i\leq n\,:\,i\bmod 4=1\},\quad\mathsf{GREEN}=\{i\leq n\,:\,i\bmod 4=3\}.$

Each element $i\in[1,n]$ has the colour blue, red or green of its corresponding class. Each class forms an arithmetic progression.

A progression is called multi-chromatic if its elements are of three distinct colours. Let us observe that a 3-element progression is good if and only if it is multi-chromatic. Indeed, this is because if $i,j\in\mathsf{RED}$ (or $\mathsf{GREEN}$ ), then $\mathit{mid}(i,j)$ is odd.

Now instead of good progressions we will deal with multi-chromatic progressions. We treat sets of integers as increasing sequences and for a set $C=\{c_{1},\dots,c_{m}\}$ we denote by $C_{\mathit{odd}}$ and $C_{\mathit{even}}$ the subsets $\{c_{1},c_{3},\dots\}$ and $\{c_{2},c_{4},\dots\}$ .

For example $\mathsf{BLUE}_{odd}=\{i\leq n\,:\,i\bmod 4=2\},\ \mathsf{RED}_{even}=\{i\leq n\,:\,i\bmod 8=5\}$ .

Our construction works as follows:

1.

Partition the set $[1,n]$ into classes $\mathsf{BLUE},\mathsf{RED},\mathsf{GREEN}$ .
2.

For each class $C\in\{\mathsf{BLUE},\mathsf{RED},\mathsf{GREEN}\}$ partition it in $n^{1+o(1)}$ time into a family $\mathcal{F}_{C}$ of $n^{o(1)}$ progression-free sets with the use of Theorem 2.5.
3.

Refine each partition $\mathcal{F}_{C}$ , splitting each set $X\in\mathcal{F}_{C}$ into two sets $X\cap C_{\mathit{odd}}$ , $X\cap C_{\mathit{even}}$ , so that for each set $X$ in the new refined partition $\mathcal{F}_{C}$ we have $X\subseteq C_{\mathit{odd}}$ or $X\subseteq C_{\mathit{even}}$ . Each family is still of size $n^{o(1)}$ .
4.

Return $\mathcal{F}\,=\,\{\,X\cup Y\cup Z\;:\;X\in\mathcal{F}_{\mathsf{BLUE}},Y\in\mathcal{F}_{\mathsf{RED}},Z\in\mathcal{F}_{\mathsf{GREEN}}\,\}$ .

Proof of point (a). Each multi-chromatic progression is contained in some $S\in\mathcal{F}$ since each element of $C$ is contained in a set from $\mathcal{F}_{C}$ .

Proof of point (b). The proof is by contradiction. Assume that $S\in\mathcal{F}$ contains a progression which is not multi-chromatic. There are two cases.

: Case 1: the progression is monochromatic, hence it appears in a single set $X\in\mathcal{F}_{C}$ . However every $X$ is progression-free (step 2), hence such a progression cannot appear in any $S\in\mathcal{F}$ ; a contradiction.
: Case 2: the progression contains exactly two different colors. Observe that if $i\bmod p=\mathit{mid}(i,j)\bmod p=r$ , then $j\bmod p=r$ (if the middle element of progression belongs to the same class as one of the other elements, then the triple is monochromatic), hence the two-coloured arithmetic progression has to consist of $i,j\in C$ and $\mathit{mid}(i,j)\notin C$ .

Since $i,j$ both belong to $C_{\mathit{odd}}$ or $C_{\mathit{even}}$ (step 3), $\mathit{mid}(i,j)$ must belong to $C$ (if $i\bmod 2p=j\bmod 2p$ , then $i\bmod p=\mathit{mid}(i,j)\bmod p$ ). Consequently, the progression cannot contain exactly two colours; a contradiction.

Our next tool is a deactivation of a set of elements which indexes are not in a given set $E$ , that is, omitting them in the computation of a solution. For $E\subseteq[1,n]$ the operation $\mathit{restr}(\bar{x},E)$ replaces each element $x_{i}$ on position $i\notin E$ by $5\max\{\mathit{MAX},n^{2}\}+i^{2}$ , where $\mathit{MAX}=\max_{k\geq 1}\,|x_{k}|$ .

Lemma 2.8.

$\operatorname{\textsc{3DAP}}(\mathit{restr}(\bar{x},E))\iff(\exists\,i,j)\;\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{x}}(i,j)\;\land\ i,j,\mathit{mid}(i,j)\in E$ .

Proof 2.9.

The $(\Leftarrow)$ part if obvious, so it suffices to show $(\Rightarrow)$ . If at least one, but not all, of $i,j,\mathit{mid}(i,j)$ is not in $E$ , then it can be checked that $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{y}}(i,j)$ cannot hold for $\bar{y}=\mathit{restr}(\bar{x},E)$ because $y_{\mathit{mid}(i,j)}$ and $\mathit{mid}(y_{i},y_{j})$ differ by at least $M:=\max\{\max_{k}\{|x_{k}|\},n^{2}\}$ . Indeed, there are seven possible cases:

1.

$i,\mathit{mid}(i,j),j\notin E$ , then $\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=\frac{i^{2}+j^{2}}{2}-(\frac{i+j}{2})^{2}=(\frac{i-j}{2})^{2}>0\ \mbox{since}\ i\neq j$
2.

$i\in E$ , $\mathit{mid}(i,j),j\notin E$ , then $\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=-\frac{5}{2}M+\frac{j^{2}}{2}-(\frac{i+j}{2})^{2}+\frac{x_{i}}{2}\leq-2M+\frac{x_{i}}{2}\leq-M$
3.

$j\in E$ , $i,\mathit{mid}(i,j)\notin E$ works as the previous case
4.

$\mathit{mid}(i,j)\in E$ , $i,j\notin E$ , then $\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=5M+\frac{i^{2}+j^{2}}{2}-x_{\mathit{mid}(i,j)}\geq 4M$
5.

$i,\mathit{mid}(i,j)\in E$ , $j\notin E$ , then $\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=\frac{5}{2}M+\frac{j^{2}}{2}+\frac{x_{i}}{2}-x_{\mathit{mid}(i,j)}\geq M$
6.

$\mathit{mid}(i,j),j\in E$ , $i\notin E$ works as the previous case
7.

$i,j\in E$ , $\mathit{mid}(i,j)\notin E$ , then $\mathit{mid}(y_{i},y_{j})-y_{\mathit{mid}(i,j)}=\frac{x_{i}+x_{j}}{2}-5M-(\frac{i+j}{2})^{2}\leq-3M$ .

Hence, apart from the first case, where none of the indices belongs to $E$ , the absolute value of difference is at least $M$ .

Otherwise, if all the positions $i,j,\mathit{mid}(i,j)$ are not in $E$ , then $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}_{\bar{y}}(i,j)$ does not hold because $\mathit{mid}(i^{2},j^{2})-(\frac{i+j}{2})^{2}=(\frac{i-j}{2})^{2}>0\ \mbox{since}\ i\neq j.$

An instance $\bar{x}$ is called an odd-half instance if $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ is false for $i,j$ such that $(j-i)/2$ is even (equivalently, for $i,j$ such that $i$ and $\mathit{mid}(i,j)$ have the same parity). Efficient equivalence

\operatorname{\textsc{Odd\mbox{-}3DAP}}(\bar{x})\ \Leftrightarrow\ (\exists\,S\in\mathcal{F}\,)\;\operatorname{\textsc{3DAP}}(\mathit{restr}(\bar{x},S))

follows now from Lemmas 2.6 and 2.8.

This produces only odd-half instances because only good progressions are left in the construction of Lemma 2.6. The instances have elements in $[-\mathcal{O}(N^{2}),\mathcal{O}(N^{2})]$ . We can increase all the elements by $\mathcal{O}(N^{2})$ so that they become non-negative. This implies:

Lemma 2.10.

An instance of $\operatorname{\textsc{Odd\mbox{-}3DAP}}$ can be reduced in $n^{1+o(1)}$ time to $n^{o(1)}$ odd-half instances of $\operatorname{\textsc{3DAP}}$ of total size $n^{1+o(1)}$ and with elements up to $K=\mathcal{O}(N^{2})$ .

Finally, we show that the resulting instances can be glued together to a single equivalent one.

Theorem 2.11.

An instance of $\operatorname{\textsc{Conv3SUM}}$ can be reduced in $N^{1+o(1)}$ time to an odd-half instance of $\operatorname{\textsc{3DAP}}$ of size $n=N^{1+o(1)}$ with elements up to $K=N^{3+o(1)}$ .

Proof 2.12.

With Lemmas 2.1, 2.3 and 2.10 we obtain a reduction from $\operatorname{\textsc{Conv3SUM}}$ to $N^{1+o(1)}$ odd-half instances of $\operatorname{\textsc{3DAP}}$ of total size $N^{1+o(1)}$ . The instances have elements in $[0,\mathcal{O}(N^{2})]$ . We will show that these instances can be reduced to a single odd-half instance of $\operatorname{\textsc{3DAP}}$ of size $N^{1+o(1)}$ with elements in the range $[0,N^{3+o(1)}]$ in time $N^{1+o(1)}$ . The resulting instance will return true if and only if at least one of the input instances does.

Let $t=N^{1+o(1)}$ be the number of the instances of $\operatorname{\textsc{3DAP}}$ , numbered $1$ through $t$ . We use Theorem 2.5 ¹¹1Actually, a deterministic version of Behrend’s construction from [14] or an earlier construction of Salem and Spencer [38] would suffice here. and pick the largest constructed progression-free set $A\subseteq[1,m]$ , for some $m$ . By the pigeonhole principle, $|A|\geq m^{1-o(1)}$ . We select $m$ that is large enough so that $m^{1-o(1)}\geq t$ , so $m=t^{1+o(1)}=N^{1+o(1)}$ , and trim the set $A$ to the size $t$ . Let $A=\{a_{1},\dots,a_{t}\}$ . For instance $i$ we multiply all its elements by $2m$ and add to each element the value $a_{i}$ . Finally we concatenate all the instances.

If any of the input instances returns true, then so does the output instance, since multiplication by and addition of the same number to all elements cannot affect the outcome of a single instance. If none of the input instances returns true, then the only possibility for the output instance to return true is to contain a 3-element arithmetic progression with elements from multiple parts corresponding to the input instances. However, this is impossible since, taken modulo $2m$ , the progression would form an arithmetic progression in the set $A$ .

Corollary 2.13.

The general $\operatorname{\textsc{3DAP}}$ problem is also $\operatorname{\textsc{3SUM}}$ -hard.

Remark 2.14.

Similarly as in $\operatorname{\textsc{Conv3SUM}}$ , techniques from [5] can be used to hash down the range in $\operatorname{\textsc{3DAP}}$ to integers of magnitude $\mathcal{O}(N^{2})$ (cf. [3]), using randomization.

Remark 2.15.

The AVERAGE problem (introduced by J. Erickson [17]) asks if there are distinct elements $a,b,c\in S$ such that $a+b=2c$ for a given set $S$ of $n$ integers. It was recently shown to be $\operatorname{\textsc{3SUM}}$ -hard [14]. The $\operatorname{\textsc{3DAP}}$ problem can be viewed as a convolution version of the AVERAGE problem²²2https://cs.stackexchange.com/questions/10681/is-detecting-doubly-arithmetic-progressions-3sum-hard/10725#10725. The ideas based on almost linear hashing used in the reductions from $\operatorname{\textsc{3SUM}}$ to $\operatorname{\textsc{Conv3SUM}}$ [35, 10] can be extended with some effort to reduce AVERAGE to $\operatorname{\textsc{3DAP}}$ . We presented a different reduction that additionally directly leads to an instance of $\operatorname{\textsc{3DAP}}$ with an odd-half property, which is essential in our proof of $\operatorname{\textsc{3SUM}}$ -hardness of computing Ab-squares (see the proof of Lemma 4.2).

2.3 Hardness of detecting additive squares

If the alphabet is a set of integers, then a string $W$ is called an additive square if $W=UV$ , where $|U|=|V|$ and $\sum_{i=1}^{|U|}\,U[i]=\sum_{i=1}^{|V|}\,V[i]$ .

Theorem 2.16.

Finding an additive square in a length- $N$ sequence composed of integers of magnitude $N^{\mathcal{O}(1)}$ is $\operatorname{\textsc{3SUM}}$ -hard.

Proof 2.17.

We use Theorem 2.11 to reduce $\operatorname{\textsc{Conv3SUM}}$ to an instance of $\operatorname{\textsc{3DAP}}$ of size $n=N^{1+o(1)}$ with elements in the requested range. $\operatorname{\textsc{3DAP}}$ returns true on an instance $x_{1},\dots,x_{n}$ if and only if the sequence $x_{2}-x_{1},x_{3}-x_{2},\dots,x_{n}-x_{n-1}$ contains an additive square. As the reduction works in $N^{1+o(1)}$ total time, the conclusion follows.

3 From arithmetics to Abelian stringology

We use capital letters to denote strings and lower case Greek letters to denote sets of integers. We assume that the positions in a string $S$ are numbered 1 through $|S|$ , where $|S|$ denotes the length of $S$ . By $S[i]$ and $S[i..j]$ we denote the $i$ th letter of $S$ and the string $S[i]\cdots S[j]$ called a factor of $S$ . The reverse of string $S$ , i.e. the string $S[|S|]\cdots S[1]$ , is denoted as $S^{R}$ . By $\varepsilon$ we denote the empty string. By $\mathit{Alph}(S)$ we denote the set of distinct letters in $S$ .

We denote Ab-equivalence of $U$ and $V$ by $U\cong V$ . For a string $U$ , by $\mathit{Parikh}(U)$ we denote the Parikh vector of $U$ . Then $U\cong V$ if and only if $\mathit{Alph}(U)=\mathit{Alph}(V)$ and $\mathit{Parikh}(U)=\mathit{Parikh}(V)$ .

We use an encoding of Amir et al. [3] based on the Chinese remainder theorem to connect $\operatorname{\textsc{Conv3SUM}}$ -type problems with Abelian stringology.

Let $p_{1}<p_{2}<\dots<p_{k}$ be prime numbers. The Chinese remainder theorem states that if one knows the remainders $r_{1},r_{2},\dots,r_{k}$ of an integer $x$ , such that $0\leq x<\prod\,p_{i}$ , when dividing by $p_{i}$ ’s, then one can uniquely determine $x$ . Assuming that the remainders of an integer $x$ are $r_{1},r_{2},\dots,r_{k}$ , we could encode $x$ as a possibly short string $\mathtt{a}_{1}^{r_{1}}\mathtt{a}_{2}^{r_{2}}\cdots\mathtt{a}_{k}^{r_{k}}$ over an alphabet $\{\mathtt{a}_{1},\mathtt{a}_{2},\ldots,\mathtt{a}_{k}\}$ (the symbols correspond to consecutive prime numbers).

For example for primes 2,3,5 the encoding of 11 would be $\mathtt{a}_{1}^{1}\mathtt{a}_{2}^{2}\mathtt{a}_{3}^{1}$ since its remainders modulo 2,3,5 are 1,2,1, respectively. However, we are interested in encodings of subtractions of one number from another one, and it is more complicated.

Let $\bar{x}=[x_{1},\dots,x_{n}]$ be an instance of $\operatorname{\textsc{3DAP}}$ and $r_{1}^{(i)},r_{2}^{(i)},\dots,r_{k}^{(i)}$ be remainders of $x_{i}$ modulo $p_{1},p_{2},\ldots,p_{k}$ . Like Amir et al. [3], we define for $1\leq i<n$ and $1\leq j\leq k$ ,

\mathit{EXP}_{i}(j)=r_{j}^{(i+1)}-r_{j}^{(i)}+d\text{ where }d=\max_{j=1}^{k}p_{j},\ \ \SS_{i}\;=\;\mathtt{a}_{1}^{\mathit{EXP}_{i}(1)}\,\mathtt{a}_{2}^{\mathit{EXP}_{i}(2)}\cdots\mathtt{a}_{k}^{\mathit{EXP}_{i}(k)}.

We choose a sequence $p_{1},\dots,p_{k}$ of $k$ distinct primes such that $p_{1}\cdots p_{k}>\max\{x_{i}\}$ . In this way we encode the difference $x_{j}-x_{i}$ , for $j>i$ , by a string $\SS_{i}\,\SS_{i+1}\cdots\SS_{j-1}$ . An obstacle is the potentially possible inequality $(a\bmod p)-(b\bmod p)\neq(a-b)\bmod p$ . For example

(4\ \text{mod}\ 3)-(2\ \text{mod}\ 3)\,=\,((4-2)\ \text{mod}\ 3)\,-\,3.

However a small correction is sufficient, due to the following observation. {observation} $(a\bmod p)-(b\bmod p)+q\,=\,(a-b)\bmod p$ , where $q\in\{0,p\}$ .

If we apply the encoding to an instance $\bar{x}=x_{1},\dots,x_{n}$ of $\operatorname{\textsc{3DAP}}$ , we obtain a lemma that is analogous to [3, Lemma 1].

Lemma 3.1.

$\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ holds for $i<j$ , $j-i$ even, iff for each $t\in[1,k]$ , there are $e_{t},f_{t}\in\{0,p_{t}\}$ , such that

e_{t}+\mathit{EXP}_{i}(t)+\mathit{EXP}_{i+1}(t)+\dots+\mathit{EXP}_{\mathit{mid}(i,j)-1}(t)\,=\,

\hskip 28.45274pt\mathit{EXP}_{\mathit{mid}(i,j)}(t)+\mathit{EXP}_{\mathit{mid}(i,j)+1}(t)+\cdots+\mathit{EXP}_{j-1}(t)+f_{t}.

Let $\Psi$ be a morphism such that $\Psi(i)=\mathtt{a}_{i}^{p_{i}}$ for each $i=1,\dots,k$ . We treat a set $U=\{u_{1},\dots,u_{w}\}$ as a string $u_{1}\cdots u_{w}$ , where $u_{1}<u_{2}<\ldots<u_{w}$ . If we interpret the vector $(\mathit{EXP}_{i}(1),\mathit{EXP}_{i}(2),\dots,\mathit{EXP}_{i}(k))$ as $\SS_{i}$ , then Lemma 3.1 directly implies the following fact.

Lemma 1.

Assume $i<j$ and $j-i$ is even. Then

\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)\ \iff\ (\,\Psi(\alpha)\,\SS_{i}\SS_{i+1}\cdots\SS_{\mathit{mid}(i,j)-1}\cong\SS_{\mathit{mid}(i,j)}\cdots\SS_{j-1}\;\Psi(\beta)\,)

for some disjoint subsets $\alpha,\beta$ of $[1,k]$ .

4 Hardness of computing all centers of Ab-squares

We construct a text $T$ over the alphabet $\{\mathtt{a}_{1},\dots,\mathtt{a}_{k},\mathtt{b},\bullet,\star,\#,\$\}$ such that $\operatorname{\textsc{3DAP}}$ has a solution if and only if $T$ contains an Ab-square with one of specified centers, so-called well-placed Ab-square.

First we extend each $\SS_{i}$ to have the same length $M\geq\max_{i=1}^{n-1}|\SS_{i}|$ , to be defined later. Intuitively, it is needed to control the number of $\SS_{i}$ ’s in the strings from 1. We append $M-|\SS_{i}|$ occurrences of a letter $\mathtt{b}$ to each $\SS_{i}$ . Let $\SS^{I}_{i}$ denote this modified string.

Lemma 1 immediately implies the following fact.

Lemma 4.1.

Assume $i<j$ and $j-i$ is even. Then

\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)\ \iff\ (\,\mathtt{b}^{e}\Psi(\alpha)\,\SS_{i}^{I}\SS_{i+1}^{I}\cdots\SS_{\mathit{mid}(i,j)-1}^{I}\,\cong\SS_{\mathit{mid}(i,j)}^{I}\cdots\SS_{j-1}^{I}\Psi(\beta)\mathtt{b}^{f}\,),

for some disjoint subsets $\alpha,\beta$ of $[1,k]$ , where $e+|\Psi(\alpha)|=f+|\Psi(\beta)|$ with $\min(e,f)=0$ .

The parts $\mathtt{b}^{e}\Psi(\alpha)$ , $\Psi(\beta)\mathtt{b}^{f}$ in the above lemma can be treated as equalizers. Let us note that in the above lemma we can assume that $\max(e,f)\leq\max(|\Psi(\alpha)|,|\Psi(\beta)|)\leq kd$ .

A pair of disjoint sets $\alpha,\beta$ that satisfies $\alpha\cup\beta=[1,k]$ will be called a 2-partition of $[1,k]$ . For a 2-partition $(\alpha,\beta)$ of $[1,k]$ , we use the string

\Gamma(\alpha,\beta)=\#\,\alpha\,\$\,\mathtt{b}^{kd}\,\#\,\beta\,\$,

called a $\Gamma$ -string. If $k=4$ , $d=7$ , an example of a $\Gamma$ -string is $\Gamma(2,\,1\,3\,4)\,=\,\#\,2\;\$\;\mathtt{b}^{28}\;\#\;1\,3\,4\;\$$ .

Let $(\pi_{1},\pi^{\prime}_{1}),(\pi_{2},\pi^{\prime}_{2}),\dots,(\pi_{m},\pi^{\prime}_{m})$ be the sequence of all $m=4^{k}$ pairs of $\Gamma$ -strings. Define

U=\pi_{m}\pi_{m-1}\ldots\pi_{1},\ \ V=\pi^{\prime}_{1}\,\pi^{\prime}_{2}\,\ldots\pi^{\prime}_{m}.

We have $\{\pi_{1},\dots,\pi_{m}\}=\{\pi^{\prime}_{1},\dots,\pi^{\prime}_{m}\}$ , so $U\cong V$ .

{observation}

For disjoint subsets $\alpha,\beta\subseteq[1,k]$ and integers $0\leq e,f\leq kd$ , there are decompositions $U\;=\;U_{1}\,\mathtt{b}^{e}\,\#\,\beta\,\$\,U_{2}$ and $V\;=\;V_{1}\,\#\,\alpha\,\$\,\mathtt{b}^{f}\,V_{2}$ , where $U_{2}\cong V_{1}$ . Let us recall the morphism $\Psi$ such that $\Psi(i)=\mathtt{a}_{i}^{p_{i}}$ for each $i\in[1,k]$ . We define additionally $\Psi(c)=c$ for $c\in\{\mathtt{b},\#,\$\}$ and set

\mathbf{B}\,=\,\Psi(U),\ \mathbf{A}\,=\,\Psi(V),\ \ M=|\mathbf{A}|=|\mathbf{B}|.

Let us observe that indeed $\max_{i=1}^{n-1}|\SS_{i}|\leq M$ holds since $|\SS_{i}|\leq kd+\sum_{j=1}^{k}p_{j}$ and the length of $\Psi(W)$ for any $\Gamma$ -string $W$ is $kd+\sum_{j=1}^{k}p_{j}+4$ .

Figure 2: Internal structure of an Ab-square, shown in a thick box (proportions are symbolic), in

\mathbf{B}\SS_{1}^{I}\mathbf{A}\,\mathbf{B}\SS_{2}^{I}\mathbf{A}

. Here

x_{\mathit{mid}(1,3)}=\mathit{mid}(x_{1},x_{3})

and

\mathtt{b}^{e}\Psi(\alpha)

\Psi(\beta)\mathtt{b}^{f}

are equalizers.

We add two new letters $\bullet,\star$ and define the following string (the symbols “ ${\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}$ ” are not parts of the string, but only show supposed centers of Ab-squares).

T\;=\;\bullet\,\;\mathbf{B}\;\star\,\;\SS^{I}_{1}\;\mathbf{A}\;\bullet\,{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}\star\,\;\mathbf{B}\;\bullet\,\;\SS^{I}_{2}\;\;\mathbf{A}\;\star\,{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}\bullet\,\;\mathbf{B}\;\star\,\;\SS^{I}_{3}\;\mathbf{A}\;\bullet\,{\color[rgb]{.5,0,.5}\definecolor[named]{pgfstrokecolor}{rgb}{.5,0,.5}\stackrel{{\scriptstyle center}}{{\downarrow}}}\star\,\;\mathbf{B}\;\bullet\,\;\SS^{I}_{4}\;\mathbf{A}\,\star\,\cdots.

(1)

An Ab-square is called well-placed if its center is between the letters $\bullet,\star$ in any order. Recall that, due to Theorem 2.11, we can assume that the input to $\operatorname{\textsc{3DAP}}$ guarantees that only odd-half instances could have solutions.

Lemma 4.2.

Assume $\bar{x}$ is an odd-half instance. Then $\operatorname{\textsc{3DAP}}(\bar{x})$ has a solution if and only if $T$ contains a well-placed Ab-square.

Proof 4.3.

Let $\bar{x}=[x_{1},\dots,x_{n}]$ be an odd-half instance of $\operatorname{\textsc{3DAP}}$ . We show two implications.

$(\mathbf{\Rightarrow})$

Assume that $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ holds for $\bar{x}$ . Lemma 4.1 implies that for strings $W,Z$ such that $W\cong Z$ we have

\mathtt{b}^{e}\#\Psi(\alpha)\$W\,\star\SS_{i}^{I}\mathbf{A}\bullet\,\,\,\star\mathbf{B}\bullet\SS_{i+1}^{I}\mathbf{A}\star\,\,\,\cdots\,\,\,\bullet\mathbf{B}\star\SS_{\mathit{mid}(i,j)-1}^{I}\mathbf{A}\bullet\,\,\,\cong\\ \quad\quad\star\mathbf{B}\bullet\SS_{\mathit{mid}(i,j)}^{I}\mathbf{A}\star\,\,\,\cdots\,\,\,\bullet\mathbf{B}\star\SS_{j-2}^{I}\mathbf{A}\bullet\,\,\,\star\mathbf{B}\bullet\SS_{j-1}^{I}\,Z\#\Psi(\beta)\$\mathtt{b}^{f}

(2)

for some disjoint subsets $\alpha,\beta$ of $[1,k]$ , where $e+|\Psi(\alpha)|=f+|\Psi(\beta)|$ with $\min(e,f)=0$ . Indeed, we use the fact that $\mathbf{A}\cong\mathbf{B}$ and the counts of letters $\bullet$ and $\star$ on both hand sides are equal (because $(j-i)/2$ is odd). By Section 4, we obtain a well-placed Ab-square in $T$ (or we obtain it after exchanging all letters $\bullet$ with $\star$ ).

$(\mathbf{\Leftarrow})$ Assume that $T$ has a well-placed Ab-square factor with center immediately after $\bullet\mathbf{B}\star\SS_{t}^{I}\mathbf{A}\bullet$ (the case that it is immediately after $\star\mathbf{B}\bullet\SS_{t}^{I}\mathbf{A}\star$ is symmetric). Let us investigate what can be the position $s$ of the first letter of this Ab-square.

Recall that $|\SS_{i}^{I}|=|\mathbf{A}|=|\mathbf{B}|=M$ for each $i\in[1,n-1]$ , so $T$ can be seen as composed of blocks of length $M^{\prime}=M+1$ . We will check which of these blocks can contain $s$ , by checking the counts of each of the letters $\bullet,\star$ in both halves of the Ab-square. The positions of letters $\bullet,\star$ in $T$ repeat with period $6(M+1)$ , so it is sufficient to inspect the first 6 blocks on each side, as the remaining ones will behave periodically; see Figures 3 and 4.

Figure 3: Which position in a block of

T

can be the starting position of a well-placed Ab-square with the designated center, just counting letters

\bullet,\star

By counting letters $\bullet,\star$ in both halves of the Ab-square, it can be readily verified that $s$ cannot be in any block $\mathbf{A}\bullet$ or $\bullet\SS_{i}^{I}$ ; if in any block $\star\mathbf{B}$ or $\star\SS_{i}^{I}$ , it can only be the first position of the block; it cannot be the first position in a block $\bullet\mathbf{B}$ ; and it can be in any position in a block $\mathbf{A}\star$ .

Moreover, $s$ cannot be the first position in a block $\star\mathbf{B}$ , since this would imply, by Lemma 4.1, that $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ holds for $i$ such that the block $\bullet\SS_{i}^{I}$ immediately follows the $\star\mathbf{B}$ block and $j=2t-i$ . However, in this case $(j-i)/2$ is even, which is impossible.

If $s$ is the first position of a block $\star\SS_{i}^{I}$ , then this implies, again by Lemma 4.1, that $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ holds for $j=2t-i$ . In this case $(j-i)/2$ is odd, so this is a valid solution to the corresponding $\operatorname{\textsc{3DAP}}$ instance.

We are left with the case that $s$ belongs to a block $\mathbf{A}\star$ or $\bullet\mathbf{B}$ (and in case of $\bullet\mathbf{B}$ does not coincide with the position of the letter $\bullet$ ). Henceforth it suffices to count letters different from $\bullet,\star$ in the halves. Each of the gadgets $\mathbf{A},\mathbf{B}$ is a concatenation of $m$ Ab-equivalent strings of the form $\#\,\Psi(\alpha)\,\$\,\mathtt{b}^{kd}\,\#\,\Psi(\beta)\,\$$ , where $\Psi(\alpha),\Psi(\beta)$ are composed of letters $\mathtt{a}_{i}$ only. By counting the letters # and $ in both halves of the Ab-square, we see that $s$ can only be a position which holds the letter $\mathtt{b}$ or #.

Hence, the Ab-square is necessarily of the form (2), which, by Lemma 4.1, implies that $\operatorname{\mathbf{\mathbf{\mathbf{\Lambda}}}}(i,j)$ holds, where $\star\SS_{i}^{I}$ is the first such block after the position $s$ and $j=2t-i$ .

{scaletikzpicturetowidth}

Figure 4: The global structure of a fragment containing a well-placed Ab-square; there are three types of blocks:

c\mathbf{B},\,c\SS^{I}_{i},\,\mathbf{A}c

, where

c

is one of

\bullet,\star

. The blocks of the second type (which can be considered as essential blocks) are in color, each block is of length

M+1

(recall that

|\mathbf{A}|=|\mathbf{B}|=|\SS_{i}|=M

). The special letters

\bullet,\star

force each half of a well-placed Ab-square to contain an odd number of full

\SS_{i}

’s and does not contain any

\SS_{i}

only partially.

Theorem 4.4.

Computing all positions that are centers of Ab-square factors in a length- $n$ string over an alphabet of size $\omega(1)$ is $\operatorname{\textsc{3SUM}}$ -hard.

Proof 4.5.

Due to Theorem 2.11 we can reduce $\operatorname{\textsc{Conv3SUM}}$ in $N^{1+o(1)}$ time to an odd-half instance $\bar{x}$ of $\operatorname{\textsc{3DAP}}$ of size $n=N^{1+o(1)}$ with elements in the range $[0,N^{3+o(1)}]$ .

We construct the string $T$ as shown in Eq. 1 for the sequence $\bar{x}$ . Then Lemma 4.2 implies that $\operatorname{\textsc{3DAP}}$ is a YES-instance if and only if $T$ has a well-placed Ab-square. The string $T$ has length $\mathcal{O}(N^{1+o(1)}M)$ . Each of the strings $\mathbf{A},\mathbf{B}$ has length $M$ and is composed of $m=4^{k}$ strings of length $\mathcal{O}(kd)$ , i.e., $\Psi$ -images of $\Gamma$ -strings.

Hence, $M=\mathcal{O}(4^{k}kd)$ . We select $k$ such that $k=\omega(1)$ and simultaneously $k=\mathcal{O}(\log N/\log\log N)$ . Then we have $4^{k}k=N^{o(1)}$ and the $k$ primes are of magnitude $d=\mathcal{O}(N^{(3+o(1))/k})=N^{o(1)}$ (we can choose $k$ consecutive primes computed using Eratosthenes’s sieve).

Overall $|T|=N^{1+o(1)}$ and $|\mathit{Alph}(T)|\leq k+5=\omega(1)$ . (One can obtain any alphabet up to $\mathcal{O}(N)$ by appending distinct letters to $T$ .)

With the same argument for a constant-sized alphabet we obtain the following result.

Theorem 4.6.

All positions that are centers of Ab-square factors in a length- $n$ string over an alphabet of size $5+k$ , for a constant $k$ , cannot be computed in $\mathcal{O}(n^{2-\tfrac{6}{3+k}-\varepsilon})$ time, for a constant $\varepsilon>0$ , unless the $\operatorname{\textsc{3SUM}}$ conjecture fails.

5 Computing centers of Ab-squares for constant-sized alphabets

A set of vectors in $[1,n]^{d}$ is called monotone if its elements can be ordered so that they form a monotone non-decreasing sequence on each coordinate.

Definition 5.1.

For sets $\mathcal{A}$ and $\mathcal{B}$ of vectors we define

\mathcal{A}+\mathcal{B}=\{a+b\,:\,a\in\mathcal{A},b\in\mathcal{B}\},\ c\cdot\mathcal{A}=\{ca\,:\,a\in\mathcal{A}\}

and for a string $W$ we define: $P_{l,r}(W)=\{\mathit{Parikh}(W[1..k]):l\leq k\leq r\}$ . Let us also denote by $|A|$ the length of a string corresponding to a Parikh vector $A$ .

In the algorithm we use the following fact shown in [11]. The exact complexities can be found in [11, Theorem 3.1].

Fact 2 ([11]).

Given three monotone sequences $\mathcal{A},\mathcal{B},\mathcal{C}$ in $[1,n]^{d}$ for a constant $d$ , we can compute $(\mathcal{A}+\mathcal{B})\cap\mathcal{C}$ in $\mathcal{O}(n^{2-\epsilon})$ expected time for a constant $\epsilon>0$ , or in $\mathcal{O}(n^{2-\epsilon^{\prime}})$ worst case time for a constant $\epsilon^{\prime}>0$ if $d\leq 7$ .

|T|<2

return

\emptyset

;

m=\lceil n/2\rceil

;

\mathcal{A}=P_{0,m-1}(T)

;

\mathcal{B}=P_{m,n}(T)

;

\mathcal{C}=P_{0,n}(T)

;

\mathcal{M}=\{|C|\,:\,2C\in(\mathcal{A}+\mathcal{B})\cap 2\cdot\mathcal{C}\}

;

T_{\mathit{left}}=T[1..m-1]

;

T_{\mathit{right}}=T[m..n]

;

return $\mathcal{M}\cup\mathsf{CENTERS}(T_{\mathit{left}})\cup\{k+m\,:\,k\in\mathsf{CENTERS}(T_{\mathit{right}})\}$

Algorithm 1

\mathsf{CENTERS}(T)

Figure 5:

A\in\mathcal{A}

B\in\mathcal{B}

C\in\mathcal{C}

denote Parikh vectors of the corresponding fragments. If

A+B=2C

, then

D=E

and

k=|C|

is a center of an Ab-square.

Theorem 5.2.

For a string of length $n$ over an alphabet of size $d=\mathcal{O}(1)$ , we can compute centers of all Ab-squares and centers of all odd Ab-squares in expected time $\mathcal{O}(n^{2-\epsilon})$ or in worst case time $\mathcal{O}(n^{2-\epsilon})$ if $d\leq 7$ , for $\epsilon>0$ .

Proof 5.3.

We use the above algorithm. Correctness of the algorithm is straightforward; see Figure 5. If

|A|<|B|,\ |C|=(|A|+|B|)/2,\ B=A+D+E,\ C=A+D

then $A+B=2C\iff 2A+D+E=2A+2D$ .

Consequently, after cancelling the same parts on both sides, $A+B=2C\iff E=D$ , equivalently if and only if the factor $T[i..j]$ corresponding to $DE$ is an Ab-square centred in $k=|C|$ . The figure shows the case when $k$ is in the right half of the strings; the other case is symmetric.

By 2 the cost of the algorithm can be given by a recurrence

S(n)=2\cdot S\left(\tfrac{n}{2}\right)+\mathcal{O}(n^{2-\epsilon})

which results in $S(n)=\mathcal{O}(n^{2-\epsilon})$ for $\epsilon>0$ .

In case of of odd Ab-squares let

P^{c}_{l,r}(W)=\{\mathit{Parikh}(W[1..k]):l\leq k\leq r,\,k\bmod{2}=c\}.

In the algorithm the statement $\mathcal{M}=\{|C|\,:\,2C\in(\mathcal{A}+\mathcal{B})\cap 2\cdot\mathcal{C}\}$ is executed for both $c\in\{0,1\}$ , with

\mathcal{A}=P^{c}_{0,m-1}(T),\quad\mathcal{B}=P^{c}_{m,n}(T),\quad\mathcal{C}=P^{1-c}_{0,n}(T).

Other parts of the algorithm, as well as its analysis, are essentially the same.

6 Detecting odd Ab-squares

Unfortunately the string $T$ from Lemma 4.2 has many Ab-squares which are not well-placed. Our approach is to embed the (slightly) modified string $T$ into a string which is a special composition of $T$ and a combination of long quaternary Ab-square-free strings. The resulting string will fix the potential centers in specified locations. We use additional letters: $\diamondsuit,\,\circ$ and $\mathtt{0},\dots,\mathtt{6}$ .

6.1 Fixing centers

We show first a fact useful in fixing Ab-squares in specified places (Lemma 6.5). Keränen’s construction [27] of a quaternary Ab-square-free string consists in iterating a certain morphism $\phi$ , such that $|\phi(a)|=85$ for each of the four letters $a$ , on an initially single-letter string. This implies the following lemma.

Lemma 6.1 (Keränen [27]).

A length- $n$ quaternary Ab-square-free string can be generated in $\mathcal{O}(n)$ time.

Let $P_{t-2}$ be any Ab-square-free string of length $t-2$ over alphabet $\{\mathtt{3},\mathtt{4},\mathtt{5},\mathtt{6}\}$ . Let us define

U_{2t}\,=\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}\,P_{t-2}^{R}\,\mathtt{0}.

Figure 6: Illustration of the proof that Ab-square cannot start inside

P=P_{t-2}

and end inside another instance of

P

, not having its center between two

\mathtt{0}

’s and have length which is not an even multiple of the period

2t

. If it is an Ab-square (as shown in the figure) then

A\mathtt{1}\mathtt{2}P^{R}\mathtt{0}\mathtt{0}BC\cong DA\mathtt{1}\mathtt{2}P^{R}\mathtt{0}\mathtt{0}B

, then we can cancel equal parts on both sides. Consequently

C\cong D

, which implies that

P^{R}

contains an Ab-square

CD

; a contradiction.

Lemma 6.2.

The string $(U_{2t})^{m}$ contains exactly the following Ab-squares:

(1)

of length divisible by $4t$ ; and
(2)

with the center between two $\mathtt{0}$ ’s and of all admissible even lengths other than $(4q+2)t$ , for an integer $q\geq 0$ .

Proof 6.3.

Let $X$ be a factor of $(U_{2t})^{m}$ . If $X$ has length greater than or equal to $4t$ , then its middle length- $4t$ factor forms a classic square, and after removing it we obtain a different factor $Y$ of length smaller by $4t$ , which is centred exactly like $X$ . Hence, we can focus only on non-empty factors of length smaller than $4t$ .

If $X$ is centred between two $\mathtt{0}$ ’s, then after removing letters $\mathtt{1}$ and $\mathtt{2}$ we obtain an even palindrome (hence also an Ab-square). If $X$ is shorter than $2t$ , then no letters $\mathtt{1}$ or $\mathtt{2}$ occur. If it is longer than $2t$ , then both parts contain one letter $\mathtt{1}$ and $\mathtt{2}$ each. If its length is exactly $2t$ , then the letters $\mathtt{1}$ and $\mathtt{2}$ remain unmatched, hence it is the only case where the factor is not an Ab-square.

Let us assume that $X$ is centred in a different place. If the factor does not contain any of the letters $\mathtt{0}$ , $\mathtt{1}$ or $\mathtt{2}$ , then it is a factor of $P_{t-2}$ or its reverse, hence it cannot be an Ab-square. Otherwise, the factor needs to contain each of $\mathtt{1}$ , $\mathtt{2}$ twice and $\mathtt{0}$ four times. Then $X$ fully contains a factor

$\mathtt{1}\,\mathtt{2}\,P_{t-2}^{R}\,\mathtt{0}\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}\,P_{t-2}^{R}\,\mathtt{0}\,\mathtt{0}\quad\text{or}\quad\mathtt{0}\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}\,\,P_{t-2}^{R}\,\mathtt{0}\,\mathtt{0}\,P_{t-2}\,\mathtt{1}\,\mathtt{2}.$

Let us assume the former, see Figure 6; the latter is considered analogously. String $X$ has a length- $i$ suffix of $P_{t-2}$ as a prefix and a length- $j$ prefix of $P_{t-2}$ as a suffix, with $0\leq i+j\leq t-2$ and $t-2-(i+j)$ even. Then $X$ is an Ab-square if and only if $P_{t-2}[j+1..t-2-i]$ is an Ab-square, since $P_{t-2}[(t-i+j)/2..t-2-i]\,X\,P[j+1..(t-i+j)/2-1]$ is a classic square; see Figure 6.

Remark 6.4.

Lemma 6.2 works for any Ab-square-free string $P_{t-2}$ such that $\mathit{Alph}(P_{t-2})\cap\{\mathtt{0},\mathtt{1},\mathtt{2}\}=\emptyset$ .

For equal-length strings $X,Y$ we define the string

\mathsf{shuffle}_{\Diamond}(X,Y)\,=\,X[1]\,\Diamond\,Y[1]\;X[2]\,\Diamond\,Y[2]\;X[3]\,\Diamond\,Y[3]\;\cdots.

For example, $\mathsf{shuffle}_{\diamondsuit}(\mathtt{abc},\mathtt{ABC})\,=\,\mathtt{a}\diamondsuit\mathtt{Ab}\diamondsuit\mathtt{Bc}\diamondsuit\mathtt{C}$ .

The parity condition for half lengths of Ab-squares in the following observation justifies the usage of the additional letter $\Diamond$ in $\mathsf{shuffle}$ . Let $U_{[X]}$ be the string resulting from $U$ by removing all letters outside $\mathit{Alph}(X)$ . {observation} Assume $X,Y$ are equal-length strings composed of disjoint sets of letters distinct from $\Diamond$ and $W$ is an Ab-square in $\mathsf{shuffle}_{\Diamond}(X,Y)$ . Then $W_{[X]},W_{[Y]}$ are Ab-squares in $X,Y$ , respectively (we say that these Ab-squares are implied by $W$ ). Moreover, $|W_{[X]}|/2,\,|W_{[Y]}|/2,\,W|/2$ are of the same parity. We say that an even-length factor of a string $X$ is centred at $i$ if it has its center between positions $i$ and $i+1$ in $X$ . By $a\mid b$ and $a\nmid b$ we denote that $a$ divides $b$ and $a$ does not divide $b$ . For an illustration of the following lemma, see Figure 7.

Lemma 6.5.

Let $X=(U_{2t})^{n-1}$ , $Y$ be a string of length $|X|$ such that its alphabet is disjoint with $\mathit{Alph}(X)\cup\{\Diamond\}$ , $W=\mathsf{shuffle}_{\Diamond}(X,Y)$ , and let an integer $\ell$ satisfy $12t\nmid\ell$ . Then a length- $\ell$ factor of $W$ is an Ab-square if and only if it is centred in $W$ at $r\equiv\{0,-1,-2\}\pmod{6t}$ , $Y$ contains an Ab-square factor of length $\ell/3$ centred in $Y$ at $\left\lfloor r/3\right\rfloor$ , and $6t\nmid\ell$ .

Proof 6.6.

By the disjointness of sets of letters in $X,Y$ and $\{\Diamond\}$ , each Ab-square in $W$ has length that is divisible by 3. The following claim is then readily verified (cf. Section 6.1).

Claim 3.

For positive integer $\ell$ such that $6\mid\ell$ , a length- $\ell$ factor of $W$ centred at $r$ is an Ab-square if and only if the length- $\ell/3$ factors in $X$ and $Y$ centred at $\left\lfloor\tfrac{r+2}{3}\right\rfloor$ and $\left\lfloor\tfrac{r}{3}\right\rfloor$ , respectively, are Ab-squares.

Let integer $\ell>0$ satisfy $6\mid\ell$ and $12t\nmid\ell$ . We show two implications.

$(\mathbf{\Rightarrow})$ If $W$ contains an Ab-square factor of length $\ell$ centred at some $r$ , then the implied Ab-square factor of $X$ has length $\ell/3$ , where $4t\nmid\ell/3$ , so by Lemma 6.2 it has its center between two $\mathtt{0}$ ’s, i.e., $2t\mid\left\lfloor\tfrac{r+2}{3}\right\rfloor$ . Hence, $r\equiv\{0,-1,-2\}\pmod{6t}$ .

Moreover, $2t\nmid\ell/3$ also by Lemma 6.2. Finally, the implied Ab-square factor of $Y$ indeed has length $\ell/3$ and is centred at $\left\lfloor r/3\right\rfloor$ .

$(\mathbf{\Leftarrow})$ Let $r\equiv\{0,-1,-2\}\pmod{6t}$ , $6t\nmid\ell$ , and assume that $Y$ contains an Ab-square factor of length $\ell/3$ centred at $\left\lfloor r/3\right\rfloor$ . We have $2t\mid\left\lfloor\tfrac{r+2}{3}\right\rfloor\ \text{and}\ 2t\nmid\ell/3,$ so by Lemma 6.2 the string $X$ contains an Ab-square factor of length $\ell/3$ centred at $\left\lfloor\tfrac{r+2}{3}\right\rfloor$ . Finally, the unary string $\Diamond^{2tm}$ , certainly contains an Ab-square factor of length $\ell/3$ centred at $\left\lfloor\tfrac{r+1}{3}\right\rfloor$ . By the claim, $W$ contains an Ab-square of length $\ell$ centred at $r$ that implies the three Ab-squares.

Figure 7: Illustration of Lemma 6.5. Let

X=(U_{6})^{2}

. The string

Y=(\mathtt{abba})^{3}

, composed of black letters, contains many Ab-squares. However the string

Z=\mathsf{shuffle}_{\Diamond}(X,Y)

of length 36, shown above, contains only Ab-squares centred at 16, 17 or 18, as in the figure. The implied Ab-squares in

Z

are only those which are centred at positions 5 or 6 in

Z

6.2 Main result

We use the technique of fixing Ab-squares from Lemma 6.5. Moreover, we make the following minor modifications upon the construction of string $T$ in Section 4:

(1)

Each fragment $\mathtt{b}^{kd}$ is extended by one letter to $\mathtt{b}^{kd+1}$ , and
(2)

the letters $\bullet,\star$ are replaced each by two letters $\bullet\circ,\star\circ$ , respectively.

Intuitively, (1) allows to extend Ab-squares considered in the proof of Lemma 4.2 by one letter $\mathtt{b}$ to either side, and (2) makes $|\mathbf{A}|=|\mathbf{B}|=|\SS_{i}^{I}|$ even which facilitates the usage of Lemma 6.5 with $Y=T$ . It can be verified by inspecting the proof that Lemma 4.2 still holds after these two changes. We refer to all the notions from Section 4 after these modifications.

Theorem 6.7.

Checking if a length- $n$ string over an alphabet of size $\omega(1)$ contains an odd Ab-square is $\operatorname{\textsc{3SUM}}$ -hard. Moreover, for a string over an alphabet of size $14+k$ , for a constant $k$ , the same problem cannot be solved in $\mathcal{O}(n^{2-\tfrac{6}{3+k}-\varepsilon})$ time, for a constant $\varepsilon>0$ , unless the $\operatorname{\textsc{3SUM}}$ conjecture fails.

Proof 6.8.

It is enough now to show the following equivalence for $X=(U_{2t})^{n-1}$ , where $2t=|T|/(n-1)$ . We assume that $n\geq 3$ .

Claim 4.

An odd-half instance of $\operatorname{\textsc{3DAP}}$ is a YES-instance if and only if $W=\mathsf{shuffle}_{\Diamond}(X,T)$ has an odd Ab-square factor.

Proof 6.9.

$\overbrace{\underbrace{\star\,\circ\;\mathbf{B}}\;\underbrace{\bullet\,\circ\;\SS^{I}_{1}}\;\underbrace{\mathbf{A}\;\star\,\circ}}^{2t}\;\overbrace{\underbrace{\bullet\,\circ\;\mathbf{B}}_{M+2}\;\underbrace{\star\,\circ\;\SS^{I}_{2}}\;\;\underbrace{\mathbf{A}\;\bullet\,\circ}}^{2t}\;\overbrace{\underbrace{\star\,\circ\;\mathbf{B}}\;\underbrace{\bullet\,\circ\;\SS^{I}_{3}}\;\underbrace{\mathbf{A}\;\star\,\circ}}^{2t}\;\overbrace{\underbrace{\bullet\,\circ\;\mathbf{B}}\;\underbrace{\star\,\circ\;\SS^{I}_{4}}\;\underbrace{\mathbf{A}\,\bullet\,\circ}}^{2t}$

Figure 8: A schematic structure of a fragment of

T

after insertion of symbols

\circ

. There are

3(n-1)

(underbraced) blocks in

T

, each of size

M+2

, and

2t=3M+6

$(\mathbf{\Rightarrow})$ Assume that $\bar{x}$ is an odd-half instance and $\operatorname{\textsc{3DAP}}(\bar{x})$ has a solution.

By Lemma 4.2, $T$ contains a well-placed Ab-square, that is, an Ab-square centred at a position $r^{\prime}$ such that $2t\mid r^{\prime}$ . (Recall that $3(M+2)=2t$ .) Moreover, in the proof of that lemma it is shown that in this case there exists a well-placed Ab-square in $T$ that satisfies the following additional requirements: (1) it starts within the gadget $\mathbf{B}$ ; (2) it starts and ends within a block of $\mathtt{b}$ ’s; (3) its maximal prefix and suffix consisting of letters $\mathtt{b}$ are $\mathtt{b}^{e}$ and $\mathtt{b}^{f}$ , where $e,f\leq kd$ .

Let $\ell^{\prime}$ denote the half length of this Ab-square. By (2) and (3), if $\ell^{\prime}$ is even, the Ab-square can be extended by one letter $\mathtt{b}$ to either side (because we have extended each block $\mathtt{b}^{kd}$ ) so that $\ell^{\prime}$ becomes odd. Moreover, by (1), we have $\ell^{\prime}\bmod(2t)\in[\tfrac{4}{3}t,2t)$ , in particular, $t\nmid\ell^{\prime}$ . Then Lemma 6.5 concludes that the factor of $W$ centred at $r=3r^{\prime}\equiv 0\pmod{6t}$ and of length $6\ell^{\prime}$ such that $6t\nmid 6\ell^{\prime}$ is an Ab-square. Its half length, $3\ell^{\prime}$ , is odd, as desired.

$\circ\,\star\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\stackrel{{\scriptstyle\mathbf{r^{\prime}}}}{{|}}\,\circ\,\,\star\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\,\bullet\,\circ\,\stackrel{{\scriptstyle\mathbf{M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\star\,\circ\,\stackrel{{\scriptstyle\mathbf{2M}}}{{{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}{\color[rgb]{1,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{1,0,0}\mathbf{-}}}}\,\bullet\,\circ$

Figure 9: A simplified version of Figure 8. Which position in a block can be the starting position of an Ab-square with the center at

r^{\prime}

(one position to the left of a good center), only counting

\bullet,\star,\circ

$(\mathbf{\Leftarrow})$ Assume that $W$ has an Ab-square factor $U$ of length $\ell$ such that $\ell/2$ is odd. In particular, we have $12t\nmid\ell$ , so by Lemma 6.5 the Ab-square $U$ is centred in $W$ at $r\equiv\{0,-1,-2\}\pmod{6t}$ and $T$ contains an Ab-square factor $V$ of length $\ell/3$ centred in $T$ at $r^{\prime}=\left\lfloor r/3\right\rfloor$ . If $6t\mid r$ , then $2t\mid r^{\prime}$ and $V$ is well-placed.

Otherwise, $V$ cannot be an Ab-square due to the following fact: $T$ does not contain an Ab-square factor of length $\ell$ not divisible by $4t$ and centred at $r^{\prime}\equiv-1\pmod{2t}$ . Indeed, similarly as in the proof of Lemma 4.2, we will show that each even-length factor centred at such $r^{\prime}$ contains different counts of one of the letters $\bullet,\star,\circ$ in both halves. The positions of letters $\bullet,\star,\circ$ in $T$ repeat with period $6(M+2)$ , so it is sufficient to inspect the first 6 blocks on each side, as the remaining ones will behave periodically; see Figures 8 and 9.

letter	$\circ$	$\star$	$\circ$	$\mathbf{B}$	$\bullet$	$\circ$	$\SS$	$\mathbf{A}$	$\star$
distance	$6M+12$	$6M+11$	$6M+10$		$5M+9$	$5M+8$			$3M+7$
letter	$\circ$	$\bullet$	$\circ$	$\mathbf{B}$	$\star$	$\circ$	$\SS$	$\mathbf{A}$	$\bullet$
distance	$3M+6$	$3M+5$	$3M+4$		$2M+3$	$2M+2$			$1$

letter	$\circ$	$\star$	$\circ$	$\mathbf{B}$	$\bullet$	$\circ$	$\SS$	$\mathbf{A}$	$\star$
distance	$1$	$2$	$3$		$M+4$	$M+5$			$3M+6$
letter	$\circ$	$\bullet$	$\circ$	$\mathbf{B}$	$\star$	$\circ$	$\SS$	$\mathbf{A}$	$\bullet$
distance	$3M+7$	$3M+8$	$3M+9$		$4M+10$	$4M+11$			$6M+12$

Table 1: Top/bottom table: the distances of letters from

\{\bullet,\star,\circ\}

from the left/right half to the center of the factor.

An exhaustive verification can be performed as follows. First, in Table 1, we count the distances of letters from $\{\bullet,\star,\circ\}$ in both directions to the center of the factor. In Table 2 we perform a merge of these two sequences of distances assuming that $M\geq 3$ .

For each distance, we write a letter that is located at this distance with a “ $+$ ” sign if it is in the left half and with a “ $-$ ” sign otherwise. Then remaining columns show the partial sum of the number of occurrences of the letter $c$ in the left and in the right half, for each $c\in\{\bullet,\star,\circ\}$ .

position	letter	$\bullet$	$\star$	$\circ$
$1$	$+\bullet$	$1$	$0$	$0$
$1$	$-\circ$	$1$	$0$	$-1$
$2$	$-\star$	$1$	$-1$	$-1$
$3$	$-\circ$	$1$	$-1$	$-2$
$M+4$	$-\bullet$	$0$	$-1$	$-2$
$M+5$	$-\circ$	$0$	$-1$	$-3$
$2M+2$	$+\circ$	$0$	$-1$	$-2$
$2M+3$	$+\star$	$0$	$0$	$-2$
$3M+4$	$+\circ$	$0$	$0$	$-1$
$3M+5$	$+\bullet$	$1$	$0$	$-1$
$3M+6$	$+\circ$	$1$	$0$	$0$
$3M+6$	$-\star$	$1$	$-1$	$0$

position	letter	$\bullet$	$\star$	$\circ$
$3M+7$	$+\star$	$1$	$0$	$0$
$3M+7$	$-\circ$	$1$	$0$	$-1$
$3M+8$	$-\bullet$	$0$	$0$	$-1$
$3M+9$	$-\circ$	$0$	$0$	$-2$
$4M+10$	$-\star$	$0$	$-1$	$-2$
$4M+11$	$-\circ$	$0$	$-1$	$-3$
$5M+8$	$+\circ$	$0$	$-1$	$-2$
$5M+9$	$+\bullet$	$1$	$-1$	$-2$
$6M+10$	$+\circ$	$1$	$-1$	$-1$
$6M+11$	$+\star$	$1$	$0$	$-1$
$6M+12$	$+\circ$	$1$	$0$	$0$
$6M+12$	$-\bullet$	$0$	$0$	$0$

Table 2: The merge of distance sequences from Table 1.

Consequently, as in Lemma 4.2, the corresponding instance of $\operatorname{\textsc{3DAP}}$ is a YES-instance.

The complexities in the theorem are obtained as in Theorems 4.4 and 4.6.

7 Open problems

The most interesting questions that remain open are as follows:

1.

Is checking Ab-square-freeness $\operatorname{\textsc{3SUM}}$ -hard? Our reductions allowed us to show $\operatorname{\textsc{3SUM}}$ -hardness of detecting an odd Ab-square.
2.

Can one detect an additive square in a length- $n$ string over a constant-sized alphabet in $\mathcal{O}(n^{2-\varepsilon})$ time, for some $\varepsilon>0$ ? We have shown $\operatorname{\textsc{3SUM}}$ -hardness of this problem for an alphabet that is polynomial in $n$ .

References

[1] Peyman Afshani, Ingo van Duijn, Rasmus Killmann, and Jesper Sindahl Nielsen. A lower bound for jumbled indexing. In Shuchi Chawla, editor, Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, SODA 2020, Salt Lake City, UT, USA, January 5-8, 2020, pages 592–606. SIAM, 2020. doi:10.1137/1.9781611975994.36.
[2] Amihood Amir, Alberto Apostolico, Tirza Hirst, Gad M. Landau, Noa Lewenstein, and Liat Rozenberg. Algorithms for jumbled indexing, jumbled border and jumbled square on run-length encoded strings. Theoretical Computer Science, 656:146–159, 2016. doi:10.1016/j.tcs.2016.04.030.
[3] Amihood Amir, Timothy M. Chan, Moshe Lewenstein, and Noa Lewenstein. On hardness of jumbled indexing. In Javier Esparza, Pierre Fraigniaud, Thore Husfeldt, and Elias Koutsoupias, editors, Automata, Languages, and Programming - 41st International Colloquium, ICALP 2014, Copenhagen, Denmark, July 8-11, 2014, Proceedings, Part I, volume 8572 of Lecture Notes in Computer Science, pages 114–125. Springer, 2014. doi:10.1007/978-3-662-43948-7_10.
[4] Hideo Bannai, Shunsuke Inenaga, and Dominik Köppl. Computing all distinct squares in linear time for integer alphabets. In Juha Kärkkäinen, Jakub Radoszewski, and Wojciech Rytter, editors, 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, July 4-6, 2017, Warsaw, Poland, volume 78 of LIPIcs, pages 22:1–22:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017. doi:10.4230/LIPIcs.CPM.2017.22.
[5] Ilya Baran, Erik D. Demaine, and Mihai Patrascu. Subquadratic algorithms for 3SUM. Algorithmica, 50(4):584–596, 2008. doi:10.1007/s00453-007-9036-3.
[6] Felix Adalbert Behrend. On sets of integers which contain no three terms in arithmetical progression. Proceedings of the National Academy of Sciences of the United States of America, 32(12):331–332, 1946. doi:10.1073/pnas.32.12.331.
[7] Tom C. Brown and Allen R. Freedman. Arithmetic progressions in lacunary sets. Rocky Mountain Journal of Mathematics, 17(3):587–596, 1987.
[8] Tom C. Brown, Veselin Jungić, and Andrew Poelstra. On double 3-term arithmetic progressions. Integers, 14:A43, 2014. URL: https://www.emis.de/journals/INTEGERS/papers/o43/o43.Abstract.html.
[9] Julien Cassaigne, James D. Currie, Luke Schaeffer, and Jeffrey O. Shallit. Avoiding three consecutive blocks of the same size and same sum. Journal of ACM, 61(2):10:1–10:17, 2014. doi:10.1145/2590775.
[10] Timothy M. Chan and Qizheng He. Reducing 3SUM to Convolution-3SUM. In Martin Farach-Colton and Inge Li Gørtz, editors, 3rd Symposium on Simplicity in Algorithms, SOSA@SODA 2020, Salt Lake City, UT, USA, January 6-7, 2020, pages 1–7. SIAM, 2020. doi:10.1137/1.9781611976014.1.
[11] Timothy M. Chan and Moshe Lewenstein. Clustered integer 3SUM via additive combinatorics. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 31–40. ACM, 2015. doi:10.1145/2746539.2746568.
[12] Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Extracting powers and periods in a word from its runs structure. Theoretical Computer Science, 521:29–41, 2014. doi:10.1016/j.tcs.2013.11.018.
[13] Larry J. Cummings and William F. Smyth. Weak repetitions in strings. Journal of Combinatorial Mathematics and Combinatorial Computing, 24:33–48, 1997.
[14] Bartłomiej Dudek, Paweł Gawrychowski, and Tatiana Starikovskaya. All non-trivial variants of 3-LDT are equivalent. In Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy, editors, Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 974–981. ACM, 2020. doi:10.1145/3357713.3384275.
[15] Roger C. Entringer, Douglas E. Jackson, and J.A. Schatz. On nonrepetitive sequences. Journal of Combinatorial Theory, Series A, 16(2):159–164, 1974. doi:10.1016/0097-3165(74)90041-7.
[16] Paul Erdős. Some unsolved problems. Magyar Tudományos Akadémia Matematikai Kutató Intézetének Közleményei, 6:221–254, 1961.
[17] Jeff Erickson. Finding longest arithmetic progressions, 1999. URL: https://jeffe.cs.illinois.edu/pubs/arith.html.
[18] Aleksandr Andreevich Evdokimov. Strongly asymmetric sequences generated by a finite number of symbols. Doklady Akademii Nauk SSSR, 179(6):1268–1271, 1968.
[19] Gabriele Fici, Filippo Mignosi, and Jeffrey O. Shallit. Abelian-square-rich words. Theoretical Computer Science, 684:29–42, 2017. doi:10.1016/j.tcs.2017.02.012.
[20] Gabriele Fici and Aleksi Saarela. On the minimum number of abelian squares in a word. In Maxime Crochemore, James Currie, Gregory Kucherov, and Dirk Nowotka, editors, Combinatorics and Algorithmics of Strings (Dagstuhl Seminar 14111), volume 4 (3), pages 34–35, Dagstuhl, Germany, 2014. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik. doi:10.4230/DagRep.4.3.28.
[21] Aviezri S. Fraenkel, Jamie Simpson, and Mike Paterson. On weak circular squares in binary words. In Alberto Apostolico and Jotun Hein, editors, Combinatorial Pattern Matching, 8th Annual Symposium, CPM 97, Aarhus, Denmark, June 30 - July 2, 1997, Proceedings, volume 1264 of Lecture Notes in Computer Science, pages 76–82. Springer, 1997. doi:10.1007/3-540-63220-4_51.
[22] Allen R. Freedman and Tom C. Brown. Sequences on sets of four numbers. Integers, 16:A33, 2016. URL: http://math.colgate.edu/~integers/q33/q33.Abstract.html.
[23] Isaac Goldstein, Tsvi Kopelowitz, Moshe Lewenstein, and Ely Porat. How hard is it to find (honest) witnesses? In Piotr Sankowski and Christos D. Zaroliagis, editors, 24th Annual European Symposium on Algorithms, ESA 2016, August 22-24, 2016, Aarhus, Denmark, volume 57 of LIPIcs, pages 45:1–45:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ESA.2016.45.
[24] Dan Gusfield and Jens Stoye. Linear time algorithms for finding and representing all the tandem repeats in a string. Journal of Computer and System Sciences, 69(4):525–546, 2004. doi:10.1016/j.jcss.2004.03.004.
[25] Lorenz Halbeisen and Norbert Hungerbühlre. An application of van der Waerden’s theorem in additive number theory. Integers, 0:A7, 2000. URL: http://math.colgate.edu/~integers/a7/a7.pdf.
[26] Veikko Keränen. A powerful abelian square-free substitution over 4 letters. Theoretical Computer Science, 410(38-40):3893–3900, 2009. doi:10.1016/j.tcs.2009.05.027.
[27] Veikko Keränen. Abelian squares are avoidable on 4 letters. In Werner Kuich, editor, Automata, Languages and Programming, ICALP 1992, volume 623 of Lecture Notes in Computer Science, pages 41–52. Springer, 1992. doi:10.1007/3-540-55719-9_62.
[28] Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, and Tomasz Waleń. Maximum number of distinct and nonequivalent nonstandard squares in a word. Theoretical Computer Science, 648:84–95, 2016. doi:10.1016/j.tcs.2016.08.010.
[29] Tomasz Kociumaka, Jakub Radoszewski, and Bartłomiej Wiśniewski. Subquadratic-time algorithms for abelian stringology problems. In Ilias S. Kotsireas, Siegfried M. Rump, and Chee K. Yap, editors, Mathematical Aspects of Computer and Information Sciences - 6th International Conference, MACIS 2015, Berlin, Germany, November 11-13, 2015, Revised Selected Papers, volume 9582 of Lecture Notes in Computer Science, pages 320–334. Springer, 2015. doi:10.1007/978-3-319-32859-1_27.
[30] Tomasz Kociumaka, Jakub Radoszewski, and Bartłomiej Wiśniewski. Subquadratic-time algorithms for abelian stringology problems. AIMS Medical Science, 4(3):332–351, 2017. doi:10.3934/ms.2017.3.332.
[31] Tsvi Kopelowitz, Seth Pettie, and Ely Porat. Higher lower bounds from the 3SUM conjecture. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, pages 1272–1287. SIAM, 2016. doi:10.1137/1.9781611974331.ch89.
[32] Florian Lietard and Matthieu Rosenfeld. Avoidability of additive cubes over alphabets of four numbers. In Natasa Jonoska and Dmytro Savchuk, editors, Developments in Language Theory - 24th International Conference, DLT 2020, Tampa, FL, USA, May 11-15, 2020, Proceedings, volume 12086 of Lecture Notes in Computer Science, pages 192–206. Springer, 2020. doi:10.1007/978-3-030-48516-0_15.
[33] Giuseppe Pirillo and Stefano Varricchio. On uniformly repetitive semigroups. Semigroup Forum, 49:125–129, 1994. doi:10.1007/BF02573477.
[34] Peter A. B. Pleasants. Non-repetitive sequences. Mathematical Proceedings of the Cambridge Philosophical Society, 68:267–274, 1970.
[35] Mihai Pătra\textcommabelowscu. Towards polynomial lower bounds for dynamic problems. In Leonard J. Schulman, editor, Proceedings of the 42nd ACM Symposium on Theory of Computing, STOC 2010, Cambridge, Massachusetts, USA, 5-8 June 2010, pages 603–610. ACM, 2010. doi:10.1145/1806689.1806772.
[36] Michaël Rao and Matthieu Rosenfeld. Avoiding two consecutive blocks of same size and same sum over $\mathbb{Z}^{2}$ . SIAM Journal on Discrete Mathematics, 32(4):2381–2397, 2018. doi:10.1137/17M1149377.
[37] Lawrence Bruce Richmond and Jeffrey O. Shallit. Counting abelian squares. Electronic Journal of Combinatorics, 16(1), 2009. URL: http://www.combinatorics.org/Volume_16/Abstracts/v16i1r72.html.
[38] Raphaël Salem and Donald C. Spencer. On sets of integers which contain no three terms in arithmetical progression. Proceedings of the National Academy of Sciences of the United States of America, 28(12):561–563, 1942. doi:10.1073/pnas.28.12.561.
[39] Jamie Simpson. Solved and unsolved problems about abelian squares, 2018. arXiv:1802.04481.
[40] Shiho Sugimoto, Naoki Noda, Shunsuke Inenaga, Hideo Bannai, and Masayuki Takeda. Computing abelian string regularities based on RLE. In Ljiljana Brankovic, Joe Ryan, and William F. Smyth, editors, Combinatorial Algorithms - 28th International Workshop, IWOCA 2017, Newcastle, NSW, Australia, July 17-21, 2017, Revised Selected Papers, volume 10765 of Lecture Notes in Computer Science, pages 420–431. Springer, 2017. doi:10.1007/978-3-319-78825-8_34.