Some structural complexity results for $\exists{\mathbb{R}}$

Klaus Meer Computer Science Institute, BTU Cottbus-Senftenberg
Platz der Deutschen Einheit 1
D-03046 Cottbus, Germany
{meer,wurm}@b-tu.de Adrian Wurm Computer Science Institute, BTU Cottbus-Senftenberg
Platz der Deutschen Einheit 1
D-03046 Cottbus, Germany
{meer,wurm}@b-tu.de

Abstract

The complexity class ${\exists{\mathbb{R}}}$ , standing for the complexity of deciding the existential first order theory of the reals as real closed field in the Turing model, has raised considerable interest in recent years. It is well known that ${\rm NP}\subseteq{\exists{\mathbb{R}}}\subseteq{\rm PSPACE}.$ In their compendium [Schaefer2024], Schaefer, Cardinal, and Miltzow give a comprehensive presentation of results together with a rich collection of open problems. Here, we answer some of them dealing with structural issues of ${\exists{\mathbb{R}}}$ as a complexity class. We show analogues of the classical results of Baker, Gill, and Solovay finding oracles which do and do not separate NP form ${\exists{\mathbb{R}}}$ , of Ladner’s theorem showing the existence of problems in ${\exists{\mathbb{R}}}\setminus{\rm NP}$ not being complete for ${\exists{\mathbb{R}}}$ (in case the two classes are different), as well as a characterization of ${\exists{\mathbb{R}}}$ by means of descriptive complexity.

1 Introduction

The existential theory of the reals collects all true sentences in existential first-order logic over ${\mathbb{R}},$ i.e., sentences of the form $\exists x_{1}\ldots\exists x_{n}\Phi(x_{1},\ldots,x_{n}),$ where $\Phi$ is a quantifier-free formula expressing a Boolean combination of polynomial equalities and inequalities. The complexity of deciding the truth of a given sentence plays a prominent role for several computational models and complexity classes. In the Blum-Shub-Smale model - henceforth BSS for short - over ${\mathbb{R}}$ basic arithmetic operations can be executed with unit cost and the size of $\Phi$ is basically the algebraic size of a dense encoding of all polynomials occuring in it. The above decision problem then turns out to be ${\rm NP}_{{\mathbb{R}}}$ -complete, where the latter class is the real number analogue of classical NP [BSS]. However, the decision problem also makes perfect sense in the Turing model of computation if we restrict all coefficients for the polynomials in $\Phi$ to be rational. Then, its size is the usual bit-size and the problem of deciding truth is easily seen to be NP-hard. In the Turing model, the exact complexity of the existential theory for the reals is (wide) open. Using deep results on quantifier elimination it has been shown that it can be decided within PSPACE, see [Canny] and also [Renegar]. The open placement of the problem somewhere between NP and PSPACE has led to the definition of an own complexity class denoted ${\exists{\mathbb{R}}},$ see a precise definition in the next subsection.

Beside many other things, in [Schaefer2024] the authors ask whether several structural results well known to hold for the classical class NP in the Turing model also can be settled for ${\exists{\mathbb{R}}}.$ The latter include an analogue of the Baker-Gill-Solovay result [BGS] on existence of oracles separating ${\exists{\mathbb{R}}}$ from NP and the question whether ${\exists{\mathbb{R}}}$ can be characterized by means of descriptive complexity as done in [Fagin] for NP. In this paper we give positive answers to these questions and also show that Ladner’s result [Ladner] can be transformed showing that if we assume ${\exists{\mathbb{R}}}\neq{\rm NP}$ there exist problems in the set difference not being ${\exists{\mathbb{R}}}$ -complete.

A few words on the proofs might be appropriate. All above mentioned results for NP turn out to hold in the corresponding variants as well for ${\exists{\mathbb{R}}}$ . In some cases, though not completely straightforward, proofs follow from not too complicated modifications of known proofs of related results in the BSS model. One basic ingredient for the proofs to work is the characterization of ${\exists{\mathbb{R}}}$ via the BSS class BP $({\rm NP}_{{\mathbb{R}}}^{0})$ in Theorem 1.1 below together with the most important observation that the class of ${\rm NP}_{{\mathbb{R}}}^{0}$ -machines in the BSS model is countable. Therefore, we see the main value of the present paper in working out the necessary proof details of expected results in order to enlarge the body of structural results known to hold for ${\exists{\mathbb{R}}}.$

1.1 Basic definitions: ${\exists{\mathbb{R}}}$ and BSS-machines without constants

We now define precisely the class ${\exists{\mathbb{R}}}$ together with BSS computations using only rational machine constants as well as related discrete complexity classes BP $(\mbox{P}_{{\mathbb{R}}}^{0})$ and BP $({\rm NP}_{{\mathbb{R}}}^{0}).$

The class ${\exists{\mathbb{R}}}$ was formally introduced in [Schaefer2009], but somehow was examined already earlier, see [Schaefer2024] for a historic account on questions related to this class being studied already in the 1990s. The class can most easily be defined starting from a complete problem and then building the downward closure under usual polynomial time many one reductions. In the meanwhile, several problems have been shown to be complete. For our purposes, the existence of real solutions of a polynomial system of degree-two equations is the most suitable one.

Definition 1

a) The decision problem Quadratic Polynomial Systems ${\rm QPS}^{0}$ is defined as follows: Given $n,m\in{\mathbb{N}}$ and polynomials $f_{i}\in{\mathbb{Q}}[x_{1},\ldots,x_{m}],1\leq i\leq n$ of degree at most $2$ , is there an $x^{*}\in{\mathbb{R}}^{n}$ such that $f_{i}(x^{*})=0$ for all $i?$ The problem is understood as a decision problem in the Turing model, i.e., instances are coded as binary strings over $\{0,1\}^{*}$ .¹¹1The superscript ⁰ in ${\rm QPS}^{0}$ results from the close relation of the approach to Blum-Shub-Smale machines that are only using rational constants and is common in that context, see below. Here it indicates rational coefficients.

b) The complexity class ${\exists{\mathbb{R}}}$ consists of all decision problems $L\subseteq\{0,1\}^{*}$ such that there exists a polynomial time many one reduction from $L$ to ${\rm QPS}^{0}$ .

The completeness of ${\rm QPS}^{0}$ for class ${\exists{\mathbb{R}}}$ , i.e., for the existential theory of the reals easily follows from the proof in [BSS] that the corresponding problem for quadratic polynomial systems with arbitrary real coefficients is ${\rm NP}_{{\mathbb{R}}}$ -complete in the BSS model over ${\mathbb{R}}$ . In the corresponding proof, no real constants are introduced and starting from decision problems in $\{0,1\}^{*}$ the reduction runs in polynomial time also in the Turing model.

The class ${\exists{\mathbb{R}}}$ can easily be characterized via a restriction of the class ${\rm NP}_{{\mathbb{R}}}$ in the BSS model, see [BSS, BCSS97]. The full real number BSS model allows to compute with reals as entities, performing the basic arithmetic operations together with a test of the format ’is $x\geq 0$ ’ at unit cost. Algorithms are allowed to use a fixed finite set of real numbers as so-called machine constants. The latter always include $0$ and $1.$ The (algebraic) size of an instance then is the number of reals necessary to specify it (in a reasonable encoding), and the class ${\rm P}_{{\mathbb{R}}}$ of problems decidable in deterministic polynomial time is defined literally the same as P, i.e., there exists a polynomial time algorithm (measured with respect to the algebraic size measure for instances and the unit cost measure for algorithms) deciding the problem correctly. Similarly, ${\rm NP}_{{\mathbb{R}}}$ stands for problems verifiable in polynomial time when having access to a witness that is a real vector of polynomial (algebraic) size in the size of the given instance. If we want to handle discrete problems in this model, i.e., problems $L\in\{0,1\}^{*}$ , it is appropriate to restrict inputs to binary strings and to disallow BSS algorithms to use non-rational machine constants. This leads to complexity classes highly important in connection with ${\exists{\mathbb{R}}}$ .

Definition 2

The class BP $({\rm P}_{{\mathbb{R}}}^{0})$ consists of all $L\subseteq\{0,1\}^{*}$ such that there exists a BSS algorithm running in (algebraic) polynomial time, using only rational machine constants and deciding $L$ on inputs from $\{0,1\}^{*}$ . Similarly, the class BP $({\rm NP}_{{\mathbb{R}}}^{0})$ consists of all $L\subseteq\{0,1\}^{*}$ such that there exists a BSS algorithm using only rational machine constants and verifying $L$ on inputs from $\{0,1\}^{*}\times{\mathbb{R}}^{*}$ . Moreover, for inputs $(y,z)$ the (algebraic) running time is polynomially bounded in the size of $y.$

Above, BP stands for ’Boolean Part’ and ${\mathbb{R}}^{*}$ denotes the real number analogue of $\{0,1\}^{*}$ , i.e., finite sequences of real numbers. Note that for discrete inputs $y\in\{0,1\}^{*}$ the bit-size and the algebraic size coincide. However, the cost measure for algorithms working with discrete data is the algebraic one. It is also important to stress again that for verification algorithms the certificate $z\in{\mathbb{R}}^{*}$ is allowed to be a vector of reals. This is the reason for the tight relation of BP $({\rm NP}_{{\mathbb{R}}}^{0})$ to ${\exists{\mathbb{R}}}.$ It follows immediately from the original completeness proof in [BSS]:

Theorem 1.1

BP $({\rm NP}_{{\mathbb{R}}}^{0})\ =\ {\exists{\mathbb{R}}}.$

Below, a decisive aspect of this theorem is that ${\exists{\mathbb{R}}}$ can be characterized through a complexity class defined by a machine model which has a countable number of machines only.

2 Relativization of NP versus ${\exists{\mathbb{R}}}$

The first question from the compendium [Schaefer2024] we want to deal with is whether there exist oracles which do and do not separate ${\exists{\mathbb{R}}}$ from NP and PSPACE, respectively. Comparing P and NP in the Turing model, these questions were answered affirmatively long ago [BGS], in the BSS model an analogue result holds [Emerson], see also [Gassner]. Usually, the construction of an oracle separating the representative classes is done via a diagonalization argument. As consequence, the separating oracles in both settings are not very natural (with an exception in [Gassner], where a Knapsack problem is used to separate other classes). As we shall see, for NP and ${\exists{\mathbb{R}}}$ a quite natural oracle work. The definition of oracle classes like P^A and NP^A naturally relies on a basic model for algorithms defining a complexity class, like deterministic or non-deterministic polynomial time Turing machines. Then, one equips them with the additional feature to use intermediate results as queries to an oracle. Following the same nearby ongoing for ${\exists{\mathbb{R}}}$ we should work with the following definition.

Definition 3

Let $A\subseteq{\mathbb{R}}^{*}.$ A problem $L\subseteq\{0,1\}^{*}$ belongs to the oracle class ${\exists{\mathbb{R}}}^{A}$ , if there exists an ${\rm NP}_{{\mathbb{R}}}^{0}$ -algorithm $M$ which in addition has an oracle state. If $M$ enters this state, it can ask the oracle whether an element $y\in{\mathbb{R}}^{*}$ previously computed belongs to $A$ , receives the correct answer in one step, and continues its computation. Then, for $x\in L$ there exists a computation of $M$ which accepts $x$ and for $x\in\{0,1\}^{*}\setminus L$ no computation accepts.

Note that though this is the straightforward definition of oracle classes ${\exists{\mathbb{R}}}^{A}$ , some subtleties are hidden. Since the basic machine is an ${\rm NP}_{{\mathbb{R}}}^{0}$ -algorithm, it is allowed to produce real number oracle questions, something a Turing machine cannot. Vice versa, since a Turing machine only computes with discrete data, we know in advance that an NP-oracle machine will never produce arbitrary non-rational queries, an information which is undecidable for a BSS machine to know in general. Our results below establishing the integers as separating oracle implicitly make use of such effects. We further comment on this at the end of this section.

Theorem 2.1

Given the above definition of oracle classes based on ${\exists{\mathbb{R}}}$ , the following hold:

a) There exists an oracle $A$ such that ${\rm P}^{A}={\rm NP}^{A}={\exists{\mathbb{R}}}^{A}={\rm PSPACE}^{A}.$

b) The oracle ${\mathbb{Z}}$ , seen as subset of ${\mathbb{R}}$ , both satisfies ${\rm NP}={\rm NP}^{{\mathbb{Z}}}\subsetneq{\exists{\mathbb{R}}}^{{\mathbb{Z}}}$ and ${\exists{\mathbb{R}}}^{{\mathbb{Z}}}\neq{\rm PSPACE}^{{\mathbb{Z}}}.$

c) In the full BSS model, ${\rm P}_{{\mathbb{R}}}^{{\mathbb{Z}}}\subsetneq{\rm NP}_{{\mathbb{R}}}^{{\mathbb{Z}}}.$

Proof

For a) we can choose $A$ as the PSPACE-complete problem Quantified Boolean Formulas QBF. Since PSPACE = NPSPACE = co-NPSPACE it is well known that ${\rm P}^{\rm QBF}={\rm NP}^{\rm QBF}={\rm PSPACE}^{\rm QBF}={\rm PSPACE}.$ Arguing about ${\exists{\mathbb{R}}}^{\rm QBF}$ needs some care since the underlying machine model in the definition changes. Suppose $M$ to be a basic ${\rm NP}_{{\mathbb{R}}}^{0}$ -oracle machine using oracle QBF. Let $p(n)$ be the polynomial time running bound of $M$ for inputs in $\{0,1\}^{n}.$ W.l.o.g. suppose $M$ for such inputs asks $m\leq p(n)$ oracle queries, all being of polynomial size at most $p(n).$ A PSPACE-algorithm deciding $M^{\rm QBF}$ on input $x\in\{0,1\}^{n}$ works as follows: First, consider a computable enumeration of bit-vectors $b\in\{0,1\}^{m}$ and of $m$ strings $(q_{1},\ldots,q_{m})$ , all $q_{i}$ strings over alphabet $\{0,1,*\}$ and of size at most $p(n).$ Below, we interpret the $q_{i}$ as oracle queries $M$ poses during its computation and $b_{i}$ as the answers. A component $*$ in a query indicates that it does not belong to $\{0,1\}.$ In case that $M$ computes such a query, the answer from the QBF-oracle is $0$ since the query does not code a correct QBF-instance. Now, given a fixed tuple $(b_{1},\ldots,b_{m},q_{1},\ldots,q_{m})$ in the enumeration, consider an existential first-order sentence over ${\mathbb{R}}$ without non-rational constants claiming that there exists a computation of $M$ on $x$ which produces the queries $q_{i}$ under the assumption that $b_{1},\ldots,b_{i-1}$ are the correct answers to the previous queries, and then finally accepts. Since the $q_{i}$ are vectors over $\{0,1,*\},$ the requirement that a query computed by $M$ equals $q_{i}$ is easily expressible with such a first-order sentence. This holds for queries being bit-vectors as well as for a query containing a component $*$ , in which case the corresponding part of the first-order sentence has to express that the component does not belong to $\{0,1\}$ . This is easily doable. The resulting entire sentence is an instance of ${\exists{\mathbb{R}}}$ and thus can be decided in PSPACE by [Canny]. It remains to guarantee in addition that the $b_{i}$ are the correct answers to QBF-queries $q_{i}$ . This of course can be done in PSPACE. Note that if a $q_{i}$ contains a non-binary component, the corresponding $b_{i}$ must be $0$ . Now the algorithm does the above for all elements in the enumeration by re-using its space and accepts, if for at least one tuple $(b_{1},\ldots,b_{m},q_{1},\ldots,q_{m})$ all conditions are satisfied. It follows ${\exists{\mathbb{R}}}^{\rm QBF}={\rm PSPACE}$ as well.

For part b) note that both ${\rm NP}^{{\mathbb{Z}}}$ and ${\rm PSPACE}^{{\mathbb{Z}}}$ only contain problems being decidable in the Turing model. Moreover, if we consider a Turing machine over $\{0,1\}^{*}$ coding computations over ${\mathbb{Q}}$ in any usual way, then it is easy to decide for an intermediate result whether it encodes an integer or not. Since for an NP-machine the size of a query is polynomial in the input, we can also check in polynomial time whether a query is integral, thus NP = NP ${}^{{\mathbb{Z}}}.$ But ${\exists{\mathbb{R}}}^{{\mathbb{Z}}}$ contains problems being undecidable in the Turing model such as Hilbert’s 10th problem: Given a polynomial $f\in{\mathbb{Z}}[x_{1},\ldots,x_{n}]$ in $n$ variables with integer coefficients, an ${\rm NP}_{{\mathbb{R}}}^{0,{\mathbb{Z}}}$ -algorithm can guess in a non-deterministic computation $n$ real numbers $x_{1}^{*},\ldots,x_{n}^{*}$ and then ask the oracle whether all $x_{i}^{*}$ are integers. If the answer is positive the algorithm evaluates $f(x_{1}^{*},\ldots,x_{n}^{*})$ in polynomial time in the algebraic model. It accepts iff the result is $0$ . Thus, the problem to decide whether there is an integer zero of $f$ belongs to ${\exists{\mathbb{R}}}^{{\mathbb{Z}}},$ but is well known to be undecidable [Matiyasevich].

For c) we can simplify Emerson’s construction by using a result from [Meer93]. Consider the set $A:=\{t\in[0,2\pi]|\exists k\in{\mathbb{N}}\ \mbox{s.t.}\ \frac{k\cdot t}{2\pi}\in{\mathbb{N}}\}.$ It is easy to see that the decision problem: Given $x\in{\mathbb{R}}$ , is $x\in A?$ belongs to ${\rm NP}_{{\mathbb{R}}}^{{\mathbb{Z}}}:$ For input $x$ guess a $k\in{\mathbb{R}}$ and check whether $k\geq 0$ and $k\in{\mathbb{Z}}$ by asking the oracle. Then compute $\frac{kx}{2\pi}$ and ask the oracle again whether it belongs to ${\mathbb{Z}}$ . All this can be done in constantly many algebraic steps. Note that in the BSS model $x\in{\mathbb{R}}$ has input size $1$ . However, in order to belong to ${\rm P}_{{\mathbb{R}}}^{{\mathbb{Z}}}$ the question should be decided in constant time using the oracle. In [Meer93] it is shown by using a ’typical path argument’ that this problem can not be decided by a polynomial time BSS machine which is allowed in addition to evaluate the sine-function in one step. Since in this ’sine-model’ one can decide in constant time whether a number is integral, it follows can ${\rm P}_{{\mathbb{R}}}^{{\mathbb{Z}}}$ can be decided by a sine-machine, but $A$ can not. We conclude that $A\not\in{\rm P}_{{\mathbb{R}}}^{{\mathbb{Z}}}.$ $\Box$

Note that part c) alternatively to the proof in [Emerson] gives a more natural problem yielding the separation. As also for part b), this seems to be an effect observable at least for some results in the BSS setting, compare a similar statement for a real number version of Post’s problem [MeerZiegler2005]. A strange effect why the separation for NP and ${\exists{\mathbb{R}}}$ works is the fact that even though the former is a subset of the latter, this does not any longer hold for arbitrary relativized versions. Clearly, the reason for this is the use of different machine models to define the corresponding oracle classes. Such an effect concerning relativized classes is also known from classical complexity theory for less prominent classes and studied in [Vereshchagin]. It might be interesting to both find other oracles yielding the separation and to study the above effect in more detail in our framework.

3 Descriptive complexity for ${\exists{\mathbb{R}}}$

The next question from the compendium [Schaefer2024] is whether ${\exists{\mathbb{R}}}$ can be characterized by purely logical means in the sense of descriptive complexity theory. Corresponding results were given in the early years of finite model theory for classical NP [Fagin] and later on for the real number version ${\rm NP}_{{\mathbb{R}}}$ in the BSS model [GraedelMeer], based on the development of meta-finite model theory in [GraedelGurevich]; see [Graedeletal] for a more intensive introduction into such results. One special feature, given the above mentioned characterization of ${\exists{\mathbb{R}}}$ as BP $({\rm NP}_{{\mathbb{R}}}^{0})$ , is the mixture of discrete inputs coded as in classical complexity theory over $\{0,1\}^{*}$ , and an algebraic cost measure for the arithmetic computations performed by the underlying machine. It turns out that this split can be modelled by a restriction of so-called ${\mathbb{R}}$ -structures used in [GraedelMeer] to characterize (full) ${\rm NP}_{{\mathbb{R}}}$ and respective logics on them. Thus, the focus below will be on defining the corresponding restriction which we call discrete ${\mathbb{R}}$ -structures, and the suitable logics on them used to capture both ${\exists{\mathbb{R}}}$ and BP $({\rm P}_{{\mathbb{R}}}^{0})$ . Corresponding proofs then are quite similar to those given in [GraedelMeer] and are only sketched in the Appendix. Though below we work with restricted ${\mathbb{R}}$ -structures, we have to recall their definition in full generality, except for disallowing arbitrary real constants in the so-called secondary part due to the fact that the BSS algorithms studied here do not use such constants.

Definition 4

Let $L_{s},L_{f}$ be finite vocabularies, where $L_{s}$ may contain relation and function symbols, and $L_{f}$ contains function symbols only. An ${\mathbb{R}}$ -structure of signature $\sigma=(L_{s},L_{f})$ is a triple ${\mathfrak{D}}=({\cal A},{\cal{R}},\cal F)$ consisting of

(i): a finite structure ${\cal A}$ of vocabulary $L_{s}$ , called the primary part or skeleton of ${\mathfrak{D}}$ ; its universe $A$ is also said to be the universe of ${\mathfrak{D}}$ ;
ii): the infinite structure ${\cal{R}}=({\mathbb{R}},0,1,+,-,\cdot,/,{\rm sign}\,,<)$ called the secondary part. Here, ${\rm sign}\,:{\mathbb{R}}\mapsto\{0,1\}$ denotes the sign-function being $0$ for negative values and $1$ otherwise;
(iii): a finite set $\cal F$ of functions $X:A^{k}\to{\mathbb{R}}$ interpreting the function symbols in $L_{f}$ , where $k$ depends on the respective symbol only.

We denote the set of all ${\mathbb{R}}$ -structures of signature $\sigma$ by ${\rm Struct}_{{\mathbb{R}}}(\sigma)$ .

For ${\mathfrak{D}}\in{\rm Struct}_{{\mathbb{R}}}(\sigma)$ the size of ${\mathfrak{D}}$ is $|{\mathfrak{D}}|:=|A|.$

We now restrict ${\mathbb{R}}$ -structures to so-called discrete ${\mathbb{R}}$ -structures suitable for modelling computations in $\mbox{P}_{{\mathbb{R}}}^{0}$ and ${\rm NP}_{{\mathbb{R}}}^{0}$ on Boolean languages. In order to model bit-strings as structures the universe always will have the form $A=\{0,1,\ldots,n-1\}$ and $L_{s}$ contains a unary relation $X\subseteq A$ interpreted as a bit-vector in $\{0,1\}^{n}$ via $X(i)=1\leftrightarrow i\in X,X(i)=0\leftrightarrow i\not\in X.$ In addition, for well known reasons we include a linear order $<\in L_{s}$ in order to capture polynomial time computations below. Finally, as the only element in $L_{f}$ we need a function $real:A\mapsto{\mathbb{R}}$ which is able to change the type of an element in $A$ to become a real (like the cost operator in a programming language like $C_{++}):$ We define $real(i):=\left\{\begin{array}[]{ll}1\in{\mathbb{R}}&\mbox{for}\ i\neq 0,i\in A\\ 0\in{\mathbb{R}}&\mbox{for}\ i=0\in A\end{array}\right.$

Definition 5

A discrete ${\mathbb{R}}$ -structure ${\mathfrak{D}}$ is an ${\mathbb{R}}$ -structure over vocabulary $\sigma=(L_{s},L_{f}),$ where $A=\{0,1,\ldots,n-1\}$ for some $n\in{\mathbb{N}},L_{s}$ contains at least a unary relation $X$ and a binary relation $<$ interpreted as bit-string and linear order, respectively, and $L_{f}=\{real\},$ where $real:A\mapsto{\mathbb{R}}$ is interpreted as above.

We denote by ${\rm Struct}_{{\mathbb{R}}}^{0}(\sigma)$ all discrete ${\mathbb{R}}$ -structures with vocabulary $\sigma.$

Note that below when we use existential second-order (so) logic, both $<$ and $real$ can be avoided by claiming their existence as functions from $A$ to ${\mathbb{R}}$ using an existential quantifier together with a fo-formula.

Discrete ${\mathbb{R}}$ -structures and logics on them provide the right framework to model BSS-computations on Boolean languages that do not use constants other than rational ones. Inputs being discrete and measured by their usual bit size, the used logics will transfer those inputs into the real number part of a discrete ${\mathbb{R}}$ -structure so that computations are modelled therein, including the use of the algebraic unit cost measure. Let $V=\{v_{0},v_{1},\ldots\}$ denote a countable set of variables. The $v_{i}$ are supposed to be fo-variables ranging over the discrete universe of a discrete ${\mathbb{R}}$ -structure.

Definition 6

Fix ${\cal{R}}=({\mathbb{R}},0,1,+,-,\cdot,/,{\rm sign}\,,<).$ The language ${\rm FO}_{{\mathbb{R}}}^{0}$ contains, for each signature $\sigma=(L_{s},L_{f}),$ a set of formulas and terms. Each term $t$ takes, when interpreted in some discrete ${\mathbb{R}}$ -structure, values in either the skeleton, in which case we call it an index term, or in ${\mathbb{R}}$ , in which case we call it a number term. Terms are defined inductively as follows

(i): The set of index terms is the closure of the set $V$ of variables under applications of function symbols of $L_{s}$ .
(ii): Any rational number is a number term.
(iii): If $h_{1},\ldots,h_{k}$ are index terms and $X$ is a $k$ -ary function symbol of $L_{f},$ then $X(h_{1},\ldots,h_{k})$ is a number term. In particular, for $h$ an index term, $real(h)$ is a number term with value in $\{0,1\}\subset{\mathbb{R}}.$
(iv): If $t,t^{\prime}$ are number terms, then so are $t+t^{\prime}$ , $t-t^{\prime}$ , $t\cdot t^{\prime}$ , and ${\rm sign}\,(t)$ . Here, the sign function is defined as ${\rm sign}\,(t):=\left\{\begin{array}[]{ll}1&\ \mbox{if}\ t\geq 0\\ 0&\ \mbox{if}\ t<0\end{array}\ .\right.$

Atomic formulas are equalities $h_{1}=h_{2}$ of index terms, equalities $t_{1}=t_{2}$ and inequalities $t_{1}<t_{2},t_{1}\leq t_{2}$ of number terms, and expressions $P(h_{1},\ldots,h_{k}),$ where $P$ is a $k$ -ary predicate symbol in $L_{s}$ and $h_{1},\ldots,h_{k}$ are index terms.

The set of formulas of ${\rm FO}_{{\mathbb{R}}}^{0}$ is the smallest set containing all atomic formulas and which is closed under Boolean connectives and quantification $(\exists v)\psi$ and $(\forall v)\psi$ .

Remark 1

Having a linear order available on $A$ it is folklore in finite model theory to define the first and the last element (denoted by $0$ and $n-1$ , respectively) of $A$ with respect to the order, as well as a linear order on every $A^{k},k\in{\mathbb{N}}$ by means of fo-formulas. This is tacitly used below.

In order to capture complexity class ${\exists{\mathbb{R}}}={\rm BP}({\rm NP}_{{\mathbb{R}}}^{0})$ as well as BP $(\mbox{P}_{{\mathbb{R}}}^{0})$ we have to extend fo-logic to existential second-order logic and first order logic, respectively. This is done similarly as in [GraedelMeer] and the proofs showing that the resulting logics capture the intended classes is a nearby variation of the corresponding ones in the full BSS model. The only technical aspect to respect is that for discrete ${\mathbb{R}}$ -structures and the logics used we cannot introduce arbitrary real constants. This is rather a restriction for fixed point logic than for existential so-logic since for the latter we can existentially quantify such real objects and then express their desired properties via a fo-formula. Since fixed point logic also needs to work with so-objects, we first define existential so logic.

Definition 7

Second-order logic ${\rm SO}_{{\mathbb{R}}}^{0}$ on discrete ${\mathbb{R}}$ -structures is obtained starting from ${\rm FO}_{{\mathbb{R}}}^{0}$ logic by adding the possibility to quantify over function symbols. More precisely, given a vocabulary $\sigma=(L_{s},L_{f}),$ where $L_{f}$ contains a function symbol $Y$ interpreted as a function from some $A^{k}\to{\mathbb{R}}$ , together with a first-order formula $\phi$ over $\sigma$ , both $\exists Y\phi(Y)$ and $\forall Y\phi(Y)$ are second-order formulas. If all quantified function symbols are existentially quantified we get existential second-order logic $\exists{\rm SO}_{{\mathbb{R}}}^{0}.$

Example 1

We express the core problem of ${\exists{\mathbb{R}}}$ , namely real solvability of an instance of QPS⁰ (Definition 1), as an existential so-property on suitable discrete ${\mathbb{R}}$ -structures. Some (easy) technical aspects are only sketched. Consider as input instance a system of $m\in{\mathbb{N}}$ polynomials $p_{1},\ldots,p_{m}\in{\mathbb{Z}}[x_{1},\ldots,x_{n}]$ in some $n\in{\mathbb{N}}$ real variables having integer coefficients. All $p_{i}$ are supposed to have degree at most $2$ and the question is to decide whether there exists a common real zero. Recall that this is an ${\exists{\mathbb{R}}}$ -complete problem. Rational coefficients for sake of easiness in the further description can be removed by multiplying all $p_{i}$ by the least common multiple of the denominators of their respective coefficients. We represent the system as follows as a discrete ${\mathbb{R}}$ -structure ${\mathfrak{D}}=({\cal A},{\cal{R}},\{real\}),$ where ${\cal A}$ has the discrete set $A=\{1,\ldots,n\}\times\{1,\ldots,m\}\times\{0,\ldots,L\}$ as universe. Here, $L\in{\mathbb{N}}$ is an upper bound for the bit-size of the coefficients of all $p_{I}.$ The vocabulary $L_{s}$ contains a linear ordering and a coefficient function $C:A^{4}\mapsto\{0,1\}\subset A$ , where for $i,j\in\{1,\ldots,n\},k\in\{1,\ldots,m\}$ and $r\in\{0,\ldots,L\}$ we interpret $C(i,j,k,0)$ as sign of the coefficient of monomial $x_{i}x_{j}$ of $p_{k};$ here $C(i,j,k,0)=0\in A$ codes sign $-1$ and $C(i,j,k,0)=1\in A$ codes sign $1.$ Furthermore, $C(i,j,k,r)$ is the bit (as element in $A$ ) of $2^{r-1}$ in the binary representation of the coefficient of $x_{i}x_{j}$ in $p_{k}$ . Some technical comments are necessary. Formally, the above discrete part looks different from the definition of a discrete ${\mathbb{R}}$ -structure. However, it is an easy task to change the above structure to one with a universe $\tilde{A}=\{0,1,\ldots,K-1\}$ for some $K\in{\mathbb{N}}$ and an $\tilde{X}\subset\tilde{A}$ interpreted as bit-string coding the instance. Towards this aim, one has to use an additional relation coding the values $n,m$ and $L$ in $\tilde{A}.$ Using the linear order on $\tilde{A}$ it is easy to express via fo-logic components representing variables, polynomials, and coefficients, respectively. Similarly, for a function $\tilde{C}:\tilde{A}^{4}\mapsto\{0,1\}\subset\tilde{A}$ it is easy to describe the correct arguments and their meaning using fo-logic to define $\tilde{C}$ arbitrarily on arguments having no interpretation. The interested reader might try to elaborate this coding; for sake of readability we prefer to use the above coding instead.

In the existential second order part we now ask for the existence of three functions mapping some $A^{k}$ to ${\mathbb{R}}.$ First, $C_{{\mathbb{R}}}:A^{4}\mapsto{\mathbb{R}}$ should represent the coefficients as real numbers in order to use them in number terms for expressing the evaluation of the $p_{k}$ in real arguments. This can be done using the following fo-formulas. Note that properties like $i,j\in\{1,\ldots,n\},k\in\{1,\ldots,m\},r\in\{0,\ldots,L\}$ can be expressed using the linear ordering; similarly, expressions like $r+1$ can be defined to have the natural meaning.

First, the correct sign of a coefficient is expressed as

\begin{array}[]{rcl}\forall i,j\in\{1,\ldots,n\},k\in\{1,\ldots,m\}\ C(i,j,k,0)=0&\Leftrightarrow&C_{{\mathbb{R}}}(i,j,k,0)=-1\in{\mathbb{R}}\\ &&\hfill{(\mbox{as real number term})}\\ \wedge\ C(i,j,k,0)=1&\Leftrightarrow&C_{{\mathbb{R}}}(i,j,k,0)=1\in{\mathbb{R}}\end{array}

Next, $\forall i,j\in\{1,\ldots,n\},k\in\{1,\ldots,m\},r\in\{1,\ldots,L-1\}$ fixing the highest bit as real and describing the real represented by bit vector $C(i,j,k,*)$ is done via

\begin{array}[]{l}C_{{\mathbb{R}}}(i,j,k,L)=real(C(i,j,k,L))\ \mbox{and}\\ C_{{\mathbb{R}}}(i,j,k,r)=C_{{\mathbb{R}}}(i,j,k,r+1)\cdot 2+real(C(i,j,k,r)),\end{array}

where of course $2$ is a real number term for $1+1$ . This way $C_{{\mathbb{R}}}(i,j,k,1)$ represents (as real) the absolute value of the coefficient of monomial $x_{i}\cdot x_{j}$ in $p_{k}$ . Finally, we require for all $i,j,k:C_{{\mathbb{R}}}(i,j,k,L+1)=C_{{\mathbb{R}}}(i,j,k,0)\cdot C_{{\mathbb{R}}}(i,j,k,1)$ to get the correct sign.

Next, use $\exists Y:A\mapsto{\mathbb{R}}$ in order to existentially quantify a potential common real zero $(Y(1),\ldots,Y(n))$ of all $p_{k}.$ Finally, we have to express the evaluation of all $p_{k}$ in $Y$ . However, this is easy and can be done similarly to the foregoing construction. Existentially quantify a function $Z:A^{3}\mapsto{\mathbb{R}}$ which represents the partial sums when evaluating all polynomials in $Y$ . One component in the arguments of $Z$ again addresses the respective polynomial to evaluate, the two others code $x_{i}\cdot x_{j}.$ Since all $p_{k}$ have degree at most $2$ , we need for every $k\in\{1,\ldots,m\}$ to cycle through $\{1,\ldots,n\}^{2}$ , and this can be expressed similarly to the computation of a coefficient from a binary representation above. The proceeding is the same as that in [GraedelMeer] for evaluating a single degree-4 polynomial.

A logical characterization of ${\exists{\mathbb{R}}}$ now is possible in almost the same way as for ${\rm NP}_{{\mathbb{R}}}$ in [GraedelMeer].

Theorem 3.1

Let $(F,F^{+})$ be a decision problem of discrete ${\mathbb{R}}$ -structures, i.e., $F$ is a set of such structures, where instance structures are coded as strings in $\{0,1\}^{*}$ as explained above, and $F^{+}\subset F.$ Then $(F,F^{+})\in{\exists{\mathbb{R}}}$ if and only if there is an existential second-order sentence $\psi$ such that $F^{+}=\{{\mathfrak{D}}\in F|{\mathfrak{D}}\models\psi\}.$

Dealing with a logical characterization of BP $({\rm P}_{{\mathbb{R}}}^{0})$ this is not possible, the reason why at some places a bit more care for technical details is necessary. Nevertheless, this is at most tedious. The extension of FO ${}_{{\mathbb{R}}}^{0}$ to fixed point logic necessary in order to capture polynomial time in our setting once again was basically introduced already in [GraedelMeer], here we outline the technical differences due to the lack of arbitrary real constants. The only new technical demand in order to describe ${\rm P}_{{\mathbb{R}}}^{0}$ -computations using a logic on discrete ${\mathbb{R}}$ -structures is to change at certain places discrete data (i.e., objects formalized in the discrete part of an ${\mathbb{R}}$ -structure) to real numbers. Suppose, for example, we use the given order on some $A^{k}$ and express by FO_R logic that $\underline{t}\in A^{k}$ is the successor of $\underline{s}$ in this order, i.e., we have a formula saying $\underline{t}=\underline{s}+1.$ We then often want to use the characteristic value $\chi[\underline{t}=\underline{s}+1]$ as real number. Formally, this can be expressed in our logics as follows: $\underline{t}=\underline{s}+1$ is an abbreviation of a formula on the discrete part of an ${\mathbb{R}}$ -structure. We can introduce a new discrete variable $\sigma$ together with the fo-formula $\sigma\Leftrightarrow\underline{t}=\underline{s}+1.$ Now, $real(\sigma)$ is a number term expressing the truth value of $\underline{t}=\underline{s}+1$ as real (!) number $0$ or $1$ . We express this by writing $real([\chi[\underline{t}=\underline{s}+1]).$ Similar constructions are used below and should be clear from the context.

We need the following definitions, compare [GraedelMeer]. In order to deal with partially defined functions from some $A^{k}\to{\mathbb{R}}$ , enlarge ${\mathbb{R}}$ by an element $undef$ and extend the arithmetic operations via $r+undef:=r-undef:=undef,\\ r\cdot undef:=r/undef:=\left\{\begin{array}[]{cc}0&r=0\\ undef&r\neq 0\end{array}\right.$ and $sgn(undef):=undef.$ For a number term $F(\underline{s},\underline{t})$ with free first order variables $\underline{s},\underline{t}$ , the $\max$ operator $\max\limits_{\underline{s}}F(\underline{s},\underline{t})$ is a number term with free variables $\underline{t}$ and obvious semantics. Below, we use $\max$ in order to cycle through some $A^{k}.$ Typically, we want $\underline{s}$ to be the direct predecessor of $\underline{t},$ then express the characteristic value $\chi[\underline{t}=\underline{s}+1]$ via FO ${}^{0}_{{\mathbb{R}}}$ -logic, take the corresponding number term $real(\chi[\underline{t}=\underline{s}+1])$ as explained above, and then identify $\underline{s}$ via $\max\limits_{\underline{s}}real(\chi[\underline{t}=\underline{s}+1]).$ That way, when a computation is in step $\underline{s}$ we can formalize the next step $\underline{t}.$ In order to define fixed point logic on discrete ${\mathbb{R}}$ -structures ${\mathfrak{D}}=(L_{s},{\cal R},\{real\})$ let $Z$ be a function symbol of arity $r$ interpreted on ${\mathfrak{D}}$ as function from $A^{r}\to{\mathbb{R}}$ and let $F(Z,\underline{x})$ be a number term in FO ${}^{0}_{{\mathbb{R}}}+\max$ logic on signature $(L_{s},\{real,Z\})$ (i.e., the closure of FO ${}^{0}_{{\mathbb{R}}}$ logic under use of the $\max$ operator) with free first order variables $\underline{x}=(x_{1},\ldots,x_{r})\in V^{r}.$

Definition 8

The fixed point $Z^{\infty}$ with respect to $F(Z,\underline{x})$ is a function symbol of arity $r.$ Its interpretation on a discrete ${\mathbb{R}}$ -structure ${\mathfrak{D}}$ is defined iteratively as follows: Set $Z^{(0)}(\underline{x}):=undef\ \forall\underline{x}\in A^{r}$ and for $i\in{\mathbb{N}}:\\ Z^{(i)}(\underline{x}):=\left\{\begin{array}[]{rl}Z^{(i-1)}(\underline{x})&\mbox{if it is defined already}\\ F(Z^{(i-1)},\underline{x})&\mbox{if}\ Z^{(i-1)}(\underline{x})=undef\end{array}.\right.$

After at most $j\leq|A|^{r}$ iterations this process becomes saturated, i.e., $Z^{(j)}=Z^{(j+1)}.$ This fixed point then is $Z^{\infty}.$

Definition 9

Functional fixed-point logic ${\rm FFP}_{{\mathbb{R}}}^{0}$ is obtained as closure of the set of first-order number terms under the maximization rule and the fixed-point rule. ${\rm FP}_{{\mathbb{R}}}^{0}$ denotes the class of characteristic functions definable in functional fixed-point logic.

Theorem 3.2

On ranked discrete ${\mathbb{R}}$ -structures the functions in ${\rm FFP}_{{\mathbb{R}}}^{0}$ are exactly those computable in polynomial time in the constant-free BSS model. In particular, the characteristic functions in ${\rm FFP}_{{\mathbb{R}}}^{0}$ are those of polynomial time solvable decision problems in $\{0,1\}^{*},$ i.e. ${\rm FP}_{{\mathbb{R}}}^{0}={\rm BP}({\rm P}_{{\mathbb{R}}}^{0}).$

We finally note that another approach to characterize a certain fragment of ${\exists{\mathbb{R}}}$ by means of a so-called probabilistic independence logic was given in [Hannula].

4 A Ladner like theorem for NP and ${\exists{\mathbb{R}}}$

The next classical result which can be transformed relatively straightforwardly to ${\exists{\mathbb{R}}}$ is Ladner’s theorem [Ladner] guaranteeing the existence of intermediate problems between P and the NP-complete ones. The range of the classical proof technique has been analyzed in [Schoening]. In relation with the BSS model of computation over uncountable structures the theorem has been shown in [MalajovichMeer] to hold as well over the complex numbers. Over ${\mathbb{R}}$ it is open, but [BMM, ChapuisKoiran, Meer2012] give partial results. Especially, [BMM] addresses the importance of quantifier elimination QE algorithms when constructing intermediate problems. In the case of separating the class of ${\exists{\mathbb{R}}}$ -complete problems from NP under the assumption ${\exists{\mathbb{R}}}\neq{\rm NP}$ a combination of the arguments from [BMM, MalajovichMeer] works as well.

Theorem 4.1

Suppose ${\exists{\mathbb{R}}}\neq{\rm NP}.$ Then, there exists a problem in ${\exists{\mathbb{R}}}\setminus{\rm NP}$ which is not ${\exists{\mathbb{R}}}$ -complete.

For the rest of this section we outline the proof idea of how to construct such a problem. Technical details then easily can be filled following the presentations in [BMM, MalajovichMeer]. Our construction starts from the ${\exists{\mathbb{R}}}$ -complete problem ${\rm QPS}^{0}$ of Definition 1. As mentioned before, the ${\rm NP}_{{\mathbb{R}}}$ -completeness proof of the corresponding problem for the class ${\rm NP}_{{\mathbb{R}}}$ in [BSS] shows as well completeness of ${\rm QPS}^{0}$ in ${\exists{\mathbb{R}}}$ since given an ${\rm NP}_{{\mathbb{R}}}^{0}$ -machine the reduction to the full $\rm QPS$ problem, i.e., where polynomials are allowed to have real coefficients as well, does not introduce real constants and runs in polynomial time also in the Turing model; intermediate results can be subsumed under real existential quantification.²²2Note that guessing a real solution might lead to real intermediate results; they can be existentially quantified in ${\exists{\mathbb{R}}}$ in the reduction proof.

The main reasons why a proof of Theorem 4.1 can be done similarly to showing related results are the decidability of ${\exists{\mathbb{R}}}$ within a computable time bound in the Turing model due to real quantifier elimination, and effective countability of NP-machines together with polynomial time bounds. Starting from ${\rm QPS}^{0}$ we build certain restricted languages $L(a)$ depending on strictly increasing sequences $(a):=(a_{i})_{i\in{\mathbb{N}}}$ of natural numbers. For input dimensions $n\in\{a_{2i-1},\ldots,a_{2i}-1\}$ we let $L(a)$ be the ${\rm QPS}^{0}$ problem, for the remaining input dimensions $L(a)$ equals the empty set. The strategy now is to define two such sequences $(a),(b)\in{\mathbb{N}}^{{\mathbb{N}}}$ so that both have the desired properties. We need two sequences in order to guarantee non-completeness.

Definition 10 (cf. [MalajovichMeer])

Let $(a),(b)\in{\mathbb{N}}^{{\mathbb{N}}}$ be two sequences of natural numbers.

a) The sequences have an exponential gap iff both are strictly increasing and for all $i\in{\mathbb{N}}$ satisfy $a_{2i+1}\geq 2^{b_{2i}}$ as well as $b_{2i+1}\geq 2^{a_{2i+2}},$ i.e., $a_{1}<a_{2}<2^{a_{2}}\leq b_{1}<b_{2}<2^{b_{2}}\leq a_{3}\ldots.$

b) The decision problem $L(a)$ (and $L(b)$ similarly) is defined dimension-wise for inputs of size $n$ as follows:

L(a)\cap\{0,1\}^{n}:=\left\{\begin{array}[]{cl}{\rm QPS}^{0}\cap\{0,1\}^{n}&\mbox{if}\ \exists i\in{\mathbb{N}}\ \mbox{s.t.}\ a_{2i-1}\leq n<a_{2i}\\ \emptyset&\mbox{otherwise}\end{array},\right.

Since ${\exists{\mathbb{R}}}$ -completeness is defined downward from ${\rm QPS}^{0}$ by usual polynomial time reductions, it is easy to see that assuming ${\exists{\mathbb{R}}}\neq{\rm NP}$ neither $L(a)$ nor $L(b)$ can be ${\exists{\mathbb{R}}}$ -complete if $(a)$ and $(b)$ have an exponential gap.

Lemma 1

Suppose ${\exists{\mathbb{R}}}\neq{\rm NP}$ and let sequences $(a),(b)$ have an exponential gap. Suppose furthermore that $L(a),L(b)$ belong to ${\exists{\mathbb{R}}}\setminus{\rm NP}.$ Then both are not ${\exists{\mathbb{R}}}$ -complete.

Some structural complexity results for ∃ℝ\exists{\mathbb{R}}