This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Simon Fraser University
11email: ter@sfu.ca

Promise Algebra:
An Algebraic Model of Non-Deterministic Computations

Eugenia Ternovska
Abstract

Our goal is to define an algebraic language for reasoning about non-deterministic computations. Towards this goal, we introduce an algebra of string-to-string transductions. Specifically, it is an algebra of partial functions on words over the alphabet of relational τ\tau-structures over the same domain. The algebra has a two-level syntax, and thus, two parameters to control its expressive power. The top level defines algebraic expressions, and the bottom level specifies atomic transitions. History-dependent Choice functions resolve atomic non-determinism, and make general relations functional. Equivalence classes of such functions serve as certificates for computational problems specified by algebraic terms. The algebra has an equivalent syntax in the form of a Dynamic Logic, where terms describing computational processes or programs appear inside the modalities.

We define a simple secondary logic for representing atomic transitions, which is a modification of conjunctive queries. With this logic, the algebra can represent both reachability and counting examples, which is not possible in Datalog. We analyze the data complexity of this logic, measured in the size of the input structure, and show that a restricted fragment of the logic captures the complexity class NP.

The logic can be viewed as a database query language, where atomic propagations are separated from control.

Organization We start by definining the syntax and semantics of the algebra. In Section 0.1, we present the syntax of the algebra, and in Section 0.2, its semantics. In Section 0.3, we reformulate the algebra as a modal Dynamic Logic. The logic allows for complex nested tests, and has an iterator construct. The main programming constructs are definable in the logic. The formulae have a free function variable over Choice functions, which gives us an implicit existential quantifier over Choice functions. Due to the presence of a form of negation, an implicit universal quantification over such functions is also present. Moreover, in complex nested tests, such implicit (existential, universal) quantifiers can alternate.

In Section 0.4, we define our main computational task, formulated in the Dynamic Logic. We then define the notion of a computational problem specified by an algebraic term. A certificate for such a problem is an equivalence class of Choice functions, also called a witness or a promise. We show that a Boolean algebra of promises is embedded into the Dynamic Logic, and its underlying set has a forest structure, where the forest has a tree for each term. In Section 0.5, we show that the truth of certain (conditional) equalities between algebraic terms can be used to indicate the existence, or non-existence, of a “yes” certificate for a problem specified by an algebraic term. In Section 0.6, we explain how a secondary logic can be defined. In particular, we formulate the Law of Inertia, and define a specific logic for specifying atomic transitions. The logic is based on a modification of unary conjunctive queries. We give examples in Section 0.7, and study the complexity of query evaluation in Section 0.9. We conclude, in Section 0.11, with a summary and future research directions. Related work is mentioned throughout the paper.

We assume familiarity with the basic notions of first-order (FO) and second-order (SO) logic (see, e.g., [18]), and use ‘:=:=’ to mean “denotes” or “is by definition”.

0.1 Algebra: Syntax

In this section, we define the syntax of our algebra, and introduce some initial intuitions regarding the meaning of its operations. The purpose of the algebra is to talk about Choice functions of a certain kind. The Choice functions (or, rather, their equivalence classes) will later be considered as certificates for computational problems specified by algebraic terms, or promises. If such a promise is given for a term (intuitively, a non-deterministic program), the computation is guaranteed to proceed successfully. With this intuition in mind, we call this algebra a Promise Algebra. A promise is like an elephant in a room – it is there, but is never mentioned explicitly.111We use a free function variable ε\varepsilon to denote the presence of this elephant, but no concrete certificate is present in the language.

Formally, the algebra is an algebra of (functional) binary relations on strings of relational structures. It has a two-level syntax. The top level (defined in this section) specifies algebraic expressions. The algebraic operations are dynamic counterparts of classical logic connectives – negation, conjunction, disjunction – and iteration (the Kleene star, or reflexive transitive closure). Unlike classical connectives, these operations are function-preserving in the sense that if atomic elements are functional binary relations, then so are all algebraic expressions built up from them. The bottom level of the formalism specifies atomic expressions (intuitively, actions) in a separate logic (explained later).

We now remind the reader the notion of a (relational) structure. Let τ\tau be a relational vocabulary, which is finite (but is of an unlimited size). Let τ:={S1,,Sn}\tau\ :=\ \{S_{1},\dots,S_{n}\}, each SiS_{i} has an associated arity rir_{i}, and AA be a non-empty set. A τ\tau-structure 𝔄\mathfrak{A} over domain dom(𝔄):=A\operatorname{dom}(\mathfrak{A})\ :=\ A is 𝔄:=(A;S1𝔄,,Sn𝔄),\mathfrak{A}\ :=\ (A;\ S_{1}^{\mathfrak{A}},\dots,S_{n}^{\mathfrak{A}}), where Si𝔄S_{i}^{\mathfrak{A}} is an rir_{i}-ary relation called the interpretation of SiS_{i}. In this paper, all structures are finite. If 𝔄\mathfrak{A} is a τ\tau-structure, 𝔄|σ\mathfrak{A}|_{\sigma} is its restriction to a sub-vocabulary σ\sigma. We now fix a relational vocabulary τ\tau, and assume it is partitioned into “inputs” (or EDB relations, in the database terminology), and unary “register” symbols:222EDB is common term in Database theory. It stands for Extensional Database, which is a relational structure.

τ:=τEDBτreg.\tau\ :=\ \tau_{\rm EDB}\uplus\tau_{\rm reg}. (1)

The details and the formal requirements on these two vocabularies will become clear when a particular logic of the bottom level is explained later in the paper. We only mention now that, intuitively, the interpretations of EDB (or input) relations never change, while the interpretations of the registers are updated by applications of atomic modules.

Let a set \mathcal{M} of atomic module symbols, denoting non-deterministic atomic actions, be fixed. Intuitively, the actions are atomic updates of relational structures. The atomic updates are combined into complex updates (formally, algebraic expressions, or terms) using algebraic operations. The set 𝑇𝑒𝑟𝑚𝑠\it Terms of the well-formed terms of the algebra are defined as:

t::=m(ε)idtt;ttttP=Q BG(PnowQ),\begin{array}[]{c}t::=m(\varepsilon)\mid{\rm id}\mid\mathord{\curvearrowright}t\mid t\mathbin{;}t\mid t\sqcup t\mid t^{\uparrow}\mid\!P=Q\!\mid\!\textbf{\tiny{ BG}}(P_{now}\neq\!Q),\end{array} (2)

where mm\in\mathcal{M} and ε\varepsilon is a free function variable ranging over Choice functions. The syntax allows only one such (free) variable per an algebraic expression.333Syntactically, the variable ε\varepsilon, that occurs free in the algebraic expressions, is not really necessary. We use it to emphasize the existence of a certificate – a concrete Choice function. It will be convenient when we formalize a computational problem specified by an algebraic term. of all relational τ\tau-structures over the same finite domain. We require that P,QτregP,Q\in\tau_{\rm reg}. The subscript “nownow” in PnowP_{now} is not a part of the syntax. It is added for the reader to memorize that the content of the “register” PP in the current position must be different from the values in QQ ever before (denoted ‘ BG’, for Back Globally).

In the table, we list the variables first, and then the partial functions of the algebra that are split into three groups – nullary (constant), unary and binary partial mappings. Unary functions take one other function as an argument, binary– two. The column in the middle represent the range of the variable or the type of the corresponding partial function. Our notations are: 𝐔{\bf U} is the set of all τ\tau-structures over the same domain, 𝖢𝖧(𝐔,){\sf CH}({\bf U},\mathcal{M}) is a set of Choice functions, to be defined later in the paper, and \mathcal{F} denotes partial functions on 𝐔+{{\bf U}^{+}} (non-empty strings over the alphabet 𝐔{\bf U}).

xx 𝐔+{{\bf U}^{+}} string variable
ε\varepsilon 𝖢𝖧(𝐔,){\sf CH}({\bf U},\mathcal{M}) Choice function variable
f,gf,g \mathcal{F} partial function variables on 𝐔+{{\bf U}^{+}}
id(x){\rm id}(x) Identity on 𝐔+{{\bf U}^{+}}
m(h/ε)(x)m(h/\varepsilon)(x) Module, for all mm\in\mathcal{M}, hh\in\mathcal{H}
BG(PQ)(x)\textbf{\tiny{ BG}}(P\neq Q)(x) \mathcal{F} Back Globally Non-Equal
(P=Q)(x)(P=Q)(x) Equal
f(x)\mathord{\curvearrowright}f(x) \mathcal{F}\to\mathcal{F} Anti-Domain (Unary Negation)
f(x)f^{\uparrow}(x) Maximum Iterate
(fg)(x)(f\sqcup g)(x) 2\mathcal{F}^{2}\to\mathcal{F} Preferential Union
(f;g)(x)(f\mathbin{;}g)(x) Sequential Composition

Intuitively, partial mappings \mathcal{F} on 𝐔+{{\bf U}^{+}} represent programs. Applying such a mapping on a string of length one corresponds to applying a program on an input structure, e.g., a graph in 3-Colourability problem. The set \mathcal{F} of partial functions on 𝐔+{{\bf U}^{+}} contains all functional constant (i.e., nullary) operations (the rows marked with \mathcal{F}), and is closed under unary (marked \mathcal{F}\to\mathcal{F}) and binary (marked 2\mathcal{F}^{2}\to\mathcal{F}) operations.

Nullary (constant mappings on strings in 𝐔+{{\bf U}^{+}}):     id{\rm id}\in\mathcal{F}, m(h/ε)m(h/\varepsilon)\in\mathcal{F} for all mm in \mathcal{M} and h𝖢𝖧(𝐔,)h\in{\sf CH}({\bf U},\mathcal{M}), and {(P=Q), BG(PQ)}\{(P=Q),\textbf{\tiny{ BG}}(P\neq Q)\}\subset\mathcal{F}, for all P,QτregP,Q\in\tau_{\rm reg}. Here, id{\rm id} is the Identity function (Diagonal relation) on strings in 𝐔+{\bf U}^{+}. Each m(h/ε)m(h/\varepsilon), mm\in\mathcal{M}, is an atomic module (action), a binary relation, that, intuitively, updates some of the registers in τreg\tau_{\rm reg}, non-deterministically, i.e., multiple outcomes of an action m(ε)m(\varepsilon) are possible. With an instantiation of a concrete Choice function hh in m(h/ε)m(h/\varepsilon), the relation m(ε)m(\varepsilon) becomes a partial function function in \mathcal{F}. Back Globally Non-Equal, denoted  BG(PQ)(x)\textbf{\tiny{ BG}}(P\neq Q)(x), checks if the value stored in register QQ at any point earlier in string xx is different from the value in PP now. Equality check (P=Q)(P=Q) compares current interpretations of PP and QQ. Thus, access to domain elements is allowed only via atomic updates (modules in \mathcal{M} specified in a secondary logic) and (in)equality checks only.

Unary op1:\textbf{op}_{1}:\mathcal{F}\to\mathcal{F} (partial mappings on strings in 𝐔+{{\bf U}^{+}} that depend on one partial function): op1{,}\textbf{op}_{1}\in\{\mathord{\curvearrowright}\,,^{\uparrow}\}. Each operation op1\textbf{op}_{1} in this set takes a function in \mathcal{F} and modifies it according to the semantics of op1\textbf{op}_{1}. The modified operation is applied on strings in 𝐔+{\bf U}^{+}. Anti-Domain f(x)\mathord{\curvearrowright}f(x) checks if there is no outgoing ff-transition from the state represented by the string xx;

Maximum Iterate f(x)f^{\uparrow}(x) is a “determinization” of the Kleene star f(x)f^{*}(x) (reflexive transitive closure). It outputs only the longest, in terms of the number of ff-steps, transition out of all possible transitions from the same state produced by the Kleene star.444For readers familiar with the μ\mu operator, we mention that f:=μZ.(ff;Z)f^{\uparrow}\ :=\ \mu Z.(\mathord{\curvearrowright}f\cup f\mathbin{;}Z), although these constructs are not in our language.

Binary op2:2\textbf{op}_{2}:\mathcal{F}^{2}\to\mathcal{F} (partial mappings on strings in 𝐔+{\bf U}^{+} that depend on two functions): op2{;,}\textbf{op}_{2}\in\{\mathbin{;},\sqcup\} These operations take two functions as arguments, and combine them to obtain a (partial) mapping on 𝐔+{\bf U}^{+}. Sequential Composition (f;g)(x)(f\mathbin{;}g)(x) is the standard function composition g(f(x))g(f(x)). Preferential Union (fg)(x)(f\sqcup g)(x) applies ff and returns its result; but, if ff is not defined, it applies gg.

We will routinely omit string variable xx in algebraic terms and work entirely with partial mappings in \mathcal{F} on 𝐔+{{\bf U}^{+}}. This is a typical notational convention in algebraic setting such as groups, algebras of functions and binary relations, etc.

0.2 Semantics

0.2.1 Atomic State-to-State Transitions

The semantics of the algebra is parameterized by an underlying transition system that specifies structure-to-structure transitions associated with each atomic module symbol. Formally, we have a transition relation Tr\textbf{Tr}\llbracket{\cdot}\rrbracket that maps each atomic module symbol in \mathcal{M} to a binary relation on 𝐔{\bf U} (the set of all relational τ\tau-structures over the same domain):

Tr:𝐔×𝐔.\textbf{Tr}\llbracket{\cdot}\rrbracket:\mathcal{M}\to{\bf U}\times{\bf U}.

In general, the relation Tr\textbf{Tr}\llbracket{\cdot}\rrbracket is specified in any (secondary) logic that constitutes the bottom level of the algebra. In Section 0.6, we give a specific example of such a logic. But, for now, it is sufficient to think of Tr\textbf{Tr}\llbracket{\cdot}\rrbracket as an arbitrary binary relation on structures in 𝐔{\bf U}. The relation Trm(ε)𝐔×𝐔\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket\subseteq{\bf U}\times{\bf U}, for mm\in\mathcal{M}, is not necessarily functional. In this sense, atomic transitions are non-deterministic. But, specific Choice functions (to be defined shortly), that are given semantically only, resolve atomic non-determinism, while taking the history into account.555Notice that concrete Choice functions are not a part of the syntax. The syntax, summarized in (2) and detailed in the table, has only a Choice function variable ε\varepsilon. The terms impose constraints on possible Choice functions. To explain it formally, we first give a definition of a tree.

A tree over alphabet 𝐔{\bf U} is a (finite or infinite) nonempty set 𝒯𝐔{\cal T}\subseteq{\bf U}^{*} such that for all xc𝒯x\cdot c\in{\cal T}, with x𝐔x\in{\bf U}^{*} and c𝐔c\in{\bf U}, we have x𝒯x\in{\cal T}. The elements of 𝒯{\cal T} are called nodes, and the empty word 𝐞\mathbf{e} is the root of 𝒯{\cal T}. For every x𝒯x\in{\cal T}, the nodes xc𝒯x\cdot c\in{\cal T} where c𝐔c\in{\bf U} are the children of xx. A node with no children is a leaf.

𝐞\mathbf{e}𝔅\mathfrak{B}𝔄\mathfrak{A}𝔄𝔅\mathfrak{A}\cdot\mathfrak{B}𝔄𝔅𝔅\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{B}𝔄𝔅𝔅𝔅\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{B}\cdot\mathfrak{B}𝔄𝔅𝔅𝔄\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{B}\cdot\mathfrak{A}𝔄𝔅𝔄\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{A}𝔄𝔄\mathfrak{A}\cdot\mathfrak{A}
Figure 1: A tree over 𝐔{\bf U}. Let m17m_{17}\in\mathcal{M} and (𝔄,𝔅)Trm17(ε)(\mathfrak{A},\mathfrak{B})\in\textbf{Tr}\llbracket{m_{17}(\varepsilon)}\rrbracket, (𝔅,𝔄)Trm17(ε)(\mathfrak{B},\mathfrak{A})\in\textbf{Tr}\llbracket{m_{17}(\varepsilon)}\rrbracket, (𝔅,𝔅)Trm17(ε)(\mathfrak{B},\mathfrak{B})\in\textbf{Tr}\llbracket{m_{17}(\varepsilon)}\rrbracket. A concrete Choice function h:(𝐔+𝐔+)h:\mathcal{M}\to({\bf U}^{+}\rightharpoonup{\bf U}^{+}) maps m17(ε)m_{17}(\varepsilon) to a set of mapping containing, for example, 𝔄𝔅𝔅𝔄𝔅𝔅𝔄\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{B}\mapsto\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{B}\cdot\mathfrak{A}, and other such mappings shown by the thicker edges in the tree. Term t:=m17;m17;m17(h/ε)t\ :=\ m_{17}\mathbin{;}m_{17}\mathbin{;}m_{17}(h/\varepsilon), with this specific hh, makes transitions along the branch in this tree shown by the thicker edges. There is a one-to-one correspondence between that branch and the string shown by the shaded area, which is a trace of tt from the input structure 𝔄\mathfrak{A} that corresponds to hh.

Notations for Strings and Partial Functions We use 𝐞\mathbf{e} to denote the empty string; s(i)s(i), for i1i\geq 1, to denote the ii’s letter in string ss; and sis_{i} to denote the prefix of string ss ending in position ii. In particular, s0=𝐞s_{0}=\mathbf{e}. We use the following notation for partial functions: AB:=CA(CB)A\rightharpoonup B\ :=\ \bigcup_{C\subseteq A}(C\to B).

0.2.2 Choice Functions

Each prefix sis_{i} of a string ss represents a history, i.e., a sequence of states – elements of 𝐔{\bf U}. For each history sis_{i}, Choice function hh selects one possible outcome of an atomic module m(ε)m(\varepsilon), out of those possible, i.e., those in its interpretation Trm(ε)\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket, see Figure 1. It decides m(ε)m(\varepsilon)’s outcome at sis_{i}, if such an outcome is possible according to the transition system Tr\textbf{Tr}\llbracket{\cdot}\rrbracket. The outcome is history-dependent, that is, for the same module m(ε)m(\varepsilon), we may have different outcomes at different time points sis_{i} and sjs_{j}, iji\neq j. We now present this intuition formally.

Definition 1.

Let \mathcal{M} be a finite set of module symbols, and 𝐔{\bf U} be an alphabet. A Choice function is a function h:(𝐔+𝐔+)h:\mathcal{M}\to({\bf U}^{+}\rightharpoonup{\bf U}^{+}) such that

  • h(m(ε))={(sisi+1)i1 and exists (s(i),s(i+1))Trm(ε) or i=0 and there exists s(i+2) such that (s(i+1),s(i+2))Trm(ε)}h(m(\varepsilon))=\{(s_{i}\mapsto s_{i+1})\mid i\geq 1\text{ and exists }(s(i),s(i+1))\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket\text{ or }i=0\text{ and there exists }s(i+2)\text{ such that }(s(i+1),s(i+2))\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket\}.

Note that hh outputs partial functions 𝐔+𝐔+{\bf U}^{+}\rightharpoonup{\bf U}^{+}, so it resolves atomic non-determinism. Observe that, for the same module, different choices can be made at different time points because we merely require the existence of a transition in Trm(ε)\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket.

The second case, i=0i=0, is included so that we can have one tree (cf. Figure 1) per each term, with branches that correspond to different Choice functions, including the choice of allowable input structures. More specifically, when i=0i=0, we have si=𝐞s_{i}=\mathbf{e}. According to the definition above, if (𝔄,𝔅)Trm(ε)(\mathfrak{A},\mathfrak{B})\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket, then (𝐞𝔄)h(m(ε))(\mathbf{e}\mapsto\mathfrak{A})\in h(m(\varepsilon)), where 𝔄\mathfrak{A} is a possible input structure, see Figure 1. Note that s(i)=s(i+1)s(i)=s(i+1) is allowed – a “no-change” transition extends the string sis_{i} by repeating the same letter in 𝐔{\bf U}.

The set of all Choice functions is denoted 𝖢𝖧(𝐔,){\sf CH}({\bf U},\mathcal{M}).

To summarize, a concrete Choice functions, e.g., hh, uses the transition relation Tr to assign semantic meaning to an atomic expression in \mathcal{M} as a set of string-to-string transductions, that is, as a (partial) function on 𝐔+{\bf U}^{+}. Each m(ε)m(\varepsilon) corresponds to some non-deterministic transitions in a transition system Tr. A concrete Choice function, as a semantic instantiation of ε\varepsilon, resolves this non-determinism, by taking the history into account. It extends the history (a string in 𝐔+{\bf U}^{+}) by one letter in the alphabet 𝐔{\bf U} that corresponds to one of the possible outcomes of m(ε)m(\varepsilon) according to Tr. Intuitively, at each time step, hh resolves atomic non-determinism, while taking the history, that starts with an input structure, into account. Thus, at different time points, there could be different outcomes.

h(m(ε))\displaystyle h(m(\varepsilon))𝔅\mathfrak{B}𝔄\mathfrak{A}𝔅\mathfrak{B}
Figure 2: Choice function hh makes a specific selection (s𝔅s𝔅𝔄)(s\mathfrak{B}\mapsto s\mathfrak{B}\mathfrak{A}), provided that (𝔅,𝔄)Trm(ε)(\mathfrak{B},\mathfrak{A})\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket, out of all available possibilities (sisi+1)(s_{i}\mapsto s_{i+1}) according to Trm(ε)\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket. In set-theoretic terms, it has the standard meaning of picking an element out of a set of elements. The elements, in our case, are mappings on strings.

Each algebraic term imposes constraints on allowable Choice functions, just as any first-order formula imposes constraints on possible instantiations of its free variables. The formal definition of the extension of Choice functions to all terms is given next. Recall that only one free variable ε\varepsilon is allowed per term.

0.2.3 Extension of Choice Functions to All Terms

So far, Choice functions were defined on the domain \mathcal{M} of atomic module symbols. They are now extended to operate on all terms, to define their semantics. Formally,

h¯:𝑇𝑒𝑟𝑚𝑠(𝐔+𝐔+).\bar{h}:\it Terms\to({\bf U}^{+}\rightharpoonup{\bf U}^{+}). (3)

The extension h¯\bar{h} of hh provides semantics to all algebraic terms as string-to-string transductions.

  1. 1.

    h¯(m(ε))=h(m(ε))\bar{h}(m(\varepsilon))=h(m(\varepsilon)) for mm\in\mathcal{M}.

  2. 2.

    h¯(id)={(sisi)iNature}\bar{h}({\rm id})=\{(s_{i}\mapsto s_{i})\mid i\in{\rm Nature}\}.

  3. 3.

    h¯(t)={(sisi)\bar{h}(\mathord{\curvearrowright}t)=\{(s_{i}\mapsto s_{i})\mid there is no string ss^{\prime}, Choice function hh^{\prime} and kik\geq i such that (sisk)h¯(t)(s_{i}\mapsto s_{k}^{\prime})\in\bar{h}^{\prime}(t), where si=si}s^{\prime}_{i}=s_{i}\}.

  4. 4.

    h¯(t;t)={(sisj)\bar{h}(t\mathbin{;}t^{\prime})=\{(s_{i}\mapsto s_{j})\mid if there exists ll, with ilji\leq l\leq j, such that (sisl)h¯(t)(s_{i}\mapsto s_{l})\in\bar{h}(t) and (slsj)h¯(t)}(s_{l}\mapsto s_{j})\in\bar{h}(t^{\prime})\}.

  5. 5.

    h¯(tt)={(sisj)\bar{h}(t\sqcup t^{\prime})=\{(s_{i}\mapsto s_{j})\mid there exists h¯\bar{h}^{\prime} such that (sisj)h¯(t)(s_{i}\mapsto s_{j})\in\bar{h}^{\prime}(t) and h¯(tt)=h¯(t)\bar{h}(t\sqcup t^{\prime})=\bar{h}^{\prime}(t) or
    there is no h¯\bar{h}^{\prime} such that (sisk)h¯(t)(s_{i}\mapsto s_{k})\in\bar{h}^{\prime}(t) for any kk and there exists h¯′′\bar{h}^{\prime\prime} such that h¯(tt)=h¯′′(t)}\bar{h}(t\sqcup t^{\prime})=\bar{h}^{\prime\prime}(t^{\prime})\}.

  6. 6.

    h¯(t)={(sisj)\bar{h}(t^{\uparrow})=\{(s_{i}\mapsto s_{j})\mid i=ji=j and (sisj)h¯(t)(s_{i}\mapsto s_{j})\in\bar{h}(\mathord{\curvearrowright}t)    or  i<ji<j  and

    (1) there exists ll with ilji\leq l\leq j such that (sisl)h¯(t)(s_{i}\mapsto s_{l})\in\bar{h}(t^{\uparrow}) and (slsj)h¯(t)(s_{l}\mapsto s_{j})\in\bar{h}(t), and

    (2) there is no kk with l<kjl<k\leq j and s(l)=s(k)}s(l)=s(k)\}. Thus, the interpretation of tt^{\uparrow} should not induce loops in the transition graph given by the binary relation Tr\textbf{Tr}\llbracket{\cdot}\rrbracket.666This is similar to how a transitive closure of a graph is defined. The graph, in our case, is given by the transition system Tr\textbf{Tr}\llbracket{\cdot}\rrbracket. In other applications, this condition may be omitted.

  7. 7.

    h¯(P=Q)={(sisi)\bar{h}(P=Q)=\{(s_{i}\mapsto s_{i})\mid Qs(i)=Ps(i)}Q^{s(i)}=P^{s(i)}\}.

  8. 8.

    h¯( BG(PQ))={(sisi)\bar{h}(\textbf{\tiny{ BG}}(P\neq Q))=\{(s_{i}\mapsto s_{i})\mid there is no 1l<i1\leq l<i such that Qs(l)=Ps(i)}Q^{s(l)}=P^{s(i)}\}.

    Observe that Back Globally (​ BG(PQ)\textbf{\tiny{ BG}}(P\neq Q)) may “look back” through the entire string, not just a computation of a specific sub-term. 777Note that the intended use of the algebra is to evaluate algebraic expressions with respect to an input structure, that is a one-letter string. In this use, the BG construct never looks into the “pre-history”, since, in that use, all strings of length greater than one are traces of some processes. 888An alternative definition of ​ BG(PQ)\textbf{\tiny{ BG}}(P\neq Q) is possible, where the scope of the condition is specified explicitly (by e.g., adding parentheses). But, it is not needed since the scope can be controlled by a cleaver use of fresh “registers”, i.e., unary predicate symbols in τreg\tau_{\rm reg}.

Thus, the semantics associates, with each algebraic expression tt, a functional binary relation (i.e., a partial mapping) h¯(t)\bar{h}(t) on strings of relational structures, that depends on a specific Choice function hh.

Example 1.

Consider term m1;m2(ε)m_{1}\mathbin{;}m_{2}(\varepsilon) and structure 𝔄\mathfrak{A}. Suppose hh is such that

(𝔄𝔄𝔅)\displaystyle(\mathfrak{A}\mapsto\mathfrak{A}\cdot\mathfrak{B}) h(m1(ε)),\displaystyle\in h(m_{1}(\varepsilon)),
(𝔄𝔅𝔄𝔅)\displaystyle(\mathfrak{A}\cdot\mathfrak{B}\mapsto\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{C}) h(m2(ε)).\displaystyle\in h(m_{2}(\varepsilon)).

Then (m1(ε);m2(ε))Tr(h/ε)(𝐞)=𝔄𝔅.(m_{1}(\varepsilon)\mathbin{;}m_{2}(\varepsilon))^{\textbf{Tr}}(h/\varepsilon)(\mathbf{e})=\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{C}. The same word can also be obtained using infinitely many different terms, e.g., (m1;id;m2;id(ε))Tr(h/ε)(𝐞)=𝔄𝔅(m_{1}\mathbin{;}\mathord{\curvearrowright}\mathord{\curvearrowright}{\rm id}\mathbin{;}m_{2}\mathbin{;}{\rm id}(\varepsilon))^{\textbf{Tr}}(h/\varepsilon)(\mathbf{e})=\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{C}, and, similarly (m1;m1;m2;m2(ε))Tr(h/ε)(𝐞)=𝔄𝔅(\mathord{\curvearrowright}\mathord{\curvearrowright}\mathord{\curvearrowright}\mathord{\curvearrowright}m_{1}\mathbin{;}m_{1}\mathbin{;}\mathord{\curvearrowright}\mathord{\curvearrowright}m_{2}\mathbin{;}m_{2}(\varepsilon))^{\textbf{Tr}}(h/\varepsilon)(\mathbf{e})=\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{C}.

Maximum Iterate vs the Kleene Star

To give a comparison of Maximum Iterate with the Kleene star (i.e., the reflexive transitive closure, a construct known from Regular Languages), we show how to define the two operators side-by-side, inductively.

𝔄1\mathfrak{A}_{1}𝔄2\mathfrak{A}_{2}𝔄3\mathfrak{A}_{3}𝔅\mathfrak{B}
Figure 3: Maximum Iterate:
t0:=tt^{\uparrow}_{0}:=\mathord{\curvearrowright}t,    tn+1:=tn;tt^{\uparrow}_{n+1}:=t^{\uparrow}_{n}\mathbin{;}t,    t:=nNaturetnt^{\uparrow}:=\bigcup_{n\in{\rm Nature}}t^{\uparrow}_{n}.
It is a deterministic operator (a partial function).
𝔄1\mathfrak{A}_{1}𝔅1\mathfrak{B}_{1}𝔅2\mathfrak{B}_{2}𝔅3\mathfrak{B}_{3}
Figure 4: The Kleene Star:
   t0:=idt^{*}_{0}:={\rm id},      tn+1:=t;tnt^{*}_{n+1}:=t\mathbin{;}t^{*}_{n},     t:=nNaturetnt^{*}:=\bigcup_{n\in{\rm Nature}}t^{*}_{n}.
It is a non-deterministic operator (not a function).

0.2.4 Terms as Partial Functions (String-to-String Transductions)

The given semantics allows us to view algebraic terms, parameterized by Choice functions, as (partial) functions. Let us make the functional view more explicit. First, we do it for atomic modules, and then generalize to all terms.

Atomic Modules Given a specific Choice function, each module symbol mm is interpreted as a semantic partial mapping mTrm^{\textbf{Tr}} on strings in 𝐔+{\bf U}^{+}:

mTr(h/ε)(si)=si+1 iff (sisi+1)h(m(ε)),m^{\textbf{Tr}}(h/\varepsilon)(s_{i})=s_{i+1}\ \text{ iff }\ (s_{i}\mapsto s_{i+1})\in h(m(\varepsilon)), (4)

where, as before, a semantic instantiation of the function variable ε\varepsilon by the concrete Choice function hh is denoted (h/ε)(h/\varepsilon).

All Terms The partial mappings (4) are extended from atomic to all terms, as presented in Section 0.2.3. Each term, with a specific Choice function, is interpreted as a partial function tTr(h/ε):𝐔+𝐔+t^{\textbf{Tr}}(h/\varepsilon):{\bf U}^{+}\rightharpoonup{\bf U}^{+} as follows:

tTr(h/ε)(si)=sj iff  the extension h¯ of h for t exists,  and (sisj)h¯(t).\begin{array}[]{l}t^{\textbf{Tr}}(h/\varepsilon)(s_{i})=s_{j}\ \text{ iff }\ \text{ the extension $\bar{h}$ of $h$ }\text{for $t$ exists, }\text{ and }(s_{i}\mapsto s_{j})\in\bar{h}(t).\end{array} (5)

Recall that the terms of the algebra take other functions as arguments. The summary of the syntax is presented in the table in Section 0.1. In that table, \mathcal{F} denotes (partial) functions on 𝐔+{\bf U}^{+}. Thus, according to (5), the partial functions in the closure of {\cal F} under all algebraic operations (2) map strings of structures to strings of structures, i.e., are string-to-string transductions. This functional view is essential to develop an algebraic formalization of non-deterministic computations.

0.2.5 Tests vs Processes

Tests play a special role in our formalisation of non-deterministic computations as they specify computational decision problems. While tests are, essentially, “yes” or “no” questions, processes specify the actual content of these questions. We now define these notions formally.

Definition 2.

A term tt is called a test if, for all hh, the extension h¯\bar{h} of hh returns an identity mapping, that is, if in the interpretation h¯(t)\bar{h}(t) of tt, we have that i=ji=j in each map (sisj)h¯(t)(s_{i}\mapsto s_{j})\in\bar{h}(t). Otherwise, tt is called a process.

Thus, every test is a subset of Identity (Diagonal) relation on 𝐔+{\bf U}^{+}. The cases for tests in the definition of an extension of Choice function are 2,3,7 and 8.

Note that it is possible that an atomic module does not change the interpretations of any relations (“registers”), and thus does not make a transition to another state in the transition system Tr. However, such a module would not be considered a test because it still extends the current string by repeating the same letter.

Notice an important difference between tests and processes. A test can “reject” an input string because the domain of the corresponding partial function is empty. For example, id(x)\mathord{\curvearrowright}{\rm id}(x) rejects all strings because h¯(id)\bar{h}(\mathord{\curvearrowright}{\rm id}) is always undefined. A test can also “accept” a string. For instance, id(x){\rm id}(x) is everywhere defined and thus “accepts” every string. The situation with processes is quite different. Every process defines a string-to-string transduction. It extends a string, provided it is defined in that string.

0.3 Dynamic Logic

We now provide an alternative (and equivalent) two-sorted version of the syntax in the form of a Dynamic logic. While the two formalizations are equivalent, in many ways, it is easier to work with the logic. The syntax is given by the grammar:

t::=m(ε)idtt;ttttP=Q BG(PQ)ϕ?ϕ::=|tϕ¬ϕϕϕϕϕ\begin{array}[]{c}t::=m(\varepsilon)\mid\!{\rm id}\!\mid\mathord{\curvearrowright}t\mid t\mathbin{;}t\mid t\sqcup t\mid t^{\uparrow}\mid\!P=Q\!\mid\!\textbf{\tiny{ BG}}(P\neq Q)\!\mid\!\phi?\\ \phi::=\ |t\rangle\phi\mid{\top}\ \mid\ \neg\phi\ \mid\ \phi\land\phi\ \mid\ \phi\lor\phi\end{array} (6)

In Dynamic Logic literature, the expressions in the first line are typically called process terms, and those in the second line state formulae or tests (cf. Definition 2). State formulae ϕ\phi are “unary” in the same sense as P(x)P(x) is a unary notation for P(x,x)P(x,x). Semantically, they are subsets of the identity relation. The state formulae in the second line of (6) are shorthands that use the operations in the first line: |tϕ:=(t;ϕ)|t\rangle\ \phi:=\mathord{\curvearrowright}\mathord{\curvearrowright}(t\mathbin{;}\phi) is the “diamond” modality claiming the existence of a successful execution of tt leading to ϕ\phi. In addition, we define more familiar operations :=id{\top}:={\rm id}, ¬ϕ:=ϕ\neg\phi:=\mathord{\curvearrowright}\phi,    ϕψ:=ϕ;ψ\phi\land\psi:=\phi\mathbin{;}\psi,   ϕψ:=ϕψ\phi\lor\psi:=\phi\sqcup\psi,   and also ϕ?:=ϕ\phi?:=\mathord{\curvearrowright}\mathord{\curvearrowright}\phi, appearing in the top line of (6). In addition, we define Dom(t):=(t){\rm Dom}(t):=\mathord{\curvearrowright}\mathord{\curvearrowright}(t).

An interesting property is that Unary Negations are not idempotant (double negation does not cancel to a negation-free expression), but, they are weakly idempotant: a triple negation of it cancels to a single one. We will come back to this property in Examples 6 and 7, after the notion of a strong term equality is introduced in Section 0.5.

As a general rule, we will refer to algebraic process-terms of the form ϕ?\phi? in the first line of (6) as tests, and to expressions in the second line of (6) as state formulae. But, the main difference between the terms “test” and “state formulae” is mainly in the context of their use, since state formulae are merely a syntactic sugar of the algebra. Clearly, any process, including ϕ?\phi?, can be transformed into a state formula by adding Antidomain (\mathord{\curvearrowright}\! ) or Domain (\mathord{\curvearrowright}\mathord{\curvearrowright}) to it.

0.3.1 Semantics of Dynamic Logic

To interpret the Dynamic Logic (6), we need to provide semantics to both processes and state formulae. Processes are interpreted as before – as partial mappings (parameterized with hh) on strings in 𝐔+{\bf U}^{+}, that can be viewed as string-to-string transductions (5). For state formulas we define, for all concrete Choice functions hh and strings ss,

Tr,sϕ(h/ε) iff ϕTr(h/ε)(s)=s.\textbf{Tr},s\models\phi(h/\varepsilon)\ \ \ \text{ iff }\ \ \ \phi^{\textbf{Tr}}(h/\varepsilon)(s)=s. (7)

Notice that (7) is a particular case of (5), the partial mapping specified by a term, where the terms are tests (cf. Definition 2). This is immediate from cases 2,3,7 and 8 of the extended Choice function h¯\bar{h}.

In this paper, the relation Tr (that is specified by a secondary logic at the bottom level of the algebra) is fixed, and is omitted from Tr,sϕ(h/ε)\textbf{Tr},s\models\phi(h/\varepsilon) (cf. (7)), where ϕ\phi is a state formula. But, of course, other settings are possible as well.

0.3.2 Implicit Quantification in State Formulae

State formulae exhibit implicit quantification over the Choice function variable ε\varepsilon. Indeed, tests of the form t\mathord{\curvearrowright}\mathord{\curvearrowright}t represent the domain of tt (which is, semantically, is a partial mapping (5)), and, intuitively, mean that there is a Choice function hh such that st(h/ε)s\models\mathord{\curvearrowright}\mathord{\curvearrowright}t(h/\varepsilon). On the other hand, tests of the form t\mathord{\curvearrowright}t are universal, because they claim that there is no Choice function that corresponds to a successful execution of tt. To summarize, we have the following implicit quantification over Choice functions:

t(ε)(x) implicitly, ε,t(ε)(x) implicitly, ε.\begin{array}[]{rcl}\mathord{\curvearrowright}\mathord{\curvearrowright}t(\varepsilon)(x)&-&\text{ implicitly, }\exists\varepsilon,\\ \mathord{\curvearrowright}t(\varepsilon)(x)&-&\text{ implicitly, }\forall\varepsilon.\end{array} (8)

These are second-order quantifiers, since we quantify over functions. The quantifiers can interleave, since expressions of the form (8) may appear within the term tt. The two kinds of tests mentioned in (8) are of special interest in the study of computational problems specified by a term, as discussed in Section 0.4.1.

0.3.3 Programming Constructs

It is well-known that in Propositional Dynamic Logic [20], imperative programming constructs are definable using a fragment of regular languages, see the Dynamic Logic book by Harel, Kozen and Tiuryn [26]. The corresponding language is called Deterministic Regular (While) Programs in [26].999 Please note that Deterministic Regular expressions and the corresponding Glushkov automata are unrelated to what we study here. In those terms, expression a;aa\mathbin{;}a^{*} is Deterministic Regular, while a;aa^{*}\mathbin{;}a is not. Both expressions are not in our language. In our case, imperative constructs are definable (cf. [31]) by:

𝐬𝐤𝐢𝐩:=id,𝐟𝐚𝐢𝐥:=id,𝐢𝐟ϕ𝐭𝐡𝐞𝐧t𝐞𝐥𝐬𝐞t:=(ϕ?;t)t,𝐰𝐡𝐢𝐥𝐞ϕ𝐝𝐨t:=(ϕ?;t);(ϕ?),𝐫𝐞𝐩𝐞𝐚𝐭t𝐮𝐧𝐭𝐢𝐥ϕ:=t;((ϕ?);t);ϕ?.\begin{array}[]{ll}\begin{array}[]{l}{\bf skip}\ :=\ {\rm id},\ \ \ \ {\bf fail}\ :=\ \mathord{\curvearrowright}{\rm id},\\ {\bf if}\ \phi\ {\bf then}\ t\ {\bf else}\ t^{\prime}\ :=\ (\phi?\mathbin{;}t)\sqcup t^{\prime},\end{array}&\begin{array}[]{l}{\bf while}\ \phi\ {\bf do}\ t:=(\phi?\mathbin{;}t)^{\uparrow}\mathbin{;}(\mathord{\curvearrowright}\phi?),\\ {\bf repeat}\ t\ {\bf until}\ \phi:=t\mathbin{;}((\mathord{\curvearrowright}\phi?)\mathbin{;}t)^{\uparrow}\mathbin{;}\phi?.\end{array}\end{array}

Thus, importantly, the non-determinism of operations * and \cup of regular languages is not needed to formalize these programming constructs.

0.3.4 Duality between Sets of Strings and Transductions

Recall the difference between tests and processes discussed in Section 0.2.5. Notice that, an application of a process-term tt can only make a string longer. This is because, in (5), if the extension h¯(t)\bar{h}(t) of hh for process tt exists, it is always the case that, in (sisj)h¯(t)(s_{i}\mapsto s_{j})\in\bar{h}(t), we have i<ji<j. Thus, the mapping associated with a process-term can be viewed as a set of (non-empty) strings over the alphabet 𝐔{\bf U}. We call the (non-empty) strings in this set the deltas of tt. In certain contexts, we also call them traces of tt. The term “delta” is more appropriate when we talk about applying a term, e.g., when we say that we apply a string delta selected by h¯\bar{h}. A “trace” is something left after the execution of tt from an inputs structure 𝔄\mathfrak{A}. We talk more about traces in Section (0.4.2). Notice that tests never extend a string. Their deltas are the empty strings.

s(i,j)s(i,j)sjs_{j}sis_{i}h¯(t(ε))\bar{h}(t(\varepsilon))
Figure 5: The set of strings in the middle, shown above the arrow, is the set of deltas of t(ε)t(\varepsilon). For (sisj)h¯(t(ε))(s_{i}\mapsto s_{j})\in\bar{h}(t(\varepsilon)), the string s(i,j)s(i,j) is one of such deltas. It extends sis_{i} and produces sjs_{j}. Note that, even if the set of deltas is finite, the domain and codomain of h¯(t(ε))\bar{h}(t(\varepsilon)) are, in general, infinite (shown on the left and on the right).

In fact, deltas of the form s(1,j)s(1,j), for some jj, are associated with certificates for computational decision problems, and are related to some equivalence classes of Choice functions that “agree” on s(1,j)s(1,j), as will be seen shortly, in Section 0.4.2.

In summary, the interpretation of each process-term can be viewed as a set of deltas – non-empty strings – “extensions” applied to strings on the input.

0.4 Main Computational Task

In this section, we formulate the main computational task we study in this paper. The task amounts to satisfiability, with respect to a one-letter string, of a state formula in certain form. That is, the task is a particular case of satisfying a state formula, cf. (7). Importantly, we define a computational problem specified by an algebraic term, as a class of relational structures. We then discuss how certain equivalence classes of Choice functions, that, intuitively, correspond to traces of computation, can serve as certificates for such computational problems.

Recall that, in this paper, relation Tr is fixed, and is omitted from Tr,sϕ(h/ε)\textbf{Tr},s\models\phi(h/\varepsilon) in (7). This state formula ϕ\phi will be of a special form, in the computational task we define next.

Problem: Main Task (Decision Version) Given: A structure 𝔄\mathfrak{A} with an empty vocabulary and term tt. Question: h𝔄|t(h/ε)?\exists h\ \mathfrak{A}\models|t\rangle{\top}(h/\varepsilon)\ ? (9)

Intuitively, when the answer is yes, (9) says that there is a successful execution of tt at the input structure 𝔄\mathfrak{A}, e.g., a graph, and hh is a witness of it.

The study of the decision version of the computational task (9) is the main goal of this paper. In particular, we are interested in the data complexity of this task, where the formula is fixed, and the input structures vary [54].101010Notice that fixing the relation Tr means fixing a formula of the secondary logic. The size of the transition graph, i.e., the interpretation of the specification in the secondary logic, depends on the size of the input structure.

Dually to this task, the complement of the problem (9) can be defined, using the fact that t\mathord{\curvearrowright}t is the complement of t\mathord{\curvearrowright}\mathord{\curvearrowright}t, which is a syntactic variant of |t|t\rangle{\top}. For instance, if a problem in the complexity class NP is specified using the, implicitly, existential, term t\mathord{\curvearrowright}\mathord{\curvearrowright}t, then its complement, a problem in co-NP, is specified by the term t\mathord{\curvearrowright}t, with an implicit universal quantifier. The reader who is familiar with some elements of Descriptive Complexity (see, e.g., Immerman’s book [30]), might notice an analogy with the connection between second-order quantifier alternations and the polynomial time hierarchy.

Search Version The state formula |t(ε)|t\rangle{\top}(\varepsilon) may be seen as defining a set of Choice functions with respect to the inputs structure 𝔄\mathfrak{A}. This gives us a search version of the main task, which tells us to find all such hh. Recall that only one free variable ε\varepsilon is allowed per term.111111There is an analogy with evaluating queries in database theory. A formula ϕ(x¯)\phi(\bar{x}) with free variables x¯\bar{x} in classical logic is viewed as a query to a database (relational structure). The query 𝔄ϕ(x¯)\mathfrak{A}\models\phi(\bar{x}) returns a set of tuples of domain element that, when instantiated for the free variables x¯\bar{x}, make the formula true in the structure.

0.4.1 Computational Problem 𝒫t\mathcal{P}_{t} Specified by Algebraic Term tt

Definition 3.

A computational problem specified by a process-term tt is an isomorphism-closed class 𝒫t\mathcal{P}_{t} of τ\tau-structures 𝔄\mathfrak{A} such that a structure 𝔄\mathfrak{A} is in this class if and only if (9) holds, that is, there exists hh such that 𝔄|t(h/ε).\mathfrak{A}\ \models\ |t\rangle{\top}\ (h/\varepsilon).121212For natural secondary logics such as the one given in Section 0.6, isomorphism-closure holds for all structures satisfying the Main Task 9.

Here, we assume that, in 𝔄\mathfrak{A}, all relational “register” symbols in τreg\tau_{\rm reg} are interpreted by a special “blank” element, and the interpretation of τEDB\tau_{\rm EDB} describes the input to the problem.

Intuitively, the class 𝒫t\mathcal{P}_{t} contains all structures 𝔄\mathfrak{A} such that a successful execution of tt on input 𝔄\mathfrak{A} is possible, or tt is defined on input 𝔄\mathfrak{A}. We will talk more about defined and undefined terms in Section 0.5.

Many Choice functions witness the main task (9) in the “same way”. We now explain this notion of similarity formally, by introducing equivalence classes on Choice function.

0.4.2 Traces as Equivalence Classes of Choice Functions

We have seen, in (9), that Choice functions witness the main computational task. But, we don’t want to distinguish between Choice functions that are, in some sense, “similar”, as certificates. This similarity relates to the notion of a trace. We say that string ss is a trace of term tt from input 𝔄\mathfrak{A} if there is hh such that (𝔄s)h¯(t(ε))(\mathfrak{A}\mapsto s)\in\bar{h}(t(\varepsilon)). An example of a trace of t:=m17;m17;m17(ε)t\ :=\ m_{17}\mathbin{;}m_{17}\mathbin{;}m_{17}(\varepsilon) is highlighted in Figure 1 by the shaded area.

For the same Choice function, many different terms can produce the same trace, say 𝔄𝔅\mathfrak{A}\cdot\mathfrak{B}\cdot\mathfrak{C}, as Example 1 in Section 0.2.3 shows. Dually, for the same term tt and an input structure 𝔄\mathfrak{A}, different Choice functions can generate the same trace, but act differently outside of it. We consider two different Choice functions hh and hh^{\prime} equivalent with respect to tt and 𝔄\mathfrak{A}, if for that term, they agree on a trace ss of tt from 𝔄\mathfrak{A}, i.e., return the same mapping 𝔄s\mathfrak{A}\mapsto s:

h(𝔄,t)h(𝔄s)h¯(t) iff (𝔄s)h¯(t), for some string s𝐔+.h\cong_{(\mathfrak{A},t)}h^{\prime}\ \ \Leftrightarrow\ \ (\mathfrak{A}\mapsto s)\in\bar{h}(t)\mbox{ iff }(\mathfrak{A}\mapsto s)\in\bar{h^{\prime}}(t),\mbox{ for some string }s\in{\bf U}^{+}. (10)

This equivalence relation gives us equivalence classes [h](𝔄,t)[h]_{(\mathfrak{A},t)} of Choice functions. Each equivalence class has a one-to-one correspondence with a trace of tt from 𝔄\mathfrak{A}, so we can refer to traces as equivalence classes [h](𝔄,t)[h]_{(\mathfrak{A},t)} (i.e., with a string ss on which the Choice functions agree). We will see shortly that these equivalence classes (or traces) can be viewed as certificates for the membership of 𝔄\mathfrak{A} in a computational problem specified by a term.

Example 2.

Consider a tree that represents the unwinding of the transition relation Tr from the input structure 𝔄\mathfrak{A} in Figure 1. For each branch from 𝔄\mathfrak{A}, that is a trace of some term tt, there could be Choice functions that agree on that branch, but act differently (return different transductions for the same term t:=m17;m17;m17(ε)t\ :=\ m_{17}\mathbin{;}m_{17}\mathbin{;}m_{17}(\varepsilon)) on nodes not in that branch. For example, some other Choice function gg may agree with hh on the trace highlighted by the shaded area in the figure, but can return different mappings for the same term, e.g., for the node 𝔅\mathfrak{B}, which is not in that trace. So, h(𝔄,t)gh\cong_{(\mathfrak{A},t)}g, but h≇(𝔅,t)gh\not\cong_{(\mathfrak{B},t)}g.

Example 3.

For all atomic modules mm\in\mathcal{M} and all structures 𝔄𝐔\mathfrak{A}\in{\bf U}, we have

[h](𝔄,m):={h𝔅(𝔄,𝔅)h(m(ε))}.[h]_{(\mathfrak{A},m)}\ :=\ \{h\mid\exists\mathfrak{B}\ (\mathfrak{A},\mathfrak{B})\in h(m(\varepsilon))\}.
Example 4.

Interesting, Identity, a term counterpart of a tautology, generates only “realistic” promises, i.e., those on which at least one module is defined:

[h](𝔄,id):={hm and 𝔅(𝔄,𝔅)h(m(ε))}.[h]_{(\mathfrak{A},{\rm id})}\ :=\ \{h\mid\exists m\in\mathcal{M}\text{ and }\exists\mathfrak{B}\ (\mathfrak{A},\mathfrak{B})\in h(m(\varepsilon))\}.

0.4.3 Promises (a.k.a. Witnesses or Certificates)

We define the set of witnesses for 𝔄\mathfrak{A} in 𝒫t\mathcal{P}_{t} as

W𝔄t:={[h](𝔄,t)𝔄|t(h/ε)} and Wt:=𝔄𝒫tWt𝔄.W^{t}_{\mathfrak{A}}\ :=\ \{[h]_{(\mathfrak{A},t)}\mid\mathfrak{A}\ \models\ |t\rangle{\top}\ (h/\varepsilon)\}\ \ \ \ \text{ and }\ \ \ W^{t}\ :=\ \ \bigcup_{\mathfrak{A}\in\mathcal{P}_{t}}W^{t}_{\mathfrak{A}}.

We also call the witnesses “promises”, which gives the name to the algebra (2) – Promise Algebra. If a promise is given, the computation is guaranteed to succeed.

0.4.4 Boolean Algebra of Promises

The intended use of the algebra is to study queries of the form (9). In that use, strings in 𝐔+{\bf U}^{+} are partitioned into those that are traces of some algebraic terms that represent programs, and thus can potentially act as “yes”-certificates to computational decision problems, and those that are not traces of any term whatsoever, and thus cannot be “yes”-certificates for any 𝒫t\mathcal{P}_{t}. We can consider the Boolean algebra of the set of all potential “yes” certificates as follows.

The Boolean algebra B(A)B(A) of a set AA is the set of subsets of AA that can be obtained by means of a finite number of the set operations union (OR), intersection (AND), and complementation (NOT), see pages 185-186 of Comtet’s 1974 book [15].

Let \mathcal{H} be the set of all promises, i.e., equivalence classes on functions from 𝖢𝖧(𝐔,){\sf CH}({\bf U},\mathcal{M}), with respect to the equivalence relation (10).

Definition 4.

The Boolean Algebra of Promises is the Boolean algebra of the set {(ss)s}.\{(s\mapsto s)\mid s\in\mathcal{H}\}.

For simplicity, since pairs (ss)(s\mapsto s) always contain identical strings in \mathcal{H}, we denote this Boolean algebra as B()B(\mathcal{H}).

If σ\sigma is a signature and AA, BB are σ\sigma-structures (also called σ\sigma-algebras in universal algebra), then a map h:ABh:A\to B is a σ\sigma-embedding if all of the following hold:

  • hh is injective,

  • for every nn-ary function symbol fσf\in\sigma and a1,,anAna_{1},\ldots,a_{n}\in A^{n}, we have

    h(fA(a1,,an))=fB(h(a1),,h(an)).h(f^{A}(a_{1},\ldots,a_{n}))=f^{B}(h(a_{1}),\ldots,h(a_{n})). (11)

The fact that a map h:ABh:A\to B is an embedding is indicated by the use of a “hooked arrow” h:ABh:A\hookrightarrow B.131313Note that this notation is sometimes also used for inclusion maps, but, here, we mean an embedding (which happens to be an inclusion).

Proposition 1.

Let σ={,,¬}\sigma=\{\land,\lor,\neg\}. There is a σ\sigma-embedding of the the Boolean algebra B()B(\mathcal{H}) into the Dynamic Logic (6).

Proof.

We need a structure-preserving injective mapping h:ABh:A\hookrightarrow B, where AA is the Boolean Algebra of Promises, B()B(\mathcal{H}), and BB is the Dynamic Logic. The elements aiAa_{i}\in A in (11) are, in our case, sets of partial mappings of the form (ss)(s\mapsto s). The mapping h:ABh:A\hookrightarrow B is such that all sets of maps (ss)(s\mapsto s) of promises are mapped to themselves. The homomorphism property (11) clearly holds because, for S,T{(ss)s}S,T\subseteq\{(s\mapsto s)\mid s\in\mathcal{H}\}, we have h(SAT)=h(S)Bh(T)h(S\land^{A}T)=h(S)\land^{B}h(T), and similarly for \lor and ¬\neg, by the semantics of state formulae (7). ∎

The following proposition is straightforward, as it follows immediately from the semantics of terms as string-to-string transductions and the definition of the equivalence relation (10).

Proposition 2.

The underlying set \mathcal{H} of the Boolean algebra B()B(\mathcal{H}) has a forest structure, with a tree for each term.

0.5 Strong Equality for Reasoning about Computation

In this section, we explain that the truth of certain (conditional) equalities between terms indicates the existence, or non-existence, of a “yes” certificate for a computational problem specified by a term.

Recall that terms are interpreted as partial functions, see (3). This is crucial for reasoning about computations (cf. the Main Task (9)), since programs are not everywhere defined.

0.5.1 Defined and Undefined Terms

Definition 5.

We say that term tt is defined in sis_{i}, notation t(si)t(s_{i})\!\!\downarrow, if there is h𝖢𝖧(𝐔,)h\in{\sf CH}({\bf U},\mathcal{M}) and string sjs_{j} such that tTr(h/ε)(si)=sjt^{\textbf{Tr}}(h/\varepsilon)(s_{i})=s_{j}, otherwise tt is undefined in sis_{i}, notation t(si)↓̸t(s_{i})\!\not\downarrow.

Intuitively, definedness is associated with the existence of a “yes” certificate, and undefinedness is associated with the non-existence of a “yes” certificate of a computational problem. Thus, another way of viewing the computational problem 𝒫t\mathcal{P}_{t} specified by term tt is as an isomorphism-closed class of structures 𝔄\mathfrak{A} such that t(𝔄)t(\mathfrak{A})\!\downarrow.

0.5.2 Trace Equivalence (a.k.a. Strong Equivalence)

Let the state-to-state transition relation Tr (given by the secondary logic), be fixed. Let ss be a string in 𝐔+{\bf U}^{+}.

Definition 6.

Terms tt and gg are strongly equivalent (or trace-equivalent) on string s𝐔+s\in{\bf U}^{+}, notation t(s)g(s)t(s)\doteq g(s), if

  1. 1.

    they both are defined on ss, in symbols, t(s)t(s)\!\downarrow and g(s)g(s)\!\downarrow, and

  2. 2.

    for all h𝖢𝖧(𝐔,)h\in{\sf CH}({\bf U},\mathcal{M}), they denote the same mapping on ss, i.e., tTr(h/ε)(s)=gTr(h/ε)(s).t^{\textbf{Tr}}(h/\varepsilon)(s)=g^{\textbf{Tr}}(h/\varepsilon)(s).

We say that terms tt and gg are strongly equal on a set S𝐔+S\subseteq{\bf U}^{+} if for all strings ss in SS, we have that t(s)g(s)t(s)\doteq g(s).

Notice that terms that are trace-equivalent on some set SS have the same sets of deltas, extending strings in that set SS.

Example 5.

All tests that are vacuously true on all strings s𝐔+s\in{\bf U}^{+} (are tautologies) are strongly equal to Identity id{\rm id} on the set of all strings 𝐔+{\bf U}^{+}.

Example 6.

Strong equivalence t(s)t(s)\mathord{\curvearrowright}\mathord{\curvearrowright}\mathord{\curvearrowright}t(s)\doteq\mathord{\curvearrowright}t(s) holds on the set of those strings where tt is undefined (equivalently, t\mathord{\curvearrowright}t is defined). The equivalence holds because Anti-Domain of Domain (i.e., t(s)\mathord{\curvearrowright}\mathord{\curvearrowright}t(s)) is Anti-Domain.

Example 7.

On the other hand, if tt is a process-term, then there is no set SS on which t(s)t(s)\mathord{\curvearrowright}\mathord{\curvearrowright}t(s)\doteq t(s). We leave it to the reader to figure out why this is the case (one has to construct a counterexample).

Note that undefined terms are not considered (strongly) equal. Thus, strong equality is not reflexive. This is an important property for reasoning about computation, as discussed shortly in Section 0.5.4.

0.5.3 Before-After Equivalence (a.k.a. Input-Output Equivalence)

To define the notion of a Before-After equivalence of terms, we need the following notion of a projection onto 𝐔{\bf U}, which is simply the set of all pairs (start, end) of a term’s deltas.

A projection 𝖯𝗋{\sf Pr} of term tt under Choice function hh onto 𝐔{\bf U} is defined as:

(s(i),s(j))𝖯𝗋(t,h)(sisj)h¯(t).(\,s(i),s(j)\,)\in{\sf Pr}(t,h)\ \ \ \Leftrightarrow\ \ \ (s_{i}\mapsto s_{j})\in\bar{h}(t).
Definition 7.

Terms tt and gg are Before-After equivalent on ss, notation t(s)g(s)t(s)\fallingdotseq g(s), if

  1. 1.

    they both are defined on ss, in symbols, t(s)t(s)\!\downarrow and g(s)g(s)\!\downarrow, and

  2. 2.

    for all h𝖢𝖧(𝐔,)h\in{\sf CH}({\bf U},\mathcal{M}), their projections onto 𝐔{\bf U} coincide, i.e., 𝖯𝗋(t,h)=𝖯𝗋(g,h).{\sf Pr}(t,h)={\sf Pr}(g,h).

Both equalities we have just introduced can be uses to specify the existence of a certificate, that invisible elephant in the room, as we show next.

0.5.4 Strong Equality and Existence of a Certificate

An important consequence of the definitions of Strong equality and Before-After equality is that, in both cases, such an equality is not necessarily reflexive. As a consequence, self-equality is identified with definedness. On the other hand, undefinedness, in xx, denoted t(x)↓̸t(x)\!\not\downarrow, is associated with t(x)x\mathord{\curvearrowright}t(x)\doteq x (here, \fallingdotseq can be considered as well, depending on specific application needs).

To summarize,

t(x) abbreviates t(x)t(x), a “yes” certificate for x in 𝒫t exists,t(x)↓̸ abbreviates t(x)x, no “yes” certificates for x in 𝒫t exist.\begin{array}[]{llll}t(x)\!\!\downarrow&\text{ abbreviates }&t(x)\doteq t(x),&\text{ a ``yes'' certificate for $x$ in $\mathcal{P}_{t}$ exists},\\ t(x)\!\not\downarrow&\text{ abbreviates }&\mathord{\curvearrowright}t(x)\doteq x,&\text{ no ``yes'' certificates for $x$ in $\mathcal{P}_{t}$ exist}.\end{array}

Since terms denote partial mappings, strong equalities such as t(x)g(x)t(x)\doteq g(x) may hold for some strings xx, but not for all xx, as in Example 6. For that reason, the algebra (2) cannot be axiomatized by universally quantified equalities between terms. Instead of equational theories that axiomatize algebraic structures such as groups, we need to consider quasi-equational theories, i.e., those with conditional equations in the Horn form.

The fact that the existence of a certificate is associated with equality of certain terms opens up a possibility developing a proof system where deriving the existence of a certificate is possible. Developing such a quasi-equational theory and a proof system is outside of the scope of this paper.

0.6 A Logic for Atomic Modules

In this section, first, we discuss the Law of Inertia, that must hold regardless of what secondary logic is used for specifying atomic modules. Second, we define a specific secondary logic used further in this paper. The logic is based on a modification of Conjunctive Queries (CQs) – only one element of the set generated by the CQ is, non-deterministically, returned on the output. Finally, we come back to the Law of Inertia for that specific secondary logic.

0.6.1 Law of Inertia

Recall that we have used the relation Trm(ε)\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket that represents a transition system in Definition 1 of the semantics of atomic modules, via Choice functions. We have, temporarily, treated it as an arbitrary binary relation. However, this relation has an important property that always holds for any update – everything that is not explicitly modified in the update, must remain the same. This property is commonly called the Law of Inertia, starting from McCarthy and Hayes 1969 [36]. Figure 6 illustrates the Law of Inertia for atomic modules. Recall that the vocabulary symbols τ\tau are partitioned into EDB relations and “registers”, τ:=τEDBτreg\tau\ :=\ \tau_{\rm EDB}\uplus\tau_{\rm reg}, see (1).

Trm(ε)\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracketLLRRτ\tauτ\tau𝔄\mathfrak{A}𝔅\mathfrak{B}
Figure 6: One of possible transitions (𝔄,𝔅)Trm(ε)(\mathfrak{A},\mathfrak{B})\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket of an atomic module m(ε)m(\varepsilon). The module “checks” if the relation LL (or a set of such relations) is true in the τ\tau-structure 𝔄\mathfrak{A} on the left, and, if yes, it updates the value of the “register” RR in the τ\tau-structure 𝔅\mathfrak{B} on the right. The interpretation of all other relations in τ\tau remains unchanged, by inertia.

0.6.2 Atomic Modules as Vector Transformations

We restrict atomic transductions so that the only relations that change from structure to structure are unary and contain one domain element at a time, similarly to registers in a Register Machine [46]. Please see Section 0.6.6 for an explicit discussion of the machine model used.

Register values, i.e., the content of unary singleton-set relations, get updated by state-to-state transitions. Updating the registers can be thought of re-interpreting a fixed set (a vector) of kk constants, interpreted by domain elements. Such an update can be specified in any way. For example, it can be a linear algebra transformation b¯=La¯\bar{b}=L\bar{a}, or any other transformation b¯=T(a¯,EDB)\bar{b}=T(\bar{a},{\rm EDB}), that takes into account the input EDB relations, in general.

A vector a¯=R1Rk\bar{a}=R_{1}\cdots R_{k} of registers (one-dimensional array) can encode an ll-dimensional matrix by providing appropriate indices such as i,j,li,j,l, e.g., Ri,j,lR_{i,j,l} for a 3-dimensional matrix. These indices correspond to keys i,j,li,j,l in ([i,j,l];R)([i,j,l];\ R), a Graph Normal Form (GNF) representation of the 3-dimensional matrix.

Note that EDB relations given on the input are never updated. All computational work happens in the registers.

0.6.3 Intuitions for a Specific Secondary Logic

As mentioned above, the bottom level of the algebra can be specified in any formalism. In the sections that follow, we specify a particular secondary logic for axiomatizing atomic transductions, that is similar to Conjunctive Queries (CQs).

Intuitively, we take a monadic primitive positive (M-PP) relation, i.e., a unary relation definable by a unary CQ, and output, arbitrarily, only a single element contained in the relation, instead of the whole relation. There could be several such CQs in the same modules, applied ot once. A formal definition will be given shortly.

For readers familiar with Datalog (such a reader is invited to jump ahead to see the general form (16)), we mention informally that, at the bottom level of the algebra, we have a set of Datalog-like “programs”, one per each atomic module symbol mm\in\mathcal{M}. The “programs” are similar to Conjunctive Queries (non-recursive Datalog programs). The rules in such programs have a unary predicate symbol in the head of each rule. Each (simultaneous) application of the rules puts only one domain element into each unary IDB relation in the head, out of several possible. This creates non-determinism in the atomic module applications. The applications of the modules are controlled by algebraic expressions in (2) at the top level of the algebra.

This logic is the second parameter that affects the expressive power of the language, in addition to the selection of the algebraic operations presented earlier. The logic is carefully constructed to ensure complexity bounds presented towards the end of the paper.

0.6.4 Preliminaries: Conjunctive Queries

We start our formal exposition by reviewing queries, Conjunctive Queries(CQs), and PP-definable relations.

Let CC be a class of relational structures of some vocabulary τ\tau. Following Gurevich [25], we say that an rr-ary CC-global relation qq (a query) assigns to each structure 𝔄\mathfrak{A} in CC an rr-ary relation q(𝔄)q(\mathfrak{A}) on 𝔄\mathfrak{A}; the relation q(𝔄)q(\mathfrak{A}) is the specialization of qq to 𝔄\mathfrak{A}. The vocabulary τ\tau is the vocabulary of qq. If CC is the class of all τ\tau-structures, we say that qq is τ\tau-global. Let \mathcal{L} be a logic. A kk-ary query qq on CC is \mathcal{L}-definable if there is an \mathcal{L}-formula ψ(x1,,xk)\psi(x_{1},\dots,x_{k}) with x1,,xkx_{1},\dots,x_{k} as free variables such that for every 𝔄C\mathfrak{A}\in C,

q(𝔄)={(a1,,ak)𝒟k𝔄ψ(a1,,ak)}.q(\mathfrak{A})=\{(a_{1},\dots,a_{k})\in\mathcal{D}^{k}\mid\mathfrak{A}\models\psi(a_{1},\dots,a_{k})\}. (12)

In this case, q(𝔄)q(\mathfrak{A}) is called the output of qq on 𝔄\mathfrak{A}. Query qq is unary if k=1k=1. A conjunctive query (CQ) is a query definable by a FO formula in prenex normal form built from atomic formulas, \land, and \exists only. A relation is Primitive Positive (PP) if it is definable by a CA:

x1xk(R(x1,,xk)z1zm(B1Bm)Φ(x1,,xk)),\forall x_{1}\dots\forall x_{k}\ \big{(}R(x_{1},\dots,x_{k})\leftrightarrow\underbrace{\exists z_{1}\dots\exists z_{m}\ (B_{1}\land\cdots\land B_{m})}_{\Phi(x_{1},\dots,x_{k})}\big{)},

and each atomic formula BiB_{i} has object variables from x1,,xk,z1,,zmx_{1},\dots,x_{k},z_{1},\dots,z_{m}. We will use x¯\bar{x}, u¯\bar{u}, etc., to denote tuples of variables, and use the broadly used Datalog notation for the sentence above:

R(x¯)B1(u¯1),,Bn(u¯n)Φbody,R(\bar{x})\leftarrow\underbrace{B_{1}(\bar{u}_{1}),\dots,B_{n}(\bar{u}_{n})}_{\Phi_{\rm body}}, (13)

where the part on the right of \leftarrow is called the body, and the part to the left of \leftarrow the head of the rule or the answer. Notice that an answer symbol is always different from the symbols in the body (13), since, by definition, CQs are not recursive. We say that R(x1,,xk)R(x_{1},\dots,x_{k}) is monadic if k=1k=1, and apply this term to the corresponding PP-relations as well.

Example 8.

Relation Path of Length Two is PP-definable, but not monadic: P(x1,x2)E(x1,z),E(z,x2)P(x_{1},x_{2})\leftarrow E(x_{1},z),E(z,x_{2}).

Example 9.

Relation At Distance Two from XX is monadic and PP-definable: D(x2)X(x1),E(x1,z),E(z,x2)D(x_{2})\leftarrow X(x_{1}),E(x_{1},z),E(z,x_{2}).

From now on, we assume that all queries are conjunctive, and their answers are monadic, as in Example 9. We now introduce a modification of CQs we use as the secondary logic, the bottom level of our algebra. Consider a relational vocabulary τ:=τEDBτreg,\tau\ :=\ \tau_{\rm EDB}\uplus\tau_{\rm reg}, consisting of two disjoint sets of relational symbols. The symbols in τreg\tau_{\rm reg} are monadic, and reg\rm reg abbreviates “registers”. Their interpretations change during a computation formalized by an algebraic term. The arities of the symbols of τEDB\tau_{\rm EDB} are arbitrary. Their interpretations are given by the input structure.

0.6.5 SM-PP Atomic Modules

Let Ri(x)ΦbodyR_{i}^{\prime}(x)\leftarrow\Phi_{\rm body} be a unary CQ where RiτregR_{i}\in\tau_{\rm reg} and Φbody\Phi_{\rm body} is in τ\tau. The reason for the renaming of RiR_{i} by RiR^{\prime}_{i} is to specify RiR_{i}’s value in a successor state, without introducing recursion over RiR_{i}, which is not allowed in CQs.

Definition 8.

Singleton-set-Monadic Primitive Positive (SM-PP) relation is a singleton-set relation RR^{\prime} implicitly definable by:

x((Φbody(x)Ri(x))xy(Ri(x)Ri(y)x=y))).\begin{array}[]{c}\forall x\ \big{(}(\Phi_{\rm body}(x)\to R_{i}^{\prime}(x))\\ \land\ \forall x\forall y(R_{i}^{\prime}(x)\land R_{i}^{\prime}(y)\to x=y)\big{)}\big{)}.\end{array} (14)
Notation 1.

We use a rule-based notation for (14):

Ri(x)Φbody.R_{i}(x)\leftarrowtail\Phi_{\rm body}. (15)

Notation “\leftarrowtail” in (15), unlike Datalog’s “\leftarrow” in (13), is used to emphasize that only one domain element is put into the relation in the head of the rule.

Example 10.

Suppose we want to put just one (arbitrary) element in extension of DD denoting “At Distance Two” from Example 9. In the rule-based syntax (15):

D(x2)X(x1),E(x1,z),E(z,x2).D(x_{2})\leftarrowtail X(x_{1}),E(x_{1},z),E(z,x_{2}).

The defined relation is SM-PP. Since only one domain element, out of those at distance two from XX, is put into DD, there could be multiple outcomes, up to the size of the input domain.

Example 11.

The same example “At Distance Two” can be formalized in a variant of Codd’s relational algebra, as a join of several relations, with a new operation πattributeε\pi^{\varepsilon}_{\rm attribute} of Choice-projection, where one element is put into the unary answer DD. Here, the attribute is x2x_{2}.

D:=πx2ε(Xx1EzE).D\ :=\ \pi^{\varepsilon}_{x_{2}}\ (X\Join_{x_{1}}E\Join_{z}E).
Example 12.

A non-example of SM-PP relation is Path of Length Two (Example 8). The relation PP on the output of the atomic module is neither monadic, nor singleton-set.

Definition 9.

An SM-PP module is a set of rules of the form (15):

m(ε):={R1(x)Φbody1Rk(x)Φbodyk},m(\varepsilon)\ :=\ \left\{\begin{array}[]{l}R_{1}(x)\leftarrowtail\Phi^{1}_{\rm body}\\ \cdots\\ R_{k}(x)\leftarrowtail\Phi^{k}_{\rm body}\end{array}\right\}, (16)

where m(ε)m(\varepsilon) is a symbolic notation for the module, mm\in\mathcal{M}, and ε\varepsilon is a free function variable ranging over Choice functions.

In the rest of this paper, the modules are SM-PP, that is, they define SM-PP relations. We use modules to specify atomic transitions or actions. Each module may update the interpretations of several registers simultaneously. Thus, we allow limited parallel computations. The parallelism is essential – this way we avoid using relations of arity greater than one, such as in the Same Generation example, explained in Section 0.7.

Inertia for SM-PP Modules

We now formulate the Law of Inertia for SM-PP modules. Consider a module specification in the form (16). We say that a value aa in a successor state 𝔅\mathfrak{B} of a register RiR_{i} (where aa is a domain element) is forced by transition (𝔄,𝔅)Trm(ε)(\mathfrak{A},\mathfrak{B})\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket if,

whenever 𝔄Φi[a/x],we have that aRi𝔅.\text{whenever }\mathfrak{A}\models\Phi^{i}[a/x],\text{we have that }a\in R_{i}^{\mathfrak{B}}.

Here, Φi[a/x]\Phi^{i}[a/x] is a formula in the secondary logic (here, a unary conjunctive query with xx being a free variable). These queries must be applied in 𝔄\mathfrak{A} simultaneously, in order for the update (𝔄,𝔅)Trm(ε)(\mathfrak{A},\mathfrak{B})\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket to happen.

Definition 10.

The transitions specified by m(ε)m(\varepsilon) is a binary relation Trm(ε)𝐔×𝐔\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket\subseteq{\bf U}\times{\bf U} such that, for all transitions (𝔄,𝔅)Trm(ε)(\mathfrak{A},\mathfrak{B})\in\textbf{Tr}\llbracket{m(\varepsilon)}\rrbracket, the values of all registers in 𝔅\mathfrak{B}, forced by the transition, are as specified by the rules (16), and all other relations remain unchanged in the transition from 𝔄\mathfrak{A} to 𝔅\mathfrak{B}.

Definition 11.

By 𝕃\mathbb{L} we denote the algebra with a two-level syntax that consists of the algebra defined in Section 0.1, with SM-PP modules of the form (16), see Definition 9.

0.6.6 Machine Model

We can think of evaluations of algebraic expressions as computations of Register machines starting from input 𝔄\mathfrak{A}. The machines are reminiscent those of Shepherdson and Sturgis [46]. Importantly, we are interested in isomorphism-invariant computations of these machines, i.e., those that do not distinguish between isomorphic input structures. Intuitively, we have:

  • monadic “registers” – predicates used during the computation, each containing only one domain element at a time;

  • the “real” inputs, e.g., the edge relation E(x,y)E(x,y) of an input graph, are of any arity;

  • atomic transitions correspond to conditional assignments with a non-deterministic outcome;

  • in each atomic step, only the registers of the previous state or the input structure are accessible;

  • a concrete Choice function, depending on the history, chooses one of the possible outputs;

  • computations are controlled by algebraic terms that, intuitively, represent programs.

In Section 0.3, we saw that the main programming constructs are definable. Thus, programs for our machines are very close to standard Imperative programs. In Section 0.7, we give examples of “programming” in this model.

0.7 Examples

We now give some cardinality, reachability examples, and examples with mixed propagations. We assume that the input structure 𝔄\mathfrak{A} is of combined vocabulary τ:=τEDBτreg\tau\ :=\ \tau_{\rm EDB}\uplus\tau_{\rm reg}, where 𝔄|τEDB\mathfrak{A}|_{\tau_{\rm EDB}} is an input to a computational problem, e.g., a graph, and the “registers” in 𝔄|τreg\mathfrak{A}|_{\tau_{\rm reg}} are interpreted by a “blank” symbol. We use α\alpha with subscripts to denote algebraic terms, and add τEDB\tau_{\rm EDB} symbols, e.g., PP and QQ in αeq_size(P,Q)\alpha_{\rm eq\_size}({P},{Q}), to emphasize what is given on the input.

0.7.1 Cardinality Examples

Size Four

Problem: Size Four α4\alpha_{4} Given: A structure 𝔄\mathfrak{A} with a vocabulary symbol adomadom denoting its active domain. Question: Is |adom𝔄||adom^{\mathfrak{A}}| equal to 4?

GuessP(ε):={P(x)adom(x)}.\textit{GuessP}(\varepsilon):=\left\{\begin{array}[]{l}P(x)\leftarrowtail adom(x)\end{array}\right\}.

Here, we put an arbitrary element of the active domain into PP. We specify guessing a new element by checking that the interpretation of PP now has never appeared in the trace of the program before:

GuessNewP(ε):=GuessP; BG(PnowP).\begin{array}[]{l}\textit{GuessNewP}(\varepsilon):=\textit{GuessP}\mathbin{;}\textbf{\tiny{ BG}}(P_{now}\not=P).\end{array}

The problem Size Four is now specified as:

α4:=GuessNewP4;GuessNewP,\alpha_{4}:={\textit{GuessNewP}\,}^{4}\mathbin{;}\mathord{\curvearrowright}\textit{GuessNewP},

where the power four means that we execute the guessing procedure four times sequentially. The answer to the question 𝔄|α4(ε),\mathfrak{A}\models|\alpha_{4}\rangle{\top}(\varepsilon), is non-empty, i.e., it is possible to find a concrete Choice function to semantically instantiate ε\varepsilon, if and only if the input domain is of size four. Obviously, such a program can be written for any natural number.

Same Size

Problem: Same Size αeq_size(P,Q)\alpha_{\rm eq\_size}({P},{Q}) Given: Two unary relations PP and QQ. Question: Are PP and QQ of the same size?

We pick, simultaneously, a pair of elements from the two input sets, respectively:

PickPQ(ε):={PickP(x)P(x),PickQ(x)Q(x)}.\textit{PickPQ}(\varepsilon):=\left\{\begin{array}[]{l}\textit{Pick}_{P}(x)\leftarrowtail{P(x)},\ \ Pick_{Q}(x)\leftarrowtail{Q(x)}\end{array}\right\}.

Store the selected elements temporarily:

Copy(ε):={P(x)PickP(x),Q(x)PickQ(x)}.\textit{Copy}(\varepsilon):=\left\{\begin{array}[]{l}P^{\prime}(x)\leftarrowtail{\textit{Pick}_{P}(x)},\ \ Q^{\prime}(x)\leftarrowtail{\textit{Pick}_{Q}(x)}\end{array}\right\}.

Here, the Choice variable ε\varepsilon is not really necessary, since the module is deterministic. Next, we define a sub-procedure.

GuessNewPair:=(PickPQ; BG(PickPP);PickQQ));Copy.\textit{GuessNewPair}:=\big{(}\textit{PickPQ}\mathbin{;}\textbf{\tiny{ BG}}(Pick_{P}\not=P^{\prime})\mathbin{;}\textit{Pick}_{Q}\not=Q^{\prime})\big{)}\mathbin{;}\textit{Copy}.

In the sub-procedure above, we guess two new elements, one from each set, simultaneously. The problem Same Size is now specified as:

αeq_size(P,Q):=(GuessNewPair);PickPQ.\alpha_{\rm eq\_size}({P},{Q}):=(\textit{GuessNewPair})^{\uparrow}\mathbin{;}\mathord{\curvearrowright}\textit{PickPQ}.

The answer to the question 𝔄|αeq_size(ε)\mathfrak{A}\models|\alpha_{\rm eq\_size}\rangle{\top}(\varepsilon) is non-empty, i.e., there is a Choice function witnessing ε\varepsilon, if and only if the extensions of predicate symbols in the input structure 𝔄\mathfrak{A} are of equal size.

EVEN

Problem: EVEN αE\alpha_{E} Given: A structure 𝔄\mathfrak{A} with a vocabulary symbol adomadom denoting its active domain. Question: Is |adom𝔄||adom^{\mathfrak{A}}| even?

EVEN is PTIME computable, but is not expressible in Datalog, or any fixed point logic, unless a linear order on the domain elements is given. It is also known that Monadic Second-Order (MSO) logic over the empty vocabulary cannot express EVEN.

We now show how to axiomatize it in our logic. We construct a 2-coloured path in the transition system by guessing new domain elements one-by-one, and using EE and OO as labels.141414It is possible to use fewer register symbols, but we are not trying to be concise here. To avoid infinite loops, we make sure that the elements never repeat. We define three atomic modules. For the deterministic ones, we omit the Epsilon variable:

GuessP(ε):={P(x)adom(x)},CopyPO:={O(x)P(x)},CopyPE:={E(x)P(x)}.\begin{array}[]{lcl}GuessP(\varepsilon)&:=&\left\{\begin{array}[]{l}P(x)\leftarrowtail adom(x)\end{array}\right\},\\ CopyPO&:=&\left\{\begin{array}[]{l}O(x)\leftarrowtail P(x)\end{array}\right\},\\ CopyPE&:=&\left\{\begin{array}[]{l}E(x)\leftarrowtail P(x)\end{array}\right\}.\end{array}
GuessNewO:=(GuessP; BG(PE) BG(PO));CopyPO,GuessNewE:=(GuessP; BG(PE) BG(PO));CopyPE.\begin{array}[]{l}GuessNewO:=\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \big{(}GuessP\mathbin{;}\textbf{\tiny{ BG}}(P\not=E)\textbf{\tiny{ BG}}(P\not=O)\big{)}\ \mathbin{;}\ CopyPO,\\ GuessNewE:=\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \big{(}GuessP\mathbin{;}\textbf{\tiny{ BG}}(P\not=E)\textbf{\tiny{ BG}}(P\not=O)\big{)}\ \mathbin{;}\ CopyPE.\end{array}

The problem EVEN is now formalized as:

αE:=(GuessNewO;GuessNewE);GuessNewO.\alpha_{E}:=(GuessNewO\mathbin{;}GuessNewE)^{\uparrow}\mathbin{;}\mathord{\curvearrowright}GuessNewO.

The program is successfully executed if each chosen element is different from any elements selected so far in the current information flow, and if EE and OO are guessed in alternation. Given a structure 𝔄\mathfrak{A} over an empty vocabulary, the result of the query 𝔄|αE(ε)\mathfrak{A}\models|\alpha_{E}\rangle{\top}(\varepsilon) is non-empty whenever there is a successful execution of αE\alpha_{E}, that is, the size of the input domain is even.

0.7.2 Reachability Examples

0.7.3 s-t Connectivity

Problem: s-t-Connectivity α(E,S,T)\alpha({E},{S},{T}) Given: Binary relation EE, two constants ss and tt, as singleton-set relations SS and TT. Question: Is tt reachable from ss by following the edges?

To specify the term encoding this problem, we use the definable constructs of imperative programming defined in Section 0.3.3:

α(E,S,T):=Mbase_case;𝐫𝐞𝐩𝐞𝐚𝐭(Mind_case; BG(ReachReach));Copy𝐮𝐧𝐭𝐢𝐥Reach=T.\begin{array}[]{l}\alpha({E},{S},{T}):=\ \ M_{base\_case}\mathbin{;}{\bf repeat\ }\big{(}M_{ind\_case}\mathbin{;}\\ \textbf{\tiny{ BG}}({\textit{Reach}^{\prime}\not=\textit{Reach}})\big{)}\mathbin{;}\textit{Copy}\ \ {\bf until}\ {\textit{Reach}=T}.\end{array}

Here, we use a unary relational symbol (a register) Reach. Initially, the corresponding relation contains the same node as SS. The execution is terminated when Reach equals TT. Register Reach\textit{Reach}^{\prime} is used as a temporary storage. To avoid guessing the same element multiple times, we use the BG construct. The atomic modules used in this program are:

Mbase_case(ε):={Reach(x)S(x)},Mind_case(ε):={Reach(y)Reach(x),E(x,y)},Copy(ε):={Reach(x)Reach(x)}.\begin{array}[]{c}\begin{array}[]{lcl}M_{base\_case}(\varepsilon)&:=&\ \left\{\begin{array}[]{l}\textit{Reach}(x)\leftarrowtail{S(x)}\end{array}\right\},\end{array}\\ \begin{array}[]{l}M_{ind\_case}(\varepsilon):=\left\{\begin{array}[]{l}\textit{Reach}^{\prime}(y)\leftarrowtail{\textit{Reach}(x)},{E(x,y)}\end{array}\right\},\end{array}\\ \begin{array}[]{l}\textit{Copy}(\varepsilon):={\ \ \ }\ \left\{\begin{array}[]{l}\ \textit{Reach}(x)\leftarrowtail{\textit{Reach}^{\prime}(x)}\end{array}\right\}.\end{array}\end{array}

Here, module Mind_caseM_{ind\_case} is the only non-deterministic module. The other two modules are deterministic (i.e., the corresponding binary relation is a partial function). Given structure 𝔄\mathfrak{A} over a vocabulary that matches the input (EDB) predicate symbols E,SE,S and TT, including matching the arities, by checking 𝔄|α(ε),\mathfrak{A}\models|\alpha\rangle{\top}(\varepsilon), we verify that there is a successful execution of α\alpha. That is, tt is reachable from ss by following the edges of the input graph.

0.7.4 Same Generation

Problem: Same Generation αSG(E,Root,A,B)\alpha_{\rm SG}({E},{\textit{Root}},{A},{B}) Given: Tree – edge relation: EE; root: Root; two nodes represented by unary singleton-set relations: AA and BB Question: Do AA and BB belong to the same generation in the tree?

Note that, since we do not allow binary “register” relations (binary EDB relations are allowed), we need to capture the notion of being in the same generation through coexistence in the same structure.

Mbase_case(ε):={ReachA(x)A(x),ReachB(x)B(x)}.M_{base\_case}(\varepsilon):=\left\{\begin{array}[]{l}\textit{Reach}_{A}(x)\leftarrowtail{A(x)},\ \textit{Reach}_{B}(x)\leftarrowtail{B(x)}\end{array}\right\}.

Simultaneous propagation starting from the two nodes:

Mind_case(ε):={ReachA(x)ReachA(y),E(x,y),ReachB(v)ReachB(w),E(v,w)}.\begin{array}[]{l}M_{ind\_case}(\varepsilon):=\left\{\begin{array}[]{l}\textit{Reach}_{A}^{\prime}(x)\leftarrowtail{\textit{Reach}_{A}(y)},{E(x,y)},\\ \textit{Reach}_{B}^{\prime}(v)\leftarrowtail{\textit{Reach}_{B}(w)},{E(v,w)}\end{array}\right\}.\end{array}

This atomic module specifies that, if elements yy and ww, stored in the interpretations of ReachA\textit{Reach}_{A} and ReachB\textit{Reach}_{B} respectively, coexisted in the previous state, then xx and vv will coexist in the successor state. We copy the reached elements into “buffer” registers:

Copy(ε):={ReachA(x)ReachA(x),ReachB(x)ReachB(x)}.\begin{array}[]{l}Copy(\varepsilon):=\left\{\begin{array}[]{l}\textit{Reach}_{A}(x)\leftarrowtail{\textit{Reach}_{A}^{\prime}(x)},\\ \textit{Reach}_{B}(x)\leftarrowtail{\textit{Reach}_{B}^{\prime}(x)}\end{array}\right\}.\end{array}

The resulting interpretation of ReachA\textit{Reach}_{A} and ReachB\textit{Reach}_{B} coexist in one structure, which is a state in a transition system. The algebraic expression, using the definable imperative constructs, is:

αSG(E,Root,A,B):=Mbase_case;𝐫𝐞𝐩𝐞𝐚𝐭Mind_case;Copy;𝐮𝐧𝐭𝐢𝐥(ReachA=Root;ReachB=Root).\begin{array}[]{l}\alpha_{\rm SG}({E},{\textit{Root}},{A},{B}):=M_{base\_case};{\bf repeat}M_{ind\_case}\mathbin{;}\textit{Copy};\\ \ \ {\ \bf until}\ ({\textit{Reach}_{A}={\textit{Root}}}\mathbin{;}{\textit{Reach}_{B}={\textit{Root}}}).\end{array}

While this expression looks like an imperative program, it really is a constraint on all possible Choice functions, each specifying a particular sequence of choices. The answer to the question 𝔄|αSG(ε)\mathfrak{A}\models|\alpha_{\rm SG}\rangle{\top}(\varepsilon) is non-empty if and only if AA and BB belong to the same generation in the tree.

0.7.5 Linear Equations mod 2

Problem: mod 2 Linear Equations αF\alpha_{\rm F} Given: system FF of linear equations mod 2 over vars VV given by two ternary relations Eq0\textit{Eq}_{0} and Eq1\textit{Eq}_{1} Question: Is FF solvable?

We assume that V𝔄V^{\mathfrak{A}} is a set, and Eq0𝔄\textit{Eq}_{0}^{\mathfrak{A}} and Eq1𝔄\textit{Eq}_{1}^{\mathfrak{A}} are relations, both given by an input structure 𝔄\mathfrak{A} with dom(𝔄)=V𝔄dom(\mathfrak{A})=V^{\mathfrak{A}}. Intuitively, V𝔄V^{\mathfrak{A}} is a set of variables, and (v1,v2,v3)Eq0𝔄(v_{1},v_{2},v_{3})\in\textit{Eq}_{0}^{\mathfrak{A}} iff v1v2v3=0v_{1}\oplus v_{2}\oplus v_{3}=0, and (v1,v2,v3)Eq1𝔄(v_{1},v_{2},v_{3})\in\textit{Eq}_{1}^{\mathfrak{A}} iff v1v2v3=1v_{1}\oplus v_{2}\oplus v_{3}=1. Such systems of equations are an example of constraint satisfaction problem that is not solvable by kk-local consistency checks. This problem is known to be closely connected to the construction by Cai et al. [14], and is not expressible in infinitary counting logic, as shown by Atserias, Bulatov and Dawar [6]. Yet, the problem is solvable in polynomial time by Gaussian elimination. We use the dynamic ε\varepsilon operator to arbitrarily pick both an equation (a tuple in one of the relations) and a variable (a domain element), on which Gaussian elimination Elim is performed.

αF(Eq0,Eq1,V):=Mbase_case;𝐫𝐞𝐩𝐞𝐚𝐭Pick_Eq_V;Elim𝐮𝐧𝐭𝐢𝐥Pick_Eq_V.\begin{array}[]{l}\alpha_{\rm F}({{Eq_{0}}},{{Eq_{1}}},{{V}}):=M_{base\_case}\mathbin{;}{\bf repeat}\textit{Pick}\_\textit{Eq}\_V\mathbin{;}\textit{Elim}\\ \hskip 85.35826pt{\bf until}\ \mathord{\curvearrowright}\textit{Pick}\_\textit{Eq}\_V.\end{array}

Then, we have 𝔄|αF(ε)\mathfrak{A}\models|\alpha_{\rm F}\rangle{\top}(\varepsilon) returns a non-empty set iff FF is solvable.

0.7.6 Observations

Two Types of Propagations In Reachability examples (s-t-Connectivity, Same Generation), propagations follow tuples of domain elements given by the input structure, from one element to another. In Counting examples (Size 4, or any fixed size, Same Size, etc.), propagations are made arbitrarily, they are unconstrained. In Mixed examples (mod 2 equations, CFI graphs), propagations are of both kinds, and they interleave. Such examples with mixed propagation are not possible to formalize in Datalog, or any fixed point logic. We belive that an interleaving of constraned and unconstrained propagations (as in our logic) is needed to formalize CFI-like examples. Constrained propagations are of a “reachability” kind, i.e., propagations over tuples. Unconstrained propagations are of a basic “counting” type, i.e., propagations from “before” to “after” via an unconstrained choice from the active domain. We believe that the lack of this feature is the reason of why adding just counting to FO(FP) is not enough to represent all properties in P-time [14] — e.g., the algorithm for mod 2 Linear Equations needs to interleave constrained and unconstrained propagations. Adding counting by itself to fixed points cannot accomplish it.

Choice-Invariant Encodings In the given implementation of s-t-Connectivity, a wrong guess of a path is possible. However, one can write a depth-first search algorithm, where the order of edge traversal does not matter. Because of this invariance, the depth-first encoding can be evaluated in P-time. This is because the length of the computation is limited to be polynomial in the size of the input structure (more about it in Section 0.9.1). Choice-invariance would not hold for Hamiltoian Path, where, because it is an NP-complete problem, the possibility of a wrong guess always exists.

0.8 Structural Operational Semantics

Our goal is to develop an algorithm that, given a Choice function hh, finds an answer to the main task 𝔄|α(h/ε)\mathfrak{A}\models|\alpha\rangle{\top}(h/\varepsilon). We represent the algorithm as a set of rules in the style of Plotkin’s Structural Operational Semantics [42]. The transitions describe “single steps” of the computation, as in the computational semantics [27].

Identity (Diagonal) id{\rm id}:

true(id,s)(id,s).\frac{true}{({\rm id},s)\longrightarrow({\rm id},s)}.

Atomic Modules m(ε)m(\varepsilon):

true(m(ε),s𝔄)(id,s𝔄𝔅) if (s𝔄s𝔄𝔅)h(m(ε)).\frac{true}{(m(\varepsilon),s\cdot\mathfrak{A})\longrightarrow({\rm id},s\cdot\mathfrak{A}\cdot\mathfrak{B})}\mbox{ if }(s\cdot\mathfrak{A}\mapsto s\cdot\mathfrak{A}\cdot\mathfrak{B})\in h(m(\varepsilon)).

Sequential Composition α;β\alpha\mathbin{;}\beta:

(α,s)(α,s)(α;β,s)(α;β,s),(β,s)(β,s)(id;β,s)(id;β,s).\frac{(\alpha,s)\longrightarrow(\alpha^{\prime},s^{\prime})}{(\alpha\mathbin{;}\beta,s)\longrightarrow(\alpha^{\prime}\mathbin{;}\beta,s^{\prime})},\ \ \ \ \ \ \ \ \ \frac{(\beta,s)\longrightarrow(\beta^{\prime},s^{\prime})}{({\rm id}\mathbin{;}\beta,s)\longrightarrow({\rm id}\mathbin{;}\beta^{\prime},s^{\prime})}.

Preferential Union αβ\alpha\sqcup\beta:

(α,s)(α,s)(αβ,s)(α,s).\frac{(\alpha,s)\longrightarrow(\alpha^{\prime},s^{\prime})}{(\alpha\sqcup\beta,s)\longrightarrow(\alpha^{\prime},s^{\prime})}.

That is, αβ\alpha\sqcup\beta evolves according to the instructions of α\alpha, if α\alpha can successfully evolve to α\alpha^{\prime}.

(β,s)(β,s) and (α,s)(α,s)(αβ,s)(β,s).\frac{(\beta,s)\longrightarrow(\beta^{\prime},s^{\prime})\mbox{ and }(\mathord{\curvearrowright}\alpha,s)\longrightarrow(\mathord{\curvearrowright}\alpha,s)}{(\alpha\sqcup\beta,s)\longrightarrow(\beta^{\prime},s^{\prime})}.

The rule says that αβ\alpha\sqcup\beta evolves according to the instructions of β\beta, if β\beta can successfully evolve, while α\alpha cannot.

Right Negation (Anti-Domain) α\mathord{\curvearrowright}\alpha: There are no one-step derivation rules for α\mathord{\curvearrowright}\alpha. Instead, we try, step-by-step, to derive α\alpha, and, if not derivable, make the step.

true(α,s)(id,s) if there is no Choice function h such that derivation of α in s succeeds.\frac{true}{(\mathord{\curvearrowright}\alpha,s)\longrightarrow({\rm id},s)}\mbox{ if there is no Choice function $h^{\prime}$ such that derivation of $\alpha$ in $s$ succeeds}.

Equality Check (P=Q)(P=Q):

true((P=Q),s𝔅)(id,s𝔅) if P𝔅=Q𝔅.\frac{true}{((P=Q),s\cdot\mathfrak{B})\longrightarrow({\rm id},s\cdot\mathfrak{B})}\mbox{ if }P^{\mathfrak{B}}=Q^{\mathfrak{B}}.

Back Globally  BG(PQ)\textbf{\tiny{ BG}}(P\neq Q):

true( BG(PQ),s𝔅)(id,s𝔅) if for all i,Ps(i)Q𝔅.\frac{true}{(\textbf{\tiny{ BG}}(P\neq Q),s\cdot\mathfrak{B})\longrightarrow({\rm id},s\cdot\mathfrak{B})}\mbox{ if for all }i,\ P^{s(i)}\neq Q^{\mathfrak{B}}.

Here, s(i)s(i) is the ii’th letter of ss, 𝑓𝑖𝑟𝑠𝑡(s)i𝑙𝑎𝑠𝑡(s)\mathit{first}(s)\leq i\leq\mathit{last}(s).

Maximum Iterate α\alpha^{\uparrow}:

(α,s)(α,s)(α,s)(α;α,s),true(α,s)(id,s) if α succeeds in s.\frac{(\alpha,s)\longrightarrow(\alpha^{\prime},s^{\prime})}{(\alpha^{\uparrow},s)\longrightarrow(\alpha^{\prime}\mathbin{;}\alpha^{\uparrow},s^{\prime})},\ \ \ \ \ \ \ \frac{true}{(\alpha^{\uparrow},s)\longrightarrow({\rm id},s)}\mbox{ if $\mathord{\curvearrowright}\alpha$ succeeds in $s$}.

Thus, α\alpha^{\uparrow} evolves according to α\alpha, if α\alpha can evolve successfully, or if α\mathord{\curvearrowright}\alpha succeeds in ss.

The execution of the algorithm consists of “unwinding” the term α\alpha, starting with an input structure 𝔄\mathfrak{A}. The derivation process is deterministic: whenever there are two rules for a connective, only one of them is applicable. The goal of evaluation is to apply the rules of the structural operational semantics starting from (α,𝔄)(\alpha,\mathfrak{A}) and then tracing the evolution of α\alpha to the “empty” program id{\rm id} step-by-step, by completing the branch upwards that justifies the step. In that case, we say that the derivation succeeds. Otherwise, we say that it fails.

Proposition 3.

The evaluation algorithm based on the structural operational semantics, that finds an answer to the main task 𝔄|α(h/ε)\mathfrak{A}\models|\alpha\rangle{\top}(h/\varepsilon), is correct with respect to the semantics of 𝕃\mathbb{L}.

Proof.

(outline) The correctness of the algorithm follows by induction on the structure of the algebraic expression, since the rules simply implement the semantics given earlier. ∎

0.9 Complexity of Query Evaluation

In this section, we discuss the complexity of query evaluation in logic 𝕃\mathbb{L}, specifically focusing on its fragment that captures precisely the complexity class NP.

0.9.1 The Length of a Choice Function

Recall that we are interested in data complexity, where the formula is fixed. For an iterate-free term (without Maximum Iterate) and a fixed Choice function hh, the main task is clearly in P-time. But, we want to understand the data complexity of query evaluation for more general terms, under some natural restrictions.

Recall that, according to (3), terms are mapped to partial functions on strings. Such a function can be continuously defined for a consecutive sequence of structures in 𝐔{\bf U}, but undefined afterwards.

By the length of a Choice function we mean the maximal length of an “extension” string, i.e., a delta, that corresponds to the mapping:

length(h):=max{ji(sisj)h(m(ε)) for some m(ε)}.length(h)\ :=\ \max\{j-i\mid(s_{i}\mapsto s_{j})\in h(m(\varepsilon))\text{ for some }m(\varepsilon)\}.

Intuitively, it corresponds to the height of the tree in Figure 1.

Definition 12.

We say that a Choice function hh is polynomial if length(h)O(nk),length(h)\in O(n^{k}), where nn is the size of the domain of 𝔄𝐔\mathfrak{A}\in{\bf U}, and kk is some constant.

Since Choice functions are certificates, their length has to be restricted. For our complexity results, we will restrict Choice functions to be polynomial.

0.9.2 Data Complexity of Query Evaluation

Data complexity refers to the case where the formula is fixed, and input structures 𝔄\mathfrak{A} vary [54]. Here, we assume that the description of the transition system, specified in the secondary logic SM-PP (14), is a part of the formula in the Main Task (9), and, therefore, is fixed.

We use the algorithm based on the structural operational semantics from Section 0.8 to analyze the data complexity of the main query 𝔄|α(ε)\mathfrak{A}\models|\alpha\rangle{\top}(\varepsilon). The complexity depends on the nesting of the implicit quantifiers on Choice functions (cf. 8), i.e., how exactly \mathord{\curvearrowright} is applied in the term, including as part of the evaluation of Maximum Iterate.

Since the implicit (existential and universal) quantifiers can alternate, a problem at any level of the Polynomial Time Hierarchy can be expressed. Thus, the upper bound is PSPACE in general.

We can restrict the language so that \mathord{\curvearrowright} is applied to atomic modules only, including in Maximum Iterate, and there is no other application of negation, except the double negation needed to define the modality (the domain of a term). For this restricted language, we can guarantee computations in NP because, in that case, the size of the certificate we guess is polynomial, and it can verified in deterministic polynomial time.

Theorem 1.

The data complexity of checking 𝔄|α(ε)\mathfrak{A}\models|\alpha\rangle{\top}(\varepsilon) for a restriction of logic 𝕃\mathbb{L}, where negation applies to atomic modules only, is in NP.

Proof.

(outline) We guess a certificate, which is an equivalence class [h](𝔄,α)[h]_{(\mathfrak{A},\alpha)} of Choice functions (a trace, see (10)), of a polynomial length in the size of 𝔄\mathfrak{A}, to instantiate the free function variable ε\varepsilon. With such an instantiation, term α\alpha becomes deterministic. SM-PP atomic modules (essentially, conjunctive queries) can be checked, with respect to this certificate, in P-time. We argue, by induction on the structure of an algebraic term, using Structural Operational Semantics from Section 0.8, that all operations, including negation, can be evaluated in polynomial time. We take into account that choice functions are restricted to be of polynomial length, and the term is fixed. Moreover, the semantics of Maximum Iterate does not allow loops in the transition system Tr\textbf{Tr}\llbracket{\cdot}\rrbracket. Thus, we return “yes” in polynomial time if the witness [h](𝔄,α)[h]_{(\mathfrak{A},\alpha)} proves that the answer to 𝔄|α(h/ε)\mathfrak{A}\models|\alpha\rangle{\top}(h/\varepsilon) is “yes”; or “no” in polynomial time otherwise. ∎

0.9.3 Simulating NP-time Turing Machines

The complexity class NP has been in the centre of theoretical computer science research for a long time. Its logical characterization was given by Fagin. His celebrated theorem [19] states that the complexity class NP coincides, in a precise sense, with second-order existential logic.

We now prove a counterpart of the celebrated Fagin’s theorem [19] for a fragment of our logic 𝕃\mathbb{L}. Recall that logic 𝕃\mathbb{L} is introduced in Definition 11. This fragment is existential, in the sense discussed in Section 0.3.2.

In this section, we demonstrate that our logic is strong enough to encode any polynomial-time Turing machine over unordered structures. Together with Theorem 1, these two properties show that the fragment precisely captures NP.

Theorem 2.

For every NP-recognizable class 𝒦\mathcal{K} of structures, there is a sentence of logic 𝕃\mathbb{L} (where negation applies to atomic modules only) whose models are exactly 𝒦\mathcal{K}.

Proof.

We focus on the query 𝔄|αTM(ε)\mathfrak{A}\models|\alpha_{\rm TM}\rangle{\top}(\varepsilon) from Section 0.3 and outline such a construction. The main idea is that a linear order on the domain elements of 𝔄\mathfrak{A} is induced by a path in a transition system that corresponds to a guessed Choice function hh. In this path, we guess new elements one by one, as in the examples. The linear order corresponds to an order on the tape of a Turing machine. After such an order is guessed, a deterministic computation, following the path, proceeds for that specific order. We assume, without loss of generality that the deterministic machine runs for nkn^{k} steps, where nn is the size of the domain. The program is of the form:

αTM:=ORDER;START;𝐫𝐞𝐩𝐞𝐚𝐭STEP𝐮𝐧𝐭𝐢𝐥END.\alpha_{\rm TM}:={\rm ORDER}\ \mathbin{;}\ {\rm START}\ \mathbin{;}\ {\bf repeat\ }{\rm STEP}{\ \bf until}\ {\rm END}.

Procedure ORDER: Guessing an order is perhaps the most important part of our construction. We use a secondary numeric domain with a linear ordering, and guess elements one-by-one, using a concrete Choice function hh. We associate an element of the primary domain with an element of the secondary one, using co-existence in the same structure. Each Choice function corresponds to a possible linear ordering.

Procedure START: This procedure creates an encoding of the input τEDB\tau_{\rm EDB}-structure 𝔄\mathfrak{A} (say, a graph) in a sequence of structures in the transition system, to mimic an encoding enc(𝔄)enc(\mathfrak{A}) on a tape of a Turing machine. We use structures to represent cells of the tape of the Turing machine (one τ\tau-structure = one cell). The procedure follows a specific path, and thus a specific order generated by the procedure START. Subprocedure Encode(vocab(𝔄),,Sσ,,P¯,){Encode}(vocab(\mathfrak{A}),\dots,S_{\sigma},\dots,\bar{P},\dots) operates as follows. In every state (= cell), it keeps the input structure 𝔄\mathfrak{A} itself, and adds the part of the encoding of 𝔄\mathfrak{A} that belongs to that cell. The interpretations of P¯\bar{P} over the secondary domain of labels provide cell positions on the simulated input tape. Each particular encoding is done for a specific induced order on domain elements, in the path that is being followed.

In addition to producing an encoding, the procedure START sets the state of the Turing machine to be the initial state Q0Q_{0}. It also sets initial values for the variables used to encode the instructions of the Turing machine.

Expression START is similar to the first-order formula βσ(a¯)\beta_{\sigma}(\bar{a}) used by Grädel in his proof of capturing P-time using SO-HORN logic on ordered structures [23]. The main difference is that instead of tuples of domain elements a¯\bar{a} used to refer to the addresses of the cells on a tape, we use tuples P¯\bar{P}, also of length kk. Grädel’s formula βσ(a¯)\beta_{\sigma}(\bar{a}) for encoding input structures has the following property: (𝔄,<)βσ(a¯) the a¯-th symbol of enc(𝔄) is σ.(\mathfrak{A},<)\models\beta_{\sigma}(\bar{a})\ \ \Leftrightarrow\ \ \mbox{ the $\bar{a}$-th symbol of $enc(\mathfrak{A})$ is $\sigma$.} Here, we have:

𝔄Encode(,Sσ,,P1(a1),,Pk(ak),)(h/ε) the P1(a1),,Pk(ak)-th symbol of enc(𝔄) is σ,\begin{array}[]{c}\mathfrak{A}\models Encode(\dots,S_{\sigma},\dots,P_{1}(a_{1}),\dots,P_{k}(a_{k}),\dots)(h/\varepsilon)\\ \Leftrightarrow\ \ \mbox{ the $P_{1}(a_{1}),\dots,P_{k}(a_{k})$-th symbol of $enc(\mathfrak{A})$ is $\sigma$,}\end{array}

where a¯\bar{a} is a tuple of elements of the secondary domain, hh is a Choice function that guesses a linear order on the input domain through an order on structures (states in the transition system), starting in the input structure 𝔄\mathfrak{A}. That specific generated order is used in the encoding of the input structure. Another path produces a different order, and constructs an encoding of the input structure for that order.

Procedure STEP: This procedure encodes the instructions of the deterministic Turing machine. SM-PP modules are well-suited for this purpose. Instead of time and tape positions as arguments of binary predicates as in Fagin’s [19] and Grädel’s [23] proofs, we use coexistence in the same structure with kk-tuples of domain elements, as well as lexicographic successor and predecessor on such tuples. Polynomial time of the computation is guaranteed because time, in the repeat-until part, is tracked with kk-tuples of domain elements.

Procedure END: It checks if the accepting state of the Turing machine is reached.

We have that, for any P-time Turing machine, we can construct term αTM\alpha_{\rm TM} in the logic 𝕃\mathbb{L} such that the answer to 𝔄|αTM(ε)\mathfrak{A}\models|\alpha_{\rm TM}\rangle{\top}(\varepsilon) is non-empty if and only if the Turing machine accepts an encoding of 𝔄\mathfrak{A} for some specific but arbitrary order of domain elements on its tape. ∎

Combining the theorems, we obtain the following corollary.

Corollary 1.

The fragment of logic 𝕃\mathbb{L}, where negation applies to atomic modules only, captures precisely NP with respect to its data complexity.

0.10 Related Work

Logic of Information Flows The work in the current paper is a continuation of research initiated by the author, who introduced the Logic of Information Flows (LIF) in [50]. The goal of introducing LIF was to understand how information propagates, in order to make such propagations efficient. A version of LIF, [50], was published, initially, in the context of reasoning about modular systems, and was based on classical logic. In subsequent work, with a group of coauthors, we studied the notions of input, output and composition [4], [2] in LIF, and applications to data access in database research [3]. A short description of that work can be found in [51]. A lot of development in the papers [4], [3], and beyond, was done in the excellent PhD work by Heba Aamer [1].

In parallel with the work on [4], [2], the author continued working towards the main goal, on studying how to make information flows efficient. Defining LIF based on classical logic, as was done in the early versions, was clearly not sufficient. One of the main observations of the author, very early in the development of LIF, was that it was necessary to make operations function-preserving. The author conducted an extensive analysis, going through numerous versions and combinations of algebraic operations, eventually coming up with a minimal, but sufficiently expressive set of the operations presented here. Also, it was crucial to introduce choice, to handle atomic non-determinism.

Choice Operator Choice occurs in many high-level descriptions of polynomial-time algorithms, e.g., in Gaussian elimination: choose an element and continue. A big challenge, in our goal of formalizing non-deterministic computations algebraically, in an algebra of partial functions, was in how to deal with binary relations. Such relations are not necessarily functional. To deal with this challenge, the author invented history-dependent Choice functions, early in her work on multiple variants of LIF. The dependence on the history, to the best of our knowledge, has not been used in defining Choice functions before. The first use of a Choice operator ε\varepsilon in logic goes back to Hilbert and Bernays in 1939 [28] for proof-theoretic purposes, without giving a model-theoretic semantics. Early approaches to Choice, in connection to Finite Model Theory, include the work by Arvind and Biswas [5], Gire and Hoang [22], Blass and Gurevich [8] and by Otto [40], among others. Richerby and Dawar [16] survey and study logics with various forms of choice. Outside of Descriptive Complexity, Hilbert’s ε\varepsilon has been studied extensively by Soviet logicians Mints, Smirnov and Dragalin in 1970’s, 80’s and 90’s, see [47]. The semantics of this operator is still an active research area, see, e.g., [56]. Unlike the earlier approaches, our Choice operator formalizes a strategy, i.e., what to do next, given the history, under the given constraint given by a term.

An example of using strategies can be seen in building proofs in a Gentzen-style proof system. There, while selecting an element witnessing an existential quantifier, we ensure that the element is “new”, i.e., has not appeared earlier in the proof.

A problem with a set-theoretic Choice operator is that for first-order (FO) logic, and thus for its fixed-point extensions such as FO(FP), choice-invariance is undecidable [8]. Therefore, in using FO, there is a danger of obtaining an undecidable syntax, which violates a basic principle behind a well-defined logic. Choiceless Polynomial Time [9] is an attempt to characterize what can be done without introducing choice. But, as a critical step, we restrict FO connectives, similarly to Description logics [7], and in strong connection to modal logics, that are robustly decidable [55, 24].

Two-Variable Fragments The step towards binary relations in LIF [50], where we partitioned the variables of atomic symbols into input and outputs, was inspired by our own work on Model Expansion [38], with its before-after-a-computation perspective, and also by bounded-variable fragments of first-order logic. Such fragments have been shown to have good algorithmic properties by Vardi [53]. Two-variable fragments have been offered as an initial explanation of the robust decidability of modal logics [55, 24]. In addition, two-variable fragments have order-invariance [58]. Unfortunately, such fragments of FO are not expressive enough to encode Turing machines. To overcome this obstacle, we lifted the algebra to operate on binary relations on strings of relational structures. Such relations, intuitively, encode state transitions.

Algebras of Binary Relations Such an algebra was first introduced by De Morgan. It has been extensively developed by Peirce and then Schröder. It was abstracted to relation algebra RA by Jónsson and Tarski in [33]. For earlier uses of the operations and a historic perspective please see Pratt’s historic and informative overview paper [44]. More recently, relation algebras were studied by Fletcher, Van den Bussche, Surinx and their collaborators in a series of paper, see, e.g. [48, 21]. The algebras of relations consider various subsets of operations on binary relations as primitive, and other as derivable. Our algebra is an algebra of binary relations, similar to Jónsson and Tarski [33], however our relations are on more complex entities – strings of relational structures. Moreover, our binary relations are functional, which is achieved by making all operations function-preserving, and introducing Choice functions.

Algebras of Functions In another direction, Jackson and Stokes and then McLean [31, 37] study partial functions and their algebraic equational axiomatizations. The work of Jackson and Stokes [31] is particularly relevant, because it introduces some connectives we use. However, they do not study algebras on strings and Choice functions. Also, we had to eliminate intersection, which they, and many other researchers, use. We had to do it because intersection wastes computational power due to confluence, and makes information flows less efficient. We came up with an example where an intersection of (the representations of) two NP-complete problems produce a problem in P-time. We have selected a minimal set of operations, for our purposes. The operations correspond to dynamic and function-preserving versions of conjunction, disjunction, negation and iteration. However, other operations, such as many of those from McLean [37], can be studied as well.

Restricting Connectives Our algebraic operations are a restriction of first-order connectives, similarly to restrictions of such connectives in Description logics [7]. Classical connectives, such as negation, disjunction, and also the the iterator in the form of the Kleene star (reflexive transitive closure) are incompatible with an algebraic setting of functions. We require the connectives to be function-preserving. We traced the origins of the operations we use as follows. The Unary Negation operation is from Hollenberg and Visser [29]. It is also studied, among other operations on functions, by McLean in, e.g., [37] in the form of the Antidomain operation. Restricting negation (full complementation) to its unary counterpart is already known to imply good computational properties [45].151515Unary negation is related to the negation of modal logics. It is the modal negation in the modal Dynamic Logic we discuss later. In general, modal logics are known to be robustly decidable [55, 24] due to a combination of properties. However, in our case, all connectives and the fixed point construct of first-order with least fixed point, FO(LFP), had to be restricted. The operations of Preferential Union and Maximum Iterate are from Jackson and Stokes [31]. While these algebraic operations are more restrictive than those of Regular Expressions, they define the main constructs of Imperative programming.

Database Query Languages and Comprehension Syntax Early functional languages for databases include Macchiavelli [39] and Kleisli [57] The languages are based on the Comprehension Syntax proposed in [11], see also [12] for a later development. In somewhat different line of research, Libkin proposed to use a polynomial space iterator, among other constructs, for querying databases with incomplete information [35].

Substitution Monoid and Connection to Kleene Algebra First, we explain a connection to a Kleene algebra on a meta-level, and then discuss the differences in the languages. The algebra forms a monoid with respect to term substitution. The proof of this statement is lengthy and is outside of the scope of this paper. However, as for every monoid, there is a well-known and natural connection to a Kleene algebra.

Let MM be a monoid with identity element 𝟏{\bf 1} and let AA be the set of all subsets of MM. For two such subsets SS and TT, let S+T:=STS+T\ :=\ S\cup\ T, and let ST:={st:sS and tT}.ST\ :=\ \{st:s\in S\text{ and }t\in T\}. We define SS^{*} as the submonoid of MM generated by SS, which can be described as S:={𝟏}SSSSSSS^{*}\ :=\ \{{\bf 1}\}\cup S\cup SS\cup SSS\cup... Then AA forms a Kleene algebra with 0 being the empty set and 1 being {𝟏}\{{\bf 1}\}.

Comparison to Kleene Algebra at the Level of Algebraic Languages The connection to a Kleene algebra we have just explained exists at the meta-level, when we consider term substitution as the monoid operation. But, what is the connection to Kleene algebra at the level of the algebra itself, i.e., its algebraic operations? One difference is that, unlike the operations of Union (\cup) and Iteration () of Regular Expressions, the operations of Preferential Union (\sqcup) and Maximum Iterate () are function-preserving.

Temporal and Dynamic Logics The algebra (2) has an alternative (and equivalent) syntax in the form of a Dynamic logic, that we explain shortly. Our Dynamic logic is fundamentally different from Propositional Dynamic Logic (PDL) [43, 20] in that, because of the Choice function semantics, it has a linear time (as opposite to branching time) semantics. This is similar to to Linear Temporal Logic (LTL) and Linear Dynamic logic on finite traces LDLf [17].161616LDLf has the same syntax as PDL, but is interpreted over traces. But, in addition, branching from nodes is used for complex tests. So, there is some similarity to CTL. To the best of our knowledge, the data complexity of temporal and dynamic logics, with respect to an input database, has never been studied.

Algebra vs Logic Our algebra-Dynamic-logic connection is reminiscent that of Kleene algebras with tests (KAT) [34]; but, unlike KAT that allows for simple tests only, our logic allows for arbitrarily complex nested tests, in general.

We believe that ours is the first algebraic formalization of a linear time logic, i.e., a logic interpreted over traces of computation (as opposite to branching time logics with branching at every state). A crucial step of this formalization is the use of Choice functions that map strings to strings. We are not aware of any use of such functions in temporal or Dynamic logics.

Partial Algebras and Logics Our approach to strong equality is inspired by Partial Horn Logic (PHL) by Palgrem and Vickers [41]. The logic builds partiality directly into the logic. Their logic is, essentially, as in [32], but has a modified substitution axiom. It identifies definedness with self-equality. The axioms of this logic are universal Horn formulae. A quasi-equational theory in this partial logic has functions but no predicates (other than equality). Axioms are given in sequent form with conjunction of equations entailing an equation. We adopted this idea for reasoning about computation. But, in addition to definedness, we introduced undefinedness, that indicates non-existence of a “yes” certificate. The use of partial algebras has a long history, and is surveyed in [13].

0.11 Conclusion

We have defined a query language in the form of an algebra of partial higher-order functions on strings of relational structures. The algebra has a two-level syntax, where propagations are separated from control. The operations of the top level are obtained by taming (dynamic versions of) classical connectives and a fixed-point construct, that is, by making them function-preserving. A particular example of a logic of the bottom level is a singleton-set restriction of Monadic Conjuctive Queries that, intuitively, represent non-deterministic conditional assignments.

The algebra has an associated declarative query language in the form of a Dynamic logic that is equivalent to the algebra. In this logic, typical programming constructs, such as while loops and if-then-else, are definable. In general, the logic can encode any Turing machines. We have considered a restricted fragment where the length of the computation is limited to a polynomial number of steps, in the size of the input structure.

Since the logic can implicitly mimic quantification over cetrificates, it can express problems at any level of the Polynomial Time Hierarchy. With further restriction, where negations are applied to atomic modules only, the logic captures precisely the complexity class NP.

We give examples expressing counting properties on unordered structures, even though the logic does not have a special cardinality construct. The logic can also express reachability type of queries, as well as examples with mixed propagations.

A future step of this research is to understand under what general conditions on the terms tt of the logic 𝕃\mathbb{L}, evaluating the main query 𝔄|t(ε)\mathfrak{A}\models\ |t\rangle{\top}(\varepsilon) can be done by simply following one (arbitrary) sequence of atomic choices. When such an evaluation is possible, the query is choice-invariant.

Work is under way on developing a proof system (and a quasi-equational theory) for syntactic reasoning about strong equalities between terms, including definedness, t(x)t(x)\!\downarrow, and undefinedness, t(x)↓̸t(x)\!\!\not\downarrow, as particular cases of strong equalities.

Another interesting direction is to see whether preservation theorems that fail in the case of function-preserving binary relations [10] would hold for the function-preserving binary relations over strings, as in this paper.

The proposed query language, to be practical, has to be extended to handle arithmetic and aggregates. It can be done using the framework of Embedded Model Expansion [52, 49], and also via defining a semiring semantics.

0.12 Acknowledgements

The author is grateful to Leonid Libkin and to Brett McLean for useful discussions on cardinality and reachability examples, and on algebras of partial functions, respectively. Many thanks to Heng Liu, Shahab Tasharrofi and Anurag Sanyal for their help with the figures. The author’s research is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). Part of the research presented in this paper was carried out while the author participated in the program on Propositional Satisfiability and Beyond of the Simons Institute for the Theory of Computing during the spring of 2021, and its extended reunion.

References

  • [1] Heba Aamer. Logical Analysis of Input and Output Sensitivity in the Logic of Information Flows. PhD thesis, Hasselt University, Belgium, 2023.
  • [2] Heba Aamer, Bart Bogaerts, Dimitri Surinx, Eugenia Ternovska, and Jan Van den Bussche. Inputs, outputs, and composition in the logic of information flows. ACM Transactions on Computational Logic (TOCL), 24:1–44.
  • [3] Heba Aamer, Bart Bogaerts, Dimitri Surinx, Eugenia Ternovska, and Jan Van den Bussche. Executable first-order queries in the logic of information flows. In Proceedings 23rd International Conference on Database Theory, volume 155 of Leibniz International Proceedings in Informatics, pages 4:1–4:14. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2020.
  • [4] Heba Aamer, Bart Bogaerts, Dimitri Surinx, Eugenia Ternovska, and Jan Van den Bussche. Inputs, outputs, and composition in the logic of information flows. In KR’2020, 2020.
  • [5] Vikraman Arvind and Somenath Biswas. Expressibility of first order logic with a nondeterministic inductive operator. In STACS, volume 247 of Lecture Notes in Computer Science, pages 323–335. Springer, 1987.
  • [6] Albert Atserias, Andrei A. Bulatov, and Anuj Dawar. Affine systems of equations and counting infinitary logic. Theor. Comput. Sci., 410(18):1666–1683, 2009.
  • [7] Franz Baader, Diego Calvanese, Deborah L. McGuinness, Daniele Nardi, and Peter F. Patel-Schneider, editors. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2003.
  • [8] Andreas Blass and Yuri Gurevich. The logic of choice. J. Symb. Log., 65(3):1264–1310, 2000.
  • [9] Andreas Blass, Yuri Gurevich, and Saharon Shelah. Choiceless polynomial time. Ann. Pure Appl. Logic, 100(1-3):141–187, 1999.
  • [10] Bart Bogaerts, Balder ten Cate, Brett McLean, and Jan Van den Bussche. Preservation theorems for tarski?s relation algebra. In Proc. Dali, 2023.
  • [11] Peter Buneman, Leonid Libkin, Dan Suciu, Val Tannen, and Limsoon Wong. Comprehension syntax. SIGMOD Rec., 23(1):87–96, 1994.
  • [12] Peter Buneman, Shamim A. Naqvi, Val Tannen, and Limsoon Wong. Principles of programming with complex objects and collection types. Theor. Comput. Sci., 149(1):3–48, 1995.
  • [13] Peter Burmeister. Partial algebras ? an introductory survey. In Sabidussi G Rosenberg, I.G., editor, Algebras and Orders. Springer, Dordrecht, 1993.
  • [14] Jin-yi Cai, Martin Fürer, and Neil Immerman. An optimal lower bound on the number of variables for graph identifications. Combinatorica, 12(4):389–410, 1992.
  • [15] Louis Comtet. Advanced Combinatorics: The Art of Finite and Infinite Expansions. Reidel, 1974.
  • [16] Anuj Dawar and David Richerby. A fixed-point logic with symmetric choice. In CSL, volume 2803 of Lecture Notes in Computer Science, pages 169–182. Springer, 2003.
  • [17] Giuseppe De Giacomo and Moshe Y. Vardi. Linear temporal logic and linear dynamic logic on finite traces. In IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3-9, 2013, pages 854–860, 2013.
  • [18] Herbert B. Enderton. A mathematical introduction to logic. Academic Press, 1972.
  • [19] Ronald Fagin. Generalized first-order spectra and polynomial-time recognizable sets. Complexity of computation, SIAM-AMC proceedings, 7:43–73, 1974.
  • [20] Michael J. Fischer and Richard E. Ladner. Propositional dynamic logic of regular programs. J. Comput. Syst. Sci., 18(2):194–211, 1979.
  • [21] George H. L. Fletcher, Marc Gyssens, Dirk Leinders, Dimitri Surinx, Jan Van den Bussche, Dirk Van Gucht, Stijn Vansummeren, and Yuqing Wu. Relative expressive power of navigational querying on graphs. Inf. Sci., 298:390–406, 2015.
  • [22] Françoise Gire and H. Khanh Hoang. An extension of fixpoint logic with a symmetry-based choice construct. Inf. Comput., 144(1):40–65, 1998.
  • [23] Erich Grädel. Capturing complexity classes by fragments of second order logic. In Computational Complexity Conference, pages 341–352. IEEE Computer Society, 1991.
  • [24] Erich Grädel. Why are modal logics so robustly decidable? In Current Trends in Theoretical Computer Science, pages 393–408. 2001.
  • [25] Yuri Gurevich. Logic and the challenge of computer science. In E. Börger, editor, Current Trends in Theoretical Computer Science, pages 1–57. Computer Science Press, 1988.
  • [26] David Harel, Dexter Kozen, and Jerzy Tiuryn. Dynamic Logic (Foundations of Computing). MIT Press, 2000.
  • [27] Matthew Hennessy. The Semantics of Programming Languages: An Elementary Introduction Using Structural Operational Semantics. John Wiley & Sons, Inc., New York, NY, USA, 1990.
  • [28] David Hilbert and Paul Bernays. Grundlagen der Mathematik, volume 2. Springer Verlag, 1939.
  • [29] Marco Hollenberg and Albert Visser. Dynamic negation, the one and only. J. Log. Lang. Inf., 8(2):137–141, 1999.
  • [30] Neil Immerman. Descriptive complexity. Graduate texts in computer science. Springer, 1999.
  • [31] Marcel Jackson and Timothy Stokes. Modal restriction semigroups: towards an algebra of functions. IJAC, 21(7):1053–1095, 2011.
  • [32] Peter T. Johnstone. Sketches of an elephant: A topos theory compendium, OxfordLogic Guides, 44, volume 2. University Press, 2002.
  • [33] Bjarni Jónsson and Alfred Tarski. Representation problems for relation algebras. Bull. Amer. Math. Soc., 74:127–162, 1952.
  • [34] Dexter Kozen. Kleene algebra with tests. ACM Trans. Program. Lang. Syst., 19(3):427–443, 1997.
  • [35] Leonid Libkin. Query language primitives for programming with incomplete databases. In DBPL, Electronic Workshops in Computing, page 6. Springer, 1995.
  • [36] John McCarthy and Patrick J. Hayes. Some philosophical problems from the standpoint of artificial intelligence. In B. Meltzer and D. Michie, editors, Machine Intelligence 4, pages 463–502. Edinburgh University Press, 1969.
  • [37] Brett McLean. Complete representation by partial functions for composition, intersection and anti-domain. J. Log. Comput., 27(4):1143–1156, 2017.
  • [38] D. G. Mitchell and E. Ternovska. A framework for representing and solving NP search problems. In Proc. AAAI’05, pages 430–435, 2005.
  • [39] Atsushi Ohori, Peter Buneman, and Val Tannen. Database programming in machiavelli - a polymorphic language with static type inference. In SIGMOD Conference, pages 46–57. ACM Press, 1989.
  • [40] Martin Otto. Epsilon-logic is more expressive than first-order logic over finite structures. J. Symb. Log., 65(4):1749–1757, 2000.
  • [41] Erik Palmgren and Steven J. Vickers. Partial Horn logic and cartesian categories. Ann. Pure Appl. Log., 145(3):314–353, 2007.
  • [42] Gordon D. Plotkin. A structural approach to operational semantics. Technical Report DAIMI-FN-19, Computer Science Department, Aarhus University, 1981. Also published in: Journal of Logic and Algebraic Programming, 60-61:17-140, 2004.
  • [43] Vaughan R. Pratt. Semantical considerations on Floyd-Hoare logic. In 17th Annual Symposium on Foundations of Computer Science, Houston, Texas, USA, 25-27 October 1976, pages 109–121, 1976.
  • [44] Vaughan R. Pratt. Origins of the calculus of binary relations. In Proceedings of the Seventh Annual Symposium on Logic in Computer Science (LICS ’92), Santa Cruz, California, USA, June 22-25, 1992, pages 248–254, 1992.
  • [45] Luc Segoufin and Balder ten Cate. Unary negation. Log. Methods Comput. Sci., 9(3), 2013.
  • [46] John C. Shepherdson and Howard E. Sturgis. Computability of recursive functions. J. ACM, 10(2):217–255, 1963.
  • [47] Sergei Soloviev. Studies of Hilbert’s epsilon-operator in the USSR. FLAP, 4(2), 2017.
  • [48] Dimitri Surinx, Jan Van den Bussche, and Dirk Van Gucht. The primitivity of operators in the algebra of binary relations under conjunctions of containments. In LICS ’17, 2017.
  • [49] Shahab Tasharrofi and Eugenia Ternovska. Built-in arithmetic in knowledge representation languages. In NonMon at 30 (Thirty Years of Nonmonotonic Reasoning), October 2010.
  • [50] E. Ternovska. An algebra of modular systems: Static and dynamic perspectives. In Proceedings of the 12th International Symposium on Frontiers of Combining Systems (FroCoS), September 2019.
  • [51] Eugenia Ternovska. A logic of information flows (two page abstract). In Proc. KR2020, 2020.
  • [52] Eugenia Ternovska and David G. Mitchell. Declarative programming of search problems with built-in arithmetic. In Proc. of IJCAI, pages 942–947, 2009.
  • [53] Moshe Vardi. On the complexity of bounded-variable queries. In PODS, pages 266–276. ACM Press, 1995.
  • [54] Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In STOC, pages 137–146. ACM, 1982.
  • [55] Moshe Y. Vardi. Why is modal logic so robustly decidable? In Descriptive Complexity and Finite Models, Proceedings of a DIMACS Workshop 1996, Princeton, New Jersey, USA, January 14-17, 1996, pages 149–184, 1996.
  • [56] Claus-Peter Wirth. The explicit definition of quantifiers via hilbert’s epsilon is confluent and terminating. Journal of Applied Logics – IFCoLog Journal of Logics and their Applications, Special Issue on Hilbert?s epsilon and tau in Logic, Informatics and Linguistics, 4(2), 2017.
  • [57] Limsoon Wong. Kleisli, a functional query system. J. Funct. Program., 10(1):19–56, 2000.
  • [58] Thomas Zeume and Frederik Harwath. Order-invariance of two-variable logic is decidable. In LICS, pages 807–816. ACM, 2016.

0.13 Appendix: Quasi-Equational Theory

In this section, we give a flavour of how the proof theory for our algebra will look like. In the proof system, we will be able to derive the existence and non-existence of a certificate, as explained in Section 0.5. First, we include all logical rules of Partial Horn Theory (PHT) [41], which we are not going to reproduce here. In addition to logical rules, we need a quasi-equational theory of the specific algebraic operations we use.

Let a set \mathcal{M} of module symbols be fixed. Let the signature σ\sigma be the set of function symbols in the algebra (2), also listed in the table in Section 0.1.

A Horn theory in a signature σ\sigma is a set of Horn sequents over σ\sigma. In the case where σ\sigma does not contain any predicate symbols, the theory is called quasi-equational.

A novelty of the quasi-equational theory for our logic is that, unlike PHT [41], we compute not only defined terms tt\!\downarrow, but also undefined terms t↓̸t\!\not\downarrow. Axioms for t↓̸t\!\not\downarrow specify when each particular operation is not defined. Informally, t(x)↓̸t(x)\!\not\downarrow means that the corresponding program cannot be executed at xx.

Our specification of the quasi-equational theory for signature σ\sigma given in the table in Section 0.1 is as follows. Variables that are universally quantified are listed over the turnstiles. Some turnstiles are bidirectional, meaning that implications hold in both directions.

Identity:

\displaystyle\top \sststile[ss]xid(x)=x\displaystyle\ \ \,\scalebox{0.8}[0.5]{$\sststile[ss]{\textstyle}{\textstyle\ \ \ x\ \ \ }$}\,\ \ {\rm id}(x)=x (17)

Unary Negation:

y(x)↓̸\displaystyle y(x)\!\not\downarrow \turnstile[ss]ssxysy(x)=x\displaystyle\ \ \ \,\scalebox{0.8}[0.5]{$\turnstile[ss]{s}{s}{\textstyle}{\textstyle\ \ \ \ xy\ \ \ }{s}$}\,\ \ \ \mathord{\curvearrowright}y(x)=x (18)
y(x)↓̸\displaystyle y(x)\!\not\downarrow \turnstile[ss]ssxysy(x)\displaystyle\ \ \ \,\scalebox{0.8}[0.5]{$\turnstile[ss]{s}{s}{\textstyle}{\textstyle\ \ \ \ xy\ \ \ }{s}$}\,\ \ \ \mathord{\curvearrowright}y(x)\!\downarrow (19)
y(x)\displaystyle y(x)\!\downarrow \turnstile[ss]ssxysy(x)↓̸\displaystyle\ \ \ \,\scalebox{0.8}[0.5]{$\turnstile[ss]{s}{s}{\textstyle}{\textstyle\ \ \ \ xy\ \ \ }{s}$}\,\ \ \ \mathord{\curvearrowright}y(x)\!\not\downarrow (20)

Preferential Union:

y(x)=w\displaystyle y(x)=w \sststile[ss]xyzw[yz](x)=w\displaystyle\ \ \,\scalebox{0.8}[0.5]{$\sststile[ss]{\textstyle}{\textstyle\ xyzw\ }$}\,\ \ [y\sqcup z](x)=w (21)
y(x)↓̸z(x)=w\displaystyle y(x)\!\not\downarrow\land\ z(x)=w \sststile[ss]xyzw[yz](x)=w\displaystyle\ \ \,\scalebox{0.8}[0.5]{$\sststile[ss]{\textstyle}{\textstyle\ xyzw\ }$}\,\ \ [y\sqcup z](x)=w (22)
y(x)↓̸z(x)↓̸\displaystyle y(x)\!\not\downarrow\land\ z(x)\!\not\downarrow \turnstile[ss]ssxyzws[yz](x)↓̸[zy](x)↓̸\displaystyle\ \ \,\scalebox{0.8}[0.5]{$\turnstile[ss]{s}{s}{\textstyle}{\textstyle\ \ \ xyzw\ \ }{s}$}\,\ \ [y\sqcup z](x)\!\not\downarrow\ \land\ [z\sqcup y](x)\!\not\downarrow (23)

Maximum Iterate:

y(x)↓̸\displaystyle y(x)\!\not\downarrow\ \sststile[ss]xyy(x)=x\displaystyle\ \ \,\scalebox{0.8}[0.5]{$\sststile[ss]{\textstyle}{\textstyle\ \ \ xy\ \ \ }$}\,\ \ y^{\uparrow}(x)=x (24)
y(y(x))\displaystyle y^{\uparrow}(y(x))\!\downarrow\ \sststile[ss]xyy(x)=y(y(x))\displaystyle\ \ \,\scalebox{0.8}[0.5]{$\sststile[ss]{\textstyle}{\textstyle\ \ \ xy\ \ \ }$}\,\ \ y^{\uparrow}(x)=y^{\uparrow}(y(x)) (25)

For all algebraic operations, we need both, axioms for when the operation is defined, and when it is undefined. If we try to write axioms to derive when Maximum Iterate is undefined, we immediately run into difficulties due to the cyclicity of the reasoning that is needed to prove that such a term is undefined. Stepan Kuznetsov has suggested a method to deal with that case.

In addition, we assume axioms compiled from SM-PP atomic modules, for each atomic module symbol in \mathcal{M}.