This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: DTU Informatics, Richard Petersens Plads,Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
11email: {pifi,nielson,riis}@imm.dtu.dk

Lattice based Least Fixed Point Logic

Piotr Filipiuk    Flemming Nielson    Hanne Riis Nielson
Abstract

As software systems become more complex, there is an increasing need for new static analyses. Thanks to the declarative style, logic programming is an attractive formalism for specifying them. However, prior work on using logic programming for static analysis focused on analyses defined over some powerset domain, which is quite limiting. In this paper we present a logic that lifts this restriction, called Lattice based Least Fixed Point Logic (LLFP), that allows interpretations over any complete lattice satisfying Ascending Chain Condition. The main theoretical contribution is a Moore Family result that guarantees that there always is a unique least solution for a given problem. Another contribution is the development of solving algorithm that computes the least model of LLFP formulae guaranteed by the Moore Family result.

Keywords:
Static analysis, logic programming, abstract interpretation.

1 Introduction

Nowadays, we heavily rely on software systems. At the same time they become bigger and more complex, and hence the number of potential errors increases. In order to achieve more reliable systems, formal verification techniques may be applied. A widely used verification technique is static analysis, which reasons about system behavior without executing it. It is performed statically at compile-time, and it computes safe approximations of values or behaviors that may occur at run-time. Static analysis is increasingly recognized as a fundamental technique for program verification, bug detection, compiler optimization and software understanding.

Unfortunately, developing new static analyses is difficult and error-prone. In order to overcome that problem it is desirable to implement prototypes of analyses that are easy to analyse for complexity and correctness. Since analysis specifications are generally written in a declarative style, logic programming presents an attractive model for producing executable specifications of analyses. Furthermore, thanks to the advances in logic programming, the associated solvers became more efficient.

In this paper we present a framework that facilitates rapid prototyping of new static analyses. The approach taken falls within the Abstract Interpretation [8, 7] framework, thus there always is a unique best solution to the analysis problem considered. The framework consists of the Lattice based Least Fixed Point Logic (LLFP) and the associated solver. The most prominent feature of the LLFP logic is its interpretation over complete lattices satisfying the Ascending Chain Condition, which makes it possible to express sophisticated analyses in LLFP. The solver combines a continuation passing style algorithm with propagation of differences, and uses prefix trees as its main data structure. The applicability of the framework is illustrated by presenting a specification of interval analysis, which could not be specified using logics traditionally used such as Datalog [1, 5] or ALFP [18].

This paper is organized as follows. In Section 2 we present the problem we want to solve and indicate a solution. Section 3 introduces syntax and semantics of the LLFP logic. In Section 4 we establish our main theoretical contribution; namely a Moore Family result for LLFP. Section 5 describes the solving algorithm for LLFP. We conclude and discuss future work in Section 6.

2 The problem

There is an immense body of work on using logic for specifying static analyses. However, logics traditionally used have some limitations. To illustrate the problem, let us briefly introduce the Alternation free Least Fixed Point Logic (ALFP) and then try to devise the ALFP specifications of two analyses: detection of signs and interval analysis.

Alternation-free Least Fixed Point Logic.

Many static analyses can be succinctly expressed using Alternation-free Least Fixed Point Logic (ALFP) [18]. The logic is a generalization of Datalog [1, 5] and it has proved to have a number of properties essential for specifying static analyses such as the existence of a unique least model. The syntax of ALFP is given by

v::=xapre::=R(v1,,vk)¬R(v1,,vk)pre1pre2pre1pre2x:prev1=v2v1v2cl::=R(v1,,vk)𝟏cl1cl2preclx:cl\begin{array}[]{lcl}v&::=&x\mid a\\ pre&::=&R(v_{1},\dots,v_{k})\mid\neg R(v_{1},\dots,v_{k})\mid pre_{1}\wedge pre_{2}\\ &\mid&pre_{1}\vee pre_{2}\mid\exists x:pre\mid v_{1}=v_{2}\mid v_{1}\neq v_{2}\\ cl&::=&R(v_{1},\dots,v_{k})\mid{\bf 1}\mid cl_{1}\wedge cl_{2}\mid pre\Rightarrow cl\mid\forall x:cl\end{array}

where we write aa for constants, xx for analysis variables, vv for values, RR for predicates, prepre for preconditions, and clcl for clauses. The clauses are interpreted over a universe 𝒰\mathcal{U} of constants, a𝒰a\in\mathcal{U}. The interpretation is given in terms of satisfaction relations (ρ,σ)pre(\rho,\sigma)\models pre and (ρ,σ)cl(\rho,\sigma)\models cl where ρ\rho is an interpretation of predicates, and σ\sigma is an interpretation of variables. The definition is standard and hence omitted. Due to the use of negation, we impose a stratification condition similar to the one in Datalog [1, 5]. This intuitively means that no predicate depends on the negation of itself. We refer to [18] for more details.

Example 1

Using the notion of stratification we can define equality EE and non-equality NN predicates as follows

(x:E(x,x))(x:y:¬E(x,y)N(x,y))(\forall x:E(x,x))\wedge(\forall x:\forall y:\neg E(x,y)\Rightarrow N(x,y))

The formula is stratified, since predicate EE is fully asserted before it is negatively queried in the clause asserting predicate NN.

Detection of signs analysis.

Now, let us consider the ALFP formulation of the detection of signs analysis. The analysis aims to determine for each program point and each variable, the possible sign (negative, zero or positive) that the variable may have whenever the execution reaches that point. In the following, we use program graphs as representation of the program under consideration [2]. Compared to the classical flow graphs [14, 16], the main difference is that in the program graphs the actions label the edges rather than the states. Here we focus on three types of actions: assignments, boolean expressions and the skip action. For simplicity we assume that assignments are in three-address form. The analysis is defined by predicate AA, and we begin with initializing the initial state of the program graph, q0q_{0}, with all possible signs for all variables vv occurring in the underlying program graph

v𝑉𝑎𝑟A(q0,v,)A(q0,v,0)A(q0,v,+)\bigwedge_{v\in\mathit{Var}}A(q_{0},v,-)\wedge A(q_{0},v,0)\wedge A(q_{0},v,+)

Intuitively it indicates that at state q0q_{0} all variables may have all possible values. Now we consider the ALFP specifications for each type of action. Whenever we have qsx:=yzqtq_{s}\xrightarrow{x:=y\star z}q_{t} in the program graph we generate

s:sy:sz:A(qs,y,sy)A(qs,z,sz)R(sy,sz,s)A(qt,x,s)\displaystyle\forall s:\forall s_{y}:\forall s_{z}:A(q_{s},y,s_{y})\wedge A(q_{s},z,s_{z})\wedge R_{\star}(s_{y},s_{z},s)\Rightarrow A(q_{t},x,s)\wedge
v:s:vxA(qs,v,s)A(qt,v,s)\displaystyle\forall v:\forall s:v\neq x\wedge A(q_{s},v,s)\Rightarrow A(q_{t},v,s)

where we assume that we have a relation for each type of arithmetic operation, denoted by RR_{\star} in the above formula. The first conjunct states that for all possible values ss, sys_{y} and szs_{z}, if at state qsq_{s} the signs of variables yy and zz are sys_{y} and szs_{z}, respectively, and the sign of the result of evaluating the arithmetic operation \star is ss, then at state qtq_{t} variable xx will have sign ss. The second conjunct expresses that for all variables vv and signs ss, if the variable is different than xx and at state qsq_{s} it has sign ss, then it will have the same sign at state qtq_{t}. Similarly, whenever we have qs𝑒qtq_{s}\xrightarrow{e}q_{t} or qsskipqtq_{s}\xrightarrow{skip}q_{t} in the program graph, we generate a clause

v:s:A(qs,v,s)A(qt,v,s)\forall v:\forall s:A(q_{s},v,s)\Rightarrow A(q_{t},v,s)

The clause simply propagates the signs of all variables along the edge of the program graph, without altering it.

In a similar manner we could formulate other analyses such as pointer analysis [21, 15, 20, 4], or classical data flow analyses [13, 19]. In more general terms, logics traditionally used, e.g. Datalog and ALFP, can be used for specifying analyses defined over a powerset domain. However, as we show in the next paragraph, many interesting analyses are defined over some mathematical structure such as a complete lattice. Thus, let us now consider interval analysis as an example of such an analysis.

Interval analysis.

The purpose of interval analysis is to determine for each program point an interval containing possible values of variables whenever that point is reached during run-time execution. The analysis results can be used for Array Bound Analysis, which determines whether an array index is always within the bounds of the array. If this is the case, a run-time check can safely be eliminated, which makes code more efficient.

We begin with defining the complete lattice (𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙,I)(\mathit{Interval},\sqsubseteq_{I}) over which the analysis is defined. The underlying set is

𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙={[z1,z2]z1z2,z1Z{},z2Z{}}\mathit{Interval}={\bot}\cup\{[z_{1},z_{2}]\mid z_{1}\leq z_{2},z_{1}\in Z\cup\{-\infty\},z_{2}\in Z\cup\{\infty\}\}

where ZZ is a finite subset of integers, ZZ\subseteq\mathbb{Z}, and the integer ordering \leq on \mathbb{Z} is extended to an ordering on Z=Z{,}Z^{\prime}=Z\cup\{-\infty,\infty\} by taking for all zZz\in Z: z-\infty\leq z, zz\leq\infty and -\infty\leq\infty. In the above definition, \bot denotes an empty interval, whereas [z1,z2][z_{1},z_{2}] is the interval from z1z_{1} to z2z_{2} including the end points, where z1,z2Zz_{1},z_{2}\in Z. The interval [,][-\infty,\infty] is equivalent to the top element, \top. In the following we use ii to denote an interval from 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙\mathit{Interval}. The partial ordering I\sqsubseteq_{I} in 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙\mathit{Interval} uses operations inf\inf and sup\sup

inf(i)={if i=z1if i=[z1,z2]sup(i)={if i=z2if i=[z1,z2]\begin{array}[]{rcl c rcl}\inf(i)&=&\left\{\begin{array}[]{ll}\infty&\text{if }i=\bot\\ z_{1}&\text{if }i=[z_{1},z_{2}]\end{array}\right.&&\sup(i)&=&\left\{\begin{array}[]{ll}-\infty&\text{if }i=\bot\\ z_{2}&\text{if }i=[z_{1},z_{2}]\end{array}\right.\\ \end{array}

and is defined as

i1Ii2 iff inf(i2)inf(i1)sup(i1)sup(i2)i_{1}\sqsubseteq_{I}i_{2}\text{ iff }\inf(i_{2})\leq\inf(i_{1})\wedge\sup(i_{1})\leq\sup(i_{2})

The intuition behind the partial ordering I\sqsubseteq_{I} in 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙\mathit{Interval} is that

i1Ii2{zz belongs to i1}{zz belongs to i2}i_{1}\sqsubseteq_{I}i_{2}\Leftrightarrow\{z\mid z\text{ belongs to }i_{1}\}\subseteq\{z\mid z\text{ belongs to }i_{2}\}

Unfortunately, due to limited expressiveness of the ALFP logic (and similarly Datalog), interval analysis can not be specified using these formalisms. Hence, in the following section we present a solution for that problem; namely we introduce LLFP logic.

3 Syntax and Semantics

In the previous section we briefly introduced ALFP logic, which is interpreted over a finite universe of atoms; in this section we present an extension of ALFP called LLFP allowing interpretations over complete lattices satisfying Ascending Chain Condition. We also allow function terms as arguments of relations. Since functions over the universe 𝒰\mathcal{U} can be represented as relations, we do not consider them here. Instead, we focus on functions over a complete lattice f:k\llbracket f\rrbracket:\mathcal{L}^{k}\rightarrow\mathcal{L}, and we restrict our attention to monotone functions only. Recall that a function f:12\llbracket f\rrbracket:\mathcal{L}_{1}\rightarrow\mathcal{L}_{2} between partially ordered sets 1=(1,1)\mathcal{L}_{1}=(\mathcal{L}_{1},\sqsubseteq_{1}) and 2=(2,2)\mathcal{L}_{2}=(\mathcal{L}_{2},\sqsubseteq_{2}) is monotone if

l,l1:l1lf(l)2f(l)\forall l,l^{\prime}\in\mathcal{L}_{1}:l\sqsubseteq_{1}l^{\prime}\Rightarrow\llbracket f\rrbracket(l)\sqsubseteq_{2}\llbracket f\rrbracket(l^{\prime})

Let us begin with introducing necessary definitions.

Definition 1

A complete lattice =(,)=(,,,,,)\mathcal{L}=(\mathcal{L},\sqsubseteq)=(\mathcal{L},\sqsubseteq,\bigsqcup,\bigsqcap,\bot,\top) is a partially ordered set (,)(\mathcal{L},\sqsubseteq) such that all subsets have least upper bounds as well as greatest lower bounds.

A subset YY\subseteq\mathcal{L} of a partially ordered set =(,)\mathcal{L}=(\mathcal{L},\sqsubseteq) is a chain if

l1,l2Y:(l1l2)(l2l1)\forall l_{1},l_{2}\in Y:(l_{1}\sqsubseteq l_{2})\vee(l_{2}\sqsubseteq l_{1})

Hence, a chain is a (possibly empty) subset of \mathcal{L} that is totally ordered. A sequence (ln)n(l_{n})_{n} of elements in \mathcal{L} is an ascending chain if

n<mlnlmn<m\Rightarrow l_{n}\sqsubseteq l_{m}

We say that a sequence (ln)n(l_{n})_{n} eventually stabilises if and only if

n0:n:n>n0ln=ln0\exists n_{0}\in\bbbn:\forall n\in\bbbn:n>n_{0}\Rightarrow l_{n}=l_{n_{0}}

The partially ordered set \mathcal{L} satisfies Ascending Chain Condition if and only if all ascending chains eventually stabilise. Essentially, the Ascending Chain Condition guarantees that the least fixed point computation always terminates. Due to the use of negation in the logic, we need to introduce a complement operator, \complement, in the underlying complete lattice. The only condition that we impose on the complement is anti-monotonicity i.e. l1,l2:l1l2l1l2\forall l_{1},l_{2}\in\mathcal{L}:l_{1}\sqsubseteq l_{2}\Rightarrow\complement l_{1}\sqsupseteq\complement l_{2}, which is necessary for establishing Moore Family result. The following definition introduces the syntax of LLFP.

Definition 2

Given fixed countable and pairwise disjoint sets 𝒳\mathcal{X} and 𝒴\mathcal{Y} of variables, a non-empty and finite universe 𝒰\mathcal{U}, a complete lattice satisfying Ascending Chain Condition \mathcal{L}, finite alphabets \mathcal{R} and \mathcal{F} of predicate and function symbols, respectively, we define the set of LLFP formulae (or clause sequences), clscls, together with clauses, clcl, preconditions, prepre, terms uu and lattice terms VV and VV^{\prime} by the grammar:

uu ::= xax\mid a
VV ::= Y[u]Y\mid[u]
VV^{\prime} ::= Vf(V)V\mid f(\vec{V^{\prime}})
prepre ::= R(u;V)¬R(u;V)Y(u)pre1pre2pre1pre2R(\vec{u};V)\mid\neg R(\vec{u};V)\mid Y(u)\mid pre_{1}\wedge pre_{2}\mid pre_{1}\vee pre_{2}
\mid x:preY:pre\exists x:pre\mid\exists Y:pre
clcl ::= R(u;V)𝟏cl1cl2preclx:clY:clR(\vec{u};V^{\prime})\mid{\bf 1}\mid cl_{1}\wedge cl_{2}\mid pre\Rightarrow cl\mid\forall x:cl\mid\forall Y:cl
clscls ::= cl1,,clscl_{1},\ldots,cl_{s}

Here x𝒳x\in\mathcal{X}, a𝒰a\in\mathcal{U}, Y𝒴Y\in\mathcal{Y}, RR\in\mathcal{R}, ff\in\mathcal{F}, and s1s\geq 1. Furthermore, u\vec{u} and V\vec{V^{\prime}} abbreviate tuples (u1,,uk)(u_{1},\ldots,u_{k}) and (V1,,Vk)(V^{\prime}_{1},\ldots,V^{\prime}_{k}) for some k0k\geq 0, respectively.

We write 𝑓𝑣()\mathit{fv}(\cdot) for the set of free variables in the argument \cdot. Occurrences of R(u;V)R(\vec{u};V) and ¬R(u;V)\neg R(\vec{u};V) in preconditions are called positive, resp. negative, queries and we require that 𝑓𝑣(u)𝒳\mathit{fv}(\vec{u})\subseteq{\cal X} and 𝑓𝑣(V)𝒴𝒳\mathit{fv}(V)\subseteq{\cal Y}\cup{\cal X}; these variables are defining occurrences. Occurrences of Y(u)Y(u) in preconditions must satisfy Y𝒴Y\in{\cal Y} and 𝑓𝑣(u)𝒳\mathit{fv}(u)\subseteq{\cal X}; YY is an applied occurrence, uu is a defining occurrence. Clauses of the form R(u;V)R(\vec{u};V^{\prime}) are called assertions; we require that 𝑓𝑣(u)𝒳\mathit{fv}(\vec{u})\subseteq{\cal X} and 𝑓𝑣(V)𝒴𝒳\mathit{fv}(V^{\prime})\subseteq{\cal Y}\cup{\cal X} and we note that these variables are applied occurrences. A clause clcl satisfying these conditions together with 𝑓𝑣(cl)=\mathit{fv}(cl)=\emptyset is said to be well-formed; we are only interested in clause sequences clscls consisting of well-formed clauses.

In order to ensure desirable theoretical and pragmatic properties in the presence of negation, we impose a notion of stratification similar to the one in Datalog [1, 5]. Intuitively, stratification ensures that a negative query is not performed until the predicate has been fully asserted. This is important for ensuring that once a precondition evaluates to true it will continue to be true even after further assertions of predicates.

Definition 3

The formula cls=cl1,,clscls=cl_{1},\cdots,cl_{s} is stratified if there exists a function rank:{0,,s}\mbox{\rm rank}:\mathcal{R}\rightarrow\{0,\cdots,s\} such that for all i=1,,si=1,\cdots,s:

  • rank(R)=i\mbox{\rm rank}(R)=i for every assertion RR in clicl_{i};

  • rank(R)i\mbox{\rm rank}(R)\leq i for every positive query RR in clicl_{i}; and

  • rank(R)<i\mbox{\rm rank}(R)<i for every negative query ¬R\neg R in clicl_{i}.

The following example illustrates the use of negation in the LLFP formula.

Example 2

Similarly to Example 1, we can define equality EE and non-equality NN predicates in LLFP as follows

(x:E(x;[x])),(x:Y:¬E(x;Y)N(x;Y))(\forall x:E(x;[x])),(\forall x:\forall Y:\neg E(x;Y)\Rightarrow N(x;Y))

According to Definition 3 the formula is stratified, since predicate EE is fully asserted before it is negatively queried in the clause asserting predicate NN. As a result we can dispense with an explicit treatment of == and \neq in the development that follows. On the other hand the Definition 3 rules out

(x:Y:¬P(x;Y)Q(x;Y)),(x:Y:¬Q(x;Y)P(x;Y))(\forall x:\forall Y:\neg P(x;Y)\Rightarrow Q(x;Y)),(\forall x:\forall Y:\neg Q(x;Y)\Rightarrow P(x;Y))

To specify the semantics of LLFP we introduce the interpretations ϱ\varrho, ς\varsigma and ζ\zeta of predicate symbols, variables and function symbols, respectively. Formally we have

ϱ:k/k𝒰kς:(𝒳𝒰)×(𝒴)ζ:k/kk\begin{array}[]{rl}\varrho:&\prod_{k}\mathcal{R}_{/k}\rightarrow\mathcal{U}^{k}\rightarrow\mathcal{L}\\ \varsigma:&(\mathcal{X}\rightarrow\mathcal{U})\times(\mathcal{Y}\rightarrow\mathcal{L}_{\neq\bot})\\ \zeta:&\prod_{k}\mathcal{F}_{/k}\rightarrow\mathcal{L}^{k}\rightarrow\mathcal{L}\end{array}

In the above /k\mathcal{R}_{/k} stands for a set of predicate symbols of arity kk, and \mathcal{R} is a disjoint union of /k\mathcal{R}_{/k}, hence =k/k\mathcal{R}=\biguplus_{k}\mathcal{R}_{/k}. Similarly, /k\mathcal{F}_{/k} is a set of function symbols of arity kk over the complete lattice \mathcal{L}. The set \mathcal{F} is then defined as disjoint unions of /k\mathcal{F}_{/k}; hence =k/k\mathcal{F}=\biguplus_{k}\mathcal{F}_{/k}. The interpretation of variables from 𝒳\mathcal{X} is given by x(ζ,ς)=ς(x)\llbracket x\rrbracket(\zeta,\varsigma)=\varsigma(x), where ς(x)\varsigma(x) is the element from 𝒰\mathcal{U} bound to x𝒳x\in{\cal X}. Analogously, the interpretation of variables from 𝒴\mathcal{Y} is given by Y(ζ,ς)=ς(Y)\llbracket Y\rrbracket(\zeta,\varsigma)=\varsigma(Y), where ς(Y)\varsigma(Y) is the element from ={}\mathcal{L}_{\neq\bot}=\mathcal{L}\setminus\{\bot\} bound to Y𝒴Y\in\mathcal{Y}. We do not allow variables from 𝒴\mathcal{Y} to be mapped to \bot in order to establish a relationship between ALFP and LLFP in the case of powerset lattice, i.e. 𝒫(𝒰)\mathcal{P}(\mathcal{U}), which we briefly describe later. In order to give the interpretation of [u][u], we introduce a function β:𝒰\beta:\mathcal{U}\rightarrow\mathcal{L}. The β\beta function is called a representation function and the idea is that β\beta maps a value from the universe 𝒰\mathcal{U} to the best property describing it. For example in the case of a powerset lattice, β\beta could be defined by β(a)={a}\beta(a)=\{a\} for all a𝒰a\in\mathcal{U}. Then the interpretation is given by [u](ζ,ς)=β(u(ζ,ς))\llbracket[u]\rrbracket(\zeta,\varsigma)=\beta(\llbracket u\rrbracket(\zeta,\varsigma)). The interpretation of function terms is defined as f(V)(ζ,ς)=ζ(f)(V(ζ,ς))\llbracket f(\vec{V}^{\prime})\rrbracket(\zeta,\varsigma)=\zeta(f)(\llbracket\vec{V}^{\prime}\rrbracket(\zeta,\varsigma)). For the functions we require that ζ(f):k\zeta(f):\mathcal{L}^{k}\rightarrow\mathcal{L} is monotone. The interpretation of terms is generalized to sequences u\vec{u} of terms in a point-wise manner by taking a(ζ,ς)=a\llbracket a\rrbracket(\zeta,\varsigma)=a for all a𝒰a\in{\cal U}, thus (u1,,uk)(ζ,ς)=(u1(ζ,ς),,uk(ζ,ς))\llbracket(u_{1},\ldots,u_{k})\rrbracket(\zeta,\varsigma)=(\llbracket u_{1}\rrbracket(\zeta,\varsigma),\ldots,\llbracket u_{k}\rrbracket(\zeta,\varsigma)). The interpretation of lattice terms VV (and VV^{\prime}) is generalized to sequences V\vec{V} (and V\vec{V^{\prime}}) of lattice terms in the similar way.

The satisfaction relations for preconditions prepre, clauses clcl and clause sequences clscls are specified by:

(ϱ,ς)βpre,(ϱ,ζ,ς)βcland(ϱ,ζ,ς)βcls(\varrho,\varsigma)\models_{\beta}pre,\quad(\varrho,\zeta,\varsigma)\models_{\beta}cl\quad\mathrm{and}\ (\varrho,\zeta,\varsigma)\models_{\beta}cls

The formal definition is given in Table 1; here ς[xa]\varsigma[x\mapsto a] stands for the mapping that is as ς\varsigma except that xx is mapped to aa and similarly ς[Yl]\varsigma[Y\mapsto l] stands for the mapping that is as ς\varsigma except that YY is mapped to ll\in\mathcal{L}_{\neq\bot}.

Table 1: Semantics of LLFP
(ϱ,ς)βR(u;V)iff¯ϱ(R)(ς(u))ς(V)(ϱ,ς)β¬R(u;V)iff¯(ϱ(R)(ς(u)))ς(V)(ϱ,ς)βY(u)iff¯β(ς(u))ς(Y)(ϱ,ς)βpre1pre2iff¯(ϱ,ς)βpre1 and (ϱ,ς)βpre2(ϱ,ς)βpre1pre2iff¯(ϱ,ς)βpre1 or (ϱ,ς)βpre2(ϱ,ς)βx:preiff¯(ϱ,ς[xa])βpre for some a𝒰(ϱ,ς)βY:preiff¯(ϱ,ς[Yl])βpre for some l(ϱ,ζ,ς)βR(u;V)iff¯ϱ(R)(u(ζ,ς))V(ζ,ς)(ϱ,ζ,ς)β𝟏iff¯true(ϱ,ζ,ς)βcl1cl2iff¯(ϱ,ζ,ς)βcl1 and (ϱ,ζ,ς)βcl2(ϱ,ζ,ς)βprecliff¯(ϱ,ζ,ς)βcl whenever (ϱ,ς)βpre(ϱ,ζ,ς)βx:cliff¯(ϱ,ζ,ς[xa])βcl for all a𝒰(ϱ,ζ,ς)βY:cliff¯(ϱ,ζ,ς[Yl])βcl for all l(ϱ,ζ,ς)βcl1,,clsiff¯(ϱ,ζ,ς)βcli for all i,1is\begin{array}[]{lllll}(\varrho,\varsigma)&\models_{\beta}&R(\vec{u};V)&\underline{\texttt{iff}}&\varrho(R)(\varsigma(\vec{u}))\sqsupseteq\varsigma(V)\\ (\varrho,\varsigma)&\models_{\beta}&\neg R(\vec{u};V)&\underline{\texttt{iff}}&\complement(\varrho(R)(\varsigma(\vec{u})))\sqsupseteq\varsigma(V)\\ (\varrho,\varsigma)&\models_{\beta}&Y(u)&\underline{\texttt{iff}}&\beta(\varsigma(u))\sqsubseteq\varsigma(Y)\\ (\varrho,\varsigma)&\models_{\beta}&pre_{1}\wedge pre_{2}&\underline{\texttt{iff}}&(\varrho,\varsigma)\models_{\beta}pre_{1}\text{ and }(\varrho,\varsigma)\models_{\beta}pre_{2}\\ (\varrho,\varsigma)&\models_{\beta}&pre_{1}\vee pre_{2}&\underline{\texttt{iff}}&(\varrho,\varsigma)\models_{\beta}pre_{1}\text{ or }(\varrho,\varsigma)\models_{\beta}pre_{2}\\ (\varrho,\varsigma)&\models_{\beta}&\exists x:pre&\underline{\texttt{iff}}&(\varrho,\varsigma[x\mapsto a])\models_{\beta}pre\text{ for some }a\in{\cal U}\\ (\varrho,\varsigma)&\models_{\beta}&\exists Y:pre&\underline{\texttt{iff}}&(\varrho,\varsigma[Y\mapsto l])\models_{\beta}pre\text{ for some }l\in\mathcal{L}_{\neq\bot}\\ \\ (\varrho,\zeta,\varsigma)&\models_{\beta}&R(\vec{u};V^{\prime})&\underline{\texttt{iff}}&\varrho(R)(\llbracket\vec{u}\rrbracket(\zeta,\varsigma))\sqsupseteq\llbracket V^{\prime}\rrbracket(\zeta,\varsigma)\\ (\varrho,\zeta,\varsigma)&\models_{\beta}&{\bf 1}&\underline{\texttt{iff}}&\texttt{true}\\ (\varrho,\zeta,\varsigma)&\models_{\beta}&cl_{1}\wedge cl_{2}&\underline{\texttt{iff}}&(\varrho,\zeta,\varsigma)\models_{\beta}cl_{1}\text{ and }(\varrho,\zeta,\varsigma)\models_{\beta}cl_{2}\\ (\varrho,\zeta,\varsigma)&\models_{\beta}&pre\Rightarrow cl&\underline{\texttt{iff}}&(\varrho,\zeta,\varsigma)\models_{\beta}cl\text{ whenever }(\varrho,\varsigma)\models_{\beta}pre\\ (\varrho,\zeta,\varsigma)&\models_{\beta}&\forall x:cl&\underline{\texttt{iff}}&(\varrho,\zeta,\varsigma[x\mapsto a])\models_{\beta}cl\text{ for all }a\in\mathcal{U}\\ (\varrho,\zeta,\varsigma)&\models_{\beta}&\forall Y:cl&\underline{\texttt{iff}}&(\varrho,\zeta,\varsigma[Y\mapsto l])\models_{\beta}cl\text{ for all }l\in\mathcal{L}_{\neq\bot}\\ \\ (\varrho,\zeta,\varsigma)&\models_{\beta}&cl_{1},\cdots,cl_{s}&\underline{\texttt{iff}}&(\varrho,\zeta,\varsigma)\models_{\beta}cl_{i}\text{ for all }i,1\leq i\leq s\end{array}

Relationship to ALFP.

As reader may have already noticed, in the case the underlying complete lattice is 𝒫(𝒰)\mathcal{P}(\mathcal{U}) the two logics are essentially equivalent. More precisely, in the case of powerset lattice, 𝒫(𝒰)\mathcal{P}(\mathcal{U}), function β\beta given by β(a)={a}\beta(a)=\{a\} for all a𝒰a\in\mathcal{U}, and without function terms we can translate LLFP formula into a corresponding ALFP one and vice versa. Intuitively, we get the following correspondence between interpretations of relations

a,b:((a,b)ρ(R)ϱ(R)(a){b})\forall\vec{a},b:((\vec{a},b)\in\rho(R)\Leftrightarrow\varrho(R)(\vec{a})\supseteq\{b\})

The idea is that a relation RR in LLFP with interpretation ϱ(R)𝒰k𝒫(𝒰)\varrho(R)\in{\mathcal{U}}^{k}\rightarrow\mathcal{P}(\mathcal{U}) is replaced by a relation in ALFP (also named RR) with interpretation ρ(R)𝒫(𝒰k+1)\rho(R)\in{\cal P}({\cal U}^{k+1}). Note that if ϱ(R)(a)=\varrho(R)(\vec{a})=\bot then ρ(R)\rho(R) does not contain any tuples with a\vec{a} as the first kk components.

Interval analysis in LLFP.

Now let us give an LLFP specification of interval analysis. The analysis is defined by the predicate AA. Similarly to Datalog or ALFP, the specification is defined over a universe 𝒰\mathcal{U}, which in this case is a set of all variables, 𝑉𝑎𝑟\mathit{Var}, appearing in the program as well as states in the underlying program graph. In addition, the LLFP logic allows interpretations over complete lattices satisfying Ascending Chain Condition. Here we use the lattice (𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙,I)(\mathit{Interval},\sqsubseteq_{I}), defined in Section 2.

The specification consists of the initialization clauses and clauses corresponding to three types of actions in the underlying program graph. First, for the initial state, q0q_{0}, we initialize all variables in the program graph with the \top element, denoting that they may have all possible values

v𝑉𝑎𝑟A(q0,v;)\bigwedge_{v\in\mathit{Var}}A(q_{0},v;\top)

Furthermore, whenever we have qsx:=yzqtq_{s}\xrightarrow{x:=y\star z}q_{t} in the program graph we generate

iy:iz:A(qs,y;iy)A(qs,z;iz)A(qt,x;f(iy,iz))\displaystyle\forall i_{y}:\forall i_{z}:A(q_{s},y;i_{y})\wedge A(q_{s},z;i_{z})\Rightarrow A(q_{t},x;f_{\star}(i_{y},i_{z}))\wedge
v:i:vxA(qs,v;i)A(qt,v;i)\displaystyle\forall v:\forall i:v\neq x\wedge A(q_{s},v;i)\Rightarrow A(q_{t},v;i)

The first conjunct updates the possible interval of values for the assigned variable (in that case for variable xx), with the result of evaluating the arithmetic operation yzy\star z. The second conjunct propagates the analysis information for all variables except variable xx without altering it. Furthermore, whenever we have qs𝑒qtq_{s}\xrightarrow{e}q_{t} or qsskipqtq_{s}\xrightarrow{skip}q_{t} in the program graph, we generate a clause

v:i:A(qs,v,i)A(qt,v,i)\forall v:\forall i:A(q_{s},v,i)\Rightarrow A(q_{t},v,i)

which simply propagates the analysis information along the edge of the program graph, without making any changes.

4 Moore family result for LLFP

In this section we establish a Moore family result for LLFP that guarantees that there always is a unique best solution for LLFP clauses.

Definition 4

A Moore family is a subset YY of a complete lattice =(,)\mathcal{L}=(\mathcal{L},\sqsubseteq) that is closed under greatest lower bounds: YY:YY\forall Y^{\prime}\subseteq Y:\bigsqcap Y^{\prime}\in Y.

It follows that a Moore family always contains a least element, Y\bigsqcap Y, and a greatest element, \bigsqcap\emptyset, which equals the greatest element, \top, from \mathcal{L}; in particular, a Moore family is never empty. The property is also called the model intersection property, since whenever we take a meet of a number of models we still get a model.

Assume clscls has the form cl1,,clscl_{1},\ldots,cl_{s}, and let Δ={ϱ:k/k𝒰k}\Delta=\{\varrho:\prod_{k}\mathcal{R}_{/k}\rightarrow\mathcal{U}^{k}\rightarrow\mathcal{L}\} denote the set of interpretations ϱ\varrho of predicate symbols in \mathcal{R}. We also define the lexicographical ordering \preceq such that ϱ1ϱ2\varrho_{1}\preceq\varrho_{2} if and only if there is some 1js1\leq j\leq s. where ss is the order of the formula, such that the following properties hold:

  1. (a)

    ϱ1(R)=ϱ2(R)\varrho_{1}(R)=\varrho_{2}(R) for all RR\in\mathcal{R} with rank(R)<j\mbox{\rm rank}(R)<j,

  2. (b)

    ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubseteq\varrho_{2}(R) for all RR\in\mathcal{R} with rank(R)=j\mbox{\rm rank}(R)=j,

  3. (c)

    either j=sj=s or ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubset\varrho_{2}(R) for at least one RR\in\mathcal{R} with rank(R)=j\mbox{\rm rank}(R)=j.

We say that ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubseteq\varrho_{2}(R) if and only if a𝒰k:ϱ1(R)(a)ϱ2(R)(a)\forall\vec{a}\in\mathcal{U}^{k}:\varrho_{1}(R)(\vec{a})\sqsubseteq\varrho_{2}(R)(\vec{a}), where k0k\geq 0 is the arity of RR. Notice that in the case s=1s=1, the above ordering coincides with lattice ordering \sqsubseteq. Intuitively, the lexicographical ordering \preceq orders the relations strata by strata starting with the strata 0. It is essentially analogous to the lexicographical ordering on strings, which is based on the alphabetical order of their characters.

Lemma 1

\preceq defines a partial order.

Proof

See Appendix 0.A.

Assume clscls has the form cl1,,clscl_{1},\cdots,cl_{s} where cljcl_{j} is the clause corresponding to stratum jj, and let j\mathcal{R}_{j} denote the set of all relation symbols RR defined in cl1,,cljcl_{1},\cdots,cl_{j} taking 0=\mathcal{R}_{0}=\emptyset. Let MΔM\subseteq\Delta denote a set of assignments which map relation symbols to relations.

Lemma 2

Δ=(Δ,)\Delta=(\Delta,\preceq) is a complete lattice with the greatest lower bound given by

(ΔM)(R)=λa.{ϱ(R)(a)ϱMrank(R)}\left(\bigsqcap{\!}_{\Delta}M\right)(R)=\lambda\vec{a}.\bigsqcap\left\{{\varrho}(R)(\vec{a})\mid\varrho\in M_{\mbox{\rm rank}(R)}\right\}

where

Mj={ϱMRrank(R)<j:ϱ(R)=(ΔM)(R)}M_{j}=\left\{\varrho\in M\mid\forall R^{\prime}\ \mbox{\rm rank}(R^{\prime})<j:\varrho(R^{\prime})=\left(\bigsqcap{\!}_{\Delta}M\right)\left(R^{\prime}\right)\right\}
Proof

See Appendix 0.B

Note that ΔM\bigsqcap{\!}_{\Delta}M is well defined by induction on jj observing that M0=MM_{0}=M and MjMj1M_{j}\subseteq M_{j-1}.

Proposition 1

Assume clscls is a stratified LLFP clause sequence, ς0\varsigma_{0} and ζ0\zeta_{0} are interpretations of free variables and function symbols in clscls, respectively. Furthermore, ϱ0\varrho_{0} is an interpretation of all relations of rank 0. Then {ϱ(ϱ,ζ0,ς0)βclsR:rank(R)=0ϱ0(R)ϱ(R)}\{\varrho\mid(\varrho,\zeta_{0},\varsigma_{0})\models_{\beta}cls\wedge\forall R:\mbox{\rm rank}(R)=0\Rightarrow\varrho_{0}(R)\sqsubseteq\varrho(R)\} is a Moore family.

Proof

See Appendix 0.C

The result ensures that the approach falls within the framework of Abstract Interpretation [7, 8]; hence we can be sure that there always is a single best solution for the analysis problem under consideration, namely the one defined in Proposition 1.

5 The Algorithm

In this section we present the algorithm for solving LLFP clause sequences, which extends the differential worklist algorithm by Nielson et al. [18, 17]. The algorithm computes the relations in increasing order on their rank and therefore the negations present no obstacles. It completely abandons a worklist-like data structures, which are typical for most classical iterative fixpoint algorithms [12]. Instead, we adapt the recursive topdown approach of Le Charlier and van Hentenryck [6] which is enhanced by continuation based semi-naive iteration [3, 11].

In the following we assume that prior to solving the LLFP formula, all the clauses are transformed into a form such that all applied occurrences of variables Y𝒴Y\in\mathcal{Y} in preconditions, i.e. Y(u)Y(u), are not followed by their defining occurrences, i.e. R(u;Y)R(\vec{u};Y) and ¬R(u;Y)\neg R(\vec{u};Y). This is necessary to correctly perform late bindings of variables Y𝒴Y\in\mathcal{Y} in the presence of Y(u)Y(u) construct.

The algorithm operates with (intermediate) representations of the two interpretations ς\varsigma and ϱ\varrho of the semantics; we shall call them env and result, respectively, in the following. The data structure env is supplied as a parameter to the functions of the algorithms, and it represents partial environment. The data structure result is an imperative data structure that is updated as we progress.

The partial environment env is implemented as a map from variables to their optional values. In the case the variable is undefined it is mapped into 𝑁𝑜𝑛𝑒\mathit{None}. Otherwise, depending on its type it is mapped to 𝑆𝑜𝑚𝑒(a)\mathit{Some}(a) or 𝑆𝑜𝑚𝑒(l)\mathit{Some}(l), which means that the variable is bound to a𝒰a\in{\cal U}, or ll\in\mathcal{L}_{\neq\bot}, respectively. The main operation on env is the function unify, defined as follows

unify(β,𝚎𝚗𝚟,(u;V),(a;l))={if unifyU(𝚎𝚗𝚟,u,a)=failunifyL(β,𝚎𝚗𝚟,V,l)if unifyU(𝚎𝚗𝚟,u,a)=𝚎𝚗𝚟\textsc{unify}(\beta,{\tt env},(\vec{u};V),(\vec{a};l))=\left\{\begin{array}[]{ll}\emptyset&\hbox{if }\textsc{unify}_{\textsc{U}}({\tt env},\vec{u},\vec{a})=\mbox{fail}\\ \textsc{unify}_{\textsc{L}}(\beta,{\tt env^{\prime}},V,l)&\hbox{if }\textsc{unify}_{\textsc{U}}({\tt env},\vec{u},\vec{a})={\tt env^{\prime}}\\ \end{array}\right.

It uses two auxiliary functions that perform unifications on each component of the relation. For the first component, which ranges over the universe 𝒰\mathcal{U}, the function is given by

unifyU(𝚎𝚗𝚟,u,a)={𝚎𝚗𝚟if (u𝒳𝚎𝚗𝚟[u]=𝑆𝑜𝑚𝑒(a))u=a𝚎𝚗𝚟[u𝑆𝑜𝑚𝑒(a)]if u𝒳𝚎𝚗𝚟[u]=𝑁𝑜𝑛𝑒failotherwise\textsc{unify}_{\textsc{U}}({\tt env},u,a)=\left\{\begin{array}[]{ll}{\tt env}&\hbox{if }(u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{Some}(a))\vee u=a\\ {\tt env}[u\mapsto\mathit{Some}(a)]&\hbox{if }u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{None}\\ \hbox{fail}&\hbox{otherwise}\end{array}\right.

It performs a unification of an argument uu with an element a𝒰a\in\mathcal{U} in the environment env. In the case when the unification succeeds the modified environment is returned, otherwise the function fails. The funcion is extended to kk-tuples in a straightforward way. The definition of the unification function for the lattice component is given by

unifyL(β,𝚎𝚗𝚟,V,l)={{𝚎𝚗𝚟[V𝑆𝑜𝑚𝑒(llV)]}if V𝒴𝚎𝚗𝚟[V]=𝑆𝑜𝑚𝑒(lV)llV{𝚎𝚗𝚟[V𝑆𝑜𝑚𝑒(l)]}if V𝒴𝚎𝚗𝚟[V]=𝑁𝑜𝑛𝑒l{𝚎𝚗𝚟} if V=[u]((u𝒳𝚎𝚗𝚟[u]=𝑆𝑜𝑚𝑒(a))u=a)β(a)l{𝚎𝚗𝚟[u𝑆𝑜𝑚𝑒(a)]β(a)l}if V=[u]u𝒳𝚎𝚗𝚟[u]=𝑁𝑜𝑛𝑒otherwise\textsc{unify}_{\textsc{L}}(\beta,{\tt env},V,l)=\left\{\begin{array}[]{l}\{{\tt env}[V\mapsto\mathit{Some}(l\sqcap l_{V})]\}\\ \hskip 14.22636pt\hbox{if }V\in\mathcal{Y}\wedge{\tt env}[V]=\mathit{Some}(l_{V})\wedge l\sqcap l_{V}\neq\bot\\ \{{\tt env}[V\mapsto\mathit{Some}(l)]\}\\ \hskip 14.22636pt\hbox{if }V\in\mathcal{Y}\wedge{\tt env}[V]=\mathit{None}\wedge l\neq\bot\\ \{{\tt env}\}\hbox{ if }V=[u]\wedge\\ \hskip 14.22636pt((u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{Some}(a))\vee u=a)\wedge\beta(a)\sqsubseteq l\\ \{{\tt env}[u\mapsto\mathit{Some}(a)]\mid\beta(a)\sqsubseteq l\}\\ \hskip 14.22636pt\hbox{if }V=[u]\wedge u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{None}\\ \emptyset\ \ \ \hbox{otherwise}\end{array}\right.

The function is parametrized with β:𝒰\beta:\mathcal{U}\rightarrow\mathcal{L}, defined in Section 3. It performs a unification of an lattice term VV with an element ll\in\mathcal{L} in the environment env. In the case when the unification succeeds the set of unified environments is returned, otherwise the function returns empty set.

The other important operation on the partial environment is given by the function unifiable. The function when applied to env and a tuple (u;V)(\vec{u};V), returns a set of tuples for which unify would succeed. The function is defined by means of two auxiliary functions, formally we have

unifiable(env,(u;V))=(unifiableU(env,u);unifiableL(env,V))\textsc{unifiable}(\texttt{env},(\vec{u};V))=(\textsc{unifiable}_{\textsc{U}}(\texttt{env},\vec{u});\textsc{unifiable}_{\textsc{L}}(\texttt{env},V))

where

unifiableU(env,u)={{a}if (u𝒳𝚎𝚗𝚟[u]=𝑆𝑜𝑚𝑒(a))u=a𝒰if u𝒳𝚎𝚗𝚟[u]=𝑁𝑜𝑛𝑒\textsc{unifiable}_{\textsc{U}}(\texttt{env},u)=\left\{\begin{array}[]{cl}\{a\}&\hbox{if }(u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{Some}(a))\vee u=a\\ \mathcal{U}&\hbox{if }u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{None}\\ \end{array}\right.

and

unifiableL(env,V)={lif V𝒴𝚎𝚗𝚟[V]=𝑆𝑜𝑚𝑒(l)if V𝒴𝚎𝚗𝚟[V]=𝑁𝑜𝑛𝑒β(a)if V=[u](u=a(u𝒳𝚎𝚗𝚟[u]=𝑆𝑜𝑚𝑒(a))){β(a)a𝒰}if V=[u]u𝒳𝚎𝚗𝚟[u]=𝑁𝑜𝑛𝑒f(l)if V=f(V)l=unifiableL(env,V)\begin{array}[]{ll}\textsc{unifiable}_{\textsc{L}}(\texttt{env},V)=\left\{\begin{array}[]{ll}l&\hbox{if }V\in\mathcal{Y}\wedge{\tt env}[V]=\mathit{Some}(l)\\ \top&\hbox{if }V\in\mathcal{Y}\wedge{\tt env}[V]=\mathit{None}\\ \beta(a)&\hbox{if }V=[u]\wedge(u=a\vee\\ &\hskip 8.53581pt(u\in\mathcal{X}\wedge{\tt env}[u]=\mathit{Some}(a)))\\ \bigsqcup\{\beta(a)\mid a\in\mathcal{U}\}&\hbox{if }V=[u]\wedge u\in\mathcal{X}\wedge\\ &\hskip 8.53581pt{\tt env}[u]=\mathit{None}\\ \llbracket f\rrbracket(l)&\hbox{if }V=f(\vec{V})\wedge\\ &\hskip 8.53581ptl=\textsc{unifiable}_{\textsc{L}}(\texttt{env},\vec{V})\\ \end{array}\right.\end{array}

Both auxiliary funcions are extended to kk-tuples in a straightforward way.

The global data structure result, which is updated incrementally during computations, is represented as a mapping from predicate names to the prefix trees that for each predicate RR record the tuples currently known to belong to RR. There are three main operations on the data structure result: the operation result.has checks whether a given tuple is associated with a given predicate, the operation result.sub returns a list of the tuples associated with a given predicate and the operation result.add adds a tuple to the interpretation of a given predicate.

Since ϱ\varrho is updated as the algorithm progresses, it may happen that a query R(v;V)R(\vec{v};V) inside a precondition fails to be satisfied at the given point in time, but may hold in the future when a new tuple (a;l)(\vec{a};l) is added to the interpretation of RR. If we are not careful we may lose the consequences that adding (a;l)(\vec{a};l) to RR will have on the contents of other predicates. This gives rise to the data structure infl that records computations that have to be resumed for the new tuples; these future computations are called consumers. The infl data structure is also represented as a mapping from the predicate names to prefix trees that for each predicate RR record consumers that have to be resumed when the interpretation of RR is updated. There are two main operations on the data structure infl: the operation infl.register that adds a new consumer for a given predicate and infl.consumers that returns all the consumers currently associated with a given predicate.

In the algorithm, we have one function for each of the three syntactic categories. The function solve takes a clause sequence as input and calls the function execute on each of the individual clauses

solve(cl1,,cls)=execute(cl1)[];;execute(cls)[]\textsc{solve}(cl_{1},\ldots,cl_{s})=\textsc{execute}(cl_{1})[\ ];\ldots;\textsc{execute}(cl_{s})[\ ]

where we write [ ] for the empty environment reflecting that we have no free variables in the clause sequences.

Let us now turn to the description of the function execute. The function takes a clause clcl as a parameter and a representation env of the interpretation of the variables. We have one case for each of the forms of clcl; the pseudo code is given in Figure 1.

execute(R(v;V))R(\vec{v};V))env =
let iterFun (a;l)(\vec{a};l) =
match result.has(R,(a;l)R,(\vec{a};l)) with
|| truetrue \rightarrow ()
|| falsefalse \rightarrow
result.add(R,(a;l)R,(\vec{a};l))
iter (fun ff \rightarrow ff (a;l)(\vec{a};l)) (infl.consumers R)R)
in iter iterFun (unifiable(env,(v;V)(\vec{v};V)))
execute(𝟏)𝚎𝚗𝚟=()\textsc{execute}({\bf 1}){\tt env}=()
execute(cl1cl2)𝚎𝚗𝚟=execute(cl1)𝚎𝚗𝚟;execute(cl2)𝚎𝚗𝚟\textsc{execute}(cl_{1}\wedge cl_{2}){\tt env}=\textsc{execute}(cl_{1}){\tt env};\textsc{execute}(cl_{2}){\tt env}
execute(precl)𝚎𝚗𝚟=check(pre,execute(cl))𝚎𝚗𝚟\textsc{execute}(pre\Rightarrow cl){\tt env}=\textsc{check}(pre,\textsc{execute}(cl)){\tt env}
execute(x:cl)𝚎𝚗𝚟=execute(cl)(𝚎𝚗𝚟[x𝑁𝑜𝑛𝑒])\textsc{execute}(\forall x:cl){\tt env}={\textsc{execute}}(cl)({\tt env}[x\mapsto\mathit{None}])
Figure 1: The execute function.

Let us explain the case of an assertion first. The algorithm uses the auxiliary function iter, which applies the function iterFun to each element of the list of tuples that can be unified with the argument (v;V)(\vec{v};V). Given a tuple (a;l)(\vec{a};l), the function iterFun adds the tuple to the interpretation of RR stored in result if it is not already present. If the add operation succeeds, we first create a list of all the consumers currently registered for predicate RR by calling the function infl.consumers. Thereafter, we resume the computations by iterating over the list of consumers and calling corresponding continuations. The cases of always true clause, 1, is straightforward; the function simply returns the unit, without performing any other actions. In the case of the conjunction of clauses the algorithm calls the execute function for both conjuncts and the current environment env. In the case of implication we make use of the function check that in addition to the precondition and the environment also takes the continuation execute(cl)\textsc{execute}(cl) as an argument. In the case of universal quantification, we simply extend the environment to record that the value of the new variable is unknown and then we recurse. The case of universal quantification over a variable Y𝒴Y\in\mathcal{Y} is exactly the same and hence omitted.

Now, let us present the function check. It takes a precondition, a continuation and an environment as parameters. The pseudo code is given in Figure 2.

check(R(v;V),next)R(\vec{v};V),next)env =
let consumer (a;l)(\vec{a};l) =
match unify(env,(v;V),(a;l),(\vec{v};V),(\vec{a};l)) with
|| fail \rightarrow ()
|| envs \rightarrow iter nextnext envs
in infl.register(RR,consumer); iter consumer (result.sub RR)
check(¬R(v;V),next)\neg R(\vec{v};V),next)env =
let iterFun (a;l)(\vec{a};l) =
match result.has(R,(a;l)R,(\vec{a};l)) with
|| truetrue \rightarrow ()
|| falsefalse \rightarrow iter nextnext (unify(env,(v;V),(a;l),(\vec{v};V),(\vec{a};l)))
in iter iterFun (unifiable(env,(v;V),(\vec{v};V)))
check(Y(x),next)Y(x),next)env =
let env’ = if env(Y)=𝑆𝑜𝑚𝑒(l)(Y)=\mathit{Some}(l) then env else env[Y][Y\mapsto\top]
in let f a = if 𝑆𝑜𝑚𝑒(β(a))\textbf{if }\mathit{Some}(\beta(a))\sqsubseteq\ env’(Y) then next(Y)\textbf{ then }next\ env’[xa] else ()[x\mapsto a]\textbf{ else }()
in match env’(x) with
|| 𝑆𝑜𝑚𝑒(a)\mathit{Some}(a) \rightarrow f a
|| 𝑁𝑜𝑛𝑒\mathit{None} \rightarrow iter f UU
check(pre1pre2,next)𝚎𝚗𝚟=check(pre1,check(pre2,next))𝚎𝚗𝚟\textsc{check}(pre_{1}\wedge pre_{2},next){\tt env}=\textsc{check}(pre_{1},\textsc{check}(pre_{2},next)){\tt env}
check(pre1pre2,next)pre_{1}\vee pre_{2},next)env = check(pre1,nextpre_{1},next)env; check(pre2,nextpre_{2},next)env
check(x:pre,next)\exists x:pre,next)env = check(pre,nextpre,next\ \circ\ (remove xx))(env[x𝑁𝑜𝑛𝑒][x\mapsto\mathit{None}])
Figure 2: The check function.

In the case of positive queries we first ensure that the consumer is registered in infl, by calling function register, so that future tuples associated with RR will be processed. Thereafter, the function inspects the data structure result to obtain the list of tuples associated with the predicate RR. Then, the auxiliary function consumer unifies (v;V)(\vec{v};V) with each tuple; and if the operation succeeds, the continuation nextnext is invoked on each of the updated new environments in the returned set envs. In the case of negated query, the algorithm first computes the tuples unifiable with (v;V)(\vec{v};V) in the environment env. Then, for each tuple it checks whether the tuple is already in RR and if not, the tuple is unified with (v;V)(\vec{v};V) to produce set of new environments. Thereafter, the continuation nextnext is evaluated in each of the environments contained in the returned set. Notice that in the case of negative queries we do not register a consumer for the relation RR. This is because the stratification condition introduced in Definition 3 ensures that the relation is fully evaluated before it is queried negatively. Thus, there is no need to register future computations since the interpretation of RR will not change. Now, let us consider function check in the case of Y(x)Y(x), where x𝒳x\in\mathcal{X}. The function begins with creating an environment env’ that is exactly as env except that the binding for the variable YY is set to \top in the case YY is undefined in env. Then, we define an auxiliary function that checks whether env’(Y)(Y) over-approximates the abstraction of an argument aa, denoted by β(a)\beta(a), and if so the continuation is called in the environment env’[xa][x\mapsto a]. Finally, the function checks the binding for the variable xx in the environment env’ and if it is bound to 𝑆𝑜𝑚𝑒(a)\mathit{Some}(a) the function f applied to aa is called. Otherwise, the function f is called for each element of the universe, using the iter function. The case of Y(a)Y(a), where a𝒰a\in\mathcal{U} is essentially the same as the case explained above, except that we do not have to handle the case when x𝒳x\in\mathcal{X} is undefined in env. For conjunction of preconditions we exploit a continuation passing programming style. More precisely, we call the check function for the precondition pre1pre_{1}, and as a continuation we pass a call to the check function partially applied to the precondition pre2pre_{2} and the continuation nextnext. In the case of disjunction of preconditions the function simply checks preconditions pre1pre_{1} and pre2pre_{2} respectively in the current environment env. In order to be efficient we use memoization; this means that if both checks yield the same bindings of variables, the second check does not need to consider the continuation, as it has already been done. The algorithm for existential quantification checks the precondition prepre in the environment extended with the quantified variable. The continuation that is passed is a composition of functions nextnext and remove xx, where the function remove removes variable passed as a first argument from the environment passed as a second argument. In order to be efficient we again use a memoization to avoid redundant computations. The case of existential quantification over a variable Y𝒴Y\in\mathcal{Y} is exactly the same and hence omitted.

6 Conclusions and Future Work

In the paper we introduced the LLFP logic, which is an expressive formalism for specifying static analysis problems. It lifts the limitation of logics such as Datalog and ALFP by allowing interpretation over complete lattices satisfying Ascending Chain Condition. Thanks to the declarative style, the analysis specifications are easy to analyse for their correctness.

We established a Moore Family result that guarantees that there always is a unique best solution for the LLFP formulae. More generally this ensures that the approach taken falls within the general Abstract Interpretation framework. We also developed a state-of-the-art solving algorithm for LLFP, which is a continuation passing style algorithm, which represents relations as prefix trees. We showed that the logic and the associated solver can be used for rapid prototyping of sophisticated static analyses by presenting the formulation of interval analysis.

As a future work we plan to implement a front-end to automatically extract analysis relations from program source code, and perform experiments on real-world programs in order to evaluate the performance of the LLFP solver. Furthermore, we would like to lift the Ascending Chain Condition and use e.g. widening operator [9, 10] in order to ensure termination of the least fixed point computation.

References

  • [1] K. R. Apt, H. A. Blair, and A. Walker. Towards a theory of declarative knowledge. In Foundations of Deductive Databases and Logic Programming., pages 89–148. Morgan Kaufmann, 1988.
  • [2] C. Baier and J.-P. Katoen. Principles of Model Checking (Representation and Mind Series). The MIT Press, 2008.
  • [3] I. Balbin and K. Ramamohanarao. A generalization of the differential approach to recursive query evaluation. Journal of Logic Programming, 4(3):259–262, 1987.
  • [4] M. Bravenboer and Y. Smaragdakis. Strictly declarative specification of sophisticated points-to analyses. In S. Arora and G. T. Leavens, editors, OOPSLA, pages 243–262. ACM, 2009.
  • [5] A. K. Chandra and D. Harel. Computable queries for relational data bases (preliminary report). In STOC, pages 309–318, 1979.
  • [6] B. L. Charlier and P. V. Hentenryck. A universal top-down fixpoint algorithm. Technical report, CS-92-25, Brown University, 1992.
  • [7] P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238–252, 1977.
  • [8] P. Cousot and R. Cousot. Systematic design of program analysis frameworks. In POPL, pages 269–282, 1979.
  • [9] P. Cousot and R. Cousot. Abstract interpretation and application to logic programs. J. Log. Program., 13(2&3):103–179, 1992.
  • [10] P. Cousot and R. Cousot. Comparing the galois connection and widening/narrowing approaches to abstract interpretation. In M. Bruynooghe and M. Wirsing, editors, PLILP, volume 631 of Lecture Notes in Computer Science, pages 269–295. Springer, 1992.
  • [11] C. Fecht and H. Seidl. Propagating differences: An efficient new fixpoint algorithm for distributive constraint systems. Nordic Journal of Computing, 5(4):304–329, 1998.
  • [12] C. Fecht and H. Seidl. A faster solver for general systems of equations. Sci. Comput. Program., 35(2):137–161, 1999.
  • [13] J. B. Kam and J. D. Ullman. Monotone data flow analysis frameworks. Acta Inf., 7:305–317, 1977.
  • [14] G. A. Kildall. A unified approach to global program optimization. In POPL, pages 194–206, 1973.
  • [15] M. S. Lam, J. Whaley, V. B. Livshits, M. C. Martin, D. Avots, M. Carbin, and C. Unkel. Context-sensitive program analysis as database queries. In PODS, pages 1–12, 2005.
  • [16] F. Nielson, H. R. Nielson, and C. Hankin. Principles of Program Analysis. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 1999.
  • [17] F. Nielson, H. R. Nielson, H. Sun, M. Buchholtz, R. R. Hansen, H. Pilegaard, and H. Seidl. The succinct solver suite. In TACAS, pages 251–265, 2004.
  • [18] F. Nielson, H. Seidl, and H. R. Nielson. A Succinct Solver for ALFP. Nord. J. Comput., 9(4):335–372, 2002.
  • [19] T. W. Reps. Demand interprocedural program analysis using logic databases. In Workshop on Programming with Logic Databases (Book), ILPS, pages 163–196, 1993.
  • [20] J. Whaley, D. Avots, M. Carbin, and M. S. Lam. Using datalog with binary decision diagrams for program analysis. In APLAS, pages 97–118, 2005.
  • [21] J. Whaley and M. S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In PLDI, pages 131–144, 2004.

These appendices are not intended for publication and references to them will be removed in the final version.

Appendix 0.A Proof of Lemma 1

Proof

Reflexivity ϱΔ:ϱϱ\forall\varrho\in\Delta:\varrho\preceq\varrho.

To show that ϱϱ\varrho\preceq\varrho let us take j=sj=s. If rank(R)<j\mbox{\rm rank}(R)<j then ϱ(R)=ϱ(R)\varrho(R)=\varrho(R) as required. Otherwise if rank(R)=j\mbox{\rm rank}(R)=j then from ϱ(R)=ϱ(R)\varrho(R)=\varrho(R) we get ϱ(R)ϱ(R)\varrho(R)\sqsubseteq\varrho(R). Thus we get the required ϱϱ\varrho\preceq\varrho.

Transitivity ϱ1,ϱ2,ϱ3Δ:ϱ1ϱ2ϱ2ϱ3ϱ1ϱ3\forall\varrho_{1},\varrho_{2},\varrho_{3}\in\Delta:\varrho_{1}\preceq\varrho_{2}\wedge\varrho_{2}\preceq\varrho_{3}\Rightarrow\varrho_{1}\preceq\varrho_{3}.

Let us assume that ϱ1ϱ2ϱ2ϱ3\varrho_{1}\preceq\varrho_{2}\wedge\varrho_{2}\preceq\varrho_{3}. From ϱiϱi+1\varrho_{i}\preceq\varrho_{i+1} we have jij_{i} such that conditions (a)–(c) are fulfilled for i=1,2i=1,2. Let us take jj to be the minimum of j1j_{1} and j2j_{2}. Now we need to verify that conditions (a)–(c) hold for jj. If rank(R)<j\mbox{\rm rank}(R)<j we have ϱ1(R)=ϱ2(R)\varrho_{1}(R)=\varrho_{2}(R) and ϱ2(R)=ϱ3(R)\varrho_{2}(R)=\varrho_{3}(R). It follows that ϱ1(R)=ϱ3(R)\varrho_{1}(R)=\varrho_{3}(R), hence (a) holds. Now let us assume that rank(R)=j\mbox{\rm rank}(R)=j. We have ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubseteq\varrho_{2}(R) and ϱ2(R)ϱ3(R)\varrho_{2}(R)\sqsubseteq\varrho_{3}(R) and from transitivity of \sqsubseteq we get ϱ1(R)ϱ3(R)\varrho_{1}(R)\sqsubseteq\varrho_{3}(R), which gives (b). Let us now assume that jsj\neq s, hence ϱi(R)ϱi+1(R)\varrho_{i}(R)\sqsubset\varrho_{i+1}(R) for some RR\in\mathcal{R} and i=1,2i=1,2. Without loss of generality let us assume that ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubset\varrho_{2}(R). We have ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubset\varrho_{2}(R) and ϱ2(R)ϱ3(R)\varrho_{2}(R)\sqsubseteq\varrho_{3}(R), hence ϱ1(R)ϱ3(R)\varrho_{1}(R)\sqsubset\varrho_{3}(R), and (c) holds.

Anti-symmetry ϱ1,ϱ2Δ:ϱ1ϱ2ϱ2ϱ1ϱ1=ϱ2\forall\varrho_{1},\varrho_{2}\in\Delta:\varrho_{1}\preceq\varrho_{2}\wedge\varrho_{2}\preceq\varrho_{1}\Rightarrow\varrho_{1}=\varrho_{2}.

Let us assume ϱ1ϱ2\varrho_{1}\preceq\varrho_{2} and ϱ2ϱ1\varrho_{2}\preceq\varrho_{1}. Let jj be minimal such that rank(R)=j\mbox{\rm rank}(R)=j and ϱ1(R)ϱ2(R)\varrho_{1}(R)\neq\varrho_{2}(R) for some RR\in\mathcal{R}. Then, since rank(R)=j\mbox{\rm rank}(R)=j, we have ϱ1(R)ϱ2(R)\varrho_{1}(R)\sqsubseteq\varrho_{2}(R) and ϱ2(R)ϱ1(R)\varrho_{2}(R)\sqsubseteq\varrho_{1}(R). Hence ϱ1(R)=ϱ2(R)\varrho_{1}(R)=\varrho_{2}(R) which is a contradiction. Thus it must be the case that ϱ1(R)=ϱ2(R)\varrho_{1}(R)=\varrho_{2}(R) for all RR\in\mathcal{R}. ∎

Appendix 0.B Proof of Lemma 2

Proof

First we prove that ΔM\bigsqcap{\!}_{\Delta}M is a lower bound of MM; that is ΔMϱ\bigsqcap{\!}_{\Delta}M\preceq{\varrho} for all ϱM{\varrho}\in M. Let jj be maximum such that ϱMj{\varrho}\in M_{j}; since M=M0M=M_{0} and MjMj+1M_{j}\supseteq M_{j+1} clearly such jj exists. From definition of MjM_{j} it follows that (ΔM)(R)=ϱ(R)(\bigsqcap{\!}_{\Delta}M)(R)={\varrho}(R) for all RR with rank(R)<j\mbox{\rm rank}(R)<j; hence (a) holds. If rank(R)=j\mbox{\rm rank}(R)=j we have (ΔM)(R)=λa.{ϱ(R)(a)ϱMj}ϱ(R)(\bigsqcap{\!}_{\Delta}M)(R)=\lambda\vec{a}.\bigsqcap\{{\varrho}^{\prime}(R)(\vec{a})\mid{\varrho}^{\prime}\in M_{j}\}\sqsubseteq{\varrho}(R) showing that (b) holds. Finally let us assume that jsj\neq s; we need to show that there is some RR with rank(R)=j\mbox{\rm rank}(R)=j such that (ΔM)(R)ϱ(R)(\bigsqcap{\!}_{\Delta}M)(R)\sqsubset{\varrho}(R). Since we know that jj is maximum such that ϱMj{\varrho}\in M_{j}, it follows that ϱMj+1{\varrho}\notin M_{j+1}, hence there is a relation RR with rank(R)=j\mbox{\rm rank}(R)=j such that (ΔM)(R)ϱ(R)(\bigsqcap{\!}_{\Delta}M)(R)\sqsubset{\varrho}(R); thus (c) holds.

Now we need to show that ΔM\bigsqcap{\!}_{\Delta}M is the greatest lower bound. Let us assume that ϱϱ{\varrho}^{\prime}\preceq{\varrho} for all ϱM{\varrho}\in M, and let us show that ϱΔM{\varrho}^{\prime}\preceq\bigsqcap{\!}_{\Delta}M. If ϱ=ΔM{\varrho}^{\prime}=\bigsqcap{\!}_{\Delta}M the result holds vacuously, hence let us assume ϱΔM{\varrho}^{\prime}\neq\bigsqcap{\!}_{\Delta}M. Then there exists a minimal jj such that (ΔM)(R)ϱ(R)(\bigsqcap{\!}_{\Delta}M)(R)\neq{\varrho}^{\prime}(R) for some RR with rank(R)=j\mbox{\rm rank}(R)=j. Let us first consider RR such that rank(R)<j\mbox{\rm rank}(R)<j. By our choice of jj we have (ΔM)(R)=ϱ(R)(\bigsqcap{\!}_{\Delta}M)(R)={\varrho}^{\prime}(R) hence (a) holds. Next assume that rank(R)=j\mbox{\rm rank}(R)=j. Since we assumed that ϱϱ{\varrho}^{\prime}\preceq{\varrho} for all ϱM{\varrho}\in M and MjMM_{j}\subseteq M, it follows that ϱ(R)ϱ(R){\varrho}^{\prime}(R)\sqsubseteq{\varrho}(R) for all ϱMj{\varrho}\in M_{j}. Thus we have ϱ(R)λa.{ϱ(R)(a)ϱMj}{\varrho}^{\prime}(R)\sqsubseteq\lambda\vec{a}.\bigsqcap\{{\varrho}(R)(\vec{a})\mid{\varrho}\in M_{j}\}. Since (ΔM)(R)=λa.{ϱ(R)(a)ϱMj}(\bigsqcap{\!}_{\Delta}M)(R)=\lambda\vec{a}.\bigsqcap\{{\varrho}(R)(\vec{a})\mid{\varrho}\in M_{j}\}, we have ϱ(R)(ΔM)(R){\varrho}^{\prime}(R)\sqsubseteq(\bigsqcap{\!}_{\Delta}M)(R) which proves (b). Finally since we assumed that ϱ(R)(ΔM)(R){\varrho}^{\prime}(R)\neq(\bigsqcap{\!}_{\Delta}M)(R) for some RR with rank(R)=j\mbox{\rm rank}(R)=j, it follows that (c) holds. Thus we proved that ϱΔM{\varrho}^{\prime}\preceq\bigsqcap{\!}_{\Delta}M. ∎

Appendix 0.C Proof of Proposition 1

In order to prove Proposition 1 we first state and prove two auxiliary lemmas.

Lemma 3

If ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M, prepre occurs in cljcl_{j} and (ϱ,ς)βpre(\varrho,\varsigma)\models_{\beta}pre then also (ϱ,ς)βpre({\varrho}^{\prime},{\varsigma})\models_{\beta}pre for all ϱMj{\varrho}^{\prime}\in M_{j}.

Proof

We proceed by induction on jj and in each case perform a structural induction on the form of the precondition prepre occurring in cljcl_{j}.
Case: pre=R(u;V)pre=R(\vec{u};V)
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)βR(u;V)({\varrho},{\varsigma})\models_{\beta}R(\vec{u};V)

From Table 1 we have:

ϱ(R)(ς(u))ς(V)\varrho(R)(\varsigma(\vec{u}))\sqsupseteq\varsigma(V)

Depending on the rank of RR we have two cases. If rank(R)=j\mbox{\rm rank}(R)=j then ϱ(R)=λa.{ϱ(R)(a)ϱMj}\varrho(R)=\lambda\vec{a}.\bigsqcap\{{\varrho}^{\prime}(R)(\vec{a})\mid{\varrho}^{\prime}\in M_{j}\} and hence we have

{ϱ(R)(ς(u))ϱMj}ς(V)\bigsqcap\{{\varrho}^{\prime}(R)(\varsigma(\vec{u}))\mid{\varrho}^{\prime}\in M_{j}\}\sqsupseteq\varsigma(V)

It follows that for all ϱMj{\varrho}^{\prime}\in M_{j}

ϱ(R)(ς(u))ς(V){\varrho}^{\prime}(R)(\varsigma(\vec{u}))\sqsupseteq\varsigma(V)

Now if rank(R)<j\mbox{\rm rank}(R)<j then ϱ(R)=ϱ(R){\varrho}(R)={\varrho}^{\prime}(R) for all ϱMj{\varrho}^{\prime}\in M_{j} hence we have that for all ϱMj{\varrho}^{\prime}\in M_{j}

ϱ(R)(ς(u))ς(V){\varrho}^{\prime}(R)(\varsigma(\vec{u}))\sqsupseteq\varsigma(V)

which according to Table 1 is equivalent to

ϱMj:(ϱ,ς)βR(u;V)\forall{\varrho}^{\prime}\in M_{j}:({\varrho}^{\prime},{\varsigma})\models_{\beta}R(\vec{u};V)

which was required and finishes the case.
Case: pre=Y(u)pre=Y(u)
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)βY(u)({\varrho},{\varsigma})\models_{\beta}Y(u)

According to the semantics of LLFP in Table 1 we have

β(ς(u))ς(Y)\beta(\varsigma(u))\sqsubseteq\varsigma(Y)

It follows that

ϱMj:β(ς(u))ς(Y)\forall{\varrho}^{\prime}\in M_{j}:\beta(\varsigma(u))\sqsubseteq\varsigma(Y)

which according to the semantics of LLFP in Table 1 is equivalent to

ϱMj:(ϱ,ς)βY(u)\forall{\varrho}^{\prime}\in M_{j}:({\varrho}^{\prime},{\varsigma})\models_{\beta}Y(u)

which was required and finishes the case.
Case: pre=¬R(u;V)pre=\neg R(\vec{u};V)
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)β¬R(u;V)({\varrho},{\varsigma})\models_{\beta}\neg R(\vec{u};V)

From Table 1 we have:

(ϱ(R)(ς(u)))ς(V)\complement(\varrho(R)(\varsigma(\vec{u})))\sqsupseteq\varsigma(V)

Since rank(R)<j\mbox{\rm rank}(R)<j then we know that ϱ(R)=ϱ(R){\varrho}(R)={\varrho}^{\prime}(R) for all ϱMj{\varrho}^{\prime}\in M_{j} hence we have that

ϱMj:(ϱ(R)(ς(u)))ς(V)\forall{\varrho}^{\prime}\in M_{j}:\complement(\varrho(R)(\varsigma(\vec{u})))\sqsupseteq\varsigma(V)

Which according to Table 1 is equivalent to

ϱMj:(ϱ,ς)β¬R(u;V)\forall{\varrho}^{\prime}\in M_{j}:({\varrho}^{\prime},{\varsigma})\models_{\beta}\neg R(\vec{u};V)

which was required and finishes the case.
Case: pre=pre1pre2pre=pre_{1}\wedge pre_{2}
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)βpre1pre2({\varrho},{\varsigma})\models_{\beta}pre_{1}\wedge pre_{2}

According to Table 1 we have

(ϱ,ς)βpre1({\varrho},{\varsigma})\models_{\beta}pre_{1}

and

(ϱ,ς)βpre2({\varrho},{\varsigma})\models_{\beta}pre_{2}

From the induction hypothesis we get that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βpre1({\varrho}^{\prime},{\varsigma})\models_{\beta}pre_{1}

and

(ϱ,ς)βpre2({\varrho}^{\prime},{\varsigma})\models_{\beta}pre_{2}

It follows that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βpre1pre2({\varrho}^{\prime},{\varsigma})\models_{\beta}pre_{1}\wedge pre_{2}

which was required and finishes the case.
Case: pre=pre1pre2pre=pre_{1}\vee pre_{2}
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)βpre1pre2({\varrho},{\varsigma})\models_{\beta}pre_{1}\vee pre_{2}

According to Table 1 we have

(ϱ,ς)βpre1({\varrho},{\varsigma})\models_{\beta}pre_{1}

or

(ϱ,ς)βpre2({\varrho},{\varsigma})\models_{\beta}pre_{2}

From the induction hypothesis we get that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βpre1({\varrho}^{\prime},{\varsigma})\models_{\beta}pre_{1}

or

(ϱ,ς)βpre2({\varrho}^{\prime},{\varsigma})\models_{\beta}pre_{2}

It follows that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βpre1pre2({\varrho}^{\prime},{\varsigma})\models_{\beta}pre_{1}\vee pre_{2}

which was required and finishes the case.
Case: pre=x:prepre=\exists x:pre^{\prime}
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)βx:pre({\varrho},{\varsigma})\models_{\beta}\exists x:pre^{\prime}

According to Table 1 we have

a𝒰:(ϱ,ς[xa])βpre\exists a\in\mathcal{U}:({\varrho},{\varsigma}[x\mapsto a])\models_{\beta}pre^{\prime}

From the induction hypothesis we get that for all ϱMj{\varrho}^{\prime}\in M_{j}

a𝒰:(ϱ,ς[xa])βpre\exists a\in\mathcal{U}:({\varrho}^{\prime},{\varsigma}[x\mapsto a])\models_{\beta}pre^{\prime}

It follows from Table 1 that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βx:pre({\varrho}^{\prime},{\varsigma})\models_{\beta}\exists x:pre^{\prime}

which was required and finishes the case.
Case: pre=Y:prepre=\exists Y:pre^{\prime}
Let us take ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and assume that

(ϱ,ς)βY:pre({\varrho},{\varsigma})\models_{\beta}\exists Y:pre^{\prime}

According to Table 1 we have

l:(ϱ,ς[Yl])βpre\exists l\in\mathcal{L}_{\neq\bot}:({\varrho},{\varsigma}[Y\mapsto l])\models_{\beta}pre^{\prime}

From the induction hypothesis we get that for all ϱMj{\varrho}^{\prime}\in M_{j}

l:(ϱ,ς[Yl])βpre\exists l\in\mathcal{L}_{\neq\bot}:({\varrho}^{\prime},{\varsigma}[Y\mapsto l])\models_{\beta}pre^{\prime}

It follows from Table 1 that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βY:pre({\varrho}^{\prime},{\varsigma})\models_{\beta}\exists Y:pre^{\prime}

which was required and finishes the case.∎

Lemma 4

If ϱ=ΔM\varrho=\bigsqcap{\!}_{\Delta}M and (ϱ,ζ,ς)βclj({\varrho}^{\prime},\zeta,\varsigma)\models_{\beta}cl_{j} for all ϱM{\varrho}^{\prime}\in M then (ϱ,ζ,ς)βclj(\varrho,\zeta,\varsigma)\models_{\beta}cl_{j}.

Proof

We proceed by induction on jj and in each case perform a structural induction on the form of the clause occurring in cljcl_{j}.
Case: clj=R(u;V)cl_{j}=R(\vec{u};V)
Assume that for all ϱM{\varrho}^{\prime}\in M

(ϱ,ζ,ς)βR(u;V)({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}R(\vec{u};V)

From the semantics of LLFP we have that for all ϱM{\varrho}^{\prime}\in M

ϱ(R)(u(ζ,ς))V(ζ,ς){\varrho}^{\prime}(R)(\llbracket\vec{u}\rrbracket(\zeta,\varsigma))\sqsupseteq\llbracket V\rrbracket(\zeta,\varsigma)

It follows that:

{ϱ(R)(u(ζ,ς))ϱM}V(ζ,ς)\bigsqcap\{{\varrho}^{\prime}(R)(\llbracket\vec{u}\rrbracket(\zeta,\varsigma))\mid{\varrho}^{\prime}\in M\}\sqsupseteq\llbracket V\rrbracket(\zeta,\varsigma)

Since MjMM_{j}\subseteq M, we have:

{ϱ(R)(u(ζ,ς))ϱMj}V(ζ,ς)\bigsqcap\{{\varrho}^{\prime}(R)(\llbracket\vec{u}\rrbracket(\zeta,\varsigma))\mid{\varrho}^{\prime}\in M_{j}\}\sqsupseteq\llbracket V\rrbracket(\zeta,\varsigma)

We know that rank(R)=j\mbox{\rm rank}(R)=j; hence ϱ(R)=λa.{ϱ(R)(a)ϱMj}\varrho(R)=\lambda\vec{a}.\bigsqcap\{{\varrho}^{\prime}(R)(\vec{a})\mid{\varrho}^{\prime}\in M_{j}\}; thus

ϱ(R)(u(ζ,ς))={ϱ(R)(u(ζ,ς))ϱMj}V(ζ,ς){\varrho}(R)(\llbracket\vec{u}\rrbracket(\zeta,\varsigma))=\bigsqcap\{{\varrho}^{\prime}(R)(\llbracket\vec{u}\rrbracket(\zeta,\varsigma))\mid{\varrho}^{\prime}\in M_{j}\}\sqsupseteq\llbracket V\rrbracket(\zeta,\varsigma)

Which according to Table 1 is equivalent to

(ϱ,ζ,ς)βR(u;V)({\varrho},\zeta,\varsigma)\models_{\beta}R(\vec{u};V)

Case: clj=cl1cl2cl_{j}=cl_{1}\wedge cl_{2}
Assume that for all ϱM{\varrho}^{\prime}\in M:

(ϱ,ζ,ς)βcl1cl2({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}cl_{1}\wedge cl_{2}

From Table 1 it is equivalent to

(ϱ,ζ,ς)βcl1 and (ϱ,ζ,ς)βcl2({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}cl_{1}\text{ and }({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}cl_{2}

The induction hypothesis gives that

(ϱ,ζ,ς)βcl1 and (ϱ,ζ,ς)βcl2({\varrho},\zeta,{\varsigma})\models_{\beta}cl_{1}\text{ and }({\varrho},\zeta,{\varsigma})\models_{\beta}cl_{2}

Which according to Table 1 is equivalent to

(ϱ,ζ,ς)βcl1cl2({\varrho},\zeta,{\varsigma})\models_{\beta}cl_{1}\wedge cl_{2}

and finishes the case.
Case: clj=preclcl_{j}=pre\Rightarrow cl
Assume that for all ϱM{\varrho}^{\prime}\in M:

(ϱ,ζ,ς)βprecl({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}pre\Rightarrow cl (1)

We have two cases. In the first one (ϱ,ς)βpre(\varrho,\varsigma)\models_{\beta}pre is falsefalse, hence (ϱ,ς,ζ)βprecl(\varrho,\varsigma,\zeta)\models_{\beta}pre\Rightarrow cl holds trivially. In the second case let us assume:

(ϱ,ς)βpre(\varrho,\varsigma)\models_{\beta}pre (2)

Lemma 3 gives that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ς)βpre({\varrho}^{\prime},{\varsigma})\models_{\beta}pre

From (1) we have that for all ϱMj{\varrho}^{\prime}\in M_{j}

(ϱ,ζ,ς)βcl({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}cl

and the induction hypothesis gives:

(ϱ,ζ,ς)βcl({\varrho},\zeta,\varsigma)\models_{\beta}cl

Hence from (2) we get:

(ϱ,ζ,ς)βprecl({\varrho},\zeta,\varsigma)\models_{\beta}pre\Rightarrow cl

which was required and finishes the case.
Case: clj=x:clcl_{j}=\forall x:cl
Assume that for all ϱM{\varrho}^{\prime}\in M

(ϱ,ζ,ς)βx:cl({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}\forall x:cl

From Table 1 we have that for all ϱM{\varrho}^{\prime}\in M and for all a𝒰a\in\mathcal{U}

(ϱ,ζ,ς[xa])βcl({\varrho}^{\prime},\zeta,{\varsigma}[x\mapsto a])\models_{\beta}cl

Thus from the induction hypothesis we get that for all a𝒰a\in\mathcal{U}

(ϱ,ζ,ς[xa])βcl({\varrho},\zeta,\varsigma[x\mapsto a])\models_{\beta}cl

According to Table 1 it is equivalent to

(ϱ,ζ,ς)βx:cl({\varrho},\zeta,\varsigma)\models_{\beta}\forall x:cl

which was required and finishes the case.
Case: cl=Y:clcl=\forall Y:cl
Assume that for all ϱM{\varrho}^{\prime}\in M

(ϱ,ζ,ς)βY:cl({\varrho}^{\prime},\zeta,{\varsigma})\models_{\beta}\forall Y:cl

From Table 1 we have that ϱM{\varrho}^{\prime}\in M

l:(ϱ,ζ,ς[Yl])βcl\forall l\in\mathcal{L}_{\neq\bot}:({\varrho}^{\prime},\zeta,{\varsigma}[Y\mapsto l])\models_{\beta}cl

Thus from the induction hypothesis we get that

l:(ϱ,ζ,ς[Yl])βcl\forall l\in\mathcal{L}_{\neq\bot}:({\varrho},\zeta,\varsigma[Y\mapsto l])\models_{\beta}cl

According to Table 1 it is equivalent to

(ϱ,ζ,ς)βY:cl({\varrho},\zeta,\varsigma)\models_{\beta}\forall Y:cl

which was required and finishes the case.

Proposition 1. Assume clscls is a stratified LLFP clause sequence, ς0\varsigma_{0} and ζ0\zeta_{0} are interpretations of free variables and function symbols in clscls, respectively. Furthermore, ϱ0\varrho_{0} is an interpretation of all relations of rank 0. Then {ϱ(ϱ,ζ0,ς0)βclsR:rank(R)=0ϱ0(R)ϱ(R)}\{\varrho\mid(\varrho,\zeta_{0},\varsigma_{0})\models_{\beta}cls\wedge\forall R:\mbox{\rm rank}(R)=0\Rightarrow\varrho_{0}(R)\sqsubseteq\varrho(R)\} is a Moore family.

Proof

The result follows from Lemma 4. ∎