This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\mathlig

—-¿↦ \mathlig-‘⇀ \mathlig ¿↝ \mathlig=¿⇒ \mathlig—=⊨ \mathlig ∼ \mathlig¿=≥ \mathlig¡=≤ \mathlig¡=¿⇔ \mathlig-*\sepimp \mathlig-¿→ \mathlig—[⟦ \mathlig—]⟧ \mathlig!=≠

A separation logic for sequences
in pointer programs and its decidability

Tianyue Cao1, Bowen Zhang1, Zhao Jin3, Yongzhi Cao1 and Hanpin Wang12 1 Key Laboratory of High Confidence Software Technologies (MOE),
School of Computer Science, Peking University, Beijing, China
2 School of Computer Science and Cyber Engineering,
Guangzhou University, Guangzhou, China
3 School of Computer and Artificial Intelligence,
Zhengzhou University, Zhengzhou, China
tycao@stu.pku.edu.cn, zhangbowen@pku.edu.cn,
jinzhao@zzu.edu.cn, caoyz@pku.edu.cn, whpxhy@pku.edu.cn
Abstract

Separation logic and its variants can describe various properties on pointer programs. However, when it comes to properties on sequences, one may find it hard to formalize. To deal with properties on variable-length sequences and multilevel data structures, we propose sequence-heap separation logic which integrates sequences into logical reasoning on heap-manipulated programs. Quantifiers over sequence variables and singleton heap storing sequence (sequence singleton heap) are new members in our logic. Further, we study the satisfiability problem of two fragments. The propositional fragment of sequence-heap separation logic is decidable, and the fragment with 2 alternations on program variables and 1 alternation on sequence variables is undecidable. In addition, we explore boundaries between decidable and undecidable fragments of the logic with prenex normal form.

Index Terms:
separation logic, predicate logic, higher-order, sequence, decidability

I Introduction

The classical separation logic proposed by Reynolds[1] is widely used to describe and verify various properties on heap-manipulated programs. It has evolved in many versions: object-oriented separation logic[2], quantitative separation logic [3], higher-order separation logic[4], separation logic with inductive predicates[5], etc. Its separating conjunction * and separating implication -* provides much succincter expressions on mutable data structures.

Sequences are also widely used in programs. Many data structures can be abstracted to sequences, including 𝚟𝚎𝚌𝚝𝚘𝚛\mathtt{vector} in C++, and 𝙰𝚛𝚛𝚊𝚢𝙻𝚒𝚜𝚝\mathtt{ArrayList} in Java. There are many string solvers on sequences, such as CVC4[6] and Z3str3[7]. Sequence predicate logic[8] introduces sequence variables and sequence functions, and allows quantifiers over sequence variables. With sequence variables, sort function can be defined elegantly.

However, to our best knowledge, there is no integral formal systems on logical reasoning of programs with both heap and sequence. The example proposed in [1] provides a way to formally verify whether a program implements list reversal. The program aims to reverse the sequence α0\alpha_{0} in xx and put it into yy. The invariant of this program is written as follows:

αβ.((𝚕𝚜(x,α)𝚕𝚜(y,β))α0==αβ),\displaystyle\exists\alpha\exists\beta.\;\bigl{(}(\mathtt{ls}(x,\alpha)*\mathtt{ls}(y,\beta))\land\alpha_{0}^{{\dagger}}==\alpha^{{\dagger}}\circ\beta\bigr{)}, (1)

where α\alpha^{{\dagger}} denotes reversal of the sequence α\alpha, and the term αβ\alpha\circ\beta denotes the concatenation of the two sequences α\alpha and β\beta. The predicate 𝚕𝚜(x,α)\mathtt{ls}(x,\alpha) denotes a singly-linked list starting from xx, which can be inductively defined as follows:

𝚕𝚜(x,ε)\displaystyle\mathtt{ls}(x,\varepsilon) =x=𝐧𝐢𝐥,\displaystyle\quad\overset{\triangle}{=}\quad x=\mathbf{nil}, (2)
𝚕𝚜(x,nα)\displaystyle\mathtt{ls}(x,n\circ\alpha) =y.x|>yn𝚕𝚜(y,α).\displaystyle\quad\overset{\triangle}{=}\quad\exists y.\;x|->y\circ n*\mathtt{ls}(y,\alpha).

The property defined in Equation 1 can be attributed to properties on variable-length sequences in pointer programs. These properties can be found in many commonly-used mutable data structures, such as in stacks, queues, and graphs. Besides, variable-length sequences also appear in multilevel data structures, such as paging on operating systems, and file systems. It is necessary to introduce a logic to describe and verify these kind of properties.

In this paper, we aim to establish a logical foundation for variable-length sequences and multilevel data structures in pointer programs, especially on definition, expressiveness and decidability. We propose sequence-heap separation logic to integrate sequences into logical reasoning in pointer programs. The logic is an extension of both classical separation logic proposed in [1], and sequence predicate logic proposed in [8].

Sequence-heap separation logic can be seen as a fragment of higher-order separation logic proposed in [4]. Quantifiers over sequences can be reduced to those over predicates (or over sets). Sequence-heap separation logic is less expressive than higher-order separation logic, but can be used in many scenarios on sequences in pointer programs, and has better deciability results.

In terms of definition for sequence-heap separation logic, we define the heap model as a finite partial mapping from locations to sequences, which is similar to the block model HeapB\mathrm{Heap}_{B} defined in [9]. The model can be written as h:Locfin(LocVal)h:\text{Loc}\overset{\mathrm{fin}}{\rightharpoonup}(\text{Loc}\cup\text{Val})^{*}, compared to the model h:LocfinLocValh:\text{Loc}\overset{\mathrm{fin}}{\rightharpoonup}\text{Loc}\cup\text{Val} defined in the classical separation logic.

We also define sequence singleton heap as the term x|>αx|->\alpha to adapt more scenarios on variable-length sequences. The term can be used to denote stack, queue, variadic arguments, trees with variable-length branches, graphs with unbounded out-degree, etc. We take stack as an example.

In the classical separation logic, one has to describe stack with the inductive predicate 𝚕𝚜(x,α)\mathtt{ls}(x,\alpha) defined in Equation 2, where xx and α\alpha denote the top pointer and contents of the stack respectively. However, with sequence singleton heap x|>αx|->\alpha defined in sequence-heap separation logic, one can use this formula to denote stack without inductive predicate.

The definition of sequence singleton heap can help us to get a decidable result. We prove that the Σ1\Sigma_{1} fragment with sequence singleton heap is decidable, compared to the undecidable result on Σ1\Sigma_{1} fragment in [10] where the list predicate 𝚕𝚜(x,y)\mathtt{ls}(x,y), separating conjunctions, and separating implications are defined.

Besides, properly defining the form of α\alpha can help the logic to describe multilevel data structures. For example, if α\alpha is of the form α#β\alpha\#\beta (where α\alpha denotes the sequence of locations, β\beta denotes the sequence of contents, and #\# separates these two sequences), one can describe two-tier data structures[9] in block-based cloud storage systems in the following way.

(x|>α#ε)yβ.(y¯α=>(yε#β)),\displaystyle(x|->\alpha\#\varepsilon)*\forall y\exists\beta.\;\bigl{(}y\overline{\in}\alpha=>(y\hookrightarrow\varepsilon\#\beta)\bigr{)}, (3)

where xx denotes the block location, α\alpha denotes the sequence of content location, and β\beta denotes contents in each location of α\alpha. The term y¯αy\overline{\in}\alpha denotes yy appears in α\alpha, which is defined as follows:

y¯α=α1α2.α==α1yα2.\displaystyle y\overline{\in}\alpha\overset{\triangle}{=}\exists\alpha_{1}\exists\alpha_{2}.\;\alpha==\alpha_{1}\circ y\circ\alpha_{2}.

Equation 3 can be illustrated by Fig. (1).

Refer to caption
Figure 1: Representation of two-tier data structure

We notice that sequence predicate logic proposed in [8] is also capable of describing various properties on variable-length sequences. However, the logic cannot be directly used to describe sequences in pointer programs. sequence-heap separation logic is more than just the combination of sequence predicate logic and separation logic. The logic enriches the composition of the two.

We also notice that the model of sequence-heap separation logic is similar to that of block-based separation logic proposed in [9]. However, the sequence singleton heap and quantifiers over sequences in sequence-heap separation logic provide more abstraction, and can be used for wider application scenarios, not just for block-based cloud storage systems.

For example, the combined logic can simplify expressions on describing stack. Besides, sequence predicate logic and separation logic alone cannot describe multilevel data structures shown in Equation 3, but the combined one can do this.

We also study decidability problems of sequence-heap separation logic fragments. This can help us to find algorithms for decidable fragments, as well as to understand expressiveness of the logic intensively. We get the following two main fragments on satisfiability problems. One is a decidable fragment which lays between Σ1\Sigma_{1} fragment of the classical separation logic, and the logic involving list predicate. The other is an undecidable fragment where there are two alternations on quantifiers over program variables, and one alternation on quantifiers over sequence variables.

The paper is organized as follows. In Section \@slowromancapii@, we introduce some basic concepts of separation logic and sequence predicate logic. In Section \@slowromancapiii@, we define sequence-heap separation logic and explore its expressiveness. In Section \@slowromancapiv@, we prove that the satisfiability problem of Σ1\Sigma_{1} fragment is decidable. In Section \@slowromancapv@, we prove that the satisfiability problem of the fragment with the prenex normal form xαxxα\forall_{x}^{*}\forall_{\alpha}^{*}\cap\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*} is undecidable. In addition, we explore boundaries between decidable and undecidable fragments. In Section \@slowromancapvi@, we conclude our work, and propose future work on sequence-heap separation logic.

Related work. Separation logic can express properties on heaps, such as dangling pointers, non-circularity, and memory leakages.[11] However, the logic itself is proved to be undecidable[12]. We investigate the following two categories of fragments: basic fragments without inductive predicates and Σ1\Sigma_{1} fragments with inductive predicates.

In basic fragments, there are no user-defined predicates or hard-corded predicates. It is known that the validity problem of Π1\Pi_{1} fragment SL(,)\mathrm{SL}(*,-*) is decidable[13], and that of basic fragment SL(,)\mathrm{SL}(\forall,\hookrightarrow) with quantifiers and no separation implications is undecidable[13]. If we restrict the size of the heap satisfying φ1\varphi_{1} in the formulae φ1φ2\varphi_{1}-*\varphi_{2} to be bounded (namely nn), then we can get a decidable basic fragment SL(,n,)\mathrm{SL}(*,-*^{n},\exists)[12]. To get decidable fragments, one can either restrict quantifier alternations in the formulae of prenex normal form, or number of quantified variables. For quantifier alternations, the fragment SL()\mathrm{SL}(\exists^{*}\forall^{*}) with 2 quantifier alternations is decidable[14], while the fragment SL()\mathrm{SL}(\exists^{*}\forall^{*}\exists^{*}) with 3 is undecidable[14]. For quantified variables, the fragment 1SL11\mathrm{SL}1 with 1 quantified variable is decidable [15], while the fragment 1SL21\mathrm{SL}2 with 2 quantified variables is undecidable [16].

In Σ1\Sigma_{1} fragments, it contains separation implications (or its variants) and inductive predicates describing data structures. One basic result is that the satisfiability problem of Σ1\Sigma_{1} fragment SL(,,𝚕𝚜)\mathrm{SL}(*,-*,\mathtt{ls}) with lists is undecidable[10]. However, if list predicates are restricted to the list with length at most 2, one can get a decidable fragment SL(,,𝚛𝚎𝚊𝚌𝚑2)\mathrm{SL}(*,-*,\mathtt{reach}_{2})[10]. Besides, if we further restrict heap unions in separation model, one can get strong-separation logic[17]. It is proved that the strong-separation fragment SSL(,\sepimp¬,𝚕𝚜)\mathrm{SSL}(*,\mathrel{\sepimp\mkern-15.0mu^{\lnot}},\mathtt{ls}) with list predicates is decidable.

All fragments mentioned above can be found in TABLE I.

Table I: Decidability of separation logic fragments
Fragment Model Key connectives and predicates Decidability
SL(,)\mathrm{SL}(*,-*) separation model ¬,|>,,\lnot,|->,*,-* decidable
SL(,)\mathrm{SL}(\forall,\hookrightarrow) ¬,,\lnot,\forall,\hookrightarrow undecidable
SL(,n,)\mathrm{SL}(*,-*^{n},\exists) ¬,,n,*\lnot,,-*^{n},\exists decidable
SL()\mathrm{SL}(\exists^{*}\forall^{*}) prenex \exists^{*}\forall^{*} decidable
SL()\mathrm{SL}(\exists^{*}\forall^{*}\exists^{*}) prenex \exists^{*}\forall^{*}\exists^{*} undecidable
1SL11\mathrm{SL}1 1 quantified variable decidable
1SL21\mathrm{SL}2 2 quantified variables undecidable
SL(,,𝚕𝚜)\mathrm{SL}(*,-*,\mathtt{ls}) separation model ¬,,,𝚕𝚜(x,y)\lnot,*,-*,\mathtt{ls}(x,y) undecidable
SL(,,𝚛𝚎𝚊𝚌𝚑2)\mathrm{SL}(*,-*,\mathtt{reach}_{2}) separation model ¬,,,𝚛𝚎𝚊𝚌𝚑2(x,y)\lnot,*,-*,\mathtt{reach}_{2}(x,y) decidable
SSL(,\sepimp¬,𝚕𝚜)\mathrm{SSL}(*,\mathrel{\sepimp\mkern-15.0mu^{\lnot}},\mathtt{ls}) strong-separation model ¬,,\sepimp¬,𝚕𝚜\lnot,*,\mathrel{\sepimp\mkern-15.0mu^{\lnot}},\mathtt{ls} decidable

Sequence predicate logic is an extension of word equation. Specifically, word equations are Σ1\Sigma_{1} fragment or Π1\Pi_{1} fragment of sequence predicate logic, where there are only quantifiers over sequence variables. It can express many properties on sequences, such as conjugates, and lexicographic ordering on words[18]. On the other hand, it cannot express properties such as ”the primitiveness” and ”the equal length”. It is shown that every Boolean combinations of word equations on free semigroup can be reduced to a single word equation[19].

Different from the classical predicate logic, sequence variables, sequence predicates and sequence functions are defined in sequence predicate logic[8]. The problem of whether there is a solution for word equation is decidable[19]. Namely, the truth of Σ1\Sigma_{1} fragment of sequence predicate logic is decidable. Further researches show that the truth of positive formulae (where there is no negations or inequalities) which is of the form \forall^{*}\exists^{*}[20], \exists^{*}\forall^{*}[21] or \exists^{*}\forall^{*}\exists^{*}[22] is shown to be undecidable. To get a decidable fragment, one can restrict the formulae of the form \exists^{*}\forall^{*} to the positive one[22].

There are some works which combine separation logic and sequences. When formalizing cloud storage systems, one can construct a modeling language IMDSS[23], and a proof system based on separation logic[9]. These papers define values of heaps as finite sequences, and describe properties on blocks without using sequence singleton heap. Coq-based proof assistant[24] is implemented based on the logic defined in [9].

II Preliminaries

In this section, we introduce some basic concepts on separation logic and sequence predicate logic.

II-A Separation logic

In this subsection, we mainly focus on definitions and expressiveness of separation logic.

Definition II.1 (Definition of separation logic)

The syntax of separation logic[25] with 1 records is defined as follows:

t::=\displaystyle t\quad::= 𝐧𝐢𝐥nx𝚏n(t,,t)\displaystyle\quad\mathbf{nil}\mid n\mid x\mid\mathtt{f}^{n}(t,\dots,t)
p::=\displaystyle p\quad::= t=t𝐞𝐦𝐩t|>t𝙿n(t,,t)\displaystyle\quad t=t\mid\mathbf{emp}\mid t|->t\mid\mathtt{P}^{n}(t,\dots,t)
φ::=\displaystyle\varphi\quad::= p¬φφφφφφ=>φφφφφx.φ,\displaystyle\quad p\mid\lnot\varphi\mid\varphi\land\varphi\mid\varphi\lor\varphi\mid\varphi=>\varphi\mid\varphi*\varphi\mid\varphi-*\varphi\mid\exists x.\varphi,

where nn denotes constants, xx denotes variables, 𝚏n\mathtt{f}^{n} denotes nn-ary functions mapping n\mathbb{N}^{n} to \mathbb{N}, and 𝙿n\mathtt{P}^{n} denotes nn-ary predicates mapping n\mathbb{N}^{n} to {0,1}\{0,1\}.

The underlying model σ=(s,h)\sigma=(s,h) of separation logic consists of a stack ss and a heap hh. The set Loc\mathrm{Loc} comprises locations excluding 𝐧𝐢𝐥\mathbf{nil}. The set Var\mathrm{Var} comprises symbols of variables, including x,y,z,x,y,z,\dots. The set Val\mathrm{Val} comprises integer values. The model is defined as follows:

Loc,Val\text{Loc},\text{Val}\subseteq\mathbb{N} 𝐧𝐢𝐥Atoms\mathbf{nil}\in\text{Atoms}
s:Var>LocVals:\text{Var}->\text{Loc}\cup\text{Val} h:Locfin(LocVal)h:\text{Loc}\overset{\mathrm{fin}}{\rightharpoonup}(\text{Loc}\cup\text{Val})^{*}.

The semantics of the term 𝚏n(t1,,tn)\mathtt{f}^{n}(t_{1},\dots,t_{n}) can be defined as follows, where 𝚏σn\mathtt{f}_{\sigma}^{n} is the interpretation of 𝚏n\mathtt{f}^{n}.

|[𝚏n(t1,,tn)|]σ=𝚏σn(|[t1|]σ,,|[tn|]σ).\displaystyle|[\mathtt{f}^{n}(t_{1},\dots,t_{n})|]\sigma=\mathtt{f}_{\sigma}^{n}(|[t_{1}|]\sigma,\dots,|[t_{n}|]\sigma).

The semantics of formulae is inductively defined as follows:

σ|=t1=t2\displaystyle\sigma|=t_{1}=t_{2} iff |[t1|]σ=|[t2|]σ.\displaystyle\quad|[t_{1}|]\sigma=|[t_{2}|]\sigma.
σ|=𝐞𝐦𝐩\displaystyle\sigma|=\mathbf{emp} iff 𝚍𝚘𝚖(h)=.\displaystyle\quad\mathtt{dom}(h)=\varnothing.
σ|=t1|>t2\displaystyle\sigma|=t_{1}|->t_{2} iff |[t1|]σ!=𝐧𝐢𝐥 and 𝚍𝚘𝚖(h)={|[t1|]σ}\displaystyle\quad|[t_{1}|]\sigma!=\mathbf{nil}\text{ and }\mathtt{dom}(h)=\{|[t_{1}|]\sigma\}
 and h(|[t1|]σ)=|[t2|]σ.\displaystyle\quad\text{ and }h(|[t_{1}|]\sigma)=|[t_{2}|]\sigma.
σ|=𝙿n(t1,,tn)\displaystyle\sigma|=\mathtt{P}^{n}(t_{1},\dots,t_{n}) iff 𝙿σn(|[t1|]σ,,|[tn|]σ) holds.\displaystyle\quad\mathtt{P}_{\sigma}^{n}(|[t_{1}|]\sigma,\dots,|[t_{n}|]\sigma)\text{ holds.}
σ|=¬φ\displaystyle\sigma|=\lnot\varphi iff σφ.\displaystyle\quad\sigma\nvDash\varphi.
σ|=φ1φ2\displaystyle\sigma|=\varphi_{1}\land\varphi_{2} iff σ|=φ1 and σ|=φ2.\displaystyle\quad\sigma|=\varphi_{1}\text{ and }\sigma|=\varphi_{2}.
σ|=φ1φ2\displaystyle\sigma|=\varphi_{1}\lor\varphi_{2} iff σ|=φ1 or σ|=φ2.\displaystyle\quad\sigma|=\varphi_{1}\text{ or }\sigma|=\varphi_{2}.
σ|=φ1=>φ2\displaystyle\sigma|=\varphi_{1}=>\varphi_{2} iff if σ|=φ1, then σ|=φ2.\displaystyle\quad\text{if }\sigma|=\varphi_{1}\text{, then }\sigma|=\varphi_{2}.
σ|=φ1φ2\displaystyle\sigma\;|=\;\varphi_{1}*\varphi_{2} iff there exists heap h1 and h2,\displaystyle\quad\text{there exists heap }h_{1}\text{ and }h_{2},
such that 𝚍𝚘𝚖(h1)𝚍𝚘𝚖(h2)=,\displaystyle\quad\text{such that }\mathtt{dom}(h_{1})\cap\mathtt{dom}(h_{2})=\varnothing,
and h=h1h2, and (s,h1)|=φ1,\displaystyle\quad\text{and }h=h_{1}\uplus h_{2},\text{ and }(s,h_{1})\;|=\;\varphi_{1},
and (s,h2)|=φ2.\displaystyle\quad\text{and }(s,h_{2})\;|=\;\varphi_{2}.
σ|=φ1φ2\displaystyle\sigma\;|=\;\varphi_{1}-*\varphi_{2} iff for all heaps h1,\displaystyle\quad\text{for all heaps }h_{1},
if 𝚍𝚘𝚖(h1)𝚍𝚘𝚖(h)=,\displaystyle\quad\text{if }\mathtt{dom}(h_{1})\cap\mathtt{dom}(h)=\varnothing\text{,}
and (s,h1)|=φ1,\displaystyle\quad\text{and }(s,h_{1})\;|=\;\varphi_{1},
then (s,h1h)|=φ2.\displaystyle\quad\text{then }(s,h_{1}\uplus h)\;|=\;\varphi_{2}.
σ|=x.φ\displaystyle\sigma\;|=\;\exists x.\varphi iff there exists x0 in LocVal,\displaystyle\quad\text{there exists }x_{0}\text{ in Loc}\cup\text{Val,}
such that (s[x>x0],h)|=φ,\displaystyle\quad\text{such that }(s[x->x_{0}],h)\;|=\;\varphi,

where 𝙿σn\mathtt{P}_{\sigma}^{n} denotes the interpretation of 𝙿n\mathtt{P}^{n}.

For convenience, we have the following notation[25] denoting that tt points to t1,,tnt_{1},\dots,t_{n} with some fixed nn.

t|>t1,,tn=(t|>t1)(t+n1|>tn).\displaystyle t|->t_{1},\dots,t_{n}\overset{\triangle}{=}(t|->t_{1})*\dots*(t+n-1|->t_{n}). (4)

In some fragments of separation logic, the heap model is defined as h:LocfinValnh:\text{Loc}\overset{\text{fin}}{\rightharpoonup}\text{Val}^{n} with some fixed nn[25][26]. In this case, Equation 4 will no longer hold. The semantics of the notation is defined as follows:

σ|=t|>t1,,tn\displaystyle\sigma|=t|->t_{1},\dots,t_{n} iff |[t|]σ!=𝐧𝐢𝐥 and 𝚍𝚘𝚖(h)={|[t|]σ},\displaystyle|[t|]\sigma!=\mathbf{nil}\text{ and }\mathtt{dom}(h)=\{|[t|]\sigma\},
and h(|[t|]σ)=(|[t1|]σ,,|[tn|]σ).\displaystyle\text{ and }h(|[t|]\sigma)=(|[t_{1}|]\sigma,\dots,|[t_{n}|]\sigma).

We notice that in either separation logic or one of its fragments (symbolic heap[11][26], strong-separation logic[17], etc.), the singleton heap is defined with some fixed number nn. It may not be suitable for describing properties on blocks, or graphs where each node is of unbounded out-degrees.

The paper [9] introduces sequence into separation logic to describe properties on block-based cloud storage systems. The model of the logic is defined as a quintuple of the form (sF,sB,sV,hB,hV)(s_{F},s_{B},s_{V},h_{B},h_{V}), where sF,sB,sVs_{F},s_{B},s_{V} are assignments of file variables, block variables and location variables respectively, hB,hVh_{B},h_{V} are heaps of blocks and heaps of values respectively. The file stack sFs_{F} is a mapping from file variables to a sequence of block addresses, which is denoted as sF=FVar>BLocs_{F}\overset{\triangle}{=}\text{FVar}->\text{BLoc}^{*}, and hBh_{B} is a finite partial mapping from blocks to sequences of locations, which is denoted as hB=BLocfinLoch_{B}\overset{\triangle}{=}\text{BLoc}\overset{\text{fin}}{\rightharpoonup}\text{Loc}^{*}. Although sFs_{F} and hBh_{B} both range over sequences, the logic does not introduce sequence variables, sequence functions, or sequence singleton heap. The logic visits these sequences by existential quantifiers on block variables, and by equality over values of block variables.

II-B Word equations and sequence predicate logic

Word equation can be seen as a special case of sequence predicate logic. It can be defined as follows:

Definition II.2 (Word equation)

Suppose XX^{*} is a free semigroup satisfying X={n1,,nk}X=\{n_{1},\dots,n_{k}\} with unknowns α1,,αmX\alpha_{1},\dots,\alpha_{m}\in X^{*}, where n1,,nkn_{1},\dots,n_{k} are generators. Word equation is an equality of the form:

U(n1,,nk,α1,,αm)==V(n1,,nk,α1,,αm),\displaystyle U(n_{1},\dots,n_{k},\alpha_{1},\dots,\alpha_{m})==V(n_{1},\dots,n_{k},\alpha_{1},\dots,\alpha_{m}),

where ==== is the equality between two concatenations of sequences. If there is a solution for U==VU==V, then the word equation is satisfied.

The definition of word equation can be enhanced with Boolean combination, as is shown in Definition II.3.

Definition II.3 (Boolean combination of word equations)

Boolean combination of word equations is defined as follows:

t\displaystyle t ::=nαtt\displaystyle\quad::=\quad n\mid\alpha\mid t\circ t
φ\displaystyle\varphi ::=t==t¬φφφφφ,\displaystyle\quad::=\quad t==t\mid\lnot\varphi\mid\varphi\land\varphi\mid\varphi\lor\varphi,

where nn denotes constants, α\alpha denotes sequence variables, and t1t2t_{1}\circ t_{2} denotes the concatentation of two sequences t1t_{1} and t2t_{2}.

Word equation is capable of describing many properties on sequences. Every Boolean combinations of word equations can be reduced to a single word equation followed by Theorem II.1[18].

Theorem II.1

For every Boolean combinations φ\varphi defined above, there exists a single word equation T(φ)=U==VT(\varphi)\overset{\triangle}{=}U==V, which is equivalent to φ\varphi. The proof sketch of the theorem can be found in Appendix.

If word equation is enhanced with sequence function, and quantifiers over variables and sequences, then one can get the following definition of sequence predicate logic.

Definition II.4 (Sequence predicate logic)

The syntax of sequence predicate logic[8] can be defined as follows:

t\displaystyle t ::=\displaystyle\quad::= nxαtt𝚏(t,,t)𝚏¯(t,,t)\displaystyle n\mid x\mid\alpha\mid t\circ t\mid\mathtt{f}(t,\dots,t)\mid\overline{\mathtt{f}}(t,\dots,t)
φ\displaystyle\varphi ::=\displaystyle\quad::= t==t𝙿(t,,t)¬φφφφφ\displaystyle t==t\mid\mathtt{P}(t,\dots,t)\mid\lnot\varphi\mid\varphi\land\varphi\mid\varphi\lor\varphi
φ=>φx.φα.φ.\displaystyle\mid\varphi=>\varphi\mid\exists x.\varphi\mid\exists\alpha.\varphi.

The variable xx denotes individual variables, and α\alpha denotes sequence variables. The arity of 𝚏\mathtt{f}, 𝚏¯\overline{\mathtt{f}}, and 𝙿\mathtt{P} can be either fixed or flexible. When it is fixed, there is no sequence variables on parameters. When it is flexible, it may contain sequence variables.

The structure of the logic is defined as (Val,I)(\mathrm{Val},I), where Val\mathrm{Val} is the domain, and II is the interpretation on individual and sequence constants, functions and predicates. The assignment of the logic can be defined as (sx,sα)(s_{x},s_{\alpha}), where sx:IVar>Vals_{x}:\mathrm{IVar}->\mathrm{Val}, sα:SVar>Vals_{\alpha}:\mathrm{SVar}->\mathrm{Val}^{*}, and IVar={x,y,z,}\mathrm{IVar}=\{x,y,z,\dots\}, SVar={α,β,γ,}\mathrm{SVar}=\{\alpha,\beta,\gamma,\dots\}.

In sequence predicate logic, t1t2t_{1}\circ t_{2} denotes the concatenation between two sequences t1t_{1} and t2t_{2}, t1==t2t_{1}==t_{2} denotes that two equations t1t_{1} and t2t_{2} are equal. Due to space limitations, we do not list the semantics of the logic. The details can be found in [8][27].

II-C Expressiveness of sequence predicate logic

We recall properties which can be expressed by sequence predicate logic. They are useful for describing properties with sequence-heap separation logic.

Length. Sequence predicate logic can be used to get the length of a sequence. The function 𝚕𝚎𝚗𝚐𝚝𝚑(α)\mathtt{length}(\alpha) denotes the length of the sequence α\alpha, which can be inductively defined as follows:

𝚕𝚎𝚗𝚐𝚝𝚑\displaystyle\mathtt{length} (ε)=0\displaystyle(\varepsilon)\overset{\triangle}{=}0
𝚕𝚎𝚗𝚐𝚝𝚑\displaystyle\mathtt{length} (nα)=1+𝚕𝚎𝚗𝚐𝚝𝚑(α).\displaystyle(n\circ\alpha)\overset{\triangle}{=}1+\mathtt{length}(\alpha).

For convenience, we define |α||\alpha| to denote 𝚕𝚎𝚗𝚐𝚝𝚑(α)\mathtt{length}(\alpha).

Statistics. Sequence predicate logic can be used to get occurrences of specific terms. The function 𝚏𝚒𝚗𝚍(α,x)\mathtt{find}(\alpha,x) denotes the occurrences of xx in sequence α\alpha, which can be inductively defined as follows:

𝚏𝚒𝚗𝚍\displaystyle\mathtt{find} (ε,x)=0\displaystyle(\varepsilon,x)\overset{\triangle}{=}0
𝚏𝚒𝚗𝚍\displaystyle\mathtt{find} (nα,x)=1+𝚏𝚒𝚗𝚍(α,x),\displaystyle(n\circ\alpha,x)\overset{\triangle}{=}1+\mathtt{find}(\alpha,x), n=x\displaystyle n=x
𝚏𝚒𝚗𝚍\displaystyle\mathtt{find} (nα,x)=𝚏𝚒𝚗𝚍(α,x),\displaystyle(n\circ\alpha,x)\overset{\triangle}{=}\mathtt{find}(\alpha,x), nx\displaystyle n\neq x

For convenience, we define |α|x|\alpha|_{x} to denote 𝚏𝚒𝚗𝚍(α,x)\mathtt{find}(\alpha,x).

Lookups. The predicate 𝚎𝚚(x1,α,x2)\mathtt{eq}(x_{1},\alpha,x_{2}) denotes the x2x_{2}-th item of sequence α\alpha is x1x_{1}, which can be expressed as follows:

𝚎𝚚(x1,α,x2)=α1α2.α==α1x1α2|α1x1|=x2.\displaystyle\mathtt{eq}(x_{1},\alpha,x_{2})\overset{\triangle}{=}\exists\alpha_{1}\exists\alpha_{2}.\;\alpha==\alpha_{1}\circ x_{1}\circ\alpha_{2}\land|\alpha_{1}\circ x_{1}|=x_{2}.

Note that α(x)\alpha(x) does not appear as a term in Definition II.4. If so, consider the case where xx exceeds the length of sequence α\alpha. A new value \perp should be introduced to denote illegal term, which may complicate the logic. We take x1=α(x2)x_{1}=\alpha(x_{2}) to denote 𝚎𝚚(x1,α,x2)\mathtt{eq}(x_{1},\alpha,x_{2}) for convenience:

x1=α(x2)=𝚎𝚚(x1,α,x2).\displaystyle x_{1}=\alpha(x_{2})\overset{\triangle}{=}\mathtt{eq}(x_{1},\alpha,x_{2}).

Other properties can be expressed with lookups. We list some of them below.

The relationships between items and sequences can be defined as x¯αx\;\bar{\in}\;\alpha, which means xx can be found in the sequence α\alpha:

x¯α=x1.x=α(x1).\displaystyle x\;\bar{\in}\;\alpha\overset{\triangle}{=}\exists x_{1}.\;x=\alpha(x_{1}).

The x1x_{1}-th item of the sequence α1\alpha_{1} is equal to the x2x_{2}-th item of the sequence α2\alpha_{2}:

𝚎𝚚(α1,x1,α2,x2)=x3.x3=α1(x1)x3=α2(x2).\mathtt{eq}(\alpha_{1},x_{1},\alpha_{2},x_{2})\overset{\triangle}{=}\exists x_{3}.\;x_{3}=\alpha_{1}(x_{1})\land x_{3}=\alpha_{2}(x_{2}).

Unary property. All items in the sequence α\alpha satisfies the unary property 𝙿1(x)\mathtt{P}^{1}(x):

xα1α2.α==α1xα2=>𝙿1(x).\displaystyle\forall x\forall\alpha_{1}\forall\alpha_{2}.\;\alpha==\alpha_{1}\circ x\circ\alpha_{2}=>\mathtt{P}^{1}(x).

Binary property. Each two items with different indices in the sequence α\alpha satisfies the binary property 𝙿2(x1,x2)\mathtt{P}^{2}(x_{1},x_{2}) :

x1x2α1α2α3.\displaystyle\forall x_{1}\forall x_{2}\forall\alpha_{1}\forall\alpha_{2}\forall\alpha_{3}.\; α==α1x1α1x2α3\displaystyle\alpha==\alpha_{1}\circ x_{1}\circ\alpha_{1}\circ x_{2}\circ\alpha_{3}
=>𝙿2(x1,x2).\displaystyle=>\mathtt{P}^{2}(x_{1},x_{2}).

Increment and set-like defined bellow are binary properties.

Increment. The sequence α\alpha is strictly increasing:

𝙸𝚗𝚌(α)=\displaystyle\mathtt{Inc}(\alpha)\overset{\triangle}{=}\; x1x2α1α2α3.\displaystyle\forall x_{1}\forall x_{2}\forall\alpha_{1}\forall\alpha_{2}\forall\alpha_{3}.
α==α1x1α2x2α3=>x1<x2.\displaystyle\alpha==\alpha_{1}\circ x_{1}\circ\alpha_{2}\circ x_{2}\circ\alpha_{3}=>x_{1}<x_{2}.

Set-like. Each two items in sequence α\alpha are distinct:

𝙳𝚒𝚏𝚏(α)=\displaystyle\mathtt{Diff}(\alpha)\overset{\triangle}{=}\; x1x2α1α2α3.\displaystyle\forall x_{1}\forall x_{2}\forall\alpha_{1}\forall\alpha_{2}\forall\alpha_{3}.
α==α1x1α1x2α3=>x1!=x2.\displaystyle\alpha==\alpha_{1}\circ x_{1}\circ\alpha_{1}\circ x_{2}\circ\alpha_{3}=>x_{1}!=x_{2}.

Segment. The sequence α2\alpha_{2} is the consecutive sub-sequence of α1\alpha_{1}:

α2𝚂𝚎𝚐(α1)=α3α4.α1==α3α2α4.\alpha_{2}\;\in\;\mathtt{Seg}(\alpha_{1})\overset{\triangle}{=}\exists\alpha_{3}\exists\alpha_{4}.\\ \alpha_{1}==\alpha_{3}\circ\alpha_{2}\circ\alpha_{4}.

Truncation. The sequence α2\alpha_{2} is the consecutive sub-sequence of α1\alpha_{1} with indices ranging over [x1,x2)[x_{1},x_{2}) :

α2==𝚃𝚛𝚞𝚗𝚌(α1,x1,x2)\displaystyle\alpha_{2}==\;\mathtt{Trunc}(\alpha_{1},x_{1},x_{2})
=\displaystyle\overset{\triangle}{=}\quad α3α4.α1==α3α2α4\displaystyle\exists\alpha_{3}\exists\alpha_{4}.\;\alpha_{1}==\alpha_{3}\circ\alpha_{2}\circ\alpha_{4}
|α3|=x11|α3α2|=x21.\displaystyle\hskip 40.00006pt\land|\alpha_{3}|=x_{1}-1\land|\alpha_{3}\circ\alpha_{2}|=x_{2}-1.

III sequence-heap separation logic

sequence-heap separation logic is an extension of both classical separation logic and sequence predicate logic. It can describe properties on variable-length sequences, and multilevel data structures. Some properties cannot be expressed easily by the classical separation logic or sequence predicate logic alone.

In this section, we present the definition and expressiveness of sequence-heap separation logic (SeqSL).

III-A Definition of sequence-heap separation logic

Definition III.1 (Syntax of sequence-heap separation logic)

The syntax of the logic can be defined as follows:

tx\displaystyle t_{x} ::=𝐧𝐢𝐥#nx\displaystyle\quad::=\quad\mathbf{nil}\mid\#\mid n\mid x
tα\displaystyle t_{\alpha} ::=εtxαtαtα\displaystyle\quad::=\quad\varepsilon\mid t_{x}\mid\alpha\mid t_{\alpha}\circ t_{\alpha}
φ\displaystyle\varphi ::=tx=txtα==tα¬φφφφφφ=>φ\displaystyle\quad::=\quad t_{x}=t_{x}\mid t_{\alpha}==t_{\alpha}\mid\lnot\varphi\mid\varphi\land\varphi\mid\varphi\lor\varphi\mid\varphi=>\varphi
𝐞𝐦𝐩tx|>tαφφφφxx.φαα.φ.\displaystyle\hskip 35.00005pt\mid\mathbf{emp}\mid t_{x}|->t_{\alpha}\mid\varphi*\varphi\mid\varphi-*\varphi\mid\exists_{x}x.\varphi\mid\exists_{\alpha}\alpha.\varphi.

Note that other functions and predicates can be defined in SeqSL following the definition in [8], such as comparison between sequence, and picking up an item from a sequence. Most commonly-used predicates and functions can be expressed by the logic defined above.

Definition III.2 (Model of sequence-heap separation logic)

There are three types of variables in the signature of SeqSL: program variables PVar={x,y,z,}\text{PVar}=\{x,y,z,\dots\}, sequence variables SVar={α,β,γ,}\text{SVar}=\{\alpha,\beta,\gamma,\dots\}. These sets are all countable. The model σ=(sx,sα,h)\sigma=(s_{x},s_{\alpha},h) can be defined as follows:

Loc,Val\text{Loc},\text{Val}\subseteq\mathbb{N} 𝐧𝐢𝐥,#Atoms\mathbf{nil},\#\in\text{Atoms}
LocAtoms=\text{Loc}\cap\text{Atoms}=\varnothing sx:PVar>LocVals_{x}:\text{PVar}->\text{Loc}\cup\text{Val}
sα:SVar>(LocVal)s_{\alpha}:\text{SVar}->(\text{Loc}\cup\text{Val})^{*} h:Locfin(LocVal)h:\text{Loc}\overset{\text{fin}}{\rightharpoonup}(\text{Loc}\cup\text{Val})^{*},

where Loc is the set of locations, Val is the set of values, Atoms is the set of reserved words, sxs_{x} denotes the stack, sαs_{\alpha} denotes the assignment on sequence variables, and hh denotes the heap which stores sequence of locations, values, or atoms.

Definition III.3 (Semantics of terms)

The symbol txt_{x} denotes individual term, whose value ranges over Val. The denotational semantics of the term can be inductively defined as follows:

|[\displaystyle|[ 𝐧𝐢𝐥|]σ=𝐧𝐢𝐥\displaystyle\mathbf{nil}|]\sigma=\mathbf{nil} |[\displaystyle|[ #|]σ=#\displaystyle\#|]\sigma=\#
|[\displaystyle|[ n|]σ=n\displaystyle n|]\sigma=n |[\displaystyle|[ x|]σ=sx(x),\displaystyle x|]\sigma=s_{x}(x),

where 𝐧𝐢𝐥,#\mathbf{nil},\# in Atoms cannot be used as a location, and #\# is used to describe multilevel data structures.

The symbol tαt_{\alpha} denotes sequence term, whose value ranges over (LocVal)(\text{Loc}\cup\text{Val})^{*} . The denotational semantics of the term can be inductively defined as follows:

|[\displaystyle|[ ε|]σ=ε\displaystyle\varepsilon|]\sigma=\varepsilon |[\displaystyle|[ tx|]σ is defined above\displaystyle t_{x}|]\sigma\text{ is defined above}
|[\displaystyle|[ α|]σ=sα(α)\displaystyle\alpha|]\sigma=s_{\alpha}(\alpha) |[\displaystyle|[ tα1tα2|]σ=|[tα1|]σ|[tα2|]σ,\displaystyle t_{\alpha_{1}}\circ t_{\alpha_{2}}|]\sigma=|[t_{\alpha_{1}}|]\sigma\circ|[t_{\alpha_{2}}|]\sigma,

where |[tα1|]σ|[tα2|]σ|[t_{\alpha_{1}}|]\sigma\circ|[t_{\alpha_{2}}|]\sigma denotes the concatenation of two sequences |[tα1|]σ|[t_{\alpha_{1}}|]\sigma and |[tα2|]σ|[t_{\alpha_{2}}|]\sigma.

Definition III.4 (Semantics of sequence-heap separation logic)

In SeqSL, the indices of sequence items start from 1. We set Values=ValLoc\text{Values}=\text{Val}\;\cup\;\text{Loc} for convenience. The semantics of SeqSL can be defined as follows:

σ|=tx1=tx2\displaystyle\sigma\;|=\;t_{x_{1}}=t_{x_{2}} iff |[tx1|]σ=|[tx2|]σ.\displaystyle\quad|[t_{x_{1}}|]\sigma=|[t_{x_{2}}|]\sigma.
σ|=tα1==tα2\displaystyle\sigma\;|=\;t_{\alpha_{1}}==t_{\alpha_{2}} iff |[tα1|]σ==|[tα2|]σ.\displaystyle\quad|[t_{\alpha_{1}}|]\sigma==|[t_{\alpha_{2}}|]\sigma.
σ|=¬φ\displaystyle\sigma\;|=\;\lnot\varphi iff σφ.\displaystyle\quad\sigma\;\nvDash\;\varphi.
σ|=φ1φ2\displaystyle\sigma\;|=\;\varphi_{1}\land\varphi_{2} iff σ|=φ1, and σ|=φ2.\displaystyle\quad\sigma\;|=\;\varphi_{1}\text{, and }\sigma\;|=\;\varphi_{2}.
σ|=φ1φ2\displaystyle\sigma\;|=\;\varphi_{1}\lor\varphi_{2} iff σ|=φ1, or σ|=φ2.\displaystyle\quad\sigma\;|=\;\varphi_{1}\text{, or }\sigma\;|=\;\varphi_{2}.
σ|=φ1=>φ2\displaystyle\sigma\;|=\;\varphi_{1}=>\varphi_{2} iff if σ|=φ1, then σ|=φ2.\displaystyle\quad\text{if }\sigma\;|=\;\varphi_{1}\text{, then }\sigma\;|=\;\varphi_{2}.
σ|=emp\displaystyle\sigma\;|=\;\textbf{emp} iff 𝚍𝚘𝚖(h)=.\displaystyle\quad\mathtt{dom}(h)=\varnothing.
σ|=tx|>tα\displaystyle\sigma\;|=\;t_{x}|->t_{\alpha} iff |[tx|]σAtom,𝚍𝚘𝚖(h)={|[tx|]σ},\displaystyle\quad|[t_{x}|]\sigma\notin\mathrm{Atom},\;\mathtt{dom}(h)=\{|[t_{x}|]\sigma\},
 and h(|[tx|]σ)=|[tα|]σ\displaystyle\quad\text{ and }h(|[t_{x}|]\sigma)=|[t_{\alpha}|]\sigma
σ|=φ1φ2\displaystyle\sigma\;|=\;\varphi_{1}*\varphi_{2} iff there exist heap h1 and h2,\displaystyle\quad\text{there exist heap }h_{1}\text{ and }h_{2},
such that 𝚍𝚘𝚖(h1)𝚍𝚘𝚖(h2)=,\displaystyle\quad\text{such that }\mathtt{dom}(h_{1})\cap\mathtt{dom}(h_{2})=\varnothing\;,
and h=h1h2,(sx,sα,h1)|=φ1,\displaystyle\quad\text{and }h=h_{1}\uplus h_{2},\;(s_{x},s_{\alpha},h_{1})\;|=\;\varphi_{1},
and (sx,sα,h2)|=φ2.\displaystyle\quad\text{and }(s_{x},s_{\alpha},h_{2})\;|=\;\varphi_{2}.
σ|=φ1φ2\displaystyle\sigma\;|=\;\varphi_{1}-*\varphi_{2} iff for all heaps h1,\displaystyle\quad\text{for all heaps }h_{1},
if 𝚍𝚘𝚖(h1)𝚍𝚘𝚖(h)= ,\displaystyle\quad\text{if }\mathtt{dom}(h_{1})\cap\mathtt{dom}(h)=\varnothing\text{ ,}
and (sx,sα,h1)|=φ1,\displaystyle\quad\text{and }(s_{x},s_{\alpha},h_{1})\;|=\;\varphi_{1},
then (sx,sα,h1h)|=φ2.\displaystyle\quad\text{then }(s_{x},s_{\alpha},h_{1}\uplus h)\;|=\;\varphi_{2}.
σ|=xx.φ\displaystyle\sigma\;|=\;\exists_{x}x.\varphi iff there exists x0 in Values,\displaystyle\quad\text{there exists }x_{0}\text{ in Values,}
such that (sx[x>x0],sα,h)|=φ.\displaystyle\quad\text{such that }(s_{x}[x->x_{0}],s_{\alpha},h)\;|=\;\varphi.
σ|=αα.φ\displaystyle\sigma\;|=\;\exists_{\alpha}\alpha.\varphi iff there exists α0 in Values,\displaystyle\quad\text{there exists }\alpha_{0}\text{ in Values}^{*},
such that (sx,sα[α>α0],h)|=φ,\displaystyle\quad\text{such that }(s_{x},s_{\alpha}[\alpha->\alpha_{0}],h)\;|=\;\varphi,

where sx[x>x0]s_{x}[x->x_{0}] and sα[α>α0]s_{\alpha}[\alpha->\alpha_{0}] denote substitution of program variable and sequence variable respectively. They are defined as follows.

sx[x>x0](x)={sx(x0),x=xsx(x),otherwise\displaystyle s_{x}[x->x_{0}](x^{\prime})=\begin{cases}s_{x}(x_{0}),&x^{\prime}=x\\ s_{x}(x),&\text{otherwise}\end{cases}
sα[α>α0](α)={sα(α0),α==αsα(α),otherwise\displaystyle s_{\alpha}[\alpha->\alpha_{0}](\alpha^{\prime})=\begin{cases}s_{\alpha}(\alpha_{0}),&\alpha^{\prime}==\alpha\\ s_{\alpha}(\alpha),&\text{otherwise}\end{cases}

For convenience, we write x.φ\exists x.\varphi to denote xx.φ\exists_{x}x.\varphi, and α.φ\exists\alpha.\varphi to denote αα.φ\exists_{\alpha}\alpha.\varphi.

SeqSL can express properties which is widely used in logic reasoning on pointer programs. We list some of them. These notations will be used in this paper.

septraction φ1\sepimp¬φ2=¬(φ1¬φ2)\displaystyle\quad\varphi_{1}\mathrel{\sepimp\mkern-15.0mu^{\lnot}}\varphi_{2}\overset{\triangle}{=}\lnot(\varphi_{1}-*\lnot\varphi_{2})
universal quantifier α.φ=¬α.¬φ\displaystyle\quad\forall\alpha.\varphi\overset{\triangle}{=}\lnot\exists\alpha.\;\lnot\varphi
α is stored in x\displaystyle\alpha\text{ is stored in }x xα=x|>α𝐭𝐫𝐮𝐞\displaystyle\quad x\hookrightarrow\alpha\overset{\triangle}{=}x|->\alpha*\mathbf{true}
x has been allocated\displaystyle x\text{ has been allocated } 𝚊𝚕𝚕𝚘𝚌(x)=α.xα\displaystyle\quad\mathtt{alloc}(x)\overset{\triangle}{=}\exists\alpha.\;x\hookrightarrow\alpha
α2 is stored in α1(x1)\displaystyle\alpha_{2}\text{ is stored in }\alpha_{1}(x_{1}) α(x1)α2\displaystyle\quad\alpha(x_{1})\hookrightarrow\alpha_{2}
=x.x=α1(x1)xα2.\displaystyle\hskip-5.0pt\overset{\triangle}{=}\;\exists x.\;x=\alpha_{1}(x_{1})\land x\hookrightarrow\alpha_{2}.

III-B Expressiveness of sequence-heap separation logic

In this subsection, we list few examples to show expressiveness of SeqSL. We divide examples into 2 categories: properties on variable-length sequence in programs, and properties on multilevel data structures.

Similar to the classical separation logic, sequence-heap separation logic does not make differences between locations and values in formulae. To adjust more scenarios in practice, we use the atom #\# to manually separate locations and values. The sequence singleton heap can thus be written as x|>αl#αxx|->\alpha_{l}\circ\#\circ\alpha_{x}, where αl\alpha_{l} denotes sequence of locations, and αx\alpha_{x} denotes sequence of values.

III-B1 Properties on variable-length sequences in programs

we list some properties on graphs to show the expressiveness of SeqSL on describing variable-length sequences in programs. These properties include: out-degree and reachability.

Out-degree. The out-degree of node xx can be denoted as 𝙾𝚞𝚝𝚍𝚎𝚐(x)\mathtt{Outdeg}(x) . The formula 𝙾𝚞𝚝𝚍𝚎𝚐(x)=n\mathtt{Outdeg}(x)=n means the out-degree of node xx is nxn_{x}. It is defined as follows:

𝙾𝚞𝚝𝚍𝚎𝚐(x)=nx\displaystyle\mathtt{Outdeg}(x)=n_{x} (5)
=\displaystyle\overset{\triangle}{=} αlαx.xαl#αx|αl|=nx.\displaystyle\exists\alpha_{l}\exists\alpha_{x}.\;x\hookrightarrow\alpha_{l}\circ\#\circ\alpha_{x}\land|\alpha_{l}|=n_{x}.

Note that the reserved word nil might be in αl\alpha_{l}. It does not affect the truth of the definition, because nil can be viewed as location 0. Its out-degree is 0, and in-degree is not necessarily be 0.

Reachability. First, we define one-step reachability x1x2x_{1}\leadsto x_{2} , which means there exists an edge from x1x_{1} to x2x_{2}.

x1x2=αl1αl2αx.l1αl1l2αl2#αx.x_{1}\leadsto x_{2}\overset{\triangle}{=}\exists\alpha_{l_{1}}\exists\alpha_{l_{2}}\exists\alpha_{x}.\;l_{1}\hookrightarrow\alpha_{l_{1}}\circ l_{2}\circ\alpha_{l_{2}}\circ\#\circ\alpha_{x}.

Then we define 𝚛𝚎𝚊𝚌𝚑n(x1,x2)\mathtt{reach}^{n}(x_{1},x_{2}), which means there exists a path from x1x_{1} to x2x_{2}. The length of the path is nn (n>=0n>=0).

𝚛𝚎𝚊𝚌𝚑0(x1,x2)\displaystyle\mathtt{reach}^{0}(x_{1},x_{2}) =x1=x2\displaystyle\quad\overset{\triangle}{=}\quad x_{1}=x_{2}
𝚛𝚎𝚊𝚌𝚑n+1(x1,x2)\displaystyle\mathtt{reach}^{n+1}(x_{1},x_{2}) =x3.x1x3𝚛𝚎𝚊𝚌𝚑n(x3,x2)\displaystyle\quad\overset{\triangle}{=}\quad\exists x_{3}.\;x_{1}\leadsto x_{3}*\mathtt{reach}^{n}(x_{3},x_{2})

Finally we define 𝚛𝚎𝚊𝚌𝚑(x1,x2)\mathtt{reach}(x_{1},x_{2}), which means there exists a path from x1x_{1} to x2x_{2}, whose length is larger or equal than 0.

𝚛𝚎𝚊𝚌𝚑(x1,x2)=n.n>=0𝚛𝚎𝚊𝚌𝚑n(x1,x2).\displaystyle\mathtt{reach}(x_{1},x_{2})\overset{\triangle}{=}\exists n.\;n>=0\land\mathtt{reach}^{n}(x_{1},x_{2}).

Note that 𝚛𝚎𝚊𝚌𝚑(x1,x2)\mathtt{reach}(x_{1},x_{2}) defined in SeqSL is different from 𝚕𝚜(x1,x2)\mathtt{ls}(x_{1},x_{2}) defined in the separation logic fragments on shape analysis. The former can describe reachability on graphs without restrictions on out-degree of each nodes, while the latter can only describe reachability on linked lists.

III-B2 Properties on multilevel data structures

we list some properties on two kinds of storage systems to show expressiveness of SeqSL on describing multilevel data structures.

Different from SeqSL, the model σ=(sx,sα,h)\sigma=(s_{x},s_{\alpha},h) of SeqSL describing storage systems should be interpreted in disk, not in memory.

Properties on Windows storage systems. The structure of Windows storage systems based on trees can be expressed by SeqSL. The indices αl\alpha_{l} of files are stored in the location ll. The file αx\alpha_{x} is stored in the location l1l_{1}.

lαl#ε\displaystyle l\hookrightarrow\alpha_{l}\circ\#\circ\varepsilon\;* (6)
l1αx.(l1¯αl=>l1ε#αx).\displaystyle\forall l_{1}\exists\alpha_{x}.\;(l_{1}\overline{\in}\alpha_{l}=>l_{1}\hookrightarrow\varepsilon\circ\#\circ\alpha_{x}).

Note that the sub-formula l1αx.(l1¯αl=>l1ε#αx)\forall l_{1}\exists\alpha_{x}.\;(l_{1}\overline{\in}\alpha_{l}=>l_{1}\hookrightarrow\varepsilon\circ\#\circ\alpha_{x}) can not be replaced by x.αl(x)ε#αx\forall x.\;\alpha_{l}(x)\hookrightarrow\varepsilon\circ\#\circ\alpha_{x}. The reason is that, αl\alpha_{l} is a finite sequence. When xx exceeds the length of αl\alpha_{l}, it falsifies x.αl(x)ε#αx\forall x.\;\alpha_{l}(x)\hookrightarrow\varepsilon\circ\#\circ\alpha_{x}. So we have to restrict l1l_{1} with l1¯αll_{1}\overline{\in}\alpha_{l}.

We apply #\# in Equation 6 to describe two-tier data structures. The first tier merely stores locations αl\alpha_{l}, while the second tier merely stores data αx\alpha_{x}. These two different forms of contents prevent the location l1l_{1} in the second tier from pointing to the location ll in the first tier.

With Equation 6 in hand, we can describe properties of files. We list two of them.

A file can be found by its index x1x_{1}:

αx.αl(x1)ε#αx.\displaystyle\exists\alpha_{x}.\;\alpha_{l}(x_{1})\hookrightarrow\varepsilon\circ\#\circ\alpha_{x}.

The content x1x_{1} can be found in the file αx\alpha_{x}:

x2.x1=αx(x2).\displaystyle\exists x_{2}.\;x_{1}=\alpha_{x}(x_{2}).

In addition, we can apply x|>αl#nαxx|->\alpha_{l}\circ\#^{n}\circ\alpha_{x} to describe n-th tier of multilevel data structure. This is useful for describing multilevel pointers in C++, multilevel paging, multilevel addressing, etc.

Properties on block-based cloud storage systems. SeqSL can be used to describe blocks in cloud storage systems. Generally speaking, the sequence α\alpha in the formula xαx\hookrightarrow\alpha can be viewed as a block. With a relative address ll, the content α(l)\alpha(l) can be found.

For convenience, we define the predicate 𝙸𝚗𝚌𝙸𝚗𝚍𝚎𝚡(α0,n)\mathtt{IncIndex}(\alpha_{0},n), which means α0\alpha_{0} with length n+1n+1 is strictly increasing. The first item of α\alpha is 1, and the last is n+1n+1.

𝙸𝚗𝚌𝙸𝚗𝚍𝚎𝚡(α0,n)=\displaystyle\mathtt{IncIndex}(\alpha_{0},n)\;\overset{\triangle}{=}\; 𝙸𝚗𝚌(α0)|α0|=n+1\displaystyle\mathtt{Inc}(\alpha_{0})\land|\alpha_{0}|=n+1
1=α0(1)n+1=α0(n+1).\displaystyle\land 1=\alpha_{0}(1)\land n+1=\alpha_{0}(n+1).

A big file α\alpha stored in location ll is divided into nn smaller blocks and is stored in disk. The indices αl\alpha_{l} of the file are stored in location ll. The function 𝚃𝚛𝚞𝚗𝚌(α,x1,x2)\mathtt{Trunc}(\alpha,x_{1},x_{2}) can be used to locate these blocks. The sequence variable α0\alpha_{0} in the following formula represents the sequence of division points, which should make the predicate 𝙸𝚗𝚌𝙸𝚗𝚍𝚎𝚡(αl,n)\mathtt{IncIndex}(\alpha_{l},n) to be true. The length of each block αx\alpha_{x} can be additionally fixed to 64 by tahe formula |αx|=64|\alpha_{x}|=64.

α0.(𝙸𝚗𝚌𝙸𝚗𝚍𝚎𝚡(αl,n)lαl#ε\displaystyle\exists\alpha_{0}.\;\Bigl{(}\mathtt{IncIndex}(\alpha_{l},n)\land l\hookrightarrow\alpha_{l}\circ\#\circ\varepsilon
(l1x.(l1=αl(x)αx==𝚃𝚛𝚞𝚗𝚌(α,α0(x),α0(x+1)))\displaystyle*\bigl{(}\forall l_{1}\exists x.\;(l_{1}=\alpha_{l}(x)\land\alpha_{x}==\mathtt{Trunc}(\alpha,\alpha_{0}(x),\alpha_{0}(x+1)))
=>l1ε#αx).\displaystyle=>l_{1}\hookrightarrow\varepsilon\circ\#\circ\alpha_{x}\Bigr{)}.

The formula defined above can be illustrated by Fig. (2).

Refer to caption
Figure 2: Structure of block-based cloud storage system

IV Decidable fragments

In this section, we consider a decidable Σ1\Sigma_{1} fragment of SeqSL. It contains sequence singleton heap, separating conjunction, separating implication, equality and concatenation on sequences. We present the following main result of this section.

Theorem IV.1 (Decidable fragment of SeqSL)

The satisfiability problem of the language is decidable when it is restricted as follows:

tx::=\displaystyle t_{x}\quad::= 𝐧𝐢𝐥#x\displaystyle\quad\mathbf{nil}\mid\#\mid x
tα::=\displaystyle t_{\alpha}\quad::= εtxαtαtα\displaystyle\quad\varepsilon\mid t_{x}\mid\alpha\mid t_{\alpha}\circ t_{\alpha}
φ::=\displaystyle\varphi\quad::= tx=txtα==tα𝐟𝐚𝐥𝐬𝐞φ=>φ𝐞𝐦𝐩\displaystyle\quad t_{x}=t_{x}\mid t_{\alpha}==t_{\alpha}\mid\mathbf{false}\mid\varphi=>\varphi\mid\mathbf{emp}
x|>tαφφφφ.\displaystyle\quad\mid x|->t_{\alpha}\mid\varphi*\varphi\mid\varphi-*\varphi.

The model of the fragment is the same as that of SeqSL, which is defined in Definition III.2.

We call the above fragment the Σ1\Sigma_{1} fragment of separation logic  (PSeqSL, where the alphabet ’P’ means propositional). PSeqSL is a non-trivial fragment for the following reasons:

  • The expressiveness of PSeqSL is stronger than that of the Σ1\Sigma_{1} fragment of the classical separation logic proposed in [1], because the former can express properties on variable-length sequences and on multilevel data structures.

  • The heap stores variable-length sequences in PSeqSL. The models of a formula in PSeqSL may have more possibilities than that of a formula in the classical separation logic.

  • Sequences and heap operations in the fragment are not independent. Their satisfiability may affect each other. For instance, to decide whether the formula x1|>α1x1|>α2x_{1}|->\alpha_{1}\land x_{1}|->\alpha_{2} is true, we need to consider both conditions on heap, and the implicit condition α1==α2\alpha_{1}==\alpha_{2} on sequences. To decide whether the formula (x1|>α1x2|>α2)(x1|>α3x2|>α4)x1!=x2(x_{1}|->\alpha_{1}*x_{2}|->\alpha_{2})\land(x_{1}|->\alpha_{3}*x_{2}|->\alpha_{4})\land x_{1}!=x_{2} is true, we need to consider the implicit condition α1==α3α2==α4\alpha_{1}==\alpha_{3}\land\alpha_{2}==\alpha_{4}.

  • PSeqSL can describe some of the formulae consisting of existential quantifiers in SeqSL. The property 𝚊𝚕𝚕𝚘𝚌(x)=α.xα\mathtt{alloc}(x)\overset{\triangle}{=}\exists\alpha.\;x\hookrightarrow\alpha in SeqSL can be expressed in PSeqSL as:

    𝚊𝚕𝚕𝚘𝚌(x)=(x|>𝐧𝐢𝐥)𝐟𝐚𝐥𝐬𝐞.\displaystyle\mathtt{alloc}(x)\overset{\triangle}{=}(x|->\mathbf{nil})-*\mathbf{false}.

We prove Theorem IV.1 following these three steps:

  1. 1.

    prove the following problem is decidable: given the formula φ\varphi in PSeqSL, and a part of the model sx,hs_{x},h, decide whether there exists sαs_{\alpha}, such that (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi.

  2. 2.

    prove the following problem is decidable: given the formula φ\varphi in PSeqSL, and a part of the model sxs_{x}, decide whether there exists sα,hs_{\alpha},h, such that (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi.

  3. 3.

    prove the following problem is decidable: given the formula φ\varphi in PSeqSL, decide whether there exists sx,sα,hs_{x},s_{\alpha},h, such that (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi. (Theorem IV.1)

The first step is the main part of the proof. The result on the fragment without separating implications is trivial, because the finiteness of hh leads to finite possibilities of models satisfying the formula φ\varphi.

Similar to the paper [13], we define the size of the formula φ\varphi as 𝚜𝚣(φ)\mathtt{sz}(\varphi) to represent the maximum heap size for deciding whether the formula φ\varphi is true or its negation is true.

Definition IV.1 (Size of formula)

The size of the formula φ\varphi, 𝚜𝚣(φ)\mathtt{sz}(\varphi) can be inductively defined as follows:

𝚜𝚣(tx1=tx2)=0\mathtt{sz}(t_{x_{1}}=t_{x_{2}})=0 𝚜𝚣(tα1==tα2)=0\mathtt{sz}(t_{\alpha_{1}}==t_{\alpha_{2}})=0
𝚜𝚣(𝐟𝐚𝐥𝐬𝐞)=0\mathtt{sz}(\mathbf{false})=0 𝚜𝚣(𝐞𝐦𝐩)=1\mathtt{sz}(\mathbf{emp})=1
𝚜𝚣(x|>tα)=1\mathtt{sz}(x|->t_{\alpha})=1 𝚜𝚣(φ1φ2)=𝚜𝚣(φ2)\mathtt{sz}(\varphi_{1}-*\varphi_{2})=\mathtt{sz}(\varphi_{2})
𝚜𝚣(φ1=>φ2)=𝚖𝚊𝚡(𝚜𝚣(φ1),𝚜𝚣(φ2))\mathtt{sz}(\varphi_{1}=>\varphi_{2})=\mathtt{max}(\mathtt{sz}(\varphi_{1}),\mathtt{sz}(\varphi_{2}))
𝚜𝚣(φ1φ2)=𝚜𝚣(φ1)+𝚜𝚣(φ2)\mathtt{sz}(\varphi_{1}*\varphi_{2})=\mathtt{sz}(\varphi_{1})+\mathtt{sz}(\varphi_{2}).

We define free program variables as follows.

Definition IV.2 (Free program variables in PSeqSL)

In PSeqSL, free program variables FVt,x(tx)\mathrm{FV}_{t,x}(t_{x}) in the term txt_{x} and free program variables FVt,x(tα)\mathrm{FV}_{t,x}(t_{\alpha}) in the term tαt_{\alpha} can be inductively defined as follows:

FVt,x(𝐧𝐢𝐥)={}\mathrm{FV}_{t,x}(\mathbf{nil})=\{\} FVt,x(#)={}\mathrm{FV}_{t,x}(\#)=\{\} FVt,x(x)={x}\mathrm{FV}_{t,x}(x)=\{x\}
FVt,x(α)={}\mathrm{FV}_{t,x}(\alpha)=\{\} FVt,x(ε)={}\mathrm{FV}_{t,x}(\varepsilon)=\{\}
FVt,x(tαx1tαx2)=FVt,x(tαx1)FVt,x(tαx2)\mathrm{FV}_{t,x}(t_{\alpha_{x_{1}}}\circ t_{\alpha_{x_{2}}})=\mathrm{FV}_{t,x}(t_{\alpha_{x_{1}}})\cup\mathrm{FV}_{t,x}(t_{\alpha_{x_{2}}}).

Free program variables FVx(φ)\mathrm{FV}_{x}(\varphi) in the formula φ\varphi can be inductively defined as follows:

FVx\displaystyle\mathrm{FV}_{x} (tx1=tx2)=FVt,x(tx1)FVt,x(tx2)\displaystyle(t_{x_{1}}=t_{x_{2}})=\mathrm{FV}_{t,x}(t_{x_{1}})\cup\mathrm{FV}_{t,x}(t_{x_{2}})
FVx\displaystyle\mathrm{FV}_{x} (tα1==tα2)=FVt,x(tα1)FVt,x(tα2)\displaystyle(t_{\alpha_{1}}==t_{\alpha_{2}})=\mathrm{FV}_{t,x}(t_{\alpha_{1}})\cup\mathrm{FV}_{t,x}(t_{\alpha_{2}})
FVx\displaystyle\mathrm{FV}_{x} (𝐟𝐚𝐥𝐬𝐞)={}\displaystyle(\mathbf{false})=\{\}
FVx\displaystyle\mathrm{FV}_{x} (φ1=>φ2)=FVx(φ1)FVx(φ2)\displaystyle(\varphi_{1}=>\varphi_{2})=\mathrm{FV}_{x}(\varphi_{1})\cup\mathrm{FV}_{x}(\varphi_{2})
FVx\displaystyle\mathrm{FV}_{x} (𝐞𝐦𝐩)={}\displaystyle(\mathbf{emp})=\{\}
FVx\displaystyle\mathrm{FV}_{x} (x|>tα)={x}FVt,x(tα)\displaystyle(x|->t_{\alpha})=\{x\}\cup\mathrm{FV}_{t,x}(t_{\alpha})
FVx\displaystyle\mathrm{FV}_{x} (φ1φ2)=FVx(φ1)FVx(φ2)\displaystyle(\varphi_{1}*\varphi_{2})=\mathrm{FV}_{x}(\varphi_{1})\cup\mathrm{FV}_{x}(\varphi_{2})
FVx\displaystyle\mathrm{FV}_{x} (φ1φ2)=FVx(φ1)FVx(φ2).\displaystyle(\varphi_{1}-*\varphi_{2})=\mathrm{FV}_{x}(\varphi_{1})\cup\mathrm{FV}_{x}(\varphi_{2}).

We define the set of sequence terms to collect the terms appearing in the formula. It helps us to get a finite set of all possible heap models which may satisfy the formula.

Definition IV.3 (Set of sequence terms)

The set of sequence terms 𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ)\mathtt{SeqTerms}(\varphi) appearing in the formula φ\varphi can be inductively defined as:

𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (tl1=tl2)={tl1,tl2}\displaystyle(t_{l_{1}}=t_{l_{2}})=\{t_{l_{1}},t_{l_{2}}\}
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (tα1==tα2)={tα1,tα2}\displaystyle(t_{\alpha_{1}}==t_{\alpha_{2}})=\{t_{\alpha_{1}},t_{\alpha_{2}}\}
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (𝐟𝐚𝐥𝐬𝐞)={}\displaystyle(\mathbf{false})=\{\}
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (𝐞𝐦𝐩)={}\displaystyle(\mathbf{emp})=\{\}
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (x|>tα)={x,tα}\displaystyle(x|->t_{\alpha})=\{x,t_{\alpha}\}
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (φ1=>φ2)=𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ1)𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ2)\displaystyle(\varphi_{1}=>\varphi_{2})=\mathtt{SeqTerms}(\varphi_{1})\cup\mathtt{SeqTerms}(\varphi_{2})
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (φ1φ2)=𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ1)𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ2)\displaystyle(\varphi_{1}*\varphi_{2})=\mathtt{SeqTerms}(\varphi_{1})\cup\mathtt{SeqTerms}(\varphi_{2})
𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜\displaystyle\mathtt{SeqTerms} (φ1φ2)=𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ1)𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ2).\displaystyle(\varphi_{1}-*\varphi_{2})=\mathtt{SeqTerms}(\varphi_{1})\cup\mathtt{SeqTerms}(\varphi_{2}).

Separating implication involves universal quantifiers over heaps. The paper [13] presents a solution on searching all possible models of the formula in the classical Σ1\Sigma_{1} separation logic fragment. They restrict the domain and the range of the heap hh to finite sets. To deal with sequences, we come up with a different restriction on the range of hh. The range of hh involves the set of all sequence terms appearing in the formula, empty sequence ε\varepsilon, and a fresh sequence term whose value is different from all sequence terms in the formula. Thus, universal quantifiers over heaps can be reduced to those over finite heaps. The reduction preserves satisfiability.

We present the following lemma to show small model property on the formula φ1φ2\varphi_{1}-*\varphi_{2}.

Lemma IV.1

Given the model σ=(sx,sα,h)\sigma=(s_{x},s_{\alpha},h) and formulae φ,φ1 and φ2\varphi,\;\varphi_{1}\text{ and }\varphi_{2} satisfying φ=φ1φ2\varphi=\varphi_{1}-*\varphi_{2}, the set L=FVx(φ1)FVx(φ2)L=\mathrm{FV}_{x}(\varphi_{1})\cup\mathrm{FV}_{x}(\varphi_{2}), the set BB which contains the first 𝚖𝚊𝚡(𝚜𝚣(φ1),𝚜𝚣(φ2))\mathtt{max}(\mathtt{sz}(\varphi_{1}),\mathtt{sz}(\varphi_{2})) values in Loc(𝚍𝚘𝚖(h)s(L))\mathrm{Loc}\setminus(\mathtt{dom}(h)\cup s(L)), the fresh sequence variable β¯\overline{\beta} satisfying |[β¯|]σ{|[tα|]σtα𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(α)}|[\overline{\beta}|]\sigma^{\prime}\notin\{|[t_{\alpha}|]\sigma^{\prime}\mid t_{\alpha}\in\mathtt{SeqTerms}(\alpha)\}. We define σ=(sx,sα,h)\sigma^{\prime}=(s_{x},s_{\alpha}^{\prime},h) where sαs_{\alpha}^{\prime} contains a new assignment for β¯\overline{\beta}. Let 𝒟=Bs(L)\mathcal{D}=B\cup s(L), and ={|[tα|]σtα𝚂𝚎𝚚𝚃𝚎𝚛𝚖𝚜(φ)}{ε,|[β¯|]σ}\mathcal{R}=\{|[t_{\alpha}|]\sigma^{\prime}\mid t_{\alpha}\in\mathtt{SeqTerms}(\varphi)\}\cup\{\varepsilon,|[\overline{\beta}|]\sigma^{\prime}\}. Then, (sx,sα,h)|=φ1φ2(s_{x},s_{\alpha},h)|=\varphi_{1}-*\varphi_{2} holds, if and only if for all hh^{\prime} satisfying:

  • 𝚍𝚘𝚖(h)𝚍𝚘𝚖(h)=\mathtt{dom}(h^{\prime})\cap\mathtt{dom}(h)=\varnothing and (sx,sα,h)|=φ1(s_{x},s_{\alpha}^{\prime},h^{\prime})|=\varphi_{1},

  • 𝚍𝚘𝚖(h)𝒟\mathtt{dom}(h^{\prime})\subseteq\mathcal{D},

  • 𝚛𝚗𝚐(h)\mathtt{rng}(h^{\prime})\subseteq\mathcal{R},

the proposition (sx,sα,hh)|=φ2(s_{x},s_{\alpha}^{\prime},h\uplus h^{\prime})|=\varphi_{2} holds, where 𝚛𝚗𝚐(h)\mathtt{rng}(h^{\prime}) denotes the range of hh^{\prime}.

According to Lemma IV.1, we define the reduction function as follows.

Definition IV.4

The reduction function T(sx,h,φ)T(s_{x},h,\varphi) is a mapping from stack sxs_{x}, heap hh, and the PSeqSL formula φ\varphi, to the formula in Σ1\Sigma_{1} sequence predicate logic. The reduction can be inductively defined in Fig. (3), \stripsep+2pt

T(sx,h,tx1=tx2)\displaystyle T(s_{x},h,t_{x_{1}}=t_{x_{2}}) =tx1==tx2\displaystyle\quad\overset{\triangle}{=}\quad t_{x_{1}}==t_{x_{2}}
T(sx,h,tα1==tα2)\displaystyle T(s_{x},h,t_{\alpha_{1}}==t_{\alpha_{2}}) =tα1==tα2\displaystyle\quad\overset{\triangle}{=}\quad t_{\alpha_{1}}==t_{\alpha_{2}}
T(sx,h,𝐟𝐚𝐥𝐬𝐞)\displaystyle T(s_{x},h,\mathbf{false}) =𝐟𝐚𝐥𝐬𝐞\displaystyle\quad\overset{\triangle}{=}\quad\mathbf{false}
T(sx,h,φ1=>φ2)\displaystyle T(s_{x},h,\varphi_{1}=>\varphi_{2}) =T(sx,h,φ1)=>T(sx,h,φ2)\displaystyle\quad\overset{\triangle}{=}\quad T(s_{x},h,\varphi_{1})=>T(s_{x},h,\varphi_{2})
T(sx,h,𝐞𝐦𝐩)\displaystyle T(s_{x},h,\mathbf{emp}) =𝚍𝚘𝚖(h)=\displaystyle\quad\overset{\triangle}{=}\quad\mathtt{dom}(h)=\varnothing
T(sx,h,x|>tα)\displaystyle T(s_{x},h,x|->t_{\alpha}) =𝚍𝚘𝚖(h)={sx(x)}sx(x)Atomh(sx(x))==tα\displaystyle\quad\overset{\triangle}{=}\quad\mathtt{dom}(h)=\{s_{x}(x)\}\land s_{x}(x)\notin\mathrm{Atom}\land h(s_{x}(x))==t_{\alpha}
T(sx,h,φ1φ2)\displaystyle T(s_{x},h,\varphi_{1}*\varphi_{2}) =h=h1h2(T(sx,h1,φ1)T(sx,h2,φ2))\displaystyle\quad\overset{\triangle}{=}\quad\bigvee_{h=h_{1}\uplus h_{2}}(T(s_{x},h_{1},\varphi_{1})\land T(s_{x},h_{2},\varphi_{2}))
T(sx,h,φ1φ2)\displaystyle T(s_{x},h,\varphi_{1}-*\varphi_{2}) = 𝚍𝚘𝚖(hφ)𝚍𝚘𝚖(h)=𝚍𝚘𝚖(hφ)𝒟𝚛𝚗𝚐(hφ) (T(sx,hφ,φ1)=>T(sx,hφh,φ2)),\displaystyle\quad\overset{\triangle}{=}\quad\bigwedge_{\mbox{ \tiny$\begin{array}[]{c}\mathtt{dom}(h_{\varphi})\cap\mathtt{dom}(h)=\varnothing\\ \mathtt{dom}(h_{\varphi})\subseteq\mathcal{D}\\ \mathtt{rng}(h_{\varphi})\subseteq\mathcal{R}\end{array}$ }}(T(s_{x},h_{\varphi},\varphi_{1})=>T(s_{x},h_{\varphi}\uplus h,\varphi_{2})),

Figure 3: The reduction function T(sx,h,φ)T(s_{x},h,\varphi)

where φ=φ1φ2\varphi=\varphi_{1}-*\varphi_{2}.

Given sxs_{x} and hh, the formula T(sx,h,φ1φ2)T(s_{x},h,\varphi_{1}*\varphi_{2}) consists of finitely many terms of the form T(sx,h,φ)T(s_{x},h,\varphi), because the implicit existential quantifier in separating conjunction does not create new heaps. The formula T(sx,h,φ1φ2)T(s_{x},h,\varphi_{1}-*\varphi_{2}) also consists of finitely many terms of the form T(sx,h,φ)T(s_{x},h,\varphi) according to Lemma IV.1. Moreover, the truth values of the following three formulae 𝚍𝚘𝚖(h)=\mathtt{dom}(h)=\varnothing, 𝚍𝚘𝚖(h)=sx(x)\mathtt{dom}(h)={s_{x}(x)}, and sx(x)Atoms_{x}(x)\notin\mathrm{Atom} can be determined, as well as the value of h(sx(x))h(s_{x}(x)) (which is a sequence). So T(sx,h,φ)T(s_{x},h,\varphi) consists of only tα1==tα2t_{\alpha_{1}}==t_{\alpha_{2}} and its conjunctions, disjunctions and negations. It can be further reduced to a single word equation followed by Theorem II.1. PSeqSL is thus reduced to the satisfiability problem of Σ1\Sigma_{1} sequence predicate logic. The Σ1\Sigma_{1} fragment of sequence predicate logic is shown to be decidable in [19].

Lemma IV.2

Given stack sxs_{x} and heap hh. For all assignment sαs_{\alpha} and formula φ\varphi, the proposition (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi holds if and only if there exist assignments (sx,sα)(s_{x}^{\prime},s_{\alpha}^{\prime}), such that (sx,sα)|=T(sx,h,φ)(s_{x}^{\prime},s_{\alpha}^{\prime})|=T(s_{x},h,\varphi) holds, where sxs_{x}^{\prime} and sαs_{\alpha}^{\prime} are assignments of program variables and sequence variables respectively, T(sx,h,φ)T(s_{x},h,\varphi) is the reduction from PSeqSL to Σ1\Sigma_{1} sequence predicate logic defined by Definition IV.4.

Note that after doing reductions in Definition IV.4 and applying Theorem II.1, one can get a single word equation U==VU==V with constants (kk contents for example) and sequence variables (kk^{\prime} variables for example). It is easy to show the following fact: there is a solution for U==VU==V if and only if there is a solution for U==VU==V on the free monoid with k+kk+k^{\prime} generators, where the first kk generators coincide with kk constants, and the other kk^{\prime} generators are fresh in \mathbb{N}.

To give readers some intuitions for Lemma IV.2, we list two examples below.

Example IV.1

Consider whether the formula φ=(x1|>α1x3)((x1|>α1x1|>α2)(x1|>α2x2|>α3))\varphi=(x_{1}|->\alpha_{1}\circ x_{3})*\big{(}(x_{1}|->\alpha_{1}\lor x_{1}|->\alpha_{2})-*(x_{1}|->\alpha_{2}*x_{2}|->\alpha_{3})\big{)} can be satisfied, given the stack satisfying {(x1,n1),(x2,n2),(x3,n3)}s\{(x_{1},n_{1}),(x_{2},n_{2}),(x_{3},n_{3})\}\subseteq s where n1,n2n_{1},n_{2} and n3n_{3} are distinct integer numbers and the heap h={(n1,n1n3),(n2,n2n3)}h=\{(n_{1},n_{1}\circ n_{3}),(n_{2},n_{2}\circ n_{3})\}. The assignment sαs_{\alpha} is pending.

Suppose φ=φ1φ2\varphi=\varphi_{1}*\varphi_{2}, where φ1=x1|>α1x3\varphi_{1}=x_{1}|->\alpha_{1}\circ x_{3}, φ2=(x1|>α1x1|>α2)(x1|>α2x2|>α3)\varphi_{2}=(x_{1}|->\alpha_{1}\lor x_{1}|->\alpha_{2})-*(x_{1}|->\alpha_{2}*x_{2}|->\alpha_{3}). We have the formula in Fig. (4).

\stripsep

+0.5em

T(sx,h,φ)=\displaystyle T(s_{x},h,\varphi)\quad=\quad h=h1h2(T(sx,h1,φ1)T(sx,h2,φ2))\displaystyle\bigvee_{h=h_{1}\uplus h_{2}}(T(s_{x},h_{1},\varphi_{1})\land T(s_{x},h_{2},\varphi_{2}))
=\displaystyle\quad=\quad (T(sx,{},φ1)T(sx,{(n1,n1n3),(n2,n2n3)},φ2))\displaystyle\;\big{(}T(s_{x},\{\},\varphi_{1})\land T(s_{x},\{(n_{1},n_{1}\circ n_{3}),(n_{2},n_{2}\circ n_{3})\},\varphi_{2})\big{)}
(T(sx,{(n1,n1n3)},φ1)T(sx,{(n2,n2n3)},φ2))\displaystyle\lor\big{(}T(s_{x},\{(n_{1},n_{1}\circ n_{3})\},\varphi_{1})\land T(s_{x},\{(n_{2},n_{2}\circ n_{3})\},\varphi_{2})\big{)}
(T(sx,{(n2,n2n3)},φ1)T(sx,{(n1,n1n3)},φ2))\displaystyle\lor\big{(}T(s_{x},\{(n_{2},n_{2}\circ n_{3})\},\varphi_{1})\land T(s_{x},\{(n_{1},n_{1}\circ n_{3})\},\varphi_{2})\big{)}
(T(sx,{(n1,n1n3),(n2,n2n3)},φ1)T(sx,{},φ2))\displaystyle\lor\big{(}T(s_{x},\{(n_{1},n_{1}\circ n_{3}),(n_{2},n_{2}\circ n_{3})\},\varphi_{1})\land T(s_{x},\{\},\varphi_{2})\big{)}
=\displaystyle\quad=\quad T(sx,{(n1,n1n3)},φ1)T(sx,{(n2,n2n3)},φ2).\displaystyle\;T(s_{x},\{(n_{1},n_{1}\circ n_{3})\},\varphi_{1})\land T(s_{x},\{(n_{2},n_{2}\circ n_{3})\},\varphi_{2}).
Figure 4: Construction in Example 1

Suppose φ2=φ21φ22\varphi_{2}=\varphi_{21}-*\varphi_{22}, where φ21=x1|>α1x1|>α2\varphi_{21}=x_{1}|->\alpha_{1}\lor x_{1}|->\alpha_{2}, and φ22=x1|>α2x2|>α3\varphi_{22}=x_{1}|->\alpha_{2}*x_{2}|->\alpha_{3}. We have:

T(sx,{(n2,n2n3)},φ2)\displaystyle T(s_{x},\{(n_{2},n_{2}\circ n_{3})\},\varphi_{2})
=\displaystyle=  𝚍𝚘𝚖(hφ2){n2}=𝚍𝚘𝚖(hφ2)𝒟𝚛𝚗𝚐(hφ2) (T(sx,hφ2,φ21)\displaystyle\bigwedge_{\mbox{ \tiny$\begin{array}[]{c}\mathtt{dom}(h_{\varphi_{2}})\cap\{n_{2}\}=\varnothing\\ \mathtt{dom}(h_{\varphi_{2}})\subseteq\mathcal{D}\\ \mathtt{rng}(h_{\varphi_{2}})\subseteq\mathcal{R}\end{array}$ }}\bigl{(}T(s_{x},h_{\varphi_{2}},\varphi_{21})
=>T(sx,hφ2{(n2,n2n3)},φ22)),\displaystyle=>T(s_{x},h_{\varphi_{2}}\uplus\{(n_{2},n_{2}\circ n_{3})\},\varphi_{22})\bigr{)},

where

𝒟\displaystyle\mathcal{D} =B{n1,n2,n3},B={m1,m2,m3},\displaystyle=B\;\cup\;\{n_{1},n_{2},n_{3}\},\quad B=\{m_{1},m_{2},m_{3}\},
\displaystyle\mathcal{R} ={|[α1x3|]σ,|[α1|]σ,|[α2|]σ,|[α3|]σ}{ε,|[β¯|]σ},\displaystyle=\{|[\alpha_{1}\circ x_{3}|]\sigma^{\prime},|[\alpha_{1}|]\sigma^{\prime},|[\alpha_{2}|]\sigma^{\prime},|[\alpha_{3}|]\sigma^{\prime}\}\cup\{\varepsilon,|[\overline{\beta}|]\sigma^{\prime}\},

and m1,m2,m3m_{1},m_{2},m_{3} are the first three values in Loc(𝚍𝚘𝚖(h)s(L))\mathrm{Loc}\setminus(\mathtt{dom}(h)\cup s(L)), the term β¯\overline{\beta} satisfies:

|[β¯|]σ{|[α1x3|]σ,|[α1|]σ,|[α2|]σ,|[α3|]σ}.\displaystyle|[\overline{\beta}|]\sigma^{\prime}\notin\{|[\alpha_{1}\circ x_{3}|]\sigma^{\prime},|[\alpha_{1}|]\sigma^{\prime},|[\alpha_{2}|]\sigma^{\prime},|[\alpha_{3}|]\sigma^{\prime}\}.

The domain 𝚍𝚘𝚖(hφ2)\mathtt{dom}(h_{\varphi_{2}}) consists of many finite possibilities. However, there are only two possibilities hφ2={(n1,sα(α1))}h_{\varphi_{2}}=\{(n_{1},s_{\alpha}^{\prime}(\alpha_{1}))\} and hφ2={(n1,sα(α2))}h_{\varphi_{2}}=\{(n_{1},s_{\alpha}^{\prime}(\alpha_{2}))\} which may satisfy the formula. So,

T(sx,{(n2,n2n3)},φ2)\displaystyle T(s_{x},\{(n_{2},n_{2}\circ n_{3})\},\varphi_{2})
=\displaystyle= (T(sx,{(n1,sα(α1)},φ21)\displaystyle\;(T(s_{x},\{(n_{1},s_{\alpha}^{\prime}(\alpha_{1})\},\varphi_{21})
=>T(sx,{(n1,sα(α1)),(n2,n2n3)},φ22))\displaystyle=>T(s_{x},\{(n_{1},s_{\alpha}^{\prime}(\alpha_{1})),(n_{2},n_{2}\circ n_{3})\},\varphi_{22}))
(T(sx,{(n1,n2n3)},φ21)\displaystyle\land(T(s_{x},\{(n_{1},n_{2}\circ n_{3})\},\varphi_{21})
=>T(sx,{(n1,n2n3),(n2,n2n3)},φ22))\displaystyle=>T(s_{x},\{(n_{1},n_{2}\circ n_{3}),(n_{2},n_{2}\circ n_{3})\},\varphi_{22}))
=\displaystyle= T(sx,{(n1,sα(α1)),(n2,n2n3)},φ22)\displaystyle\;T(s_{x},\{(n_{1},s_{\alpha}^{\prime}(\alpha_{1})),(n_{2},n_{2}\circ n_{3})\},\varphi_{22})
T(sx,{(n1,n2n3),(n2,n2n3)},φ22).\displaystyle\land T(s_{x},\{(n_{1},n_{2}\circ n_{3}),(n_{2},n_{2}\circ n_{3})\},\varphi_{22}).

Similarly, we discuss possibilities of models satisfying the formula φ22\varphi_{22}. Then we have:

T(sx,{(n1,sα(α1)),(n2,n2n3)},φ22)\displaystyle T(s_{x},\{(n_{1},s_{\alpha}^{\prime}(\alpha_{1})),(n_{2},n_{2}\circ n_{3})\},\varphi_{22})
=\displaystyle=\quad α1==α2n2n3==α3,\displaystyle\alpha_{1}==\alpha_{2}\land n_{2}\circ n_{3}==\alpha_{3},
T(sx,{(n1,n2n3),(n2,n2n3)},φ22)\displaystyle T(s_{x},\{(n_{1},n_{2}\circ n_{3}),(n_{2},n_{2}\circ n_{3})\},\varphi_{22})
=\displaystyle=\quad α2==α2n2n3==α3.\displaystyle\alpha_{2}==\alpha_{2}\land n_{2}\circ n_{3}==\alpha_{3}.

So,

T(sx,h,φ)\displaystyle T(s_{x},h,\varphi)
=\displaystyle=\quad (n1n3==α1n3)(α1==α2n2n3==α3)\displaystyle(n_{1}\circ n_{3}==\alpha_{1}\circ n_{3})\land(\alpha_{1}==\alpha_{2}\land n_{2}\circ n_{3}==\alpha_{3})
(α2==α2n2n3==α3).\displaystyle\land(\alpha_{2}==\alpha_{2}\land n_{2}\circ n_{3}==\alpha_{3}).

We observe that T(sx,h,φ)T(s_{x},h,\varphi) can be satisfied. Hence given sx,hs_{x},h, there exists sαs_{\alpha} such that (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi holds.

Example IV.2

Consider whether the formula φ=x1|>α1x3((x2|>α1x2|>α2)(x1|>α1x2|>α3))\varphi=x_{1}|->\alpha_{1}\circ x_{3}\land\big{(}(x_{2}|->\alpha_{1}\lor x_{2}|->\alpha_{2})-*(x_{1}|->\alpha_{1}*x_{2}|->\alpha_{3})\big{)} can be satisfied, given s={(x1,n1),(x2,n2),(x3,n3)},h={(x1,n1n3)}s=\{(x_{1},n_{1}),(x_{2},n_{2}),(x_{3},n_{3})\},h=\{(x_{1},n_{1}\circ n_{3})\}.

Following definition Definition 4.4, we have:

T(sx,h,φ)\displaystyle T(s_{x},h,\varphi)
=\displaystyle=\quad n1n3==α1n3((α1==α3α1==n1n3)\displaystyle n_{1}\circ n_{3}==\alpha_{1}\circ n_{3}\land\big{(}(\alpha_{1}==\alpha_{3}\land\alpha_{1}==n_{1}\circ n_{3})
(α2==α3α1==n1n3)).\displaystyle\land(\alpha_{2}==\alpha_{3}\land\alpha_{1}==n_{1}\circ n_{3})\big{)}.

We observe that T(sx,h,φ)T(s_{x},h,\varphi) cannot be satisified. Hence given sx,hs_{x},h, there does not exist sαs_{\alpha} such that (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi holds. We omit the details here.

For the second step of the proof, heap hh is not given beforehand. The satisfiability problem of the formula in PSeqSL can be reduced to the problem in the first step by the following lemma.

Lemma IV.3

Given stack sxs_{x} and the PSeqSL formula φ\varphi, the following problem is decidable: whether there exists sαs_{\alpha} and hh such that (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi.

The above problem is equivalent to the following problem: given sxs_{x}, whether there exists sαs_{\alpha} such that (sx,sα,{})|=φ\sepimp¬𝐭𝐫𝐮𝐞(s_{x},s_{\alpha},\{\})|=\varphi\mathrel{\sepimp\mkern-15.0mu^{\lnot}}\mathbf{true} holds. Thus we can conclude Lemma IV.3.

For the third step, stack sxs_{x} is not given. Similar to the paper[13], we need to figure out a finite range of sxs_{x}, such that the formula is true on the range if and only if the formula is true on infinite range of sxs_{x}.

Definition IV.5

Given two models (sx,sα,h),(sx,sα,h)(s_{x},s_{\alpha},h),(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime}), the set PPVarP\subseteq\mathrm{PVar}, and the set SSVarS\subseteq\mathrm{SVar}. The relation (sx,sα,h)P(sx,sα,h)(s_{x},s_{\alpha},h)\approx_{P}(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime}) holds, if and only if there exists an isomorphism r:(Loc,)>(Loc,)r:\;(\mathrm{Loc}^{*},\circ)->(\mathrm{Loc}^{*},\circ), such that the following conditions are satisfied:

  • for each α1,α2S\alpha_{1},\alpha_{2}\in S, we have r(sα(α1)sα(α2))=r(sα(α1))r(sα(α2))r(s_{\alpha}(\alpha_{1})\circ s_{\alpha}(\alpha_{2}))=r(s_{\alpha}(\alpha_{1}))\circ r(s_{\alpha}(\alpha_{2})),

  • r(𝐧𝐢𝐥)=𝐧𝐢𝐥r(\mathbf{nil})=\mathbf{nil}, r(#)=#r(\#)=\#, and for each xPx\in P, we have r(sx(x))=sx(x)r(s_{x}(x))=s_{x}^{\prime}(x),

  • for each xPx\in P, we have r(h(x))=h(r(x))r(h(x))=h^{\prime}(r(x)),

  • for each αS\alpha\in S, we have r(sα(α))=sα(α)r(s_{\alpha}(\alpha))=s_{\alpha}^{\prime}(\alpha).

Proposition IV.1

Given two models (sx,sα,h),(sx,sα,h)(s_{x},s_{\alpha},h),(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime}), and the formula φ\varphi satisfying (sx,sα,h)FVx(φ)(sx,sα,h)(s_{x},s_{\alpha},h)\approx_{\mathrm{FV}_{x}(\varphi)}(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime}). If (sx,sα,h)|=φ(s_{x},s_{\alpha},h)|=\varphi, then (sx,sα,h)|=φ(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime})|=\varphi.

Lemma IV.4

Given the model (sx,sα,h)(s_{x},s_{\alpha},h) and the formula φ\varphi. Let BB be the first |FVx(φ)||\mathrm{FV}_{x}(\varphi)| locations in Loc\mathrm{Loc}. There exists (sx,sα,h)(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime}) such that sx(PVarFVx(φ)){𝐧𝐢𝐥,#},sx(FVx(φ))B{𝐧𝐢𝐥,#}s_{x}^{\prime}(\mathrm{PVar}\setminus\mathrm{FV}_{x}(\varphi))\subseteq\{\mathbf{nil},\#\},s_{x}^{\prime}(\mathrm{FV}_{x}(\varphi))\subseteq B\cup\{\mathbf{nil},\#\}, and (sx,sα,h)FVx(φ)(sx,sα,h)(s_{x},s_{\alpha},h)\approx_{\mathrm{FV}_{x}(\varphi)}(s_{x}^{\prime},s_{\alpha}^{\prime},h^{\prime}).

With Lemmas IV.2, IV.3 and IV.4, we can conclude that the satisfiability problem of PSeqSL is decidable, that is Theorem IV.1 holds.

As we know, the truth of Σ1\Sigma_{1} fragment of sequence predicate logic is decidable[19]. As the fragment consists of negations without any restrictions, there is a one-to-one mapping from instances in Σ1\Sigma_{1} fragment of sequence predicate logic and those in Π1\Pi_{1} fragment of sequence predicate logic. Thus we have Corollary IV.1.

Corollary IV.1

The theory of Π1\Pi_{1} fragment of sequence predicate logic is decidable.

The proof of Theorem IV.1 shows that PSeqSL has small model property. The satisfiability of a formula in PseqSL can be reduced to that of the formula with finite length in Σ1\Sigma_{1} fragment sequence logic. According to small model property of PSeqSL and Corollary IV.1, we have Corollary IV.2.

Corollary IV.2

The satisfiability and validity problem of Π1\Pi_{1} fragment of PSeqSL is decidable, where Π1\Pi_{1} fragment of PSeqSL is of the form xα.φ\forall_{x}^{*}\forall_{\alpha}^{*}.\varphi, and φ\varphi is quantifier-free.

V Undecidable fragments

In this section, we mainly focus on an undecidable fragment of the form (xαxxα)SeqSL()(\forall_{x}^{*}\forall_{\alpha}^{*}\cap\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*})\text{SeqSL}(*). It is the conjunction of formula in (xα)SeqSL()(\forall_{x}^{*}\forall_{\alpha}^{*})\text{SeqSL}(*) and formula (xxα)SeqSL()(\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*})\text{SeqSL}(*), where α\exists_{\alpha}^{*} denotes there are 0 or more existential quantifiers over sequence variables, and x\exists_{x}^{*} denotes there are 0 or more existential quantifiers over program variables. The meaning of α\forall_{\alpha}^{*} and α\forall_{\alpha}^{*} are similar. We present the following main result of this section.

Theorem V.1 (Undecidable fragment of SeqSL)

The satisfiability problem of the language is undecidable when it is restricted as follows:

tx::=\displaystyle t_{x}\quad::=\quad 𝐧𝐢𝐥#x\displaystyle\mathbf{nil}\mid\#\mid x
tα::=\displaystyle t_{\alpha}\quad::=\quad εtxαtαtα\displaystyle\varepsilon\mid t_{x}\mid\alpha\mid t_{\alpha}\circ t_{\alpha}
ψ::=\displaystyle\psi\quad::=\quad tx=txtα==tαxtα𝐟𝐚𝐥𝐬𝐞ψ=>ψ\displaystyle t_{x}=t_{x}\mid t_{\alpha}==t_{\alpha}\mid x\hookrightarrow t_{\alpha}\mid\mathbf{false}\mid\psi=>\psi
𝐞𝐦𝐩ψψ\displaystyle\mid\mathbf{emp}\mid\psi*\psi
φ::=\displaystyle\varphi\quad::=\quad xα.ψxxα.ψ,\displaystyle\forall_{x}^{*}\forall_{\alpha}^{*}.\psi\land\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*}.\psi,

where φ\varphi is a sentence.

The model of the fragment is defined in Definition III.2.

We reduce the problem from halting problem of two-counter Minsky machine. Before getting into details of the reduction, we recall the definition of two-counter Minsky machine.

Definition V.1 (Two-counter Minsky machine)

Let MM be a Minsky machine with n>=1n>=1 instructions. The machine MM has two counters C1C_{1} and C2C_{2}. The instructions are defined as follows:

  1. 1.

    I:Cj:=Cj+1;I:\;C_{j}:=C_{j}+1; goto kk,

  2. 2.

    I:I: if Cj=0C_{j}=0, then goto k1k_{1}, else (Cj:=Cj1C_{j}:=C_{j}-1; goto k2k_{2}),

  3. 3.

    n:n: halt,

where j[1,2],I[1,n1],j\in[1,2],\;I\in[1,n-1], and k,k1,k2[1,n]k,k_{1},k_{2}\in[1,n]. Machine MM halts if there is a run of the form

(I0,c01,c02),(I1,c11,c12),,(Im,cm1,cm2),\displaystyle(I_{0},c_{0}^{1},c_{0}^{2}),\;(I_{1},c_{1}^{1},c_{1}^{2}),\dots,(I_{m},c_{m}^{1},c_{m}^{2}),

such that (Ii,ci1,ci2)[1,n]×2(i[1,m])(I_{i},c_{i}^{1},c_{i}^{2})\in[1,n]\times\mathbb{N}^{2}\;(i\in[1,m]), I0=1,Im=nI_{0}=1,\;I_{m}=n, and c01=c02=0c_{0}^{1}=c_{0}^{2}=0. The run follows the instructions defined above.

The set of instructions with type 1 (denoted by 1\mathcal{I}_{1}) consists of tuples (k0,cj,cj,k)(k_{0},c_{j},c_{j}^{\prime},k), and those with type 2 (denoted by 2\mathcal{I}_{2}) consists of tuples (k0,cj,cj,k1,k2)(k_{0},c_{j},c_{j}^{\prime},k_{1},k_{2}), where k0k_{0} denotes the current pointer, cj,cj[1,2]c_{j},c_{j}^{\prime}\in[1,2] denote two counters (cj!=cjc_{j}!=c_{j}^{\prime}), and k,k1,k2k,k_{1},k_{2} denote the next pointers.

The problem of deciding whether a machine MM halts is known to be undecidable[28].

We first show the undecidable fragment defined in Theorem V.1 is not that easy to get. In general, decidability results of both sequence logic fragments and separation logic fragments restrict the form of undecidable fragments of SeqSL.

Decidability results in free semigroup show that, the theory of Σ2\Sigma_{2} fragment[21], the positive theory of Π2\Pi_{2} fragment[20] and Σ3\Sigma_{3} fragment[20] of word equations are undecidable. The results in separation logic show that, the satisfiability problem of Σ3\Sigma_{3} fragment of separation logic is undecidable. In this case, We can get V.1.

Fact V.1

In the fragment of SeqSL with prenex normal form, if there are 3 or more alternations of quantifiers over program variables, or 2 or more over sequence variables, then the satisfiability problems of resulting fragments are undecidable. For instance, the satisfiability problems of (xαα)SeqSL(\exists_{x}^{*}\exists_{\alpha}^{*}\forall_{\alpha}^{*})\mathrm{SeqSL}, (xαx)SeqSL(\exists_{x}^{*}\forall_{\alpha}^{*}\exists_{x}^{*})\mathrm{SeqSL}, (xxαx)SeqSL(\exists_{x}^{*}\forall_{x}^{*}\forall_{\alpha}^{*}\exists_{x}^{*})\mathrm{SeqSL} are undecidable.

In order to get a non-trivial undecidable fragment of SeqSL with negations, we have to restrict the alternations of quantifiers over program variables to be less than 3, and those over sequence variables to be less than 2.

To prove Theorem V.1, we first discuss the encoding of the reduction. The general structure is sapling, which is shown in Fig. (5).

Refer to caption
Figure 5: Structure of sapling

We consider the encoding for sapling. Sapling is a fishbone-like structure with ’bones’ going in opposite directions. Sapling consists of a master branch and several slave branches attached to nodes in the master branch. The depth of each slave branch is 1, and the end points of them does not point to any nodes in master branch. Similar to the encoding for the list x+yx\overset{\circlearrowleft}{\longrightarrow}^{+}y defined in [12] and fishbone heap defined in [16], we encode the sapling in the following ways:

  1. 1.

    There are smaller or equal than 1 predecessor of each node. It is denoted by the formula Ψ1\varPsi_{1},

  2. 2.

    There is no predecessor on the first master node. It is denoted by the formula Ψ2\varPsi_{2},

  3. 3.

    The first master node is allocated, and the last node does not point to the next master node. It is denoted by the formula Ψ3\varPsi_{3} (it is not necessary to make the predecessor of the last node to be 1),

  4. 4.

    For each node except for the last node, there is another allocated node which follows the node. It is denoted by the formula Ψ4\varPsi_{4}.

The sapling from node x0x_{0} to node x0x_{0}^{\prime} (assume here that x0!=x0x_{0}!=x_{0}^{\prime}) can be expressed by

𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\displaystyle\mathtt{sapling}(x_{0},x_{0}^{\prime}) =Ψ1Ψ2Ψ3Ψ4.\displaystyle\overset{\triangle}{=}\varPsi_{1}\land\varPsi_{2}\land\varPsi_{3}\land\varPsi_{4}. (7)

where Ψ1,Ψ2,Ψ3\varPsi_{1},\varPsi_{2},\varPsi_{3}, and Ψ4\varPsi_{4} are defined in Fig. (4).

\stripsep

+0.5em

Ψ1\displaystyle\varPsi_{1} =xx1xx2xx3xx4αα1αα2.(x1x3α1x2x4α2=>x3!=x4)\displaystyle\quad\overset{\triangle}{=}\quad\forall_{x}x_{1}\forall_{x}x_{2}\forall_{x}x_{3}\forall_{x}x_{4}\forall_{\alpha}\alpha_{1}\forall_{\alpha}\alpha_{2}.\;(x_{1}\hookrightarrow x_{3}\circ\alpha_{1}*x_{2}\hookrightarrow x_{4}\circ\alpha_{2}=>x_{3}!=x_{4})
Ψ2\displaystyle\varPsi_{2} =xx1αα.¬(x1x0α)\displaystyle\quad\overset{\triangle}{=}\quad\forall_{x}x_{1}\forall_{\alpha}\alpha.\;\lnot(x_{1}\hookrightarrow x_{0}\circ\alpha)
Ψ3\displaystyle\varPsi_{3} =(xx1.x0x1ε)(αα.x0εα)\displaystyle\quad\overset{\triangle}{=}\quad(\exists_{x}x_{1}.\;x_{0}\hookrightarrow x_{1}\circ\varepsilon)\land(\exists_{\alpha}\alpha.\;x_{0}^{\prime}\hookrightarrow\varepsilon\circ\alpha)
Ψ4\displaystyle\varPsi_{4} =xx1xx2xx3αα1αα2.((x1x2α1x2!=x0)=>x2x3α2).\displaystyle\quad\overset{\triangle}{=}\quad\forall_{x}x_{1}\forall_{x}x_{2}\exists_{x}x_{3}\exists_{\alpha}\alpha_{1}\exists_{\alpha}\alpha_{2}.\;\bigl{(}(x_{1}\hookrightarrow x_{2}\circ\alpha_{1}\land x_{2}!=x_{0}^{\prime})=>x_{2}\hookrightarrow x_{3}\circ\alpha_{2}\bigr{)}.
Figure 6: Definition of Ψ1,Ψ2,Ψ3\varPsi_{1},\varPsi_{2},\varPsi_{3}, and Ψ4\varPsi_{4}

It can be easily shown that the predicate 𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\mathtt{sapling}(x_{0},x_{0}^{\prime}) is of the form (xαxxα)Ψ(\forall_{x}^{*}\forall_{\alpha}^{*}\cap\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*})\varPsi.

Note that the encoding in Equation 7 is just a general idea. It does not prevent contents in slave nodes from pointing to other nodes. However, this does not have impacts on the main result, because these contents only consist of 𝐧𝐢𝐥\mathbf{nil} which will be shown in Equation 8.

Note also that the predicate 𝚂𝚊𝚙𝚕𝚒𝚗𝚐\mathtt{Sapling} not only encodes the graph consisting of only one sapling, but also graphs consisting of both one sapling and circles. It is not necessary to remove all these circles, because it does not have impacts on correctness of the reduction. It is shown in Lemma V.1.

Lemma V.1

Given the predicate 𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\mathtt{sapling}(x_{0},x_{0}^{\prime}) defined in Equation 7, it can be satisfied by a model σ\sigma if and only if there exists a model σ\sigma^{\prime} that satisfies 𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\mathtt{sapling}(x_{0},x_{0}^{\prime}), and there are no circles in σ\sigma^{\prime}.

The proof of Lemma V.1 can be found in Appendix. Now we prove Theorem V.1.

Proof:

We go into details of the reduction, which is shown in Fig. (7).

Refer to caption
Figure 7: Structure of reduction

The master branch is the longest path from the beginning master node to the end master node. Each master node points to 0 or more slave nodes 𝐧𝐢𝐥\mathbf{nil}^{*}. The master nodes have the period of 4. Each period represents a state of the run. In the ii-th period (1<=i<=m1<=i<=m), the first master node represents a deliminator which separates each state of the run. There is no slave node attached to it. The second to the fourth master nodes represent the ii-th triple (I,C1,C2)(I,C_{1},C_{2}) satisfying I=ki,C1=ci1+1, and C2=ci2+1I=k_{i},\;C_{1}=c_{i}^{1}+1,\text{ and }C_{2}=c_{i}^{2}+1. The slave nodes pointed from these 3 nodes are 𝐧𝐢𝐥ki,𝐧𝐢𝐥ci1+1, and 𝐧𝐢𝐥ci2+1\mathbf{nil}^{k_{i}},\;\mathbf{nil}^{c_{i}^{1}+1},\text{ and }\mathbf{nil}^{c_{i}^{2}+1} respectively. The first period represents the initial state, while the last represents the final state.

We observe that the values of two counters in the sapling are greater than the real values of the counters in the run by 1. It is because we need to distinguish these nodes from the first node of each period, to which there is no slave nodes attached.

In general, The reduction formula consists of the following three parts.

  1. 1.

    Construct the basic structure of sapling, which is denoted by the formula Φ1=xα.Φ1\varPhi_{1}=\forall_{x}^{*}\forall_{\alpha}^{*}.\varPhi_{1}^{\prime}.

  2. 2.

    Initialize the initial and the final state, which is denoted by the formula Φ2=xα.Φ2\varPhi_{2}=\exists_{x}^{*}\exists_{\alpha}^{*}.\varPhi_{2}^{\prime}.

  3. 3.

    Encode the transition from one state to the other following the current instruction, which is denoted by the formula Φ3=xxα.Φ3\varPhi_{3}=\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*}.\varPhi_{3}^{\prime}.

The formula is of the form

Φ=\displaystyle\varPhi\quad= Φ1Φ2Φ3\displaystyle\varPhi_{1}\land\varPhi_{2}\land\varPhi_{3} (8)
=\displaystyle\quad= xα.Φ1xα.Φ2xxα.Φ3,\displaystyle\forall_{x}^{*}\forall_{\alpha}^{*}.\varPhi_{1}^{\prime}\land\exists_{x}^{*}\exists_{\alpha}^{*}.\varPhi_{2}^{\prime}\land\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*}.\varPhi_{3}^{\prime},

or

Φ=xα.Φ1xxα.(Φ2Φ3).\displaystyle\varPhi\quad=\quad\forall_{x}^{*}\forall_{\alpha}^{*}.\varPhi_{1}^{\prime}\land\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*}.(\varPhi_{2}^{\prime}\land\varPhi_{3}^{\prime}).

The detailed construction can be found in Appendix.

For each two-counter Minsky machine M, let Φ\varPhi be the formula of the form defined in Equation 8. If the machine M halts, then there is a finite run. It can be easily shown that there is a finite sapling from the first period encoding the first state to the final one encoding the final state, and each state except for the final state can be transformed to the next state by the corresponding instruction.

For the other side of the proof, we suppose that the formula Φ\varPhi of the form defined in Equation 8 is satisfied. The model σ\sigma of Φ\varPhi may consist of circles. We can get a model σ\sigma^{\prime} satisfying Φ\varPhi in which there is no circles according to Lemma V.1. ∎

VI Conclusion

In this paper, we propose sequence-heap separation logic which combines sequence predicate logic and separation logic. It is capable of describing the following properties which sequence predicate logic or separation logic alone cannot describe or cannot easily describe.

  • sequence operations in programs, such as list reversal, and lookups.

  • properties corresponding to variable-length sequences in programs, such as stack, queue, and graphs with unbounded out-degree.

  • multilevel data structures in programs, such as data structure in Windows storage systems, and in block-based cloud storage systems.

Besides, we find a boundary between decidable and undecidable SeqSL fragments, which are both of the prenex normal form. We prove the following two decidable results: the satisfiability problem of Σ1\Sigma_{1} fragment and Π1\Pi_{1} fragment is decidable, and that of (xαxxα)SeqSL()(\forall_{x}^{*}\forall_{\alpha}^{*}\cap\forall_{x}^{*}\exists_{x}^{*}\exists_{\alpha}^{*})\text{SeqSL}(*) is undecidable. As corollaries, fragments with either 3 quantifier alternations over program variables or 2 quantifier alternations over sequence variables are undecidable.

We will do the following work in the future.

  • Find other boundaries between decidable and undecidable fragments of sequence-heap separation logic, such as boundaries on numbers of quantified variables, and on inductive predicates.

  • Investigate deeply on expressiveness of sequence-heap separation logic.

  • Construct a proof system for sequence-heap separation logic.

  • Implement formal verification tools for sequence-heap separation logic fragments.

References

  • [1] J. C. Reynolds, “Separation logic: A logic for shared mutable data structures,” in Proceedings 17th Annual IEEE Symposium on Logic in Computer Science (LICS). IEEE, 2002, pp. 55–74.
  • [2] M. J. Parkinson and G. M. Bierman, “Separation logic, abstraction and inheritance,” ACM SIGPLAN Notices, vol. 43, no. 1, pp. 75–86, 2008.
  • [3] K. Batz, B. L. Kaminski, J.-P. Katoen, C. Matheja, and T. Noll, “Quantitative separation logic: a logic for reasoning about probabilistic pointer programs,” Proceedings of the ACM on Programming Languages (POPL), vol. 3, no. POPL, pp. 1–29, 2019.
  • [4] B. Biering, L. Birkedal, and N. Torp-Smith, “Bi-hyperdoctrines, higher-order separation logic, and abstraction,” ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 29, no. 5, pp. 24–es, 2007.
  • [5] J. Brotherston, C. Fuhs, J. A. N. Pérez, and N. Gorogiannis, “A decision procedure for satisfiability in separation logic with inductive predicates,” in Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 2014, pp. 1–10.
  • [6] C. Barrett, C. L. Conway, M. Deters, L. Hadarean, D. Jovanović, T. King, A. Reynolds, and C. Tinelli, “Cvc4,” in International Conference on Computer Aided Verification (CAV). Springer, 2011, pp. 171–177.
  • [7] M. Berzish, V. Ganesh, and Y. Zheng, “Z3str3: A string solver with theory-aware heuristics,” in 2017 Formal Methods in Computer Aided Design (FMCAD). IEEE, 2017, pp. 55–59.
  • [8] T. Kutsia and B. Buchberger, “Predicate logic with sequence variables and sequence function symbols,” in International Conference on Mathematical Knowledge Management (MKM). Springer, 2004, pp. 205–219.
  • [9] Z. Jin, B. Zhang, T. Cao, Y. Cao, and H. Wang, “Reasoning about block-based cloud storage systems via separation logic,” Theoretical Computer Science (TCS), vol. 936, pp. 43–76, 2022.
  • [10] S. Demri, É. Lozes, and A. Mansutti, “The effects of adding reachability predicates in propositional separation logic,” in International Conference on Foundations of Software Science and Computation Structures (FoSSaCS). Springer, 2018, pp. 476–493.
  • [11] J. Berdine, C. Calcagno, and P. W. O’hearn, “A decidable fragment of separation logic,” in International Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS). Springer, 2004, pp. 97–109.
  • [12] R. Brochenin, S. Demri, and E. Lozes, “On the almighty wand,” Information and Computation (IANDC), vol. 211, pp. 106–137, 2012.
  • [13] C. Calcagno, H. Yang, and P. W. O’hearn, “Computability and complexity results for a spatial assertion language for data structures,” in International Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS). Springer, 2001, pp. 108–119.
  • [14] A. Reynolds, R. Iosif, and C. Serban, “Reasoning in the bernays-schönfinkel-ramsey fragment of separation logic,” in International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI). Springer, 2017, pp. 462–482.
  • [15] S. Demri, D. Galmiche, D. Larchey-Wendling, and D. Méry, “Separation logic with one quantified variable,” Theory of Computing Systems, vol. 61, no. 2, pp. 371–461, 2017.
  • [16] S. Demri and M. Deters, “Two-variable separation logic and its inner circle,” ACM Transactions on Computational Logic (TOCL), vol. 16, no. 2, pp. 1–36, 2015.
  • [17] J. Pagel and F. Zuleger, “Strong-separation logic,” ACM Transactions on Programming Languages and Systems (TOPLAS), vol. 44, no. 3, pp. 1–40, 2022.
  • [18] J. Karhumäki, F. Mignosi, and W. Plandowski, “The expressibility of languages and relations by word equations,” Journal of the ACM (JACM), vol. 47, no. 3, pp. 483–505, 2000.
  • [19] G. S. Makanin, “The problem of solvability of equations in a free semigroup,” Matematicheskii Sbornik, vol. 145, no. 2, pp. 147–236, 1977.
  • [20] V. G. Durnev, “Undecidability of the positive \forall\exists3-theory of a free semigroup,” Siberian Mathematical Journal, vol. 36, no. 5, pp. 917–929, 1995.
  • [21] J. D. Day, V. Ganesh, P. He, F. Manea, and D. Nowotka, “The satisfiability of word equations: Decidable and undecidable theories,” in International Conference on Reachability Problems (RP). Springer, 2018, pp. 15–29.
  • [22] J. M. Važenin and B. V. Rozenblat, “Decidability of the positive theory of a free countably generated semigroup,” Mathematics of the USSR-Sbornik, vol. 44, no. 1, p. 109, 1983.
  • [23] Y. Jing, H. Wang, Y. Huang, L. Zhang, J. Xu, and Y. Cao, “A modeling language to describe massive data storage management in cyber-physical systems,” Journal of Parallel and Distributed Computing (JPDC), vol. 103, pp. 113–120, 2017.
  • [24] B. Zhang, Z. Jin, H. Wang, and Y. Cao, “Tool for verifying cloud block storage based on separation logic,” Journal of Software, vol. 33, no. 6, pp. 2264–2287, 2022.
  • [25] J. C. Reynolds, “Intuitionistic reasoning about shared mutable data structure,” Millennial perspectives in computer science, vol. 2, no. 1, pp. 303–321, 2000.
  • [26] J. Berdine, C. Calcagno, and P. W. O’hearn, “Symbolic execution with separation logic,” in Asian Symposium on Programming Languages and Systems (APLAS). Springer, 2005, pp. 52–68.
  • [27] T. Kutsia, “Solving equations with sequence variables and sequence functions,” Journal of Symbolic Computation (JSC), vol. 42, no. 3, pp. 352–388, 2007.
  • [28] M. Minsky, Computation: Finite and Infinite Machines. JSTOR, 1968.

Appendix


Proof sketch of Theorem II.1: For every Boolean combinations φ\varphi defined in Definition II.3, there exists a single word equation T(φ)=U==VT(\varphi)\overset{\triangle}{=}U==V, which is equivalent to φ\varphi in the following inductive way:

T\displaystyle T (t1==t2)\displaystyle(t_{1}==t_{2}) =\displaystyle\overset{\triangle}{=}\quad t1==t2\displaystyle t_{1}==t_{2}
T\displaystyle T (t1==t2t1==t2)\displaystyle(t_{1}==t_{2}\land t_{1}^{\prime}==t_{2}^{\prime}) =\displaystyle\overset{\triangle}{=}\quad F(t1,t2)==F(t1,t2)\displaystyle F(t_{1},t_{2})==F(t_{1}^{\prime},t_{2}^{\prime})
T\displaystyle T (t1==t2t1==t2)\displaystyle(t_{1}==t_{2}\lor t_{1}^{\prime}==t_{2}^{\prime}) =\displaystyle\overset{\triangle}{=}\quad T0(t1t2==t2t2t2t1==t2t2)\displaystyle T_{0}(t_{1}\circ t_{2}^{\prime}==t_{2}\circ t_{2}^{\prime}\lor t_{2}\circ t_{1}^{\prime}==t_{2}\circ t_{2}^{\prime})
T\displaystyle T (¬t1==t2)\displaystyle(\lnot\;t_{1}==t_{2}) =\displaystyle\overset{\triangle}{=}\quad βββ′′.(nΣt1==t2nβ)(nΣt2==t1nβ)\displaystyle\exists\beta\exists\beta^{\prime}\exists\beta^{\prime\prime}.(\bigvee_{n\in\Sigma}t_{1}==t_{2}\circ n\circ\beta)\lor(\bigvee_{n\in\Sigma}t_{2}==t_{1}\circ n\circ\beta)
(n,nΣn!=n t1==βnβt2==βnβ′′)\displaystyle\lor(\bigvee_{\mbox{\scriptsize$\begin{array}[]{c}n,n^{\prime}\in\Sigma\\ n!=n^{\prime}\end{array}$ }}t_{1}==\beta\circ n\circ\beta^{\prime}\land t_{2}==\beta\circ n^{\prime}\circ\beta^{\prime\prime})
T\displaystyle T (u==vu==v)0{}_{0}(u==v\lor u^{\prime}==v) =\displaystyle\overset{\triangle}{=}\quad ββ.X==βYβ,\displaystyle\exists\beta\exists\beta^{\prime}.X==\beta\circ Y\circ\beta^{\prime},

where Σ\Sigma is generator set, and

X\displaystyle X =G(uu)2uG(uu)2uG(uu)2\displaystyle\quad=\quad G(u\circ u^{\prime})^{2}\circ u\circ G(u\circ u^{\prime})^{2}\circ u^{\prime}\circ G(u\circ u^{\prime})^{2}
Y\displaystyle Y =G(uu)2vG(uu)2\displaystyle\quad=\quad G(u\circ u^{\prime})^{2}\circ v\circ G(u\circ u^{\prime})^{2}
F(t1,t2)\displaystyle F(t_{1},t_{2}) =t1nt2t1nt2,\displaystyle\quad=\quad t_{1}\circ n\circ t_{2}\circ t_{1}\circ n^{\prime}\circ t_{2},\; n!=n\displaystyle n!=n^{\prime}
G(t)\displaystyle G(t) =tntn,\displaystyle\quad=\quad t\circ n\circ t\circ n^{\prime},\; n!=n\displaystyle n!=n^{\prime}

\hfill\square

Proof sketch of Lemma V.1: The ’if’ part of the lemma is straightforward. For the ’only if’ part, we suppose that σ|=𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\sigma|=\mathtt{sapling}(x_{0},x_{0}^{\prime}). It is sufficient to show the following two facts.

  • It is impossible that all connected components in σ\sigma are circles,

  • If there are one or more circles in σ\sigma, then let σ\sigma^{\prime} be the model with only one sapling in σ\sigma and without circles. We can get σ|=𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\sigma^{\prime}|=\mathtt{sapling}(x_{0},x_{0}^{\prime}).

First we prove the first fact. We suppose that all connected components in σ\sigma are circles. We can imply that each master node has one predecessor. It contradicts Ψ2\varPsi_{2} which claims there are no predecessors on the first node x0x_{0}.

Then we prove the second fact. According to the first fact, there exists one sapling in σ\sigma. The sapling does not intersect with other circles according to Ψ1\varPsi_{1} which claims that each master node has no more than 2 predecessors. Let σ\sigma^{\prime} be the sapling in σ\sigma, and σ′′\sigma^{\prime\prime} be all circles in σ\sigma. We can have σ=σσ′′\sigma=\sigma^{\prime}\uplus\sigma^{\prime\prime}. So, the proposition σ|=𝚜𝚊𝚙𝚕𝚒𝚗𝚐(x0,x0)\sigma^{\prime}|=\mathtt{sapling}(x_{0},x_{0}^{\prime}) is true, because removing circles does not affect the truth of Ψ1,Ψ2,Ψ3\varPsi_{1},\varPsi_{2},\varPsi_{3}, and Ψ4\varPsi_{4}. \square


Construction in Theorem 5.1. The formula Φ1\varPhi_{1} in part 1 is exactly of the form Ψ1Ψ2\varPsi_{1}\land\varPsi_{2} defined in eq. 7. It can be written in the following way:

Φ1=\displaystyle\varPhi_{1}\;= xx1xx2xx3xx4αα1αα2.(x1x3α1x2x4α2=>x3!=x4)xx1αα.¬(x1xα).\displaystyle\;\forall_{x}x_{1}\forall_{x}x_{2}\forall_{x}x_{3}\forall_{x}x_{4}\forall_{\alpha}\alpha_{1}\forall_{\alpha}\alpha_{2}.\;(x_{1}\hookrightarrow x_{3}\circ\alpha_{1}*x_{2}\hookrightarrow x_{4}\circ\alpha_{2}=>x_{3}!=x_{4})\land\forall_{x}x_{1}\forall_{\alpha}\alpha.\;\lnot(x_{1}\hookrightarrow x\circ\alpha). (9)

The formula Φ2\varPhi_{2} in part 2 can be written in the following way.

Φ2=\displaystyle\varPhi_{2}\quad= xx0xxkxxc1xxc2xx1xx0xxkxxc1xxc2ααc1ααc2.\displaystyle\exists_{x}x_{0}\exists_{x}x_{k}\exists_{x}x_{c^{1}}\exists_{x}x_{c^{2}}\exists_{x}x_{1}\exists_{x}x_{0}^{\prime}\exists_{x}x_{k}^{\prime}\exists_{x}x_{c^{1}}^{\prime}\exists_{x}x_{c^{2}}^{\prime}\exists_{\alpha}\alpha_{c^{1}}^{\prime}\exists_{\alpha}\alpha_{c^{2}}^{\prime}. (10)
𝙸𝚗𝚒𝚝𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x1)𝙵𝚒𝚗𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,αc1,αc2).\displaystyle\mathtt{InitState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{1})*\mathtt{FinState}(x_{0}^{\prime},x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime}).

The predicate 𝙸𝚗𝚒𝚝𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x1)\mathtt{InitState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{1}) encodes the initial state (I0,c01,c02)=(1,0,0)(I_{0},c_{0}^{1},c_{0}^{2})=(1,0,0) (which corresponds to slave nodes on three master nodes xk,xc1,xc2x_{k},x_{c^{1}},x_{c^{2}} in Fig. (7) ), and
𝙵𝚒𝚗𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,αc1,αc2)\mathtt{FinState}(x_{0}^{\prime},x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime}) encodes the final state (n,cm1,cm2)(n,c_{m}^{1},c_{m}^{2}). The state (n,cm1,cm2)(n,c_{m}^{1},c_{m}^{2}) corresponds to slave nodes on three master nodes xk,xc1,xc2x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime} in Fig. (7). These two predicates are defined as follows:

𝙸𝚗𝚒𝚝𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x1)\displaystyle\mathtt{InitState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{1}) (11)
=\displaystyle\overset{\triangle}{=} x0xkεxkxc1𝐧𝐢𝐥xc1xc2𝐧𝐢𝐥xc2x1𝐧𝐢𝐥,\displaystyle x_{0}\hookrightarrow x_{k}\circ\varepsilon*x_{k}\hookrightarrow x_{c^{1}}\circ\mathbf{nil}*x_{c^{1}}\hookrightarrow x_{c^{2}}\circ\mathbf{nil}*x_{c^{2}}\hookrightarrow x_{1}\circ\mathbf{nil},
𝙵𝚒𝚗𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,αc1,αc2)\displaystyle\mathtt{FinState}(x_{0}^{\prime},x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime})
=\displaystyle\overset{\triangle}{=} x0xkεxkxc1𝐧𝐢𝐥nxc1xc2αc1𝐧𝐢𝐥xc2εαc2𝐧𝐢𝐥.\displaystyle x_{0}^{\prime}\hookrightarrow x_{k}^{\prime}\circ\varepsilon*x_{k}^{\prime}\hookrightarrow x_{c^{1}}^{\prime}\circ\mathbf{nil}^{n}*x_{c^{1}}^{\prime}\hookrightarrow x_{c^{2}}^{\prime}\circ\alpha_{c^{1}}^{\prime}\circ\mathbf{nil}*x_{c^{2}}^{\prime}\hookrightarrow\varepsilon\circ\alpha_{c^{2}}^{\prime}\circ\mathbf{nil}.

Note that nn is fixed when the two-counter Minsky machine is given.

Before constructing the formula Φ3\varPhi_{3}, we need to formalize some additional properties shown below.

The sequence α\alpha is of the form 𝐧𝐢𝐥\mathbf{nil}^{*}:

𝚒𝚗𝚒(α)=𝐧𝐢𝐥α==α𝐧𝐢𝐥.\displaystyle\mathtt{ini}(\alpha)\overset{\triangle}{=}\mathbf{nil}\circ\alpha==\alpha\circ\mathbf{nil}. (12)

The sequence α1\alpha_{1} of the form 𝐧𝐢𝐥\mathbf{nil}^{*} is one element longer than sequence α2\alpha_{2} of the same form:

α1==α2𝐧𝐢𝐥.\displaystyle\alpha_{1}==\alpha_{2}\circ\mathbf{nil}.

With the idea in Ψ4\varPsi_{4}, we can construct Φ3\varPhi_{3} as follows:

Φ3=\displaystyle\varPhi_{3}\quad= xx0xxkxxc1xxc2xx0.xxkxxc1xxc2x1ααkααc1ααc2ααkααc1ααc2.\displaystyle\forall_{x}x_{0}\forall_{x}x_{k}\forall_{x}x_{c^{1}}\forall_{x}x_{c^{2}}\forall_{x}x_{0}^{\prime}.\;\exists_{x}x_{k}^{\prime}\exists_{x}x_{c^{1}}^{\prime}\exists_{x}x_{c^{2}}^{\prime}\exists x_{1}^{\prime}\exists_{\alpha}\alpha_{k}\exists_{\alpha}\alpha_{c^{1}}\exists_{\alpha}\alpha_{c^{2}}\exists_{\alpha}\alpha_{k}^{\prime}\exists_{\alpha}\alpha_{c^{1}}^{\prime}\exists_{\alpha}\alpha_{c^{2}}^{\prime}. (13)
(𝙲𝚞𝚛𝚛𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x0,αk,αc1,αc2)¬𝙵𝚒𝚗𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,αc1,αc2))\displaystyle\bigl{(}\mathtt{CurrState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{0}^{\prime},\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}})\land\lnot\mathtt{FinState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},\alpha_{c^{1}},\alpha_{c^{2}})\bigr{)}
=>𝙽𝚎𝚡𝚝𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x0,xk,xc1,xc2,x0′′,αk,αc1,αc2,αk,αc1,αc2).\displaystyle=>\mathtt{NextState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{0}^{\prime},x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime},x_{0}^{\prime\prime},\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime}).

In Φ3\varPhi_{3}, the predicate 𝙲𝚞𝚛𝚛𝚂𝚝𝚊𝚝𝚎\mathtt{CurrState} represents the current state beginning from the master node which has no slave nodes, 𝙵𝚒𝚗𝚂𝚝𝚊𝚝𝚎\mathtt{FinState} represents the final state which has been defined in Equation 11, and 𝙽𝚎𝚡𝚝𝚂𝚝𝚊𝚝𝚎\mathtt{NextState} represents the transition from one state to the other. The formula Φ3\varPhi_{3} encodes the following fact: for each current state except for the final state, there exists a next state which can be transitioned from the current state. The predicates 𝙲𝚞𝚛𝚛𝚂𝚝𝚊𝚝𝚎\mathtt{CurrState} and 𝙽𝚎𝚡𝚝𝚂𝚝𝚊𝚝𝚎\mathtt{NextState} are defined as follows:

𝙲𝚞𝚛𝚛𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x0,αk,αc1,αc2)\displaystyle\mathtt{CurrState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{0}^{\prime},\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}})
=\displaystyle\overset{\triangle}{=}\quad x0xkεxkxc1αkxc1xc2αc1𝐧𝐢𝐥xc2x0αc2𝐧𝐢𝐥,\displaystyle x_{0}\hookrightarrow x_{k}\circ\varepsilon*x_{k}\hookrightarrow x_{c^{1}}\circ\alpha_{k}*x_{c^{1}}\hookrightarrow x_{c^{2}}\circ\alpha_{c^{1}}\circ\mathbf{nil}*x_{c^{2}}\hookrightarrow x_{0}^{\prime}\circ\alpha_{c^{2}}\circ\mathbf{nil},
𝙽𝚎𝚡𝚝𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x0,xk,xc1,xc2,x0′′,αk,αc1,αc2,αk,αc1,αc2)\displaystyle\mathtt{NextState}(x_{0},x_{k},x_{c^{1}},x_{c^{2}},x_{0}^{\prime},x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime},x_{0}^{\prime\prime},\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime})
=\displaystyle\overset{\triangle}{=}\quad 𝙸𝚗𝚒𝚝(αk,αc1,αc2,αk,αc1,αc2)𝙲𝚞𝚛𝚛𝚂𝚝𝚊𝚝𝚎(x0,xk,xc1,xc2,x0′′,αk,αc1,αc2)𝙽𝚎𝚡𝚝(αk,αc1,αc2,αk,αc1,αc2),\displaystyle\mathtt{Init}(\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime})\land\mathtt{CurrState}(x_{0}^{\prime},x_{k}^{\prime},x_{c^{1}}^{\prime},x_{c^{2}}^{\prime},x_{0}^{\prime\prime},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime})\land\mathtt{Next}(\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime}),

where the predicate 𝙸𝚗𝚒𝚝\mathtt{Init} is defined as follows. The predicate 𝚒𝚗𝚒\mathtt{ini} has been defined in Equation 12.

𝙸𝚗𝚒𝚝(αk,αc1,αc2,αk,αc1,αc2)\displaystyle\mathtt{Init}(\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime})
=\displaystyle\quad\overset{\triangle}{=}\quad 𝚒𝚗𝚒(αk)𝚒𝚗𝚒(αc1)𝚒𝚗𝚒(αc2)𝚒𝚗𝚒(αk)𝚒𝚗𝚒(αc1)𝚒𝚗𝚒(αc2).\displaystyle\mathtt{ini}(\alpha_{k})\land\mathtt{ini}(\alpha_{c^{1}})\land\mathtt{ini}(\alpha_{c^{2}})\land\mathtt{ini}(\alpha_{k}^{\prime})\land\mathtt{ini}(\alpha_{c^{1}}^{\prime})\land\mathtt{ini}(\alpha_{c^{2}}^{\prime}).

The predicate 𝙽𝚎𝚡𝚝\mathtt{Next} is defined as follows:

𝙽𝚎𝚡𝚝(αk,αc1,αc2,αk,αc1,αc2)\displaystyle\mathtt{Next}(\alpha_{k},\alpha_{c^{1}},\alpha_{c^{2}},\alpha_{k}^{\prime},\alpha_{c^{1}}^{\prime},\alpha_{c^{2}}^{\prime})
=\displaystyle\overset{\triangle}{=}\quad (k0,cj,cj,k)1(αk==𝐧𝐢𝐥k0αcj==αcj𝐧𝐢𝐥αcj==αcjαk==𝐧𝐢𝐥k)\displaystyle\bigwedge_{(k_{0},c_{j},c_{j}^{\prime},k)\in\mathcal{I}_{1}}\Bigl{(}\alpha_{k}==\mathbf{nil}^{k_{0}}\land\alpha_{c^{j}}^{\prime}==\alpha_{c^{j}}\circ\mathbf{nil}\land\alpha_{c^{j^{\prime}}}^{\prime}==\alpha_{c^{j^{\prime}}}\land\alpha_{k}^{\prime}==\mathbf{nil}^{k}\Bigr{)}
\displaystyle\land (k0,cj,cj,k1,k2)2(αk==𝐧𝐢𝐥k0(αcj==𝐧𝐢𝐥=>(αc1==αc1αc2==αc2αk==𝐧𝐢𝐥k1))\displaystyle\;\bigwedge_{(k_{0},c_{j},c_{j}^{\prime},k_{1},k_{2})\in\mathcal{I}_{2}}\Bigl{(}\alpha_{k}==\mathbf{nil}^{k_{0}}\;\land\bigl{(}\alpha_{c^{j}}==\mathbf{nil}=>(\alpha_{c^{1}}^{\prime}==\alpha_{c^{1}}\land\alpha_{c^{2}}^{\prime}==\alpha_{c^{2}}\land\alpha_{k}^{\prime}==\mathbf{nil}^{k_{1}})\bigr{)}
(¬(αcj==𝐧𝐢𝐥)=>(αcj𝐧𝐢𝐥==αcjαcj==αcjαk==𝐧𝐢𝐥k2))).\displaystyle\hskip 80.00012pt\land\bigl{(}\lnot(\alpha_{c^{j}}==\mathbf{nil})=>(\alpha_{c^{j}}^{\prime}\circ\mathbf{nil}==\alpha_{c^{j}}\land\alpha_{c^{j^{\prime}}}^{\prime}==\alpha_{c^{j^{\prime}}}\land\alpha_{k}^{\prime}==\mathbf{nil}^{k_{2}})\bigr{)}\Bigr{)}.

In the formula 𝙽𝚎𝚡𝚝\mathtt{Next}, the set 1\mathcal{I}_{1} and 2\mathcal{I}_{2} have been defined in Definition V.1. The reduction can be done following Equations 8, 9, 10 and 13. \hfill\square