Interpolants and Explicit Definitions in Extensions of the Description Logic
Abstract
We show that the vast majority of extensions of the description logic do not enjoy the Craig interpolation nor the projective Beth definability property. This is the case, for example, for with nominals, with the universal role, with a role inclusion of the form , and for . It follows in particular that the existence of an explicit definition of a concept or individual name cannot be reduced to subsumption checking via implicit definability. We show that nevertheless the existence of interpolants and explicit definitions can be decided in polynomial time for standard tractable extensions of (such as ) and in ExpTime for and various extensions. It follows that these existence problems are not harder than subsumption which is in sharp contrast to the situation for expressive DLs. We also obtain tight bounds for the size of interpolants and explicit definitions and the complexity of computing them: single exponential for tractable standard extensions of and double exponential for and extensions. We close with a discussion of Horn-DLs such as Horn-.
1 Introduction
The projective Beth definability property (PBDP) of a description logic (DL) states that a concept or individual name is explicitly definable under an -ontology by an -concept using symbols from a signature of concept, role, and individual names if, and only if, it is implicitly definable using under . The importance of the PBDP for DL research stems from the fact that it provides a polynomial time reduction of the problem to decide the existence of an explicit definition to the well understood problem of subsumption checking. The existence of explicit definitions is important for numerous knowledge engineering tasks and applications of description logic ontologies, for example, the extraction of equivalent acyclic TBoxes from ontologies (?; ?), the computation of referring expressions (or definite descriptions) for individuals (?), the equivalent rewriting of ontology-mediated queries into concepts (?; ?; ?), the construction of alignments between ontologies (?), and the decomposition of ontologies (?).
The PBDP is often investigated in tandem with the Craig interpolation property (CIP) which states that if an -concept is subsumed by another -concept under some -ontology then one finds an interpolating -concept using the shared symbols of the two input concepts only. In fact, the CIP implies the PBDP and the interpolants obtained using the CIP can serve as explicit definitions.
Many standard Boolean DLs such as , , and enjoy the CIP and PBDP and sophisticated algorithms for computing interpolants and explicit definitions have been developed (?). Important exceptions are the extensions of any of the above DLs with nominals and/or role hierarchies. In fact, it has recently been shown that the problem of deciding the existence of an interpolant/explicit definition becomes 2ExpTime-complete for ( with nominals) and for ( with role hierarchies). This result is in sharp contrast to the ExpTime-completeness of the same problem for itself inherited from the ExpTime-completeness of subsumption under -ontologies (?).
Our aim in this article is threefold: (1) determine which members of the -family of DLs enjoy the CIP/PBDP; (2) investigate the complexity of deciding the existence of interpolants/explicit definitions for those that do not enjoy it; and (3) establish tight bounds on the size of interpolants/explicit definitions and the complexity of computing them.
In what follows we discuss our main results. It has been shown in (?; ?) already that and with role hierarchies enjoy the CIP and PBDP. Rather surprisingly, it turns out that none of the remaining standard DLs in the -family enjoy the CIP nor the PBDP.
Theorem 1.
The following DLs do not enjoy the CIP nor PBDP:
-
1.
with the universal role,
-
2.
with nominals,
-
3.
with a single role inclusion ,
-
4.
with role hierarchies and a transitive role,
-
5.
the extension of with inverse roles.
In Points 2 to 5, the CIP/PBDP also fails if the universal role can occur in interpolants/explicit definitions.
Theorem 1 also has interesting consequences that are not explicitly stated. For instance, it follows that neither the DL introduced in (?) nor the extension of with any combination of nominals, role hierarchies, or transitive roles enjoy the CIP/PBDP. With the exception of the failure of the CIP/PBDP for with nominals (without the universal role in interpolants/explicit definitions) (?), our results are new.
It follows from Theorem 1 that the behaviour of extensions of is fundamentally different from extensions of : adding role hierarchies to does not preserve the CIP/PBDP (?) but it does for ; on the other hand, adding the universal role or inverse roles to preserves the CIP/PBDP (?) but it does not for .
Theorem 1 leaves open the behaviour of a few natural DLs between and its extension with arbitrary role inclusions. For instance, what happens if one only adds transitive roles or, more generally, role inclusions using a single role name only? To cover these cases we show a general result that implies that these DLs enjoy the CIP and PBDP. In particular, it follows that in Point 4 of Theorem 1 the combination of role hierarchies with a transitive role is necessary for failure of the CIP/PBDP.
We next discuss our main result about tractable extensions of .
Theorem 2.
For and any extension with any combination of nominals, role inclusions, the universal role, or , the existence of interpolants and explicit definitions is in PTime. If an interpolant/explicit definition exists, then there exists one of at most exponential size that can be computed in exponential time. This bound is optimal.
It follows that for tractable extensions of the complexity of deciding the existence of interpolants and explicit definitions does not depend on the CIP/PBDP, in sharp contrast to the behaviour of and . Moreover, the proof shows how interpolants and explicit definitions can be computed from the canonical models introduced in (?), if they exist. It applies derivation trees (first introduced in (?) for DLs without nominals and role hierarchies) to estimate the size of interpolants and provide an exponential time algorithm for computing them.
Theorem 3.
For and any extension with any combination of nominals, the universal role, or , the existence of interpolants and explicit definitions is ExpTime-complete. If an interpolant/explicit definition exists, then there exists one of at most double exponential size that can be computed in double exponential time. This bound is optimal.
The proof of Theorem 3 shows how an interpolant or explicit definition can be extracted from a (potentially infinite) tree-shaped canonical model. The ExpTime complexity bound is proved using an encoding as an emptiness problem for tree automata that also uses derivation trees. It does not seem possible to obtain tight bounds on the size of interpolants using derivation trees; instead we generalize transfer sequences for this purpose (also first introduced in (?)).
In the final section, we consider expressive Horn-DLs such as Horn-. We first observe that Theorem 3 also holds for Horn- and extensions with nominals and the universal role, provided one asks for interpolants and explicit definitions in (and extensions with nominals and the universal role, respectively). If one admits expressive Horn-concepts as interpolants or explicit definitions, then sometimes interpolants and explicit definitions exist that previously did not exist. We show that nevertheless the CIP/PBDP also fail in this case for DLs including Horn-, , and Horn-.
Detailed proofs are given in the arxiv version of this article.
2 Related Work
The CIP and PBDP have been investigated extensively in databases, with applications to query rewriting under views and query compilation (?; ?). The computation of explicit definitions under Horn ontologies can be seen as an instance of query reformulation under constraints (?) which has been a major research topic for many years. The Chase and Backchase approach that is central to this research closely resembles our use of canonical models. We do not assume, however, that the chase terminates. In (?; ?), it is shown that the reformulation of CQs into CQs under tgds can be reduced to entailment using Lyndon interpolation of first-order logic. By linking reformulation into CQs and definability using concepts, this approach can potentially be used to obtain alternative proofs of complexity upper bounds for the existence of interpolants and explicit definitions in our languages. Also relevant is the investigation of interpolation in basic modal logic (?) and hybrid modal logic (?; ?).
The main aim of this article is to investigate explicit definability of concept and individual names under ontologies. We have therefore chosen a definition of the CIP and interpolants that generalizes the projective Beth definability property and explicit definability in a natural and useful way, following (?). There are, however, other notions of Craig interpolation that are of interest. Of particular importance for modularity and various other purposes is the following version: if is an ontology and an inclusion such that , then there exists an ontology in the shared signature of and such that . This property has been considered for and various extensions in (?; ?). Currently, it is unknown whether there exists any interesting relationship between this version of the CIP and the version we investigate in this article.
Craig interpolants should not be confused with uniform interpolants (or forgetting) (?; ?; ?; ?). Uniform interpolants generalize Craig interpolants in the sense that a uniform interpolant is an interpolant for a fixed antecedent and any formula implied by the antecedent and sharing with it a fixed set of symbols.
Interpolant and explicit definition existence have only recently been investigated for logics that do not enjoy the CIP or PBDP. Extending work on Boolean DLs we discussed already, it is shown that they become harder than validity also in the guarded and two-variable fragment (?). The interpolant existence problem for linear temporal logic LTL is considered in (?). In the context of referring expressions, explicit definition existence is investigated in (?), see also (?).
3 Preliminaries
Let , , and be disjoint and countably infinite sets of concept, role, and individual names. A role is a role name or an inverse role , with a role name. Nominals take the form , where is an individual name. The universal role is denoted by . -concepts are defined by the following syntax rule:
where ranges over concept names, over individual names, and over roles (including the universal role). Fragments of are defined as usual. For example, -concepts are -concepts without nominals and the universal role, and -concepts are -concepts without inverse roles. Given any of the DLs introduced above, an -concept inclusion (-CI) takes the form with -concepts. An -ontology is a finite set of -CIs.
We also consider ontologies with role inclusions (RIs), expressions of the form with role names. An -ontology with RIs is called an -ontology. A set of RIs is a role hierarchy if all its RIs are of the form with role names.
A signature is a set of concept, role, and individual names, uniformly referred to as (non-logical) symbols. We follow common practice and do not regard the universal role as a non-logical symbol as its interpretation is fixed. We use to denote the set of symbols used in any syntactic object such as a concept or an ontology. If is a DL and a signature, then an -concept is an -concept with . The size of a syntactic object is the number of symbols needed to write it down.
The semantics of DLs is given in terms of interpretations , where is a non-empty set (the domain) and is the interpretation function, assigning to each a set , to each a relation , and to each an element . The interpretation of a concept in is defined as usual, see (?). An interpretation satisfies a CI if and an RI if . We say that is a model of an ontology if it satisfies all inclusions in it. If is a CI or RI, we write if all models of satisfy . We write if and .
An ontology is in normal form if its CIs are of the form
and
where are concept names, is a role or the universal role, and is an individual name. It is well known that for any -ontology with or without RIs one can construct in polynomial time a conservative extension using the same constructors as that is in normal form.
-concepts can be characterized using -simulations which we define next. Let and be interpretations. A relation is called an -simulation between and if the following conditions hold:
-
1.
if and , then , for all ;
-
2.
if and , then , for all ;
-
3.
if and , then there exists with and , for all .
is called an -simulation if is the domain of and an -simulation if Condition 3 also holds for inverse roles from . Condition 2 is dropped if does not use nominals. We write if there exists an -simulation between and with . We write if implies for all -concepts . The following characterization is well known (?; ?).
Lemma 1.
Let . Then implies . The converse direction holds if is finite.
4 Craig Interpolation Property and Projective Beth Definability Property
We introduce the Craig interpolation property (CIP) as defined in (?) and the projective Beth definability property (PBDP) and prove Theorem 1 from the introduction to this article. We observe that the CIP implies the PBDP, but lack a proof of the converse direction. Nevertheless, all DLs considered in this paper enjoying the PBDP also enjoy the CIP.
Set , for any ontology and concept . Let be -ontologies and let be -concepts. Then an -concept is called an -interpolant111Important variations of this definition are to drop in Point 2 and in Point 3, respectively, or to consider only one ontology and regard the signature of the interpolant as an input given independently from . This has an effect on the CIP, but our results on interpolant computation and existence are not affected. for under if
-
•
;
-
•
;
-
•
.
Definition 1.
A DL has the Craig interpolation property (CIP) if for any -ontologies and -concepts such that there exists an -interpolant for under .
We next define the relevant definability notions. Let be an ontology and a concept name. Let be a signature. An -concept is an explicit -definition of under if . We call explicitly definable in under if there is an explicit -definition of under . The -reduct of an interpretation coincides with except that no symbol that is not in is interpreted in . A concept is called implicitly definable using under if the -reduct of any model of determines the set ; in other words, if and are both models of such that , then . It is easy to see that implicit definability can be reformulated as a standard reasoning problem as follows: a concept name is implicitly definable using under iff , where is obtained from by replacing every symbol not in (including ) uniformly by a fresh symbol .
Definition 2.
A DL has the projective Beth definable property (PBDP) if for any -ontology , concept name , and signature the following holds: if is implicitly definable using under , then is explicitly -definable under .
Remark 1.
The CIP implies the PBDP. To see this, assume that an -ontology , concept name and a signature are given, and that is implicitly definable from under . Then , with defined above. Take an -interpolant for under . Then is an explicit -definition of under .
Remark 2.
The PBDP implies that implicitly definable nominals are explicitly definable and that, more generally, every implicitly definable concept is explicitly definable. This can be shown by adding to the ontology for a fresh concept name and asking for an explicit definition of in the extended ontology.
Remark 3.
The CIP and PBDP are invariant under adding (interpreted as the empty set) to the languages introduced above. The straightforward proof is given in the appendix of the full version.
We next prove that the majority of tractable extensions of does not enjoy the CIP nor PBDP.
Theorem 1. The following DLs do not enjoy the CIP nor PBDP:
-
1.
with the universal role,
-
2.
with nominals,
-
3.
with a single role inclusion ,
-
4.
with role hierarchies and a transitive role,
-
5.
with inverse roles.
In Points 2 to 5, the CIP/PBDP also fails if the universal role can occur in interpolants/explicit definitions.
Proof.
We first show that does not enjoy the PBDP. Point 1 then follows using Remark 1. We define an -ontology , signature , and concept name such that is implicitly definable using under but not -explicitly definable under . Define as the following set of CIs:
and let . We have ,222Here and in what follows we use standard syntax and semantics and set (?). so is implicitly definable using under . The interpretations and given in Figure 1 show that is not explicitly -definable under .
Indeed, and are both models of , , , and the relation is a -simulation between and . As -concepts are preserved under -simulations (Lemma 1), if for some -concept , then from we obtain . This implies , and so . As , we obtain a contradiction.
We next prove Point 2. An example from (?) shows that does not enjoy the CIP/PBDP. Here we show that does not enjoy the CIP/PBDP, even if interpolants/explicit defintions are from . Let contain the following CIs:
and let . Observe that is implicitly definable using under as . The relation is an -simulation between the interpretations and defined in Figure 2. Now we can apply the same argument as in Point 1 to show that is not explicitly -definable under .
For Point 3, let contain
and let . Then is implicitly definable using under since
We show that there does not exist any -explicit definition of under .
The interpretations and given in Figure 3 are both models of , , , and the relation is an -simulation between and . One can now show in the same way as in Point 1 that no -definition of under exists.
Point 4 is shown in the appendix of the full version using a modification of the ontology used for Point 3.
To prove Point 5, obtain an -ontology from defined above by replacing the second CI of by . Let, as before, . Then is implicitly definable from under (the same explicit definition works), but is not explicitly -definable under (the same interpretations and work). ∎
We next discuss a general positive result on interpolation and explicit definition existence that shows that Theorem 1 is essentially optimal. A set of RIs is safe for a signature if for each RI , , if then .
Theorem 4.
Let be -ontologies with RIs, be -concepts, and set . Assume that the set of RIs in is safe for and . Then an -interpolant for under , exists.
The proof technique is based on simulations and similar to (?; ?). Theorem 4 has a few interesting consequences. For instance, with transitive roles enjoys both the CIP and PBDP since transitivity is expressed by the role inclusion which is safe for any signature (as it only uses a single role name).
5 Interpolant and Explicit Definition Existence
We introduce interpolant and explicit definition existence as decision problems and establish a polynomial time reduction of the latter to the former. We then show that it suffices to consider ontologies in normal form and that the addition of does not affect the complexity of the decision problems.
Definition 3.
Let be a DL. Then -interpolant existence is the problem to decide for any -ontologies and -concepts whether there exists an -interpolant for under .
Observe that interpolant existence reduces to checking for logics with the CIP but that this is not the case for logics without the CIP.
Definition 4.
Let be a DL. Then -explicit definition existence is the problem to decide for any -ontology , signature , and concept name whether is explicitly definable in under .
Remark 4.
There is a polynomial time reduction of -explicit definition existence to -interpolant existence. Moreover, any algorithm computing -interpolants also computes -explicit definitions and any bound on the size of -interpolants provides a bound on the size of -explicit definitions. The proof is similar to the proof of Remark 1.
We next observe that replacing the original ontologies by a conservative extension preserves interpolants and explicit definitions. Thus, it suffices to consider ontologies in normal form and interpolants for inclusions between concept names.
Lemma 2.
Let be ontologies and concepts in any DL considered in this paper. Then one can compute in polynomial time -ontologies in normal form and with fresh concept names such that an -concept is an interpolant for under iff it is an interpolant for under .
Proof.
Let and be normal form conservative extensions of and, respectively, , computed in polynomial time. One can show that and are as required. ∎
Remark 5.
Assume that is any of the DLs introduced above and let denote its extension with . Then -interpolant existence and -explicit definition existence can be reduced in polynomial time to -interpolant existence and -explicit definition existence, respectively. The converse direction also holds modulo an oracle deciding whether .
6 Interpolant and Explicit Definition Existence in Tractable Extensions
The aim of this section is to analyse interpolants and explicit definitions for extensions of with any combination of nominals, role inclusions, or the universal role. We show the following result from the introduction.
Theorem 2. For and any extension with any combination of nominals, role inclusions, the universal role, or , the existence of interpolants and explicit definitions is in PTime. If an interpolant/explicit definition exists, then there exists one of at most exponential size that can be computed in exponential time. This bound is optimal.
Before we start with a sketch of the proof we give instructive examples showing that the exponential bound on the size of explicit definitions is optimal.
Example 1.
Variants of the following example have already been used for various succinctness arguments in DL. Let
and . triggers a marker and a binary tree of depth whose leafs are decorated with . Conversely, if is true at all leafs of a binary tree of depth , then is true at all nodes of the tree and together with entail at its root. Let, inductively, and , for , and . Then is the smallest explicit -definition of under . Next let
and . Then is the smallest explicit -definition of under .
Observe that using one enforces explicit definitions of exponential size by generating a binary tree of linear depth whereas using this is achieved by generating a path of exponential length. The latter can only happen if role inclusions are used in the ontology. One insight provided by the exponential upper bound on the size of explicit definitions in Theorem 2 is that the two examples cannot be combined to enforce a binary tree of exponential depth.
To continue with the proof we introduce ABoxes as a technical tool that allows us to move from interpretations to (potentially incomplete) sets of facts and concepts. An ABox is a (possibly infinite) set of assertions of the form , , , and with , , , and individual variables (we call individuals used in ABoxes variables to distinguish them from individual names used in nominals). We denote by the set of individual variables in . A -ABox is an ABox using symbols from only. Models of ABoxes are defined as usual. We do not make the unique name assumption.
Every interpretation defines an ABox by identifying every with a variable and taking if , if , if . Conversely, ABoxes define interpretations in the obvious way (by identifying variables if ). We associate with every ABox a directed graph . Let be a set of individual names. Then is ditree-shaped modulo if after dropping some facts of the form with for some , it is ditree-shaped in the sense that is acyclic and and imply . A pointed ABox is a pair with . Then -concepts correspond to pointed -ABoxes such that is ditree-shaped modulo and -concepts correspond to rooted pointed -ABoxes such that is ditree-shaped modulo , where is called rooted if for every there is a path from to in . We write if for every model of and .
Given an -ontology in normal form and a concept name , one can construct in polynomial time the canonical model of and using the approach introduced in (?). More generally, the canonical model for an ABox and ontology can be constructed in polynomial time and is a model of both and such that for any -concept using symbols from only and any ,
-
()
iff ,
details are given in the appendix of the full version. We let with . Note that in (?) the condition () is only stated for subconcepts of the ontology , thus () requires a proof.
Example 2.
The interpretations defined in the proof of Theorem 1 define canonical models with for the ontologies . The interpretations define canonical models with the -reduct of regarded as an ABox and .
The directed unfolding of a pointed -ABox into a pointed -ABox that is ditree-shaped modulo is defined in the standard way. In the rooted directed unfolding, nodes that cannot be reached from via role names are dropped.
Assume now that is in normal form and a concept name. Let be the -reduct of the canonical model , regarded as an ABox. Denote by the directed unfolding of , by the sub-ABox of rooted in , and by its rooted directed unfolding. Theorem 2 is a direct consequence of the following characterization of interpolants.
Theorem 5.
There exists a polynomial such that the following conditions are equivalent for all -ontologies in normal form, concept names , and :
-
1.
An -interpolant for under exists;
-
2.
;
-
3.
there exists a finite subset of with such that the -concept corresponding to is an -interpolant for under .
The same equivalences hold if in Points 1 to 3, is replaced by , by , and by .
In Point 3, can be computed in exponential time, if it exists.
Note that the polynomial time decidability of interpolant existence follows from Point 2 of Theorem 5 (and the tractability of (?)).
Example 3.
The following example illustrates the difference between the existence of explicit definitions in and and thus the need for moving to the ABoxes , and if one does not admit the universal role in explicit definitions.
Example 4.
Let and let . Then is explicitly -definable under since but is not explicitly -definable. Note that in this case but .
We next sketch the proof idea for Theorem 5 for the case with universal role in interpolants. We show “1. 2.”, observe that “3. 1.” is trivial, and then sketch the proof of “2. 3.” and the exponential time algorithm computing interpolants, details are provided in the appendix of the full version. For “1. 2.” assume that is an -concept with (i) and (ii) . By () and (i), . But then by (ii) , as required.
If one does not impose a bound on the size of in Point 3, then one can prove “2. 3.” using compactness and a generalization of unraveling tolerance according to which and entail the same (?; ?). As we are interested in an exponential bound on the size of (and a deterministic exponential time algorithm computing it) we require a more syntactic approach. Our proof of “2. 3.” is based on derivation trees which represent a derivation of a fact from an ontology and ABox using a labeled tree. Our derivation trees generalize those introduced in (?; ?) to languages with nominals and role inclusions. Reflecting the use of individual names and concept names in the construction of the domain of the canonical model (?), we assume and . Then a derivation tree for is a tree with a labeling function such that and satisfies rules stating under which conditions the label of is derived in one step from the labels of the successors of . To illustrate, the existence of successors of with and justifies if . The rules are given in the appendix of the full version, we only discuss the rule used to capture derivations using RIs: is justified if there are role names such that is a label of a successor of , , , and the situation depicted in Figure 4 holds, where the “dotted lines” stand for ‘either or some with are labels of successors of ’, and stands for ‘either or some is a label of a successor of and if and if ’. Moreover, for all , , there exists a successor of with label for some . The soundness of this rule should be clear, completeness can be shown similarly to the analysis of canonical models.
The length of the sequence can be exponential (for instance, in Example 1 for the fact in ). One can show, however, that its length can be bounded without affecting completeness by with a polynomial. The following lemma summarizes the main properties of derivation trees.
Lemma 3.
Let be an -ontology in normal form and a finite -ABox. Then
-
1.
if and only if there is a derivation tree for in . Moreover, if a derivation tree exists, then there exists one of depth and outdegree bounded by which can be constructed in exponential time in .
-
2.
If is a derivation tree for in of at most exponential size, then one can construct in exponential time (in ) a derivation tree for in with the directed unfolding of modulo and of the same depth as and such that the outdegree of does not exceed with the length of the longest chain used in the rule for RIs in the derivation tree .
Proof.
We sketch the idea. For Point 1, the bound on the depth of derivation trees can be proved by observing that one can assume (using a standard pumping argument) that the labels of distinct nodes on a single path are distinct and the bound on the outdegree can be proved by observing that one can trivially assume that all successor nodes of a node have distinct labels. For the construction of derivation trees, let denote the set of facts in for which there is a derivation tree of depth at most . Then one can construct in exponential time derivation trees for all facts in any , by starting with derivation trees of depth for members of , and then constructing derivation trees of depth for members of using the trees for members of . For Point 2, the transformation of into is by induction over rule application, the only interesting step being the rule for RIs. Using the ontology of Example 1 one can see that the exponential blow-up of the outdegree is unavoidable. ∎
We are now in the position to complete the sketch of the proof of “2. 3.” Assume that Point 2 holds. Then . By Point 1 of Lemma 3 we can construct a derivation tree for in of polynomial depth and outdegree in exponential time. By Point 2 of Lemma 3 we can transform into a derivation tree for in in exponential time. Now let be the restriction of to all which occur in a label of . Then is also a derivation tree for in and so . It follows that the -concept corresponding to is an interpolant for under . Its size is at most exponential in since is at most exponential in , and so also in .
7 Interpolant and Explicit Definition Existence in and Extensions
We analyze interpolants and explicit definitions for and its extensions with nominals and universal roles, and show the following result from the introduction.
Theorem 3. For and any extension with any combination of nominals, the universal role, or , the existence of interpolants and explicit definitions is ExpTime-complete. If an interpolant/explicit definition exists, then there exists one of at most double exponential size that can be computed in double exponential time. This bound is optimal.
The double exponential lower bound on the size of explicit definitions and interpolants is shown in the appendix of the full version. The proof is inspired by similar lower bounds for the size of FO-rewritings and uniform interpolants (?; ?). To prove the remaining claims of Theorem 3, we lift Theorem 5 to . The main differences are that (1) we now associate undirected graphs with ABoxes and also unfold along inverse roles; (2) that canonical models become potentially infinite but tree-shaped; (3) that therefore deciding the new variant of Point 2 of Theorem 5 is not an instance of standard entailment checking in , instead we give a reduction to emptiness checking for tree automata; and (4) that to bound the size of in Point 3, we employ transfer sequences (and not derivation trees) to represent how facts are derived.
In more detail, associate with every ABox the undirected graph We say that is tree-shaped if is acyclic, and imply , and implies for any . is tree-shaped modulo a set of individual names if after dropping some facts with or for some it is tree-shaped. We observe that -concepts correspond to pointed -ABoxes such that is tree-shaped modulo . -concepts correspond to weakly rooted pointed -ABoxes such that is tree-shaped modulo , where is called weakly rooted if for every there is a path from to in .
For every -ontology and concept there exists a (potentially infinite) pointed canonical model such that the ABox corresponding to is tree-shaped modulo . The property () used in the context of canonical models for tractable extensions of holds here as well. We also require the undirected unfolding of a pointed -ABox into a pointed -ABox which is tree-shaped modulo . In the rooted undirected unfolding, nodes that cannot be reached from via roles are dropped.
Assume now that is in normal form and a concept name. Let be the -reduct of the canonical model , regarded as an ABox. Denote by the undirected unfolding of , by the sub-ABox of weakly rooted in , and by its rooted undirected unfolding. Then we lift Theorem 5 as follows.
Theorem 6.
There exists a polynomial such that the following conditions are equivalent for all -ontologies in normal form, concept names , and :
-
1.
An -interpolant for under exists;
-
2.
;
-
3.
there exists a finite subset of with such that the -concept corresponding to is an -interpolant for under .
The same equivalences hold if in Points 1 to 3, is replaced by , by , and by .
In Point 3, can be computed in double exponential time, if it exists.
We first sketch how tree automata are used to show that Point 2 entails an exponential time upper bound for deciding the existence of an interpolant. To this end we represent finite prefix-closed subsets of as trees and design
-
•
a non-determistic tree automaton over finite trees (NTA), , that accepts exactly those trees that represent prefix-closed finite subsets of ;
-
•
a two-way alternating tree automaton over finite trees (2ATA), , that accepts exactly those trees that represent a pointed ABox with .
Similar tree automata techniques have been used e.g. in (?). is constructed using the definition of canonical models; its states are essentially types occuring in the canonical model and it can be constructed in exponential time. The 2ATA tries to construct a derivation tree for in , given as input a tree representing . It has polynomially many states, and can thus be turned into an equivalent NTA with exponentially many states (?). By taking the intersection with , one can then check in exponential time whether , that is, whether .
We return to the proof of Theorem 6. The interesting implication is “2. 3.” and the double exponential computation of interpolants. In this case we use transfer sequences to obtain a bound on the size of the subset of needed to derive (we note that for without nominals one can also use the automata encoding above). Transfer sequences describe how facts are derived in a tree-shaped ABox and allow to determine when individuals and behave sufficiently similar so that the subtree rooted at can be replaced by the subtree rooted at (?) without affecting a derivation. This technique can be used to show that one can always choose a prefix closed subset of of at most exponential depth. This also implies that can be obtained in double exponential time by constructing the canonical model up to depth with a polynomial.
8 Expressive Horn Description Logics
We address two questions regarding expressive Horn-DLs. (1) Can our results for and extensions be lifted to more expressive Horn-DLs? (2) In the examples provided in the proof of Theorem 1 we sometimes (for example, for and ) construct explicit Horn-DL definitions to show implicit definability of concept names. Are Horn-DL concepts always sufficient to obtain an explicit definition if an implicit definition exists? We provide a positive answer to (1) if one only admits -concepts (or fragments) as interpolants/explicit definitions and a negative answer to (2) in the sense that and various other Horn-DLs do not enjoy the CIP/PBDP even if one admits Horn-DL concepts as interpolants/explicit definitions.
We introduce expressive Horn DLs (?), presented here in the form proposed in (?). Horn--concepts and Horn--CIs are defined by the syntax rules
with ranging over concept names, over individual names, and over roles (including the universal role). As usual, the fragment of Horn- without nominals and the universal role is denoted by Horn- and Horn- denotes the fragment of Horn- without inverse roles.
Theorem 7.
Let be the pair Horn- or the pair Horn-, . Then
-
•
deciding the existence of an -interpolant for an -CI under -ontologies is ExpTime-complete;
-
•
deciding the existence of an explicit -definition of a concept name under an -ontology is ExpTime-complete.
Moreover, if an -interpolant/explicit definition exists, then there exists one of at most double exponential size that can be computed in double exponential time.
Theorem 7 follows from Theorem 3 and the fact that for any -ontology one can construct in polynomial time an -ontology in normal form that is a conservative extension of (see (?) for a similar result). We next show that despite the fact that Horn--concepts sometimes provide explicit definitions if none exist in (proof of Theorem 1), they are not sufficient to prove the CIP/PBDP.
Theorem 8.
There exists an ontology in Horn- (and in ), a signature , and a concept name such that is implicitly definable using under but does not have an explicit Horn--definition.
Proof.
We modify the ontology used in the proof of Point 1 of Theorem 1. Let and let contain and the following CIs:
Intuitively, the final two CIs should be read as
and the concept name is introduced to achieve this in a projective way as the latter CI is not in Horn-.
is implicitly definable using under since
To show that is not explicitly Horn--definable under consider the interpretations and in Figure 5. The claim follows from the facts that and are models of , , , but implies holds for every Horn--concept . The latter can be proved by observing that there exists a Horn--simulation between and (?) containing , we refer the reader to the appendix of the full version. To obtain an example in , it suffices to take a conservative extension of in . ∎
9 Discussion
For a few important extensions of the complexity of interpolant and explicit definition existence remains to be investigated. Examples include extensions of with role inclusions, and extensions of or with functional roles or more general number restrictions. It would also be of interest to investigate interpolant existence if Horn-concepts are admitted as interpolants (using, for example, the games introduced in (?)). Finally, the question arises whether there exists at all a decidable Horn language extending, say, Horn-, with the CIP/PBDP. We note that Horn-FO enjoys the CIP (Exercise 6.2.6 in (?)) but is undecidable and that we show in the appendix of the full version that the Horn fragment of the guarded fragment does not enjoy the CIP/PBDP.
Acknowledgments
This research was supported by the EPSRC UK grant EP/S032207/1.
References
- Areces, Blackburn, and Marx 2001 Areces, C.; Blackburn, P.; and Marx, M. 2001. Hybrid logics: Characterization, interpolation and complexity. J. Symb. Log. 66(3):977–1010.
- Artale et al. 2021a Artale, A.; Jung, J. C.; Mazzullo, A.; Ozaki, A.; and Wolter, F. 2021a. Living without Beth and Craig: Explicit definitions and interpolants in description logics with nominals and role hierarchies. In Proc. of AAAI.
- Artale et al. 2021b Artale, A.; Mazzullo, A.; Ozaki, A.; and Wolter, F. 2021b. On free description logics with definite descriptions. In Proc. of KR.
- Baader et al. 2016 Baader, F.; Bienvenu, M.; Lutz, C.; and Wolter, F. 2016. Query and predicate emptiness in ontology-based data access. J. Artif. Intell. Res. 56:1–59.
- Baader et al. 2017 Baader, F.; Horrocks, I.; Lutz, C.; and Sattler, U. 2017. An Introduction to Description Logic. Cambridge University Press.
- Baader, Brandt, and Lutz 2005 Baader, F.; Brandt, S.; and Lutz, C. 2005. Pushing the envelope. In Proc. of IJCAI, 364–369.
- Benedikt et al. 2016 Benedikt, M.; Leblay, J.; ten Cate, B.; and Tsamoura, E. 2016. Generating Plans from Proofs: The Interpolation-based Approach to Query Reformulation. Synthesis Lectures on Data Management. Morgan & Claypool Publishers.
- Benedikt et al. 2017 Benedikt, M.; Kostylev, E. V.; Mogavero, F.; and Tsamoura, E. 2017. Reformulating queries: Theory and practice. In IJCAI, 837–843.
- Bienvenu et al. 2016 Bienvenu, M.; Hansen, P.; Lutz, C.; and Wolter, F. 2016. First order-rewritability and containment of conjunctive queries in horn description logics. In IJCAI, 965–971.
- Bienvenu, Lutz, and Wolter 2013 Bienvenu, M.; Lutz, C.; and Wolter, F. 2013. First-order rewritability of atomic queries in Horn description logics. In Proc. of IJCAI.
- Borgida, Toman, and Weddell 2016 Borgida, A.; Toman, D.; and Weddell, G. E. 2016. On referring expressions in query answering over first order knowledge bases. In Proc. of KR, 319–328.
- Chang and Keisler 1998 Chang, C., and Keisler, H. J. 1998. Model Theory. Elsevier.
- Deutsch, Popa, and Tannen 2006 Deutsch, A.; Popa, L.; and Tannen, V. 2006. Query reformulation with constraints. SIGMOD Rec. 35(1):65–73.
- Geleta, Payne, and Tamma 2016 Geleta, D.; Payne, T. R.; and Tamma, V. A. M. 2016. An investigation of definability in ontology alignment. In Blomqvist, E.; Ciancarini, P.; Poggi, F.; and Vitali, F., eds., Proc. of EKAW, 255–271.
- Hernich et al. 2020 Hernich, A.; Lutz, C.; Papacchini, F.; and Wolter, F. 2020. Dichotomies in ontology-mediated querying with the guarded fragment. ACM Trans. Comput. Log. 21(3):20:1–20:47.
- Hustadt, Motik, and Sattler 2005 Hustadt, U.; Motik, B.; and Sattler, U. 2005. Data complexity of reasoning in very expressive description logics. In IJCAI, 466–471.
- Jung and Wolter 2021 Jung, J. C., and Wolter, F. 2021. Living without Beth and Craig: Definitions and interpolants in the guarded and two-variable fragments. In Proc. of LICS.
- Jung et al. 2019 Jung, J. C.; Papacchini, F.; Wolter, F.; and Zakharyaschev, M. 2019. Model comparison games for horn description logics. In Proc. of LICS, 1–14. IEEE.
- Jung et al. 2020 Jung, J. C.; Lutz, C.; Martel, M.; and Schneider, T. 2020. Conservative extensions in horn description logics with inverse roles. J. Artif. Intell. Res. 68:365–411.
- Konev et al. 2009 Konev, B.; Lutz, C.; Walther, D.; and Wolter, F. 2009. Formal properties of modularisation. In Modular Ontologies, volume 5445 of Lecture Notes in Computer Science. Springer. 25–66.
- Konev et al. 2010 Konev, B.; Lutz, C.; Ponomaryov, D. K.; and Wolter, F. 2010. Decomposing description logic ontologies. In Proc. of KR. AAAI Press.
- Koopmann and Schmidt 2015 Koopmann, P., and Schmidt, R. A. 2015. Uniform interpolation and forgetting for ALC ontologies with aboxes. In Proc. of AAAI, 175–181. AAAI Press.
- Lutz and Wolter 2010 Lutz, C., and Wolter, F. 2010. Deciding inseparability and conservative extensions in the description logic EL. J. Symb. Comput. 45(2):194–228.
- Lutz and Wolter 2011 Lutz, C., and Wolter, F. 2011. Foundations for uniform interpolation and forgetting in expressive description logics. In Proc. of IJCAI, 989–995. IJCAI/AAAI.
- Lutz and Wolter 2012 Lutz, C., and Wolter, F. 2012. Non-uniform data complexity of query answering in description logics. In Brewka, G.; Eiter, T.; and McIlraith, S. A., eds., Proc. of KR.
- Lutz and Wolter 2017 Lutz, C., and Wolter, F. 2017. The data complexity of description logic ontologies. Logical Methods in Computer Science 13(4).
- Lutz, Piro, and Wolter 2011 Lutz, C.; Piro, R.; and Wolter, F. 2011. Description logic tboxes: Model-theoretic characterizations and rewritability. In Walsh, T., ed., Proc. of IJCAI, 983–988. IJCAI/AAAI.
- Lutz, Seylan, and Wolter 2012 Lutz, C.; Seylan, I.; and Wolter, F. 2012. An automata-theoretic approach to uniform interpolation and approximation in the description logic EL. In Proc. of KR. AAAI Press.
- Lutz, Seylan, and Wolter 2019 Lutz, C.; Seylan, I.; and Wolter, F. 2019. The data complexity of ontology-mediated queries with closed predicates. Logical Methods in Computer Science 15(3).
- Maksimova and Gabbay 2005 Maksimova, L., and Gabbay, D. 2005. Interpolation and Definability in Modal and Intuitionistic Logics. Clarendon Press.
- Nikitina and Rudolph 2014 Nikitina, N., and Rudolph, S. 2014. (Non-)succinctness of uniform interpolants of general terminologies in the description logic EL. Artif. Intell. 215:120–140.
- Place and Zeitoun 2016 Place, T., and Zeitoun, M. 2016. Separating regular languages with first-order logic. Log. Methods Comput. Sci. 12(1).
- Seylan, Franconi, and de Bruijn 2009 Seylan, I.; Franconi, E.; and de Bruijn, J. 2009. Effective query rewriting with ontologies over dboxes. In Proc. of IJCAI, 923–925.
- Sofronie-Stokkermans 2008 Sofronie-Stokkermans, V. 2008. Interpolation in local theory extensions. Log. Methods Comput. Sci. 4(4).
- ten Cate et al. 2006 ten Cate, B.; Conradie, W.; Marx, M.; and Venema, Y. 2006. Definitorially complete description logics. In Proc. of KR, 79–89. AAAI Press.
- ten Cate, Franconi, and Seylan 2013 ten Cate, B.; Franconi, E.; and Seylan, İ. 2013. Beth definability in expressive description logics. J. Artif. Intell. Res. 48:347–414.
- ten Cate 2005 ten Cate, B. 2005. Interpolation for extended modal languages. J. Symb. Log. 70(1):223–234.
- Toman and Weddell 2011 Toman, D., and Weddell, G. E. 2011. Fundamentals of Physical Design and Query Compilation. Synthesis Lectures on Data Management. Morgan & Claypool Publishers.
- Toman and Weddell 2021 Toman, D., and Weddell, G. E. 2021. FO rewritability for OMQ using beth definability and interpolation. In Homola, M.; Ryzhikov, V.; and Schmidt, R. A., eds., Proc. of DL. CEUR-WS.org.
- Vardi 1998 Vardi, M. Y. 1998. Reasoning about the past with two-way automata. In Proc. of ICALP’98, 628–641.
Appendix A Further Prelimaries
We call an ontology a conservative extension of an ontology if for all and every model of can be expanded to a model of by modifying the interpretation of symbols in . In other words, the -reducts of and coincide. The following result is folklore (?).
Lemma 4.
Let be any DL from or an extension with the universal role, and let be an -ontology. Then one can construct in polynomial time an -ontology in normal form such that is a conservative extension of .
We next give a more detailed introduction to ABoxes and how they relate to concepts. Recall that an ABox is a (possibly infinite) set of assertions of the form , , , and with , , , and individual variables. An ABox is factorized if imply .
ABox assertions are interpreted in an interpretation using a variable assignment that maps individual variables to elements of . Then satisfies an assertion if , if , if , and is always satisfied. satisfies an ABox if it satisfies all assertions in it. We write if there exists an assignment with such that satisfies . We say that an assertion is entailed by an ontology and ABox , in symbols , if for all models of and assignments such that satisfy . This is the standard notion of entailment from a knowledge base consisting of an ontology and an ABox. Deciding entailment is in PTime for the DLs between and (?) and ExpTime-complete for the DLs between and (?).
Every interpretation defines a factorized ABox by identifying every with a variable and taking if , if , if . Conversely, factorized ABoxes define interpretations in the obvious way.
The following lemma provides a formal description of the relationship between ABoxes that are ditree-shaped modulo some set of individual names and -concepts.
Lemma 5.
For any -concept one can construct in polynomial time a pointed -ABox such that is ditree-shaped modulo and iff , for all interpretations and .
Conversely, for any pointed -ABox such that is a ditree-shaped ABox modulo , one can construct in polynomial time an -concept such that and iff , for all interpretations and .
The above also holds if one replaces -concepts by -concepts and requires the pointed ABoxes to be rooted.
We define a canonical model for an -ontology in normal form and a concept name . This has been done in (?), but as we do not use canonical models for subsumption or instance checking we give a succinct model-theoretic construction.
Assume and are given and is in normal form. Define an equivalence relation on the set of individual names in by setting if . Let and set . Say that a concept name is absorbed by an individual name if and let denote the set of concept names in such that and is not absorbed by any individual name.
Now let and let
for every concept name , , and . We often denote the nodes and by or, for simplicity, and, respectively, . If is absorbed by an individual we still often denote by .
Lemma 6.
The canonical model is a model of and for every model of and any with , , where is any signature.
Proof.
We first show that is a model of . It is straightforward to show that satisfies the CIs of the form , , .
Assume now that and with of the form or . We have , . Thus . But then and . Thus , as required.
Assume now that and . Then there exists such that and . Hence and . Thus, . Hence since , . But then , as required.
Finally, assume that and . Then there are with for all , where and . We obtain for all . Thus . Hence . Hence , as required.
Let be a model of with . Define a relation between and as follows: for any and , let if . One can now show that this is well-defined and that for any there exists a with . It is straightforward to show that is a -simulation, as required. ∎
Lemma 7.
Let be an -ontology in normal form, a concept name, and an -concept. Then the following conditions are equivalent:
-
1.
;
-
2.
.
Next assume that and an ABox are given. Assume is in normal form. Then one can construct in polynomial time a canonical model of that satisfies via an assignment . The details are straightforward, and we only give the main properties of .
Lemma 8.
Given an -ontology in normal form and an ABox one can construct in polynomial time a model of and an assignment such that for all and all -concepts the following conditions are equivalent:
-
1.
;
-
2.
.
The following lemma provides a formal description of the relationship between ABoxes that are tree-shaped modulo some set of individual names and -concepts.
Lemma 9.
For any -concept one can construct in polynomial time a pointed -ABox such that is tree-shaped modulo and iff , for all interpretations and .
Conversely, for any pointed -ABox such that is a tree-shaped ABox modulo , one can construct in polynomial time an -concept such that and iff , for all interpretations and .
The above also holds if one replaces -concepts by -concepts and requires the pointed ABoxes to be weakly rooted.
Appendix B Proof for Section 4
We start by proving Remark 3.
Proof of Remark 3. We have to show that the CIP and PBDP are invariant under adding (interpreted as the empty set) to the languages introduced in this paper. Assume that is any such language and let denote its extension with . We claim that enjoys the CIP/PBDP iff does. We show this for the CIP, the proof for the PBDP is similar. Assume first that and are a counterexample to the CIP of . Then they are also a counterexample to the CIP of . Conversely, assume that and are a counterexample to the CIP of . We may assume that no CI in uses in the concept on its left hand side (if it does, the CI is redundant). Let be a fresh concept name and replace by in and . Also add to the CIs
for all role names in ) and . We also let range over inverse roles in ) if admits inverse roles, the universal role if admits the universal role, and over nominals in ) if admits nominals. Let denote the resulting ontology. Then it is easy to see that and are a counterexample to the CIP of .
We continue with a few comments and missing proofs for Theorem 1.
Theorem 1. The following DLs do not enjoy the CIP nor PBDP:
-
1.
with the universal role,
-
2.
with nominals,
-
3.
with a single role inclusion ,
-
4.
with role hierarchies and a transitive role,
-
5.
with inverse roles.
In Points 2 to 5, the CIP/PBDP also fails if the universal role can occur in interpolants/explicit definitions.
Proof.
We first supply a proof for Point 4. Let contain
and let . Then is implicitly definable using under since
In the same way as above, the interpretations and given in Figure 6 show that has no -definition under .
We next observe that Point 5 can easily be strengthened. The concept name does not only have no explicit -definition, but no such definition exists in the positive fragment of . To see this, consider the interpretations given in Figure 7.
Observe that the interpretations show that is not definable under using any concept constructed from using since for any such concept we have for that implies . Of course, the interpretations and given in Figure 7. also demonstrate that concepts with implicit definitions in may not have explicit definitions in positive . The interpretations depicted in Figure 7 differ from the interpretations constructed previously in that they are not the canonical models. The nodes and are not enforced by the ontology but are needed to ensure does not distinguish and . ∎
Appendix C Proofs for Section 5
We give a proof for Remark 5.
Proof of Remark 5. Assume that is any DL introduced in this paper and let denote its extension with . The polynomial time reductions of -interpolant existence and -explicit definition existence to -interpolant existence and -explicit definition existence, respectively, are trivial. For the converse direction, we consider the CIP, the reduction for the PBDP is similar. The idea is the same as in Remark 3. Assume that and are in . If , then an interpolant exists and we are done. Assume . We may assume that no CI in uses in the concept on its left hand side (if it does, the CI is redundant). Now let be a fresh concept name and replace by in , , , and . Also add to the CIs
for all role names in ) and . We also let range over inverse roles in ) if admits inverse roles, the universal role if admits the universal role, and over nominals in ) if admits nominals. Let denote the resulting ontology. Then there exists an -interpolant for under iff there exists an -interpolant for under .
Appendix D Proofs for Section 6
We first give a proof of the polynomial time decidability of interpolant existence that has not been discussed in the main paper. Then we provide the missing proofs from the main paper.
The following complexity upper bound proof does not provide an upper bound on the size of interpolants/explicit definitions, but is more elementary than the one we sketched in the main paper.
We start by proving a characterization for the existence of interpolants using canonical models and simulations.
Lemma 10.
Let be -ontologies in normal form, concept names, and . Let . Then there does not exist an -interpolant for under iff there exists a model of and such that
-
1.
;
-
2.
.
Proof.
Assume an -interpolant exists, but there exists a model of and satisfying the conditions of the lemma. As , by Lemma 6, we obtain . By Lemma 1, . We have derived a contradiction to the condition that , is a model of , and .
Assume no -interpolant exists. Let
By Lemma 7 and compactness, there exists a model of and such that for all but . We may assume that is -saturated.333See (?) for an introduction to -saturated interpretations and their properties. Thus, by a straightforward gneralization of Lemma 1 from finite to -saturated interpretations, , and satisfies the conditions of the lemma. ∎
The characterization provided in Lemma 10 can be checked in polynomial time. Consider a fresh concept name for each for . We define the diagram of as the ontology consisting of the following CIs:
-
•
, for every and ;
-
•
, for every ;
-
•
, for every and ;
-
•
, for every .
Denote by the -reduct of the interpretation . Now it is straightforward to show that there exists a model of and such that the conditions of Lemma 10 hold for iff . The latter condition can be checked in polynomial time. If we aim at interpolants without the universal role we simply remove the CIs of the final item from the definition of , denote the resulting set of inclusions by and have that there exists a model of and such that the conditions of Lemma 10 hold for iff .
Directed Unfolding of ABox.
We give a precise definition of the directed unfolding of an ABox. Let be a factorized -ABox and . The directed unfolding of into a ditree-shaped ABox modulo is defined as follows. The individuals of are the words with role names and such that for any and and for all . We set and define
-
•
if , for ;
-
•
if and if for some and , for ;
-
•
if , for and .
Derivation Trees.
Fix an -ontology in normal form, a -ABox , and recall the definition of and . Let . A derivation tree for the assertion in is a finite -labeled tree , where is a set of nodes and the labeling function, such that
-
•
;
-
•
if , then (i) and or (ii) or (iii) and or
-
1.
for a concept name and has a successor with ; or
-
2.
for a concept name and has a successor such that and ; or
-
3.
has successors with for and and ; or
-
4.
has successors with , , and ; or
-
5.
the conditions of the rule for RIs discussed in the main paper hold: there are role names and members of such that is a label of a successor of , , , and the situation depicted in Figure 4 holds, where the “dotted lines” stand for ‘either or some with are labels of successors of ’, and stands for ‘either or some is a label of a successor of and if and if ’. Moreover, for all , , there exists a successor of with label for some ; or
-
6.
has a successor with and .
-
1.
The purpose of Conditions 1 and 2 is to establish that it follows from and that is not empty. In this case is derived. The purpose of the remaining rules should be clear.
Example 5.
We use the ontology from Example 1. Recall that
Then is defined by setting
Recall that and that is an explicit definition of using under . Consider the ABox corresponding to the -reduct of . Then a derivation tree for in is defined by setting and taking a single successor of with . In the notation of Rule 5, we have and . We use that and .
We next show Part 1 of Lemma 3.
Proof of Part 1 of Lemma 3. Let be an -ontology in normal form and a finite -ABox. Assume with and is given. It is straightforward to show by induction that if there is a derivation tree for in , then . We construct a sequence of ABoxes as follows. Define as the union of and all assertions with an individual name in and with . Let be obtained from by applying one of the following rules:
-
1.
if , then add to ;
-
2.
if and , then add to ;
-
3.
if and , then add to ;
-
4.
if , then add to ;
-
5.
if there is a sequence of elements of and a sequence of role names such that and for every either or there is with such that for every :
-
•
; or
-
•
and there exists such that and ; or
-
•
and there exists such that and
and there exist and a role name such that , , and , then add to .
-
•
-
6.
if and , then add to .
Note that the sequence is finite, and denote by the final ABox.
Claim. There is a model of and such that for all and , implies .
Proof of the Claim. For all , we write if or for some . Notice that due to Rule 4, implies if and only if . It follows that is an equivalence relation. We let denote the equivalence class of . Start with an interpretation defined by:
By definition, satisfies all CIs in that do not involve role names or the universal role. We next extend by adding pairs of the form with to the interpretation of role names. In detail, if and there exist with and with such that , then add to . Also, if and there exist with and with such that , then add to . Finally, add any pair to if there exists an RI that follows from such that is in relation under the updated interpretations of . This defines an interpretation . By Rule 2 all CIs of the form are satisfied in . By definition, all RIs in are satisfied in . By Rules 5 and 6, all CIs of the form are satisfied as well. This finishes the proof of the claim.
Now suppose . By the Claim, we have . Since the six rules to construct are in one-to-one correspondence with Conditions (1)–(6) from the definition of derivation trees, we can inductively construct a derivation tree for in w.r.t. .
The remaining claims made in Part 1 of Lemma 3 have been shown in the main paper already. ∎
We next come to Part 2 of Lemma 3. The following example illustrates how one can construct from a derivation tree of in a derivation tree in with the directed unfolding of . The derivation tree has the same depth but the outdegree might be exponential.
Example 6.
Recall the ontology and concept name from Example 5. We consider the -reduct of the ABox corresponding to the canonical model . It is defined by . The directed unfolding has individuals
and the assertions
In a derivation tree for in we require that has successors labeled with:
We now give the general construction of the derivation tree in the directed unfolding from a derivation tree in the original ABox.
Proof of Part 2 of Lemma 3. Assume that is a derivation tree for in of at most exponential size. We obtain a very similar derivation tree for in with the directed unfolding of modulo . In fact, with the exception of Condition 5, the construction is identical. For Condition 5, one potentially has to introduce ”copies” of the nodes in which correspond to the fresh individuals introduced in the unfolded ABox.
In the following construction of the following holds: if the label of in is , then the label of copies of in takes the form with . Moreover, if for some or , then the label of is identical to the label of . Note is a mapping form to with
In detail, we define as follows from , starting with the root by setting .
Assume inductively that is a copy of , , and . To define the successors of and their labelings we consider the possible derivation steps for in : (i) if and , then and ; (ii) if , then ; (iii) if and , then . We next consider the cases 1 to 6:
-
1.
for a concept name and has a successor with : take a copy of as the only successor of and set .
-
2.
for a concept name and has a successor such that and : take a copy of as the only successor of and set .
-
3.
has successors with and and : take copies of as the successors of and set .
-
4.
has successors with , , and : take copies of as successors of and set , , and .
-
5.
Suppose that has successors such that the conditions of Point 5 for derivation trees hold for and members of . We define the new members of and relevant successors of with labeling by induction. We set . Assume that has been defined and is the label of a copy of a successor of with label .
Case 1. . Then we set .
Case 2. There exists and successors of with and . Then we let and we introduce copies of with and .
Now assume that has been defined and is the label of a copy of a successor of with label .
Case 1. and for some successor of . If for some , then we set and introduce a copy of and set . Observe that . Otherwise (if no with exists), we set and introduce a copy of and set .
Case 2. , for some successor of , and for a successor of and (if ) or (if ), respectively. Then we introduce copies of and set , , and .
-
6.
has a successor with and : then introduce a copy of and set .
Then is a derivation tree for in satisfying the conditions of the lemma. ∎
The proof of “2. 3.” of Theorem 5 is now as sketched in the main paper. Note also that we can construct in exponential time since we can construct the derivation tree for in in exponential time, then lift it to a derivation tree in its unfolding in exponential time, and from that derivation tree obtain the individuals in the ABox in exponential time.
A proof of the statement of Theorem 5 for interpolants without the universal role is obtained from the proof above in a straightforward way.
We conclude this section with a deferred proof of Theorem 4.
Theorem 4. Let be -ontologies with RIs, be -concepts, and set . Assume that the set of RIs in is safe for and . Then an -interpolant for under , exists.
Proof.
For convenience of notation, we assume w.l.o.g., by Lemma 2, that and are in normal form, , and . Suppose for a proof by contradiction that but there exists no -interpolant for . Then . Moreover, since the language under consideration contains neither nominals nor the universal role, this strengthens to .
Let be the canonical model of and . In what follows, we identify the domain of and individuals of , and consider both to be subsets of the domain of . By the properties of the canonical model, we then have . Furthermore, as is a model for both and , there exists a -simulation between and such that for all .
Consider an interpretation defined as follows:
where is a concept or role name. If we immediately derive a contradiction as we then have and , contradicting .
-
•
If does not contain RIs, as and are identical on all elements except , for all the relation is a -simulation between and . Conversely, the embedding of into generates a simulation, that is for all . By Lemma 1, for any --concept and for all we have if, and only if . Thus, is a model of CIs in . By construction .
-
•
Suppose that contains RIs. Since the interpretation may not satisfy some RIs, we consider a sequence of interpretations obtained by extending the interpretations of roles in to satisfy RIs. We give the construction of , for , in Figure 8.
A simple inductive argument shows that by the safety condition and the fact that we have that .
Furthermore, we prove by induction that the relation is a -simulation between and . For this has been established above. For the induction stop it suffices to consider -successors of in , where is from the definition of above. By the induction hypothesis, is a a -simulation between and . Then there exist with and for . As is a model of , we have and as required.
As canonical models defined in this paper are finite, there exists such that for all , . It can be seen that satisfies all RIs in and the satisfaction of CIs is proved similarly to the case above. Then is a model of with and , contradicting .
∎
Appendix E Proofs for Section 7
The section is organized as follows. We first introduce canonical models and derivation trees for . We then give the automata based proof of the ExpTime upper bound for interpolant existence. We then show the double exponential lower bound on the size of explicit definitions, the implication “2. 3.” of Theorem 6, and that interpolants can be computed in double exponential time.
Canonical Models.
Assume is an ontology in normal form and a concept name with . We introduce the canonical model . Let denote the set of subconcepts of concepts in , and denote by the set of with or a role name in and . We may assume that . An -type is a subset of such that implies for all concepts . We sometimes identify and . For a role , we write if is a maximal (w.r.t. inclusion) -type such that . Note that the set of all -types and relation can be computed in exponential time.
For any concept name , denotes the minimal -type containing and . Similarly, for any individual , denotes the minimal -type containing and . Let with and . The canonical model of and is defined as follows:
We also use to denote . The following properties of canonical models can be proved in a standard way.
Lemma 11.
For all -ontologies in normal form and concept names :
-
1.
is a model of ;
-
2.
for every model of and any with , ;
-
3.
for every -concept , if and only if .
We use to denote the ABox associated with the canonical model , and its -reduct. We denote the individuals and by and , respectively and observe that iff and iff .
Undirected Unfolding of an ABox.
We give a precise definition of the undirected unfolding of an ABox. Let be a -ABox and . The undirected unfolding of into a tree-shaped ABox modulo is defined as follows. The individuals of are the set of words with roles and such that for any and , and if is a role name and if is an inverse role, for all . We set and let
-
•
if , for ;
-
•
if and if for some and , for ;
-
•
if and if for some and , for ;
-
•
if , for and .
Derivation Trees.
Fix an -ontology in normal form and an ABox , and . Let , and . A derivation tree for the assertion in is a finite -labeled tree , where is a set of nodes and the labeling function, such that:
-
•
;
-
•
If with , then or or
-
1.
has successors , with , such that or for all , and defining if , and otherwise, we have ; or
-
2.
and has a single successor with ; or
-
3.
has a single successor with such that and (where is a role name or an inverse role).
-
1.
-
•
If with , then or:
-
4.
There exists such that has successors , with and or for all , and, defining if , and otherwise, we have .
-
4.
Note that a special case of rule 1 is when has two successors labeled and , and a special case of rule 4 is when has two successors labeled and .
We now prove the analogue of Lemma 3 for , except not considering the size of derivation trees.
Lemma 12.
Let be an -ontology in normal form and a finite -ABox. Then
-
1.
if and only if there is a derivation tree for in .
-
2.
If is a derivation tree for in , then one can construct a derivation tree for in , with the undirected unfolding of , and such that .
Proof.
We start with the proof of Part 1. is straightforward. For , we construct a sequence of ABoxes generalized with assertions of the form . Take where the ’s are fresh individual variables. Let be obtained from by applying one of the following rule, where is a concept of the form or or , and :
-
1.
if , with or for some , and , where if and if , then add ;
-
2.
if then add ;
-
3.
if and , then add ;
-
4.
if , with or for some , and , where if and if , then add .
Note that the sequence is finite, and denote by the final ABox.
Claim. There is a model of and such that for all and , implies .
Proof of the Claim. For all , we write if for some . Notice that if , then by rule 4, and by rule 1. Therefore, implies if and only if . In particular, is an equivalence relation. We let denote the equivalence class of . Start with an interpretation defined by:
Let denote the conjunction of all concepts of the form , , , or such that . Let denote the canonical model for and rooted at . Due to rule 1 and the universality of , for every concept name or nominal , we have if and only if . Similarly, because of rule 4, for every , if and only if .
We can now define as follows: is the disjoint union of and all elements in domains . Interpretations of concept names and nominals are inherited from the or each element comes from. Finally, is obtained by taking the union of and all after replacing edges to/from with edges to/from . It is clear that for the variable assignment , satisfies , and thus so does .
By rule 1, all concept inclusions of of the form , , and are satisfied by . They are also satisfied by every (since is a model of ), and thus by . Now consider a concept inclusion , where is a role name or an inverse role. Recall that for every and , if and only if . Therefore, for all , implies . The case is similar. Since every satisfies , so does . Similarly, every concept inclusion is satisfied in : if the witness pair for is part of , this follows from rule 3, and if not, then it is part of some , which is by definition a model of . For concept inclusions of the form , we can observe that if there exists some such that , then is in for some , i.e., by rule 2, for all .
Finally, for all and , implies , i.e., . This concludes the proof of the claim.
Now suppose . By the Claim, we have . Since the four rules to construct are in one-to-one correspondence with Conditions (1)–(4) from the definition of derivation trees, we can inductively construct a derivation tree for in w.r.t. . This concludes the proof of Part 1.
The proof of Part 2 is similar to that of Lemma 3. We define as follows from , starting with the root by setting . At each step, if then for some such that . To define the labelings of the successors of , we consider the possible derivation steps for in .
-
1.
, and has successors , with , such that or for all , and defining if , and otherwise, we have . Take if , and if .
-
2.
and has a single successor with . Take .
-
3.
has a single successor with such that and (where is a role name or an inverse role). Take .
-
4.
and there exists such that has successors , with and or for all , and, defining if , and otherwise, we have . Take if , and if .
Then is a derivation tree for in w.r.t. . ∎
Tree Automata.
A tree is a non-empty set closed under prefixes and such that implies . It is -ary if . The node is the root of . As a convention, we take and . Note that is undefined. Given an alphabet , a -labeled tree is a pair consisting of a tree and a node-labeling function .
A non-deterministic tree automaton (NTA) over finite -ary trees is a tuple , where is a set of states, is the input alphabet, is the set of initial states, and is the transition relation. A run of an NTA over a -ary input is a -labeled tree such that for all with children , . It is accepting if . The language accepted by , denoted , is the set of all finite -ary -labeled trees over which has an accepting run.
A two-way alternating tree automaton over finite -ary trees (2ATA) is a tuple where is a finite set of states, is the input alphabet, is the initial state, and is a transition function. The transition function maps every state and input letter to a positive Boolean formula over the truth constants and and transition atoms of the form , where . The semantics is given in terms of runs. More precisely, let be a finite -ary -labeled tree and a 2ATA. An accepting run of over is a -labeled tree such that:
-
1.
, and
-
2.
for all with , there is a subset such that and for every , there is some successor of in with .
The language accepted by , denoted , is the set of all finite -ary -labeled trees for which there is an accepting run.
From a 2ATA , one can compute in exponential time an NTA whose number of states is exponential in the number of states of and such that (?).
Interpolant Existence.
We now give the proof that Point 2 in Theorem 6 entails an exponential time upper bound for deciding the existence of an interpolant. We focus on the case of . Let be -ontologies in normal form, , and . We can assume that and .
As our proof relies on tree automata, let us first explain how we represent ABoxes that are tree-shaped modulo as trees over the alphabet , where
Intuitively, the nodes of the tree correspond to the individual variables of the ABox; labels indicate concepts that hold at the current node, while labels or are used to indicate which roles (if any) connect a node to its parent. Note that there need not be such a label or , so connected nodes in the tree representation are not necessarily connected in the ABox.
More precisely, we associate with every -labeled tree the following ABox, where are fresh individual variables:
Notice that is tree-shaped modulo . Conversely, for every ABox that is tree-shaped modulo , there exists a (not necessarily unique) tree such that . In addition, if the degree of is less than , then there exists a -ary tree such that . For instance, can be represented by a -ary tree for any larger than the number of concept inclusions in .
We also denote by the -reduct of .
We describe below an NTA with exponentially many states accepting trees that represent prefix-closed finite subsets of , and a 2-ATA with polynomially many states accepting trees such that . The existence of an interpolant then reduces to the non-emptiness of .
Definition of .
We represent the canonical model for and by a tree with at the root of the tree, other inserted at arbitrary positions in the tree, and below if . We want to accept finite subsets of obtained by keeping a prefix-closed finite subset of nodes, and possibly removing some concepts and relations from the labels (including all concepts and relations not in ). To do so, the automaton will simply guess in its state the type of each node, and check that all guesses are locally consistent by allowing only transitions that match the definition of canonical models. Concretely, the states of the automaton consist of a pair of -types, where state should be interpreted as the parent node having type and the current node type .
To keep the definition simple, the automaton also accepts trees where, compared to the canonical model, some nodes are duplicated (that is, we do not require that the node corresponding to some is unique). This does not change the set of concepts entailed at the root.
We take , where
-
•
, where is the set of -types introduced in the definition of ;
-
•
;
-
•
For states and input letter , if the following conditions are satisfied, for all :
-
–
the current state and label are consistent with the definition of the canonical model: for all , ;
-
–
the set of concepts associated with is a subset of the -type : ;
-
–
the current type is stored in the state of all child nodes: for all , .
-
–
Note that can be computed in exponential time.
Lemma 13.
if and only if there exists such that , where is the root of .
Proof.
The run of on some can be used to define a homomorphism from to . Therefore, if then . Conversely, if then there exists a finite subset of such that . Take as any finite prefix of an encoding of that contains all nodes corresponding to individuals in . Then the labeling of with the full types from the canonical model defines an accepting run of on , and . ∎
Definition of .
The construction of relies on derivation trees. Intuitively, runs of on some correspond to derivation trees for in . The states of are
Intuitively, state is used to check that is entailed at the current node. States and are used to check the label of the current node. The initial state is , as we are trying to construct a derivation tree for at the root.
Let us now define the transition relation. From a state or , where and , the automaton simply checks the current label:
From a state , with and , the automaton checks that the current node has an -successor from which there exists a run starting in . This -successor can be (i) the parent of the current node, i.e. there is a run from from the current node and a run from from the parent node, (ii) some -th child of the current node, i.e. there is a run from and one from from the -th child, or (iii) an individual , i.e. there is a run from and from from the current node:
From a state , the automaton checks if (i) condition 1 from derivation trees can be applied, that is, there exist concepts of the form , , or such that and there exists a run from each from the current node, or (ii) condition 2 from derivation trees can be applied, which can be checked by propagating the search for a run from to all neighbouring nodes, or (iii) condition 3 from derivation trees can be applied, that is, there exists such that and the automaton has a run from starting from the current node:
From a state where or with , the automaton checks if condition 4 from derivation trees can be applied either (i) taking the current node as , that is, there exist concepts of the form , , or such that and there exists a run from each from the current node, or (ii) taking some other node as , which can be checked by propagating the search for a run from to all neighbouring nodes:
We also set
For or , if or , and otherwise, the automaton checks if conditions 1 or 3 from derivation trees can be applied:
Lemma 14.
For all finite -ary -labeled trees , we have if and only if .
Proof.
We observe that for all ,
-
•
For all -concept of the form , or with and , has a run starting from state on if and only if there exists a derivation tree for in .
-
•
For all , for all -concept or , has a run starting from state if and only if there exists a derivation tree for in . ∎
Lower Bound for Explicit Definitions.
We construct an -ontology , signature , and concept name such that the smallest explicit -definition of under is of double exponential size in . is a variant of ontologies constructed in (?; ?) and defined as follows. It contains ,
and
Let . Note that triggers a marker and a binary tree of depth using counter concept names and . A concept name is made true at the leafs. Conversely, if is true at the leafs of a binary tree of depth then is true at all nodes of the tree and is entailed by and at its root. Define inductively
Then is the smallest explicit -definition of under .
Transfer Sequences.
For the proof of “2. 3.” of Theorem 6 and the proof that interpolants can be computed in double exponential time we require an extension of the notion of transfer sequences first introduced in (?) to logics with nominals.
Assume that Condition 2 of Theorem 6 holds. So we have -ontologies in normal form, concept names , and such that . Set . We use to denote the ABox associated with the canonical model . We require some notation for the individuals that occur in . We set if and set . We say that concept name is absorbed by if . We denote the individual of by and the individuals of by . Note that if and if is absorbed by .
Given , we call the individuals of the form the subtree of rooted at .
By compactness we have a finite subset of containing such that . We may assume that is prefix closed and that contains
-
•
and for all ;
-
•
and for all ;
We obtain the ABox from by adding the assertions
-
•
and , for all , where and are fresh individuals.
Let denote the set of individuals with and let denote the set of individuals with . Observe that and entail the same assertions for ), so the additional individuals do not influence what is entailed. In fact, we introduce the individuals only to enable explicit bookkeeping about when in a transfer sequence (defined below) an assertion of the form or is derived.
We aim to define a small subset of such that for the concept corresponding to and such that still . If has at most exponential depth in the size of then we are done, as then is of at most double exponential size in the size of . We obtain from by determining and which behave ‘sufficiently similar’ such that if we obtain from by replacing the subtree rooted at in by the subtree rooted at , then we still have and for the concept defined by . The replacement of subtrees is then performed exhaustively.
For and to be sufficiently similar, we firstly require that (with the final type in for any ). This ensures that for the concept corresponding to . This also has the consequence that is (isomorphic) to a prefix closed subABox of . For the second condition for being sufficiently similar, we apply the notion of transfer sequences (?). To define transfer sequences, we consider derivations using and intermediate ABoxes such that
We admit to contain equations for , with the obvious semantics. Consider such an intermediate and . Then the set is defined as the set of assertions with and of the form
-
•
or with and ; or
-
•
with and ;
-
•
with and ;
-
•
with .
and . For , let
-
•
denote the restriction of to the individuals in the subtree of rooted at and ; and let
-
•
be the ABox obtained from by dropping from except for itself and .
Define the transfer sequence of w.r.t. as follows:
Intuitively, we first consider the set of assertions that are entailed by and at if we only use assertions in . We update by those assertions. Next we consider the set of assertions that are entailed by and the updated at if we only use assertions in the updated . We update again, and so on. It is not difficult to see that if and
-
•
the restrictions of to and coincide (modulo renaming to ) and
-
•
the transfer sequences of w.r.t. coincides with the transfer sequence of w.r.t. (modulo renaming to )
then one can replace by in and it still holds that for the resulting ABox . If in addition we require that , then the resulting ABox is (isomorphic to) a prefix closed sub ABox of and so the concept corresponding to the ABox is still entailed by w.r.t. .
By performing the above replacement exhaustively, we obtain a prefix closed subset of that is of depth with a polynomial and therefore has the properties required for Point 3 of Theorem 6. Such an can be constructed in at most double exponential time since one can construct the canonical model up to nodes of depth in double exponential time.
The claims stated in Theorem 6 for interpolants without the universal role are shown by modifying the proof above in a straightforward way.
Appendix F Proofs for Sections 8 and 9
We first complete the proof of Theorem 8 by showing that there is a Horn--simulation between the interpretations and defined in Figure 5. The definition of Horn-simulations is as follows. For any two sets and and a binary relation , we set
-
•
if for all there exists with ;
-
•
if for all there exists with .
A relation is a Horn--simulation between and if implies and the following hold:
-
•
for any , if and , then ;
-
•
for any role in , if and , then there exist and with and ;
-
•
for any role in , if and , then there is with and ;
-
•
if , then for every (where indicates that we have a simulation that does not only respect role names in but also the inverse of role names in ).
We write if there exists a Horn--simulation between and such that . It is shown in (?) that if , then all Horn--concepts true in all nodes in are also true in .
Now observe that the relation between and containing all pairs , , and is a Horn--simulation between the interpretations and defined in Figure 5, as required.
We next observe that moving to the Horn fragment Horn-GF of the guarded fragment is not sufficient to obtain a logic in which interpolants/explicit definitions always exist. To this end we modify the ontology given in the proof of Theorem 8. In detail, let contain the following CIs:
and also . Define the signature by setting . We note that, intuitively, the third and fourth CI should be read as
and the concept name is introduced to achieve this in a projective way as the latter CI is not in Horn-.
We first observe that is implicitly definable from under since
We next sketch the proof that is not explicitly Horn-GF-definable under . For a definition of Horn-GF and Horn-GF simulations we refer the reader to (?). Now consider the interpretations and defined in Figure 9. Both and are models of , , , but implies holds for every Horn-GF-formula , and the claim follows. The latter can be proved by observing that there exists a Horn-GF-simulation between and (?) containing . In fact, one can show that the relation containing all pairs , , and is a Horn-GF-simulation.
We finally make a few observations regarding the Horn fragment of first-order logic. Recall that Horn-FO is defined as the closure of formulas of the form ,
under conjunction, universal quantification, and existential quantification, where are sequences of individual variables and individual names (?). According to Exercise 6.2.6 in (?) Horn-FO has the following property.
Theorem 9.
Let be sentences in Horn-FO such that is not satisfiable. Then there exists a sentence in Horn-FO such that , , and is not satisfiable.
We directly obtain the following interpolation result.
Theorem 10.
Let be Horn--ontologies and let be Horn--concepts such that . Then there exists a formula in Horn-FO such that
-
•
;
-
•
;
-
•
.
Proof.
Take a fresh unary relation symbol and a fresh individual name . Let be the conjunction of all sentences in and let be the conjunction of all sentences in . Then and are both equivalent to sentences in Horn-FO. By definition is not satisfiable. Thus there exists a Horn-FO sentence using only and symbols in such that and is not satisfiable. Thus:
-
•
;
-
•
.
Replace by in , and . Then
-
•
;
-
•
,
as required. ∎
Applied to Horn- ontologies and concepts we thus always obtain an interpolant in Horn-FO and an interpolant in (since enjoys the CIP (?)).
It would be interesting to find out whether there exists an interpolant in the intersection of Horn-FO and and whether it is possible to give an informative syntactic description of that intersection.