\templatetype

pnasresearcharticle \leadauthorFagin \significancestatementThis work is a step in the direction of explainable AI as it pertains to logical inference in neural networks. This may ultimately assist in preventing unfair, unwarranted, or otherwise undesirable outcomes from the application of modern AI methods. \correspondingauthor¹To whom correspondence should be addressed. E-mail: fagin@us.ibm.com

Foundations of Reasoning with Uncertainty via Real-valued Logics

Ronald Fagin IBM Research Ryan Riegel IBM Research Alexander Gray IBM Research

Abstract

Real-valued logics underlie an increasing number of neuro-symbolic approaches, though typically their logical inference capabilities are characterized only qualitatively. We provide foundations for establishing the correctness and power of such systems. We give a sound and strongly complete axiomatization that can be parametrized to cover essentially every real-valued logic, including all the common fuzzy logics. Our class of sentences are very rich, and each describes a set of possible real values for a collection of formulas of the real-valued logic, including which combinations of real values are possible. Strong completeness allows us to derive exactly what information can be inferred about the combinations of real values of a collection of formulas given information about the combinations of real values of several other collections of formulas. We then extend the axiomatization to deal with weighted subformulas. Finally, we give a decision procedure based on linear programming for deciding, for certain real-valued logics and under certain natural assumptions, whether a set of our sentences logically implies another of our sentences.

keywords:

Keywords: real-valued logic

|

strongly complete axiomatization

doi:

www.pnas.org/cgi/doi/10.1073/pnas.XXXXXXXXXX

\dates

This manuscript was compiled on August 10, 2025

\dropcap

Recent years have seen growing interest in approaches for augmenting the capabilities of learning-based methods with those of reasoning, often broadly referred to as neuro-symbolic (though they may not be strictly neural). One of the key goals that neuro-symbolic approaches have at their root is logical inference, or reasoning. However, the representation of classical 0-1 logic (where truth values of sentences are either 0, representing “False”, or 1, representing “True”) is generally insufficient for this goal because representing uncertainty is essential to AI. In order to merge with the ideas of neural learning, the truth values dealt with must be real-valued (we shall take these to be real numbers in the interval $[0,1]$ , where intuitively, 0 means “completely false”, and 1 means “completely true”), whether the uncertainty semantics are those of probabilities, subjective beliefs, neural network activations, or fuzzy set memberships. For this reason, many major approaches have turned to real-valued logics. Logic tensor networks (1) define a logical language on real-valued vectors corresponding to groundings of terms computed by a neural network, which can use any of the common real-valued logics (e.g., Łukasiewicz, product, or Gödel logic) for its connectives (e.g., $\mathbin{\&}$ , $\veebar$ , $\neg$ , and $\rightarrow$ ). Probabilistic soft logics (2) draw a correspondence of their approach based on Markov random fields (MRFs) with satisfiability of statements in a real-valued logic (Łukasiewicz). Tensorlog (3), also based on MRFs but implemented in neural network frameworks, draws a correspondence of its approach to the use of connectives in a real-valued logic (product). Logical neural networks (LNNs) (4) draw a correspondence between activation functions of neural networks and connectives in real-valued logics. To complete a full correspondence between neural networks and statements in real-valued logic, LNN defines a class of real-valued logics allowing weighted inputs, which represent the relative influence of subformulas. While widely regarded as fundamental to the goal of AI, the reasoning capabilities of the aforementioned systems are typically made qualitatively versus quantitatively and mathematically. While learning theory (roughly, what it means to perform learning) is well articulated and, for 0-1 logic, what it means to perform reasoning is well studied, reasoning is surprisingly not well formalized for real-valued logics. As reasoning becomes an increasing goal of learning-based work, it becomes important to have a solid mathematical footing for it.

Formalization of the idea of real-valued logics is old and fundamental, going back to the origins of formal logic. It is not well known that Boole himself invented a probabilistic logic in the 19th century (5), where formulas were assigned real values corresponding to probabilities. It was used in AI to model the semantics of vague concepts for commonsense reasoning by expert systems (6). Real-valued logic is used in linguistics to model certain natural language phenomena (7), in hardware design to deal with multiple stable voltage levels (8), and in databases to deal with queries that are composed of multiple graded notions, such as the redness of an object, that can range from 0 (“not at all red”) to 1 (“completely red”) (9). Despite all this, while definitions of logical correctness and power (generally, soundness and completeness) are well established and corresponding procedures for theorem proving having those properties are abundant for classical logics, the equivalents for real-valued logics (where the values can take arbitrarily values between 0 and 1) are rather limited.

This paper.

In this paper, there are two levels of logic. In the “inner” layer, we have formulas of the real-valued logic with its logical connectives. In this inner layer, we shall use $\mathbin{\&}$ for “and” and $\veebar$ for “or”, as is done in (10). In the “outer” layer, we have a novel class of sentences about the inner real-valued logic (such as saying which truth values a given real-valued formula may attain). For these sentences (which take on only the classical values 0 and 1 for False and True, respectively), we make use of the traditional logical symbols $\land$ for “and” and $\lor$ for “or”. We remark that, somewhat confusingly, the symbols $\land$ and $\lor$ are often used in real-valued logics for weaker versions of “and” and “or” than that given by $\mathbin{\&}$ and $\veebar$ , which we do not have need to discuss in this paper.

Let us say that an axiomatization of a logic is strongly complete if whenever $\Gamma$ is a finite set of sentences in the (outer) logic and $\gamma$ is a single sentence in the (outer) logic that is a logical consequence of $\Gamma$ , then there is a proof of $\gamma$ from $\Gamma$ using the axiomatization. An axiomatization is weakly complete if this holds for $\Gamma=\emptyset$ . That is, an axiomatization is weakly complete if whenever $\gamma$ is a valid sentence (always true), then there is a proof of $\gamma$ using the axiomatization. Early axiomatizations of real-valued logics in the literature were typically weakly complete, but now have been improved to strongly complete (see (10) for examples).

We now explain why it is necessary to assume that $\Gamma$ is finite in the definition of strong completeness. (In our explanation, we make use of ideas from (10).) Let us restrict to Łukasiewicz logic. Let $A^{k}$ denote $A\mathbin{\&}A\mathbin{\&}\cdots\mathbin{\&}A$ , where $A$ appears $k$ times. Let $\Gamma$ be the infinite set of sentences $(B\rightarrow A^{k};\{1\})$ for $k\geq 1$ , along with $(A;[0,1))$ which says that the value of $A$ is less than 1. Let $\tau$ be $(B;\{0\})$ . We now show that $\Gamma$ logically implies $\tau$ . Assume that $\Gamma$ holds but $\tau$ does not hold. Therefore, the value of $A$ is less than 1. It then follows from the definition of conjunction in Łukasiewicz logic that there is $k$ such that $A^{k}$ has value 0 . From $(B\rightarrow A^{k};\{1\})$ this then implies that the value of $B$ is 0, so $\tau$ holds. Hence, $\Gamma$ logically implies $\tau$ . Because our proofs are of finite length, there cannot be a proof of $\tau$ from $\Gamma$ , since this would give a proof of $\tau$ from a finite subset of $\Gamma$ , but no finite subset of $\Gamma$ logically implies $\tau$ . A natural open problem is whether we can allow $\Gamma$ to be infinite if we were to restrict our attention to Gödel logic.

We introduce a rich, novel class of sentences.

1.

These sentences can say what the set $S$ of possible values is for a formula $\sigma$ . This set $S$ can be a singleton $\{s\}$ (meaning that the real value of $\sigma$ is $s$ ), or $S$ can be an interval, or a union of intervals, or in fact an arbitrary subset of $[0,1]$ , e.g. the set of rational numbers in $[0,1]$ .
2.

Our sentences can give not only the possible real values of formulas, but the interactions between these values. For example, if $\sigma_{1}$ and $\sigma_{2}$ are formulas, our sentences can not only say what the possible real values are for each of $\sigma_{1}$ and $\sigma_{2}$ , but also how they interact: thus, if $s_{1}$ is the real value of $\sigma_{1}$ and $s_{2}$ is the real value of $\sigma_{2}$ , then there is a sentence in our logic that says $(s_{1},s_{2})$ must lie in the set $S$ of ordered pairs, where $S$ is an arbitrary subset of $[0,1]\times[0,1]$ . We give a sound and strongly complete axiomatization for our sentences.
3.

Unlike the other axiomatizations mentioned earlier, our axiomatization can be extended to include the use of weights for subformulas (where, for example,in the formulas $A_{1}\veebar A_{2}$ , the subformula $A_{1}$ is considered twice as important as the subformula $A_{2}$ ).
4.

A surprising feature of our axiomatization is that it is parametrized, so that this one axiomatization is sound and strongly complete for essentially every real-valued logic, including those that do not obey the standard restrictions on fuzzy logics (such as conjunction being commutative). Previous axiomatizations in the literature had a separate set of axioms for each real-valued logic (for example, one of the axioms for Łukasiewicz logic is $A\leftrightarrow\neg\neg A$ , and one of the axioms for Gödel logic is $A\leftrightarrow(A\mathbin{\&}A)$ ). In the axiomatizations mentioned earlier, each connective has a fixed associated function that tells how to evaluate it. For example, in in Łukasiewicz logic, the value of $A_{1}\mathbin{\&}A_{2}$ is $f_{\mathbin{\&}}(a_{1},a_{2})$ , where $a_{1}$ is the value of $A_{1}$ and $a_{2}$ is the value of $A_{2}$ , and where $f_{\mathbin{\&}}(a_{1},a_{2})=\max\{0,a_{1}+a_{2}-1\}$ . By contrast, for our axiomatization, $f_{\mathbin{\&}}$ is arbitrary, as long as it maps $[0,1]^{2}$ into $[0,1]$ .

From now on (except in the Section 8 on related work) we use “complete” to mean “strongly complete”.

An especially useful real-valued logic for logical neural nets is Łukasiewicz logic, for several reasons. First, the $\mathbin{\&}$ , $\veebar$ , $\neg$ , and $\rightarrow$ operators are essentially linear, in that if $a_{1}$ is the truth value of a formula $A_{1}$ , and $a_{2}$ is the truth value of a formula $A_{2}$ , then (a) $A_{1}\mathbin{\&}A_{2}$ has value $\max\{0,a_{1}+a_{2}-1\}$ , (b) $A_{1}\veebar A_{2}$ has value $\min\{1,a_{1}+a_{2}\}$ , (c) $\neg A_{1}$ has value $1-a_{1}$ , and (d) $A_{1}\rightarrow A_{2}$ is equivalent to $\neg A_{1}\veebar A_{2}$ , and so has value $\min\{1,1-a_{1}+a_{2}\}$ .¹¹1The versions of $\mathbin{\&}$ and $\veebar$ we describe here are sometimes called in the literature strong conjunction and strong disjunction. Weak conjunction is given by $\min\{a_{1},a_{2}\}$ and weak disjunction is given by $\max\{a_{1},a_{2}\}$ . Second, it is easy to incorporate weights. Thus, if $w_{1}$ and $w_{2}$ are nonnegative weights of $A_{1}$ and $A_{2}$ , respectively, then we can take the weighted value of $A_{1}\veebar A_{2}$ to be $\min\{1,w_{1}s_{1}+w_{2}s_{2}\}$ .

Throughout this paper, we take the domain of each function in the real-valued logic to be $[0,1]$ or $[0,1]\times[0,1]$ and the range to be $[0,1]$ . This is a common assumption for many real-valued logics, but all of our results go through with obvious modifications if the domains are $D^{k}$ for possibly multiple choices of arity $k$ and range $D$ , for arbitrary subsets $D$ of the reals. We note that real-valued logic can be viewed as a special case of multi-valued logic (11), although in multi-valued logic there is typically a finite set of truth values, not necessarily linearly ordered.

We also provide a decision procedure for deciding, whether a set of our sentences logically implies another of our sentences, for certain real-valued logics, under certain natural assumptions. We implemented the decision algorithm, dubbed SoCRAtic logic (for Sound and Complete Real-valued Axiomatic logic), which we describe in detail and make available in source code.

Our sentences allow arbitrary real-valued logics, as does our sound and complete axiomatization, but our decision procedure depends heavily on the choice of real-valued logic, and in particular is tailored towards Łukasiewicz and Gödel logic. This is because a key portion of our decision procedure is linear programming, and we depend on the essentially linear nature of Łukasiewicz logic and the ease of dealing with min and max in Gödel logic.

Overview.

Until the final section, we do not allow weights. In Section 1, we give our basic notions, including what a model is and what a sentence is. In Section 1, we define our sentences to be of the form $(\sigma_{1},\ldots,\sigma_{k};S)$ where the $\sigma_{i}$ are formulas, where $S$ is a set of tuples $(s_{1},\ldots,s_{k})$ , and where the sentence says that if the value of each $\sigma_{i}$ is $s_{i}$ , for $1\leq i\leq k$ , then $(s_{1},\ldots,s_{k})\in S$ . In Section 2, we give our (only) axiom and our inference rules. In Section 3, we give our soundness and completeness theorem. In Section 4, we give a theorem that says that our sentences are closed under Boolean combinations. This helps show robustness of our class of sentences. In Section 5 we discuss possible simplifications of our sentences. In Section 6, we give the decision algorithm. In Section 7, we show how to extend our methodology to incorporate weights. In Section 8, we discuss related work. In the Conclusions, we review the implications for neuro-symbolic approaches.

1 Models and sentences

We assume a finite set of atomic propositions. These can be thought of as the leaves of a neural net, i.e., nodes with no inputs from other neurons. A model $M$ is an assignment $g^{M}$ of values to the atomic propositions. Thus, $M$ assigns a value $g^{M}(A)\in[0,1]$ to each atomic proposition $A$ .

Let $F$ be the set of logical formulas over the atomic propositions, where we allow arbitrary finite sets of binary and unary connectives. Typical binary connectives are conjunction (denoted by $\mathbin{\&}$ ), disjunction (denoted by $\veebar$ ), and implication (denoted by $\rightarrow$ ). Typical unary connectives are negation (denoted by $\neg$ ) and a delta function (denoted by $\bigtriangleup$ ). Sometimes $\neg x$ is taken to be $1-x$ , and $\bigtriangleup x$ is taken to be defined by $\bigtriangleup x=1$ if $x=1$ and 0 otherwise.

When considering only formulas with value 1, as most other works do when giving sound axiomatizations of real-valued logics, the convention is to consider a sentence to be simply a member of $F$ . What if we want to take into account values other than 1?

We take a sentence to be an expression of the form $(\sigma_{1},\ldots,\sigma_{k};S)$ , where $\sigma_{1},\ldots,\sigma_{k}$ are in $F$ , and where $S\subseteq[0,1]^{k}$ . The intuition is that $(\sigma_{1},\ldots,\sigma_{k};S)$ says that if the value of each $\sigma_{i}$ is $s_{i}$ , for $1\leq i\leq k$ , then $(s_{1},\ldots,s_{k})\in S$ . We refer to our sentences as multi-dimensional sentences, or for short MD-sentences. For a fixed $k$ , we refer to the MD-sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ as $k$ -dimensional. The class of MD-sentences is robust. In particular, Theorem 4.3 says that MD-sentences are closed under Boolean combinations. We give a sound and (strongly) complete axiomatization, that is parameterized to deal with an arbitrary fixed real-valued logic. This axiomatization allows us to derive exactly what information can be inferred about the combinations of values of a collection of formulas given information about the combinations of values of several other collections of formulas.

Note that we are not saying that the logic is multi-dimensional (which could mean that the values taken on by variables are vectors, not just numbers), but instead we are saying that the sentences in our "outer" logic are multi-dimensional. The "inner" logic we work with in this paper is real-valued, and real-valued logic has been heavily studied. What is novel in our paper are our multi-dimensional sentences.

Note that the set $S$ in $(\sigma_{1},\ldots,\sigma_{k};S)$ can be undecidable, even if $k=1$ and every member of $S$ is a rational number. For example, we could then take $S$ to be the set of all numbers $1/k$ , where $k$ is the G $\ddot{o}$ del number of a halting Turing machine. But our decision procedures involve only special sets $S$ . Thus, we shall say in Section 6 that a sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ is interval-based if $S$ is of the form $S_{1}\times\cdots\times S_{k}$ , where each $S_{i}$ is the union of a finite number of intervals with rational endpoints. And our decision procedure in that section deals with interval-based sentences. However, our sound and complete axiomatization in Section 3 makes no such assumptions about the sets $S$ ; in particular, the sets $S$ can be undecidable.

For convenience, we assume throughout that in the sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ , we have that $\sigma_{i}$ and $\sigma_{j}$ are different formulas if $i\neq j$ . We refer to $\sigma_{1},\ldots,\sigma_{k}$ as the components of $(\sigma_{1},\ldots,\sigma_{k};S)$ , and $S$ as the information set of $(\sigma_{1},\ldots,\sigma_{k};S)$ .

Let $\gamma$ be the sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ . We now say what it means for a model $M$ to satisfy $\gamma$ . For $1\leq i\leq k$ , let $s_{i}$ be the value of the formula $\sigma_{i}$ under the assignment of values to the atomic propositions given by the model $M$ . We say that $M$ satisfies $\gamma$ if $(s_{1},\ldots,s_{k})\in S$ . We then say that $M$ is a model of $\gamma$ , and we write $M\vDash\gamma$ . Note that if $(\sigma_{1},\ldots,\sigma_{k};S)$ is satisfiable, that is, has a model, then $S\neq\emptyset$ .

2 Axioms and inference rules

We have only one axiom:

(\sigma;[0,1])

(1)

Axiom (1) guarantees that all values are in $[0,1])$ .

We now give our inference rules.

If $\pi$ is a permutation of $1,\ldots,k$ , then:

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{\pi(1)},\ldots,\sigma_{\pi(k)};S^{\prime})

(2)

where $S^{\prime}=\{(s_{\pi(1)},\ldots,s_{\pi(k)})\colon(s_{1},\ldots,s_{k})\in S\}$ .

Rule (2) simply permutes the order of the components.

Our next inference rule is:

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }

(3)

$(\sigma_{1},\ldots,\sigma_{k},\sigma_{k+1},\ldots,\sigma_{m};S\times[0,1]^{m-k}).$

Rule (3) extends $(\sigma_{1},\ldots,\sigma_{k};S)$ to include $\sigma_{k+1},\ldots,\sigma_{m}$ with no nontrivial information being given about the new components.

Our next inference rule is:

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S_{1})\mbox{ and }(\sigma_{1},\ldots,\sigma_{k};S_{2})\mbox{ infer }

(4)

$(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2}).$

Rule (4) enables us to join the information in $(\sigma_{1},\ldots,\sigma_{k};S_{1})$ and $(\sigma_{1},\ldots,\sigma_{k};S_{2})$ .

Our next inference rule is the following (where $0<r<k$ ):

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k-r};S^{\prime})

(5)

where $S^{\prime}=\{(s_{1},\ldots,s_{k-r})\colon(s_{1},\ldots,s_{k})\in S\}$ .

Intuitively, $S^{\prime}$ is the projection of $S$ onto the first $k-r$ components. Rule (5) enables us to select information about $\sigma_{1},\ldots,\sigma_{k-r}$ from information about $\sigma_{1},\ldots,\sigma_{k}$ .

Our next inference rule is:

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k};S^{\prime})\mbox{ if }S\subseteq S^{\prime}.

(6)

Rule (6) says that we can go from more information to less information. The intuition is that smaller information sets are more informative.

We now give an inference rule that depends on the real-valued logic under consideration. For each real-valued binary connective $\alpha$ , let $f_{\alpha}(s_{1},s_{2})$ be the value of $\sigma_{1}\mathop{\alpha}\sigma_{2}$ when the value of $\sigma_{1}$ is $s_{1}$ and the value of $\sigma_{2}$ is $s_{2}$ . For example, in Gödel logic, $f_{\mathbin{\&}}(s_{1},s_{2})=\min\{s_{1},s_{2}\}$ . For each real-valued unary connective $\rho$ , let $f_{\rho}(s)$ be the value of $\rho\sigma$ when the value of $\sigma$ is $s$ . For example, in Łukasiewicz logic, $f_{\neg}(s)=1-s$ .

In the sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ , let us say that the tuple $(s_{1},\ldots,s_{k})$ in $S$ is good if (a) $s_{m}=f_{\alpha}(s_{i},s_{j})$ whenever $\sigma_{m}$ is $\sigma_{i}\mathop{\alpha}\sigma_{j}$ and $\alpha$ is a binary connective (such as $\mathbin{\&}$ ), and (b) $s_{j}=f_{\rho}(s_{i})$ whenever $\sigma_{j}$ is $\rho\sigma_{i}$ and $\rho$ is a unary connective (such as $\neg$ ). Note that being “good” is a local property of a tuple $s$ in $S$ (that is, it depends only on the tuple $s$ and not on the other tuples in $S$ ). Of course, if the real-valued logic under consideration has higher-order connectives (ternary, etc.), then we would modify the definition of a good tuple in the obvious way. For simplicity, we will assume throughout this paper that we are in the common case where the only connectives of the real-valued logic are unary and binary, although all of our results go through in the general case.

We then have the following inference rule:

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k};S^{\prime})

(7)

when $S^{\prime}$ is the set of good tuples of $S$ .

Rule (7) is our key rule of inference. Let $\gamma_{1}$ be the premise $(\sigma_{1},\ldots,\sigma_{k};S)$ and let $\gamma_{2}$ be the conclusion $(\sigma_{1},\ldots,\sigma_{k};S^{\prime})$ of Rule (7). As we shall discuss later, $\gamma_{1}$ and $\gamma_{2}$ are logically equivalent, and $S^{\prime}$ is as small as possible so that $\gamma_{1}$ and $\gamma_{2}$ are logically equivalent.

A simple example of a valid sentence is $(A,B,A\veebar B;S)$ where $S=\{(s_{1},s_{2},s_{3})\colon\ s_{1}\in[0,1],s_{2}\in[0,11],s_{3}=f_{\veebar}(s_{1},s_{2})\}$ . This is derived from the valid sentence $(A,B,A\veebar B;[0,1]^{3})$ by applying Rule (7)

Each of our rules is of the form “From A infer B” or “From A infer B where …”. We refer to A as the premise and B as the conclusion. We need the notion of a subformula of a formula. If $\alpha$ is a binary connective, then the subformulas of $\sigma_{1}\mathop{\alpha}\sigma_{2}$ are $\sigma_{1}$ and $\sigma_{2}$ . If $\rho$ is a unary connective, then the subformula of $\rho\sigma$ is $\sigma$ .

Let $\Gamma$ be a set of MD-sentences. We define the closure $G$ of $\Gamma$ under subformulas as follows. For each sentence $(\gamma_{1},\ldots,\gamma_{m};S)$ in $\Gamma$ , the set $G$ contains $\gamma_{1},\ldots,\gamma_{m}$ , and for each formula $\gamma$ in $G$ , the set $G$ contains every subformula of $\gamma$ .

In particular, $G$ contains every atomic proposition that appears inside the components of $\Gamma$ .

3 Soundness and completeness

Let $\Gamma$ be a finite set of MD-sentences, and let $\gamma$ be a single MD-sentence. We write $\Gamma\vDash\gamma$ if every model of $\Gamma$ is a model of $\gamma$ . We write $\Gamma\vdash\gamma$ if there is a proof of $\gamma$ from $\Gamma$ , using our axiom system. Soundness says “ $\Gamma\vdash\gamma$ implies $\Gamma\vDash\gamma$ ”. Completeness says “ $\Gamma\vDash\gamma$ implies $\Gamma\vdash\gamma$ ”. In this section, we shall prove that our axiom system is sound and complete for MD-sentences.

We define a special property of certain MD-sentences, that is used in a crucial manner in our completeness proof. Let us say that a sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ is minimized if whenever $(s_{1},\ldots,s_{k})\in S$ , then there is a model $M$ of $(\sigma_{1},\ldots,\sigma_{k};S)$ such that for $1\leq i\leq k$ , the value of $\sigma_{i}$ in $M$ is $s_{i}$ . Thus, $(s_{1},\ldots,s_{k})\in S$ if and only if there is a model $M$ of $(\sigma_{1},\ldots,\sigma_{k};S)$ such that for $1\leq i\leq k$ , the value of $\sigma_{i}$ in $M$ is $s_{i}$ . We use the word “minimized”, since intuitively, $S$ is as small as possible.

Our proof of completeness makes use of the following lemmas.

Lemma 3.1

Let $(\sigma_{1},\ldots,\sigma_{k};S)$ be the premise of Rule (7). Assume that $G=\{\sigma_{1},\ldots,\sigma_{k}\}$ is closed under subformulas (so that in particular, every atomic proposition that appears in $G$ is a member of $G$ ). Then the conclusion $(\sigma_{1},\ldots,\sigma_{k};S^{\prime})$ of Rule (7) is minimized.

Proof 3.2.

Let $\varphi$ be the conclusion $(\sigma_{1},\ldots,\sigma_{k};S^{\prime})$ of Rule (7). Assume that $(s_{1},\ldots,s_{k})\in S^{\prime}$ . To prove that $\varphi$ is minimized, we must show that there is a model $M$ of $\varphi$ such that for $1\leq i\leq k$ , the value of $\sigma_{i}$ in $M$ is $s_{i}$ . From the assignment of values to the atomic propositions, as specified by a portion of $(s_{1},\ldots,s_{k})$ , we obtain our model $M$ . For this model $M$ , the value of each $\sigma_{i}$ is exactly that specified by $(s_{1},\ldots,s_{k})$ , as we can see by a simple induction on the structure of formulas. Hence, $\varphi$ is minimized.

Lemma 3.3.

For each of Rules (2), (3), (4), and (7), the premise and the conclusion are logically equivalent.

Proof 3.4.

The equivalence of the premise and conclusion of Rule (2) is clear. For Rules (3), (4), and (7), the fact that the premise logically implies the conclusion follows from soundness of the rules, which we shall show shortly. We now show that for Rules (3), (4), and (7), the conclusion logically implies the premise. For Rule (3), we see that if $(s_{1},\ldots,s_{m})\in S\times[0,1]^{m-k}$ , then $(s_{1},\ldots,s_{k})\in S$ . Hence, the conclusion of Rule (3) logically implies the premise of Rule (3). For Rules (4) and (7), the conclusion logically implies the premise because of the soundness of Rule (6).

Lemma 3.5.

Minimization is preserved by Rules 2 and (4), in the following sense.

1.

If the premise of Rule (2) is minimized, then so is the conclusion.
2.

If the premises $(\sigma_{1},\ldots,\sigma_{k};S_{1})$ and $(\sigma_{1},\ldots,\sigma_{k};S_{2})$ of Rule (4) are minimized, then so is the conclusion $(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2})$ .

Proof 3.6.

Part (1) is immediate, since the premise and conclusion have exactly the same information.

For part (2), assume that $(\sigma_{1},\ldots,\sigma_{k};S_{1})$ and $(\sigma_{1},\ldots,\sigma_{k};S_{2})$ are minimized. To show that $(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2})$ is minimized, we must show that if $(s_{1},\ldots,s_{k})\in S_{1}\cap S_{2}$ , then there is a model $M$ of $(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2})$ such that for $1\leq i\leq k$ , the value of $\sigma_{i}$ in $M$ is $s_{i}$ . Assume that $(s_{1},\ldots,s_{k})\in S_{1}\cap S_{2}$ . Hence, $(s_{1},\ldots,s_{k})\in S_{1}$ . Since $(\sigma_{1},\ldots,\sigma_{k};S_{1})$ is minimized, we obtain the desired model $M$ .

Theorem 3.7.

Our axiom system is sound and complete for MD-sentences.

Proof 3.8.

We begin by proving soundness. We say that an axiom is sound if it is true in every model. We say that an inference rule is sound if every model that satisfies the premise also satisfies the conclusion. To prove soundness of our axiom system, it is sufficient to show that our axiom is sound and that each of our rules is sound.

Axiom (1) is sound, since every real-valued logic formula has a value in $[0,1]$ .

Rule (2) is sound, since the premise and conclusion encode exactly the same information.

Rule (3) is sound for the following reason. Let $M$ be a model, and let $s_{1},\ldots,s_{m}$ be the values of $\sigma_{1},\ldots,\sigma_{m}$ , respectively, in $M$ . If $M$ satisfies the premise, then $(s_{1},\ldots,s_{k})\in S$ . This implies that $(s_{1},\ldots,s_{m})\in S\times[0,1]^{m-k})$ and so $M$ satisfies the conclusion.

Rule (4) is sound for the following reason. Let $M$ be a model, and let $s_{1},\ldots,s_{k}$ be the values of $\sigma_{1},\ldots,\sigma_{k}$ , respectively, in $M$ . If $M$ satisfies the premise, then $(s_{1},\ldots,s_{k})\in S_{1}$ and $(s_{1},\ldots,s_{k})\in S_{2}$ . Therefore, $(s_{1},\ldots,s_{k})\in S_{1}\cap S_{2}$ , and so $M$ satisfies the conclusion.

Rule (5) is sound for the following reason. Let $M$ be a model, and let $s_{1},\ldots,s_{k}$ be the values of $\sigma_{1},\ldots,\sigma_{k}$ , respectively, in $M$ . If $M$ satisfies the premise, then $(s_{1},\ldots,s_{k})\in S$ . Therefore $(s_{1},\ldots,s_{k-r})\in S^{\prime}$ , and so $M$ satisfies the conclusion.

Rule (6) is sound for the following reason. Let $M$ be a model, and let $s_{1},\ldots,s_{k}$ be the values of $\sigma_{1},\ldots,\sigma_{k}$ , respectively, in $M$ . If $M$ satisfies the premise, then $(s_{1},\ldots,s_{k})\in S$ . Therefore, $(s_{1},\ldots,s_{k})\in S^{\prime}$ , and so $M$ satisfies the conclusion.

Rule (7) is sound for the following reason. Let $M$ be a model, and let $s_{1},\ldots,s_{k}$ be the values of $\sigma_{1},\ldots,\sigma_{k}$ , respectively, in $M$ . If $M$ satisfies the premise, then $(s_{1},\ldots,s_{k})\in S$ . In our real-valued logic, we have that (a) $f_{\alpha}(s_{i},s_{j})=s_{m}$ when $\sigma_{m}$ is $\sigma_{i}\mathop{\alpha}\sigma_{j}$ and $\alpha$ is a binary connective (such as $\mathbin{\&}$ ), and (b) $f_{\rho}(s_{i})=s_{j}$ when $\sigma_{j}$ is $\rho\sigma_{i}$ and $\rho$ is a unary connective (such as $\neg$ ). Then $(s_{1},\ldots,s_{k})\in S^{\prime}$ , and $M$ satisfies the conclusion.

This completes the proof of soundness. We now prove completeness. Assume that $\Gamma\vDash\gamma$ ; we must show that $\Gamma\vdash\gamma$ . We can assume without loss of generality that $\Gamma$ is nonempty, because if $\Gamma$ is empty, we replace it by a singleton set containing an instance of our Axiom (1).

Let $\Gamma=\{\gamma_{1},\ldots,\gamma_{n}\}$ . For $1\leq i\leq n$ , assume that $\gamma_{i}$ is $(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}};S_{i})$ , and let $\Gamma_{i}=\{\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}}\}$ . Assume that $\gamma$ is $(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};S_{0}$ ), and let $\Gamma_{0}=\{\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}}\}$ . Let $G$ be the closure of $\Gamma_{0}\cup\Gamma_{1}\cup\cdots\cup\Gamma_{n}$ under subformulas.

For each $i$ with $1\leq i\leq n$ , let $H_{i}$ be the set difference $G\setminus\Gamma_{i}$ . Let $r_{i}=|H_{i}|$ . Let $H_{i}=\{\tau^{i}_{1},\ldots\tau^{i}_{r_{i}}\}$ . By applying Rule (3), we prove from $\gamma_{i}$ the sentence $(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}},\tau^{i}_{1},\ldots,\tau^{i}_{r_{i}};S_{i}\times[0,1]^{r_{i}})$ . Let $\psi_{i}$ be the conclusion of Rule (7) when the premise is $(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}},\tau^{i}_{1},\ldots,\tau^{i}_{r_{i}};S_{i}\times[0,1]^{r_{i}})$ .

Let $\delta_{1},\ldots,\delta_{p}$ be a fixed ordering of the members of $G$ . Since the set of components of each $\psi_{i}$ is $G$ , we can use Rule (2) to rewrite $\psi_{i}$ as a sentence $(\delta_{1},\ldots,\delta_{p};T_{i})$ . Let us call this sentence $\varphi_{i}$ .

Also, since the only rules used in proving $\varphi_{i}$ from $\gamma_{i}$ are Rules (2), (3), and (7), it follows from Lemma 3.3 that $\gamma_{i}$ and $\varphi_{i}$ are logically equivalent.

We now make use of the notion of minimization. Let $T=T_{1}\cap\cdots\cap T_{n}$ . Define $\varphi$ to be the sentence $(\delta_{1},\ldots,\delta_{p};T)$ . It follows from Lemma 3.1 that each $\psi_{i}$ is minimized. So by Lemma 3.5, each $\varphi_{i}$ is minimized. By Lemma 3.5 again, $\varphi$ is minimized.

The sentence $\varphi$ was obtained from the sentences $\varphi_{i}$ by applying Rule (4) $n-1$ times. It follows from Lemma 3.3 that $\varphi$ is equivalent to $\{\varphi_{1},\ldots,\varphi_{n}\}$ . Since we also showed that $\gamma_{i}$ is logically equivalent to $\varphi_{i}$ for $1\leq i\leq n$ , it follows that $\varphi$ is logically equivalent to $\Gamma$ . Hence, since $\Gamma\vDash\gamma$ , it follows that $\{\varphi\}\vDash\gamma$ . It also follows that to prove that $\Gamma\vdash\gamma$ , we need only show that there is a proof of $\gamma$ from $\varphi$ .

Recall that $\gamma$ is $(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};S_{0}$ ), and $\varphi$ is $(\delta_{1},\ldots,\delta_{p};T)$ . By applying Rule (2), we can re-order the components of $\varphi$ so that the components start with $\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}}$ . We thereby obtain from $\varphi$ a sentence $(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}},\ldots;T^{\prime})$ , which we denote by $\varphi^{\prime}$ . By Lemma 3.3 we know that $\varphi$ and $\varphi^{\prime}$ are logically equivalent. So $\{\varphi^{\prime}\}\vDash\gamma$ . Since $\varphi$ is minimized, so is $\varphi^{\prime}$ , by Lemma 3.5. By applying Rule (5), we obtain from $\varphi^{\prime}$ a sentence $(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};T^{\prime\prime})$ , which we denote by $\varphi^{\prime\prime}$ .

We now show that $T^{\prime\prime}\subseteq S_{0}$ . This is sufficient to complete the proof of completeness, since then we can use Rule (6) to prove $\gamma$ . If $T^{\prime\prime}$ is empty, we are done. So assume that $(s_{1},\ldots,s_{k_{0}})\in T^{\prime\prime}$ ; we must show that $(s_{1},\ldots,s_{k_{0}})\in S_{0}$ .

Since $(s_{1},\ldots,s_{k_{0}})\in T^{\prime\prime}$ , it follows that there is an extension $(s_{1},\ldots,s_{k_{0}},\ldots,s_{p})$ in $T^{\prime}$ . Since $\varphi^{\prime}$ is minimized, there is a model $M$ of $\varphi^{\prime}$ such that the value of $\sigma^{0}_{i}$ is $s_{i}$ , for $1\leq i\leq k_{0}$ . Since $\{\varphi^{\prime}\}\vDash\gamma$ , it follows that $M$ is a model of $\gamma$ . By definition of what it means for $M$ to be a model of $\gamma$ , it follows that $(s_{1},\ldots,s_{k_{0}})\in S_{0}$ , as desired.

This completes the proof of soundness and completeness.

4 Closure of MD-sentences under Boolean combinations

Our next theorem implies that MD-sentences are robust, in that they are closed under Boolean combinations. Of course, since we are dealing with sentences (which take only the values True and False) in our "outer" logic, we use the standard Boolean connectives.

We begin with a useful lemma that we shall also use later.

Lemma 4.1.

The (standard logical) negation of the sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ is $(\sigma_{1},\ldots,\sigma_{k};\tilde{S})$ where $\tilde{S}$ is the set difference $[0,1]^{k}\setminus S$ .

Proof 4.2.

We need only show that if $M$ is a model, then $M\vDash(\sigma_{1},\ldots,\sigma_{k};\tilde{S})$ if and only if $M\not\vDash(\sigma_{1},\ldots,\sigma_{k};S)$ . Let $s_{i}$ be the value of $\sigma_{i}$ in $M$ , for $1\leq i\leq k$ . If $M\vDash(\sigma_{1},\ldots,\sigma_{k};\tilde{S})$ , then $(s_{1},\ldots,s_{k})\in\tilde{S}$ , and so $(s_{1},\ldots,s_{k})\notin S$ . Hence, $M\not\vDash(\sigma_{1},\ldots,\sigma_{k};S)$ . Conversely, if $M\not\vDash(\sigma_{1},\ldots,\sigma_{k};S)$ , then $(s_{1},\ldots,s_{k})\notin S$ , and so $(s_{1},\ldots,s_{k})\in\tilde{S}$ . Hence, $M\vDash(\sigma_{1},\ldots,\sigma_{k};\tilde{S})$ .

Theorem 4.3.

MD-sentences are closed under Boolean combinations $\land$ , $\lor$ , and $\neg$ .

Proof 4.4.

Let $\gamma_{1}$ and $\gamma_{2}$ be MD-sentences. Assume that $\gamma_{1}$ is $(\sigma^{1}_{1},\ldots,\sigma^{1}_{m};S_{1})$ , and that $\gamma_{2}$ is $(\sigma^{2}_{1},\ldots,\sigma^{2}_{n};S_{2})$ . As in the proof of Theorem 3.7, let $G$ be the closure of $\{\sigma^{1}_{1},\ldots,\sigma^{1}_{m},\sigma^{2}_{1},\ldots,\sigma^{2}_{n}\}$ under subformulas. Assume that $G=\{\delta_{1},\ldots,\delta_{p}\}$ . As in the proof of Theorem 3.7, we know that for $i=1$ and $i=2$ , there is $T_{i}$ such that $\gamma_{i}$ is equivalent to a sentence $(\delta_{1},\ldots,\delta_{p};T_{i})$ . The conjunction $\gamma_{1}\land\gamma_{2}$ is equivalent to $(\delta_{1},\ldots,\delta_{p};T_{1}\cap T_{2})$ . The disjunction $\gamma_{1}\lor\gamma_{2}$ is equivalent to $(\delta_{1},\ldots,\delta_{p};T_{1}\cup T_{2})$ . And by Lemma 4.1, the negation $\neg\gamma_{1}$ is equivalent to $(\delta_{1},\ldots,\delta_{p};\tilde{T_{i}})$ , where $\tilde{T_{1}}$ is the set difference $[0,1]^{p}\setminus T_{1}$ .

5 Lowering the dimensionality

It is natural to ask whether there is a $(k+1)$ -dimensional MD-sentence that in Łukasiewicz or Gödel logic is not equivalent to any $k$ -dimensional MD-sentence. For the special case $k=1$ , the next theorem gives an answer. We shall shortly state the more general case and generalizations of it as open problems.

Theorem 5.1.

There is a 2-dimensional MD-sentence that is not equivalent (in either Łukasiewicz or Gödel logic) to a 1-dimensional MD-sentence.

Proof 5.2.

Let $\sigma$ be the 2-dimensional MD-sentence $(A_{1},A_{2};S)$ where $S=\{(a_{1},a_{2}):a_{1}^{2}=a_{2}\}$ . We now show that $\sigma$ is not equivalent to a 1-dimensional MD-sentence. If $\varphi$ is a propositional formula involving only $A_{1}$ and $A_{2}$ , then it is easy to see (by induction on the structure of formulas) that for Łukasiewicz or Gödel logic, $\varphi$ defines a piecewise linear function $f_{\varphi}$ , in the sense that the 1-dimensional MD-sentence $(\varphi;S^{\prime})$ says that if $a_{1}$ is the value of $A_{1}$ and $a_{2}$ is the value of $A_{2}$ , then $f_{\varphi}(a_{1},a_{2})\in S^{\prime}$ . Since there is no such piecewise linear function $f_{\varphi}$ and set $S^{\prime}$ for our sentence $\sigma$ , the result holds.

The next theorem does not depend on restricting to Łukasiewicz or Gödel logic.

Theorem 5.3.

Every finite set of MD-sentences of arbitrary dimensions that involve only the $k$ predicate symbols $A_{1},\ldots,A_{k}$ is equivalent to a single $k$ -dimensional MD sentence $(A_{1},\ldots,A_{k};S)$ . (The set $S$ depends on the real-valued logic being considered.)

Proof 5.4.

Let $\Gamma$ be a finite set of MD-sentences. We can view $\Gamma$ as a conjunction of MD-sentences, so by Theorem 4.3, $\Gamma$ is equivalent to a single MD-entence $\gamma$ . As in the proof of completeness, by closing under subformulas, applying Rule (7), and reodering by applying Rules (2), we obtain an MD-sentnece $(A_{1},\ldots,A_{k},\varphi_{1},\ldots,\varphi_{r}^{\prime};S^{\prime})$ that is equivalent to $\gamma$ . Since the tuples in $S^{\prime}$ are good tuples, this is equivalent to the sentence $(A_{1},\ldots,A_{k};S)$ where $S=\{(s_{1},\ldots,s_{k}):(s_{1},\ldots s_{k},s^{\prime}_{1},\ldots s^{\prime}_{r})\in S^{\prime}\}$ .

Open problems: For each $k$ with $k\geq 2$ , does there exist a $(k+1)$ -dimensional MD-sentence that in Łukasiewicz or Gödel logic is not equivalent to a $k$ -dimensional MD-sentence? And for $k\geq 1$ , how about there being a $(k+1)$ -dimensional MD-sentence not equivalent to a Boolean combination of 1-dimensional MD-sentences, or even to a Boolean combination of $k$ -dimensional MD-sentences?

6 SoCRAtic logic: A decision procedure

Given a finite set $\Gamma$ of MD-sentences, and a single MD-sentence $\gamma$ , Theorem 3.7 says that $\Gamma\vDash\gamma$ if and only if $\Gamma\vdash\gamma$ . As we shall show, under natural assumptions there is an algorithm for deciding if $\Gamma\vDash\gamma$ . We call this algorithm a decision procedure. If the information sets $S$ all have s simple structure and the size of $\Gamma$ is treated as a constant, than the algorithm runs in polynomial time.

It is natural to wonder whether we can simply use our complete axiomatization to derive a decision procedure. The usual answer is that it is not clear in what order to apply the rules of inference. In our proof of completeness, the rules of inference are applied in a specific order, so that is not an issue here. Rather, the problem is that in applying Rule (7), there is no easy way to derive $S^{\prime}$ from $S$ , even if $S$ is fairly simple. In fact, we now show that even deciding if $S^{\prime}$ is nonempty is NP-hard. Let $\varphi$ be an instance of the NP-hard problem 3SAT. Thus, $\varphi$ is of the form $(B^{1}_{1}\veebar B^{1}_{2}\veebar B^{1}_{3})\mathbin{\&}\cdots\mathbin{\&}(B^{r}_{1}\veebar B^{r}_{2}\veebar B^{r}_{3})$ , where each $B^{i}_{j}$ is a literal (an atomic proposition or its negation). Assume that the atomic propositions that appear in $\varphi$ are $A_{1},\ldots,A_{k}$ . Let $\psi$ be the sentence

(A_{1},\ldots,A_{k},\neg A_{1},\ldots,\neg A_{k},\tau_{1},\ldots,\tau_{r},\tau_{1}\veebar B^{1}_{3},\ldots,\tau_{r}\veebar B^{r}_{3};S),

where $\tau_{i}$ is $B^{i}_{1}\veebar B^{i}_{2}$ , for $1\leq i\leq r$ , and where $S=\{0,1\}^{2k+r}\times\{1\}^{r}$ . Assume that we apply Rule (7) where the premise is $\psi$ , and the conclusion is

(A_{1},\ldots,A_{k},\neg A_{1},\ldots,\neg A_{k},\tau_{1},\ldots,\tau_{r},\tau_{1}\veebar B^{1}_{3},\ldots,\tau_{r}\veebar B^{r}_{3};S^{\prime}).

We call this sentence $\psi^{\prime}$ . It follows easily from our construction of $\psi$ that the 3SAT problem $\varphi$ is satisfiable if and only if $\psi$ is satisfiable. Now $\psi$ and $\psi^{\prime}$ are logically equivalent, by Lemma 3.3. So the 3SAT problem $\varphi$ is satisfiable if and only if $\psi^{\prime}$ is satisfiable. By Lemma 3.1, we know that $\psi^{\prime}$ is minimized. Hence, if $S^{\prime}$ is nonempty, there is a model of $\psi^{\prime}$ , by the definition of minimization. And if $S^{\prime}$ is empty, then by the definition of a model of a sentence, there is no model of $\psi^{\prime}$ . Therefore, $\psi^{\prime}$ is satisfiable if and only if $S^{\prime}$ is nonempty. By combining this with our earlier observation that the 3SAT problem $\varphi$ is satisfiable if and only if $\psi^{\prime}$ is satisfiable, it follows that the 3SAT problem $\varphi$ is satisfiable if and only if $S^{\prime}$ is nonempty. Hence, deciding if $S^{\prime}$ is nonempty is NP-hard.

We now discuss our decision procedure. To have a chance of there being a decision procedure, the set portion $S$ of an MD-sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ must be tractable. We now give a simple, natural choice for the set portions. A rational interval is a subset of $[0,1]$ that is of one of the four forms $(a,b)$ , $[a,b]$ , $(a,b]$ , or $[a,b)$ , where $a$ and $b$ are rational numbers. Let us say that a sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ is interval-based if $S$ is of the form $S_{1}\times\cdots\times S_{k}$ , where each $S_{i}$ is a union of a finite number of rational intervals. If each $S_{i}$ is the union of at most $N$ rational intervals, then we say that the sentence is $N$ -interval-based. Note that this interval-based sentence $(\sigma_{1},\ldots,\sigma_{k};S)$ is equivalent to the set $\{(\sigma_{1};S_{1}),\ldots,(\sigma_{k};S_{k})\}$ of sentences with only one component each. This observation may be useful in implementing a decision procedure. In fact, although we do not make use of this in the decision procedure described in this section, we so use it in the implementation of the decision procedure described later, since these sentences with a single component are easy to deal with. (This is one of several ways that our implementation differs from what is described in this section.)

Theorem 6.1.

Assume either Łukasiewicz logic or Gödel logic, with the connective $\mathbin{\&}$ , $\veebar$ , $\rightarrow$ , and $\neg$ .²²2For Łukasiewicz logic, we could allow each of strong and weak disjunction and conjunction, respectively, as described in an earlier footnote. Assume that $\Gamma\cup\{\gamma\}$ is interval based. Then there is an algorithm that determines whether $\Gamma\vDash\gamma$ . Assume that $\Gamma$ has at most $P$ sentences, each sentence in $\Gamma\cup\{\gamma\}$ is $N$ -interval based, and $(\Gamma,\gamma)$ has nesting depth at most $M$ . If $M$ is fixed, then the algorithm runs in time polynomial in $P$ and $N$ .

Proof 6.2.

Assume throughout the proof that $\Gamma$ has at most $P$ sentences, each sentence in $\Gamma\cup\{\gamma\}$ is $N$ -interval based, and $(\Gamma,\gamma)$ has nesting depth at most $M$ .

It is easy to see that $\Gamma\vDash\gamma$ if and only $\Gamma\cup\{\neg\gamma\}$ is not satisfiable. So we need only give an algorithm that decides whether $\Gamma\cup\{\neg\gamma\}$ is satisfiable.

Let $\{\sigma_{1},\ldots,\sigma_{p}\}$ be the closure of $\Gamma\cup\{\gamma\}$ under subformulas. Let $\Gamma=\{\gamma_{1},\ldots,\gamma_{n}\}$ . By making use of Rules (2) and (3), for each $i$ with $1\leq i\leq n$ , we can create a sentence $\gamma^{\prime}_{i}$ of the form $(\sigma_{1},\ldots,\sigma_{p};S^{i})$ that by Lemma 3.3 is equivalent to $\gamma_{i}$ , and that has $\sigma_{1},\ldots,\sigma_{p}$ as components. By the construction, each $\gamma_{i}^{\prime}$ is $N$ -interval-based.

Similarly, create the sentence $\gamma^{\prime}$ of the form $(\sigma_{1},\ldots,\sigma_{p};T)$ that is equivalent to $\gamma$ , and that has $\sigma_{1},\ldots,\sigma_{p}$ as components. As before, $\gamma^{\prime}$ is $N$ -interval-based.

Now $\Gamma$ is equivalent to the conjunction of the sentences $\gamma^{\prime}_{i}$ for $1\leq i\leq n$ , and this conjunction is equivalent to $(\sigma_{1},\ldots,\sigma_{p};S)$ , where $S=\bigcap_{i\leq n}S^{i}$ . We now show that $(\sigma_{1},\ldots,\sigma_{p};S$ ) is $PN$ -interval-based. By assumption, for each $i$ with $1\leq i\leq n$ , we have that $S^{i}$ is of the form $S^{i}_{1}\times\cdots\times S^{i}_{p}$ , where each $S^{i}_{j}$ is the union of at most $N$ intervals. For each $j$ with $1\leq j\leq p$ , let $S_{j}=\bigcap_{i}S^{i}_{j}$ . Then $S=S_{1}\times\cdots\times S_{p}$ . So to show that $(\sigma_{1},\ldots,\sigma_{p};S$ ) is $PN$ -interval-based, we need only show that each $S_{j}$ is the union of at most $PN$ intervals.

Since $S_{j}=\bigcap_{i\leq n}S^{i}_{j}$ , where each $S^{i}_{j}$ is the union of at most $N$ intervals, we see that $S_{j}$ is the union of intervals where the left endpoint of each interval in $S_{j}$ is one of the left endpoints of intervals in $\bigcup_{i\leq n}S^{i}_{j}$ . For each $j$ , there are $n$ sets $S^{i}_{j}$ . And for each $i$ with $1\leq i\leq n$ , there are at most $N$ left endpoints of $S^{i}_{j}$ . So the total number of left endpoints of intervals in $\bigcup_{i\leq n}S^{i}_{j}$ is at most $nN\leq PN$ , and so the number of intervals in $S_{j}$ is at most $PN$ . Since $S=S_{1}\times\cdots\times S_{p}$ , it follows that $(\sigma_{1},\ldots,\sigma_{p};S)$ is $PN$ -interval-based.

Let us now consider $\neg\gamma$ , which is equivalent to $\neg\gamma^{\prime}$ . Recall that $\gamma^{\prime}$ is $(\sigma_{1},\ldots,\sigma_{p};T)$ , and that $\gamma^{\prime}$ is $N$ -interval-based. So $T$ is of the form $T_{1}\times\cdot\times T_{p}$ , where each $T_{j}$ is the union of at most $N$ intervals. By Lemma 4.1, the negation of $\gamma^{\prime}$ is $(\sigma_{1},\ldots,\sigma_{p};\tilde{T})$ , where $\tilde{T}$ is the set difference $[0,1]^{p}\setminus T$ . For each $j$ with $1\leq j\leq p$ , let $T^{\prime}_{j}$ be the set difference $[0,1]\setminus T_{j}$ . Clearly, $T^{\prime}_{j}$ is the union of intervals. The left endpoints of intervals in $T^{\prime}_{j}$ are the right-end points of intervals in $T_{j}$ , possible along with 0. So $T^{\prime}_{j}$ is the union of at most $N+1$ intervals. Let $V_{j}=[0,1]^{j-1}\times T^{\prime}_{j}\times[0,1]^{p-j}$ . It is straightforward to see that $\tilde{T}=\bigcup_{j\leq p}V_{j}$ .

Now, showing that $\Gamma\cup\{\neg\gamma\}$ is not satisfiable is equivalent to showing that $(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};\tilde{T})$ is not satisfiable, which is equivalent to showing that for every $j$ with $1\leq j\leq p$ , we have that $(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};V_{j})$ is not satisfiable. So we need only give an algorithm for deciding if $(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};V_{j})$ is satisfiable. Let us hold $j$ fixed. Since, as we showed, $(\sigma_{1},\ldots,\sigma_{p};S)$ is $PN$ -interval-based, we can write $S$ as $S_{1}\times\cdots\times S_{p}$ , where each $S_{i}$ is the union of at most $PN$ intervals. Now $(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};V_{j})$ is equivalent to $(\sigma_{1},\ldots,\sigma_{p};S\cap V_{j})$ . Now $S\cap V_{j}$ is of the form $S^{\prime}_{1}\times\cdots\times S^{\prime}_{p}$ , where $S^{\prime}_{m}=S_{m}$ for $m\neq j$ , and where $S^{\prime}_{j}=S_{j}\cap T^{\prime}_{j}$ . We showed that $T^{\prime}_{j}$ is the union of at most $N+1$ intervals, and that $S_{j}$ is the union of at most $PN$ intervals, so it follows that $S_{j}\cap T^{\prime}_{j}$ is the union of at most $PN+N+1$ intervals, since each left endpoint of the intervals in $S_{j}\cap T^{\prime}_{j}$ is a left endpoint of an interval in $S_{j}$ or an interval in $T^{\prime}_{j}$ .

We now describe our algorithm for deciding if the sentence $(\sigma_{1},\ldots,\sigma_{p};S\cap V_{j})$ , that is, for the sentence $(\sigma_{1},\ldots,\sigma_{p};S^{\prime}_{1}\times\cdots\times S^{\prime}_{p}$ ), which is $(PN+N+1)$ -interval-based, is satisfiable. This can be broken into $|S^{\prime}_{1}|\times\cdots\times|S^{\prime}_{p}|$ subproblems, one for each choice $(I_{1},\ldots,I_{p})$ of a single interval $I_{k}$ from $S^{\prime}_{k}$ for each $k$ with $1\leq k\leq p$ . This gives a total of at most $(PN+N+1)^{M}$ subproblems. For each of these subproblems, we wish to decide satisfiability of the system $\{s_{1}\in I_{1},\ldots,s_{p}\in I_{p}\}$ along with (a) the binary constraints $f_{\alpha}(s_{i},s_{j})=s_{m}$ when $\sigma_{m}$ is $\sigma_{i}\mathop{\alpha}\sigma_{j}$ and $\alpha$ is a $\mathbin{\&}$ , $\veebar$ , or $\Rightarrow$ , and (b) $f_{\neg}(s_{i})=s_{j}$ when $\sigma_{j}$ is $\neg\sigma_{i}$ .

The constraints $s_{j}\in I_{j}$ are specified by inequalities (for example, if $I_{j}$ is $(a,b]$ we get the inequalities $a<s_{i}\leq b$ ). We now show how to deal with the constraints in (a) and (b) above. A canonical example is given by dealing with $f_{\mathbin{\&}}(s_{i},s_{j})=s_{m}$ in Gödel logic, which interprets “ $f_{\mathbin{\&}}(s_{i},s_{j})=s_{m}$ ” as $\min\{s_{i},s_{j}\}=s_{m}$ . We split the system of constraints into two systems of constraints, one where we replace $\min\{s_{i},s_{j}\}=s_{m}$ by the two statements “ $s_{i}\leq s_{j}$ , $s_{i}=s_{m}$ ” and another where we replace $\min\{s_{i},s_{j}\}=s_{m}$ by the two statements “ $s_{j}<s_{i}$ , $s_{j}=s_{m}$ ”. In Łukasiewicz logic, where $f_{\mathbin{\&}}(s_{i},s_{j})$ is $\max\{0,s_{1}+s_{2}-1\}$ , we split the system of constraints into two systems of constraints, one where we replace $\max\{0,s_{1}+s_{2}-1\}=s_{m}$ by the two statements “ $s_{i}+s_{j}-1\geq 0$ , $s_{i}+s_{j}-1=s_{m}$ ” and another where we replace $\max\{0,s_{1}+s_{2}-1\}=s_{m}$ by the two statements “ $s_{i}+s_{j}-1<0$ , $s_{m}=0$ ”. The same approach works for the other binary connectives. For example, in Gödel logic, where $f_{\Rightarrow}(s_{i},s_{j})$ is 1 if $s_{i}\leq s_{j}$ and is $s_{j}$ otherwise, we would split into two case, one where we replace $f_{\Rightarrow}(s_{i},s_{j})=s_{m}$ by the two statements “ $s_{i}\leq s_{j}$ , $s_{m}=1$ ” and another where we replace $f_{\Rightarrow}(s_{i},s_{j})=s_{m}$ by the two statements “ $s_{j}>s_{i}$ , $s_{m}=s_{j}$ ”. In considering the effect of the constraints in (a) and (b), each of our at most $(PN+N+1)^{M}$ subproblems splits at most $2^{p}\leq 2^{M}$ times, giving a grand total of at most $(PN+N+1)^{M}2^{M}$ systems of inequalities that we need to check for feasibility (that is, to see if there is a solution). For each of these systems of inequalities, we can make use a polynomial-time algorithm for linear programming to decide feasibility, where the size of each of these systems is linear in $M$ , and so the running time for each instance of the linear programming algorithm is polynomial in $M$ . Since also the number of systems is at most $(PN+N+1)^{M}2^{M}$ , and since $M$ is fixed by assumption, this gives us an overall algorithm for decidability, whose rulnning time is polynoimial in $N$ and $P$ .

The reason we held the parameter $M$ fixed is that the running time of the algorithm is exponential in $M$ , because there are an exponential number of calls to a linear programming subroutine. The algorithm is polynomial-time if there is a fixed bound on $M$ . Such a bound is necessary, because the problem can be co-NP hard, for the following reason.

Let $\gamma$ be the sentence $(A,\neg A;[1]\times[1])$ . Then $\gamma$ is not satisfiable. Let $\Gamma$ consist of the single sentence $\psi$ from the beginning of the section. Then $\Gamma\vDash\gamma$ if and only if $\psi$ is not satisfiable. Now $\psi$ is satisfiable if and only if $S^{\prime}$ from the beginning of the section is nonempty, which we showed is an NP-hard problem to determine. Since $\Gamma\vDash\gamma$ if and only if $\psi$ is not satisfiable, it follows that deciding if $\Gamma\vDash\gamma$ is co-NP hard.

We now give an implementation of the decisoin procedure The decision procedure described in Section 6 is available under the socratic-logic subdirectory provided with this supplementary material. We implemented the algorithm as a Python package named socratic, which requires Python 3.6 or 3.7 and makes use of IBM^® ILOG^® CPLEX^® Optimization Studio V12.10.0 via the docplex Python package.

6.1 Source code organization

The source code is organized as follows:

setup.sh: A script to create a Python virtualenv and install required packages
requirements.txt: A standard pip list of package dependencies
socratic/theory.py: Implementations for theories, sentences, and bounded intervals
socratic/op.py: Implementations for each logical operator as well as propositions and truth value constants
socratic/demo.py: Two example use cases demonstrating the use of the package
socratic/test.py: A suite of unit tests also serving as our experimental setup
socratic/hajek.py: Many tautologies proved in (10) used in test.py
socratic/clock.py: A higher-order function to measure the runtime of experiments

The classes defined in the source code are:

theory.Theory

A collection of sentences that can test for satisfiability or the entailment of a query sentence under a given logic

theory.Sentence

A base-class for a collection of formulas and an associated set of candidate interpretations for the formulas

theory.SimpleSentence: A single formula and an associated collection of candidate truth value intervals for the formula

theory.FloatInterval

An open or closed interval of truth values

theory.ClosedInterval

$[l,u]$ , i.e., all values from $l$ to $u$ , inclusive

theory.Point: $[x,x]$ , i.e., just $x$

theory.OpenInterval

$(l,u)$ , i.e., all values from $l$ to $u$ , exclusive

theory.OpenLowerInterval

$(l,u]$ , i.e., all values from $l$ to $u$ excluding $l$

theory.OpenUpperInterval

$[l,u)$ , i.e., all values from $l$ to $u$ excluding $u$

op.Formula

The base-class of a data structure representing the syntax tree of a logical formula

op.Prop

A named proposition, e.g., $x$

op.Constant

A truth value constant, e.g., $.5$

op.Operator

A base-class for all formulas with subformulas

op.And: Strong conjunction $x\otimes y$
op.WeakAnd: Weak conjunction $x\mathbin{\&}y=\min\{x,y\}$
op.Or: Strong disjunction $x\otimes y$
op.WeakOr: Weak disjunction $x\veebar y=\max\{x,y\}$
op.Implies: Implication $x\Rightarrow y$ , i.e., the residuum of $\otimes$
op.Not: Negation defined $\neg x=(x\Rightarrow 0)$
op.Inv: Involute negation ${\sim}x=1-x$
op.Equiv: Logical equivalence defined $(x\equiv y)=((x\Rightarrow y)\otimes(y\Rightarrow x))$
op.Delta: The operation $\Delta x=1$ if $x=1$ else 0

6.2 Implementation details

The implementation strategy closely adheres to the decision procedure described in Section 6, though with a few notable design shortcuts.

Boolean variables.

One such shortcut is the use of mixed integer linear programming (MILP) to perform the “spliting” of linear programs into two possible optimization problems, specifically by adding a Boolean variable that determines which of a set of constraints must be active. For example, given the desired constraint $z=\min\{x,y\}$ , one may write

$\displaystyle z$	$\displaystyle\leq x$	(8)
$\displaystyle z$	$\displaystyle\leq y$	(9)
$\displaystyle z$	$\displaystyle\geq x-(1-b)$	(10)
$\displaystyle y$	$\displaystyle\geq x-(1-b)$	(11)
$\displaystyle z$	$\displaystyle\geq y-b$	(12)
$\displaystyle x$	$\displaystyle\geq y-b$	(13)

for Boolean variable $b$ . For $x,y,z\in[0,1]$ , observe that (10) and (11) are effectively disabled for $b=0$ and that (12) and (13) are likewise disabled for $b=1$ . For example, when $b=1$ , the remaining constraints are $z\leq x,z\leq y,z\geq x,y\geq x$ , which is equivalent to $z=x,x\leq y$ , as desired. Observe then that MILP’s exploration of either value for the Boolean variable is equivalent to repeating linear optimization for either possible set of constraints; no feasible solution exists for any combination of such Boolean variables in exactly the case that none of the split linear programs are feasible. In practice, CPLEX has built-in support for min, max, abs, and a handful of other useful functions, though the above technique is still required to implement Gödel logic’s residuum, negation, and equivalence operations as well as to select the specific intervals a sentence’s formula truth values lie within.

Strict inequality.

The described decision procedure also occasionally calls for continuous constraints with strict inequality, in particular when dealing with the complements of closed intervals, but also when handling input open intervals or the Gödel residuum, $(x\Rightarrow y)=y$ if $x>y$ else 1. Linear programming, however, does not inherently support this. To implement strict inequality constraints, we introduce a global gap variable $\delta\in[0,1]$ to widen the distance between either side of the inequality, e.g.

x\geq y+\delta,

(14)

and then seek to maximize $\delta$ . If optimization yields an apparently feasible solution but with $\delta=0$ , we regard it as infeasible because at least one strict inequality constraint could not be honored strictly. Again in practice, due to floating-point imprecision, MILP can sometimes return tiny though nonzero values of $\delta$ even for $x=y$ in (14); as a result, it is necessary to check if $\delta$ is greater than some threshold rather than merely nonzero. We use $\delta>10^{-8}$ , which is much larger than the imprecision we have observed and yet much smaller than most truth values we consider. We observe that this technique is roughly equivalent to replacing $\delta$ with $10^{-8}$ throughout, which has the added benefit of freeing up the optimization objective for other uses in future extensions of the decision procedure, such as determining the tightest bounds for which a theory can entail a query.

Simple sentences.

We additionally observe that, for theories restricted to interval-based sentences, it is sufficient to support only sentences containing a single formula and collection of truth value intervals, i.e., SimpleSentences of the form $(\sigma;S)$ for a single formula $\sigma$ . This is because of the following theorem:

Theorem 6.3.

Any interval-based sentence $s=(\sigma_{1},\ldots,\sigma_{k};S_{1}\times\cdots\times S_{k})$ is equivalent to a collection of simple sentences $s_{1},\ldots,s_{k}$ , each given $s_{i}=(\sigma_{i};S_{i})$ .

Proof 6.4.

Given interval-based sentence $s$ and simple sentences $s_{1},\ldots,s_{k}$ as described, one may apply Rules (3) and (2) to obtain $s_{1}^{\prime},\ldots,s_{k}^{\prime}$ given $s_{i}^{\prime}=(\sigma_{1},\ldots,\sigma_{k};[0,1]^{i-1}\times S_{i}\times[0,1]^{k-i})$ . One may then repeatedly apply Rule (4) to compose these exactly into $s$ . Likewise, one may apply Rules (2) and (5) to obtain each $s_{i}$ directly from $s$ . Hence, the two forms are equivalent.

Accordingly, socratic implements only SimpleSentence. In order to include an interval-based sentence in a theory, one should instead include each of its component simple sentences as constructed above. In order to test the entailment of an interval-based sentence, one should separately test the entailment of each of its component simple sentences.

Complementary intervals.

As a last deviation from the described decision procedure, rather than explicitly finding the complement of a collection of truth value intervals for a given query formula, we simply adjust how constraints are expressed to force feasible solutions into the set of complementary intervals. Specifically, while the usual constraints require the formula’s truth value to lie within some one of its intervals, the complementary constraints require the formula’s truth value to not lie in any of its intervals, i.e., to lie to the left or the right of each of its intervals. We then reverse the direction of each interval’s lower and upper bound constraints, adding or removing gap variable $\delta$ as appropriate to switch between strict and nonstrict inequalities, and introduce Boolean parameters to decide which of each pair of constraints should apply, i.e., to decide whether the formula’s truth value should lie to the left or to the right. For example, if simple sentence has truth value intervals $[.2,.3],(.5,1]$ , the above would produce constraints

	$\displaystyle x$	$\displaystyle\leq.2-\delta+(1-b_{1}),$
	$\displaystyle x$	$\displaystyle\geq.3+\delta-b_{1},$
	$\displaystyle x$	$\displaystyle\leq.5+(1-b_{2}),$
	$\displaystyle x$	$\displaystyle\geq 1+\delta-b_{2}$

Observe that the last of these cannot be satisfied for $\delta>0$ unless $b_{2}=1$ , which is consistent with the complement of $(.5,1]$ not having a right side in the interval $[0,1]$ .

6.3 Experimental results

We tested socratic in three different experimental contexts:

•

3SAT and higher $k$ -SAT problems which become satisfiable if any one of their input clauses is removed
•

82 axioms and tautologies taken from Hájek in (10), some of which hold only for one of Łukasiewicz or Gödel logic
•

A formula that is classically valid but invalid in both Łukasiewicz and Gödel logic, unless propositions are constrained to be Boolean
•

A stress test running socratic on sentences with thousands of intervals

All experiments are conducted on a MacBook Pro with a 2.9 GHz Quad-Core Intel Core i7, 16 GB 2133 MHz LPDDR3, and Intel HD Graphics 630 1536 MB running macOS Catalina 10.15.5.

$k$ -SAT.

We construct classically unsatisfiable $k$ -SAT problems of the form

(x_{1}\mathbin{\&}\neg x_{1})\veebar\cdots\veebar(x_{k}\mathbin{\&}\neg x_{k})

(15)

which, after CNF conversion, yields for 3SAT

$\displaystyle(x_{1}\veebar x_{2}\veebar x_{3}),$	$\displaystyle(\neg x_{1}\veebar x_{2}\veebar x_{3}),$	$\displaystyle(x_{1}\veebar\neg x_{2}\veebar x_{3}),$
$\displaystyle(x_{1}\veebar x_{2}\veebar\neg x_{3}),$	$\displaystyle(x_{1}\veebar\neg x_{2}\veebar\neg x_{3}),$	$\displaystyle(\neg x_{1}\veebar x_{2}\veebar\neg x_{3}),$
$\displaystyle(\neg x_{1}\veebar\neg x_{2}\veebar x_{3}),$	$\displaystyle(\neg x_{1}\veebar\neg x_{2}\veebar\neg x_{3})$

and similarly for larger $k$ . The removal of any one clause in such a problem renders it satisfiable. We observe that, when each clause is required to have truth value exactly 1 but propositions are allowed to have any truth value, socratic correctly determines the problem to be

1)

unsatisfiable in Gödel logic,
2)

satisfiable in Gödel logic when dropping any one clause,
3)

trivially satisfiable in Łukasiewicz logic with, e.g., $x_{i}=.5$ ,
4)

again unsatisfiable in Łukasiewicz logic when propositions are required to have truth values in range either $\left[0,\frac{1}{k}\right)$ or $\left(\frac{k-1}{k},1\right]$ ,
5)

and yet again satisfiable in Łukasiewicz logic with constrained propositions when dropping any one clause.

We observe that Gödel logic is much slower than Łukasiewicz logic as implemented in socratic, likely because it performs mins and maxes between many arguments throughout while Łukasiewicz logic instead performs sums with simpler mins and maxes serving as clamps to the $[0,1]$ range. Interestingly, the difference between unsatisfiable and satisfiable in Gödel logic is significant; while the satisfiable problems have one fewer clause, this is more likely explained by socratic finding a feasible solution quickly. On the other hand, the unsatisfiable and satisfiable problems (with constrained propositions) take roughly the same amount of time for Łukasiewicz, though the trivially satisfiable problem is quicker. The apparent exponential increase in runtime is partially explained by the fact that each larger problem has twice as many clauses, but runtime appears to be growing by slightly more than a factor of 2 per each $k$ .

Table 1:

k

-SAT runtimes in seconds for socratic with different experimental configurations. The five columns pertain to items 1 through 5 above. The problem is unsatisfiable in classical and Gödel logic, satisfiable in Gödel logic after removing a clause at random, trivially satisfiable in Łukasiewicz logic with, e.g.,

x_{i}=.5

, unsatisfiable in Łukasiewicz logic if propositions are required to lie in ranges sufficiently close to 0 and 1, and again satisfiable in Łukasiewicz logic with constrained propositions when removing a clause at random.

	Gödel	Gödel	Łuka.	Łuka.	Łuka.
$k$	unsat.	satisf.	trivial	unsat.	satisf.
3	.012	.011	.014	.019	.014
4	.022	.020	.022	.031	.033
5	.054	.043	.041	.047	.043
6	.121	.107	.064	.104	.098
7	.204	.255	.173	.167	.206
8	.404	.414	.273	.286	.308
9	.861	.881	.507	.539	.554
10	5.46	1.99	1.03	1.11	1.17
11	18.000	4.34	2.09	2.44	2.21
12	33.300	10.900	4.36	5.06	5.01
13	119.000	25.800	8.72	12.400	12.300
14	696.000	71.000	18.400	38.000	35.600

Hájek tautologies.

Hájek lists many axioms and tautologies pertaining to a system of logic he describes as basic logic (BL), consistent with any t-norm logic, as well as a number of tautologies specific to Łukasiewicz and Gödel logic, all of which should have truth value exactly 1. We implement these tautologies in socratic and test whether the empty theory can entail them with truth value 1 in their respective logics. The BL tautologies are divided into batches pertaining to specific operations and properties:

axioms: 8 tests, e.g., $(\varphi\Rightarrow\psi)\Rightarrow((\psi\Rightarrow\chi)\Rightarrow(\varphi\Rightarrow\chi))$
implication: 3 tests, e.g., $\varphi\Rightarrow(\psi\Rightarrow\varphi)$
conjunction: 6 tests, e.g., $(\varphi\otimes(\varphi\Rightarrow\psi))\Rightarrow\psi$
weak˙conjunction: 7 tests, e.g., $(\varphi\mathbin{\&}\psi)\Rightarrow\varphi$
weak˙disjunction: 7 tests, e.g., $\varphi\Rightarrow(\varphi\veebar\psi)$
negation: 8 tests, e.g., $\varphi\Rightarrow(\neg\varphi\Rightarrow\psi)$
associativity: 6 tests, e.g., $(\varphi\mathbin{\&}(\psi\mathbin{\&}\chi))\Rightarrow((\varphi\mathbin{\&}\psi)\mathbin{\&}\chi)$
equivalence: 9 tests, e.g., $((\varphi\equiv\psi)\otimes(\psi\equiv\chi))\Rightarrow(\varphi\equiv\chi)$
distributivity: 8 tests, e.g., $(\varphi\otimes(\psi\veebar\chi))\equiv((\varphi\otimes\psi)\veebar(\varphi\otimes\chi))$
delta˙operator: 3 tests, e.g., $\Delta\varphi\equiv\Delta(\varphi\otimes\varphi)$

In addition, there are logic-specific batches of tautologies:

lukasiewicz: 12 tests, e.g., $\neg\neg\varphi\equiv\varphi$
godel: 5 tests, e.g., $\varphi\Rightarrow(\varphi\otimes\varphi)$

Each of the above BL batches complete successfully for both logics and each of the logic-specific batches complete for their respective logics and, as expected, fail for the other logic. The runtime of individual tests are negligible; the entire test suite of 82 tautologies run on both logics complets in just 2.911 seconds.

Boolean logic.

We consider a formula $\sigma$ defined

(\varphi\Rightarrow\psi)\Rightarrow((\neg\varphi\Rightarrow\psi)\Rightarrow\psi)

(16)

which is valid in classical logic but is not entailed with truth value 1 by the empty theory in either Łukasiewicz or Gödel logic. Conversely, constraining propositions $\varphi$ and $\psi$ to have classical truth values by introducing the sentences

	$\displaystyle(\varphi;[0,0]\cup[1,1]),$
	$\displaystyle(\psi;[0,0]\cup[1,1])$

into the theory succeeds in entailing $\sigma$ in either logic. Indeed, if even one of these sentences is added, $\sigma$ is entailed, but no looser intervals around 0 and 1 can entail $\sigma$ if both propositions are non-Boolean. In the other direction, Łukasiewicz logic with unconstrained propositions entails the sentence $(\sigma;[.5,1])$ , i.e., $\sigma$ with truth value bounded above .5, while Gödel logic with unconstrained propositions cannot entail $\sigma$ with any interval tighter than $[0,1]$ . As a final example of the interaction between truth value intervals, Gödel logic entails $(\sigma;[t,1])$ for a lower bound truth value $t$ if either of $\varphi$ or $\psi$ is constrained in the theory to have set of candidate truth values $\{0\}\cup[t,1]$ .

Stress test.

We consider the experimental configuration for Boolean logic above now with query $(\sigma;S)$ for $S$ consisting of 10000 open intervals $(\frac{1}{k+1},\frac{1}{k})$ for $k$ from 2 to 10000 plus the closed interval $[.5,1]$ and $(\varphi;S^{\prime})$ and $(\psi;S^{\prime})$ for $S^{\prime}$ consisting of 10000 open intervals $(1-\frac{1}{k},1-\frac{1}{k+1})$ plus the closed interval $[0,0]$ . We observe the runtime of socratic to be just 11.8 seconds for Gödel logic and 9.38 seconds for Łukasiewicz logic. If we instead use closed intervals throughout, measured runtimes are 17.4 seconds for Gödel and 9.29 seconds for Łukasiewicz.

7 Dealing with weights

In some circumstances, such as logical neural networks (4), weights are assigned to subformulas, where the weight is intended to reflect the influence, or importance, of the subformula. Each weight is a real number. For example, in the formulas $\sigma_{1}\veebar\sigma_{2}$ , the weight $w_{1}$ might be assigned to $\sigma_{1}$ and the weight $w_{2}$ assigned to $\sigma_{2}$ . If $0<w_{1}=2w_{2}$ , this might indicate that $\sigma_{1}$ is twice as important as $\sigma_{2}$ in evaluating the value of $\sigma_{1}\veebar\sigma_{2}$ .

As an example of a possible way to incorporate weights, assume that we are using Łukasiewicz real-valued logic, where the value of $\sigma_{1}\veebar\sigma_{2}$ is $\min\{1,s_{1}+s_{2}\}$ , when $s_{1}$ is the value of $\sigma_{1}$ and $s_{2}$ is the value of $s_{2}$ . If the weights of $\sigma_{1}$ and $\sigma_{2}$ are $w_{1}$ and $w_{2}$ , respectively, and if both $w_{1}$ and $w_{2}$ are non-negative, then we might take the value of $\sigma_{1}\veebar\sigma_{2}$ in the presence of these weights to be $\min\{1,w_{1}s_{1}+w_{2}s_{2}\}$ .

We now show how to incorporate weights into our approach. In fact, the ease of incorporating weights and still getting a sound and complete axiomatization is a real advantage of our approach!

To deal with weights, we define an expanded view of what a formula is, defined recursively. Each atomic proposition is a formula. If $\sigma_{1}$ and $\sigma_{2}$ are formulas, $w_{1}$ and $w_{2}$ are weights, and $\alpha$ is a binary connective (such as $\mathbin{\&}$ ) then $(\sigma_{1}\mathop{\alpha}\sigma_{2},w_{1},w_{2})$ is a formula. Here $w_{1}$ is interpreted as the weight of $\sigma_{1}$ and $w_{2}$ as the weight of $\sigma_{2}$ in the formula $\sigma_{1}\mathop{\alpha}\sigma_{2}$ . Also, if $\sigma$ is a formula, $w$ is a weight, and $\rho$ is a unary connective (such as $\neg$ ), then $(\rho\sigma,w)$ is a formula, where $w$ is interpreted as the weight of $\sigma$ . We modify our definition of subformula as follows. The subformulas of $(\sigma_{1}\mathop{\alpha}\sigma_{2},w_{1},w_{2})$ are $\sigma_{1}$ and $\sigma_{2}$ , and the subformula of $(\rho\sigma,w)$ is $\sigma$ .

If $\alpha$ is a binary connective, then $f_{\alpha}$ now has four arguments, rather than two. Thus, $f_{\alpha}(s_{1},s_{2},w_{,}w_{2})$ is the value of the formula $(\sigma_{1}\mathop{\alpha}\sigma_{2},w_{1},w_{2})$ when the value of $\sigma_{1}$ is $s_{1}$ , the value of $\sigma_{2}$ is $s_{2}$ , the weight of $\sigma_{1}$ is $w_{1}$ , and the weight of $\sigma_{2}$ in $w_{2}$ . Also, $f_{\rho}$ now has two arguments rather than one. Thus, $f_{\rho}(s,w)$ is the value of the formula $(\rho\sigma,w)$ when the value of $\sigma$ is $s$ , and the weight of $\sigma$ is $w$ .

The axiom and rules are just as before, except that Rule (7) is changed to:

\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k};S^{\prime})

(17)

where $S^{\prime}=\{(s_{1},\ldots,s_{k})\colon(s_{1},\ldots,s_{k},\in S$ and (a) $s_{m}=f_{\alpha}(s_{i},s_{j},w_{1},w_{2})$ when $\sigma_{m}$ is $(\sigma_{i}\mathop{\alpha}\sigma_{j},w_{1},w_{2})$ and $\alpha$ is a binary connective, and (b) $s_{j}=f_{\rho}(s_{i},w)$ when $\sigma_{j}$ is $(\rho\sigma_{i},w)$ and $\rho$ is a unary connective $\}$ .

We can extend Theorem 3.7 (soundness and completeness) and Theorem 4.3 (closure under Boolean combinations) to deal with our sentences $(\sigma_{1},\ldots,\sigma_{k};S)$ that include weights. The proofs go through just as before, where we use Rule (17) instead of Rule (7). Thus, we obtain the following theorems.

Theorem 7.1.

Our axiom system where Rule (7) is replaced by Rule (17) is sound and complete for sentences of the form $(\sigma_{1},\ldots,\sigma_{k};S)$ that include weights.

Theorem 7.2.

The sentences $(\sigma_{1},\ldots,\sigma_{k};S)$ that include weights are closed under Boolean combinations.

What about the decision procedure in Theorem 6.1? The key step is the use of a polynomial-time algorithm for linear programming. If we were to hold the weights $w_{i}$ as fixed rational constants, and if the weighting functions were linear (such as $w_{1}s_{1}+w_{2}s_{2}$ ), possibly including a min or a max, then we could use linear programming, and the decision procedure would go through in the presence of weights.

8 Related work

Rosser (12) comments on the possibility of considering formulas whose value is guaranteed to be at least $\alpha$ . For example, in Łukasiewicz) logic, if we consider weak disjunction $\underline{\veebar}$ , where $f_{\underline{\veebar}}(s_{,}s_{2})=\max(s_{1},s_{2})$ , then the real value of $A\underline{\veebar}\neg A$ is always at least 0.5, since $f_{\neg}(s)=1-s$ . But Rosser rejects this approach, since he notes that there are uncountably many choices for $\alpha$ , but only countably many recursively enumerable sets (and an axiomatization would give a recursively enumerable set of valid formulas).

Belluci (13) investigates when the set of formulas with values at least $\alpha$ is recursively enumerable. Font et al. (14) consider the question of what they call “preservation of degrees of truth”. They give a method for deciding, for a fixed $\alpha$ , if $\sigma$ having a value at least $\alpha$ implies that $\varphi$ has value at least $\alpha$ .

Novák (15) considered a logic with sentences that assign a real value to each formula of first-order real-valued logic. Thus, using our notation, his sentences would be of the form $(\varphi;\{\alpha\})$ , where $\varphi$ is a formula in first-order real-valued logic, and $\alpha$ is a single real value. He gave a sound and complete axiomatization.

An interesting logic is the rational Pavelka logic RPL, an expansion of the standard Łukasiewicz logicwhere rational truth-constants are allowed in formulas. For example, if $r$ is a rational number, then the formula $r\rightarrow\varphi$ says that the value of $\varphi$ is at least $r$ , and the formula $\varphi\rightarrow r$ says that the value of $\varphi$ is at most $r$ . Therefore, this logic can express the MD-sentences $(\varphi;S)$ , when $S$ is the union of a finite number of closed intervals. However, it cannot express strict inequalities. For example, it cannot express that the value of $\varphi$ is strictly greater than 0.5.³³3This follows from the stronger fact that if $A_{1},\ldots,A_{r}$ are the atomic propositions, $\varphi$ is a formula, and $G$ is the set of all value assignments to the atomic propositions that give $\varphi$ the truth value 1,then since the operators of standard Łukasiewicz logic are continuous (and so the value of $\varphi$ is a continuous function of the value of the atomic propositions), it follows that $\{(g(A_{1}),\cdots,g(A_{r})):\ g\in G\}$ is a closed subset of $[0,1]^{r}$ . Note that if $r=0.5$ , then even though the formula $A\rightarrow r$ has the value 1 when the value $a$ of $A$ is at most 0.5, the negation $\neg(A\rightarrow r)$ does not have the value 1 when $a>0.5$ ; instead it has the value $a$ - 0.5. RPL was introduced by Hájek in (10) as a simplification of the system proposed by Pavelka in (16) in which the syntax contained a truth-constant for each real number of the interval [0,1]. Hájek showed that an analogous logic could be presented as an expansion of Łukasiewicz propositional logic with truth-constants only for the rational numbers in [0,1] and gave a corresponding completeness theorem. Moreover, first-order fuzzy logics with real or rational constants have also been deeply studied starting from Novák’s extension of Pavelka’s logic to a first-order predicate language in (17) (see e.g. (18)).

Each of (19), (20) and (21) give decision procedures that partially cover the situation we allow in Section 6. The former two support only Łukasiewicz logic. The third, like our decision procedure, works for a variety of logics, though it is explicitly established in (21) that their approach does not support discontinuous operators. Accordingly, unlike our decision procedure, their approach does not work for Gödel logic given its discontinuous $\rightarrow$ operator.

9 Conclusions

We give a sound and strongly complete axiomatization for sentences about real-valued formulas. By being parameterized, our axiomatization covers essentially all real-valued logics. Our axiomatization allows us to include weights on formulas. The results give us a way to establish such properties for neuro-symbolic systems that aim or purport to perform logical inference with real values. The algorithm described gives us a constructive existence proof and a baseline approach for well-founded inference. Because LNNs (4) are exactly a weighted real-valued logical system implemented in neural network form, an important immediate upshot of our results for the weighted case is that they provide provably sound and complete logical inference for LNNs. Such a result has not previously been established for any neuro-symbolic approach to our knowledge. While our main motivation was to pave the way forward for neuro-symbolic systems, our results are fundamental, filling a long-standing gap in a very old literature, and can be applied well beyond AI.

\acknow

We are very grateful to Marco Carmosino, who improved the writing in this paper by giving us many helpful comments We are also grateful to Guillermo Badia, Ken Clarkson, Didier Dubois, Phokion Kolaitis, Carles Noguera, and Henri Prade for helpful comments. Finally, we are grateful to Lluis Godo for confirming the novelty of our approach, and for helpful comments.

\showacknow

References

(1) L Serafini, Ad Garcez, Logic tensor networks: Deep learning and logical reasoning from data and knowledge. \JournalTitlearXiv preprint arXiv:1606.04422 (2016).
(2) SH Bach, M Broecheler, B Huang, L Getoor, Hinge-loss Markov random fields and probabilistic soft logic. \JournalTitleThe Journal of Machine Learning Research 18, 3846–3912 (2017).
(3) W Cohen, F Yang, KR Mazaitis, TensorLog: A probabilistic database implemented using deep-learning infrastructure. \JournalTitleJournal of Artificial Intelligence Research 67, 285–325 (2020).
(4) R Riegel, et al., Logical neural networks. \JournalTitlearXiv preprint arXiv:2006.13155 (2020).
(5) G Boole, An investigation of the laws of thought: on which are founded the mathematical theories of logic and probabilities. (Walton and Maberly) Vol. 2, (1854).
(6) LA Zadeh, Fuzzy logic and approximate reasoning. \JournalTitleSynthese 30, 407–428 (1975).
(7) V Novák, A formal theory of intermediate quantifiers. \JournalTitleFuzzy Sets and Systems 159, 1229–1246 (2008).
(8) G Epstein, Multiple-valued logic design: an introduction. (CRC Press), (1993).
(9) R Fagin, A Lotem, M Naor, Optimal aggregation algorithms for middleware. \JournalTitleJournal of computer and system sciences 66, 614–656 (2003).
(10) P Hájek, Metamathematics of fuzzy logic. (Springer Science & Business Media) Vol. 4, (1998).
(11) N Rescher, Many-valued logic. (McGraw-Hill), (1969).
(12) JB Rosser, Axiomatization of infinite valued logics. \JournalTitleLogizue et Analyse 3, 137–153 (1960).
(13) L Belluce, Further results on infinite valued predicate logic. \JournalTitleJ. Symbolic Logic 29, 69–78 (1964).
(14) JM Font, ÀJ Gil, A Torrens, V Verdu, On the infinite-valued łukasiewicz logic that preserves degrees of truth. \JournalTitleArch. Math. Logic 45, 835–868 (2006).
(15) V Novák, Fuzzy logic with extended syntax. \JournalTitleHandbook of Mathematical Fuzzy Logic 3, 1063–1104 (2015).
(16) J Pavelka, On fuzzy logic i, ii, iii. \JournalTitleZeitschrift fur Mathematische Logik und Grundlagen der Mathematik 29, 45—52, 119–134, 447–464 (1979).
(17) V Novák, On the syntactico-semantical completeness of first-order fuzzy logic part I (syntax and semantic), part II (main results). \JournalTitleKybernetika 26, 47–66, 134–154 (1990).
(18) F Esteva, L God, C Noguerra, First-order t-norm based fuzzy logics with truth-constants: distinguished semantics and completeness properties. \JournalTitleAnnals of Pure and Applied Logic 161, 185–202 (2009).
(19) G Beavers, Automated theorem proving for łukasiewicz logics. \JournalTitleStudia Logica 52, 183–195 (1993).
(20) D Mundici, A constructive proof of McNaughton’s theorem in infinite-valued logic. \JournalTitleThe Journal of Symbolic Logic 59, 596–602 (1994).
(21) R Hähnle, Many-valued logic and mixed integer programming. \JournalTitleAnnals of Mathematics and Artificial Intelligence 12, 231–263 (1994).

Foundations of Reasoning with Uncertainty via Real-valued Logics

Abstract

keywords:

doi:

This paper.

Overview.

1 Models and sentences

2 Axioms and inference rules

3 Soundness and completeness

Lemma 3.1

Proof 3.2.

Lemma 3.3.

Proof 3.4.

Lemma 3.5.

Proof 3.6.

Theorem 3.7.

Proof 3.8.

4 Closure of MD-sentences under Boolean combinations

Lemma 4.1.

Proof 4.2.

Theorem 4.3.

Proof 4.4.

5 Lowering the dimensionality

Theorem 5.1.

Proof 5.2.

Theorem 5.3.

Proof 5.4.

6 SoCRAtic logic: A decision procedure

Theorem 6.1.

Proof 6.2.

6.1 Source code organization

6.2 Implementation details

Boolean variables.

Strict inequality.

Simple sentences.

Theorem 6.3.

Proof 6.4.

Complementary intervals.

6.3 Experimental results

kk-SAT.

Hájek tautologies.

Boolean logic.

Stress test.

7 Dealing with weights

Theorem 7.1.

Theorem 7.2.

8 Related work

9 Conclusions

References

$k$ -SAT.