This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\templatetype

pnasresearcharticle \leadauthorFagin \significancestatementThis work is a step in the direction of explainable AI as it pertains to logical inference in neural networks. This may ultimately assist in preventing unfair, unwarranted, or otherwise undesirable outcomes from the application of modern AI methods. \correspondingauthor1To whom correspondence should be addressed. E-mail: fagin@us.ibm.com

Foundations of Reasoning with Uncertainty via Real-valued Logics

Ronald Fagin IBM Research Ryan Riegel IBM Research Alexander Gray IBM Research
Abstract

Real-valued logics underlie an increasing number of neuro-symbolic approaches, though typically their logical inference capabilities are characterized only qualitatively. We provide foundations for establishing the correctness and power of such systems. We give a sound and strongly complete axiomatization that can be parametrized to cover essentially every real-valued logic, including all the common fuzzy logics. Our class of sentences are very rich, and each describes a set of possible real values for a collection of formulas of the real-valued logic, including which combinations of real values are possible. Strong completeness allows us to derive exactly what information can be inferred about the combinations of real values of a collection of formulas given information about the combinations of real values of several other collections of formulas. We then extend the axiomatization to deal with weighted subformulas. Finally, we give a decision procedure based on linear programming for deciding, for certain real-valued logics and under certain natural assumptions, whether a set of our sentences logically implies another of our sentences.

keywords:
Keywords: real-valued logic || strongly complete axiomatization
\dates

This manuscript was compiled on August 10, 2025

\dropcap

Recent years have seen growing interest in approaches for augmenting the capabilities of learning-based methods with those of reasoning, often broadly referred to as neuro-symbolic (though they may not be strictly neural). One of the key goals that neuro-symbolic approaches have at their root is logical inference, or reasoning. However, the representation of classical 0-1 logic (where truth values of sentences are either 0, representing “False”, or 1, representing “True”) is generally insufficient for this goal because representing uncertainty is essential to AI. In order to merge with the ideas of neural learning, the truth values dealt with must be real-valued (we shall take these to be real numbers in the interval [0,1][0,1], where intuitively, 0 means “completely false”, and 1 means “completely true”), whether the uncertainty semantics are those of probabilities, subjective beliefs, neural network activations, or fuzzy set memberships. For this reason, many major approaches have turned to real-valued logics. Logic tensor networks (1) define a logical language on real-valued vectors corresponding to groundings of terms computed by a neural network, which can use any of the common real-valued logics (e.g., Łukasiewicz, product, or Gödel logic) for its connectives (e.g., &\mathbin{\&}, \veebar, ¬\neg, and \rightarrow). Probabilistic soft logics (2) draw a correspondence of their approach based on Markov random fields (MRFs) with satisfiability of statements in a real-valued logic (Łukasiewicz). Tensorlog (3), also based on MRFs but implemented in neural network frameworks, draws a correspondence of its approach to the use of connectives in a real-valued logic (product). Logical neural networks (LNNs) (4) draw a correspondence between activation functions of neural networks and connectives in real-valued logics. To complete a full correspondence between neural networks and statements in real-valued logic, LNN defines a class of real-valued logics allowing weighted inputs, which represent the relative influence of subformulas. While widely regarded as fundamental to the goal of AI, the reasoning capabilities of the aforementioned systems are typically made qualitatively versus quantitatively and mathematically. While learning theory (roughly, what it means to perform learning) is well articulated and, for 0-1 logic, what it means to perform reasoning is well studied, reasoning is surprisingly not well formalized for real-valued logics. As reasoning becomes an increasing goal of learning-based work, it becomes important to have a solid mathematical footing for it.

Formalization of the idea of real-valued logics is old and fundamental, going back to the origins of formal logic. It is not well known that Boole himself invented a probabilistic logic in the 19th century (5), where formulas were assigned real values corresponding to probabilities. It was used in AI to model the semantics of vague concepts for commonsense reasoning by expert systems (6). Real-valued logic is used in linguistics to model certain natural language phenomena (7), in hardware design to deal with multiple stable voltage levels (8), and in databases to deal with queries that are composed of multiple graded notions, such as the redness of an object, that can range from 0 (“not at all red”) to 1 (“completely red”) (9). Despite all this, while definitions of logical correctness and power (generally, soundness and completeness) are well established and corresponding procedures for theorem proving having those properties are abundant for classical logics, the equivalents for real-valued logics (where the values can take arbitrarily values between 0 and 1) are rather limited.

This paper.

In this paper, there are two levels of logic. In the “inner” layer, we have formulas of the real-valued logic with its logical connectives. In this inner layer, we shall use &\mathbin{\&} for “and” and \veebar for “or”, as is done in (10). In the “outer” layer, we have a novel class of sentences about the inner real-valued logic (such as saying which truth values a given real-valued formula may attain). For these sentences (which take on only the classical values 0 and 1 for False and True, respectively), we make use of the traditional logical symbols \land for “and” and \lor for “or”. We remark that, somewhat confusingly, the symbols \land and \lor are often used in real-valued logics for weaker versions of “and” and “or” than that given by &\mathbin{\&} and \veebar, which we do not have need to discuss in this paper.

Let us say that an axiomatization of a logic is strongly complete if whenever Γ\Gamma is a finite set of sentences in the (outer) logic and γ\gamma is a single sentence in the (outer) logic that is a logical consequence of Γ\Gamma, then there is a proof of γ\gamma from Γ\Gamma using the axiomatization. An axiomatization is weakly complete if this holds for Γ=\Gamma=\emptyset. That is, an axiomatization is weakly complete if whenever γ\gamma is a valid sentence (always true), then there is a proof of γ\gamma using the axiomatization. Early axiomatizations of real-valued logics in the literature were typically weakly complete, but now have been improved to strongly complete (see (10) for examples).

We now explain why it is necessary to assume that Γ\Gamma is finite in the definition of strong completeness. (In our explanation, we make use of ideas from (10).) Let us restrict to Łukasiewicz logic. Let AkA^{k} denote A&A&&AA\mathbin{\&}A\mathbin{\&}\cdots\mathbin{\&}A, where AA appears kk times. Let Γ\Gamma be the infinite set of sentences (BAk;{1})(B\rightarrow A^{k};\{1\}) for k1k\geq 1, along with (A;[0,1))(A;[0,1)) which says that the value of AA is less than 1. Let τ\tau be (B;{0})(B;\{0\}). We now show that Γ\Gamma logically implies τ\tau. Assume that Γ\Gamma holds but τ\tau does not hold. Therefore, the value of AA is less than 1. It then follows from the definition of conjunction in Łukasiewicz logic that there is kk such that AkA^{k} has value 0 . From (BAk;{1})(B\rightarrow A^{k};\{1\}) this then implies that the value of BB is 0, so τ\tau holds. Hence, Γ\Gamma logically implies τ\tau. Because our proofs are of finite length, there cannot be a proof of τ\tau from Γ\Gamma, since this would give a proof of τ\tau from a finite subset of Γ\Gamma, but no finite subset of Γ\Gamma logically implies τ\tau. A natural open problem is whether we can allow Γ\Gamma to be infinite if we were to restrict our attention to Gödel logic.

We introduce a rich, novel class of sentences.

  1. 1.

    These sentences can say what the set SS of possible values is for a formula σ\sigma. This set SS can be a singleton {s}\{s\} (meaning that the real value of σ\sigma is ss), or SS can be an interval, or a union of intervals, or in fact an arbitrary subset of [0,1][0,1], e.g. the set of rational numbers in [0,1][0,1].

  2. 2.

    Our sentences can give not only the possible real values of formulas, but the interactions between these values. For example, if σ1\sigma_{1} and σ2\sigma_{2} are formulas, our sentences can not only say what the possible real values are for each of σ1\sigma_{1} and σ2\sigma_{2}, but also how they interact: thus, if s1s_{1} is the real value of σ1\sigma_{1} and s2s_{2} is the real value of σ2\sigma_{2}, then there is a sentence in our logic that says (s1,s2)(s_{1},s_{2}) must lie in the set SS of ordered pairs, where SS is an arbitrary subset of [0,1]×[0,1][0,1]\times[0,1]. We give a sound and strongly complete axiomatization for our sentences.

  3. 3.

    Unlike the other axiomatizations mentioned earlier, our axiomatization can be extended to include the use of weights for subformulas (where, for example,in the formulas A1A2A_{1}\veebar A_{2}, the subformula A1A_{1} is considered twice as important as the subformula A2A_{2}).

  4. 4.

    A surprising feature of our axiomatization is that it is parametrized, so that this one axiomatization is sound and strongly complete for essentially every real-valued logic, including those that do not obey the standard restrictions on fuzzy logics (such as conjunction being commutative). Previous axiomatizations in the literature had a separate set of axioms for each real-valued logic (for example, one of the axioms for Łukasiewicz logic is A¬¬AA\leftrightarrow\neg\neg A, and one of the axioms for Gödel logic is A(A&A)A\leftrightarrow(A\mathbin{\&}A)). In the axiomatizations mentioned earlier, each connective has a fixed associated function that tells how to evaluate it. For example, in in Łukasiewicz logic, the value of A1&A2A_{1}\mathbin{\&}A_{2} is f&(a1,a2)f_{\mathbin{\&}}(a_{1},a_{2}), where a1a_{1} is the value of A1A_{1} and a2a_{2} is the value of A2A_{2}, and where f&(a1,a2)=max{0,a1+a21}f_{\mathbin{\&}}(a_{1},a_{2})=\max\{0,a_{1}+a_{2}-1\}. By contrast, for our axiomatization, f&f_{\mathbin{\&}} is arbitrary, as long as it maps [0,1]2[0,1]^{2} into [0,1][0,1].

From now on (except in the Section 8 on related work) we use “complete” to mean “strongly complete”.

An especially useful real-valued logic for logical neural nets is Łukasiewicz logic, for several reasons. First, the &\mathbin{\&}, \veebar, ¬\neg, and \rightarrow operators are essentially linear, in that if a1a_{1} is the truth value of a formula A1A_{1}, and a2a_{2} is the truth value of a formula A2A_{2}, then (a) A1&A2A_{1}\mathbin{\&}A_{2} has value max{0,a1+a21}\max\{0,a_{1}+a_{2}-1\}, (b) A1A2A_{1}\veebar A_{2} has value min{1,a1+a2}\min\{1,a_{1}+a_{2}\}, (c) ¬A1\neg A_{1} has value 1a11-a_{1}, and (d) A1A2A_{1}\rightarrow A_{2} is equivalent to ¬A1A2\neg A_{1}\veebar A_{2}, and so has value min{1,1a1+a2}\min\{1,1-a_{1}+a_{2}\}.111The versions of &\mathbin{\&} and \veebar we describe here are sometimes called in the literature strong conjunction and strong disjunction. Weak conjunction is given by min{a1,a2}\min\{a_{1},a_{2}\} and weak disjunction is given by max{a1,a2}\max\{a_{1},a_{2}\}. Second, it is easy to incorporate weights. Thus, if w1w_{1} and w2w_{2} are nonnegative weights of A1A_{1} and A2A_{2}, respectively, then we can take the weighted value of A1A2A_{1}\veebar A_{2} to be min{1,w1s1+w2s2}\min\{1,w_{1}s_{1}+w_{2}s_{2}\}.

Throughout this paper, we take the domain of each function in the real-valued logic to be [0,1][0,1] or [0,1]×[0,1][0,1]\times[0,1] and the range to be [0,1][0,1]. This is a common assumption for many real-valued logics, but all of our results go through with obvious modifications if the domains are DkD^{k} for possibly multiple choices of arity kk and range DD, for arbitrary subsets DD of the reals. We note that real-valued logic can be viewed as a special case of multi-valued logic (11), although in multi-valued logic there is typically a finite set of truth values, not necessarily linearly ordered.

We also provide a decision procedure for deciding, whether a set of our sentences logically implies another of our sentences, for certain real-valued logics, under certain natural assumptions. We implemented the decision algorithm, dubbed SoCRAtic logic (for Sound and Complete Real-valued Axiomatic logic), which we describe in detail and make available in source code.

Our sentences allow arbitrary real-valued logics, as does our sound and complete axiomatization, but our decision procedure depends heavily on the choice of real-valued logic, and in particular is tailored towards Łukasiewicz and Gödel logic. This is because a key portion of our decision procedure is linear programming, and we depend on the essentially linear nature of Łukasiewicz logic and the ease of dealing with min and max in Gödel logic.

Overview.

Until the final section, we do not allow weights. In Section 1, we give our basic notions, including what a model is and what a sentence is. In Section 1, we define our sentences to be of the form (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) where the σi\sigma_{i} are formulas, where SS is a set of tuples (s1,,sk)(s_{1},\ldots,s_{k}), and where the sentence says that if the value of each σi\sigma_{i} is sis_{i}, for 1ik1\leq i\leq k, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. In Section 2, we give our (only) axiom and our inference rules. In Section 3, we give our soundness and completeness theorem. In Section 4, we give a theorem that says that our sentences are closed under Boolean combinations. This helps show robustness of our class of sentences. In Section 5 we discuss possible simplifications of our sentences. In Section 6, we give the decision algorithm. In Section 7, we show how to extend our methodology to incorporate weights. In Section 8, we discuss related work. In the Conclusions, we review the implications for neuro-symbolic approaches.

1 Models and sentences

We assume a finite set of atomic propositions. These can be thought of as the leaves of a neural net, i.e., nodes with no inputs from other neurons. A model MM is an assignment gMg^{M} of values to the atomic propositions. Thus, MM assigns a value gM(A)[0,1]g^{M}(A)\in[0,1] to each atomic proposition AA.

Let FF be the set of logical formulas over the atomic propositions, where we allow arbitrary finite sets of binary and unary connectives. Typical binary connectives are conjunction (denoted by &\mathbin{\&}), disjunction (denoted by \veebar), and implication (denoted by \rightarrow). Typical unary connectives are negation (denoted by ¬\neg) and a delta function (denoted by \bigtriangleup). Sometimes ¬x\neg x is taken to be 1x1-x, and x\bigtriangleup x is taken to be defined by x=1\bigtriangleup x=1 if x=1x=1 and 0 otherwise.

When considering only formulas with value 1, as most other works do when giving sound axiomatizations of real-valued logics, the convention is to consider a sentence to be simply a member of FF. What if we want to take into account values other than 1?

We take a sentence to be an expression of the form (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S), where σ1,,σk\sigma_{1},\ldots,\sigma_{k} are in FF, and where S[0,1]kS\subseteq[0,1]^{k}. The intuition is that (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) says that if the value of each σi\sigma_{i} is sis_{i}, for 1ik1\leq i\leq k, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. We refer to our sentences as multi-dimensional sentences, or for short MD-sentences. For a fixed kk, we refer to the MD-sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) as kk-dimensional. The class of MD-sentences is robust. In particular, Theorem 4.3 says that MD-sentences are closed under Boolean combinations. We give a sound and (strongly) complete axiomatization, that is parameterized to deal with an arbitrary fixed real-valued logic. This axiomatization allows us to derive exactly what information can be inferred about the combinations of values of a collection of formulas given information about the combinations of values of several other collections of formulas.

Note that we are not saying that the logic is multi-dimensional (which could mean that the values taken on by variables are vectors, not just numbers), but instead we are saying that the sentences in our "outer" logic are multi-dimensional. The "inner" logic we work with in this paper is real-valued, and real-valued logic has been heavily studied. What is novel in our paper are our multi-dimensional sentences.

Note that the set SS in (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) can be undecidable, even if k=1k=1 and every member of SS is a rational number. For example, we could then take SS to be the set of all numbers 1/k1/k, where kk is the Go¨\ddot{o}del number of a halting Turing machine. But our decision procedures involve only special sets SS. Thus, we shall say in Section 6 that a sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) is interval-based if SS is of the form S1××SkS_{1}\times\cdots\times S_{k}, where each SiS_{i} is the union of a finite number of intervals with rational endpoints. And our decision procedure in that section deals with interval-based sentences. However, our sound and complete axiomatization in Section 3 makes no such assumptions about the sets SS; in particular, the sets SS can be undecidable.

For convenience, we assume throughout that in the sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S), we have that σi\sigma_{i} and σj\sigma_{j} are different formulas if iji\neq j. We refer to σ1,,σk\sigma_{1},\ldots,\sigma_{k} as the components of (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S), and SS as the information set of (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S).

Let γ\gamma be the sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S). We now say what it means for a model MM to satisfy γ\gamma. For 1ik1\leq i\leq k, let sis_{i} be the value of the formula σi\sigma_{i} under the assignment of values to the atomic propositions given by the model MM. We say that MM satisfies γ\gamma if (s1,,sk)S(s_{1},\ldots,s_{k})\in S. We then say that MM is a model of γ\gamma, and we write MγM\vDash\gamma. Note that if (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) is satisfiable, that is, has a model, then SS\neq\emptyset.

2 Axioms and inference rules

We have only one axiom:

(σ;[0,1])(\sigma;[0,1]) (1)

Axiom (1) guarantees that all values are in [0,1])[0,1]).

We now give our inference rules.

If π\pi is a permutation of 1,,k1,\ldots,k, then:

From (σ1,,σk;S) infer (σπ(1),,σπ(k);S)\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{\pi(1)},\ldots,\sigma_{\pi(k)};S^{\prime}) (2)

where S={(sπ(1),,sπ(k)):(s1,,sk)S}S^{\prime}=\{(s_{\pi(1)},\ldots,s_{\pi(k)})\colon(s_{1},\ldots,s_{k})\in S\}.

Rule (2) simply permutes the order of the components.

Our next inference rule is:

From (σ1,,σk;S) infer \mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer } (3)

(σ1,,σk,σk+1,,σm;S×[0,1]mk).(\sigma_{1},\ldots,\sigma_{k},\sigma_{k+1},\ldots,\sigma_{m};S\times[0,1]^{m-k}).

Rule (3) extends (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) to include σk+1,,σm\sigma_{k+1},\ldots,\sigma_{m} with no nontrivial information being given about the new components.

Our next inference rule is:

From (σ1,,σk;S1) and (σ1,,σk;S2) infer \mbox{From }(\sigma_{1},\ldots,\sigma_{k};S_{1})\mbox{ and }(\sigma_{1},\ldots,\sigma_{k};S_{2})\mbox{ infer } (4)

(σ1,,σk;S1S2).(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2}).

Rule (4) enables us to join the information in (σ1,,σk;S1)(\sigma_{1},\ldots,\sigma_{k};S_{1}) and (σ1,,σk;S2)(\sigma_{1},\ldots,\sigma_{k};S_{2}).

Our next inference rule is the following (where 0<r<k0<r<k):

From (σ1,,σk;S) infer (σ1,,σkr;S)\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k-r};S^{\prime}) (5)

where S={(s1,,skr):(s1,,sk)S}S^{\prime}=\{(s_{1},\ldots,s_{k-r})\colon(s_{1},\ldots,s_{k})\in S\}.

Intuitively, SS^{\prime} is the projection of SS onto the first krk-r components. Rule (5) enables us to select information about σ1,,σkr\sigma_{1},\ldots,\sigma_{k-r} from information about σ1,,σk\sigma_{1},\ldots,\sigma_{k}.

Our next inference rule is:

From (σ1,,σk;S) infer (σ1,,σk;S) if SS.\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k};S^{\prime})\mbox{ if }S\subseteq S^{\prime}. (6)

Rule (6) says that we can go from more information to less information. The intuition is that smaller information sets are more informative.

We now give an inference rule that depends on the real-valued logic under consideration. For each real-valued binary connective α\alpha, let fα(s1,s2)f_{\alpha}(s_{1},s_{2}) be the value of σ1𝛼σ2\sigma_{1}\mathop{\alpha}\sigma_{2} when the value of σ1\sigma_{1} is s1s_{1} and the value of σ2\sigma_{2} is s2s_{2}. For example, in Gödel logic, f&(s1,s2)=min{s1,s2}f_{\mathbin{\&}}(s_{1},s_{2})=\min\{s_{1},s_{2}\}. For each real-valued unary connective ρ\rho, let fρ(s)f_{\rho}(s) be the value of ρσ\rho\sigma when the value of σ\sigma is ss. For example, in Łukasiewicz logic, f¬(s)=1sf_{\neg}(s)=1-s.

In the sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S), let us say that the tuple (s1,,sk)(s_{1},\ldots,s_{k}) in SS is good if (a) sm=fα(si,sj)s_{m}=f_{\alpha}(s_{i},s_{j}) whenever σm\sigma_{m} is σi𝛼σj\sigma_{i}\mathop{\alpha}\sigma_{j} and α\alpha is a binary connective (such as &\mathbin{\&}), and (b) sj=fρ(si)s_{j}=f_{\rho}(s_{i}) whenever σj\sigma_{j} is ρσi\rho\sigma_{i} and ρ\rho is a unary connective (such as ¬\neg). Note that being “good” is a local property of a tuple ss in SS (that is, it depends only on the tuple ss and not on the other tuples in SS). Of course, if the real-valued logic under consideration has higher-order connectives (ternary, etc.), then we would modify the definition of a good tuple in the obvious way. For simplicity, we will assume throughout this paper that we are in the common case where the only connectives of the real-valued logic are unary and binary, although all of our results go through in the general case.

We then have the following inference rule:

From (σ1,,σk;S) infer (σ1,,σk;S)\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k};S^{\prime}) (7)

when SS^{\prime} is the set of good tuples of SS.

Rule (7) is our key rule of inference. Let γ1\gamma_{1} be the premise (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) and let γ2\gamma_{2} be the conclusion (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S^{\prime}) of Rule (7). As we shall discuss later, γ1\gamma_{1} and γ2\gamma_{2} are logically equivalent, and SS^{\prime} is as small as possible so that γ1\gamma_{1} and γ2\gamma_{2} are logically equivalent.

A simple example of a valid sentence is (A,B,AB;S)(A,B,A\veebar B;S) where S={(s1,s2,s3):s1[0,1],s2[0,11],s3=f(s1,s2)}S=\{(s_{1},s_{2},s_{3})\colon\ s_{1}\in[0,1],s_{2}\in[0,11],s_{3}=f_{\veebar}(s_{1},s_{2})\}. This is derived from the valid sentence (A,B,AB;[0,1]3)(A,B,A\veebar B;[0,1]^{3}) by applying Rule (7)

Each of our rules is of the form “From A infer B” or “From A infer B where …”. We refer to A as the premise and B as the conclusion. We need the notion of a subformula of a formula. If α\alpha is a binary connective, then the subformulas of σ1𝛼σ2\sigma_{1}\mathop{\alpha}\sigma_{2} are σ1\sigma_{1} and σ2\sigma_{2}. If ρ\rho is a unary connective, then the subformula of ρσ\rho\sigma is σ\sigma.

Let Γ\Gamma be a set of MD-sentences. We define the closure GG of Γ\Gamma under subformulas as follows. For each sentence (γ1,,γm;S)(\gamma_{1},\ldots,\gamma_{m};S) in Γ\Gamma, the set GG contains γ1,,γm\gamma_{1},\ldots,\gamma_{m}, and for each formula γ\gamma in GG, the set GG contains every subformula of γ\gamma.

In particular, GG contains every atomic proposition that appears inside the components of Γ\Gamma.

3 Soundness and completeness

Let Γ\Gamma be a finite set of MD-sentences, and let γ\gamma be a single MD-sentence. We write Γγ\Gamma\vDash\gamma if every model of Γ\Gamma is a model of γ\gamma. We write Γγ\Gamma\vdash\gamma if there is a proof of γ\gamma from Γ\Gamma, using our axiom system. Soundness says “Γγ\Gamma\vdash\gamma implies Γγ\Gamma\vDash\gamma”. Completeness says “Γγ\Gamma\vDash\gamma implies Γγ\Gamma\vdash\gamma”. In this section, we shall prove that our axiom system is sound and complete for MD-sentences.

We define a special property of certain MD-sentences, that is used in a crucial manner in our completeness proof. Let us say that a sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) is minimized if whenever (s1,,sk)S(s_{1},\ldots,s_{k})\in S, then there is a model MM of (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) such that for 1ik1\leq i\leq k, the value of σi\sigma_{i} in MM is sis_{i}. Thus, (s1,,sk)S(s_{1},\ldots,s_{k})\in S if and only if there is a model MM of (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) such that for 1ik1\leq i\leq k, the value of σi\sigma_{i} in MM is sis_{i}. We use the word “minimized”, since intuitively, SS is as small as possible.

Our proof of completeness makes use of the following lemmas.

Lemma 3.1

Let (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) be the premise of Rule (7). Assume that G={σ1,,σk}G=\{\sigma_{1},\ldots,\sigma_{k}\} is closed under subformulas (so that in particular, every atomic proposition that appears in GG is a member of GG). Then the conclusion (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S^{\prime}) of Rule (7) is minimized.

Proof 3.2.

Let φ\varphi be the conclusion (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S^{\prime}) of Rule (7). Assume that (s1,,sk)S(s_{1},\ldots,s_{k})\in S^{\prime}. To prove that φ\varphi is minimized, we must show that there is a model MM of φ\varphi such that for 1ik1\leq i\leq k, the value of σi\sigma_{i} in MM is sis_{i}. From the assignment of values to the atomic propositions, as specified by a portion of (s1,,sk)(s_{1},\ldots,s_{k}), we obtain our model MM. For this model MM, the value of each σi\sigma_{i} is exactly that specified by (s1,,sk)(s_{1},\ldots,s_{k}), as we can see by a simple induction on the structure of formulas. Hence, φ\varphi is minimized.

Lemma 3.3.

For each of Rules (2), (3), (4), and (7), the premise and the conclusion are logically equivalent.

Proof 3.4.

The equivalence of the premise and conclusion of Rule (2) is clear. For Rules (3), (4), and (7), the fact that the premise logically implies the conclusion follows from soundness of the rules, which we shall show shortly. We now show that for Rules (3), (4), and (7), the conclusion logically implies the premise. For Rule (3), we see that if (s1,,sm)S×[0,1]mk(s_{1},\ldots,s_{m})\in S\times[0,1]^{m-k}, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. Hence, the conclusion of Rule (3) logically implies the premise of Rule (3). For Rules (4) and (7), the conclusion logically implies the premise because of the soundness of Rule (6).

Lemma 3.5.

Minimization is preserved by Rules 2 and (4), in the following sense.

  1. 1.

    If the premise of Rule (2) is minimized, then so is the conclusion.

  2. 2.

    If the premises (σ1,,σk;S1)(\sigma_{1},\ldots,\sigma_{k};S_{1}) and (σ1,,σk;S2)(\sigma_{1},\ldots,\sigma_{k};S_{2}) of Rule (4) are minimized, then so is the conclusion (σ1,,σk;S1S2)(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2}).

Proof 3.6.

Part (1) is immediate, since the premise and conclusion have exactly the same information.

For part (2), assume that (σ1,,σk;S1)(\sigma_{1},\ldots,\sigma_{k};S_{1}) and (σ1,,σk;S2)(\sigma_{1},\ldots,\sigma_{k};S_{2}) are minimized. To show that (σ1,,σk;S1S2)(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2}) is minimized, we must show that if (s1,,sk)S1S2(s_{1},\ldots,s_{k})\in S_{1}\cap S_{2}, then there is a model MM of (σ1,,σk;S1S2)(\sigma_{1},\ldots,\sigma_{k};S_{1}\cap S_{2}) such that for 1ik1\leq i\leq k, the value of σi\sigma_{i} in MM is sis_{i}. Assume that (s1,,sk)S1S2(s_{1},\ldots,s_{k})\in S_{1}\cap S_{2}. Hence, (s1,,sk)S1(s_{1},\ldots,s_{k})\in S_{1}. Since (σ1,,σk;S1)(\sigma_{1},\ldots,\sigma_{k};S_{1}) is minimized, we obtain the desired model MM.

Theorem 3.7.

Our axiom system is sound and complete for MD-sentences.

Proof 3.8.

We begin by proving soundness. We say that an axiom is sound if it is true in every model. We say that an inference rule is sound if every model that satisfies the premise also satisfies the conclusion. To prove soundness of our axiom system, it is sufficient to show that our axiom is sound and that each of our rules is sound.

Axiom (1) is sound, since every real-valued logic formula has a value in [0,1][0,1].

Rule (2) is sound, since the premise and conclusion encode exactly the same information.

Rule (3) is sound for the following reason. Let MM be a model, and let s1,,sms_{1},\ldots,s_{m} be the values of σ1,,σm\sigma_{1},\ldots,\sigma_{m}, respectively, in MM. If MM satisfies the premise, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. This implies that (s1,,sm)S×[0,1]mk)(s_{1},\ldots,s_{m})\in S\times[0,1]^{m-k}) and so MM satisfies the conclusion.

Rule (4) is sound for the following reason. Let MM be a model, and let s1,,sks_{1},\ldots,s_{k} be the values of σ1,,σk\sigma_{1},\ldots,\sigma_{k}, respectively, in MM. If MM satisfies the premise, then (s1,,sk)S1(s_{1},\ldots,s_{k})\in S_{1} and (s1,,sk)S2(s_{1},\ldots,s_{k})\in S_{2}. Therefore, (s1,,sk)S1S2(s_{1},\ldots,s_{k})\in S_{1}\cap S_{2}, and so MM satisfies the conclusion.

Rule (5) is sound for the following reason. Let MM be a model, and let s1,,sks_{1},\ldots,s_{k} be the values of σ1,,σk\sigma_{1},\ldots,\sigma_{k}, respectively, in MM. If MM satisfies the premise, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. Therefore (s1,,skr)S(s_{1},\ldots,s_{k-r})\in S^{\prime}, and so MM satisfies the conclusion.

Rule (6) is sound for the following reason. Let MM be a model, and let s1,,sks_{1},\ldots,s_{k} be the values of σ1,,σk\sigma_{1},\ldots,\sigma_{k}, respectively, in MM. If MM satisfies the premise, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. Therefore, (s1,,sk)S(s_{1},\ldots,s_{k})\in S^{\prime}, and so MM satisfies the conclusion.

Rule (7) is sound for the following reason. Let MM be a model, and let s1,,sks_{1},\ldots,s_{k} be the values of σ1,,σk\sigma_{1},\ldots,\sigma_{k}, respectively, in MM. If MM satisfies the premise, then (s1,,sk)S(s_{1},\ldots,s_{k})\in S. In our real-valued logic, we have that (a) fα(si,sj)=smf_{\alpha}(s_{i},s_{j})=s_{m} when σm\sigma_{m} is σi𝛼σj\sigma_{i}\mathop{\alpha}\sigma_{j} and α\alpha is a binary connective (such as &\mathbin{\&}), and (b) fρ(si)=sjf_{\rho}(s_{i})=s_{j} when σj\sigma_{j} is ρσi\rho\sigma_{i} and ρ\rho is a unary connective (such as ¬\neg). Then (s1,,sk)S(s_{1},\ldots,s_{k})\in S^{\prime}, and MM satisfies the conclusion.

This completes the proof of soundness. We now prove completeness. Assume that Γγ\Gamma\vDash\gamma; we must show that Γγ\Gamma\vdash\gamma. We can assume without loss of generality that Γ\Gamma is nonempty, because if Γ\Gamma is empty, we replace it by a singleton set containing an instance of our Axiom (1).

Let Γ={γ1,,γn}\Gamma=\{\gamma_{1},\ldots,\gamma_{n}\}. For 1in1\leq i\leq n, assume that γi\gamma_{i} is (σ1i,,σkii;Si)(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}};S_{i}), and let Γi={σ1i,,σkii}\Gamma_{i}=\{\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}}\}. Assume that γ\gamma is (σ10,,σk00;S0(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};S_{0}), and let Γ0={σ10,,σk00}\Gamma_{0}=\{\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}}\}. Let GG be the closure of Γ0Γ1Γn\Gamma_{0}\cup\Gamma_{1}\cup\cdots\cup\Gamma_{n} under subformulas.

For each ii with 1in1\leq i\leq n, let HiH_{i} be the set difference GΓiG\setminus\Gamma_{i}. Let ri=|Hi|r_{i}=|H_{i}|. Let Hi={τ1i,τrii}H_{i}=\{\tau^{i}_{1},\ldots\tau^{i}_{r_{i}}\}. By applying Rule (3), we prove from γi\gamma_{i} the sentence (σ1i,,σkii,τ1i,,τrii;Si×[0,1]ri)(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}},\tau^{i}_{1},\ldots,\tau^{i}_{r_{i}};S_{i}\times[0,1]^{r_{i}}). Let ψi\psi_{i} be the conclusion of Rule (7) when the premise is (σ1i,,σkii,τ1i,,τrii;Si×[0,1]ri)(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}},\tau^{i}_{1},\ldots,\tau^{i}_{r_{i}};S_{i}\times[0,1]^{r_{i}}).

Let δ1,,δp\delta_{1},\ldots,\delta_{p} be a fixed ordering of the members of GG. Since the set of components of each ψi\psi_{i} is GG, we can use Rule (2) to rewrite ψi\psi_{i} as a sentence (δ1,,δp;Ti)(\delta_{1},\ldots,\delta_{p};T_{i}). Let us call this sentence φi\varphi_{i}.

Also, since the only rules used in proving φi\varphi_{i} from γi\gamma_{i} are Rules (2), (3), and (7), it follows from Lemma 3.3 that γi\gamma_{i} and φi\varphi_{i} are logically equivalent.

We now make use of the notion of minimization. Let T=T1TnT=T_{1}\cap\cdots\cap T_{n}. Define φ\varphi to be the sentence (δ1,,δp;T)(\delta_{1},\ldots,\delta_{p};T). It follows from Lemma 3.1 that each ψi\psi_{i} is minimized. So by Lemma 3.5, each φi\varphi_{i} is minimized. By Lemma 3.5 again, φ\varphi is minimized.

The sentence φ\varphi was obtained from the sentences φi\varphi_{i} by applying Rule (4) n1n-1 times. It follows from Lemma 3.3 that φ\varphi is equivalent to {φ1,,φn}\{\varphi_{1},\ldots,\varphi_{n}\}. Since we also showed that γi\gamma_{i} is logically equivalent to φi\varphi_{i} for 1in1\leq i\leq n, it follows that φ\varphi is logically equivalent to Γ\Gamma. Hence, since Γγ\Gamma\vDash\gamma, it follows that {φ}γ\{\varphi\}\vDash\gamma. It also follows that to prove that Γγ\Gamma\vdash\gamma, we need only show that there is a proof of γ\gamma from φ\varphi.

Recall that γ\gamma is (σ10,,σk00;S0(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};S_{0}), and φ\varphi is (δ1,,δp;T)(\delta_{1},\ldots,\delta_{p};T). By applying Rule (2), we can re-order the components of φ\varphi so that the components start with σ10,,σk00\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}}. We thereby obtain from φ\varphi a sentence (σ10,,σk00,;T)(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}},\ldots;T^{\prime}), which we denote by φ\varphi^{\prime}. By Lemma 3.3 we know that φ\varphi and φ\varphi^{\prime} are logically equivalent. So {φ}γ\{\varphi^{\prime}\}\vDash\gamma. Since φ\varphi is minimized, so is φ\varphi^{\prime}, by Lemma 3.5. By applying Rule (5), we obtain from φ\varphi^{\prime} a sentence (σ10,,σk00;T′′)(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};T^{\prime\prime}), which we denote by φ′′\varphi^{\prime\prime}.

We now show that T′′S0T^{\prime\prime}\subseteq S_{0}. This is sufficient to complete the proof of completeness, since then we can use Rule (6) to prove γ\gamma. If T′′T^{\prime\prime} is empty, we are done. So assume that (s1,,sk0)T′′(s_{1},\ldots,s_{k_{0}})\in T^{\prime\prime}; we must show that (s1,,sk0)S0(s_{1},\ldots,s_{k_{0}})\in S_{0}.

Since (s1,,sk0)T′′(s_{1},\ldots,s_{k_{0}})\in T^{\prime\prime}, it follows that there is an extension (s1,,sk0,,sp)(s_{1},\ldots,s_{k_{0}},\ldots,s_{p}) in TT^{\prime}. Since φ\varphi^{\prime} is minimized, there is a model MM of φ\varphi^{\prime} such that the value of σi0\sigma^{0}_{i} is sis_{i}, for 1ik01\leq i\leq k_{0}. Since {φ}γ\{\varphi^{\prime}\}\vDash\gamma, it follows that MM is a model of γ\gamma. By definition of what it means for MM to be a model of γ\gamma, it follows that (s1,,sk0)S0(s_{1},\ldots,s_{k_{0}})\in S_{0}, as desired.

This completes the proof of soundness and completeness.

4 Closure of MD-sentences under Boolean combinations

Our next theorem implies that MD-sentences are robust, in that they are closed under Boolean combinations. Of course, since we are dealing with sentences (which take only the values True and False) in our "outer" logic, we use the standard Boolean connectives.

We begin with a useful lemma that we shall also use later.

Lemma 4.1.

The (standard logical) negation of the sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) is (σ1,,σk;S~)(\sigma_{1},\ldots,\sigma_{k};\tilde{S}) where S~\tilde{S} is the set difference [0,1]kS[0,1]^{k}\setminus S.

Proof 4.2.

We need only show that if MM is a model, then M(σ1,,σk;S~)M\vDash(\sigma_{1},\ldots,\sigma_{k};\tilde{S}) if and only if M⊭(σ1,,σk;S)M\not\vDash(\sigma_{1},\ldots,\sigma_{k};S). Let sis_{i} be the value of σi\sigma_{i} in MM, for 1ik1\leq i\leq k. If M(σ1,,σk;S~)M\vDash(\sigma_{1},\ldots,\sigma_{k};\tilde{S}), then (s1,,sk)S~(s_{1},\ldots,s_{k})\in\tilde{S}, and so (s1,,sk)S(s_{1},\ldots,s_{k})\notin S. Hence, M⊭(σ1,,σk;S)M\not\vDash(\sigma_{1},\ldots,\sigma_{k};S). Conversely, if M⊭(σ1,,σk;S)M\not\vDash(\sigma_{1},\ldots,\sigma_{k};S), then (s1,,sk)S(s_{1},\ldots,s_{k})\notin S, and so (s1,,sk)S~(s_{1},\ldots,s_{k})\in\tilde{S}. Hence, M(σ1,,σk;S~)M\vDash(\sigma_{1},\ldots,\sigma_{k};\tilde{S}).

Theorem 4.3.

MD-sentences are closed under Boolean combinations \land, \lor, and ¬\neg.

Proof 4.4.

Let γ1\gamma_{1} and γ2\gamma_{2} be MD-sentences. Assume that γ1\gamma_{1} is (σ11,,σm1;S1)(\sigma^{1}_{1},\ldots,\sigma^{1}_{m};S_{1}), and that γ2\gamma_{2} is (σ12,,σn2;S2)(\sigma^{2}_{1},\ldots,\sigma^{2}_{n};S_{2}). As in the proof of Theorem 3.7, let GG be the closure of {σ11,,σm1,σ12,,σn2}\{\sigma^{1}_{1},\ldots,\sigma^{1}_{m},\sigma^{2}_{1},\ldots,\sigma^{2}_{n}\} under subformulas. Assume that G={δ1,,δp}G=\{\delta_{1},\ldots,\delta_{p}\}. As in the proof of Theorem 3.7, we know that for i=1i=1 and i=2i=2, there is TiT_{i} such that γi\gamma_{i} is equivalent to a sentence (δ1,,δp;Ti)(\delta_{1},\ldots,\delta_{p};T_{i}). The conjunction γ1γ2\gamma_{1}\land\gamma_{2} is equivalent to (δ1,,δp;T1T2)(\delta_{1},\ldots,\delta_{p};T_{1}\cap T_{2}). The disjunction γ1γ2\gamma_{1}\lor\gamma_{2} is equivalent to (δ1,,δp;T1T2)(\delta_{1},\ldots,\delta_{p};T_{1}\cup T_{2}). And by Lemma 4.1, the negation ¬γ1\neg\gamma_{1} is equivalent to (δ1,,δp;Ti~)(\delta_{1},\ldots,\delta_{p};\tilde{T_{i}}), where T1~\tilde{T_{1}} is the set difference [0,1]pT1[0,1]^{p}\setminus T_{1}.

5 Lowering the dimensionality

It is natural to ask whether there is a (k+1)(k+1)-dimensional MD-sentence that in Łukasiewicz or Gödel logic is not equivalent to any kk-dimensional MD-sentence. For the special case k=1k=1, the next theorem gives an answer. We shall shortly state the more general case and generalizations of it as open problems.

Theorem 5.1.

There is a 2-dimensional MD-sentence that is not equivalent (in either Łukasiewicz or Gödel logic) to a 1-dimensional MD-sentence.

Proof 5.2.

Let σ\sigma be the 2-dimensional MD-sentence (A1,A2;S)(A_{1},A_{2};S) where S={(a1,a2):a12=a2}S=\{(a_{1},a_{2}):a_{1}^{2}=a_{2}\}. We now show that σ\sigma is not equivalent to a 1-dimensional MD-sentence. If φ\varphi is a propositional formula involving only A1A_{1} and A2A_{2}, then it is easy to see (by induction on the structure of formulas) that for Łukasiewicz or Gödel logic, φ\varphi defines a piecewise linear function fφf_{\varphi}, in the sense that the 1-dimensional MD-sentence (φ;S)(\varphi;S^{\prime}) says that if a1a_{1} is the value of A1A_{1} and a2a_{2} is the value of A2A_{2}, then fφ(a1,a2)Sf_{\varphi}(a_{1},a_{2})\in S^{\prime}. Since there is no such piecewise linear function fφf_{\varphi} and set SS^{\prime} for our sentence σ\sigma, the result holds.

The next theorem does not depend on restricting to Łukasiewicz or Gödel logic.

Theorem 5.3.

Every finite set of MD-sentences of arbitrary dimensions that involve only the kk predicate symbols A1,,AkA_{1},\ldots,A_{k} is equivalent to a single kk-dimensional MD sentence (A1,,Ak;S)(A_{1},\ldots,A_{k};S). (The set SS depends on the real-valued logic being considered.)

Proof 5.4.

Let Γ\Gamma be a finite set of MD-sentences. We can view Γ\Gamma as a conjunction of MD-sentences, so by Theorem 4.3, Γ\Gamma is equivalent to a single MD-entence γ\gamma. As in the proof of completeness, by closing under subformulas, applying Rule (7), and reodering by applying Rules (2), we obtain an MD-sentnece (A1,,Ak,φ1,,φr;S)(A_{1},\ldots,A_{k},\varphi_{1},\ldots,\varphi_{r}^{\prime};S^{\prime}) that is equivalent to γ\gamma. Since the tuples in SS^{\prime} are good tuples, this is equivalent to the sentence (A1,,Ak;S)(A_{1},\ldots,A_{k};S) where S={(s1,,sk):(s1,sk,s1,sr)S}S=\{(s_{1},\ldots,s_{k}):(s_{1},\ldots s_{k},s^{\prime}_{1},\ldots s^{\prime}_{r})\in S^{\prime}\}.

Open problems: For each kk with k2k\geq 2, does there exist a (k+1)(k+1)-dimensional MD-sentence that in Łukasiewicz or Gödel logic is not equivalent to a kk-dimensional MD-sentence? And for k1k\geq 1, how about there being a (k+1)(k+1)-dimensional MD-sentence not equivalent to a Boolean combination of 1-dimensional MD-sentences, or even to a Boolean combination of kk-dimensional MD-sentences?

6 SoCRAtic logic: A decision procedure

Given a finite set Γ\Gamma of MD-sentences, and a single MD-sentence γ\gamma, Theorem 3.7 says that Γγ\Gamma\vDash\gamma if and only if Γγ\Gamma\vdash\gamma. As we shall show, under natural assumptions there is an algorithm for deciding if Γγ\Gamma\vDash\gamma. We call this algorithm a decision procedure. If the information sets SS all have s simple structure and the size of Γ\Gamma is treated as a constant, than the algorithm runs in polynomial time.

It is natural to wonder whether we can simply use our complete axiomatization to derive a decision procedure. The usual answer is that it is not clear in what order to apply the rules of inference. In our proof of completeness, the rules of inference are applied in a specific order, so that is not an issue here. Rather, the problem is that in applying Rule (7), there is no easy way to derive SS^{\prime} from SS, even if SS is fairly simple. In fact, we now show that even deciding if SS^{\prime} is nonempty is NP-hard. Let φ\varphi be an instance of the NP-hard problem 3SAT. Thus, φ\varphi is of the form (B11B21B31)&&(B1rB2rB3r)(B^{1}_{1}\veebar B^{1}_{2}\veebar B^{1}_{3})\mathbin{\&}\cdots\mathbin{\&}(B^{r}_{1}\veebar B^{r}_{2}\veebar B^{r}_{3}), where each BjiB^{i}_{j} is a literal (an atomic proposition or its negation). Assume that the atomic propositions that appear in φ\varphi are A1,,AkA_{1},\ldots,A_{k}. Let ψ\psi be the sentence

(A1,,Ak,¬A1,,¬Ak,τ1,,τr,τ1B31,,τrB3r;S),(A_{1},\ldots,A_{k},\neg A_{1},\ldots,\neg A_{k},\tau_{1},\ldots,\tau_{r},\tau_{1}\veebar B^{1}_{3},\ldots,\tau_{r}\veebar B^{r}_{3};S),

where τi\tau_{i} is B1iB2iB^{i}_{1}\veebar B^{i}_{2}, for 1ir1\leq i\leq r, and where S={0,1}2k+r×{1}rS=\{0,1\}^{2k+r}\times\{1\}^{r}. Assume that we apply Rule (7) where the premise is ψ\psi, and the conclusion is

(A1,,Ak,¬A1,,¬Ak,τ1,,τr,τ1B31,,τrB3r;S).(A_{1},\ldots,A_{k},\neg A_{1},\ldots,\neg A_{k},\tau_{1},\ldots,\tau_{r},\tau_{1}\veebar B^{1}_{3},\ldots,\tau_{r}\veebar B^{r}_{3};S^{\prime}).

We call this sentence ψ\psi^{\prime}. It follows easily from our construction of ψ\psi that the 3SAT problem φ\varphi is satisfiable if and only if ψ\psi is satisfiable. Now ψ\psi and ψ\psi^{\prime} are logically equivalent, by Lemma 3.3. So the 3SAT problem φ\varphi is satisfiable if and only if ψ\psi^{\prime} is satisfiable. By Lemma 3.1, we know that ψ\psi^{\prime} is minimized. Hence, if SS^{\prime} is nonempty, there is a model of ψ\psi^{\prime}, by the definition of minimization. And if SS^{\prime} is empty, then by the definition of a model of a sentence, there is no model of ψ\psi^{\prime}. Therefore, ψ\psi^{\prime} is satisfiable if and only if SS^{\prime} is nonempty. By combining this with our earlier observation that the 3SAT problem φ\varphi is satisfiable if and only if ψ\psi^{\prime} is satisfiable, it follows that the 3SAT problem φ\varphi is satisfiable if and only if SS^{\prime} is nonempty. Hence, deciding if SS^{\prime} is nonempty is NP-hard.

We now discuss our decision procedure. To have a chance of there being a decision procedure, the set portion SS of an MD-sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) must be tractable. We now give a simple, natural choice for the set portions. A rational interval is a subset of [0,1][0,1] that is of one of the four forms (a,b)(a,b), [a,b][a,b], (a,b](a,b], or [a,b)[a,b), where aa and bb are rational numbers. Let us say that a sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) is interval-based if SS is of the form S1××SkS_{1}\times\cdots\times S_{k}, where each SiS_{i} is a union of a finite number of rational intervals. If each SiS_{i} is the union of at most NN rational intervals, then we say that the sentence is NN-interval-based. Note that this interval-based sentence (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) is equivalent to the set {(σ1;S1),,(σk;Sk)}\{(\sigma_{1};S_{1}),\ldots,(\sigma_{k};S_{k})\} of sentences with only one component each. This observation may be useful in implementing a decision procedure. In fact, although we do not make use of this in the decision procedure described in this section, we so use it in the implementation of the decision procedure described later, since these sentences with a single component are easy to deal with. (This is one of several ways that our implementation differs from what is described in this section.)

Let Γ={γ1,,γn}\Gamma=\{\gamma_{1},\ldots,\gamma_{n}\}. For 1in1\leq i\leq n, assume that γi\gamma_{i} is (σ1i,,σkii;Si)(\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}};S_{i}), and let Γi={σ1i,,σkii}\Gamma_{i}=\{\sigma^{i}_{1},\ldots,\sigma^{i}_{k_{i}}\}. Assume that γ\gamma is (σ10,,σk00;S0(\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}};S_{0}), and let Γ0={σ10,,σk00}\Gamma_{0}=\{\sigma^{0}_{1},\ldots,\sigma^{0}_{k_{0}}\}. Let GG be the closure of Γ0Γ1Γn\Gamma_{0}\cup\Gamma_{1}\cup\cdots\cup\Gamma_{n} under subformulas. If |G|M|G|\leq M, then we say that the pair (Γ,γ)(\Gamma,\gamma) has nesting depth at most MM.

Theorem 6.1.

Assume either Łukasiewicz logic or Gödel logic, with the connective &\mathbin{\&}, \veebar, \rightarrow, and ¬\neg.222For Łukasiewicz logic, we could allow each of strong and weak disjunction and conjunction, respectively, as described in an earlier footnote. Assume that Γ{γ}\Gamma\cup\{\gamma\} is interval based. Then there is an algorithm that determines whether Γγ\Gamma\vDash\gamma. Assume that Γ\Gamma has at most PP sentences, each sentence in Γ{γ}\Gamma\cup\{\gamma\} is NN-interval based, and (Γ,γ)(\Gamma,\gamma) has nesting depth at most MM. If MM is fixed, then the algorithm runs in time polynomial in PP and NN.

Proof 6.2.

Assume throughout the proof that Γ\Gamma has at most PP sentences, each sentence in Γ{γ}\Gamma\cup\{\gamma\} is NN-interval based, and (Γ,γ)(\Gamma,\gamma) has nesting depth at most MM.

It is easy to see that Γγ\Gamma\vDash\gamma if and only Γ{¬γ}\Gamma\cup\{\neg\gamma\} is not satisfiable. So we need only give an algorithm that decides whether Γ{¬γ}\Gamma\cup\{\neg\gamma\} is satisfiable.

Let {σ1,,σp}\{\sigma_{1},\ldots,\sigma_{p}\} be the closure of Γ{γ}\Gamma\cup\{\gamma\} under subformulas. Let Γ={γ1,,γn}\Gamma=\{\gamma_{1},\ldots,\gamma_{n}\}. By making use of Rules (2) and (3), for each ii with 1in1\leq i\leq n, we can create a sentence γi\gamma^{\prime}_{i} of the form (σ1,,σp;Si)(\sigma_{1},\ldots,\sigma_{p};S^{i}) that by Lemma 3.3 is equivalent to γi\gamma_{i}, and that has σ1,,σp\sigma_{1},\ldots,\sigma_{p} as components. By the construction, each γi\gamma_{i}^{\prime} is NN-interval-based.

Similarly, create the sentence γ\gamma^{\prime} of the form (σ1,,σp;T)(\sigma_{1},\ldots,\sigma_{p};T) that is equivalent to γ\gamma, and that has σ1,,σp\sigma_{1},\ldots,\sigma_{p} as components. As before, γ\gamma^{\prime} is NN-interval-based.

Now Γ\Gamma is equivalent to the conjunction of the sentences γi\gamma^{\prime}_{i} for 1in1\leq i\leq n, and this conjunction is equivalent to (σ1,,σp;S)(\sigma_{1},\ldots,\sigma_{p};S), where S=inSiS=\bigcap_{i\leq n}S^{i}. We now show that (σ1,,σp;S(\sigma_{1},\ldots,\sigma_{p};S) is PNPN-interval-based. By assumption, for each ii with 1in1\leq i\leq n, we have that SiS^{i} is of the form S1i××SpiS^{i}_{1}\times\cdots\times S^{i}_{p}, where each SjiS^{i}_{j} is the union of at most NN intervals. For each jj with 1jp1\leq j\leq p, let Sj=iSjiS_{j}=\bigcap_{i}S^{i}_{j}. Then S=S1××SpS=S_{1}\times\cdots\times S_{p}. So to show that (σ1,,σp;S(\sigma_{1},\ldots,\sigma_{p};S) is PNPN-interval-based, we need only show that each SjS_{j} is the union of at most PNPN intervals.

Since Sj=inSjiS_{j}=\bigcap_{i\leq n}S^{i}_{j}, where each SjiS^{i}_{j} is the union of at most NN intervals, we see that SjS_{j} is the union of intervals where the left endpoint of each interval in SjS_{j} is one of the left endpoints of intervals in inSji\bigcup_{i\leq n}S^{i}_{j}. For each jj, there are nn sets SjiS^{i}_{j}. And for each ii with 1in1\leq i\leq n, there are at most NN left endpoints of SjiS^{i}_{j}. So the total number of left endpoints of intervals in inSji\bigcup_{i\leq n}S^{i}_{j} is at most nNPNnN\leq PN, and so the number of intervals in SjS_{j} is at most PNPN. Since S=S1××SpS=S_{1}\times\cdots\times S_{p}, it follows that (σ1,,σp;S)(\sigma_{1},\ldots,\sigma_{p};S) is PNPN-interval-based.

Let us now consider ¬γ\neg\gamma, which is equivalent to ¬γ\neg\gamma^{\prime}. Recall that γ\gamma^{\prime} is (σ1,,σp;T)(\sigma_{1},\ldots,\sigma_{p};T), and that γ\gamma^{\prime} is NN-interval-based. So TT is of the form T1××TpT_{1}\times\cdot\times T_{p}, where each TjT_{j} is the union of at most NN intervals. By Lemma 4.1, the negation of γ\gamma^{\prime} is (σ1,,σp;T~)(\sigma_{1},\ldots,\sigma_{p};\tilde{T}), where T~\tilde{T} is the set difference [0,1]pT[0,1]^{p}\setminus T. For each jj with 1jp1\leq j\leq p, let TjT^{\prime}_{j} be the set difference [0,1]Tj[0,1]\setminus T_{j}. Clearly, TjT^{\prime}_{j} is the union of intervals. The left endpoints of intervals in TjT^{\prime}_{j} are the right-end points of intervals in TjT_{j}, possible along with 0. So TjT^{\prime}_{j} is the union of at most N+1N+1 intervals. Let Vj=[0,1]j1×Tj×[0,1]pjV_{j}=[0,1]^{j-1}\times T^{\prime}_{j}\times[0,1]^{p-j}. It is straightforward to see that T~=jpVj\tilde{T}=\bigcup_{j\leq p}V_{j}.

Now, showing that Γ{¬γ}\Gamma\cup\{\neg\gamma\} is not satisfiable is equivalent to showing that (σ1,,σp;S)(σ1,,σp;T~)(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};\tilde{T}) is not satisfiable, which is equivalent to showing that for every jj with 1jp1\leq j\leq p, we have that (σ1,,σp;S)(σ1,,σp;Vj)(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};V_{j}) is not satisfiable. So we need only give an algorithm for deciding if (σ1,,σp;S)(σ1,,σp;Vj)(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};V_{j}) is satisfiable. Let us hold jj fixed. Since, as we showed, (σ1,,σp;S)(\sigma_{1},\ldots,\sigma_{p};S) is PNPN-interval-based, we can write SS as S1××SpS_{1}\times\cdots\times S_{p}, where each SiS_{i} is the union of at most PNPN intervals. Now (σ1,,σp;S)(σ1,,σp;Vj)(\sigma_{1},\ldots,\sigma_{p};S)\land(\sigma_{1},\ldots,\sigma_{p};V_{j}) is equivalent to (σ1,,σp;SVj)(\sigma_{1},\ldots,\sigma_{p};S\cap V_{j}). Now SVjS\cap V_{j} is of the form S1××SpS^{\prime}_{1}\times\cdots\times S^{\prime}_{p}, where Sm=SmS^{\prime}_{m}=S_{m} for mjm\neq j, and where Sj=SjTjS^{\prime}_{j}=S_{j}\cap T^{\prime}_{j}. We showed that TjT^{\prime}_{j} is the union of at most N+1N+1 intervals, and that SjS_{j} is the union of at most PNPN intervals, so it follows that SjTjS_{j}\cap T^{\prime}_{j} is the union of at most PN+N+1PN+N+1 intervals, since each left endpoint of the intervals in SjTjS_{j}\cap T^{\prime}_{j} is a left endpoint of an interval in SjS_{j} or an interval in TjT^{\prime}_{j}.

We now describe our algorithm for deciding if the sentence (σ1,,σp;SVj)(\sigma_{1},\ldots,\sigma_{p};S\cap V_{j}), that is, for the sentence (σ1,,σp;S1××Sp(\sigma_{1},\ldots,\sigma_{p};S^{\prime}_{1}\times\cdots\times S^{\prime}_{p}), which is (PN+N+1)(PN+N+1)-interval-based, is satisfiable. This can be broken into |S1|××|Sp||S^{\prime}_{1}|\times\cdots\times|S^{\prime}_{p}| subproblems, one for each choice (I1,,Ip)(I_{1},\ldots,I_{p}) of a single interval IkI_{k} from SkS^{\prime}_{k} for each kk with 1kp1\leq k\leq p. This gives a total of at most (PN+N+1)M(PN+N+1)^{M} subproblems. For each of these subproblems, we wish to decide satisfiability of the system {s1I1,,spIp}\{s_{1}\in I_{1},\ldots,s_{p}\in I_{p}\} along with (a) the binary constraints fα(si,sj)=smf_{\alpha}(s_{i},s_{j})=s_{m} when σm\sigma_{m} is σi𝛼σj\sigma_{i}\mathop{\alpha}\sigma_{j} and α\alpha is a &\mathbin{\&}, \veebar, or \Rightarrow, and (b) f¬(si)=sjf_{\neg}(s_{i})=s_{j} when σj\sigma_{j} is ¬σi\neg\sigma_{i}.

The constraints sjIjs_{j}\in I_{j} are specified by inequalities (for example, if IjI_{j} is (a,b](a,b] we get the inequalities a<siba<s_{i}\leq b). We now show how to deal with the constraints in (a) and (b) above. A canonical example is given by dealing with f&(si,sj)=smf_{\mathbin{\&}}(s_{i},s_{j})=s_{m} in Gödel logic, which interprets “f&(si,sj)=smf_{\mathbin{\&}}(s_{i},s_{j})=s_{m}” as min{si,sj}=sm\min\{s_{i},s_{j}\}=s_{m}. We split the system of constraints into two systems of constraints, one where we replace min{si,sj}=sm\min\{s_{i},s_{j}\}=s_{m} by the two statements “sisjs_{i}\leq s_{j}, si=sms_{i}=s_{m}” and another where we replace min{si,sj}=sm\min\{s_{i},s_{j}\}=s_{m} by the two statements “sj<sis_{j}<s_{i}, sj=sms_{j}=s_{m}”. In Łukasiewicz logic, where f&(si,sj)f_{\mathbin{\&}}(s_{i},s_{j}) is max{0,s1+s21}\max\{0,s_{1}+s_{2}-1\}, we split the system of constraints into two systems of constraints, one where we replace max{0,s1+s21}=sm\max\{0,s_{1}+s_{2}-1\}=s_{m} by the two statements “si+sj10s_{i}+s_{j}-1\geq 0, si+sj1=sms_{i}+s_{j}-1=s_{m}” and another where we replace max{0,s1+s21}=sm\max\{0,s_{1}+s_{2}-1\}=s_{m} by the two statements “si+sj1<0s_{i}+s_{j}-1<0, sm=0s_{m}=0”. The same approach works for the other binary connectives. For example, in Gödel logic, where f(si,sj)f_{\Rightarrow}(s_{i},s_{j}) is 1 if sisjs_{i}\leq s_{j} and is sjs_{j} otherwise, we would split into two case, one where we replace f(si,sj)=smf_{\Rightarrow}(s_{i},s_{j})=s_{m} by the two statements “sisjs_{i}\leq s_{j}, sm=1s_{m}=1” and another where we replace f(si,sj)=smf_{\Rightarrow}(s_{i},s_{j})=s_{m} by the two statements “sj>sis_{j}>s_{i}, sm=sjs_{m}=s_{j}”. In considering the effect of the constraints in (a) and (b), each of our at most (PN+N+1)M(PN+N+1)^{M} subproblems splits at most 2p2M2^{p}\leq 2^{M} times, giving a grand total of at most (PN+N+1)M2M(PN+N+1)^{M}2^{M} systems of inequalities that we need to check for feasibility (that is, to see if there is a solution). For each of these systems of inequalities, we can make use a polynomial-time algorithm for linear programming to decide feasibility, where the size of each of these systems is linear in MM, and so the running time for each instance of the linear programming algorithm is polynomial in MM. Since also the number of systems is at most (PN+N+1)M2M(PN+N+1)^{M}2^{M}, and sinceMM is fixed by assumption, this gives us an overall algorithm for decidability, whose rulnning time is polynoimial in NN and PP.

The reason we held the parameter MM fixed is that the running time of the algorithm is exponential in MM, because there are an exponential number of calls to a linear programming subroutine. The algorithm is polynomial-time if there is a fixed bound on MM. Such a bound is necessary, because the problem can be co-NP hard, for the following reason.

Let γ\gamma be the sentence (A,¬A;[1]×[1])(A,\neg A;[1]\times[1]). Then γ\gamma is not satisfiable. Let Γ\Gamma consist of the single sentence ψ\psi from the beginning of the section. Then Γγ\Gamma\vDash\gamma if and only if ψ\psi is not satisfiable. Now ψ\psi is satisfiable if and only if SS^{\prime} from the beginning of the section is nonempty, which we showed is an NP-hard problem to determine. Since Γγ\Gamma\vDash\gamma if and only if ψ\psi is not satisfiable, it follows that deciding if Γγ\Gamma\vDash\gamma is co-NP hard.

We now give an implementation of the decisoin procedure The decision procedure described in Section 6 is available under the socratic-logic subdirectory provided with this supplementary material. We implemented the algorithm as a Python package named socratic, which requires Python 3.6 or 3.7 and makes use of IBM® ILOG® CPLEX® Optimization Studio V12.10.0 via the docplex Python package.

6.1 Source code organization

The source code is organized as follows:

setup.sh

A script to create a Python virtualenv and install required packages

requirements.txt

A standard pip list of package dependencies

socratic/theory.py

Implementations for theories, sentences, and bounded intervals

socratic/op.py

Implementations for each logical operator as well as propositions and truth value constants

socratic/demo.py

Two example use cases demonstrating the use of the package

socratic/test.py

A suite of unit tests also serving as our experimental setup

socratic/hajek.py

Many tautologies proved in (10) used in test.py

socratic/clock.py

A higher-order function to measure the runtime of experiments

The classes defined in the source code are:

theory.Theory

A collection of sentences that can test for satisfiability or the entailment of a query sentence under a given logic

theory.Sentence

A base-class for a collection of formulas and an associated set of candidate interpretations for the formulas

theory.SimpleSentence

A single formula and an associated collection of candidate truth value intervals for the formula

theory.FloatInterval

An open or closed interval of truth values

theory.ClosedInterval

[l,u][l,u], i.e., all values from ll to uu, inclusive

theory.Point

[x,x][x,x], i.e., just xx

theory.OpenInterval

(l,u)(l,u), i.e., all values from ll to uu, exclusive

theory.OpenLowerInterval

(l,u](l,u], i.e., all values from ll to uu excluding ll

theory.OpenUpperInterval

[l,u)[l,u), i.e., all values from ll to uu excluding uu

op.Formula

The base-class of a data structure representing the syntax tree of a logical formula

op.Prop

A named proposition, e.g., xx

op.Constant

A truth value constant, e.g., .5.5

op.Operator

A base-class for all formulas with subformulas

op.And

Strong conjunction xyx\otimes y

op.WeakAnd

Weak conjunction x&y=min{x,y}x\mathbin{\&}y=\min\{x,y\}

op.Or

Strong disjunction xyx\otimes y

op.WeakOr

Weak disjunction xy=max{x,y}x\veebar y=\max\{x,y\}

op.Implies

Implication xyx\Rightarrow y, i.e., the residuum of \otimes

op.Not

Negation defined ¬x=(x0)\neg x=(x\Rightarrow 0)

op.Inv

Involute negation x=1x{\sim}x=1-x

op.Equiv

Logical equivalence defined (xy)=((xy)(yx))(x\equiv y)=((x\Rightarrow y)\otimes(y\Rightarrow x))

op.Delta

The operation Δx=1\Delta x=1 if x=1x=1 else 0

6.2 Implementation details

The implementation strategy closely adheres to the decision procedure described in Section 6, though with a few notable design shortcuts.

Boolean variables.

One such shortcut is the use of mixed integer linear programming (MILP) to perform the “spliting” of linear programs into two possible optimization problems, specifically by adding a Boolean variable that determines which of a set of constraints must be active. For example, given the desired constraint z=min{x,y}z=\min\{x,y\}, one may write

z\displaystyle z x\displaystyle\leq x (8)
z\displaystyle z y\displaystyle\leq y (9)
z\displaystyle z x(1b)\displaystyle\geq x-(1-b) (10)
y\displaystyle y x(1b)\displaystyle\geq x-(1-b) (11)
z\displaystyle z yb\displaystyle\geq y-b (12)
x\displaystyle x yb\displaystyle\geq y-b (13)

for Boolean variable bb. For x,y,z[0,1]x,y,z\in[0,1], observe that (10) and (11) are effectively disabled for b=0b=0 and that (12) and (13) are likewise disabled for b=1b=1. For example, when b=1b=1, the remaining constraints are zx,zy,zx,yxz\leq x,z\leq y,z\geq x,y\geq x, which is equivalent to z=x,xyz=x,x\leq y, as desired. Observe then that MILP’s exploration of either value for the Boolean variable is equivalent to repeating linear optimization for either possible set of constraints; no feasible solution exists for any combination of such Boolean variables in exactly the case that none of the split linear programs are feasible. In practice, CPLEX has built-in support for min, max, abs, and a handful of other useful functions, though the above technique is still required to implement Gödel logic’s residuum, negation, and equivalence operations as well as to select the specific intervals a sentence’s formula truth values lie within.

Strict inequality.

The described decision procedure also occasionally calls for continuous constraints with strict inequality, in particular when dealing with the complements of closed intervals, but also when handling input open intervals or the Gödel residuum, (xy)=y(x\Rightarrow y)=y if x>yx>y else 1. Linear programming, however, does not inherently support this. To implement strict inequality constraints, we introduce a global gap variable δ[0,1]\delta\in[0,1] to widen the distance between either side of the inequality, e.g.

xy+δ,x\geq y+\delta, (14)

and then seek to maximize δ\delta. If optimization yields an apparently feasible solution but with δ=0\delta=0, we regard it as infeasible because at least one strict inequality constraint could not be honored strictly. Again in practice, due to floating-point imprecision, MILP can sometimes return tiny though nonzero values of δ\delta even for x=yx=y in (14); as a result, it is necessary to check if δ\delta is greater than some threshold rather than merely nonzero. We use δ>108\delta>10^{-8}, which is much larger than the imprecision we have observed and yet much smaller than most truth values we consider. We observe that this technique is roughly equivalent to replacing δ\delta with 10810^{-8} throughout, which has the added benefit of freeing up the optimization objective for other uses in future extensions of the decision procedure, such as determining the tightest bounds for which a theory can entail a query.

Simple sentences.

We additionally observe that, for theories restricted to interval-based sentences, it is sufficient to support only sentences containing a single formula and collection of truth value intervals, i.e., SimpleSentences of the form (σ;S)(\sigma;S) for a single formula σ\sigma. This is because of the following theorem:

Theorem 6.3.

Any interval-based sentence s=(σ1,,σk;S1××Sk)s=(\sigma_{1},\ldots,\sigma_{k};S_{1}\times\cdots\times S_{k}) is equivalent to a collection of simple sentences s1,,sks_{1},\ldots,s_{k}, each given si=(σi;Si)s_{i}=(\sigma_{i};S_{i}).

Proof 6.4.

Given interval-based sentence ss and simple sentences s1,,sks_{1},\ldots,s_{k} as described, one may apply Rules (3) and (2) to obtain s1,,sks_{1}^{\prime},\ldots,s_{k}^{\prime} given si=(σ1,,σk;[0,1]i1×Si×[0,1]ki)s_{i}^{\prime}=(\sigma_{1},\ldots,\sigma_{k};[0,1]^{i-1}\times S_{i}\times[0,1]^{k-i}). One may then repeatedly apply Rule (4) to compose these exactly into ss. Likewise, one may apply Rules (2) and (5) to obtain each sis_{i} directly from ss. Hence, the two forms are equivalent.

Accordingly, socratic implements only SimpleSentence. In order to include an interval-based sentence in a theory, one should instead include each of its component simple sentences as constructed above. In order to test the entailment of an interval-based sentence, one should separately test the entailment of each of its component simple sentences.

Complementary intervals.

As a last deviation from the described decision procedure, rather than explicitly finding the complement of a collection of truth value intervals for a given query formula, we simply adjust how constraints are expressed to force feasible solutions into the set of complementary intervals. Specifically, while the usual constraints require the formula’s truth value to lie within some one of its intervals, the complementary constraints require the formula’s truth value to not lie in any of its intervals, i.e., to lie to the left or the right of each of its intervals. We then reverse the direction of each interval’s lower and upper bound constraints, adding or removing gap variable δ\delta as appropriate to switch between strict and nonstrict inequalities, and introduce Boolean parameters to decide which of each pair of constraints should apply, i.e., to decide whether the formula’s truth value should lie to the left or to the right. For example, if simple sentence has truth value intervals [.2,.3],(.5,1][.2,.3],(.5,1], the above would produce constraints

x\displaystyle x .2δ+(1b1),\displaystyle\leq.2-\delta+(1-b_{1}),
x\displaystyle x .3+δb1,\displaystyle\geq.3+\delta-b_{1},
x\displaystyle x .5+(1b2),\displaystyle\leq.5+(1-b_{2}),
x\displaystyle x 1+δb2\displaystyle\geq 1+\delta-b_{2}

Observe that the last of these cannot be satisfied for δ>0\delta>0 unless b2=1b_{2}=1, which is consistent with the complement of (.5,1](.5,1] not having a right side in the interval [0,1][0,1].

6.3 Experimental results

We tested socratic in three different experimental contexts:

  • 3SAT and higher kk-SAT problems which become satisfiable if any one of their input clauses is removed

  • 82 axioms and tautologies taken from Hájek in (10), some of which hold only for one of Łukasiewicz or Gödel logic

  • A formula that is classically valid but invalid in both Łukasiewicz and Gödel logic, unless propositions are constrained to be Boolean

  • A stress test running socratic on sentences with thousands of intervals

All experiments are conducted on a MacBook Pro with a 2.9 GHz Quad-Core Intel Core i7, 16 GB 2133 MHz LPDDR3, and Intel HD Graphics 630 1536 MB running macOS Catalina 10.15.5.

kk-SAT.

We construct classically unsatisfiable kk-SAT problems of the form

(x1&¬x1)(xk&¬xk)(x_{1}\mathbin{\&}\neg x_{1})\veebar\cdots\veebar(x_{k}\mathbin{\&}\neg x_{k}) (15)

which, after CNF conversion, yields for 3SAT

(x1x2x3),\displaystyle(x_{1}\veebar x_{2}\veebar x_{3}), (¬x1x2x3),\displaystyle(\neg x_{1}\veebar x_{2}\veebar x_{3}), (x1¬x2x3),\displaystyle(x_{1}\veebar\neg x_{2}\veebar x_{3}),
(x1x2¬x3),\displaystyle(x_{1}\veebar x_{2}\veebar\neg x_{3}), (x1¬x2¬x3),\displaystyle(x_{1}\veebar\neg x_{2}\veebar\neg x_{3}), (¬x1x2¬x3),\displaystyle(\neg x_{1}\veebar x_{2}\veebar\neg x_{3}),
(¬x1¬x2x3),\displaystyle(\neg x_{1}\veebar\neg x_{2}\veebar x_{3}), (¬x1¬x2¬x3)\displaystyle(\neg x_{1}\veebar\neg x_{2}\veebar\neg x_{3})

and similarly for larger kk. The removal of any one clause in such a problem renders it satisfiable. We observe that, when each clause is required to have truth value exactly 1 but propositions are allowed to have any truth value, socratic correctly determines the problem to be

  1. 1)

    unsatisfiable in Gödel logic,

  2. 2)

    satisfiable in Gödel logic when dropping any one clause,

  3. 3)

    trivially satisfiable in Łukasiewicz logic with, e.g., xi=.5x_{i}=.5,

  4. 4)

    again unsatisfiable in Łukasiewicz logic when propositions are required to have truth values in range either [0,1k)\left[0,\frac{1}{k}\right) or (k1k,1]\left(\frac{k-1}{k},1\right],

  5. 5)

    and yet again satisfiable in Łukasiewicz logic with constrained propositions when dropping any one clause.

We observe that Gödel logic is much slower than Łukasiewicz logic as implemented in socratic, likely because it performs mins and maxes between many arguments throughout while Łukasiewicz logic instead performs sums with simpler mins and maxes serving as clamps to the [0,1][0,1] range. Interestingly, the difference between unsatisfiable and satisfiable in Gödel logic is significant; while the satisfiable problems have one fewer clause, this is more likely explained by socratic finding a feasible solution quickly. On the other hand, the unsatisfiable and satisfiable problems (with constrained propositions) take roughly the same amount of time for Łukasiewicz, though the trivially satisfiable problem is quicker. The apparent exponential increase in runtime is partially explained by the fact that each larger problem has twice as many clauses, but runtime appears to be growing by slightly more than a factor of 2 per each kk.

Table 1: kk-SAT runtimes in seconds for socratic with different experimental configurations. The five columns pertain to items 1 through 5 above. The problem is unsatisfiable in classical and Gödel logic, satisfiable in Gödel logic after removing a clause at random, trivially satisfiable in Łukasiewicz logic with, e.g., xi=.5x_{i}=.5, unsatisfiable in Łukasiewicz logic if propositions are required to lie in ranges sufficiently close to 0 and 1, and again satisfiable in Łukasiewicz logic with constrained propositions when removing a clause at random.
Gödel Gödel Łuka. Łuka. Łuka.
kk unsat. satisf. trivial unsat. satisf.
3 .012 .011 .014 .019 .014
4 .022 .020 .022 .031 .033
5 .054 .043 .041 .047 .043
6 .121 .107 .064 .104 .098
7 .204 .255 .173 .167 .206
8 .404 .414 .273 .286 .308
9 .861 .881 .507 .539 .554
10 5.46 1.99 1.03 1.11 1.17
11 18.000 4.34 2.09 2.44 2.21
12 33.300 10.900 4.36 5.06 5.01
13 119.000 25.800 8.72 12.400 12.300
14 696.000 71.000 18.400 38.000 35.600

Hájek tautologies.

Hájek lists many axioms and tautologies pertaining to a system of logic he describes as basic logic (BL), consistent with any t-norm logic, as well as a number of tautologies specific to Łukasiewicz and Gödel logic, all of which should have truth value exactly 1. We implement these tautologies in socratic and test whether the empty theory can entail them with truth value 1 in their respective logics. The BL tautologies are divided into batches pertaining to specific operations and properties:

axioms

8 tests, e.g., (φψ)((ψχ)(φχ))(\varphi\Rightarrow\psi)\Rightarrow((\psi\Rightarrow\chi)\Rightarrow(\varphi\Rightarrow\chi))

implication

3 tests, e.g., φ(ψφ)\varphi\Rightarrow(\psi\Rightarrow\varphi)

conjunction

6 tests, e.g., (φ(φψ))ψ(\varphi\otimes(\varphi\Rightarrow\psi))\Rightarrow\psi

weak˙conjunction

7 tests, e.g., (φ&ψ)φ(\varphi\mathbin{\&}\psi)\Rightarrow\varphi

weak˙disjunction

7 tests, e.g., φ(φψ)\varphi\Rightarrow(\varphi\veebar\psi)

negation

8 tests, e.g., φ(¬φψ)\varphi\Rightarrow(\neg\varphi\Rightarrow\psi)

associativity

6 tests, e.g., (φ&(ψ&χ))((φ&ψ)&χ)(\varphi\mathbin{\&}(\psi\mathbin{\&}\chi))\Rightarrow((\varphi\mathbin{\&}\psi)\mathbin{\&}\chi)

equivalence

9 tests, e.g., ((φψ)(ψχ))(φχ)((\varphi\equiv\psi)\otimes(\psi\equiv\chi))\Rightarrow(\varphi\equiv\chi)

distributivity

8 tests, e.g., (φ(ψχ))((φψ)(φχ))(\varphi\otimes(\psi\veebar\chi))\equiv((\varphi\otimes\psi)\veebar(\varphi\otimes\chi))

delta˙operator

3 tests, e.g., ΔφΔ(φφ)\Delta\varphi\equiv\Delta(\varphi\otimes\varphi)

In addition, there are logic-specific batches of tautologies:

lukasiewicz

12 tests, e.g., ¬¬φφ\neg\neg\varphi\equiv\varphi

godel

5 tests, e.g., φ(φφ)\varphi\Rightarrow(\varphi\otimes\varphi)

Each of the above BL batches complete successfully for both logics and each of the logic-specific batches complete for their respective logics and, as expected, fail for the other logic. The runtime of individual tests are negligible; the entire test suite of 82 tautologies run on both logics complets in just 2.911 seconds.

Boolean logic.

We consider a formula σ\sigma defined

(φψ)((¬φψ)ψ)(\varphi\Rightarrow\psi)\Rightarrow((\neg\varphi\Rightarrow\psi)\Rightarrow\psi) (16)

which is valid in classical logic but is not entailed with truth value 1 by the empty theory in either Łukasiewicz or Gödel logic. Conversely, constraining propositions φ\varphi and ψ\psi to have classical truth values by introducing the sentences

(φ;[0,0][1,1]),\displaystyle(\varphi;[0,0]\cup[1,1]),
(ψ;[0,0][1,1])\displaystyle(\psi;[0,0]\cup[1,1])

into the theory succeeds in entailing σ\sigma in either logic. Indeed, if even one of these sentences is added, σ\sigma is entailed, but no looser intervals around 0 and 1 can entail σ\sigma if both propositions are non-Boolean. In the other direction, Łukasiewicz logic with unconstrained propositions entails the sentence (σ;[.5,1])(\sigma;[.5,1]), i.e., σ\sigma with truth value bounded above .5, while Gödel logic with unconstrained propositions cannot entail σ\sigma with any interval tighter than [0,1][0,1]. As a final example of the interaction between truth value intervals, Gödel logic entails (σ;[t,1])(\sigma;[t,1]) for a lower bound truth value tt if either of φ\varphi or ψ\psi is constrained in the theory to have set of candidate truth values {0}[t,1]\{0\}\cup[t,1].

Stress test.

We consider the experimental configuration for Boolean logic above now with query (σ;S)(\sigma;S) for SS consisting of 10000 open intervals (1k+1,1k)(\frac{1}{k+1},\frac{1}{k}) for kk from 2 to 10000 plus the closed interval [.5,1][.5,1] and (φ;S)(\varphi;S^{\prime}) and (ψ;S)(\psi;S^{\prime}) for SS^{\prime} consisting of 10000 open intervals (11k,11k+1)(1-\frac{1}{k},1-\frac{1}{k+1}) plus the closed interval [0,0][0,0]. We observe the runtime of socratic to be just 11.8 seconds for Gödel logic and 9.38 seconds for Łukasiewicz logic. If we instead use closed intervals throughout, measured runtimes are 17.4 seconds for Gödel and 9.29 seconds for Łukasiewicz.

7 Dealing with weights

In some circumstances, such as logical neural networks (4), weights are assigned to subformulas, where the weight is intended to reflect the influence, or importance, of the subformula. Each weight is a real number. For example, in the formulas σ1σ2\sigma_{1}\veebar\sigma_{2}, the weight w1w_{1} might be assigned to σ1\sigma_{1} and the weight w2w_{2} assigned to σ2\sigma_{2}. If 0<w1=2w20<w_{1}=2w_{2}, this might indicate that σ1\sigma_{1} is twice as important as σ2\sigma_{2} in evaluating the value of σ1σ2\sigma_{1}\veebar\sigma_{2}.

As an example of a possible way to incorporate weights, assume that we are using Łukasiewicz real-valued logic, where the value of σ1σ2\sigma_{1}\veebar\sigma_{2} is min{1,s1+s2}\min\{1,s_{1}+s_{2}\}, when s1s_{1} is the value of σ1\sigma_{1} and s2s_{2} is the value of s2s_{2}. If the weights of σ1\sigma_{1} and σ2\sigma_{2} are w1w_{1} and w2w_{2}, respectively, and if both w1w_{1} and w2w_{2} are non-negative, then we might take the value of σ1σ2\sigma_{1}\veebar\sigma_{2} in the presence of these weights to be min{1,w1s1+w2s2}\min\{1,w_{1}s_{1}+w_{2}s_{2}\}.

We now show how to incorporate weights into our approach. In fact, the ease of incorporating weights and still getting a sound and complete axiomatization is a real advantage of our approach!

To deal with weights, we define an expanded view of what a formula is, defined recursively. Each atomic proposition is a formula. If σ1\sigma_{1} and σ2\sigma_{2} are formulas, w1w_{1} and w2w_{2} are weights, and α\alpha is a binary connective (such as &\mathbin{\&}) then (σ1𝛼σ2,w1,w2)(\sigma_{1}\mathop{\alpha}\sigma_{2},w_{1},w_{2}) is a formula. Here w1w_{1} is interpreted as the weight of σ1\sigma_{1} and w2w_{2} as the weight of σ2\sigma_{2} in the formula σ1𝛼σ2\sigma_{1}\mathop{\alpha}\sigma_{2}. Also, if σ\sigma is a formula, ww is a weight, and ρ\rho is a unary connective (such as ¬\neg), then (ρσ,w)(\rho\sigma,w) is a formula, where ww is interpreted as the weight of σ\sigma. We modify our definition of subformula as follows. The subformulas of (σ1𝛼σ2,w1,w2)(\sigma_{1}\mathop{\alpha}\sigma_{2},w_{1},w_{2}) are σ1\sigma_{1} and σ2\sigma_{2}, and the subformula of (ρσ,w)(\rho\sigma,w) is σ\sigma.

If α\alpha is a binary connective, then fαf_{\alpha} now has four arguments, rather than two. Thus, fα(s1,s2,w,w2)f_{\alpha}(s_{1},s_{2},w_{,}w_{2}) is the value of the formula (σ1𝛼σ2,w1,w2)(\sigma_{1}\mathop{\alpha}\sigma_{2},w_{1},w_{2}) when the value of σ1\sigma_{1} is s1s_{1}, the value of σ2\sigma_{2} is s2s_{2}, the weight of σ1\sigma_{1} is w1w_{1}, and the weight of σ2\sigma_{2} in w2w_{2}. Also, fρf_{\rho} now has two arguments rather than one. Thus, fρ(s,w)f_{\rho}(s,w) is the value of the formula (ρσ,w)(\rho\sigma,w) when the value of σ\sigma is ss, and the weight of σ\sigma is ww.

The axiom and rules are just as before, except that Rule (7) is changed to:

From (σ1,,σk;S) infer (σ1,,σk;S)\mbox{From }(\sigma_{1},\ldots,\sigma_{k};S)\mbox{ infer }(\sigma_{1},\ldots,\sigma_{k};S^{\prime}) (17)

where S={(s1,,sk):(s1,,sk,SS^{\prime}=\{(s_{1},\ldots,s_{k})\colon(s_{1},\ldots,s_{k},\in S and (a) sm=fα(si,sj,w1,w2)s_{m}=f_{\alpha}(s_{i},s_{j},w_{1},w_{2}) when σm\sigma_{m} is (σi𝛼σj,w1,w2)(\sigma_{i}\mathop{\alpha}\sigma_{j},w_{1},w_{2}) and α\alpha is a binary connective, and (b) sj=fρ(si,w)s_{j}=f_{\rho}(s_{i},w) when σj\sigma_{j} is (ρσi,w)(\rho\sigma_{i},w) and ρ\rho is a unary connective}\}.

We can extend Theorem 3.7 (soundness and completeness) and Theorem 4.3 (closure under Boolean combinations) to deal with our sentences (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) that include weights. The proofs go through just as before, where we use Rule (17) instead of Rule (7). Thus, we obtain the following theorems.

Theorem 7.1.

Our axiom system where Rule (7) is replaced by Rule (17) is sound and complete for sentences of the form (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) that include weights.

Theorem 7.2.

The sentences (σ1,,σk;S)(\sigma_{1},\ldots,\sigma_{k};S) that include weights are closed under Boolean combinations.

What about the decision procedure in Theorem 6.1? The key step is the use of a polynomial-time algorithm for linear programming. If we were to hold the weights wiw_{i} as fixed rational constants, and if the weighting functions were linear (such as w1s1+w2s2w_{1}s_{1}+w_{2}s_{2}), possibly including a min or a max, then we could use linear programming, and the decision procedure would go through in the presence of weights.

8 Related work

Rosser (12) comments on the possibility of considering formulas whose value is guaranteed to be at least α\alpha. For example, in Łukasiewicz) logic, if we consider weak disjunction ¯\underline{\veebar}, where f¯(s,s2)=max(s1,s2)f_{\underline{\veebar}}(s_{,}s_{2})=\max(s_{1},s_{2}), then the real value of A¯¬AA\underline{\veebar}\neg A is always at least 0.5, since f¬(s)=1sf_{\neg}(s)=1-s. But Rosser rejects this approach, since he notes that there are uncountably many choices for α\alpha, but only countably many recursively enumerable sets (and an axiomatization would give a recursively enumerable set of valid formulas).

Belluci (13) investigates when the set of formulas with values at least α\alpha is recursively enumerable. Font et al. (14) consider the question of what they call “preservation of degrees of truth”. They give a method for deciding, for a fixed α\alpha, if σ\sigma having a value at least α\alpha implies that φ\varphi has value at least α\alpha.

Novák (15) considered a logic with sentences that assign a real value to each formula of first-order real-valued logic. Thus, using our notation, his sentences would be of the form (φ;{α})(\varphi;\{\alpha\}), where φ\varphi is a formula in first-order real-valued logic, and α\alpha is a single real value. He gave a sound and complete axiomatization.

An interesting logic is the rational Pavelka logic RPL, an expansion of the standard Łukasiewicz logicwhere rational truth-constants are allowed in formulas. For example, if rr is a rational number, then the formula rφr\rightarrow\varphi says that the value of φ\varphi is at least rr, and the formula φr\varphi\rightarrow r says that the value of φ\varphi is at most rr. Therefore, this logic can express the MD-sentences (φ;S)(\varphi;S), when SS is the union of a finite number of closed intervals. However, it cannot express strict inequalities. For example, it cannot express that the value of φ\varphi is strictly greater than 0.5.333This follows from the stronger fact that if A1,,ArA_{1},\ldots,A_{r} are the atomic propositions, φ\varphi is a formula, and GG is the set of all value assignments to the atomic propositions that give φ\varphi the truth value 1,then since the operators of standard Łukasiewicz logic are continuous (and so the value of φ\varphi is a continuous function of the value of the atomic propositions), it follows that {(g(A1),,g(Ar)):gG}\{(g(A_{1}),\cdots,g(A_{r})):\ g\in G\} is a closed subset of [0,1]r[0,1]^{r}. Note that if r=0.5r=0.5, then even though the formula ArA\rightarrow r has the value 1 when the value aa of AA is at most 0.5, the negation ¬(Ar)\neg(A\rightarrow r) does not have the value 1 when a>0.5a>0.5; instead it has the value aa - 0.5. RPL was introduced by Hájek in (10) as a simplification of the system proposed by Pavelka in (16) in which the syntax contained a truth-constant for each real number of the interval [0,1]. Hájek showed that an analogous logic could be presented as an expansion of Łukasiewicz propositional logic with truth-constants only for the rational numbers in [0,1] and gave a corresponding completeness theorem. Moreover, first-order fuzzy logics with real or rational constants have also been deeply studied starting from Novák’s extension of Pavelka’s logic to a first-order predicate language in (17) (see e.g. (18)).

Each of (19), (20) and (21) give decision procedures that partially cover the situation we allow in Section 6. The former two support only Łukasiewicz logic. The third, like our decision procedure, works for a variety of logics, though it is explicitly established in (21) that their approach does not support discontinuous operators. Accordingly, unlike our decision procedure, their approach does not work for Gödel logic given its discontinuous \rightarrow operator.

9 Conclusions

We give a sound and strongly complete axiomatization for sentences about real-valued formulas. By being parameterized, our axiomatization covers essentially all real-valued logics. Our axiomatization allows us to include weights on formulas. The results give us a way to establish such properties for neuro-symbolic systems that aim or purport to perform logical inference with real values. The algorithm described gives us a constructive existence proof and a baseline approach for well-founded inference. Because LNNs (4) are exactly a weighted real-valued logical system implemented in neural network form, an important immediate upshot of our results for the weighted case is that they provide provably sound and complete logical inference for LNNs. Such a result has not previously been established for any neuro-symbolic approach to our knowledge. While our main motivation was to pave the way forward for neuro-symbolic systems, our results are fundamental, filling a long-standing gap in a very old literature, and can be applied well beyond AI.

\acknow

We are very grateful to Marco Carmosino, who improved the writing in this paper by giving us many helpful comments We are also grateful to Guillermo Badia, Ken Clarkson, Didier Dubois, Phokion Kolaitis, Carles Noguera, and Henri Prade for helpful comments. Finally, we are grateful to Lluis Godo for confirming the novelty of our approach, and for helpful comments.

\showacknow

References

  • (1) L Serafini, Ad Garcez, Logic tensor networks: Deep learning and logical reasoning from data and knowledge. \JournalTitlearXiv preprint arXiv:1606.04422 (2016).
  • (2) SH Bach, M Broecheler, B Huang, L Getoor, Hinge-loss Markov random fields and probabilistic soft logic. \JournalTitleThe Journal of Machine Learning Research 18, 3846–3912 (2017).
  • (3) W Cohen, F Yang, KR Mazaitis, TensorLog: A probabilistic database implemented using deep-learning infrastructure. \JournalTitleJournal of Artificial Intelligence Research 67, 285–325 (2020).
  • (4) R Riegel, et al., Logical neural networks. \JournalTitlearXiv preprint arXiv:2006.13155 (2020).
  • (5) G Boole, An investigation of the laws of thought: on which are founded the mathematical theories of logic and probabilities. (Walton and Maberly) Vol. 2, (1854).
  • (6) LA Zadeh, Fuzzy logic and approximate reasoning. \JournalTitleSynthese 30, 407–428 (1975).
  • (7) V Novák, A formal theory of intermediate quantifiers. \JournalTitleFuzzy Sets and Systems 159, 1229–1246 (2008).
  • (8) G Epstein, Multiple-valued logic design: an introduction. (CRC Press), (1993).
  • (9) R Fagin, A Lotem, M Naor, Optimal aggregation algorithms for middleware. \JournalTitleJournal of computer and system sciences 66, 614–656 (2003).
  • (10) P Hájek, Metamathematics of fuzzy logic. (Springer Science & Business Media) Vol. 4, (1998).
  • (11) N Rescher, Many-valued logic. (McGraw-Hill), (1969).
  • (12) JB Rosser, Axiomatization of infinite valued logics. \JournalTitleLogizue et Analyse 3, 137–153 (1960).
  • (13) L Belluce, Further results on infinite valued predicate logic. \JournalTitleJ. Symbolic Logic 29, 69–78 (1964).
  • (14) JM Font, ÀJ Gil, A Torrens, V Verdu, On the infinite-valued łukasiewicz logic that preserves degrees of truth. \JournalTitleArch. Math. Logic 45, 835–868 (2006).
  • (15) V Novák, Fuzzy logic with extended syntax. \JournalTitleHandbook of Mathematical Fuzzy Logic 3, 1063–1104 (2015).
  • (16) J Pavelka, On fuzzy logic i, ii, iii. \JournalTitleZeitschrift fur Mathematische Logik und Grundlagen der Mathematik 29, 45—52, 119–134, 447–464 (1979).
  • (17) V Novák, On the syntactico-semantical completeness of first-order fuzzy logic part I (syntax and semantic), part II (main results). \JournalTitleKybernetika 26, 47–66, 134–154 (1990).
  • (18) F Esteva, L God, C Noguerra, First-order t-norm based fuzzy logics with truth-constants: distinguished semantics and completeness properties. \JournalTitleAnnals of Pure and Applied Logic 161, 185–202 (2009).
  • (19) G Beavers, Automated theorem proving for łukasiewicz logics. \JournalTitleStudia Logica 52, 183–195 (1993).
  • (20) D Mundici, A constructive proof of McNaughton’s theorem in infinite-valued logic. \JournalTitleThe Journal of Symbolic Logic 59, 596–602 (1994).
  • (21) R Hähnle, Many-valued logic and mixed integer programming. \JournalTitleAnnals of Mathematics and Artificial Intelligence 12, 231–263 (1994).