pnasresearcharticle \leadauthorFagin \significancestatementThis work is a step in the direction of explainable AI as it pertains to logical inference in neural networks. This may ultimately assist in preventing unfair, unwarranted, or otherwise undesirable outcomes from the application of modern AI methods. \correspondingauthor1To whom correspondence should be addressed. E-mail: fagin@us.ibm.com
Foundations of Reasoning with Uncertainty via Real-valued Logics
Abstract
Real-valued logics underlie an increasing number of neuro-symbolic approaches, though typically their logical inference capabilities are characterized only qualitatively. We provide foundations for establishing the correctness and power of such systems. We give a sound and strongly complete axiomatization that can be parametrized to cover essentially every real-valued logic, including all the common fuzzy logics. Our class of sentences are very rich, and each describes a set of possible real values for a collection of formulas of the real-valued logic, including which combinations of real values are possible. Strong completeness allows us to derive exactly what information can be inferred about the combinations of real values of a collection of formulas given information about the combinations of real values of several other collections of formulas. We then extend the axiomatization to deal with weighted subformulas. Finally, we give a decision procedure based on linear programming for deciding, for certain real-valued logics and under certain natural assumptions, whether a set of our sentences logically implies another of our sentences.
keywords:
Keywords: real-valued logic strongly complete axiomatizationThis manuscript was compiled on August 10, 2025
Recent years have seen growing interest in approaches for augmenting the capabilities of learning-based methods with those of reasoning, often broadly referred to as neuro-symbolic (though they may not be strictly neural). One of the key goals that neuro-symbolic approaches have at their root is logical inference, or reasoning. However, the representation of classical 0-1 logic (where truth values of sentences are either 0, representing “False”, or 1, representing “True”) is generally insufficient for this goal because representing uncertainty is essential to AI. In order to merge with the ideas of neural learning, the truth values dealt with must be real-valued (we shall take these to be real numbers in the interval , where intuitively, 0 means “completely false”, and 1 means “completely true”), whether the uncertainty semantics are those of probabilities, subjective beliefs, neural network activations, or fuzzy set memberships. For this reason, many major approaches have turned to real-valued logics. Logic tensor networks (1) define a logical language on real-valued vectors corresponding to groundings of terms computed by a neural network, which can use any of the common real-valued logics (e.g., Łukasiewicz, product, or Gödel logic) for its connectives (e.g., , , , and ). Probabilistic soft logics (2) draw a correspondence of their approach based on Markov random fields (MRFs) with satisfiability of statements in a real-valued logic (Łukasiewicz). Tensorlog (3), also based on MRFs but implemented in neural network frameworks, draws a correspondence of its approach to the use of connectives in a real-valued logic (product). Logical neural networks (LNNs) (4) draw a correspondence between activation functions of neural networks and connectives in real-valued logics. To complete a full correspondence between neural networks and statements in real-valued logic, LNN defines a class of real-valued logics allowing weighted inputs, which represent the relative influence of subformulas. While widely regarded as fundamental to the goal of AI, the reasoning capabilities of the aforementioned systems are typically made qualitatively versus quantitatively and mathematically. While learning theory (roughly, what it means to perform learning) is well articulated and, for 0-1 logic, what it means to perform reasoning is well studied, reasoning is surprisingly not well formalized for real-valued logics. As reasoning becomes an increasing goal of learning-based work, it becomes important to have a solid mathematical footing for it.
Formalization of the idea of real-valued logics is old and fundamental, going back to the origins of formal logic. It is not well known that Boole himself invented a probabilistic logic in the 19th century (5), where formulas were assigned real values corresponding to probabilities. It was used in AI to model the semantics of vague concepts for commonsense reasoning by expert systems (6). Real-valued logic is used in linguistics to model certain natural language phenomena (7), in hardware design to deal with multiple stable voltage levels (8), and in databases to deal with queries that are composed of multiple graded notions, such as the redness of an object, that can range from 0 (“not at all red”) to 1 (“completely red”) (9). Despite all this, while definitions of logical correctness and power (generally, soundness and completeness) are well established and corresponding procedures for theorem proving having those properties are abundant for classical logics, the equivalents for real-valued logics (where the values can take arbitrarily values between 0 and 1) are rather limited.
This paper.
In this paper, there are two levels of logic. In the “inner” layer, we have formulas of the real-valued logic with its logical connectives. In this inner layer, we shall use for “and” and for “or”, as is done in (10). In the “outer” layer, we have a novel class of sentences about the inner real-valued logic (such as saying which truth values a given real-valued formula may attain). For these sentences (which take on only the classical values 0 and 1 for False and True, respectively), we make use of the traditional logical symbols for “and” and for “or”. We remark that, somewhat confusingly, the symbols and are often used in real-valued logics for weaker versions of “and” and “or” than that given by and , which we do not have need to discuss in this paper.
Let us say that an axiomatization of a logic is strongly complete if whenever is a finite set of sentences in the (outer) logic and is a single sentence in the (outer) logic that is a logical consequence of , then there is a proof of from using the axiomatization. An axiomatization is weakly complete if this holds for . That is, an axiomatization is weakly complete if whenever is a valid sentence (always true), then there is a proof of using the axiomatization. Early axiomatizations of real-valued logics in the literature were typically weakly complete, but now have been improved to strongly complete (see (10) for examples).
We now explain why it is necessary to assume that is finite in the definition of strong completeness. (In our explanation, we make use of ideas from (10).) Let us restrict to Łukasiewicz logic. Let denote , where appears times. Let be the infinite set of sentences for , along with which says that the value of is less than 1. Let be . We now show that logically implies . Assume that holds but does not hold. Therefore, the value of is less than 1. It then follows from the definition of conjunction in Łukasiewicz logic that there is such that has value 0 . From this then implies that the value of is 0, so holds. Hence, logically implies . Because our proofs are of finite length, there cannot be a proof of from , since this would give a proof of from a finite subset of , but no finite subset of logically implies . A natural open problem is whether we can allow to be infinite if we were to restrict our attention to Gödel logic.
We introduce a rich, novel class of sentences.
-
1.
These sentences can say what the set of possible values is for a formula . This set can be a singleton (meaning that the real value of is ), or can be an interval, or a union of intervals, or in fact an arbitrary subset of , e.g. the set of rational numbers in .
-
2.
Our sentences can give not only the possible real values of formulas, but the interactions between these values. For example, if and are formulas, our sentences can not only say what the possible real values are for each of and , but also how they interact: thus, if is the real value of and is the real value of , then there is a sentence in our logic that says must lie in the set of ordered pairs, where is an arbitrary subset of . We give a sound and strongly complete axiomatization for our sentences.
-
3.
Unlike the other axiomatizations mentioned earlier, our axiomatization can be extended to include the use of weights for subformulas (where, for example,in the formulas , the subformula is considered twice as important as the subformula ).
-
4.
A surprising feature of our axiomatization is that it is parametrized, so that this one axiomatization is sound and strongly complete for essentially every real-valued logic, including those that do not obey the standard restrictions on fuzzy logics (such as conjunction being commutative). Previous axiomatizations in the literature had a separate set of axioms for each real-valued logic (for example, one of the axioms for Łukasiewicz logic is , and one of the axioms for Gödel logic is ). In the axiomatizations mentioned earlier, each connective has a fixed associated function that tells how to evaluate it. For example, in in Łukasiewicz logic, the value of is , where is the value of and is the value of , and where . By contrast, for our axiomatization, is arbitrary, as long as it maps into .
From now on (except in the Section 8 on related work) we use “complete” to mean “strongly complete”.
An especially useful real-valued logic for logical neural nets is Łukasiewicz logic, for several reasons. First, the , , , and operators are essentially linear, in that if is the truth value of a formula , and is the truth value of a formula , then (a) has value , (b) has value , (c) has value , and (d) is equivalent to , and so has value .111The versions of and we describe here are sometimes called in the literature strong conjunction and strong disjunction. Weak conjunction is given by and weak disjunction is given by . Second, it is easy to incorporate weights. Thus, if and are nonnegative weights of and , respectively, then we can take the weighted value of to be .
Throughout this paper, we take the domain of each function in the real-valued logic to be or and the range to be . This is a common assumption for many real-valued logics, but all of our results go through with obvious modifications if the domains are for possibly multiple choices of arity and range , for arbitrary subsets of the reals. We note that real-valued logic can be viewed as a special case of multi-valued logic (11), although in multi-valued logic there is typically a finite set of truth values, not necessarily linearly ordered.
We also provide a decision procedure for deciding, whether a set of our sentences logically implies another of our sentences, for certain real-valued logics, under certain natural assumptions. We implemented the decision algorithm, dubbed SoCRAtic logic (for Sound and Complete Real-valued Axiomatic logic), which we describe in detail and make available in source code.
Our sentences allow arbitrary real-valued logics, as does our sound and complete axiomatization, but our decision procedure depends heavily on the choice of real-valued logic, and in particular is tailored towards Łukasiewicz and Gödel logic. This is because a key portion of our decision procedure is linear programming, and we depend on the essentially linear nature of Łukasiewicz logic and the ease of dealing with min and max in Gödel logic.
Overview.
Until the final section, we do not allow weights. In Section 1, we give our basic notions, including what a model is and what a sentence is. In Section 1, we define our sentences to be of the form where the are formulas, where is a set of tuples , and where the sentence says that if the value of each is , for , then . In Section 2, we give our (only) axiom and our inference rules. In Section 3, we give our soundness and completeness theorem. In Section 4, we give a theorem that says that our sentences are closed under Boolean combinations. This helps show robustness of our class of sentences. In Section 5 we discuss possible simplifications of our sentences. In Section 6, we give the decision algorithm. In Section 7, we show how to extend our methodology to incorporate weights. In Section 8, we discuss related work. In the Conclusions, we review the implications for neuro-symbolic approaches.
1 Models and sentences
We assume a finite set of atomic propositions. These can be thought of as the leaves of a neural net, i.e., nodes with no inputs from other neurons. A model is an assignment of values to the atomic propositions. Thus, assigns a value to each atomic proposition .
Let be the set of logical formulas over the atomic propositions, where we allow arbitrary finite sets of binary and unary connectives. Typical binary connectives are conjunction (denoted by ), disjunction (denoted by ), and implication (denoted by ). Typical unary connectives are negation (denoted by ) and a delta function (denoted by ). Sometimes is taken to be , and is taken to be defined by if and 0 otherwise.
When considering only formulas with value 1, as most other works do when giving sound axiomatizations of real-valued logics, the convention is to consider a sentence to be simply a member of . What if we want to take into account values other than 1?
We take a sentence to be an expression of the form , where are in , and where . The intuition is that says that if the value of each is , for , then . We refer to our sentences as multi-dimensional sentences, or for short MD-sentences. For a fixed , we refer to the MD-sentence as -dimensional. The class of MD-sentences is robust. In particular, Theorem 4.3 says that MD-sentences are closed under Boolean combinations. We give a sound and (strongly) complete axiomatization, that is parameterized to deal with an arbitrary fixed real-valued logic. This axiomatization allows us to derive exactly what information can be inferred about the combinations of values of a collection of formulas given information about the combinations of values of several other collections of formulas.
Note that we are not saying that the logic is multi-dimensional (which could mean that the values taken on by variables are vectors, not just numbers), but instead we are saying that the sentences in our "outer" logic are multi-dimensional. The "inner" logic we work with in this paper is real-valued, and real-valued logic has been heavily studied. What is novel in our paper are our multi-dimensional sentences.
Note that the set in can be undecidable, even if and every member of is a rational number. For example, we could then take to be the set of all numbers , where is the Gdel number of a halting Turing machine. But our decision procedures involve only special sets . Thus, we shall say in Section 6 that a sentence is interval-based if is of the form , where each is the union of a finite number of intervals with rational endpoints. And our decision procedure in that section deals with interval-based sentences. However, our sound and complete axiomatization in Section 3 makes no such assumptions about the sets ; in particular, the sets can be undecidable.
For convenience, we assume throughout that in the sentence , we have that and are different formulas if . We refer to as the components of , and as the information set of .
Let be the sentence . We now say what it means for a model to satisfy . For , let be the value of the formula under the assignment of values to the atomic propositions given by the model . We say that satisfies if . We then say that is a model of , and we write . Note that if is satisfiable, that is, has a model, then .
2 Axioms and inference rules
We have only one axiom:
(1) |
Axiom (1) guarantees that all values are in .
We now give our inference rules.
If is a permutation of , then:
(2) |
where .
Rule (2) simply permutes the order of the components.
Our next inference rule is:
(3) |
Rule (3) extends to include with no nontrivial information being given about the new components.
Our next inference rule is:
(4) |
Rule (4) enables us to join the information in and .
Our next inference rule is the following (where ):
(5) |
where .
Intuitively, is the projection of onto the first components. Rule (5) enables us to select information about from information about .
Our next inference rule is:
(6) |
Rule (6) says that we can go from more information to less information. The intuition is that smaller information sets are more informative.
We now give an inference rule that depends on the real-valued logic under consideration. For each real-valued binary connective , let be the value of when the value of is and the value of is . For example, in Gödel logic, . For each real-valued unary connective , let be the value of when the value of is . For example, in Łukasiewicz logic, .
In the sentence , let us say that the tuple in is good if (a) whenever is and is a binary connective (such as ), and (b) whenever is and is a unary connective (such as ). Note that being “good” is a local property of a tuple in (that is, it depends only on the tuple and not on the other tuples in ). Of course, if the real-valued logic under consideration has higher-order connectives (ternary, etc.), then we would modify the definition of a good tuple in the obvious way. For simplicity, we will assume throughout this paper that we are in the common case where the only connectives of the real-valued logic are unary and binary, although all of our results go through in the general case.
We then have the following inference rule:
(7) |
when is the set of good tuples of .
Rule (7) is our key rule of inference. Let be the premise and let be the conclusion of Rule (7). As we shall discuss later, and are logically equivalent, and is as small as possible so that and are logically equivalent.
A simple example of a valid sentence is where . This is derived from the valid sentence by applying Rule (7)
Each of our rules is of the form “From A infer B” or “From A infer B where …”. We refer to A as the premise and B as the conclusion. We need the notion of a subformula of a formula. If is a binary connective, then the subformulas of are and . If is a unary connective, then the subformula of is .
Let be a set of MD-sentences. We define the closure of under subformulas as follows. For each sentence in , the set contains , and for each formula in , the set contains every subformula of .
In particular, contains every atomic proposition that appears inside the components of .
3 Soundness and completeness
Let be a finite set of MD-sentences, and let be a single MD-sentence. We write if every model of is a model of . We write if there is a proof of from , using our axiom system. Soundness says “ implies ”. Completeness says “ implies ”. In this section, we shall prove that our axiom system is sound and complete for MD-sentences.
We define a special property of certain MD-sentences, that is used in a crucial manner in our completeness proof. Let us say that a sentence is minimized if whenever , then there is a model of such that for , the value of in is . Thus, if and only if there is a model of such that for , the value of in is . We use the word “minimized”, since intuitively, is as small as possible.
Our proof of completeness makes use of the following lemmas.
Lemma 3.1
Proof 3.2.
Let be the conclusion of Rule (7). Assume that . To prove that is minimized, we must show that there is a model of such that for , the value of in is . From the assignment of values to the atomic propositions, as specified by a portion of , we obtain our model . For this model , the value of each is exactly that specified by , as we can see by a simple induction on the structure of formulas. Hence, is minimized.
Lemma 3.3.
Proof 3.4.
The equivalence of the premise and conclusion of Rule (2) is clear. For Rules (3), (4), and (7), the fact that the premise logically implies the conclusion follows from soundness of the rules, which we shall show shortly. We now show that for Rules (3), (4), and (7), the conclusion logically implies the premise. For Rule (3), we see that if , then . Hence, the conclusion of Rule (3) logically implies the premise of Rule (3). For Rules (4) and (7), the conclusion logically implies the premise because of the soundness of Rule (6).
Lemma 3.5.
Proof 3.6.
Part (1) is immediate, since the premise and conclusion have exactly the same information.
For part (2), assume that and are minimized. To show that is minimized, we must show that if , then there is a model of such that for , the value of in is . Assume that . Hence, . Since is minimized, we obtain the desired model .
Theorem 3.7.
Our axiom system is sound and complete for MD-sentences.
Proof 3.8.
We begin by proving soundness. We say that an axiom is sound if it is true in every model. We say that an inference rule is sound if every model that satisfies the premise also satisfies the conclusion. To prove soundness of our axiom system, it is sufficient to show that our axiom is sound and that each of our rules is sound.
Axiom (1) is sound, since every real-valued logic formula has a value in .
Rule (2) is sound, since the premise and conclusion encode exactly the same information.
Rule (3) is sound for the following reason. Let be a model, and let be the values of , respectively, in . If satisfies the premise, then . This implies that and so satisfies the conclusion.
Rule (4) is sound for the following reason. Let be a model, and let be the values of , respectively, in . If satisfies the premise, then and . Therefore, , and so satisfies the conclusion.
Rule (5) is sound for the following reason. Let be a model, and let be the values of , respectively, in . If satisfies the premise, then . Therefore , and so satisfies the conclusion.
Rule (6) is sound for the following reason. Let be a model, and let be the values of , respectively, in . If satisfies the premise, then . Therefore, , and so satisfies the conclusion.
Rule (7) is sound for the following reason. Let be a model, and let be the values of , respectively, in . If satisfies the premise, then . In our real-valued logic, we have that (a) when is and is a binary connective (such as ), and (b) when is and is a unary connective (such as ). Then , and satisfies the conclusion.
This completes the proof of soundness. We now prove completeness. Assume that ; we must show that . We can assume without loss of generality that is nonempty, because if is empty, we replace it by a singleton set containing an instance of our Axiom (1).
Let . For , assume that is , and let . Assume that is ), and let . Let be the closure of under subformulas.
For each with , let be the set difference . Let . Let . By applying Rule (3), we prove from the sentence . Let be the conclusion of Rule (7) when the premise is .
Let be a fixed ordering of the members of . Since the set of components of each is , we can use Rule (2) to rewrite as a sentence . Let us call this sentence .
Also, since the only rules used in proving from are Rules (2), (3), and (7), it follows from Lemma 3.3 that and are logically equivalent.
We now make use of the notion of minimization. Let . Define to be the sentence . It follows from Lemma 3.1 that each is minimized. So by Lemma 3.5, each is minimized. By Lemma 3.5 again, is minimized.
The sentence was obtained from the sentences by applying Rule (4) times. It follows from Lemma 3.3 that is equivalent to . Since we also showed that is logically equivalent to for , it follows that is logically equivalent to . Hence, since , it follows that . It also follows that to prove that , we need only show that there is a proof of from .
Recall that is ), and is . By applying Rule (2), we can re-order the components of so that the components start with . We thereby obtain from a sentence , which we denote by . By Lemma 3.3 we know that and are logically equivalent. So . Since is minimized, so is , by Lemma 3.5. By applying Rule (5), we obtain from a sentence , which we denote by .
We now show that . This is sufficient to complete the proof of completeness, since then we can use Rule (6) to prove . If is empty, we are done. So assume that ; we must show that .
Since , it follows that there is an extension in . Since is minimized, there is a model of such that the value of is , for . Since , it follows that is a model of . By definition of what it means for to be a model of , it follows that , as desired.
This completes the proof of soundness and completeness.
4 Closure of MD-sentences under Boolean combinations
Our next theorem implies that MD-sentences are robust, in that they are closed under Boolean combinations. Of course, since we are dealing with sentences (which take only the values True and False) in our "outer" logic, we use the standard Boolean connectives.
We begin with a useful lemma that we shall also use later.
Lemma 4.1.
The (standard logical) negation of the sentence is where is the set difference .
Proof 4.2.
We need only show that if is a model, then if and only if . Let be the value of in , for . If , then , and so . Hence, . Conversely, if , then , and so . Hence, .
Theorem 4.3.
MD-sentences are closed under Boolean combinations , , and .
Proof 4.4.
Let and be MD-sentences. Assume that is , and that is . As in the proof of Theorem 3.7, let be the closure of under subformulas. Assume that . As in the proof of Theorem 3.7, we know that for and , there is such that is equivalent to a sentence . The conjunction is equivalent to . The disjunction is equivalent to . And by Lemma 4.1, the negation is equivalent to , where is the set difference .
5 Lowering the dimensionality
It is natural to ask whether there is a -dimensional MD-sentence that in Łukasiewicz or Gödel logic is not equivalent to any -dimensional MD-sentence. For the special case , the next theorem gives an answer. We shall shortly state the more general case and generalizations of it as open problems.
Theorem 5.1.
There is a 2-dimensional MD-sentence that is not equivalent (in either Łukasiewicz or Gödel logic) to a 1-dimensional MD-sentence.
Proof 5.2.
Let be the 2-dimensional MD-sentence where . We now show that is not equivalent to a 1-dimensional MD-sentence. If is a propositional formula involving only and , then it is easy to see (by induction on the structure of formulas) that for Łukasiewicz or Gödel logic, defines a piecewise linear function , in the sense that the 1-dimensional MD-sentence says that if is the value of and is the value of , then . Since there is no such piecewise linear function and set for our sentence , the result holds.
The next theorem does not depend on restricting to Łukasiewicz or Gödel logic.
Theorem 5.3.
Every finite set of MD-sentences of arbitrary dimensions that involve only the predicate symbols is equivalent to a single -dimensional MD sentence . (The set depends on the real-valued logic being considered.)
Proof 5.4.
Let be a finite set of MD-sentences. We can view as a conjunction of MD-sentences, so by Theorem 4.3, is equivalent to a single MD-entence . As in the proof of completeness, by closing under subformulas, applying Rule (7), and reodering by applying Rules (2), we obtain an MD-sentnece that is equivalent to . Since the tuples in are good tuples, this is equivalent to the sentence where .
Open problems: For each with , does there exist a -dimensional MD-sentence that in Łukasiewicz or Gödel logic is not equivalent to a -dimensional MD-sentence? And for , how about there being a -dimensional MD-sentence not equivalent to a Boolean combination of 1-dimensional MD-sentences, or even to a Boolean combination of -dimensional MD-sentences?
6 SoCRAtic logic: A decision procedure
Given a finite set of MD-sentences, and a single MD-sentence , Theorem 3.7 says that if and only if . As we shall show, under natural assumptions there is an algorithm for deciding if . We call this algorithm a decision procedure. If the information sets all have s simple structure and the size of is treated as a constant, than the algorithm runs in polynomial time.
It is natural to wonder whether we can simply use our complete axiomatization to derive a decision procedure. The usual answer is that it is not clear in what order to apply the rules of inference. In our proof of completeness, the rules of inference are applied in a specific order, so that is not an issue here. Rather, the problem is that in applying Rule (7), there is no easy way to derive from , even if is fairly simple. In fact, we now show that even deciding if is nonempty is NP-hard. Let be an instance of the NP-hard problem 3SAT. Thus, is of the form , where each is a literal (an atomic proposition or its negation). Assume that the atomic propositions that appear in are . Let be the sentence
where is , for , and where . Assume that we apply Rule (7) where the premise is , and the conclusion is
We call this sentence . It follows easily from our construction of that the 3SAT problem is satisfiable if and only if is satisfiable. Now and are logically equivalent, by Lemma 3.3. So the 3SAT problem is satisfiable if and only if is satisfiable. By Lemma 3.1, we know that is minimized. Hence, if is nonempty, there is a model of , by the definition of minimization. And if is empty, then by the definition of a model of a sentence, there is no model of . Therefore, is satisfiable if and only if is nonempty. By combining this with our earlier observation that the 3SAT problem is satisfiable if and only if is satisfiable, it follows that the 3SAT problem is satisfiable if and only if is nonempty. Hence, deciding if is nonempty is NP-hard.
We now discuss our decision procedure. To have a chance of there being a decision procedure, the set portion of an MD-sentence must be tractable. We now give a simple, natural choice for the set portions. A rational interval is a subset of that is of one of the four forms , , , or , where and are rational numbers. Let us say that a sentence is interval-based if is of the form , where each is a union of a finite number of rational intervals. If each is the union of at most rational intervals, then we say that the sentence is -interval-based. Note that this interval-based sentence is equivalent to the set of sentences with only one component each. This observation may be useful in implementing a decision procedure. In fact, although we do not make use of this in the decision procedure described in this section, we so use it in the implementation of the decision procedure described later, since these sentences with a single component are easy to deal with. (This is one of several ways that our implementation differs from what is described in this section.)
Let . For , assume that is , and let . Assume that is ), and let . Let be the closure of under subformulas. If , then we say that the pair has nesting depth at most .
Theorem 6.1.
Assume either Łukasiewicz logic or Gödel logic, with the connective , , , and .222For Łukasiewicz logic, we could allow each of strong and weak disjunction and conjunction, respectively, as described in an earlier footnote. Assume that is interval based. Then there is an algorithm that determines whether . Assume that has at most sentences, each sentence in is -interval based, and has nesting depth at most . If is fixed, then the algorithm runs in time polynomial in and .
Proof 6.2.
Assume throughout the proof that has at most sentences, each sentence in is -interval based, and has nesting depth at most .
It is easy to see that if and only is not satisfiable. So we need only give an algorithm that decides whether is satisfiable.
Let be the closure of under subformulas. Let . By making use of Rules (2) and (3), for each with , we can create a sentence of the form that by Lemma 3.3 is equivalent to , and that has as components. By the construction, each is -interval-based.
Similarly, create the sentence of the form that is equivalent to , and that has as components. As before, is -interval-based.
Now is equivalent to the conjunction of the sentences for , and this conjunction is equivalent to , where . We now show that ) is -interval-based. By assumption, for each with , we have that is of the form , where each is the union of at most intervals. For each with , let . Then . So to show that ) is -interval-based, we need only show that each is the union of at most intervals.
Since , where each is the union of at most intervals, we see that is the union of intervals where the left endpoint of each interval in is one of the left endpoints of intervals in . For each , there are sets . And for each with , there are at most left endpoints of . So the total number of left endpoints of intervals in is at most , and so the number of intervals in is at most . Since , it follows that is -interval-based.
Let us now consider , which is equivalent to . Recall that is , and that is -interval-based. So is of the form , where each is the union of at most intervals. By Lemma 4.1, the negation of is , where is the set difference . For each with , let be the set difference . Clearly, is the union of intervals. The left endpoints of intervals in are the right-end points of intervals in , possible along with 0. So is the union of at most intervals. Let . It is straightforward to see that .
Now, showing that is not satisfiable is equivalent to showing that is not satisfiable, which is equivalent to showing that for every with , we have that is not satisfiable. So we need only give an algorithm for deciding if is satisfiable. Let us hold fixed. Since, as we showed, is -interval-based, we can write as , where each is the union of at most intervals. Now is equivalent to . Now is of the form , where for , and where . We showed that is the union of at most intervals, and that is the union of at most intervals, so it follows that is the union of at most intervals, since each left endpoint of the intervals in is a left endpoint of an interval in or an interval in .
We now describe our algorithm for deciding if the sentence , that is, for the sentence ), which is -interval-based, is satisfiable. This can be broken into subproblems, one for each choice of a single interval from for each with . This gives a total of at most subproblems. For each of these subproblems, we wish to decide satisfiability of the system along with (a) the binary constraints when is and is a , , or , and (b) when is .
The constraints are specified by inequalities (for example, if is we get the inequalities ). We now show how to deal with the constraints in (a) and (b) above. A canonical example is given by dealing with in Gödel logic, which interprets “” as . We split the system of constraints into two systems of constraints, one where we replace by the two statements “, ” and another where we replace by the two statements “, ”. In Łukasiewicz logic, where is , we split the system of constraints into two systems of constraints, one where we replace by the two statements “, ” and another where we replace by the two statements “, ”. The same approach works for the other binary connectives. For example, in Gödel logic, where is 1 if and is otherwise, we would split into two case, one where we replace by the two statements “, ” and another where we replace by the two statements “, ”. In considering the effect of the constraints in (a) and (b), each of our at most subproblems splits at most times, giving a grand total of at most systems of inequalities that we need to check for feasibility (that is, to see if there is a solution). For each of these systems of inequalities, we can make use a polynomial-time algorithm for linear programming to decide feasibility, where the size of each of these systems is linear in , and so the running time for each instance of the linear programming algorithm is polynomial in . Since also the number of systems is at most , and since is fixed by assumption, this gives us an overall algorithm for decidability, whose rulnning time is polynoimial in and .
The reason we held the parameter fixed is that the running time of the algorithm is exponential in , because there are an exponential number of calls to a linear programming subroutine. The algorithm is polynomial-time if there is a fixed bound on . Such a bound is necessary, because the problem can be co-NP hard, for the following reason.
Let be the sentence . Then is not satisfiable. Let consist of the single sentence from the beginning of the section. Then if and only if is not satisfiable. Now is satisfiable if and only if from the beginning of the section is nonempty, which we showed is an NP-hard problem to determine. Since if and only if is not satisfiable, it follows that deciding if is co-NP hard.
We now give an implementation of the decisoin procedure The decision procedure described in Section 6 is available under the socratic-logic subdirectory provided with this supplementary material. We implemented the algorithm as a Python package named socratic, which requires Python 3.6 or 3.7 and makes use of IBM® ILOG® CPLEX® Optimization Studio V12.10.0 via the docplex Python package.
6.1 Source code organization
The source code is organized as follows:
- setup.sh
-
A script to create a Python virtualenv and install required packages
- requirements.txt
-
A standard pip list of package dependencies
- socratic/theory.py
-
Implementations for theories, sentences, and bounded intervals
- socratic/op.py
-
Implementations for each logical operator as well as propositions and truth value constants
- socratic/demo.py
-
Two example use cases demonstrating the use of the package
- socratic/test.py
-
A suite of unit tests also serving as our experimental setup
- socratic/hajek.py
-
Many tautologies proved in (10) used in test.py
- socratic/clock.py
-
A higher-order function to measure the runtime of experiments
The classes defined in the source code are:
- theory.Theory
-
A collection of sentences that can test for satisfiability or the entailment of a query sentence under a given logic
- theory.Sentence
-
A base-class for a collection of formulas and an associated set of candidate interpretations for the formulas
- theory.SimpleSentence
-
A single formula and an associated collection of candidate truth value intervals for the formula
- theory.FloatInterval
-
An open or closed interval of truth values
- theory.ClosedInterval
-
, i.e., all values from to , inclusive
- theory.Point
-
, i.e., just
- theory.OpenInterval
-
, i.e., all values from to , exclusive
- theory.OpenLowerInterval
-
, i.e., all values from to excluding
- theory.OpenUpperInterval
-
, i.e., all values from to excluding
- op.Formula
-
The base-class of a data structure representing the syntax tree of a logical formula
- op.Prop
-
A named proposition, e.g.,
- op.Constant
-
A truth value constant, e.g.,
- op.Operator
-
A base-class for all formulas with subformulas
- op.And
-
Strong conjunction
- op.WeakAnd
-
Weak conjunction
- op.Or
-
Strong disjunction
- op.WeakOr
-
Weak disjunction
- op.Implies
-
Implication , i.e., the residuum of
- op.Not
-
Negation defined
- op.Inv
-
Involute negation
- op.Equiv
-
Logical equivalence defined
- op.Delta
-
The operation if else 0
6.2 Implementation details
The implementation strategy closely adheres to the decision procedure described in Section 6, though with a few notable design shortcuts.
Boolean variables.
One such shortcut is the use of mixed integer linear programming (MILP) to perform the “spliting” of linear programs into two possible optimization problems, specifically by adding a Boolean variable that determines which of a set of constraints must be active. For example, given the desired constraint , one may write
(8) | ||||
(9) | ||||
(10) | ||||
(11) | ||||
(12) | ||||
(13) |
for Boolean variable . For , observe that (10) and (11) are effectively disabled for and that (12) and (13) are likewise disabled for . For example, when , the remaining constraints are , which is equivalent to , as desired. Observe then that MILP’s exploration of either value for the Boolean variable is equivalent to repeating linear optimization for either possible set of constraints; no feasible solution exists for any combination of such Boolean variables in exactly the case that none of the split linear programs are feasible. In practice, CPLEX has built-in support for min, max, abs, and a handful of other useful functions, though the above technique is still required to implement Gödel logic’s residuum, negation, and equivalence operations as well as to select the specific intervals a sentence’s formula truth values lie within.
Strict inequality.
The described decision procedure also occasionally calls for continuous constraints with strict inequality, in particular when dealing with the complements of closed intervals, but also when handling input open intervals or the Gödel residuum, if else 1. Linear programming, however, does not inherently support this. To implement strict inequality constraints, we introduce a global gap variable to widen the distance between either side of the inequality, e.g.
(14) |
and then seek to maximize . If optimization yields an apparently feasible solution but with , we regard it as infeasible because at least one strict inequality constraint could not be honored strictly. Again in practice, due to floating-point imprecision, MILP can sometimes return tiny though nonzero values of even for in (14); as a result, it is necessary to check if is greater than some threshold rather than merely nonzero. We use , which is much larger than the imprecision we have observed and yet much smaller than most truth values we consider. We observe that this technique is roughly equivalent to replacing with throughout, which has the added benefit of freeing up the optimization objective for other uses in future extensions of the decision procedure, such as determining the tightest bounds for which a theory can entail a query.
Simple sentences.
We additionally observe that, for theories restricted to interval-based sentences, it is sufficient to support only sentences containing a single formula and collection of truth value intervals, i.e., SimpleSentences of the form for a single formula . This is because of the following theorem:
Theorem 6.3.
Any interval-based sentence is equivalent to a collection of simple sentences , each given .
Proof 6.4.
Accordingly, socratic implements only SimpleSentence. In order to include an interval-based sentence in a theory, one should instead include each of its component simple sentences as constructed above. In order to test the entailment of an interval-based sentence, one should separately test the entailment of each of its component simple sentences.
Complementary intervals.
As a last deviation from the described decision procedure, rather than explicitly finding the complement of a collection of truth value intervals for a given query formula, we simply adjust how constraints are expressed to force feasible solutions into the set of complementary intervals. Specifically, while the usual constraints require the formula’s truth value to lie within some one of its intervals, the complementary constraints require the formula’s truth value to not lie in any of its intervals, i.e., to lie to the left or the right of each of its intervals. We then reverse the direction of each interval’s lower and upper bound constraints, adding or removing gap variable as appropriate to switch between strict and nonstrict inequalities, and introduce Boolean parameters to decide which of each pair of constraints should apply, i.e., to decide whether the formula’s truth value should lie to the left or to the right. For example, if simple sentence has truth value intervals , the above would produce constraints
Observe that the last of these cannot be satisfied for unless , which is consistent with the complement of not having a right side in the interval .
6.3 Experimental results
We tested socratic in three different experimental contexts:
-
•
3SAT and higher -SAT problems which become satisfiable if any one of their input clauses is removed
-
•
82 axioms and tautologies taken from Hájek in (10), some of which hold only for one of Łukasiewicz or Gödel logic
-
•
A formula that is classically valid but invalid in both Łukasiewicz and Gödel logic, unless propositions are constrained to be Boolean
-
•
A stress test running socratic on sentences with thousands of intervals
All experiments are conducted on a MacBook Pro with a 2.9 GHz Quad-Core Intel Core i7, 16 GB 2133 MHz LPDDR3, and Intel HD Graphics 630 1536 MB running macOS Catalina 10.15.5.
-SAT.
We construct classically unsatisfiable -SAT problems of the form
(15) |
which, after CNF conversion, yields for 3SAT
and similarly for larger . The removal of any one clause in such a problem renders it satisfiable. We observe that, when each clause is required to have truth value exactly 1 but propositions are allowed to have any truth value, socratic correctly determines the problem to be
-
1)
unsatisfiable in Gödel logic,
-
2)
satisfiable in Gödel logic when dropping any one clause,
-
3)
trivially satisfiable in Łukasiewicz logic with, e.g., ,
-
4)
again unsatisfiable in Łukasiewicz logic when propositions are required to have truth values in range either or ,
-
5)
and yet again satisfiable in Łukasiewicz logic with constrained propositions when dropping any one clause.
We observe that Gödel logic is much slower than Łukasiewicz logic as implemented in socratic, likely because it performs mins and maxes between many arguments throughout while Łukasiewicz logic instead performs sums with simpler mins and maxes serving as clamps to the range. Interestingly, the difference between unsatisfiable and satisfiable in Gödel logic is significant; while the satisfiable problems have one fewer clause, this is more likely explained by socratic finding a feasible solution quickly. On the other hand, the unsatisfiable and satisfiable problems (with constrained propositions) take roughly the same amount of time for Łukasiewicz, though the trivially satisfiable problem is quicker. The apparent exponential increase in runtime is partially explained by the fact that each larger problem has twice as many clauses, but runtime appears to be growing by slightly more than a factor of 2 per each .
Gödel | Gödel | Łuka. | Łuka. | Łuka. | |
---|---|---|---|---|---|
unsat. | satisf. | trivial | unsat. | satisf. | |
3 | .012 | .011 | .014 | .019 | .014 |
4 | .022 | .020 | .022 | .031 | .033 |
5 | .054 | .043 | .041 | .047 | .043 |
6 | .121 | .107 | .064 | .104 | .098 |
7 | .204 | .255 | .173 | .167 | .206 |
8 | .404 | .414 | .273 | .286 | .308 |
9 | .861 | .881 | .507 | .539 | .554 |
10 | 5.46 | 1.99 | 1.03 | 1.11 | 1.17 |
11 | 18.0 | 4.34 | 2.09 | 2.44 | 2.21 |
12 | 33.3 | 10.9 | 4.36 | 5.06 | 5.01 |
13 | 119 | 25.8 | 8.72 | 12.4 | 12.3 |
14 | 696 | 71.0 | 18.4 | 38.0 | 35.6 |
Hájek tautologies.
Hájek lists many axioms and tautologies pertaining to a system of logic he describes as basic logic (BL), consistent with any t-norm logic, as well as a number of tautologies specific to Łukasiewicz and Gödel logic, all of which should have truth value exactly 1. We implement these tautologies in socratic and test whether the empty theory can entail them with truth value 1 in their respective logics. The BL tautologies are divided into batches pertaining to specific operations and properties:
- axioms
-
8 tests, e.g.,
- implication
-
3 tests, e.g.,
- conjunction
-
6 tests, e.g.,
- weak˙conjunction
-
7 tests, e.g.,
- weak˙disjunction
-
7 tests, e.g.,
- negation
-
8 tests, e.g.,
- associativity
-
6 tests, e.g.,
- equivalence
-
9 tests, e.g.,
- distributivity
-
8 tests, e.g.,
- delta˙operator
-
3 tests, e.g.,
In addition, there are logic-specific batches of tautologies:
- lukasiewicz
-
12 tests, e.g.,
- godel
-
5 tests, e.g.,
Each of the above BL batches complete successfully for both logics and each of the logic-specific batches complete for their respective logics and, as expected, fail for the other logic. The runtime of individual tests are negligible; the entire test suite of 82 tautologies run on both logics complets in just 2.911 seconds.
Boolean logic.
We consider a formula defined
(16) |
which is valid in classical logic but is not entailed with truth value 1 by the empty theory in either Łukasiewicz or Gödel logic. Conversely, constraining propositions and to have classical truth values by introducing the sentences
into the theory succeeds in entailing in either logic. Indeed, if even one of these sentences is added, is entailed, but no looser intervals around 0 and 1 can entail if both propositions are non-Boolean. In the other direction, Łukasiewicz logic with unconstrained propositions entails the sentence , i.e., with truth value bounded above .5, while Gödel logic with unconstrained propositions cannot entail with any interval tighter than . As a final example of the interaction between truth value intervals, Gödel logic entails for a lower bound truth value if either of or is constrained in the theory to have set of candidate truth values .
Stress test.
We consider the experimental configuration for Boolean logic above now with query for consisting of 10000 open intervals for from 2 to 10000 plus the closed interval and and for consisting of 10000 open intervals plus the closed interval . We observe the runtime of socratic to be just 11.8 seconds for Gödel logic and 9.38 seconds for Łukasiewicz logic. If we instead use closed intervals throughout, measured runtimes are 17.4 seconds for Gödel and 9.29 seconds for Łukasiewicz.
7 Dealing with weights
In some circumstances, such as logical neural networks (4), weights are assigned to subformulas, where the weight is intended to reflect the influence, or importance, of the subformula. Each weight is a real number. For example, in the formulas , the weight might be assigned to and the weight assigned to . If , this might indicate that is twice as important as in evaluating the value of .
As an example of a possible way to incorporate weights, assume that we are using Łukasiewicz real-valued logic, where the value of is , when is the value of and is the value of . If the weights of and are and , respectively, and if both and are non-negative, then we might take the value of in the presence of these weights to be .
We now show how to incorporate weights into our approach. In fact, the ease of incorporating weights and still getting a sound and complete axiomatization is a real advantage of our approach!
To deal with weights, we define an expanded view of what a formula is, defined recursively. Each atomic proposition is a formula. If and are formulas, and are weights, and is a binary connective (such as ) then is a formula. Here is interpreted as the weight of and as the weight of in the formula . Also, if is a formula, is a weight, and is a unary connective (such as ), then is a formula, where is interpreted as the weight of . We modify our definition of subformula as follows. The subformulas of are and , and the subformula of is .
If is a binary connective, then now has four arguments, rather than two. Thus, is the value of the formula when the value of is , the value of is , the weight of is , and the weight of in . Also, now has two arguments rather than one. Thus, is the value of the formula when the value of is , and the weight of is .
The axiom and rules are just as before, except that Rule (7) is changed to:
(17) |
where and (a) when is and is a binary connective, and (b) when is and is a unary connective.
We can extend Theorem 3.7 (soundness and completeness) and Theorem 4.3 (closure under Boolean combinations) to deal with our sentences that include weights. The proofs go through just as before, where we use Rule (17) instead of Rule (7). Thus, we obtain the following theorems.
Theorem 7.1.
Theorem 7.2.
The sentences that include weights are closed under Boolean combinations.
What about the decision procedure in Theorem 6.1? The key step is the use of a polynomial-time algorithm for linear programming. If we were to hold the weights as fixed rational constants, and if the weighting functions were linear (such as ), possibly including a min or a max, then we could use linear programming, and the decision procedure would go through in the presence of weights.
8 Related work
Rosser (12) comments on the possibility of considering formulas whose value is guaranteed to be at least . For example, in Łukasiewicz) logic, if we consider weak disjunction , where , then the real value of is always at least 0.5, since . But Rosser rejects this approach, since he notes that there are uncountably many choices for , but only countably many recursively enumerable sets (and an axiomatization would give a recursively enumerable set of valid formulas).
Belluci (13) investigates when the set of formulas with values at least is recursively enumerable. Font et al. (14) consider the question of what they call “preservation of degrees of truth”. They give a method for deciding, for a fixed , if having a value at least implies that has value at least .
Novák (15) considered a logic with sentences that assign a real value to each formula of first-order real-valued logic. Thus, using our notation, his sentences would be of the form , where is a formula in first-order real-valued logic, and is a single real value. He gave a sound and complete axiomatization.
An interesting logic is the rational Pavelka logic RPL, an expansion of the standard Łukasiewicz logicwhere rational truth-constants are allowed in formulas. For example, if is a rational number, then the formula says that the value of is at least , and the formula says that the value of is at most . Therefore, this logic can express the MD-sentences , when is the union of a finite number of closed intervals. However, it cannot express strict inequalities. For example, it cannot express that the value of is strictly greater than 0.5.333This follows from the stronger fact that if are the atomic propositions, is a formula, and is the set of all value assignments to the atomic propositions that give the truth value 1,then since the operators of standard Łukasiewicz logic are continuous (and so the value of is a continuous function of the value of the atomic propositions), it follows that is a closed subset of . Note that if , then even though the formula has the value 1 when the value of is at most 0.5, the negation does not have the value 1 when ; instead it has the value - 0.5. RPL was introduced by Hájek in (10) as a simplification of the system proposed by Pavelka in (16) in which the syntax contained a truth-constant for each real number of the interval [0,1]. Hájek showed that an analogous logic could be presented as an expansion of Łukasiewicz propositional logic with truth-constants only for the rational numbers in [0,1] and gave a corresponding completeness theorem. Moreover, first-order fuzzy logics with real or rational constants have also been deeply studied starting from Novák’s extension of Pavelka’s logic to a first-order predicate language in (17) (see e.g. (18)).
Each of (19), (20) and (21) give decision procedures that partially cover the situation we allow in Section 6. The former two support only Łukasiewicz logic. The third, like our decision procedure, works for a variety of logics, though it is explicitly established in (21) that their approach does not support discontinuous operators. Accordingly, unlike our decision procedure, their approach does not work for Gödel logic given its discontinuous operator.
9 Conclusions
We give a sound and strongly complete axiomatization for sentences about real-valued formulas. By being parameterized, our axiomatization covers essentially all real-valued logics. Our axiomatization allows us to include weights on formulas. The results give us a way to establish such properties for neuro-symbolic systems that aim or purport to perform logical inference with real values. The algorithm described gives us a constructive existence proof and a baseline approach for well-founded inference. Because LNNs (4) are exactly a weighted real-valued logical system implemented in neural network form, an important immediate upshot of our results for the weighted case is that they provide provably sound and complete logical inference for LNNs. Such a result has not previously been established for any neuro-symbolic approach to our knowledge. While our main motivation was to pave the way forward for neuro-symbolic systems, our results are fundamental, filling a long-standing gap in a very old literature, and can be applied well beyond AI.
We are very grateful to Marco Carmosino, who improved the writing in this paper by giving us many helpful comments We are also grateful to Guillermo Badia, Ken Clarkson, Didier Dubois, Phokion Kolaitis, Carles Noguera, and Henri Prade for helpful comments. Finally, we are grateful to Lluis Godo for confirming the novelty of our approach, and for helpful comments.
References
- (1) L Serafini, Ad Garcez, Logic tensor networks: Deep learning and logical reasoning from data and knowledge. \JournalTitlearXiv preprint arXiv:1606.04422 (2016).
- (2) SH Bach, M Broecheler, B Huang, L Getoor, Hinge-loss Markov random fields and probabilistic soft logic. \JournalTitleThe Journal of Machine Learning Research 18, 3846–3912 (2017).
- (3) W Cohen, F Yang, KR Mazaitis, TensorLog: A probabilistic database implemented using deep-learning infrastructure. \JournalTitleJournal of Artificial Intelligence Research 67, 285–325 (2020).
- (4) R Riegel, et al., Logical neural networks. \JournalTitlearXiv preprint arXiv:2006.13155 (2020).
- (5) G Boole, An investigation of the laws of thought: on which are founded the mathematical theories of logic and probabilities. (Walton and Maberly) Vol. 2, (1854).
- (6) LA Zadeh, Fuzzy logic and approximate reasoning. \JournalTitleSynthese 30, 407–428 (1975).
- (7) V Novák, A formal theory of intermediate quantifiers. \JournalTitleFuzzy Sets and Systems 159, 1229–1246 (2008).
- (8) G Epstein, Multiple-valued logic design: an introduction. (CRC Press), (1993).
- (9) R Fagin, A Lotem, M Naor, Optimal aggregation algorithms for middleware. \JournalTitleJournal of computer and system sciences 66, 614–656 (2003).
- (10) P Hájek, Metamathematics of fuzzy logic. (Springer Science & Business Media) Vol. 4, (1998).
- (11) N Rescher, Many-valued logic. (McGraw-Hill), (1969).
- (12) JB Rosser, Axiomatization of infinite valued logics. \JournalTitleLogizue et Analyse 3, 137–153 (1960).
- (13) L Belluce, Further results on infinite valued predicate logic. \JournalTitleJ. Symbolic Logic 29, 69–78 (1964).
- (14) JM Font, ÀJ Gil, A Torrens, V Verdu, On the infinite-valued łukasiewicz logic that preserves degrees of truth. \JournalTitleArch. Math. Logic 45, 835–868 (2006).
- (15) V Novák, Fuzzy logic with extended syntax. \JournalTitleHandbook of Mathematical Fuzzy Logic 3, 1063–1104 (2015).
- (16) J Pavelka, On fuzzy logic i, ii, iii. \JournalTitleZeitschrift fur Mathematische Logik und Grundlagen der Mathematik 29, 45—52, 119–134, 447–464 (1979).
- (17) V Novák, On the syntactico-semantical completeness of first-order fuzzy logic part I (syntax and semantic), part II (main results). \JournalTitleKybernetika 26, 47–66, 134–154 (1990).
- (18) F Esteva, L God, C Noguerra, First-order t-norm based fuzzy logics with truth-constants: distinguished semantics and completeness properties. \JournalTitleAnnals of Pure and Applied Logic 161, 185–202 (2009).
- (19) G Beavers, Automated theorem proving for łukasiewicz logics. \JournalTitleStudia Logica 52, 183–195 (1993).
- (20) D Mundici, A constructive proof of McNaughton’s theorem in infinite-valued logic. \JournalTitleThe Journal of Symbolic Logic 59, 596–602 (1994).
- (21) R Hähnle, Many-valued logic and mixed integer programming. \JournalTitleAnnals of Mathematics and Artificial Intelligence 12, 231–263 (1994).