N \pyear2010 \pmonthMonth \doinu10.1017/S1755020300000000 \leftrunningheadLucius T. Schoenbaum \rightrunningheadOn the Syntax of Logic and Set Theory
On the Syntax of Logic and Set Theory
Abstract
We introduce an extension of the propositional calculus to include abstracts of predicates and quantifiers, employing a single rule along with a novel comprehension schema and a principle of extensionality, which are substituted for the Bernays postulates for quantifiers and the comprehension schemata of ZF and other set theories. We prove that it is consistent in any finite Boolean subset lattice. We investigate the antinomies of Russell, Cantor, Burali-Forti, and others, and discuss the relationship of the system to other set theoretic systems ZF, NBG, and NF. We discuss two methods of axiomatizing higher order quantification and abstraction, and then very briefly discuss the application of one of these methods to areas of mathematics outside of logic.
1 Introduction
This work is an extended investigation of the idea of regarding the set not as the extension of a class (in this case, the class whose existence is guaranteed in ZF by the Pairing axiom), but as a mathematical product , an approach which, as we will see, naturally gives rise to a hierarchy of products in some ways comparable to the recursive hierarchy of arithmetical operations etc. A major goal is to obtain a well-defined notion of the abstract of the predicate such that, informally stated, one has , and other theorems of the same kind. There are two facts to underline:
-
[(ii).]
-
(i).
In order to form such abstract classes, one is forced to introduce some kind of notion of -primality. Havoc would result if -divisible objects could be ``included" or ``added" to the class by satisfying the condition , because this property might not necessarily be inherited by its -divisors.
-
(ii).
In order to form such abstract classes, the empty set must acquire a status unlike that of ordinary sets. Were such a class to be empty, , it must be well defined whether it is meant that , or perhaps neither.
It is not surprising that difficulties arise once, so to speak, the animals have all been released from their cages—once, that is, braces are eliminated and totalities give way to trickier (and more fluid) multiplicities. What is remarkable is that it seems that surpassing these obstacles, as a straightforward consequence, yields a naive comprehension principle, since two critical paradoxes, those of Russell and of Cantor, are correlated with the two points above. Cantor's paradox, which arises due to the unwieldy growth of power sets, can be avoided if the power set is not allowed to be -prime. Russell's paradox is avoided if we do not grant -primality to the empty set.
There is a second idea, closely related to the first, involved in our investigation. By building set theory not as the semantics of first order logic, but (through a kind of ``radical Henkinization") syntactically of a piece with it, we demonstrate that the basic philosophical principles of formalism can be applied in a setting further removed from logicist foundations than that setting established by Hilbert and his circle during the formative years of mathematical logic. Through a slight modification of the model-theoretic boundary between syntactic and semantic spaces, we invest formulas with location among model-theoretic individuals, allowing a number of simplifying reductions of the main artifacts of logic to be implemented. This appears to carry interesting effects through to more advanced levels of analysis in proof theory and model theory (§3), set theory (§4), and in other areas of mathematics (§5). From a broader vantage, it gives rise to an organization or Weltanschauung for mathematics which may be of interest to scientists and philosophers in areas related to mathematical foundations.
Our presentation is essentially self-contained, in order to address a broad audience inclusive of researchers (philosophers and mathematicians alike) in areas closely related to logic who may not have a background in logic comparable to those of specialists in the field. We would be most content if our work were considered interdisciplinary, and sincerely hope that our presentation contains something of interest to much-admired target readers in a plurality of disciplines.
2 Elements
2.1 Preliminaries
Our treatment of propositional calculus in this section will be brief. We study the domain of objects, or (synonymously) expressions. Anything that ``exists", we say, is an object—for example, propositions, desks, and numbers are objects. A handful of objects are designated as signs, these are:
The symbol Ø is the system's basic constant. The next three symbols of the list generate new objects. An object of the form for some object is a variable, an object of the form is an index, and a constant is, in addition to Ø, any object of the form . We call the first handful of variables etc. with the standard abbreviations etc.; this assignment is taken to be precise, but need not be specified here. In our presentation, when we have a need for metamathematical variables, we use Greek letters etc. We call the three unary connectives stops or logical stops. This name is chosen since they are meant to ``stop" the progress of a logical substitution (defined below). The index stop can be put aside for now; we will not see it again until §3.
The symbol is the fundamental relation, referred to as containment or universal containment. The symbols and are (binary) operations. Table 1 displays the system's basic formation rules. These rules allow, for any objects , , , the formation of a new object, having , , as subobjects or subexpressions; this language is heritable in the obvious sense. The object is the conjunction, union, or federation of and .222We have searched for an apt name and symbol for the product (): a universal expression of objects-in-multiplicity, akin to ordinary addition but idempotent and, in practice, more basic. The symbol employed is a formal comma modeled after the commas in the notation for the pair and the Gentzen-Kleene-Rosser notation for the carriage of assumptions. The use of an ordinary comma is tempting but troublesome, so one considers using one or the other of and . However, because the product should carry neither logical nor set-theoretical connotation in every instance, stultifying and plainly odd-looking expressions arise as a result. The symbol smooths over these distinctions, and thus helps us achieve the desired conceptual view. Our approach is to use the terminology federation, union, and conjunction (and three distinct symbols) to refer to instances of the same operation appearing at three distinct positions in the order of operations (see footnote, p. 26). It is hoped that a new symbol and a new term will aid the reader and avoid confusion to the greatest possible extent. The object is the intersection of and . We shall suppress these parentheses when they are unnecessary.
We call any object generated from Ø using only the given formation rules (and those that may be added later) formal objects, or well-formed objects. Objects which are not well-formed are informal objects, and include chairs, desks, letters of the alphabet (when not intended as abbreviations of formal objects), and strings of signs which are not well-formed. In this paper, we shall loosely allow consideration of informal or ``ordinary" objects as objects of the system, while giving precedence to the view in which all but the formal objects are laid aside or ignored. Thus the terms object and expression will hereafter refer to formal objects.
c {oldtabular}l
l
We make the following definitions. For we write . For we write , the negation of . For we write , the trivialization of .333This notation mirrors the notation for the negation of in, e.g., Hilbert & Ackermann (1928). We write for , the disjunction of and . This expression is classically equivalent to the expression ; however, the former expression remains serviceable in the intuitionistic case.444Thus can replace Heyting’s connective . For we may write , and for we write .
We call expressions which may be put in the form 555Given an object , that is, we may find objects and such that . formulas; an object that is not a formula is said to be concrete. Expressions of the form may also be called equations or equalities in addition to being called formulas, as usual. In order to improve readability and respect normal usage, we will usually use the synonymous symbol instead of when we denote an equality between formulas. We may also similarly write in place of in the case when and are both formulas. In its role, this technique is comparable to the practice (common, for example, in physics) of using modified parentheses (brackets, braces) in lengthy expressions to highlight syntactic units. It is also comparable in its role to the system of dots devised by Peano and featured in the Principia Mathematica. Here, because our work is not too involved, and in order to gain familiarity with the machine language (so to speak), we shall use sparingly.
We must pause to clarify the aforementioned notion of logical substitution. First, an instance of an object in the syntax of an object may be one of two kinds: if is, in this instance, a subobject of , it is said to appear in , and otherwise to merely occur in . We denote the object modified so that every appearance (note) of the variable is replaced by the object by . We refer to as the target variable, and to as the substituend of the substitution. The reader can verify that this is a well-defined object for its inputs , , and , and that no expression which occurs within (``is guarded by") an index, variable, or constant stop is modified by any substitution. If there is no risk of confusion, in place of , we may sometimes write simply .
c {oldtabular}cc
R1. Exchange. R2. Substitution.
l A1.
A2.
A3.
l A4a.
A4b. {oldtabular}l A5a.
A5b.
A6.
[t]l A7.
A8.
A9. {oldtabular}[t]l A10.
A11.
A12.
A formal proof is a finite numbered list of substitutions, exchanges, and axioms. These steps of the proof are usually written in descending order down the page along with a citation or attribution column (as in the formal proof of Lemma 2.1 below). If it is possible for an expression to appear in a formal proof, we say that the expression may be obtained or deduced, and we call a formal proof of . Rules R1-2 and axioms A1-12 are presented in Table 2. They are styled postulates; a postulate is a rule or an axiom, according to Kleene (1952). They are based primarily on the systems of Heyting (1930) (see Mancosu (1998, pp. 311-27)) and Kleene, op. cit., but see also Hilbert (1927), and Hilbert & Ackermann (1928, pp. 55ff.). The system is what is known in proof theory as a substitution Frege system (see, e.g., Pudlák (1998)). The rule of Combination:
is immediately derived from A10 and A11, for formulas and .
2.2 Basics
Formal proofs are greatly simplified through the use of assumptions. An arbitrary object (normally a formula) is taken or assumed at a given step, and subsequent rules may combine relying on the assumption as if it were proven, until the assumption is discharged. At the discharge step the conclusion of the immediately preceding step is rewritten under the hypothesis of the assumption. A proof with assumptions is not complete until all assumptions have been discharged. That the use of assumptions does not allow any new formulas to be proven is a result known as the deduction theorem, which we will now prove. We first require a lemma.
Lemma 1.
A8 | |
A8 | |
A11 | |
Exch. 4, 7 | |
Lemma (A4a, A4b, | |
Comb., A2, Exch.) | |
Exch. 2, 12 | |
Exch. 8, 13 | |
A7 | |
A2 | |
— | |
— | |
— | |
— | |
Comb. 14, 16 | |
Exch. 22, 21 |
Because of A11, and because A1 implies that , the equation
(1) |
is characteristic of formulas. Thus, Lemma 2.1 becomes, for all formulas and , formula (1) of Frege (1879),
(2) |
The connection between (2) and assumption mechanisms was noticed early on (see van Heijenoort (1967, p. 465)). However, it does not obtain when and range over all the objects of our system, as can easily be checked by letting in (2) be concrete.666Provided that one has already crossed over to the intended conceptual view, this point is obvious. We are considering logic and set theory as they might be conceived in a single common space or worldview in which a commitment exists to a general notion of objecthood encompassing both language and its referents. Consider, then, the content of (2) set-theoretically, e.g., in the case when and are ordinary objects (desks, chairs, etc.) and the relation is one of ordinary containment. In our system concrete objects live in the midrange or midst of the lattice. Viewed logically, the lattice encompasses an infinite array of truth values; a concrete expression—a desk, a number, a geometric figure, etc.—is an object whose truth value is neither true, nor false, but rather a thing unique to the object itself. As the formalism is developed in the sequel, the load on the symbol will be decreased by the introduction of the symbols etc.—though the distinction between these symbols and the family etc. shall solely involve properties of their arguments. The formal unity between the concepts of implication and containment, whose naturality we propose in this section and whose fundamental importance to the system should become clear as we proceed, rests undisturbed throughout (echoing in certain respects an early system of Quine, see footnote, p. 18).
Theorem 2 (Deduction Theorem).
A proof with assumptions in which all variables appearing in assumptions are held constant, and in which all assumptions are assumed and discharged in last-in-first-out order, may be converted to a formal proof.
Let be such a proof with assumptions. Consider a sequence of steps , , from an assumption step, , to the first subsequent discharge step, , inclusive, containing no intervening assumption steps. Replacing the entire string with the new string , it is clear that every prior use of the rule of Exchange can be carried out under the lingering hypothesis by inserting the Axiom A1, instantiating it as desired,777As a minor additional hypothesis, one must let variables appearing in assumptions be distinct from those appearing in the axioms. and applying Combination and Exchange. Every Substitution can be executed under the assumption since, by hypothesis, all variables are held constant. All of the axioms, finally, can themselves be written under the assumption of by using Lemma 2.1 and Exchange, and we obtain a new formal proof. If it has assumptions, we iterate the process described. If we continue until we have treated all the assumptions in turn, we obtain a formal proof of .
The stipulation that assumed variables be held constant is perfunctory at this level; there is no motivation to vary assumptions until quantifiers (and/or classes) are introduced. All naturally arising assumptions may in fact be discharged in any order, but one must exercise a bit of caution, since
(3) |
and these expressions are not equivalent in general,888If your wallet is empty, I will give you my coat. But I am not inclined, otherwise, to give you my coat for your wallet. but are seen to be equivalent when is a formula. A few details remain which we omit showing that we may generalize these principles, and therefore make and manipulate assumptions naturally, provided that they are formulas.
The following relations easily obtain: conjunctions and intersections of formulas are formulas. The converses of axioms 3, 6, 11, and 12. Of union/conjunction/ federation : associativity, commutativity, idempotence, identity Ø, zero element , as well as the relations
(4) | ||||
(5) |
Of intersection : associativity, commutativity, idempotence, identity , zero Ø, and isotonicity (implying substitutivity999The term isotone (and the dual antitone) is found, e.g., in Birkhoff (1940); substitutivity appears, e.g., in Quine (1951); Bernays (1958). One may show that antecedents are antitone, that double antecedents are again isotone, etc.). The same laws therefore hold also for disjunction . Finally, both of the distributive laws hold, yielding a distributive lattice of objects under federation and intersection, and a Boolean ring of formulas (under conjunction and disjunction, or equivalently, federation/union and intersection) isomorphic to .
2.3 Remarks
All students of modern mathematics are well accustomed to the assignment of formulas and equations to some value or thing that is in some sense the storing ground of true formulas, and another which is the home and identifier of false formulas. It is a truth which is all the more remarkable given the advanced technology, both industrial and intellectual, available to scientists and philosophers who investigate these instruments in our own age, that whether they are taken to be abstract forms, letters, or empirical entities, they have up to now resisted all efforts to make them rigorously understood and canonized. We do not contest these efforts, which lie well beyond our ken. We follow formalists and intuitionists in distinguishing mathematical discourse from ordinary linguistic discourse, and place before ourselves only the very limited task of calibrating and arranging adequate grounds for the former, laying aside questions surrounding the latter. We make no claims here concerning the concepts of truth and falsehood as they arise in ordinary language.
Table 3 illustrates the syntactic medium in which we are developing logic and set theory. This arrangement may surprise some readers, many of whom may never have paused before over the evocative relations which naturally arise in proof theory (such as , denoting that the set of formulas follows under no assumptions). The author readily agrees that the arrangement is conceptually surprising (it stirs the imagination even after long familiarity); however, having weighed the matter carefully, he finds arguments which speak for the system as it has been defined, and to condemn it, neither any falsifier, nor such a drastic departure from the norm in foundations. Logic itself is surprising and elusive—it receives these qualities naturally and instantly from the concepts it traffics in. Assigning those formulas which are formally true to the formal constant Ø identifying empty and trivial structures is no more remarkable than assigning them to a letter , ensconced in a detached logical space. The formulas, in either case, remain what they are; we encounter them, and they inspire us, in the same way. The author believes that it is rare indeed to find an a priori basis for criticism of variations in formal orientation which is invulnerable to a challenge on a priori grounds. One must be practical and, as Hilbert implored us to do, measure theories by their fruits. Given the inconclusive suggestion of patterns in the structure of formal and informal thought, we should be neglecting the due diligence of science to leave either possibility unexamined, in case one orientation should offer distinct utilitarian advantages over its alternative. The author cautions the reader to carefully entertain a wide-ranging field of evidence before adopting the Einsteinian critique that the system is simple, yes—and too simple. So far, we have seen a number of ways in which the present system recommends itself: a compact set of basic symbols, accessible definitions, the felicitous distinction between the formula and the concrete term, as well as the convenience of a unified intuitive presentation. What remains to be discovered is, of course, unknown to the author. In the next section, the reader shall have the opportunity to consider the effects of the novel approach upon the fundamental devices of set theory and model theory.
cccc Truth Table Name Generalization Formula
TTTT TRUE empty/trivial set Ø
FTTT NAND is a covering of
TFTT containment
TTFT containment
TTTF OR at least one set is empty/trivial
FFTT covers
FTFT covers
FTTF XOR disjointness
TFTF itself
TTFF itself
FFFT NOR all sets are nonempty/nontrivial
FFTF strict containment
FTFF strict containment
TFFF AND union
FFFF FALSE absurd/undefined
c
Aside from these remarks (and with respectful apologies to those in the philosophical domain), the author does not wish to suggest here a single philosophical vantage from which the formal system might best be viewed. According to strict formalism, the system is the article found on the page, dissociated from all semantic interpretations, and whatever metaphysics it might inspire is irrelevant to its use in practice. The formalist, as Carnap once said, is ``among the most pacifistic of mankind," able to partake in the methods of all others, and the author has, in principle, no disagreements with him. We believe that one could incorporate the system into intuitionism by introducing, to begin with, the symbol Ø as the empty hand—the moment before the unfolding construction of proof—while introducing the symbol to represent an impossible object of reflection, or what is called in Gödel (1944) ``the notion of `something' in an unrestricted sense". One may then proceed to define the system of Heyting.101010The intuitionistic case is interesting in its details, but requires additional work; we intend to discuss it elsewhere. The opposite assignment conflicts with intuitionism, for consider: truth is obviously constructive, while the domain in which construction occurs cannot be entirely so. With the early work of Wittgenstein, there is a shared goal of a comprehensive Weltanschauung for analytic thought, and with sustained effort along these lines there might be developed a framework bearing out some of his views on the limitations of language and the world of coherent ideas. The relationship of these ideas to Platonism and its many later manifestations is also quite striking. The contradiction-as-unity () has some curious and remarkable parallels to concepts explored by several ancient Greek philosophers including Plato, later figures such as, for example, Plotinus, Cusanus, and Pascal, and modern figures such as Cantor, Brouwer, Whitehead, and many others. We believe considerable opportunity remains to discuss the system in light of other leading schools. The relationship with Cantor's thought is further touched upon in §4.2, where we shall see that the novel approach adds a note of prescience to one of his more rarely-cited ideas.
3 Predicate Calculus and Set Theoretical Abstraction
To first summarily recount our progress in §2: to an unformalized intuitive space we have added the presence of objects, allowed that these objects may be collected into multiplicities (also considered objects), defined the intersections of such multiplicities (again, objects), and developed a system generating formulas (also objects) describing the containment relationship, if any, holding between two given objects. To this world of things, we wish to apply set-theoretical abstraction. This demands that we make a few incomplete remarks, before we begin, on the general question of how best to introduce sets of elements satisfying a stated condition—abstracts, or what most authors call classes—to a system like ours.
3.1 Individuals
In an early attempt to develop the class rigorously for analytic purposes, Peano (1889) refers to the formation of classes as inversion, apparently on the notion that if a formula were taken as a mapping into the truth values 0 and 1, the class derived from the formula would be its kernel. Prima facie, this seems quite natural, perhaps even obvious, but it runs into difficulties having to do with the fact that the domain of definition of such a function remains unclarified. In the particular case of such a system as the one at hand, there is no reason, at present, to expect that any given multiplicity should have a well-defined content. Since it is far from the case that all properties are heritable, this presents quite a grave problem.
Recall that we have chosen to build out from the notion of containment—the principle that objects are formed by fixing an interior. Let an object have ``content" if there is such that . What do we find ``inside" the set of objects such that has content? If the set can be formed, the transitivity of containment becomes destabilizing should there exist objects without content. Is such a predicate formally unstatable? It would appear not. In all nontrivial cases some objects indeed have content; if they all have content, the set is fine. With that allowance, however, we seem pushed into the boundless possibilities of a hopelessly ill-founded world. One way to deal with all of this trouble is to suppose that set-theoretical abstraction doesn't care about objects with content in this sense. If it cares only about objects that lack content, we begin to arrive at the notion of a discrete atomic point as a specialized multiplicity, and a theory of classes based upon it.
This approach, a retreat into the safety of wellfoundedness, bears a certain resonance, even if one were hypothetically prepared to dismiss the general tendency of modern set theoretical systems as an historic trend. It remains today an obvious and unavoidable fact (one that no one has ever been quite sure what to make of) that the world of mathematical models is very different, qualitatively speaking, from the world of ``real" objects. The many-pronged efforts during the early twentieth century to clarify the foundations of mathematics (Hilbert, whose school was perhaps preeminent among these, constantly emphasized the need to divest symbols of their ``meaning" in order to establish perfect rigor) has since been met with the runaway success of modern set theory, logic, and computer science, while the outcome of the movement to clear away the ambiguities of language (led by the logical positivists, significists, and others) remains unclear, having been frustrated by a wave of objections which crested during the postwar period and from which, it seems, it is not likely to recover. This suggests that the venture to clarify ordinary language and the venture to clarify mathematics are (at least in some significant respects) distinguishable. Instead of confronting or attempting to repair this dichotomy—and by more tenuous extension, that between mathematical models and empirically observed systems—we would, by enhancing the theoretical significance of the above-mentioned distinction, accept it into our system. In the endeavor to characterize things which are ``one" (robustly ``single" objects, or individuals) contradictions that have beleaguered the study of classes—the paradoxes named after Russell, Curry, Berry, Mirimanoff, Grelling, and so on—shall serve as so many tests to objectively measure candidate definitions. Should they be resolved, we shall have evidence that a satisfactory abstract definition of the intuitive notion of a point or individual is in reach.
Our approach will be based loosely upon the notion of primality in a unique factorization domain.111111This structure is discussed in most algebra textbooks, e.g. Hungerford (1974, §3.3).
Definition 3.1.
We say that an object is single, , if That is, is single if it is a concrete object with no proper concrete content. A single object is an individual.
Proposition 3.
. . .
Proposition 4.
Individuals are irreducible and prime with respect to . In symbols:
(6) | |||
(7) |
Let . , so . If , then ; use . Now let and . Suppose . . So . Since , . So . Since and , , so , so . So .
Definition 3.2.
We say that is an element or member of an object , , if and .
Proposition 5.
implies or .
Proposition 6.
is not an element of any object, and nor is Ø.
Let the reader note, we do not see in the set-theoretical universe—it is a still higher maximum, but it serves only as a kind of universal wastepaper basket. We will later encounter an object, the ``universe" ; it, too, will fail to be an individual in any nontrivial system. Here, the system echoes the notion, advanced in systems like NBG and MK, that the universe is a proper class, and therefore not an element of any set.
3.2 Classes
Because we could ask meaningful questions about classes that have not been constructed—for example, whether (under some formal criteria) they may be constructed or not—we could introduce these objects initially as expressions (say, of the symbolic form ) which dwell in the lattice of objects but have ab initio no containment relationships with any previously defined concrete expression, nor with one another. Then we might ensure, proceeding abstractly, that the classes so understood behave formally as though they were objects already in some sense defined—possibly by construction. Where any kind of construction is possible, all conceivable ways of access must converge, and conclude in perfect agreement. In other words, while the introduction of classes may succeed in generating new descriptions or formulations, it must fail to generate new constructions. If the class represents a kind of bridge across a sea of complexity, we wish to ensure that it begins and ends somewhere. We would not build bridges on a planet with no land, nor if it meant giving up the land for bridges.
These ideas will resurface several times as we proceed, particularly in §4.2. Before turning to the formal development, let us consider one more rough analogy. Consider the sequence and the sequence The syntax of both expressions is very basic, and one should agree that, whatever these symbols mean relative to one another or to other things, if one of the two sequences could be constructed, then the other could as well. Most people would also understand the latter expressions to be defined in terms of the former expressions—however, they need not be. If they were not, then in order to establish equivalence with the ordinary definitions, one would demand that a proof could be supplied showing that we may understand one to be a subsequence of the other, or that a member of the latter sequence takes on a unique location somewhere in the former sequence. Even if one were not exactly sure where in the former sequence, say, the twentieth element of the latter sequence could be correctly placed, one could easily stand about discussing the twentieth element of the latter sequence, more or less as we are doing right now, and proof in hand, refer to it, think of it, regard it, etc., as an element of the former sequence simpliciter.
Now in place of the natural numbers of this example, consider the signs and the formation rules introduced in §2. We add classes to the system by adding the sign to the list of basic signs, and to the formation rules of Table 1 the formation rule
We shall axiomatize these new objects in such a way that in every respect they act precisely as though they were in fact members of the original group of well-formed objects—precisely those members that we think of when we imagine the expression factored with respect to .
We define the class or abstract of variable and object (sometimes called the extension of with respect to ), denoted or usually more simply with the abbreviation , to be a product of , the class index, and , the condition of the class. (Though may range freely over objects, there is little interest in the class if is not a formula.) The modified—N.B.—condition and the modified class index are subobjects of the class. We refer to an appearance of an index as a bound occurence of , as usual, or say that appears as an index. Thus when simply appears, we sometimes say that appears free. This allows us in general to do all our thinking in terms of variables, some of which are ``guarded" from substitution processes.
3.3 Sets
Having discussed the individual and the class, we can introduce the remaining rules and axioms of this section, and begin to develop the rudiments of a theory of sets. We begin by introducing the simplest class of all:
Definition 3.3.
.
This class may be referred to as the single-element or type- universe, or the universal class. The formula Ø which appears as the condition can be thought of as expressing that there are no conditions upon members of the class.
If we select a multiplicity from the object just defined, then we have a set.
Definition 3.4.
An object is a set if . If is a set and , we say that is contained in , and write . If is also a set, we say that is a subset of .
Unless specified otherwise, the letters and shall always denote formulas, and we shall use the variables , etc. to denote sets.121212As was stated in §2, all variables are syntactically generic: for some object . We shall not make use of typed variables anywhere in the sequel. We define all of the close relatives of and in the expected way.131313Because the relation is rarely used, it might be best to let denote that and . However, we shall preserve the usual convention regarding slashed relations here. Schematically, this gives:
In expressions these relations will always bind more tightly than the ones introduced in §2, with the exception of , unless it is denoted by the character .
Passing for a moment to the context of the integers under ordinary multiplication (the model of primality that everyone knows best): we do not regard the integer 1 as a prime, since it is the multiplicative identity, and since it is what mathematicians call a unit. In a certain sense, it permeates multiplicative space—it is like the solvent in which prime number combinations are dissolved. Similarly, the basic idea of axiom schema A13 (a version of Cantor's comprehension principle, Frege's rule V, Quine's , and Zermelo's principle of Aussonderong) is to regard Ø as a different kind of object, distinct from ordinary single objects. It falls outside the domain of things which are targeted for membership in classes—a subset of any set, yet not a member of any set. Granting Ø such special status provides a kind of valve for the release of pressure, which checks the advance of Russell's paradox.
Axiom. (Schema) For every well-formed formula and variable ,
(8) |
Comprehension schema A13 is presented in a convenient deployable form in (8), while the formal axiom schema is displayed in Table 4. Note, as mentioned above, that since variables, when they are bound, are encased within an index stop, there is no way that a variable appearing in can be inadvertently bound in deriving from 's membership in .
There is one rule, R3, governing the classes introduced in this section. It states that under a group of assumptions which do not depend on a given variable (i.e., in which the variable does not appear), if an object contains an object , then the class is contained in the class . We point out the immediate derived rule
(9) |
The use of here is justified, since for any object ,
and R3 itself may be written
(10) |
c {oldtabular}c R3. does not appear in .
l A13 (Comprehension). objects , variables .
A14. variables .
A14a.
A14b.
Since these are the last of our rules, we may now note that the deduction theorem continues to hold, permitting the normal introduction and discharge of assumptions. We refer to a variable which appears in an open assumption as a named variable. This vocabulary is chosen naturally, because such a variable is thought of as the ``name" of something, and this name is retained as the steps of the proof flow along. We say that a proof is mindful of names if assumptions open during the occurrence of a substitution which has a named variable as target variable appear, upon discharge, with the named variable modified appropriately, and if the rule R3 is used only when the variable to be converted to an index does not appear in any open assumption. We will assume, from now on, that assumptions are formulas unless stated otherwise.
Theorem 7 (Deduction theorem).
Every formal proof with assumptions mindful of names can be converted into a formal proof.
We present the verifications to be inserted in Theorem 2.2; the setup is identical. Clearly, the rule R2 can be used to target named variables, provided the proof is mindful of names. We have to show that instances of R3 can be carried out under the lingering hypothesis . Let an occurence of R3 (using variable ) act on step , ; then is of the form , where are objects, and does not appear in . Hence . We use this expression to derive a proof (requiring only the postulates of §2, for which the deduction theorem has been verified) of , apply R3 (noting that the proof is mindful of names), then insert a proof of , allowing the proof to proceed under the assumption . A simple induction argument shows that an arbitrary number of assumptions can be folded up in the same way, to make way for an occurence of R3.141414In fact, because our system is not a resource logic, the open assumptions could simply be treated as an abiding unordered multiplicity. This completes the proof.
In certain respects, we expect all sets to behave like deliberately constructed sets. Hence, one thinks of the set as having a unique decomposition, or of being able to break every set up into its single elements—to ``spread them out" on a universal table top. Since we have defined sets in terms of a class, , this will be an acceptable notion for us, since all sets will be classifiable, well-founded objects. At the present stage of development, however, there might exist sets which have subsets, but which do not have any elements. Suppose, that is, that there exists an object such that
and now suppose that is a set. With such an object present, could be put equal to without contradiction. We exclude the possibility of such objects being defined with a version of the principle of extensionality: that every set ``is determined by its elements" (Zermelo, 1908). The important property is
(11) |
which asserts that for any given set , the class is at worst a proper subset of . For definable sets, the two expressions are equivalent, and any finite union of single objects is definable (Theorem 3.14). For undefinable sets, however, we must enforce a rather imponderable discipline, and thus we include axiom A14b of Table 4.
Axiom. (Schema) For variables , and .
We can now begin our development in earnest.
3.4 Extensionality
The following theorem is easy to prove in the adopted line of development, and is of central importance to what follows.
Theorem 8 (Extensionality).
A straightforward application of -transitivity, relying on A14.
We can now examine some well-known antinomies. The Russell set is . A single object will belong to if and only if by A13, but for all such . Therefore contains no individuals. Therefore (by extensionality) is empty.151515Though we avoid Russell’s paradox, Cantor’s theorem on the size of the power set may still be proven (see §4.1),and the theories of computability and decidability appear unaffected, as one might expect. A discussion of several similar paradoxically-self-containing sets is found in Beth (1965, pp. 483-4). Extensions of formulas
(which Quine calls ``epsilon cycles") are under our definitions all equal to Ø. It is not possible under our definitions for any paradox to arise from the relation , where implies . More generally, we have:
Theorem 9.
Curry's paradox arises from considering the extension (with respect to ) of the formula , where may be any formula or object. In naive set theory there is a paradox: let the set be called . By naive comprehension, . Therefore it is true that . So . So , which might be any statement at all, is true—all objects, in other words, are trivial. The A13 comprehension schema also implies that . However, it does not then follow from A13 that , only that if is single. It then falls to to determine whether in fact or not.161616In either case, the class is essentially a description which does not affect the underlying ontology. If is the sole individual such that —or in other words, if there is but one individual such that —then is single and , whereas if there are at least two such individuals, or no such individual, it is formally proven that is not single, and therefore that .
The recurring theme of these solutions is that paradoxes are avoided by countenancing multiplicities which are not individuals. We will see this theme again when we examine other paradoxes in §4. Next, we verify several basic theorems.
Theorem 10.
The class of objects equal to is itself if is single. The class of all elements of a class is equal to the class itself. Alphabetic variants are equal; reindexing is permitted. Briefly:
(12) | |||
(13) | |||
(14) |
The proofs of these formulas offer no difficulty. One simply applies the general method of Theorem 3.12, as in, e.g., the following proof:
Theorem 11.
For all formulas and ,
(15) | |||
(16) |
Let (name , and the variables appearing in and ). Then by comprehension , this becomes . Dividing into cases (axiom A6), one concludes .171717The parentheses required here invite the use of (defined below). Discharge the assumption (there are no longer any assumptions, thus no longer any named variables) and apply extensionality. The other direction is similar. For (16), use A6.
Theorem 12.
If and are sets, .
As in Theorem 3.15. We note that this theorem, like many mathematical theorems, requires a formal stipulation on the involved variables that is often omitted; in this instance, the variable cannot appear free in the expressions substituted for and . For us, however, this restriction is made automatically due to the index-variable distinction drawn at the syntactic level, and the formal statement of the theorem is here given in its entirety.
Since we still lack the ability to form sets of sets, we are still unable to define many familiar devices of set theory, such as the sum set and power set. However, we may define , the difference of and , , the symmetric difference or distinction of and , and , the complement of . Under our definitions, the union of two sets and is once again their federation, the fundamental product defined for any two objects and . Elementary formulas involving these can now be proven in the usual way, for example: , etc.
3.5 Quantification
The principle of Extensionality, first proposed by Frege in 1893, and later by Zermelo in 1908, has been subsequently used to provide a rigorous and intuitively natural basis for theorems throughout mathematics. We will use it to ground quantification.181818 During the period just prior to the appearance of his work on NF, Quine (1937) experimented with a similar approach, and proved its equivalence to a system of Tarski. In the system, to the study of which Quine devoted two papers, the simple theory of types is developed using only the connectives of inclusion (containment) and abstraction. For example, the condional is an abbreviation of the formula , where does not appear in or in . Since comprehension is one of the basic assumptions in topos theory and related areas, quantification by means of comprehension is commonly used in category-theoretical developments of deductive systems; see, e.g., Bell (1988, p. 70ff). The author is indebted to Andreas Blass for pointing out this and other references in topos theory.
Definition 3.5.
Given a variable and formula , we write if , and if .191919We will suppress parentheses whenever they are unnecessary. Our existential quantifier is the double negation of the intuitionistic one. We might also define symbols for the expressions : there exists a unique individual such that . Since does not imply that , we might write, say, to mean . This quantifier is surprisingly rare, however; in mathematical practice, discovery of the existence of structure elements or nonempty sets is the overwhelming norm. We make note of the presence, in the ordinary language of logic and mathematics, of the words such that coupled with the words there exists and the class of, which suggests—if weakly—a connection between the grammar of the quantifier and the class construction. But what then, about the incantation “for all , ”? If that hypothesis were true, shouldn’t such that appear here as well? Let denote the class (where we insert the needed substitutions should appear in or in ). Then the universal and existential quantifiers over the scope may be written: In this form, the scope appears in the condition only in the case of the existential quantifier, prompting such that , while the condition of the universal quantifier, being empty, is subject to elision; one need not remark the constraint such that…well, nothing. Some might argue that since their scope is confined to , these quantifiers establish a limited or even artificial system of quantification; others, however, might take the view that quantification over objects at the general level has already been defined to satisfaction by the universal rule of substitution R2.
Before continuing any further, we need to extend R3 to the case when may appear in :
Theorem 13.
Let , . Then . Let . Then , and since , as well, so . So by extensionality.
Once objects may have classes as subobjects, the management of assumptions becomes something of a puzzle. Theorem 3.18 presents a way around this obstacle: it directly follows from it that, given any objects , if has been proven, then one immediately has . Thus it becomes allowed to convert a named variable to an index, provided the assumptions in which the variable is named are discharged with the introduction of a universal quantifier. In practice this all falls naturally into place most of the time, but it can incite minor technical fallacies.
For example, given formulas and , the statement should imply that , but as writing produces absurdities by substituting for , this will not be a provable formula unless does not appear in or in . On the other hand, if for every individual , (a weaker assumption in general), the expected class relation obtains, as has just been shown.
Kleene (1952) investigates this phenomenon under the rubric variation (p. 102ff). There, one works with a system in which proofs with open assumptions are considered finished formal proofs. In order to keep his assumptions in order, he stipulates that variables which appear in assumptions are ``held constant" until they are substituted for. One is free to do so; however, they are then added to a list appended to the proof-theoretical symbol which indicates that they have been varied, and that further substitutions are no longer possible. His approach may been influenced by the Peanian pasigraphy (by way of the Principia, perhaps), in which subscripts of the quantified variables are attached to certain logical connectives in place of quantifier operators. Having obtained a structural analog to Kleene's approach, we are prepared to proceed.
We next show how to pass from the universal to the particular, and vice versa.
Theorem 14 (Generalization).
Let . Suppose ; then , so by comprehension, and therefore . So by extensionality .
Theorem 15 (Instantiation).
1. Let 2. 3. 4. 5. 6. 7.
In the universal case, we have given a backbone to the quasi-reasoning which echoes in the corridors of mathematics departments, where proofs of universal statements proceed by ``choosing" elements ``arbitrarily", ``freely", sometimes even ``at random". Existential statements prompt arguments of a similar style: the mathematician argues that he may ``pick one" and insert it into his algebra; if he is subsequently able to derive an expression that does not depend upon which element he picked, he concludes that the statement is true, usually without accounting for the fate of the assumption naming his choice. This form of reasoning, too, can be filled in with the tools we have defined. We first pause momentarily to verify the inverse direction:
Theorem 16 (-Introduction).
Argue by contradiction, by concluding from that .
Theorem 17 (-Elimination).
Let appear nowhere in formulas and , and let appear nowhere in . Then if, ,
then
We might state the sense of this statement by saying that ``if it has already been proven that "—here, is simply an independent variable—``then by using R3 and R1 with the theorem as stated, it is proven that ."202020In Kleene’s system, one has . Notice that is false; let , , , .
We need the following.
Lemma 18.
Let appear nowhere in . Then .
To prove the lemma, let , and let . Suppose (choosing a new variable) that . Then , that is, . Therefore . Therefore , and . So by extensionality. So , contrary to hypothesis. Discharging gives , or by the law of excluded middle.
Let and be defined as stated above.
(bind ) | |
(using Lemma 3.23) | |
(2), (1) Q.E.D. |
Many readers ought to suspect that a full system of first-order quantification over individuals has been established by these theorems. We can proceed to develop the identities of classical predicate calculus directly from them, or we can rely instead on the following theorem:
Theorem 19.
Any theorem provable in classical predicate calculus using the Bernays postulates212121According to Hilbert & Ackermann (1928, ch. 3), the postulates defining the universal and existential quantifier in Hilbert-type systems are due to Bernays. is provable under our postulates A1-14 and R1-3.
Although the reader may know the four postulates in question well, we should probably review them. In Kleene's notation, they consist of two axioms: , and , where is an individual free for in , along with two rules:
where the variable does not appear free in the formula .
Theorems 3.20 and 3.21 are analogues of Bernays' axioms, with the proviso that be an individual converted to one that be single.222222If we refer to single objects as individuals, then the vocabulary is coincidentally a match. The rules may be translated into our system as the formula
where and are formulas and does not appear in . This formula may be formally proven using the methods already presented.
3.6 Elementary Model Theory, Consistency
We will conclude this section with a short investigation of the metamathematics of the system we have introduced, following Kleene (1952); Simmons (2000). Because the new approach to the core syntax of logic adopted in this paper demands some revision of the details of an ordinary consistency proof (while allowing much to stand unchanged), we will proceed in a thorough style, assuming that the reader familiar with logic will easily pick out the relevant changes.
We begin by introducing revisions to the system that are convenient for the work of this section, since we are only interested in the classical case. All our previous work goes through essentially unchanged.232323Though in order to undertake this step we must assume primality (Proposition 3.7), a principle which as an axiom is unnatural in the author’s opinion. Trying to smooth the wrinkle out causes it to pop up elsewhere; in the balance it seems A5-6 are wanted after all.
-
1.
Cross out axioms A5, A6, and formation rule .
-
2.
Add abbreviations: for , and for .
We (momentarily) call an object a string if may be generated from the following formation rules:
for strings , objects , and variables | ||||
for strings and | ||||
for strings and variables |
A string is subject to the following substitutive one-step reductions:
provided that does not appear in . If a string may be reduced to a well-formed object, we call the string a well-formed proof string or simply a proof string. We say that an object is provable if there is a well-formed proof string and a many-step reduction reducing to . In this case, we may sometimes call the theorem of , and a proof of , and denote this relation . The length of proof string is defined as usual.
Let be a nonnegative integer. A -structure is a metamathematically collected group (hereafter: a set) , where , the constants of , is a set . The tables of are sets of ordered triplets, often summarized (informally) in the form of written tables. The tables are a generalization of the notion of a truth table. It is convenient to correlate the values found in the tables of a -structure to the decomposition of each concrete constant into its base 2 expansion. For example, in any -structure with , will be a member of the -table, indicating that is valid (in the sense defined shortly). This is because , and (see Table 5). The individuals of each -structure are thus the constants . We also note that the value of an object should be, according to , an object having weight 0 (as defined below).
Valuation of closed objects. Let be a subobject of . The class degree of an instance of with respect to is the number of classes in having in its scope, i.e., the number of classes escaped on a walk out of the object beginning at . If an object (formula or concrete) contains no variables, it is said to be closed. The weight of a closed object , , is defined recursively as follows, letting be formal objects: , , . This is a ``metamathematical" function, since its argument is not substitutive. The procedure we shall adopt for the evaluation or valuation of a closed object with respect to a -structure proceeds in two stages.
Direct evaluation: If every subobject in has class degree , then the object is class-free. Class-free objects are evaluated as follows: replace all subobjects having weight 1 with objects having weight 0 by using the tables of . This shifts the weights of all weighted subobjects by 1. If necessary, evaluate all multiplicities of weight 0. Iteration of this procedure eventually yields an object of weight 0.
Declassification: If not, then all classes with maximum class degree appearing in are replaced systematically with a subset of the constants of which happen to be single, namely . For each , , is tested in order with the condition of (which may be directly evaluated). If evaluating yields Ø, the constant is included in the multiplicity. If takes any other value, then Ø is included in the multiplicity instead. When this is finished, the multiplicity is itself evaluated, and the class is replaced. After treating all such classes , this step yields an object with strictly smaller class degree. The entire algorithm is then repeated until itself is class-free and may be evaluated. In the case that this procedure yields the product given , we say that takes the value in , or that the object is -valid in , and write . If the object is Ø-valid, we may say simply that it is valid.
Objects that are not closed. If an object in which variables appear may be closed by an assignment of its variables to the constants of (including Ø and ) such that it takes the value , then we say that the object can take the value in , or that it is -satisfiable in . If the object is Ø-satisfiable, we may say simply that it is satisfiable. If under every such assignment can take no other value except , we again say that is -valid.
Statability. An object is statable in or statable in if no constants except those of appear in . is provable in , or -provable, if there is a proof string that is statable in (in the obvious extension of the just-stated definition of statability) such that . Note that if an object is provable in , then it is necessarily statable in .
We refer to as the empty structure. In accordance with ordinary usage, we refer to class-free formulas statable in the empty structure as propositions. The system restricted to class free formulas (including propositions) statable in the empty structure we call the extended propositional calculus. The system of objects (including arbitrary formulas and concrete objects) statable in the empty structure we call the extended predicate calculus.242424Predicates, or predicate letters, can be modeled by variables. They are functions which take the value Ø or under their inputs, normally taken to be individuals.
Theorem 20.
Any formal object provable in is valid in , for every . In particular, all theorems of the extended predicate calculus are valid in any -structure.
We only sketch the proof. Fix , and let be statable in . We wish to show that , that is,
where the superscripted operators represent (in order) the reduction of to its theorem (say, ), the division of into a set distinguished by different possible assignments of the variables of to constants of , the declassification of the in , and finally, the direct evaluation of the declassified using the tables of . We proceed by induction on the length of .
First, if , then is an axiom, statable in the empty structure and thus statable in . Choosing arbitrarily, we can verify that in this case is valid in . For example, to verify A1, , note that if the is assigned to the resulting formula is valid. Suppose , with , and check that in all possible cases, the formula is valid. This type of inference can be applied to produce verifications of axioms 1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, and 14. We omit the remaining details of this step.
Next, assume the claim holds for every proof of length , and let statable in have length . There are three cases to consider. Let .
Case 1. Let . First, assume that the theorem of , , is closed with the possible exception of the variable . We know that the theorem of , , is of the form , for objects , and we know that is valid in (hereafter: ). Since (for we may assume that , since otherwise the valuation of is immediately Ø), implies that as well, for any individual in . So if (as in the previous step), then for every , , so , showing that is valid. If is not closed, the same argument can be repeated on each variable assignment.
Case 2. Let . The validity of implies that assigning any value to the variable , if it appears in , the resulting object is valid. So for any possible value of , will clearly also be valid. Since declassification and valuation both proceed from the leaves to the root of the syntax tree, this proves that is valid also in this case.
Case 3. Let . If , then is of the form , for object . Since , must necessarily also be valid; otherwise will not be valid, contrary to hypothesis.
Corollary 21.
No proof string statable in proves any constant of except Ø. In particular, is not provable in , for every .
In the author's opinion, our approach to the distinction between the syntax and semantics of model theory—treating it as a distinction in method instead of a distinction in kind—enhances the model theoretic approach in many respects, simplifying proofs and allowing ideas to more easily commingle with those areas of mathematics where a syntactic/semantic distinction is not implemented. Other routes to consistency, e.g., via the subformula property, cut elimination, the extended Hauptsatz, and the reduction of the Hilbert-type system to the Gentzen-type systems to which these theorems apply, requires a great deal of technical verification to carry out. Consistency is more easily provable in natural deduction systems, by using the inversion principle and normalization lemma of Prawitz (1965), but more effort is required to relate these systems to Hilbert-type systems.
4 Second and Higher Order Logic
The quantifiers that were introduced in §3.5 are of the first order, insofar as their range is constrained to single objects, or in essence those bodies we usually term individuals. As yet there is no way, given only the tools defined so far, to form a collection of subsets distinct from their set-theoretic sum, nor is there a way to quantify over such a collection. The first-order quantifiers of ZF, in stark contrast, range over any set in the cumulative hierarchy. At this stage, further development requires that we select one of two very broad options. First, we can bestow upon sets a route to obtaining the property of singlehood, thereby filling the universe in the way normally done in ZF and similar theories. Second, following the paradigm of putting eggs into cartons, cartons onto racks, racks onto pallets, pallets into containers, and so on, we can build, atop of , a larger universe of collections. We present versions of these two approaches in this section, beginning with the latter, and discuss Cantor's theorem and the Burali-Forti paradox.
4.1 Genera
In certain respects the approach of Russell, Ramsey, Quine and other typed systems is quite similar to what follows in this subsection—so much so that the present system can be considered a descendent of theirs.
We carry out a construction of the collection of sets, which we style the genus. Let a pair of signs and be introduced, together with the formation rules
Thus () is a binary operation on objects, called type-1 federation. Objects and are subobjects of the object , as usual. In our axiomatization of (), we shall rely upon equalities between objects to formally define a product that is identical to union/conjunction/federation (), except that it does not play the role that federation plays in logic.
Axiom. Type-1 federation is associative, commutative, idempotent, substitutive, and has zero element Ø (axioms B1-6 of Table 6).
Definition 4.1.
Let and be objects. We write , or simply if context allows, if The relation is called type-1 containment or first containment. When , we say that is contained in , or type-1 contained in, or first-contained in .
We can eventually define type- containment where is any integer or even any ordinal.
Theorem 22.
The type-1 containment relation is reflexive, transitive, and antisymmetric. In addition,
(17) | |||
(18) | |||
(19) | |||
(20) |
To prove that is transitive, let and . Then and . Therefore
c {oldtabular}[t]lc B1.
B2.
B3.
B4.
B5.
B6.
B7.
B8.
B9.
B10.
B11.
(objects , variables )
B12a.
B12b.
(variables )
[t]c
R4. does not appear in
lc C1.
C2.
C3.
Definition 4.2.
An object is type-1 single or first-single, or a generic individual, , if it has no generic content. That is, We write , or , if and . A genus that is not first-single is a proper genus. A nonsingle, first-single object is a plurality.
Axioms B7-10 further characterize generic individuals. Axiom B7 ensures that no proper genus can be single (this principle is exigent in light of Theorem 4.33 below). Axiom B8 introduces the simplifying assumption that there are no generic individuals beyond the sets. Thus the generic individuals are the sets: all individuals together with all pluralities. Axiom B9 grants meaning to the expressions and when either or is a proper genus. Axiom B10, perhaps the most important of this group, asserts the primality of first-single objects with respect to .
Proposition 23.
As in §3.1.
Proposition 24.
Suppose that . Then is a proper genus by Definition 4.29. Axiom B9 therefore implies that if , then , and hence .
The rest of the apparatus is quite analogous to previous definitions and postulates. We define, for every variable and proposition , the type-1 class of and or the class of sets such that , formally as . We shall usually denote this , or , if it is necessary to distinguish the type. The modified condition and the modified class index are subobjects of the class . These objects are subject to postulates B11, B12, and R4.
Definition 4.3.
, the class of sets with no condition. If the object , is said to be a genus. If is a genus and , we say that is a subgenus of .
is the type-1 universe, or the universal genus. Script letters etc. are used to denote genera. By letting , , we define type-1 quantifiers, which range over sets as well as individuals. We usually denote these and .
The curious reader may verify that letting
everything remains the same as in §2 and §3 at the generic level, and flows from the principle of extensionality much as before. For example, to verify A6, let and . Then (by primality) Here, as elsewhere, much depends upon the formula .
We will not do much with genera here (though see §5), but we can define a few familiar objects. For every genus , the sum set of is We can also define, for all sets , its set power or subset genus, the genus of all subsets of : Thus we have, for single
In general, the power of a finite set of individuals has elements. If this is something less than the reader expected, he or she will note that the cardinality of infinite sets is unaffected by the missing null (as noted above, Ø is not an element of any set); in fact, the power is itself of less theoretical importance without the motivation of Zermelo's principle of Aussonderong and the Power Set axiom. The usual value can be regained with a simple adjustment of the definitions above: instead of having as the type-1 identity Ø, one introduces a new identity which might be denoted , or even . However, since these constants quickly proliferate, the user of the system is then forced to define scores of symbols all intuitively signifying a trivial structure, a ``set of nothing". The author's approach is therefore to pursue the simpler of the two options, and to stand prepared should a clear and distinct reason for introducing typed identities arise. This may well occur, even though it has yet to so far; we should recall the nearly two millenia that passed in the West before 0 was finally distinguished from Ø.252525The infinite reduplication of each logically definable class at each type in the simple theory of types (which, as noted in Quine (1938, p. 137), is “particularly strange in the case of the null class”) was foremost among the concerns that gave rise to NF.
We assume (without formal development) the notion of a relation as a set of ordered pairs, that is, of substitutive products , for objects and . In place of , we write . We call a relation a function if , etc.
Theorem 25 (Cantor).
Let the set be a plurality. Then there is no injective function from onto .
Let be a plurality, let be an injective mapping of onto , and consider the set . On pain of contradiction, must be empty. In that case (encountered when the Russell set was examined above), an element of can be mapped to no other value but itself. But since is a plurality, this contradicts the hypothesis that is onto.
Because Cantor's theorem is unavoidable, genera cannot be members of . If genera were allowed to be single, Cantor's theorem would lead directly to Cantor's paradox, because the power of , a subgenus of , would be mapped in its entirety into itself, leading to contradictory results on the cardinality of . This means, effectively, that it is not possible to form a set of proper genera. Per axiom B9, such sets are left undefined.
4.2 Totalities
The reader is no doubt familiar with the construction introduced by von Neumann in 1923 defining sets that may be conveniently identified with order types. This construction is carried out within our system in this section. The original intent and style of thought is preserved: rather than sets (as given above) we formally generate a class of individuals sharing characteristic properties (which we label totalities) and apply abstraction to study them as an unrestricted whole. We begin by introducing two new symbols, and , with formation rules
(we will suppress these parentheses when they are unnecessary). With these come axioms C1-3:262626As noted above, the product , equivalent to except for a lower place in the order of operations, may be used at times to suppress unwanted parentheses. This gives rise to the order of operations as follows (where, for added context, ring operations are added):
The operation is analogue to the braces of normal set notation. Following D. R. Finkelstein, we call it the bracing operation. The antibracing operation strips away the shell introduced by bracing. We call objects such that for some totalities. Let the ordered pair of two objects be the totality ; relations may now be defined as usual (as in, for example, Suppes (1960), or Kunen (1980)). In particular, let if . We say that a set is transitive if , as usual; however, a totality is subtransitive if .
Definition 4.4.
An object is an ordinal if it is a subtransitive totality and is well-ordered by .
The first few ordinals are then etc., or (introducing abbreviations) etc.
Forming a totality out of the set of all ordinals leads to the oldest modern set-theoretical paradox.
Theorem 26.
Let Then if , then .
, the set of all ordinals, is transitive and well-ordered by . Therefore is itself an ordinal, that is, . But for any , , that is, . Therefore , a contradiction.
This is certainly not the first time that this paradoxical result has come to light in this fashion. It was found by Cantor while he was developing the theory of the transfinite, and it was formally derived by Burali-Forti in a paper which appeared around the turn of the century. Many years later, it impinged upon Quine when he extended his system NF with the class-existence axioms of Quine (1951). After the first edition of the book appeared in 1940, J.B. Rosser discovered that the system it contained (known as ML) was inconsistent. Quine had posited that all classes having stratified conditions and whose quantifiers could range freely over classes could be considered elements, and made members of his universe . Having worked out the difficulties related to Cantor's paradox of the largest cardinal, he had been convinced, wrongly as it turned out, that the system would not be troubled by the paradox of Burali-Forti.272727A repair of ML due to Hao Wang was used for the 1951 edition cited in the bibliography.
After what happened to no less creative and exact a logician than Quine, we must count ourselves fortunate that we noticed the paradox before hastening to press ourselves. This may be due in some part to the simplicity of the derivation here compared to the rather lengthy derivation required in ML. The problem with the naive procedure seems to be that the formation rule
and the axiom
when combined give rise to an ungovernable feedback when operating in tandem with the class structure introduced in §3. A class can be chosen and traded in for a new individual; the creation of an individual sparks a revision of the classes. Given such a potential source of instability, it shouldn't be surprising that a paradox was discovered lurking.
Let us see if we can distill the logic of Theorem 4.35. Since the crucial property is , we could consider the class
which contains , and falls to the Russell antinomy.282828Call this set . is an object; is single. If then i.e., . And vice versa: if i.e., . It seems that classes can now be definitions of the non-predicative kind rejected by Poincaré. Having betrayed our intention (of §3.2) never to use any class as a vehicle for the formation of new objects in addition to those already in place before the class has been formed, we have found ourselves again bushwhacked by Russell's paradox.
The problem may be as old as set theory itself, but we have new notions to employ—the distinct concepts of multiplicity and totality—which allow us hope that the paradox might yet be dispelled. It must be noted, however, that these notions were introduced already for this purpose by Cantor. First to arrive at the difficulty, he concluded that is a multiplicity for which there is no corresponding totality. He wrote to Dedekind in 1899: ``If we start from the notion of a definite multiplicity (a system, a totality) of things, it is necessary … to distinguish two kinds of multiplicities… For a multiplicity can be such that the assumption that all of its elements `are together' leads to a contradiction, so that it is impossible to conceive of the multiplicity as a unity, as `one finished thing'."292929See van Heijenoort (1967, pp. 113-7). He called such multiplicities absolutely infinite or inconsistent multiplicities.
Cantor's interpretation of the difficulty, that the the combustible totalities cannot be formed, or that , , etc., is motivated by the intuitive view (held by Cantor) that this construction and all such constructions stand on the cusp of the natural boundaries of the classifiable. For , in particular, this account is suggestive, given the commonplace associations (from a range of sources upholding varying standards of rigor) between the incomplete infinite (the Cantorian ``absolute" infinite) and the inconceivable or impossible. In one form or another, this vision of a great divide in the ontology of set theory is shared by almost all of Cantor's descendents, but our work has at least one profound contrast with others: it envisions the boundary brought down, so to speak, from a great height, and made into something more familiar: the distinction between one and many.
By taking what we have here denoted instead of what we have here denoted as set-theoretical membership, the individual is to this system essentially what the set is to NBG and MK. Since set theorists have a long experience with the use of proper classes in these systems (whose relationship with ZF has long been well understood), accepting the existence of proper, ``unfinishable", or ``untotalitizable" classes as Cantor recommended will likely seal the present system from further duress.303030For instance, Mirimanoff’s paradox is thus resolved. Mirimanoff considers the set of all well-founded sets—or rather, the totality of well-founded totalities. It must be well-founded, since an infinitely descending chain from implies an infinite descending chain from , but is well-founded. If is well-founded, however, then , contradiction. In NBG, it is deduced that is a proper class—or in Cantor’s idiom, is an inconsistent (we would almost prefer to say “perfect”) multiplicity—one which cannot be gathered into a whole. As Fraenkel points out, however, in the preface of Bernays (1958, §7), a ``vast field of uncertainty still remains" between those totalities which trigger an alarm and those which can be constructively imagined. In most axiomatizations of set theory (including ZFC, NBG, and MK), this field of uncertainty is pushed back with seemingly innocuous reassurances—, , , )—and other axioms impacting the structure of the transfinite.
There is, in any case, a guiding assurance: the paradox of the largest ordinal number has arisen for us in precisely the way it arose for Quine in the 1940s—because computationally tricky totalities (or elements, in the case of ML) have been introduced to an underlying system which, as Quine put it, ``is a completely safe basal logic…to which more daring structures may be added at the constructor's peril" (Quine, 1941, p. 135). As an independent form, the system of §3 remains a consistent and adequate setting for basic mathematical intuition—the intuition of aggregates of objects, finite in number, invested with location in a common space. This lends support to the view that the problem of the Burali-Forti paradox should be diagnosed as an attribute of the construction of naïvely defined totalities, and does not reflect doubt upon the fundamental principles governing the multiplicity and the class outlined above.
There likely exists a number of modifications of C1-3 (perhaps along with the formation rules for totalities) by which the difficulties we have observed could be compellingly resolved. Set theorists have already focused a great and impressive effort upon this pursuit. One might also pursue an orthogonal route, laying axioms C1-3 aside and carrying out further constructions and experiments leading into different mathematical and logical domains, employing the B axioms, or something similar, when need arises for collections of sets and second order quantifiers. If we use the system of §3 to stage the integers (beginning along the lines of etc., or by defining Church numerals or combinatory terms), the real number system, an elementary topos, or an abstract group, things might all proceed as one might hope and expect, and we could soon find ourselves having access to a language quite indiscernible from that of researchers whose work is formally grounded by other means. However, because practical attainment of these diaphanous connections might pose unexpected challenges (and these may be instructive and fruitful), we await the findings of further investigations.
5 Lower Order Logic, Modulation, Mathematics
Having completed the task set out in the introduction, we are prepared to conclude. However, we hesitate, since readers might now still wish to ask: why go to all the trouble of laying set-theoretical and logical foundations when, to be sure, set-theoretical and logical foundations have already been laid? There are several ways to answer this objection based on our previous work. There is the need in science for endless exploration and inquiry; there is the simplified form of our basic definitions and the pull of Occam's razor; there is the manner in which the elements of the system are well-correlated to natural human mathematical intuition, providing unity and explanatory power; there is the convenient reduction in the complexity of metamathematics provided, in particular, by using syntactically defined classes, syntactically defined indices and variables, and the synoptic rule R3; the manner in which the great nineteenth-century set-theoretical paradoxes have been, we believe, freshly illuminated, and perhaps better understood. In sum, the construction has so far proven to possess a number of compelling features. Our purpose in this section is to add a final observation to this reply felt to be worthy of note, by briefly discussing the manners in which our prior definitions offer practical utility to a working mathematician, and thus gain ground towards what we consider an essential motive for research in foundations. We ask the reader's forgiveness if in this section we do not pause to define all of our terms perfectly; the previous work ought to be a satisfactory guide to those seeking a more formal treatment. We begin by generalizing axiom group B slightly.
Axiom. For nonzero integers and and objects , , and , the type-n federal product, or type-n federation, is defined, subject to the following axioms:
|
The analogous generalization of axioms B11, B12, and R4 is straightforward. Type- abstraction is denoted , or if context allows; elements of the class must be type- single, . We write , , etc.; type subscripts may often be suppressed if confusion does not arise as a result. The type- universe is denoted . We set , etc.
Names tend to lend concreteness to abstract ideas, so we shall say that if…then we call the object a…
|
This taxonomic hierarchy is of course taken over from the ranks of the eighteenth century Linnaean system. The terms order and class have been removed, since they are frequently used for other purposes in mathematics.313131Nor are they alone: the terms form, variety, type, and domain all have currency in taxonomy. The terms set and residue are added since there is an admirable correlation between these and existing mathematical notions. Since the term family is often used for a set of collections (as for example in Rudin (1987)), there is another agreement with standard terminology at the rank. The term species might be suggested over set, since Aristotle's original binomial taxonomy can then be recognized in the generalized system, and since that term is generally used by Brouwer in his papers on intuitionistic set theory (see, e.g., Brouwer (1923)). It also suggests itself since it is useful to have the term set available at the metalevel (as we saw in §3.5). Nevertheless, we shall continue to refer to generic individuals as sets.
Definition 5.1.
Let . The modulation of is defined to be the expression
conveniently denoted .
The inverse operation, or antimodulation of may also be defined in the natural way. Thus from a given set , one obtains a residue through modulation, and a genus through antimodulation. Among the two, the modulation (which may be pronounced modulo , or simply mod ) serves particularly well as a basic tool in algebra. Let be a set, and let be products defined on , with , for . Let
(21) | |||
(22) |
so that and are distributive over the entire federal hierarchy. Let be a ring with unity, and let . The question of when the set forms a ring may be answered by examining the elementary calculations
(23) | |||
(24) |
We wish to obtain a set of conditions under which these expressions become
(25) | |||
(26) |
which correlate multiplication and addition with and without the modulated factor . (25) will follow from (23) when is additively closed and contains 0. A solution to the problem of satisfying the constraint of (26) is to allow that for all , and that be multiplicatively closed, so that (remembering ). Thus if , and as well. Moreover since , giving an additive group, and the multiplicative closure requirement can similarly be dropped. We call such a structure an ideal inside the ring.
The reader should note that the ideal concept has been defined neither dogmatically nor informally. Rather, it has been directly detected and recovered, by means of a brief, real-time examination of algebraic structure. The style of calculation can be continued to yield elegant direct proofs of the isomorphism theorems, and generally to facilitate investigation within the theory of homomorphisms and beyond.
Consider the case . Each ideal has the form for some . The study of equations in carrying the factor was first undertaken systematically by Gauss (1986), who wrote
for the relation
or equivalently,
Thus the formal expressions we obtain using the algebraic set bear a striking resemblance to the classical notation coined by Gauss.
We present a proof of a well-known elementary theorem in the calculus of residues, a proof that will in fact be similar to the one given by Gauss (op. cit., pp. 10-13). Because the symbol recurs frequently, it will be suppressed under modulation, thus . We will work with equalities and containments between residues, i.e., federal products .
Consider the problem of finding the integer solutions of the equation modulo an integer , that is, finding the complete solution of the expression
(27) |
where . In order to bring the expression into a shorter and more standard form, we write (27) as
(28) |
to indicate that the superscripted factor is applied on both sides of the equation. Let be the greatest common divisor of and . Adding323232This operation is permitted; however, ideals do not have additive inverses, so one may not subtract them out. the residue to both sides of (28) gives
(29) |
but since and , this becomes
(30) |
or
(31) |
Selecting a member from the right hand side gives
(32) |
Antimodulation gives , i.e., divides . Dividing by gives
(33) |
where is relatively prime to . Since (which we do not pause to prove), , and therefore
(34) |
that is,
(35) |
By multiplying (35) through by , we see that the solution of (28) is nonempty. Since , this solution is single modulo .333333We do not intend to flout tradition: a solution exists, and is unique modulo . Our intent here is to point out a certain agreement between the shared notions of mathematics and notions we have developed in the preceding sections while working in foundations. We may thus write the solution set in the form , for some single solution . It remains only to determine the form of the solutions in the modulus . This is easy: we proceed by extracting the coarser modulus:
(36) |
We therefore conclude that, when divides , there are infinitely many solutions of (28) in the set of integers, solutions modulo , and a single solution modulo . The argument has a more elementary character than a standard proof, since it relies upon natural extensions of familiar algebraic operations and has a very basic quantificational structure. We let this suffice as a simple demonstration of the computational methods organized by instruments developed in the foregoing sections, since we intend to continue this discussion in work now in progress.
6 Acknowledgements
This study began while the author was a student at the University of Georgia advised by Brad Bassler, Robert Varley, David Edwards, and Edward Halper. The author would like to thank them, as well as David Finkelstein and the anonymous referees, for all their help in editing and improving earlier versions of this paper.
References
- Bell (1988) Bell, J. (1988). Toposes and Local Set Theories: An Interpretation. Oxford: Clarendon.
- Bernays (1958) Bernays, P. (1958). Axiomatic Set Theory. Amsterdam: North Holland.
- Beth (1965) Beth, E. W. (1965). The Foundations of Mathematics (2nd ed.). Amsterdam: North-Holland.
- Birkhoff (1940) Birkhoff, G. (1940). Lattice Theory (3rd ed.). Providence, RI: American Mathematical Society.
- Brouwer (1923) Brouwer, L. E. J. (1923). Über die Bedeutung des Satzes vom ausgeschlossenen Dritten in der Mathematik, insbesondere in der Funktionentheorie. Journal fur die reine und angewandte Mathematik 154, 1–7. English translation in van Heijenoort (1967, 334-341).
- Frege (1879) Frege, G. (1879). Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle: Louis Nebert. English translation in van Heijenoort (1967, 1-82).
- Gauss (1986) Gauss, K. F. (1986). Disquisitiones Arithmeticae. Yale University Press. English translation by A. Clarke, W. Waterhouse. Originally published in 1801.
- Gödel (1944) Gödel, K. (1944). Russell's Mathematical Logic. In Schilpp, P., editor, The Philosophy of Bertrand Russell, pp. 125–53. Evanston & Chicago: Northwestern University Press.
- Heyting (1930) Heyting, A. (1930). Die formalen Regeln der intuitionistischen Logik. Sigzungsberichte der Preussischen Akademie der Wissenschaften, 42–56. English translation in Mancosu (1998, pp. 311-27).
- Hilbert (1927) Hilbert, D. (1927). Die Grundlagen der Mathematik. Abhandlungen aus dem mathematischen Seminar der Hamburgischen Universität 6, 65–85. English translation in van Heijenoort (1967, pp. 464-479).
- Hilbert & Ackermann (1928) Hilbert, D., & Ackermann, W. (1928). Grundzüge der theoretischen Logik (2nd ed.). Berlin: Springer.
- Hungerford (1974) Hungerford, T. W. (1974). Algebra. New York: Springer.
- Kleene (1952) Kleene, S. C. (1952). Introduction to Metamathematics. Amsterdam: North-Holland.
- Kunen (1980) Kunen, K. (1980). Set Theory: An Introduction to Independence Proofs. Amsterdam: Elsevier B.V.
- Mancosu (1998) Mancosu, P. (1998). From Brouwer to Hilbert: The Debate on the Foundations of Mathematics in the 1920s. Oxford University Press.
- Peano (1889) Peano, G. (1889). Arithmetices principia, nova methodo exposita. Turin. English translation in van Heijenoort (1967, pp. 83-97).
- Prawitz (1965) Prawitz, D. (1965). Natural Deduction: A Proof-Theoretical Study. Stockholm: Almqvist & Wiksell.
- Pudlák (1998) Pudlák, P. (1998). The lengths of proofs. In Buss, S., editor, Handbook of Proof Theory, Chapter 8. Elsevier Science B.V.
- Quine (1937) Quine, W. V. O. (1937). Logic Based On Inclusion and Abstraction. Journal of Symbolic Logic 2(4), 145–152.
- Quine (1938) Quine, W. V. O. (1938). On the Theory of Types. Journal of Symbolic Logic 3(4), 125–139.
- Quine (1941) Quine, W. V. O. (1941). Element and Number. Journal of Symbolic Logic 6(4), 135–149.
- Quine (1951) Quine, W. V. O. (1951). Mathematical Logic (revised ed.). Cambridge: Harvard University Press.
- Rudin (1987) Rudin, W. (1987). Real and Complex Analysis (3rd ed.). WCB/McGraw-Hill.
- Simmons (2000) Simmons, H. (2000). Derivation and Computation. Cambridge University Press.
- Suppes (1960) Suppes, P. (1960). Axiomatic Set Theory. D. Van Nostrand Company.
- van Heijenoort (1967) van Heijenoort, J., editor (1967). From Frege to Gödel: A Source Book in Mathematical Logic. New York: toExcel Press.
- Zermelo (1908) Zermelo, E. (1908). Untersuchungen über die Grundlagen der Mengenlehre I. Mathematische Annalen 59, 261–281. English translation in van Heijenoort (1967, pp. 199-215).