Construction and Set Theory
Abstract.
This paper argues that mathematical objects are constructions and that constructions introduce a flexibility in the ways that mathematical objects are represented (as sets of binary sequences for example) and presented (in a particular order for example). The construction approach is then applied to searching for a mathematical object in a set, and a logarithm-time search algorithm outlined which applies to a set of all binary sequences of length ordinal with a binary label appended to each sequence to indicate that sequence is a member of or not. It follows that deciding membership of a set for a given binary sequence of length of binary sequence of cardinal length takes bits, which is shown to be equivalent to the Generalised Continuum Hypothesis on the assumption that information is minimized when a mathematical object is created.
1. Philosophical Introduction
This is a short paper about set theory as a foundation for mathematics.
It is not my intention to repeat what many authors have already written
on the subject of set theory, so there is no discussion of the iterative
conception of sets, forcing or limitation of size arguments, and only
a mention of large cardinal axioms as a complexity measure.111See [6] for an encyclopedic overview of set theory up
to the millennium and [1], [10] for very
readable introductions to the iterative conception of set, which remains
the standard motivation for set theory in terms of motivating the
axioms of first-order Zermelo Fraenkel set theory. [4]
gives an excellent background in the development of the concept of
set, while [5] gives a structuralist interpretation
of set theory that is still unsurpassed in clarity. Large cardinal
axioms (axioms asserting the existence of infinite cardinal numbers
with certain defining properties that are not theorems of first-order
Zermelo Fraenkel set theory) have a vast literature, but [7]
is a good introduction. Rather the aim of this paper is to convince the reader about a certain
way of looking at mathematics, which has some implications for set
theory. That way of looking at mathematics owes something to information
theory and computer science, and a great deal to P. Lorenzen’s notion
of construction (see [8] and [9]).
The basic idea is that all of the objects and activities of mathematics
are constructed by functions, and that the existence of the functions
enables objects (including sets) to be defined. To give a simple example,
the function of successor defines the set of natural numbers (subject
to the condition that there is an initial number, 0, and the successor
function does not output 0) given that the construction defines the
smallest such set because an agent with unbounded but finite resource
would construct exactly the set of the natural numbers.222Strictly, in terms of an ontology each mathematical “object” is
really a function (or type) over a set of concrete individuals, because
there is an issue of non-unique types, such as in the statement “1,
2 and 3 are 3 numbers”. Moreover constructions can also be carried out on much larger sets
than the set of natural numbers, in much the same way as intuitionists
admit for natural numbers and real numbers, namely by free choice.333See for example [11].
The axiom of choice in the form of the well-order-ability of any well-founded
set is a key principle of infinite construction, and is constructive
because an agent with sufficient (i.e. infinite) resource could
choose elements successively and at infinite limits form the sequence
of all elements chosen so far. If one accepts infinite constructions,
then the truth or falsehood of any proposition of first-order set
theory follows. For example, the truth of a quantified proposition
has a clear inductive construction in terms of a sequence of truth
values of its subformulas that follows the constant true sequence
or constant false sequence of truth values or that does not follow
those sequences.444For example, is true in a model if
can be well ordered as using the axiom
of choice and the truth values of
form a constant sequence of value “true” of length
The constant sequence of value “false” corresponds to
and not following constant sequence of value “false” corresponds
to While constructions determine how objects come to exist, that does
not mean that relationships between the objects cannot exist that
were not intended as part of the construction. Mathematics does not
need to be predicative (i.e. defining sets in stages only in
terms of sets that are already defined) provided the rule or process
of construction is clear (which in my view includes the process of
choosing members of a set).555This is a deviation from the view of Lorenzen and the school that
includes H. Poincaré, H. Weyl and S. Feferman, see [3]
for example. As truth is well defined, the logic of mathematics does not need
to be constructive or intuitionistic. However, according to this view
the objects of mathematics are no more than constructions, and we
should not imagine that they exist independently of the process of
their construction. The objects of mathematics are possibilities of
construction, in the modal-structural sense of [5],
and it is the clarity of their rules of construction that grants them
existence.
All constructions create information. It is reasonable to suppose
that Ockham’s Razor applies: when an object is created, the amount
of information created with it is the least possible to be consistent
with other objects.
One problem with this approach is the status of these agents with
infinite resources (actually bounded by some infinite ordinal). I
do not claim that such agents exist in our physical world, but I do
claim that their existence is possible if a rule of construction that
an agent uses is clear. In the same way that Euclid’s proof of the
infinite of primes gives a bound on finding the next prime in the
sequence of prime natural numbers, and thereby shows that the number
of prime natural numbers is infinite even though there are only finitely
many atoms in the universe, rules of construction that require infinite
resources can have interesting properties that help frame our theories
of the physical world.
This may be all very well as a philosophical position (or not of course),
but what practical value does it have? Put briefly, the value of this
position is the recognition that mathematicians have freedom to represent
a set of objects as they wish subject to the constraints of the construction,
including the presentation of the set in terms of ordering. That is
to say, if a mathematical object does not come equipped with its own
intrinsic ordering, an ordering can be added without affecting the
intrinsic properties of the mathematical object. It turns out that
freedom to present and represent mathematics does have practical consequences.
2. Search for a Member of a Set
As an example of the constructive nature of mathematics, consider
the question of what it means to search for a member of a set. In
theory, if we represent the members of a set as binary sequences (or
bitstrings for short), then you could read the bitstring and
then append a label (say 1) to the bitstring if the bitstring were
a member of the set and another label (say 0) if the bitstring were
not a member of the set.666This is possible by fixing an enumeration of a set ,
(by the Axiom of Choice), and for any subset forming
the binary -sequence ,
where the ordinal index of any member is taken from the
enumeration of (which includes all members of ). Thus a subset
of can be identified with a binary -sequence, and a
set of subsets of can be identified with a set of binary -sequences. In general we would have to rely on an oracle to decide whether a
set defined in this way were (equivalent to) the same set as a defined
by a property of the members, but this lack of decidability is a problem
with properties rather than with sets. We can say that if a set comprises
bitstrings that each have length of least upper bound an ordinal
of cardinal number , then the amount of information in searching
for a member of the set is, adding 1 to the length of the sequence
for the binary label, . In practice, for any reasonably
large set we will be faced with a lot of bitstrings, and have no way
to search for a particular bitstring other than to enumerate
the set of bitstrings somehow. Let us suppose (using our freedom of
construction) that we can linearly order lexicographically (written
)777 if
or the members of the set such that there is a least upper bound and
greatest lower bound (in terms of bitstrings of length )
for the set as a whole and we can assign a distance between any two
members of the set. It is reasonable to suppose that a set can be
presented already linearly ordered, not when we are faced with a list
to sort, but when we can choose how to present a set in the first
place.
To justify our assumptions, we can define an interval
of binary sequences of length ordinal as a set of all such
binary -sequences (binary sequences of length ) with
the properties that every path through the tree of sequences from
root to leaves is a branch of the tree, i.e. ,
where is defined as .
Intervals defined in this way are not uniquely determined by ordinal
as the tree could have gaps between the sequences, but it
is possible to make them unique by stipulating that for interval ,
. We can also stipulate
that the root represents , so that in a sense the interval represents
the maximal interval from 0 to 1 comprising binary -sequences.
Intervals of this type are written . To justify that
any two members of can be assigned a distance
to be constant 0 -sequence with 1 at the position
where and first differ (read from 0. onwards). Then
can be seen to be a generalised888 is a generalised ultrametric because distances are not real numbers
but binary -sequences. ultrametric (i.e. ).999To see this, fix labels arbitrarily. Then if splits
from before splits from , then and
, so . If
splits from after splits from , then
and , so . Finally,
if splits from at the same position that splits from
, then and , so .
These inequalities are not strict and allow for the cases of ,
or .
It is possible to losslessly compress any binary -sequence
to a binary -sequence where is a cardinal .
We can thus represent any set of -sequences as .
But we actually want the construction below to use sets such that
each -sequence is labelled with a 1 (if ) and 0 (if
). These labelled sets of binary -sequences
are then such that the set of binary
-sequences without the labels . We will
write the labelled set of binary -sequences corresponding
to set as .
Any set of size can be searched for the bitstring
of length in steps by representing the set
by the labelled set and then dividing
into two equal intervals (which is possible whether the midpoint is
or not), choosing the interval that contains based on
the value of the next bit of (because
for the lower and upper limits of the interval) and iterating
times (taking the intersection of intervals at any limit
ordinal stages), and checking the label of in at the
-th step.
A shorthand way to express the search for the bitstring is to
note that there are bitstrings to be searched, but
that binary search runs in logarithmic time. Therefore there are
bits of information in the search for . In the simplest case
of the real numbers, we can see that the search method amounts to
binary search for a binary -sequence in a labelled set that
extends the closed interval . That us to say, every binary
-sequence is represented (starting with in the case
of ) and every -sequence has an extension at position
which states whether , where is coded as
a set of binary -sequences. It is clear that , for
a set of real numbers, can be decided in steps.
But does that mean that the set of real numbers is the closed interval
? No, but it does mean that the set of real numbers are represented
by insofar as purely set theoretic properties, such as cardinality,
are concerned.
This enumeration (well-ordering) of intervals can also be regarded
as an enumeration of members of the intervals. Members of the intervals
may be members of but they do not have to be. For definiteness
and balance we alternately choose -sequences in and
as successive elements of the enumeration as
far as possible (ending when an interval has all members
or , and we see that there are
steps to decide . Thus for any given binary -sequence
there is an enumeration of and that
takes steps to decide . We call this last statement
(*).
The statement (*) is not the strongest statement we can make about
searching for members of . It is also true that (+) ,
since the last member of the labelled sequence
for trivially satisfies (+). (+) is actually
equivalent to (*) by an application of the axiom of choice.
The amount of information of bits in function ,
for functional abstraction operator , is at least
because any -bit binary code for would also
be a code for some . We can express this by
means of a diagonal function
for a -bit code for a function
of -bits, and note that we get a contradiction if we put
unless the number of bits in
is greater than the number of bits in .
This implies that the number of bits in (where )
is at least .
We can say (++) that for infinite cardinal ,
the concept of being a member of set , contains bits
of information, and any can be decided in
bits. The reason this is true is effectively the diagonal argument
again, because otherwise the -bit binary code for (
would also be a code for some . (++) is consistent with application
and abstraction operations in the lambda calculus, since application
and abstraction apply in this case to generic -sequences.
In fact if we were to choose to represent a generic
by an -sequence, where , we see
that is a natural information measure for (
as it is the least upper bound of .
Principle (++) is equivalent to the Generalised Continuum Hypothesis
(GCH) for , as it is a choice principle that
limits the number of bits in deciding whether any
by a function to bits in any decision process.
Theorem 1.
GCH is equivalent to (++) for .
Proof.
Assume GCH. and fix a binary -sequence . Then if
then by GCH will be decided in
bits. While if then will be decided in
bits. In either case then can be decided steps,
i.e. decided in steps since is a cardinal.
But if can be decided in steps, then it can
be decided in steps.
Now assume (++) and that GCH is false, i.e. has
cardinality , and fix a binary -sequence
. Then if , we would always find in bits by
enumeration since there are
members of to be enumerated otherwise (and .
We can now check that is consistent with (++), but
leads to being decided almost always in bits (contradiction)
and leads to (contradiction). If
, then we could either enumerate all members of
or members of . But enumerating
all of members of contradicts (++) because
leads to being decided in steps (contradiction),
leads to being enumerated almost always in
steps (contradiction) and leads to
(contradiction). The remaining possibility if is that
is enumerated in bits in .
Then is consistent with (++), and
leads to being decided almost always in steps
(contradiction) and contradicts Cantor’s theorem
that (contradiction). Since is not empty
and because has cardinality , then
both and are witnessed as
and the associated enumerations vary; hence (contradiction).
Hence GCH is true.
∎
The reason why a statement like GCH that is independent of first-order
Zermelo-Fraenkel set theory turns out to be true for almost all sets101010The proof of Theorem 1 uses “almost always” in its arguments.
There will be a very small proportion ( of sets
where the equivalence does not hold. if (++) is true is that (++) requires a very rich theory to be true.
If we were to measure the complexity of a decision problem by the
size of any set (i.e. possibly “a large cardinal”)111111If the axiom of choice is assumed, then the size a set is its only
distinguishing feature, since all sets are well-orderable and are
isomorphic to ordinals; and ordinals can be losslessly compressed
to cardinals. that is needed to solve the decision problem by deduction from the
axioms of a first-order theory of sets,121212See [2] 417 for an example from the work of H. Friedman
of statements that can be encoded in first-order arithmetic that require
a large cardinal axiom. then (++) indicates that for infinite cardinal ,
rather than a large cardinal would be the measure of complexity. In
Zermelo-Fraenkel set theory with the axiom of choice (ZFC), a proof
is the construction of set (which in first-order ZFC corresponds to
a formula in the language of ZFC). A set can be identified with
an enumeration of by a (one-to-one) function such that
for some ordinal , which follows by the well-ordering theorem
(requiring the axiom of choice). Thus the cardinality of the set (as
the least ordinal) needed to decide a decision problem is a natural
and useful measure of the complexity of the decision problem. We can
conclude that (++) is not compatible with decidability by a first-order
deductive theory (that uses set cardinality as a complexity measure
of decidability), but is compatible with truth in an initial segment
of the Von Neumann hierarchy of pure sets, . itself is a
class model of first-order set theory.
Labelled sets are a good way to see the power of the decision criterion
(++). Labelled sets clearly represent a standard binary coding of
any set, but with the advantage that it is easy to tell which binary
sequences are members of the set or not. There are uncountably more
labelled sets than there can be sets defined in any countable formal
language of set theory, because each has
labels for all its members and non-members (which is not true for
membership defined by means of formulas through the axiom schemas
of separation or replacement). In fact we can see that all sets
can be labelled for all cardinals . It is worth noting that
labelled sets do not satisfy the axioms of first-order Zermelo-Fraenkel
set theory, because functions cannot be applied to labels in the same
way as to the data that they label; but the axioms could be easily
modified by stripping out the labels (i.e. the -th
nodes), applying the function to binary sequences of length
and adding back the labels. That is, if is a labelled set
of binary -sequences then we can form the labelled set ,
where is a -sequence and
is a union operator with the property that .
It is clear though that labelled sets preserve what sets can be formed
in initial segments of .
3. Conclusions
I think the example of search for a member of a set shows, at least in principle, that taking mathematical objects as constructions (for example, labelled sets) which can be represented and ordered in different ways has mathematical consequences. The alternative to the labelled set approach discussed above is to suppose that there are sets which in principle we cannot define (by means of finite formulas) and of which we are not even permitted to see their shadows.
References
- [1] G. Boolos. The iierative conception of set. The Journal of Philosophy, 68(8):215–231, 1971.
- [2] M. Davis. The incompleteness theorem. Notices of the American Mathematical Society, 53:414–418, 2006.
- [3] S Feferman. Weyl vindicated: Das kontinuum 70 years later. In C. Cellucci & G.Sambin, editor, Temi i prospettive della logica e della scienza contemporanee, volume 1, pages 59–93. CLUEB, 1988.
- [4] M. Hallett. Cantorian Set Theory and Limitation of Size. Oxford Logic Guides. Clarendon Press, 1986.
- [5] G. Hellman. Mathematics without numbers. Towards a modal-structural interpretation. Clarendon Press, 1989.
- [6] T. Jech. Set Theory: The Third Millenium Edition, Revised and Expanded. Springer, 2002.
- [7] A. Kanamori. The Higher Infinite: Large Cardinals in Set Theory from Their Beginnings. Springer, 1994.
- [8] P. Lorenzen. Constructive Philosophy. University of Massachusetts Press, 1987.
- [9] A. Powell. Philosophy and mathematics. Teorema, XVI(2):97–108, 1997.
- [10] J. R. Shoenfield. Axioms of Set Theory. In J. Barwise, editor, Handbook of Mathematical Logic, pages 321–344. North-Holland, Amsterdam, 1977.
- [11] A.S. Troelstra. Analysing chouce sequences. Journal of Philosophical Logic, 12:197–260, 1983.