28 \stackMath
Exchangeable Laws in Borel Data Structures
Abstract.
Motivated by statistical practice, category theory terminology is used to introduce Borel data structures and study exchangeability in an abstract framework. A generalization of de Finetti’s theorem is shown and natural transformations are used to present functional representation theorems (FRTs). Proofs of the latter are based on a classical result by D.N.Hoover providing a functional representation for exchangeable arrays indexed by finite tuples of integers, together with an universality result for Borel data structures. A special class of Borel data structures are array-type data structures, which are introduced using the novel concept of an indexing system. Studying natural transformations mapping into arrays gives explicit versions of FRTs, which in examples coincide with well-known Aldous-Hoover-Kallenberg-type FRTs for (jointly) exchangeable arrays. The abstract "index arithmetic" presented unifies and generalizes technical arguments commonly encountered in the literature on exchangeability theory. Finally, the category theory approach is used to outline how an abstract notion of seperate exchangeability can be derived, again motivated from statistical practice.
Key words and phrases:
exchangeability, functional represetation theorems, data structures, natural transformations, arrays, Borel spaces, foundations of statistics2010 Mathematics Subject Classification:
Primary 60G09, 68P05; secondary 62A011. Introduction
Let be a Borel space111a measurable space is a Borel space if there exists a measurable subset and a bi-measurable bijection , see Appendix 9.1 for basic properties of such spaces., the discrete group of bijections and
(1.1) |
a measurable group action. (The law of) A -valued random variable is called exchangeable if for every , with being equality in distribution. In many examples motivated from statistics, is exchangeable iff holds for all , with the countable group of bijections with for all but finitely many .
This work studies exchangeability when is derived from a Borel data structure (BDS), which is defined to be a functor
where is the category of injections between finite sets, its opposite and the category of measurable maps between Borel spaces. The main definitions and results are presented in Section 2, which starts with an explicit definition of Borel data structure in Definition 1. No knowledge of category theory is assumed to read this paper, references for the used terminology are [Mac78] and [Mil19], the latter providing a "programmers" view to category theory which fits the philosophy of how it is used in this work very well.
This paper is addressed to readers interested in exchangeability and data structures, the emphasize is on decomposition, functional representation and foundations of statistical applications. Many surveys on exchangeability theory covering such topics exist, see [Ald85], [Aus08], [Ald09], [Ald10] or [OR14].
Acknowledgements.
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation), Exchangeability theory of ID-based data structures with applications in statistics, 502386356.
1.1. Overview of the results
The main achievement of this work may be the provided abstract framework, which allows to talk about many reoccurring phenomena and constructions in exchangeability theory literature in a general setting. The main results are:
- •
-
•
Theorem 2: Hoover’s FRT for exchangeable arrays, Theorem A below, has an equivalent formulation in the provided framework using the concept of natural transformations. Theorem A is the most important ingredient to our approach,
- •
- •
-
•
Theorems 6 and 12 providing an explicit characterization of all (true) natural transformations mapping from any BDS into an array-type data structure via kernel functions. One application of this is in characterizing all local modification rules on array-type data structures, Example 16, a concept which has been introduced in [AT10],
-
•
Theorem 5: a strong FRT for exchangeable laws in array-type data structures via true natural transformations. For a given array-type data structure and using the classification of natural transformations via kernels, it is seen that the derived FRT is often equivalent to some classical version of an Aldous-Hoover-Kallenberg-type FRT for (jointly) exchangeable arrays, see Corollary 4,
-
•
Theorem 7: it exists a Borel data structure that is universal with respect to natural embedding,
-
•
Theorem 9: by considering combinatorial Borel data structures, a correspondence principle between exchangeable laws and limits of combinatorial structures is shown. This generalizes many well-known of such correspondences, the most famous being between graph limits and exchangeable random graphs, see [DJ07], [Grü15] or [Aus08] for a more general exposition. Another is between exchangeable posets and poset limits, see [Jan11], in which further examples are listed in the introduction. A very elementary instance of this correspondence can be formulated for exchangeable -sequences, see [GGH16].
-
•
Section 7 in which a notion of seperate exchangeability is presented for a wide range of Borel data structures. A special case is the classical notion of seperate exchangeability in arrays. The abstract construction of seperate exchangeability is motivated from its statistical philosophy and makes heavy use of the category theory approach to exchangeability via functors.
1.2. Similar use of category theory terminology in related work
The categorical approach to exchangeability via Borel data structures can be motivated from a statistical perspective, see Section 1.5, which is, in spirit, very close to the use of category theory in [McC02] where the more general question "What is a statistical model?" is discussed, see Remark 4.
There is close connection to the notion of combinatorial species, see [Ber+98], used in analytical combinatorics; (combinatorial) Borel data structures can be interpreted as combinatorial species equipped with a restriction mechanism compatible with the relabeling mechanism; this approach was used in [Ger18]. Like the case with combinatorial species, a great benefit of using category theory terminology with Borel data structures is that it becomes easy to build new examples of Borel data structures by composition, which provides infinite examples by iterative constructions, see Example 12. Also, the category theory approach is the basis for introducing an abstract concept of seperate exchangeability in Section 7.
Several definitions in this work are close to the content presented in Section 3.1 of [AT10], where contravariant functors, natural transformations and also exchangeable laws were introduced in a similar abstract setting, some aspects of that work were presented already in [Aus08] in an "explicit" form. More connections are explained throughout the work, also see Remark 24 discussing the different basic assumptions.
Remark 1 (Other connections).
The approach to exchangeability via functors modeling data structures is complemented by the approach using model theory, we refer to Section 3.8 in [Aus08] and the references therein. Also, de Finetti’s theorem for exchangeable sequences has been approached from a more pure category theory perspective recently, see [FGP21], [JS20] or [SS22]. To explain all these connections goes beyond the scope of this paper.
1.3. Exchangeability in arrays
FRTs are often presented for different notions of exchangeability in arrays, many of which fit in the framework (1.1) as follows: given is a Borel space , a countable set of indices and a group action
(1.2) |
on indices. This gives a (left-)group action on by defining for the action as . In this situation, -valued exchangeable random variables are arrays of -valued random variables indexed by , that is , such that
(1.3) |
Many basic examples of exchangeability in arrays are instances of (1.3), some examples together with their FRTs are presented next. Let be iid -random variables indexed by finite subsets of . The following results are organized in Chapter 7 of [Kal06].
-
(E1)
Sequences: , that is , and . The de Finetti/Hewitt-Savage theorem states that laws of exchangeable sequences are precisely the mixtures of laws of iid processes, which directly translates into a FRT: for every exchangeable sequence there exists a measurable function such that
with being responsible for the mixture over iid laws.
-
(E2)
Arrays indexed by size-2 sets: , that is , and . A FRT (Aldous, Hoover) reads as follows: for every exchangeable there is a measurable function , symmetric in the second and third argument, such that
An elementary proof of this is presented in [Aus12]. Note that, by symmetry of , the value does not depend on an enumeration of the set . In case exchangeable arrays indexed by correspond to exchangeable random graphs on nodes , the variables indicating edges.
-
(E3)
Arrays indexed by length-2 tuples with different entries: , that is with , and . For every exchangeable there exists a measurable function , that does not have to be symmetric, such that
In case arrays indexed by are exchangeable directed graphs on nodes (without self-loops), indicates the presence of a directed edge .
The examples (E2) and (E3) have straightforward generalizations to indices being -size subsets or -length tuples with different entries ; of course one could also consider or . In all these cases FRTs use randomization up to size , that is involve variables . FRTs for indices of unbounded size such as (all finite subsets) or (all finite length tuples with different entries), use full randomization . The FRT in the latter case is due to Hoover, see Theorem A in Section 1.6.
The Definitions 8 and 9 introduce indexing systems and the derived notion of array-type data structure, the latter being special types of BDS. This provides an abstract framework to capture examples of the previous types and Theorem 5 gives a unified formulation of FRTs in such cases, which is later translated into an explicit low-level form in Corollary 4.
Remark 2 (Graphons and Digraphons).
Representations for exchangeable graphs, (E2) with , are often presented using graphons, which are symmetric measurable functions . Given a graphon one can define an exchangeable random graph as follows: given let be independent with (Bernoulli). The FRT in (E2) shows that, loosely speaking, every exchangeable random graph appears in this way if one allows the graphon to be picked at random in a first step experiment: define with and set (ignoring measureability details).
Representations for exchangeable directed graphs, (E3) with , are often presented using digraphons; how the FRT in (E3) translates into a digraphon representation is explained in [DJ07], Proof of Theorem 9.1.
Applications of such derived representations are, for example, in the context of Bayesian statistics, see [CAF16] or [OR14].
Remark 3 (Other notions of exchangeability in arrays).
This work mainly studies exchangeability in the sense of -invariance, motivated by the statistical philosophy in Section 1.5. In the context of arrays, , the term "exchangeability" is often also used for a probabilistic symmetry induced by a group action on indices in which is not necessarily . Examples are separately exchangeable arrays, for instance and acting on as . Considering only the diagonal action every separately exchangeable array is seen to be also (jointly) exchangeable in the sense of -invariance; the converse fails in general. The statistical philosophy of "basic" notions of seperate exchangeability can be exploited to derive a notion of seperate exchangeability in the abstract setting of BDS discussed in this work, this is outlined in Section 7. Functional representations for classical notions of seperatly exchangeable arrays are presented in Chapter 7 of [Kal06].
Other types of actions on indices, giving generalizations of classical notions of exchangeability, are studied in [AP14] (hierarchical exchangeability), [Jun+21] and [Lee22] (DAG-exchangeability). Also see [Llo+13] in which "exchangeability in databases" is discussed.
1.4. Other exchangeable random objects
Exchangeability has long been studied in random structures different from, but not unrelated to, arrays. Many of such examples can be discussed within the BDS framework, to mention only a few:
Relation-type examples are given by partitions (interpreted as equivalence relations, Kingman’s Paintbox, see [Kal06], Section 7.8), posets [Jan11], strict weak orders [Gne97] or total orders (folklore, see e.g. [Ger20a] Section 3.2). Examples of this type fit into the frameworks (1.1) and (1.2) as follows: given is a action on indices and the space of interest (partitions, orders,) can be encoded as a subspace such that and such that the notion of exchangeability on is inherited from the array-setting. The exchangeable random structures can thus be seen as exchangeable arrays for which FRTs are often known – but mostly lead to unsatisfactory functional representations, as the additional structure given by is ignored. However, this approach can serve as an intermediate step to a satisfactory representation, for an example see [EGW17] (exchangeable didendritic systems). The essence of these examples – structures of interest being "embedded" in more general ones – is later introduced within the BDS framework by considering natural embeddings and sub-data structures. In (hyper)graphs sub-data structures correspond to so-called hereditary (hyper)graph properties, see the introduction in [AT10].
Another source of examples for exchangeability in random structures does not (directly) fit the array-framework: structures of set system-type. Examples are total partitions (hierarchies) [FHP18] or interval hypergraphs [Ger20]. It is not (directly) obvious how these structures could be encoded as an exchangeable array in a useful way. Later the (combinatorial) BDS of set systems is introduced and these examples are seen to be sub-data structures therein.
1.5. Statistical motivation
Studying exchangeability in context (1.1) can be motivated by thinking about how data is collected by a statistician: picking a small group of individuals from a large population and measuring information on that group, the type of information could very well be about interactions between individuals, that is relational. For storing the measured information as data (on a piece of paper, on a computer,…) it is required to give unique identifiers (IDs) to the individuals of the picked group, which are used to represent the individuals within stored data - a common choice of IDs for storing information of a finite group of individuals is , at least in mathematical papers. When studying exchangeability theory it is assumed that the finite groups can be of arbitrary finite size - which pays to the idea that the underlying population is ’large’. Based on the idea of sampling consistency one passes to model measurements on countable infinite group of individuals, usually identifying individuals using IDs . Having this in mind, a group action can be interpreted as follows: represent data measured on a countable infinite group of individuals represented via IDs and represents the measurement on the same group, but with IDs of individuals redistributed according to . Now suppose randomness is involved: first, a population is picked at random and second, conditioned on the population being picked, the statistician "randomly" picks a countable infinite group of individuals and gives them IDs , also "randomly". Given that group of individuals represented by IDs , the statistician measures data on that group, which gives . The precise meaning of "randomly" is not specified (for good reasons), but it seems reasonable to model the final measurement a -valued exchangeable random variable, that is for all .
Two thoughts about this:
-
(T1)
all a statistician will ever see in practice are measurements on finite groups of individuals; Borel data structures model the treatment of finite measurements only and countable infinite measurements, which are of theoretical interest, are constructed using sampling consistency,
-
(T2)
there is no reason to restrict IDs being elements , that is natural numbers – IDs only serve to identify individuals within stored data, no information of interest should be encoded in IDs. Using category theory terminology provides a suitable language to handle arbitrary sets (of IDs).
In search for a mathematical framework replacing by something that fits both the statisticians philosophy and also pays to (T1) and (T2) directly leads to the Definition of a Borel data structure and a notion of exchangeability therein, Section 2.
Remark 4.
The philosophy behind IDs and exchangeability are closely related to the ideas presented in [McC02], where the way more general question of what constitutes a statistical model is discussed. In that approach, the concept of an ID is replaced by statistical unit, which can encode more structure but just to serve as an identifier.
1.6. The main ingredients of the proofs
The notion of exchangeability studied in Borel data structures turns out to be equivalent to -invariance. is a countable amenable group, thus ergodic theory provides important theorems: relevant for this work are ergodic decomposition (Theorem A1.4 in [Kal06]) and pointwise convergence (Theorem 1.2 in [Lin01]). Interesting for statistical applications: the convergence in the pointwise convergence theorem is known to be asymptotically normal under mild regularity assumptions, see [AO18]. An application of this is, for example, in the analysis of cross validations protocols, see Section 4.5 of [Aus19]. Also, an application to "generalized -statistics" is given later, see Remark 16.
The most important ingredient to the proofs of FRTs in this work is a functional representation of exchangeable arrays fitting the framework (1.3) as follows: let be a Borel space and be the set of all finite-length tuples with and for all . The group acts on as . The following theorem follows the exposition of Theorem 7.21 in [Kal06] where the result is attributed to D.N. Hoover [Hoo79].
Theorem A (FRT for exchangeable arrays indexed by , Hoover, Kallenberg).
For every -valued exchangeable array there exists a measurable function
such that
where for it is .
1.7. Notations
Let be a set, its cardinality and its power set. For define subsets of
Let be the set of all finite-length tuples over . Let be the set of all length- tuples with for . Let be the set of all finite-length tuples over with different entries.
Let be two non-empty sets and the set of functions . Note that is also defined, even if is empty: there exists exactly one function , which is always injective and bijective iff . In particular, .
For any function define functions:
-
•
sends to the image ,
-
•
sends to ,
-
•
, that is is obtained from by restricting its range to its image. is surjective.
For every let
be the inclusion map and
for the identity on . It is always injective and it is bijective iff , that is .
Every function has the representation
that is as a composition of a surjective map followed by an inclusion map. is injective iff is bijective. is surjective iff is bijective, which implies and .
For a measurable space it is the set of probability measures on . The law of -valued random variable is . For random variables denotes equality in distribution and almost sure equality. For a set it is a measurable space equipped with the product -field. For it is the discrete measurable space consisting of one point being the function , similar has the single element .
2. Main definitions and results
Arbitrary finite sets are denoted by . They represent finite sets of IDs used by a statistician to identify individuals from a finite group. An injection corresponds to picking a subgroup from a group represented by IDs using IDs . In the subgroup obtained via individuals are assigned IDs , individual corresponds to in the larger group. Each injection can be written as
with
-
•
the inclusion map of ,
-
•
the bijection obtained by restricting the range.
Injection corresponds to restricting group to subgroup and to a redistribution of IDs on subgroup via .
The following is an explicit definition of a contravariant functor , which is the same as a (covariant) functor .
Definition 1 (Borel data structure).
A Borel data structure (BDS) is a rule that maps
-
•
every finite set to a Borel space ,
-
•
every injection between finite sets to a measurable map ,
such that
-
(i)
for every finite set ,
-
(ii)
for all composable injections between finite sets.
In case every is a non-empty finite discrete space is called combinatorial data structure. Combinatorial data structures coincide with functors , where is the category of maps between non-empty finite sets.
One can interpret as the space of data a statistician can collect on a group of individuals using IDs to represent individuals. For every injection the contravariance of gives
one can think of
-
•
as restricting measurements to subgroups,
-
•
as transforming IDs within stored data as ,
thus combines both these operations.
Now image a statistician picks a finite group of individuals from a large population and gives them IDs with both at random, then measures -valued data on that group, modeled as a -valued random variable . What "at random" means here is not specified, but is seems obvious that for every injection it should hold that
Let the law of . In terms of laws the previous is equivalent to
which leads to the following definition:
Definition 2 (Exchangeable law).
An exchangeable law on is a rule that sends every finite set to a probability measure such that for every injection it holds that . Let be the class of all exchangeable laws on .
Remark 5.
In the statistician records information that is not about any individual, hence that information is about the population itself or more general about the environment the measurement takes place in.
Example 1.
Let be a Borel space and define (sequential data over ) by and . Let be a -valued exchangeable sequence. By exchangeability, for every finite set and any two injections it holds , which allows to define
not depending on the choice of . It is easily seen that this defines an exchangeable law and that the construction is a one-to-one correspondence between laws of exchangeable -valued sequences and ; the inverse construction involves Kolmogorov consistency arguments.
The discussion in Section 4 shows that the previous example generalizes to any Borel data structure , that is: can be naturally identified with the space of invariant probability measures for some measurable group action on a Borel space . In particular, is a set that comes equipped with a natural Borel space (and convexity) structure such that for every finite set and measurable the evaluation map is measurable.
Remark 6 (Exchangeable laws via category theory terminology).
See [Mac78] for category theory vocabulary used here, in particular Section 4. There are at least two equivalent ways to obtain using category theory constructions. Both involve the endofunctor which sends a Borel space to the Borel space and a measurable map to the push-forward . For every BDS it is defined by and a new BDS. Having this, can be identified with either
-
•
the limit of the functor : cones over correspond to measurable parametrizations (not necessarily injective or surjective) with the parameter space being Borel. The limit corresponds to the parametrization of by itself. An example of a cone over is with (the iid-normal-distribution model).
-
•
the set of all natural transformations , where is the trivial data structure and , compare to Equation (29) in [AT10].
Remark 7 (Combinatorial species, see [Ber+98]).
A combinatorial species is a (covariant) functor , where is the category of bijections between non-empty finite sets. Every combinatorial data structure defines a species of structures by and . In this sense, combinatorial data structures can be seen as combinatorial species enriched with restrictions compatible with the relabeling mechanism. The restriction mechanism is of crucial importance to develop exchangeability theory.
2.1. Generalization of de Finetti’s theorem
Let . If corresponds to the law of data obtained by picking individuals from a fixed large population, it seems obvious that the measurements on disjoint subgroups should be independent, that is: if are disjoint, , and then and should be independent. The following defines this property on the level of laws.
Definition 3 (Independence property).
has the independence property if for all finite sets with
Let be the subset of exchangeable laws having this property.
It is seen later that the laws having the independence property coincide with ergodic invariant laws, thus the notion . Exchangeable laws are precisely the mixtures of exchangeable laws having the independence property:
Theorem 1.
If , then a non-empty measurable subset of and the following map is a bijection:
where is the rule defined by
Example 2 (Exchangeable sequences are mixed iid).
Let . Let and . In terms of random variables it is
a joint distribution of two disjoint sub-collections of RVs. If has the independence property it thus holds that
Applying this inductively down to singletons and using exchangeability shows every is of the form for some ; one can identify with and Theorem 1 gives: exchangeable laws in are precisely given by the rules , bijectivity parameterized through .
In case it was easily possible to use the independence property to give a perfect parametrization of . From a data structure point of view the reason for this is that for every disjoint sets the map is a bijection. This is a very special property of sequential data and fails in general. As a consequence, it is in general far from obvious how exchangeable laws having the independence property look like – functional representations offer a different approach to understand the structure of exchangeable laws.
2.2. A weak FRT for arbitrary Borel data structures
Borel data structures have been introduced as functors and a good notion for "functions between functors" is that of a natural transformation. Also an almost sure version is introduced:
Definition 4 ((Almost sure) Natural transformations).
Let be two Borel data structures and be a rule that sends every to a measurable map .
The rule is called
-
•
natural transformation if for all
-
•
-a.s. natural transformation, with , if for all
Of course, a natural transformation is also a -a.s. natural transformation for every .
Example 3.
Every measurable gives a natural transformation having components and this is a one-to-one correspondence between measurable maps and natural transformations. This is generalized by Theorem 12 later.
A central observation is the following:
Proposition 1.
For every and -a.s. natural transformation it is , where is the rule that sends to the push-forward of under , that is to the probability measure .
Proof.
Let be injective, and . It is . ∎
Four (parameterized) examples of Borel data structures are introduced to state the main results. All of these are array-type data structures, the general concept is in Definitions 8 and 9. In Section 3 many more examples of Borel data structures and ways of composing new ones from given ones are presented.
Definition 5 (First examples of BDS).
Let be a Borel space and .
-
•
with and .
-
•
with and .
-
•
with and .
-
•
with and .
Iid uniform random variables , frequently used in FRTs, are mirrored in this framework by the following:
Definition 6 (Uniform randomizer).
The following notations are used:
The exchangeable laws and are defined by
The letter "" is used for "Randomization".
A key result to our approach is:
Theorem 2 (Theorem A via natural transformations).
The following are equivalent:
-
(i)
Theorem A (representation of -valued exchangeable arrays indexed by ),
-
(ii)
for every exist a natural transformation such that .
The latter allows to translate Theorem A into the language of natural transformations, which is used to prove the following:
Theorem 3 (Weak FRT).
For every Borel data structure and exchangeable law exists a -a.s. natural transformation such that .
In Corollary 1 the result is presented using random variables. There are examples of BDS for which no exchangeable laws exists, that is , see Example 11 later. A direct consequence of Theorem 3 is
As seen in (E1)-(E3), known FRTs may not need randomization of arbitrary high level. This can be involved in the Theorem by defining the depth of a BDS:
Definition 7 (Depth).
A BDS is -determined, , if for every finite set and the following implication holds
Let with .
Theorem 4 (Weak FRT for finite depth).
Let be a Borel data structure with . For every exchangeable law there exists a -a.s. natural transformation such that .
Remark 8 (Weak FRT for ergodic laws).
Another refinement of the weak FRTs can be made for ergodic exchangeable laws: define the BDS by , and the exchangeable law . It can be shown that for every Borel data structure and every ergodic there exists a -a.s. transformation with . The same can be stated for finite depth by introducing and in an obvious analogue way.
Remark 9 (Global axiom of choice).
The weak FRT is about the existence of a -almost sure natural transformation. Such objects are "rules" that map any finite set to a measurable map; from an axiomatic point of view, such rules are functions between proper classes. A suitable axiomatization of mathematics to work with proper classes are, for example, given by the NBG-axioms (Neumann-Bernays-Gödel). Often included in the NBG-axioms is the global axiom of choice, which states that there exists a rule that simultaneously picks an element from any non-empty set. This axiom will be used several times in our proofs, which makes many results NBG-theorems. However, this is not problematic if one wishes to not leave the ZFC-world: all our NBG-theorems involving a quantifier "for all finite sets" (maybe within involved definitions) give an evenly interesting theorem by restricting the quantifier to "for all finite subsets of some fixed infinite set". Our NBG-Theorems obtained by this restriction talk about sets only. Now NBG is a conservative extension of ZFC: every NBG-theorem talking about sets only also is a ZFC-theorem, that is could have been proved within ZFC alone, see [Fel71]. An alternative approach to handle these foundational aspects is to postulate the existence of sufficiently rich Grothendieck universes and call "sets" only elements of these, see Section I.6 in [Mac78]. The global axiom of choice is used also in the index arithmetic being developed in Section 6, see the discussion in Example 14 there.
2.3. A strong FRT for array-type data structures
The weak FRT is weak in the sense that it only guarantees the existence of an -almost sure natural transformation to represent via . The question arises in what circumstances this can be strengthened to a "strong" form in which a true natural transformation can be used for a functional representation. The following shows that this can not always be the case, in fact there may exists no true natural transformations at all:
Example 4.
The existence of a true natural transformation implies that for every there exists with for every bijection ; choose with having a constant value . One example of in which there exists no true natural transformation but exchangeable laws exist is given by the combinatorial data structure of total orders, see Example 8.
A class of data structures where the weak FRT can be strengthened to a strong version are array-type data structures. Also, it is possible to give an explicit "low-level" description of the "high-level" concept natural transformations mapping into arrays, which allows to give low-level descriptions of the strong FRT in the usual style of such representation results.
Indexing systems are defined as functors satisfying additional axioms, in an explicit form:
Definition 8 (Indexing system).
An indexing system is a rule that maps
-
•
every finite set to a finite set
-
•
every injection to an injection
such that the following hold
-
(1)
for all composable injections ,
-
(2)
for all finite sets ,
-
(3)
for all finite sets .
Indexing systems are introduced to define array-type data structures:
Definition 9 (Array-type data structure).
Let be a Borel space (data type) and an indexing system. The Borel data structure is defined by
The previous examples of array-type data structures used the indexing systems
-
•
with and ,
-
•
with and ,
-
•
with and ,
-
•
with and .
Example 5.
The indexing system axioms give that any index i from an indexing system , that is for some , has a unique minimal set of IDs used to build i: there exists a unique finite set with and . Later we write (the domain of i). Not every functor is an indexing system, an example: let and in case and in case . For an injection let in case and the unique function on domain in case . For two sets with and it is . In this case no domains can be defined.
Every array-type data structure has exchangeable laws: gives an exchangeable law . In case and it is and the latter rule equals used in the weak FRT, Theorem 3.
Definition 10 (Products of BDS).
For every countable family of Borel data structures it is defined by
a new Borel data structure.
More constructions such as coproducts, composition or sub-data structures are presented later.
Theorem 5 (Strong FRT for products of array-type data structures).
For every countable product of array-type data structures and every exchangeable law there exists a (true) natural transformation such that .
In case one can replace by .
Theorem 5 can be reformulated using category theory vocabulary: it shows the existence of a weak universal arrow for the -functor defined on array-type data structures, see Remark 26 for details. The strong FRT becomes particularly important combined with the following, which gives an explicit description of natural transformations mapping into countable products of array-type data structures:
Theorem 6 (Characterization of natural transformations mapping into arrays).
For every Borel data structure there exists an explicit one-to-one correspondence between natural transformations and certain countable families of kernel functions in which for each there is some , and sub-group such that is measurable with for all .
Some prior work is needed to explicitly state the correspondence and how the set is constructed, the case is stated in Theorem 12. In this case, the theorem characterizes natural transformations . The groups imposing symmetry restrictions on kernel functions depend on the indexing system and in this regard the indexing systems represent the two extreme cases: the former leads to full subgroups , the latter to trivial subgroups . It is seen in Theorem 11 that, up to group-isomorphism, for every finite group there exists an indexing system such that can appear as a symmetry restriction on a kernel function.
Remark 10 (Skew-products).
In [Aus15] the notions of skew-product tuples and skew-product type functions were introduced; in our terminology these concept are about natural transformations ; loosely speaking, skew-product tuples correspond to kernel functions in the sense of Theorem 6 and the associated skew-product type function to the obtained natural transformation.
2.4. Universality of
The key importance of the indexing system , and hence Theorem A, is that indices can be identified with injections mapping into finite sets: every index gives the injection . The whole concept of Borel data structures is based on the Borel space assumption and on handling injective maps; and in fact, plays a crucial role in the theory.
Definition 11 (Embedding and isomorphism).
Let be Borel data structures and be a natural transformation. is called
-
•
embedding if all components are injective,
-
•
isomorphism if all components are bijective.
and are called isomorphic of there exists an isomorphism between them.
It is easy to check that if is an isomorphism then the rule having as components the inverse functions of is a natural transformation with and , measureability is given by the Borel space assumption.
Definition 12 (Sub-data structures).
Let be Borel data structures. is a sub-data structure of , denoted with , if for every and injection
-
•
is a measurable subspace,
-
•
for all .
Remark 11.
Proposition 2.
If is an embedding then defined by and is a sub-data structure of isomorphic to , an isomorphism is given by with components .
Proof.
is a measurable injection between Borel spaces, thus the image is a measurable subspace and hence a Borel space. For every there is a unique with and for it is , which shows that is a sub-data structure of . The natural inverse of has components , which are measurable by Borel space assumptions, the naturality of and is straightforward. ∎
Note that the inverse of is a natural transformation with and can in general not be extended to natural transformation defined on the whole BDS . This is different from embeddings between Borel spaces: if is a measurable injection between Borel spaces, then there exists a measurable left-inverse of , that is . In category theory terminology: in every monomorphism is a section, which is not the case in the functor category .
In Theorem 10 it is shown that every Borel data structure can be naturally embedded in , the embedding being more or less explicit, but of little practical interest. However, together with Proposition 2 this yields:
Theorem 7 (Universality).
Every Borel data structure is naturally isomorphic to a sub-data structure of .
3. Examples and Constructions
Example 6 (Array-type data structures).
Examples of array-type data structures are obtained by giving examples of indexing systems , that is specifying the finite set and for every and the value . Note that in case is a finite set is a combinatorial data structure.
Let .
-
•
with and is the indexing system in which IDs equal indices.
-
•
Set-type indexing systems are of the form and . Examples are the indexing systems having sets of indices . Note that injectivity of gives in all these cases.
-
•
Tuple-type indexing systems are of the form and . Examples are the indexing systems having sets of indices , where the sup-script indicates that only tuples with distinct entries are considered.
-
•
Let be two indexing systems. New indexing systems are defined by
-
–
Products: with and ,
-
–
Coproducts: are defined analogously,
-
–
Composition: with and .
-
–
-
•
Every species of structures can be turned into an indexing system : let and for , that is , let with .
Definition 13 (Set systems).
The combinatorial data structure is defined by , that is elements are subsets , and for injective map and it is .
There is a canonical bijection between the set of set systems and the set of functions by mapping to the indicator function . This is not a natural isomorphism between and :
Proposition 3.
and are not naturally isomorphic.
Proof.
Let , and with .
Let and consider the set
This set has cardinality , not depending on the concrete choice of .
Now let , that is , and consider the set
If and would be naturally isomorphic, this set would have the same cardinality independent on the concrete choice of . But this does not hold: let . It is if and only if for all it holds that . In particular, for this specific there are precisely such . Clearly, for . ∎
Example 7 (Three implementations of graphs).
An undirected loop-free graph can be defined as either (1) a pair of vertices and edges, (2) an edge indicator function or (3) as an adjacency matrix. These "implementations" of graphs can be formalized using the BDS framework and are seen to be naturally isomorphic:
-
(1)
Pairs of vertices and edges: is defined by and ,
-
(2)
Edge indicator functions: is defined by ,
-
(3)
Adjacency matrices: is defined as a sub-data structure with
Natural isomorphisms between these implementations are
-
•
with .
-
•
with .
-
•
with .
Definition 14 (Products, coproducts, composition).
Let be Borel data structures and let and be endofunctors.
-
•
is defined by and ,
-
•
is defined analogously,
- •
-
•
is defined by and . In case is an indexing system it holds that .
Example 8 (Binary relations and hereditary properties therein).
A binary relation on a set can be seen a subset . If is an injection then defines a new binary relation on and this gives the combinatorial data structure of binary relations, which is naturally isomorphic to by mapping to the indicator .
Many standard properties of binary relations are hereditary, that is stable under , such as: symmetry, transitivity, reflexivity, connectedness, anti-symmetry, and thus yield sub-data structure of .
One example important for illustrative purposes: a binary relation on , implemented as an array , is a strict total order iff for all
Being a strict total order is hereditary and gives the data structure with the subset of strict total orders on .
Example 9 (Exchangeable total order).
The exchangeability theory of is folklore, see for example [Ger20a] or [Ger20]: there exists exactly one exchangeable law on given by , which is ergodic by uniqueness. It is , but a weak representation in the style of Theorem 4 only needs level randomization. For a finite set and iid define a random strict total order on as . Note that this gives a strict total order with probability one and is equivalent to a weak representation with being and the a.s. natural transformation defined as (and arbitrary on a set with -probability zero).
Example 10 (Sub-data structures of ).
Sub-data structures of correspond to hereditary set system properties, that is properties such that if a set system fulfills and is an injection, then also satisfies , for every such property gives a sub-data structure . Examples of such properties are
-
•
being a partition: and (including the empty set in this case is a question of implementation and does not affect the essence of what makes a partition).
-
•
being a total partition, also called hierarchy: for all , .
-
•
being an interval hypergraph: for all and there exists a strict total order such that every is an interval with respect to , that is: and with then .
Exchangeability in partitions has a representation by Kingmans’s paintbox construction, representations for exchangeable total partitions are by [FHP18] and [Ger20] and for interval hypergraphs by [Ger20]. The functional representation in [Ger20] can be translated into the style of FRTs: for every exchangeable law over interval hypergraphs there exists a random compact subset of the triangle such that for every finite set , where are independent. Letting and defining defines a -almost sure natural transformation mapping into interval hypergraphs such that .
Example 11 (Examples with ).
Exchangeable laws always exist in array-type data structures (product measures) and in combinatorial data structures (by a compactness argument). Two examples of a BDS without exchangeable laws are:
-
•
for every (the open unit interval) and for injection . Suppose exists, write . Applying exchangeability via gives , which converges to in probability as , thus , which is a contradiction to taking values in .
-
•
let be countable infinite and the sub-data structure with the set of all injective functions . If there were it would also be such that for all . By de Finetti has to be a mixture over iid-laws, , which implies for that because is countable.
Example 12.
Let be species of structures defining graphs. The previous discussion allows to consider the Borel data structure
4. Extension, pointwise convergence and decomposition
For this Section let be a fixed BDS.
Let be countable infinite, e.g. . Imagine a statistician picks a countable infinite group of individuals from a large population, uses IDs to represent the individuals and then measures information on each finite subgroup . The obtained measurements should satisfy sampling consistency
If individuals are picked and IDs distributed at random, the obtained measurement should be an exchangeable random object in the following sense:
Definition 15 (Exchangeable -measurement).
Let be countable infinite. An exchangeable -measurement using IDs is a collection of random variables
such that for every
-
•
takes values in ,
-
•
for every (sampling consistency),
-
•
for every bijection (exchangeability).
If only the first two hold is called random -measurement using IDs . Let
that is
Proposition 4.
Let be countable infinite and . Let be a finite set. Choose with and a bijection . Then does not depend on the concrete choice of and which allows to define . The rule is element of and the map is a one-to-one correspondence between and . In particular, is a set.
The proof is based on Kolmogorov consistency arguments and placed in the Appendix.
Definition 16 (Canonical extension to countable infinite sets of IDs).
For countable infinite let
For any countable set (finite or infinite), injection and let
It is easily seen that if is an injection between two countable infinite sets then implies . In particular: for some countable infinite implies for every countable infinite . The proof of the following is placed in the Appendix.
Proposition 5.
Assume . Then
-
(1)
For every countable infinite it is a non-empty measurable subset of and hence a Borel space. In particular, random infinite -measurements using IDs can be considered -valued random variables. For two -valued random variables it holds iff for all ,
-
(2)
The construction in Definition 16 extends to a functor , where is the category of injections between countable sets,
-
(3)
Let be a random -measurement using IDs , that is a -valued random variable. The following are equivalent:
-
(i)
for every and bijection , that is: is an exchangeable -measurement in the sense of Definition 15,
-
(ii)
for every bijection with for all but finitely many ,
-
(iii)
for every bijection ,
-
(iv)
for every injection .
-
(i)
-
(4)
For every injection between countable infinite sets the map is a bijection .
Remark 12.
Let be an exchangeable -measurement using IDs whose law is represented by . Let . Applying (4) to allows to represent as with being an exchangeable -measurement using IDs , whose law is necessarily also represented by . In case such constructions are a basic approach to prove functional representation theorems for arrays, see [Ald82], [Ald85] and [Aus12].
Combining the previous propositions with Theorems 3 and 4 gives the following reformulation of the FRTs:
Corollary 1 (Weak FRT for exchangeable random measurements).
For every exchangeable -measurement there exists a -almost sure natural transformation such that
where are iid . If there is a -a.s. natural transformation such that
Proof.
By Proposition 4 there is a unique with for every . By Theorem 3 there is a -a.s. natural transformation with . This gives for every . Since is -a.s. natural transformation takes values in almost surely and the same is true for . The equality in distribution at each implies equality in distribution of the whole -indexed processes by (1) of Proposition 5. The finite-depth case follows the same way by applying Theorem 4. ∎
4.1. Natural extensions of array-type data structures
Let . Since the functor can be extended to a functor by the construction of Definition 16. A more natural extension is possible for array-type data structures. The special case is very instructive: let be countable infinite, an element in the canonical extension is of the form
Let . The defining property of (sampling consistency) gives
which obviously can be represented more naturally as . This works for every : a natural extension is based on extending the indexing system, which is a functor with additional properties, to a functor and defining the natural extension of as and . The extension of is as follows:
Let be countable infinite. Define and for an injection , with finite or infinite, define
Lemma 2 later provides the main technical details to see that this extends to countable infinite sets, satisfying functor properties and also satisfying the indexing system axioms for countable infinite sets. Some examples: let be arbitrary countable and be an injection:
-
•
has natural extension and ,
-
•
has natural extension and ,
-
•
has natural extension and ,
-
•
has natural extension and .
In particular, exchangeable random measurements using IDs now fit the framework (1.3) presented in the introduction: the group action on indices is and following Proposition 4 shows that laws of exchangeable processes can be identified with (by passing from canonical to natural extension). An exchangeable array in natural extension corresponds to in canonical extension.
Remark 13.
With it is not obvious if there is an extension that is any more "natural" than the canonical one from Definition 16. Note that both in [FHP18] and [Ger20] exchangeable random objects of set system-type (hierarchies/interval hypergraphs) have been introduced as random sequences of finite growing exchangeable structures satisfying sampling consistency, that is in canonical extension.
4.2. Pointwise convergence, -statistics and the independence property
Let be a BDS with and be the canonical extension of . Propositions 4 and 5 show that studying falls into the framework (1.1) presented in the introduction: a measurable group action is derived by defining
and can be identified with , that is with laws of -valued random variables with for all . Further, is exchangeable already iff for all .
Ergodic theory results become directly applicable: let be the -field of measurable subsets with for all . An exchangeable -valued is called ergodic iff for all . Let be the set ergodic exchangeable laws, which is non-empty measurable. Ergodic decomposition, Theorem A1.4 in [Kal97], gives that the following two maps are bijections inverse to each other:
The abstract de Finetti theorem, Theorem 1, follows from this by identifying with exchangeable laws having the independence property, that is with . This is shown in Theorem 8 below.
Remark 14 (Convex decomposition).
Let be countable infinite and consider collections with . A strict partial order on is given by comparing sets by cardinality, that is iff . This strict partial order is directed to the right and countable at infinity. For let be a uniform random injection and define the probability kernel . By combinatorial arguments, a collection comes from some iff for all . Modulo topological assumptions: Proposition 1.1 in Chapter IV of [Lau88] gives a simplex decomposition for such collections .
The proof for characterizing ergodicity via independence heavily relies on the following, for write iff for all :
Theorem B (Pointwise convergence, Theorem 1.2 in [Lin01] applied to ).
For every and measurable with it holds that
Theorem B is applied to functions obtained from kernel functions via . For let
For a uniform random injection it is
and basic combinatorial arguments together with functorality of gives for every random -measurement and
Theorem B directly yields the following
Corollary 2.
For an exchangeable -measurement and measurable with it is
Remark 15.
An alternative approach to Corollary 2 is by backwards martingale convergence; however, the proof using pointwise convergence theorem is much more direct.
Basic measure theoretic considerations give that is ergodic iff for every and bounded measurable kernel it is a.s. constant, which is equivalent to the variance of being zero. For every exchangeable , not necessarily ergodic, and every square integrable kernel , that is , simple calculations using exchangeability, sampling consistency and functorality of give for every
(4.1) |
This is used to prove:
Theorem 8.
Let be an exchangeable -measurement. Equivalent are:
-
(i)
is ergodic,
-
(ii)
has the independence property: are stochastically independent for all with ,
-
(iii)
for every countable set of bounded measurable functions there exists a deterministic sequence with such that and for every
Proof.
(i)(iii)(ii)(i) is shown.
(i)(iii). By Corollary 2 a.s. for every defined on . Since is countable the convergence almost surely holds simultaneously over , take for some from the corresponding probability-one event; in this case.
(iii)(ii). Let with and be bounded measurable, so is to be shown. Let be such that and be
Applying (iii) to the three-element set gives a deterministic sequence with , , such that for a uniform random injection it holds that
Let be another random uniform injection independent from and let . Elementary combinatorial calculations show that as and that for every fixed with the joint distribution of is the same as that of conditioned on . This gives
(ii)(i). Let be bounded measurable, it is shown that the variance of is zero. By pointwise and dominated convergence
By (4.1) the variance of depends on covariances with and bijective. By assumption (ii) such a covariance is zero if . For there are of such , bounding gives
for fixed the upper bound goes to zero as . ∎
Remark 16 (Asymptotic of -statistics).
Let be a symmetric measurable kernel, that is for every bijection . In this case can be extended to have domain for every with . If is exchangeable with the variance formula (4.1) can be further reduced: for it is
In case this follows from classical -statistics theory, see [KB94], and in this case has a representation as the variance of a conditional expectation, which directly gives . This also holds for general : by Corollary 1 there is a -a.s natural transformation representing the law of , for every let with iid . In this special construction of , the same ideas as for the sequential case give
In case is ergodic it is , which follows directly from the independence property and is also reflected in the previous formula noting that for ergodic laws no randomness from is needed in functional representations, see Remark 8.
Theorem 17 from [AO18] can be applied to consider the asymptotic distribution of : in case is ergodic it is
where the asymptotic variance can be found as
with .
4.3. Limits of combinatorial structures
Let be a combinatorial data structure. For simplicity assume the finite set can be recovered from , so one can define . If this is not the case replace by the isomorphic BDS defined as and .
For with let
that is for a uniform random injection . The value is interpreted as the (combinatorial) density of the smaller structure within the larger structure .
Definition 17 (Limits of combinatorial structures).
A sequence with is said to be convergent iff and for every the limit
exists. In this case, the limit of x is the rule that maps to .
The following is an application of Theorem 8 for the combinatorial case, technical details are in the Appendix.
Theorem 9.
Limits of convergent sequences coincide with : for every convergent sequence there is exactly one rule such that
(4.2) |
and conversely, every is of this form for some convergent sequence x.
Remark 17 ( is a Bauer simplex for combinatorial data structures).
Let be the category of continuous maps between compact metrizable topological spaces. Every finite discrete space is compact metrizable and every map between finite discrete spaces continuous, thus combinatorial data structures can be seen as functors . In this case extensions to always exists and can be seen as functors ; the derived group action is a topological group action on compact metrizable space, which are studied in ergodic theory, see [Gla03]. For any compact metrizable space and topological group action the space of invariant laws, denoted with , has the structure of a Choquet simplex. Our discussion shows that has a closed set of extremal points – either check that the independence property is closed or argue that extremal points coincide with a Martin boundary – in particular, is a Bauer simplex. This is not obvious from general theory: with and it is known that is the Poulsen simplex by the fact that is amenable and countable infinite and thus does not have Kazhdan’s property (T), see Theorem 13.15 in [Gla03].
5. Weak FRT
To recall, the indexing system is defined by , that is the set of all tuples with , and for an injection it is defined by
For a finite set and index let , which is injective.
The following result characterizes natural transformations , where is an arbitrary Borel data structure and an arbitrary Borel space. It arises as a special case of Theorem 12 later, but because is of great importance to the general theory and the proof is especially tractable in this case, it is presented here separately. Note that the components of a natural transformation are measurable maps and thus have inner component functions such that .
Proposition 6.
There is a one-to-one correspondence between
-
•
Natural Transformations
-
•
Sequences of measurable maps with
given by
-
•
with
-
•
with .
Proof.
Let . For every rule that maps a finite set to a measurable map consider the "inner" component functions which are measurable and satisfy . It is easily checked that is a natural transformation iff the inner components satisfy for every injection and index
Suppose is a natural transformation and let . For every it holds and hence
that is: is determined by , hence the construction is injective.
On the other hand, let be an arbitrary sequence of measurable functions and for define the inner component by . Let be an injection. For it holds
and hence
so the construction defines a natural transformation. It is obvious that the constructions and are inverse to each other. ∎
A first application of Proposition 6 is in proving that Theorem A has an equivalent formulation using natural transformations.
Proof of Theorem 2.
Let . The following are shown to be equivalent:
-
(i)
Theorem A
-
(ii)
For every exists a natural transformation with .
(i)(ii). Let . By Kolmogorov consistency there exists a -valued stochastic process such that for every finite set it is . Let be iid and for every let . By Theorem A there is a measurable function such that , where for it is bijective and it holds
For every let be the restriction of to . The functions give a natural transformation by the construction in Proposition 6. For every finite subset it then holds that
that is .
(ii)(i). Let be exchangeable -valued. For any injection define . The law of does not depend on a concrete choice of and hence allows to define
independent on the choice of . It is easy to check that this defines an exchangeable law . By (ii) there is a natural transformation such that for every finite . By Proposition 6 there is a sequence of measurable functions representing which glued together yield . Let be iid and . It is and hence
Note that for every by naturality and that . By Kolmogorov consistency the distributional equations holding for every finite set can thus be lifted to the whole process:
giving (i). ∎
Theorems A + 2 give:
Corollary 3.
Let . For every exists a natural transformation with .
A second application of Proposition 6 is:
Theorem 10.
Every Borel data structure can be naturally embedded in .
Proof.
A natural transformation is constructed such that every component is injective.
For every it is a Borel space, hence there exists a measurable injection
By Proposition 6 the rule having components with is a natural transformation. For every a tuple having maximal length is an enumeration of all elements of and hence is a bijection, so is a bijection and hence is an injection (as a composition of injection and bijection ). Now is injective as already some of its component functions are. ∎
Remark 18.
In case one can construct an embedding in an analog way, thus is naturally isomorphic to a sub-data structure of .
The embedding constructed in the proof is highly redundant and unpractical for applications: every entry in the array that is indexed by a full-length tuple , of which there are many, contains all information about and thus also the information about all other entries. But this information can in general not be recovered using a true natural transformation defined on the whole of ; for some BDS there do not even exist a single true natural transformation .
Lemma 1.
Let be Borel data structures, , a -a.s. natural transformation and a rule that maps every finite set to a measurable function . Then the following are equivalent:
-
(i)
is a -a.s. natural transformation,
-
(ii)
is a -a.s. natural transformation.
Proof.
Let be an injection, and , that is .
(i)(ii): It is because is -a.s. natural transformation by assumption (i). It is because is -a.s. natural transformation by assumption. Hence , so (ii).
(ii)(i). It is since is -a.s. natural transformation by (ii). It is since is a -a.s. natural transformation, hence (i).
∎
Proposition 7.
Let be an embedding. Then there exists a rule that sends every finite set to measurable map such that
-
•
, that is for every ,
-
•
is a -a.s natural transformation for every .
Proof.
Every component is a measurable injection between Borel spaces, hence has a measurable left-inverse. Applying the global axiom of choice gives a rule that picks measurable left-inverses, so . Since both and are natural transformations, they are also -a.s. natural transformations for every . By Lemma 1 is a -a.s. natural transformation. ∎
Given the previous results it is now easy to prove the weak FRT (without depth):
Proof of Theorem 3.
It is shown that for every BDS and there exists a -a.s. natural transformation with .
Let and be an embedding, which exists due to Theorem 10. It is and by Corollary 3 there is a natural transformation such that .
By Proposition 7 there is a -a.s. natural transformation such that . Let . It is a natural transformation and a -a.s. natural transformation. Applying Lemma 1 gives that is a -a.s. natural transformation. Because it holds that
so gives the desired functional representation of . ∎
Versions of (weak) FRTs for finite depth can be obtained from unbounded depth case by the following; the proof is placed in the appendix, it is very technical.
Proposition 8.
Let be a Borel data structure with and let
be the rule that has components
Then the following holds:
-
(i)
is a natural transformation with .
-
(ii)
for every natural transformation exists a natural transformation with .
-
(iii)
for every -a.s. natural transformation exists a -a.s. natural transformation with -almost surely, that is for every it holds that for -almost all .
The weak FRT for bounded depth follows easily:
6. Array-type data structures
Let be an indexing system, see Definition 8.
Definition 18.
Let be a finite set and . Define
-
•
(domain of i = IDs used to build i),
-
•
the size of i,
-
•
,
-
•
for any other index write iff there exists an injection such that .
Using functorality of shows that is a finite group. The indexing system axioms give the following, a proof is given in the Appendix.
Lemma 2.
Let and be injective.
-
(1)
does not depend on ,
-
(2)
,
-
(3)
,
-
(4)
for two injections it holds if and only if there exists with for all ,
-
(5)
is an equivalence relation on indices.
Example 13.
Some examples for Definition 18:
-
•
: for it is , , . All indices are equivalent.
-
•
: for it is , , . All indices are equivalent.
-
•
: for let be such that . For this index everything is as in the previous example. Two indices in are equivalent iff they have the same size.
-
•
: for it is , and . All indices are equivalent.
-
•
: for let be such that . For this index everything is as in the previous example. Two indices in are equivalent iff they have the same size.
-
•
: for it is the set of different entries and the number of different entries. has only one element, the identity on . Every index defines a set-partition of by declaring that fall in the same block iff . The indices are equivalent iff they induce the same partition.
-
•
defined by . For it is , , and has one element (identity). Two indices are equivalent iff .
More complex examples can emerge from composing indexing systems:
Theorem 11.
Let . For every finite group there is an index i in such that and are isomorphic as groups.
Proof.
Wlog assume is a subgroup . An index is a set with for each it is and for injection it is
The index has and . ∎
The following is very useful for characterizing natural transformations .
Definition 19 (Skeleton of an indexing system).
Let be an indexing system. A skeleton for is a triple in which
-
•
is a set of normalized representative indices, that is for every index i there is exactly one with and for every it is with ,
-
•
is the rule that maps every index i to the unique with ,
-
•
is a rule that maps every index i with to a bijection satisfying .
Example 14.
Consider two minimal examples, related to (E2) and (E3) from the introduction:
-
•
has a skeleton given by and for it is and .
-
•
has and for (with ) it is . Now a problem arises: should be a rule that maps any two-element set to a bijection with ; but both bijections have this property. To justify the existence of a rule requires
-
–
Global Axiom of Choice if IDs are arbitrary,
-
–
(Usual) Axiom of Choice if IDs are elements only of some fixed but arbitrary uncountable set,
-
–
Countable Axiom of Choice if IDs are elements only of some fixed but arbitrary countable set.
A choice axiom is not needed when IDs are elements of some fixed but arbitrary set that comes equipped with a total order: in this case one can choose to be the strictly increasing function. This was done in the index arithmetic of Chapter 7 in [Kal06], where IDs are always from . We shortly see that the concrete choice of does not really matter; but it is pleasant to have one available.
-
–
As seen in the example, the following requires the global axiom of choice.
Proposition 9 (Existence of a skeleton).
Every indexing system has a skeleton .
Proof.
It is an equivalence relation on indices. Let which is a countable set. It is easy to see that every index i is equivalent to some index from . Restricting to one can apply axiom of countable choice and obtain together with a choice function . Since for every index i it is a non-empty set, applying global choice gives a rule that maps every index i to some index with . The rule is defined as . Obviously, is uniquely determined given .
For every index i with it is , hence and by definition of it is
a non-empty set. Applying global choice again gives the rule which maps every index i to an element . ∎
Remark 19.
For with it is straightforward to check that , where is an arbitrary choice of representative indices.
Let be a BDS. As before, for a rule that maps every finite set to a measurable map it is the i-th component function of , that is .
Theorem 12.
Let be a BDS, a Borel space and an indexing system with skeleton . A one-to-one correspondence between
-
(1)
natural transformations and
-
(2)
sequences such that for every with it is
measurable with for every
is given by
-
•
with ,
-
•
with .
Further, the construction does not depend on a concrete choice of .
Proof.
Let . For a rule that maps finite sets to measurable functions let be the components of . As already noted in the proof of Theorem 6, the following are equivalent:
-
(i)
are the components of a natural transformation ,
-
(ii)
for every injection and index
(6.1)
. It is shown that this construction gives a sequences of kernels as in (2). Let with . The measureability of is clear, as it is the inner component of the measurable function . Applying (6.1) to and gives
that is the construction gives sequences of kernels as in (2).
. It is shown that this construction gives a natural transformation. Let with . Let and . Property (6.1) needs to be verified. Consider both sides of that equation by plugging in the definitions and write for short:
Now and hence . Calculation on the first term give
with bijection . Now consider the second term. It holds . Using the symmetry of , for every it follows
Comparing the final calculations for both sides show that equality, hence naturality of , follows, if there exists such that , which is simply given by , one can check noticing .
implies . Let with . Let be finite and . It is . Write with and apply (6.1) to the inner components of :
implies . Let be finite and . It is . For with it is
Now and , hence , which gives
Now it is such that , that is . Since is symmetric it follows that .
The one-to-one correspondence is thus shown. Only thing left to do:
The construction does not depend on a concrete choice of . Let be a fixed choice of representative indices and let be two rules that map an index i to bijections such that . Applying (4) from Lemma 2 gives a with . Suppose is defined using and using . For a finite set and index the invariance of the kernel functions give
∎
Example 15.
Natural transformations , with , are determined by symmetric measurable maps and the corresponding natural transformation has components . The previous theorem shows: it does not matter in which order and are picked from and plugged into , because of symmetry. However, a concrete choice is made in that theorem via .
Example 16 (Local modification rules).
Following Definition 1.27 in [AT10] the concept of a local modification rule is introduced: let be an arbitrary BDS and be a finite set representing "extra individuals from the outside". A new BDS is defined by and , where for it is the injection that operates as on and as on . A local modification rule on (using ) is a natural transformation . In case Theorem 12 gives an explicit description of local modification rules using kernel functions.
A characterization of natural transformations can be obtained easily given prior results:
Proof of Theorem 6.
For any countable collections of BDS there is an obvious one-to-one correspondence between natural transformations and sequences of natural transformations with a natural transformation for every : for every such sequence it is the component of a n.t. and this construction is one-to-one. Hence Theorem 6 directly follows from Theorem 12. ∎
It remains to show the strong FRT for array-type data structure, Theorem 5, which is obtained from the weak version by modifying almost sure natural transformations to true ones.
Lemma 3.
Let be Borel spaces, a measurable map, a countable group, a measurable group action and a probability measure such that for every it it for -almost all . Then there exits a -invariant measurable function such that for -almost all .
Proof.
For each the set is measurable with -probability one. Since is countable the same is true for . In particular . If choose , otherwise choose arbitrary and define
is measurable which satisfies -almost because . The -invariance of follows because for every the equivalence holds. ∎
Proposition 10 (Modification).
Let be a BDS, and a countable product of array-type data structures. Every -a.s. natural transformation has a modification to a true natural transformation such that for every finite it holds that -almost surely.
Proof.
Let . For every it is , let be the -th component function of . It is a -a.s. natural transformation. If every can be modified to a true natural transformation then, because countable intersections of events with probability one have probability one, the rule defines the components of the desired modification of . Hence one can restrict to the case : showing that every -a.s. natural transformation has a modification, where is an arbitrary Borel space an an arbitrary indexing system. Let be a skeleton of . For and let be the i-th component of . Let be injective and . Since is a -a.s. natural transformation it holds that . This is an almost surely equality in and hence for every index it follows that
(6.2) |
For with define
For applying (6.2) to and gives
By Lemma 3 one can modify to a measurable function such that for all (pointwise) and . By Theorem 12 one can use to construct a true natural transformation which has components . This gives a -a.s. modification of : for a finite set and let , which is an injection such that and . Noticing gives the calculation
and hence (finite intersection of events with probability one). ∎
Proof of Theorem 5.
Let and . The weak FRT, Theorem 3, shows that there exists a -a.s. natural transformation with . Proposition 10 gives that can be modified to a true natural transformation with for -almost all , hence . In case applying Proposition 8 to the true natural transformation gives a true natural transformation with and hence . ∎
6.1. Explicit FRT for array-type data structures
Let be an indexing system with skeleton . Define and the action as
This gives a notion of exchangeability in arrays as in (1.3). For every bijection it is . The following is a consequence of Theorems 5, Theorem 12 and formulated in terms of natural extensions of arrays, see Section 4.1.
Corollary 4.
Let be a Borel space. For every exchangeable -valued process there exist kernel functions such that for every with it is
-
•
measurable,
-
•
for every and
and such that
with iid . The representation does not depend on the concrete choice of by symmetry of the kernels.
It is directly seen that the FRT needs randomization up to order . Applying the representation to gives the examples (E1)-(E3), applying it to gives back Theorem A (from which everything started). It is noted that deriving a FRT for a particular indexing system from Hoover’s (or any other known) FRT may often be more or less easy by "elementary" arguments - which then often depend on the concrete indexing system considered. The result above has worked these arguments out simultaneously for any indexing system.
6.2. Atomic indexing systems
To understand what indexing systems are about it is insightful to consider atomic indexing systems. is called atomic if there exists a unique representative index, that is: if is a skeleton, then has one single element with for some . It follows that for every index i from it is , and .
Example 17.
Atomic indexing systems are with representative index , with and with . Examples of non-atomic indexing systems are in case , in case , or .
Using Lemma 2 it is straightforward to show that an atomic indexing systems with is always "in between" and : for every finite set it is
Further, Theorem 12 can be used to justify that natural embeddings
are given by
-
•
; the kernel is the identity function . Injectivity of follows because ranges over as i ranges over ,
-
•
; the kernel is . Injectivity of follows because ranges over all elements from when j ranges over (because then, ranges over all injections ).
The term "atomic" is justified by the fact that every indexing system decomposes into atomic indexing systems: if is a skeleton for and the representatives are enumerated as , a countable set, then for every an atomic indexing system is given by defined as and for . For every finite set it is a disjoint union because is an equivalence relation on indices; a natural isomorphism is given by components
A formal remark on this: if and are such that , then is the discrete one-point Borel space consisting of the unique function , which for every equals . In case is countable infinite, for every finite set it is for all but finitely many .
7. Outlook to seperate exchangeability
Let be fixed. The statistical philosophy behind (classical) notions of seperate exchangeability is that there are large populations and a statistician picks from any of the populations a finite set of individuals, representing individuals from population via IDs from some finite set . The complete sample of individuals is represented by the tuple . Picking subgroups is performed separately on each group, that is via tuples of injections such that is injective. Composition with , with injective, is . The same ideas leading to study BDS () can be extended to and lead to consider functors
where is the -fold product category of . A functor gives the Borel spaces representing spaces of measurements on a group of individuals represented by and for every way of (separately) picking subgroups a measurable map which explains how picking subgroups transforms measured data. Imagine the statistician picks individuals and assigns IDs "randomly" and then measures data. As with Borel data structures it is straightforward to model the distribution of such a random measurement by a rule mapping every to some such that for any it holds
Let be the space of such , which are called symmetric laws on . Any symmetric law is determined on its diagonal, that is by the values ranging over finite sets : for every let , it is
Let be the diagonal functor that sends to and to . It is
a Borel data structure and for the rule is element . The map is injective. Let
In this context it is reasonable to call a seperate exchangeable law and a jointly exchangeable law on the Borel data structure . The statistical interpretation of the BDS is as follows: a statistician picks individuals from each of the populations, obtaining distinct groups of individuals each of size , and uses a single finite set of IDs with to identify individuals within each of the groups, that is every points to an individual in each of the groups. Storing (joint) information about the picked groups as data gives a value from . Picking subgroups in is performed such that for every the individuals represented by are treated as "linked together". Loosely speaking, this results in the following statistical interpretation of seperate and jointly exchangeable laws:
-
•
Jointly exchangeable laws arise as follows: with is the law of a measurement in which individuals are picked with an arbitrary coupling, that is every pick represents a simultaneous pick of individuals, exactly one from each population.
-
•
Seperate exchangeable laws correspond to the coupling being "independent", that is for every and every an inidividual from population is picked randomly and assigned ID . Of course, .
To summarize: in any BDS represented in the form there is a canonical notion of seperate exchangeability being stronger than (joint) exchangeability , that is . Of course, when it is and . Next, two ways are given to construct a BDS satisfying for some constructed from a "base" BDS .
For injections with let act as . The coproduct version is the map acing on as . Considering only the "diagonal" version of these constructions leads to the indexing systems , which sends to and to , and , which sends to (-times) and to .
Let be an arbitrary BDS.
-
(C1)
satisfies with being
-
(C2)
satisfies with being
The statistical interpretation of these constructions is straightforward: let the picked groups, each of size , be represented by . Data of the form is measured by building all pairs and using these pairs as new "individuals" on which data is measured according to . Data of the form is measured by pooling the individuals from the different groups together (in an identifiable way), which gives new IDs (the elements of ), and using to measure data on the pooled group.
Example 18.
The classical notion of seperate exchangeability is about arrays indexed by . This notion can be derived from the previous construction as follows:
It is with . As seen before (natural extension of arrays + correspondence with laws of random measurements using a countable infinite set of IDs), jointly exchangeable laws correspond to laws of -valued arrays satisfying for every bijection
The derived notion of seperate exchangeability is the classical one: the law of is represented by a seperate exchangeable law iff
holds for any bijections .
Example 19.
Let be the data structure of strict total orders. Seperate exchangeable laws in appeared in [CE17] in the context of identifying the Doob-Martin boundary of a specific combinatorial Markov chain producing randomly growing words over a -letter alphabet; a (functional) representation of seperate exchangeable laws in this case is given by their Theorem 6.12 (where "exchangeable" instead of "seperate exchangeable" is used).
Remark 20.
The constructions (C1), (C2) also have outer versions, details are only given for the product: let be BDS and consider the product . It is with and . The statistical interpretation is that on each of the groups data is measured separately, on group according to , and recorded in a tuple. Seperate exchangeable laws can be easily identified: is seperate exchangeable iff for every finite set
for a uniquely defined probability measures on . Note the coincidence that for for all it is , which is special to sequential data.
In future work the abstract notion of seperate exchangeability should be investigated further. For that, many of the results derived for functors and their exchangeable=symmetric laws should have a straightforward generalization to functors for arbitrary . Studying functional representations for seperate exchangeable laws should be particularly fruitful for BDS of the form with , as in this case is of array-type again, for which general results have been presented. The same holds for .
8. Concluding remarks/outlook
Remark 21 (Kernels as morphisms).
Let be the category that has Borel spaces as objects and probability kernels as morphisms, that is: a morphism from to in is a measurable map and composition with is defined by disintegration:
A good part of our definitions and results should also hold when is replaced by , that is the initial object of study would be functors ; however, the focus of this work was on functional aspects of exchangeability based on the statistical interpretation of "manipulating measurements in a deterministic way". A possible benefit on extending the theory from to is to be investigated. It is noted that the results of this work would embed nicely into the more general framework: the category is obtained as the Kleisli category induced by the Giry monad, see [Gir82], which has a version on . The results about BDS and natural transformations between BDS would embed in the -setting by identifying a function with the kernels .
Remark 22 (Conjecture about generalized Noise-Outsourcing Lemma).
[Aus15] studied exchangeable laws in with . As noted in Remark 10, the notions of natural transformations and kernel functions implicitly appeared in that context as skew-product type functions and skew-product tuples. The results obtained there lead to the following conjecture, formulated in a "weak" form for arbitrary BDS of arbitrary depth:
Conjecture 1.
Let and be Borel data structures.
-
•
For every and -a.s. natural transformation there exists a -a.s. natural transformation such that for every finite set it is for -almost all ,
-
•
Abstract Noise-Outsourcing Lemma: for every with first marginal there exists a -a.s. natural transformation such that with having components .
Remark 23 (Topological assumptions).
Topological assumptions may be needed to obtain further results, in particular for sub-data structures, which is reflected by the topological assumptions being made in [AT10] for studying hereditary properties. For that, one could replace with or (continuous maps between polish/compact metrizable spaces), both of which have a version of the Giry monad (equipping probability measures with the topology of weak convergence). An interesting question arises: it is known that for every measurable group action on a Borel space there exists a polish topology on generating its -field and such that becomes a continuous group action, see [Kec00]. Let be the forgetful functor mapping a polish space to the obtained Borel space and a continuous map to itself; is it true that for every BDS there exists a functor such that ? Such a functor would correspond to a rule that maps every finite to a polish topology on generating its -field and making all maps continuous.
Remark 24 (Functors in [AT10]).
In Definition 3.5 of [AT10] contravariant functors have been introduced, where has sub-Cantor spaces as objects (topological spaces homeomorphic to a compact subsets of the standard Cantor space) and probability kernels as morphisms. Restricting such a functor to and keeping only the measureability structure gives a functor , see Remark 21. The derived functor obtained from a sub-Cantor palette (Definition 3.7) corresponds to in our notation.
Remark 25 (Quasi-Borel spaces).
[Heu+17] introduces the category of quasi-Borel spaces, , aiming to provide a more solid mathematical foundation to applications in stochastic programming motivated from the (unpleasant) observation that is not Cartesian closed. In particular, a de Finetti-type representation theorem for quasi-Borel-spaced exchangeable sequences is shown; given that and their statistical motivation, it seems interesting to investigate if and in what sense definitions and results in the BDS context translate to functors .
Remark 26 (Using category theory terminology).
It should be possible to translate definitions and results using more category theory terminology, which would give the opportunity to search for further abstractions. For example, Theorem 5 (strong FRT for products of arrays) can be formulated as follows: Let be the category that has countable products of array-type data structures as objects and natural transformations as morphisms. Consider the functor that sends to the Borel space of exchangeable laws and a natural transformation to the push-forward . Let be the one-point Borel space. Theorem 5 is equivalent to the existence of a weak universal arrow from to witnessed by the pair , where is viewed as a function , see [Mac78] Section X.2.
Remark 27 (Shift-invariance and contractability).
Suppose is a BDS having an extension . Let the set of injections . It was seen that exchangeable laws corresponds to laws of -valued random variables satisfying for all . Any subset introduces a weaker notion of invariance: (The law of) A -valued random variable is called -invariant iff for all ; of course exchangeability induces -invariance. To study -invariance one can assume wlog that is closed under composition and contains , that is being a monoid under composition. Two classical examples fall into this frame:
-
•
Shift-invariance: with ,
-
•
Contractability/Spreadability: . Note that spreads IDs and so, by contravariance, contracts measurements.
Both these invariances are based on additional mathematical structure on the concrete choice of IDs : addition for shift-invariance and a total order in case of contractability. How to invoke additional structure on IDs into an abstract category theory framework remains open for future research, but should give interesting insights: comparing Theorems 7.15 and 7.22 in [Kal06] shows a deep connection between contractability in and exchangeability in .
9. Appendix
9.1. Borel spaces
In [Kal97] Borel spaces are introduced as measurable spaces for which there exists a Borel subset and a bi-measurable bijection . Borel spaces coincide with standard Borel spaces which are typically introduced as measurable spaces on which the -field is generated from a polish topology on . The theory of (standard) Borel spaces is presented, for example, in [Kec95].
Borel spaces enjoy the following closure properties:
-
•
Countable products and coproducts of Borel spaces are Borel,
-
•
Measurable sub-spaces of Borel spaces are Borel,
-
•
For a measurable space let be the space of probability measures on equipped with the -field generated by the evaluation maps measurable. If is Borel, so is , see Theorem 1.5 in [Kal17].
Let be Borel spaces and measurable.
-
•
If is bijective its inverse is measurable,
-
•
If is injective and measurable, then the image is measurable and in case it is a bi-measurable bijection between the Borel spaces and , see Corollary (15.2) in [Kec95],
-
•
If is injective then there exists a measurable with .
Borel spaces make the concept of conditional distributions well-behaved, see for example Lemma 3.1 in [Aus12]:
Theorem (Noise-Outsourcing).
Let be a Borel space and an arbitrary measurable space. Let be a -valued random variable. Then there exists a measurable function such that with independent from .
9.2. Some proofs
Proof of Proposition 4.
First we check that the construction is well-defined: let be finite and let and be two bijections. Then there exists a bijection with and hence . Now let be with . There exists a bijection such that . With this the functorality of and sampling consistency and exchangeability of gives
which gives and hence that the definition does not depend on the concrete choice of .
Next check : let be an injection and . Then . Let and , which is bijection. It holds that . By functorality of and sampling consistency of it is
hence .
Next check that the construction is a bijection. It is injective: let with constructed rule . For it is . By sampling consistency the law of is determined by hence the construction is injective.
Next check that the construction is surjective, that is for every rule there exists with for all . The Borel space assumption is needed to apply Kolmogorov extension theorem: let be an increasing sequence of finite sets with . Applying Theorem 8.21 in [Kal97] gives the existence of a stochastic process such that for all and almost surely for all . For any finite set let be the smallest set with and define . It can easily be checked that is an exchangeable -measurement, that is , whose law gives back the rule .
∎
Proof of Proposition 5.
(1) Let be countable infinite. is a measurable subset of because it is the countable intersection of sets over , the latter are measurable because is for every . By assumption , let . By Proposition 4 there exist an exchangeable -measurement with for every , it holds that and hence . The property iff for all finite follows from for all together with laws of processes being determined by finite dimensional margins.
(2) Let be countable infinite. By definition for every it is . The -field on inherited of is also generated by these projections, in particular is measurable. It is easily checked that the extension of to arbitrary countable sets satisfies functorality, that is for all composable injections between countable sets its holds and . Only the measureability of needs to be checked: let be injective. If is finite then is measurable by composition. If is also infinite, then is measurable iff is measurable for every . By functorality which was seen to measurable before.
(3) Let be an exchangeable -measurement using IDs . (iv)(iii)(ii) is clear. Assume (ii) and let and bijective. Extend to a bijection via on and on . By (ii) it is and hence . Let . It is and the latter equals by definition, hence and (i) follows. Now assume (i) and show (iv). By Proposition 4 there is with for every finite set . Let be an arbitrary injection. It is
Because laws on are determined by one-dimensional margins, see (1), only needs to be shown. Now it is
an injection , hence follows from .
(4) Let be injective between countable infinite sets and . It is a -valued random variable. For every bijection choose a bijection with . It holds , that is is exchangeable and is a map . This map is an isomorphism due to Proposition 4, which shows that both and can be identified with by the rule constructed there.
∎
Proof of Theorem 9.
For all and bijections it holds that
it is thus no restriction to consider only elements with for some when investigating limits.
Thus, only finite subsets are considered and laws are identified with laws of ergodic exchangeable -measurements .
Let and for with let be the indicator of . Let . The law of any exchangeable -measurement is determined by the expectations over , that is by .
Applying (i)(iii) of Theorem 8 to gives that for every ergodic there exists a convergent sequence such that (4.2) holds.
On the other hand it is easy to check that a limit of a convergent sequence with gives a rule via
This works because and hence are assumed to be finite.
It only needs to be checked that is ergodic. Let using IDs . Because the characterization of ergodicity via independence, Theorem 8, check that for every with and it holds that are independent.
For , by sampling consistency, that probability for are represented by limits and that is finite one obtains:
The argument that the latter equals is the same as in the proof of (iii)(ii) from Theorem 8. ∎
Proof of Proposition 8.
(i) This is straightforward to check.
For both (i) and (ii) some preparing observations. It is easy to check that defined by
and for and
defines a new Borel data structure. Again, it is straightforward to check that
is a natural transformation such that every component is injective due to , that is is an embedding.
(ii) By Proposition 2 it is a Borel data structure naturally isomorphic to . Let be the natural isomorphism obtained from by restricting the range of its components and let be the natural inverse of .
For every it is and hence
For every finite set and this allows to define
(9.1) |
Check that : for every one can choose with
Note that for every it holds that and hence
Applying naturality of gives
It is and hence . That is, is a measurable rule and the previous calculation also showed that
Applying to the left gives , so the candidate for is the rule . All left to check is that this is a natural transformation. Since is it suffices to show that is. Let and choose with . Let be injective.
that is as needed.
(iii) The idea is the same as for (ii), but the technical details are a little more subtle. As before, let be the embedding and let be a left-inverse that is a -a.s. natural transformation for every , which exists due to Proposition 7. In particular, it holds that and hence for every .
For every define the value as in (9.1), which gives a rule .
Let and define , so . The -a.s. naturality of gives
that is the -a.s. equality of the rules and . Applying on the left gives -almost surely. The desired candidate for is thus and all left to check is that this is a -a.s. natural transformation.
First check that is a -a.s. natural transformation. Let with and be injective.
So is a -a.s. natural transformation.
Check that is a -a.s. natural transformation: it is a -a.s. natural transformation for every . Let . Because -almost surely and it holds that
Hence is -a.s. natural transformation and is a -a.s. natural transformation. Lemma 1 gives that is a -a.s. natural transformation. ∎
Proof of Lemma 2.
(1) For the moment write . Let be another set with . Check that : since and it is . Because and
(2) Write so that . Because it is
that is equals .
(3) Let and and . Check : since it holds that , hence and so with by (2) , which is element of , thus . Applying the inverse to the equation gives and the same reasoning as before yields . Because is bijective it holds that . From and injectivity of it follows and hence . Because both are finite together with implies .
(4) Assume there is with for all . This implies and with (2) it follows
Now assume . By (3) it is , hence both are bijections . By (2) it holds . Applying on the left gives . That is, the bijection is element of . For it is and hence .
(5) Reflexivity: it is hence . Symmetry: let with witnessed by satisfying . By (2) it is . Applying gives and hence . Transitivity follows by composing the witnessing injections.
∎
References
- [Ald09] David J Aldous “More uses of exchangeability: representations of complex random structures” In arXiv preprint arXiv:0909.4339, 2009
- [Ald10] David J Aldous “Exchangeability and continuum limits of discrete random structures” In Proceedings of the International Congress of Mathematicians 2010 (ICM 2010) (In 4 Volumes) Vol. I: Plenary Lectures and Ceremonies Vols. II–IV: Invited Lectures, 2010, pp. 141–153 World Scientific
- [Ald82] David J Aldous “On exchangeability and conditional independence” In Exchangeability in probability and statistics (Rome, 1981) North-Holland Amsterdam, 1982, pp. 165–170
- [Ald85] David J Aldous “Exchangeability and related topics” In École d’Été de Probabilités de Saint-Flour XIII—1983 Springer, 1985, pp. 1–198
- [AO18] Morgane Austern and Peter Orbanz “Limit theorems for distributions invariant under groups of transformations” In Annals of Statistics (to appear), 2018 URL: https://www.e-publications.org/ims/submission/AOS/user/submissionFile/51328?confirm=c0a05f9c
- [AP14] Tim Austin and Dmitry Panchenko “A hierarchical version of the de Finetti and Aldous-Hoover representations” In Probability Theory and Related Fields 159.3 Springer, 2014, pp. 809–823
- [AT10] Tim Austin and Terence Tao “Testability and repair of hereditary hypergraph properties” In Random Structures & Algorithms 36.4 Wiley Online Library, 2010, pp. 373–463
- [Aus08] Tim Austin “On exchangeable random variables and the statistics of large graphs and hypergraphs” In Probability Surveys 5 The Institute of Mathematical Statisticsthe Bernoulli Society, 2008, pp. 80–145
- [Aus12] Tim Austin “Exchangeable random arrays” In Notes for IAS workshop, 2012
- [Aus15] Tim Austin “Exchangeable random measures” In Annales de l’IHP Probabilités et statistiques 51.3, 2015, pp. 842–861
- [Aus19] Morgane Austern “Limit Theorems Beyond Sums of IID Observations” Columbia University, 2019
- [Ber+98] François Bergeron, F Bergeron, Gilbert Labelle and Pierre Leroux “Combinatorial species and tree-like structures” Cambridge University Press, 1998
- [CAF16] Diana Cai, Nathanael Ackerman and Cameron Freer “Priors on exchangeable directed graphs” In Electronic Journal of Statistics 10.2 Institute of Mathematical StatisticsBernoulli Society, 2016, pp. 3490–3515
- [CE17] Hye Soo Choi and Steven N Evans “Doob–Martin compactification of a Markov chain for growing random words sequentially” In Stochastic processes and their applications 127.7 Elsevier, 2017, pp. 2428–2445
- [DJ07] Persi Diaconis and Svante Janson “Graph limits and exchangeable random graphs” In arXiv preprint arXiv:0712.2749, 2007
- [EGW17] Steven N Evans, Rudolf Grübel and Anton Wakolbinger “Doob–Martin boundary of Rémy’s tree growth chain” In The Annals of Probability 45.1 Institute of Mathematical Statistics, 2017, pp. 225–277
- [Fel71] Urlich Felgner “Comparison of the axioms of local and universal choice” In Fundamenta mathematicae 71 Instytut Matematyczny Polskiej Akademii Nauk, 1971, pp. 43–62
- [FGP21] Tobias Fritz, Tomáš Gonda and Paolo Perrone “De Finetti’s Theorem in Categorical Probability” In Journal of Stochastic Analysis 2.4.6, 2021
- [FHP18] Noah Forman, Chris Haulk and Jim Pitman “A representation of exchangeable hierarchies by sampling from random real trees” In Probability Theory and Related Fields 172.1 Springer, 2018, pp. 1–29
- [Ger18] Julian Gerstenberg “Austauschbarkeit in Diskreten Strukturen: Simplizes und Filtrationen”, 2018
- [Ger20] Julian Gerstenberg “Exchangeable interval hypergraphs and limits of ordered discrete structures” In The Annals of Probability 48.3 Institute of Mathematical Statistics, 2020, pp. 1128–1167
- [Ger20a] Julian Gerstenberg “General erased-word processes: Product-type filtrations, ergodic laws and Martin boundaries” In Stochastic Processes and their Applications 130.6 Elsevier, 2020, pp. 3540–3573
- [GGH16] Julian Gerstenberg, Rudolf Grübel and Klaas Hagemann “A boundary theory approach to de Finetti’s theorem” In arXiv preprint arXiv:1610.02561, 2016
- [Gir82] Michele Giry “A categorical approach to probability theory” In Categorical aspects of topology and analysis Springer, 1982, pp. 68–85
- [Gla03] Eli Glasner “Ergodic theory via joinings” American Mathematical Soc., 2003
- [Gne97] Alexander V Gnedin “The representation of composition structures” In The Annals of Probability JSTOR, 1997, pp. 1437–1450
- [Grü15] Rudolf Grübel “Persisting randomness in randomly growing discrete structures: graphs and search trees” In Discrete Mathematics & Theoretical Computer Science 18 Episciences. org, 2015
- [Heu+17] Chris Heunen, Ohad Kammar, Sam Staton and Hongseok Yang “A convenient category for higher-order probability theory” In 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 2017, pp. 1–12 IEEE
- [Hoo79] D.N. Hoover “Relations on probability spaces and arrays of random variables” In Preprint, Institute for Advanced Study, Princeton, 1979
- [Jan11] Svante Janson “Poset limits and exchangeable random posets” In Combinatorica 31.5 Springer, 2011, pp. 529–563
- [JS20] Bart Jacobs and Sam Staton “De Finetti’s construction as a categorical limit” In International Workshop on Coalgebraic Methods in Computer Science, 2020, pp. 90–111 Springer
- [Jun+21] Paul Jung, Jiho Lee, Sam Staton and Hongseok Yang “A generalization of hierarchical exchangeability on trees to directed acyclic graphs” In Annales Henri Lebesgue 4, 2021, pp. 325–368
- [Kal06] Olav Kallenberg “Probabilistic symmetries and invariance principles” Springer New York, 2006
- [Kal17] Olav Kallenberg “Random measures, theory and applications” Springer Cham, 2017
- [Kal97] Olav Kallenberg “Foundations of modern probability” Springer New York, 1997
- [KB94] Vladimir S Korolyuk and Yu V Borovskich “Theory of U-statistics” Springer Dordrecht, 1994
- [Kec00] Alexander S. Kechris “Descriptive dynamics” In London Math. Soc. Lecture Note Series 277, 2000, pp. 231–258
- [Kec95] Alexander S. Kechris “Classical descriptive set theory” Springer New York, 1995
- [Lau88] Steffen L. Lauritzen “Extremal families and systems of sufficient statistics” Lecutre Notes in Statistics. Springer-Verlag Berlin Heidelberg GmbH, 1988
- [Lee22] Jiho Lee “A de Finetti-type representation of joint hierarchically exchange-able arrays on DAGs” In ALEA, Lat. Am. J. Probab. Math. Stat. 19, 2022, pp. 925–942
- [Lin01] Elon Lindenstrauss “Pointwise theorems for amenable groups” In Inventiones mathematicae 146.2 Springer, 2001, pp. 259–295
- [Llo+13] James Robert Lloyd, Peter Orbanz, Zoubin Ghahramani and Daniel M Roy “Exchangeable databases and their functional representation” In NIPS Workshop on Frontiers of Network Analysis: Methods, Models, and Application, 2013
- [Mac78] Saunders Mac Lane “Categories for the working mathematician” Springer New York, 1978
- [McC02] Peter McCullagh “What is a statistical model?” In The Annals of Statistics 30.5 Institute of Mathematical Statistics, 2002, pp. 1225–1310
- [Mil19] Bartosz Milewski “Category theory for programmers” Bartosz Milewski, 2019
- [OR14] Peter Orbanz and Daniel M Roy “Bayesian models of graphs, arrays and other exchangeable random structures” In IEEE transactions on pattern analysis and machine intelligence 37.2 IEEE, 2014, pp. 437–461
- [SS22] Sam Staton and Ned Summers “Quantum de Finetti Theorems as Categorical Limits, and Limits of State Spaces of C*-algebras” In arXiv preprint arXiv:2207.05832, 2022