PPP-Completeness and Extremal Combinatorics††thanks: Part of this wok done while visiting R.B., L.F., P.H., and N.I.S. were visiting Bocconi University.
Abstract
Many classical theorems in combinatorics establish the emergence of substructures within sufficiently large collections of objects. Well-known examples are Ramsey’s theorem on monochromatic subgraphs and the Erdős-Rado sunflower lemma. Implicit versions of the corresponding total search problems are known to be PWPP-hard; here “implicit” means that the collection is represented by a poly-sized circuit inducing an exponentially large number of objects.
We show that several other well-known theorems from extremal combinatorics – including Erdős-Ko-Rado, Sperner, and Cayley’s formula – give rise to complete problems for PWPP and PPP. This is in contrast to the Ramsey and Erdős-Rado problems, for which establishing inclusion in PWPP has remained elusive. Besides significantly expanding the set of problems that are complete for PWPP and PPP, our work identifies some key properties of combinatorial proofs of existence that can give rise to completeness for these classes.
Our completeness results rely on efficient encodings for which finding collisions allows extracting the desired substructure. These encodings are made possible by the tightness of the bounds for the problems at hand (tighter than what is known for Ramsey’s theorem and the sunflower lemma). Previous techniques for proving bounds in TFNP invariably made use of structured algorithms. Such algorithms are not known to exist for the theorems considered in this work, as their proofs “from the book” are non-constructive.
1 Introduction
A well-known theorem by Ramsey gives a lower bound on the size of the largest monochromatic clique in any edge-coloring of the complete graph using two colors.
- Ramsey [Ram30]
-
Any edge-coloring of the complete graph on vertices with two colors contains a monochromatic clique of size at least .
Ramsey’s theorem gives rise to a natural computational search problem Ramsey [Kra05, KNY19]: given a description of an edge-coloring, output the vertices of a monochromatic clique of size . Since the theorem guarantees the existence of a monochromatic clique of this size, Ramsey belongs to the complexity class TFNP consisting of efficiently verifiable search problems to which a solution is guaranteed to exist [MP91].
The computational complexity of Ramsey very much depends on its representation. One the one hand, it is efficiently solvable when the graph is given explicitly; a folklore proof of Ramsey’s theorem gives an efficient algorithm to find such a subgraph – see Appendix A. On the other hand, the situation is less clear when the graph is represented implicitly, e.g., via a Boolean circuit that, for any pair of vertices, outputs the corresponding color of the edge-coloring of the graph.111Given such a representation, it might be even hard to compute the degree of a node with respect to one of the two colors.
Another TFNP problem considered in the literature that is motivated by a result in extremal combinatorics arises from the well-known Erdős-Rado sunflower lemma.
- Erdős-Rado [ER60]
-
Any family of -sets of cardinality greater than contains an -sunflower of size , i.e., subsets such that, for some , for every distinct .
An instance of the total search problem Sunflower [KNY19] can be implicitly represented, e.g., via a Boolean circuit that, given an index of a set in the family, outputs its characteristic vector.
In general, little is known of the complexity of the implicit variants of Ramsey or Sunflower – the proofs of the corresponding theorems are either non-constructive or result in inefficient (i.e., superpolynomial-time) algorithms. Both problems are known to be PWPP-hard, as shown by Krajíček [Kra05] and Komargodski, Naor, and Yogev [KNY19]. This means that finding the desired substructure is at least as hard as finding collisions in an arbitrary poly-sized shrinking circuit and, hence, hard in the worst-case if collision-resistant hash functions exist. However, they are not known to be complete for the class PWPP and the intriguing question of whether they give rise to a complexity class distinct from PWPP has remained open for years.
1.1 Our Results
We explore new connections between classical theorems in extremal combinatorics and the complexity classes PPP [Pap94] and PWPP [Jeř16], i.e., the classes of search problems with totality guaranteed by the (weak) pigeonhole principle. We show that PPP and PWPP can be characterized via a number of new TFNP problems based on the following theorems.
- Erdős-Ko-Rado [EKR61].
-
Any family of distinct pairwise-intersecting -sets on a universe of size has size at most .
- Sperner [Spe28].
-
The largest antichain, i.e., a family of subsets such that no member is contained in any other, on a universe with elements is unique and consists of all subsets of size .
- Cayley [Cay89].
-
There are exactly spanning trees of the complete graph on vertices.
Just as for Ramsey and Sunflower, the corresponding search problems are efficiently solvable when given explicit access to the family of objects and, again, their computational complexity is open when we consider implicit access to the structure, e.g., where the instance is given by a circuit that on input returns an encoding of the object in the collection.222Note that an implicit representation of the collection might not necessarily satisfy the assumptions of the underlying theorem. For instance, representing sets via characteristic vectors for Erdős-Ko-Rado does not ensure that they are actually -sets or that they are distinct. Importantly, such a violation could allow evading the totality of the search problem. Nevertheless, we can ensure totality by allowing locally verifiable evidence of a malformed representation as a solution, e.g., an index not corresponding to a -set or two indices corresponding to the same set. The totality of the problems we define follows from a common principle – the instances are given via an implicit representation of a sufficiently large collection of objects (e.g., subsets for Erdős-Ko-Rado) such that, by the corresponding theorem, there exists a small subset of these objects satisfying some efficiently verifiable property (e.g, a pair of disjoint subsets for Erdős-Ko-Rado).
In addition to the above completeness results, we define TFNP problems arising from the following two results in extremal combinatorics.
We show that variants of the corresponding problems are hard for PWPP and PPP. However, proving their inclusion in PWPP or PPP remains open and they join Ramsey and Sunflower as candidate problems that might define a new class above PWPP or PPP (see Section 1.5). An overview of our results in terms of weak and strong problems (see Section 1.3) is given in Table 1.
Problem | Hardness | Containment |
---|---|---|
Ramsey | PWPP [Kra05, KNY19] | TFNP |
Sunflower | ||
Ward-Szabo |
PWPP
[Theorems 7.4, 8.3 and 8.12] |
|
weak-Mantel |
PPP
[Theorems 7.7, 8.4 and 8.13] |
|
weak-Turánr | ||
Ward-Szabo-Colorful-Collisions | ||
Ward-Szabo-Collisions | PWPP [Theorems 7.5, 7.4, 5.2, 6.2, 4.4 and 4.21] | |
weak-Erdős-Ko-Rado | ||
weak-general-Erdős-Ko-Radok | ||
weak-Sperner-Antichain | ||
weak-Cayley | ||
Erdős-Ko-Rado | PPP [Theorems 6.7, 5.7, 4.9 and 4.26] | |
general-Erdős-Ko-Radok | ||
Sperner-Antichain | ||
Cayley | ||
Mantel | PPP [Theorem 8.8] | TFNP |
1.2 Techniques and Ideas
A long-standing open problem regarding Ramsey and Sunflower has been to determine their status with respect to the classes PWPP and PPP. For the most part, the most challenging part in establishing completeness for some syntactic subclass of TFNP lies in proving hardness (see, e.g., [DGP09, Meh18, FG18]). For subclasses of TFNP such as PPAD, PPA, and PLS, the inclusion in a subclass mostly follows from the existence of an inefficient yet structured algorithm for the problem at hand; for example, the chessplayer algorithm for PPA [Pap94] or the steepest descent algorithm for PLS [JPY88]. However, this methodology seems inapplicable for proving inclusion in PWPP or PPP as these classes do not exhibit any characterizing graph-theoretic structure that could capture some class of natural algorithms.
In contrast to many existing bounds in TFNP, our work does not make use of structured algorithms but instead makes use of encodings that translate between substructures and collisions in circuits. In order to establish inclusion in PWPP, we encode the objects of the collection using a “property-preserving encoding” that encodes the objects in a way that translates some specific relation into collisions. More precisely, we want an encoding function that is efficiently computable and (nearly) optimal, such that whenever two elements have the same encoding, these two elements give a solution to the original problem. While this technique is quite general, it is not always clear how to instantiate the encoding to get the desired collisions.
Consider, for example, the total search problem corresponding to the Erdős-Ko-Rado theorem for intersecting families of -sets on a universe of size . An instance can be given by a Boolean circuit representing a family of subsets of , i.e., is the characteristic vector of the -th -set in the family. Suppose the outputs of define distinct -sets. Since there are more than of them, then, by the Erdős-Ko-Rado theorem, there must exist a pair of inputs mapped to disjoint -sets by . We define any such pair of inputs to be a solution.333To ensure the totality of the problem, we introduce additional solutions corresponding to succinct certificates that does not define a family of distinct -sets, i.e., either an such that is not of Hamming weight or a pair such that .
When proving that the above total search problem is contained in the complexity class PWPP, at a high level, we want to encode the -sets of the family using a shrinking circuit, in such a way that collisions correspond to disjoint sets. Observe that for -sets in a universe of size , the only disjoint sets are complements and, hence, we get an equivalent instance of the problem if we map each set to either itself or its complement, arbitrarily. In our construction, we map each set to the representative not containing the element 1. That is, if , the set is left unchanged and, otherwise, it is mapped to its complement . Note that, by the pigeonhole principle, two sets that do not contain 1 must have a non-empty intersection since we work with -subsets of . To obtain a shrinking circuit, we make use of Cover encodings (Section 3.1) that give an optimal encoding of all -sets by considering their lexicographic order. Notice that if the input is not an -set, we may map it arbitrarily to any -set, as a collision, in this case, yields a solution to the instance of the above problem motivated by the Erdős-Ko-Rado theorem.
In contrast, the PWPP-hardness results for Ramsey and Sunflower follow an extremely elegant but rather direct (compared to other hardness results for subclasses of TFNP) technique of graph-hash product [Kra05, KNY19], which we illustrate on Ramsey. Recall that there are known randomized constructions of edge-colorings of the complete graph on vertices that do not contain a monochromatic clique of size [Erd47]. Given such an underlying edge-coloring of and a hash function mapping -bit strings to -bit strings, one can construct an edge-coloring of the complete graph on vertices by assigning to every edge the color of the edge from the underlying coloring. Since the underlying edge-coloring of does not contain a monochromatic clique of size , it is easy to see that any monochromatic clique of size in the resulting edge-coloring of (guaranteed to exist by Ramsey’s theorem) must have been introduced via a collision in the hash .
As noted by [KNY19], the structure of a PWPP-hardness proof using the graph-hash product is not restricted to total search problems corresponding to graph-theoretic theorems of existence; indeed, [KNY19] used the graph-hash product to prove also PWPP-hardness of Sunflower. On a high level, for a problem to be amenable to the graph-hash product technique, it is sufficient to be able to construct a collection of objects such that 1) it does not contain the desired substructure, 2) its size is at least a constant fraction of the threshold necessary for the existential theorem to apply,444This is a technical condition ensuring that we can reduce from a PWPP-complete variant of the problem of finding collisions in a shrinking hash. Note that it is easy to find collisions in functions that exhibit extreme shrinking. and 3) it can be efficiently indexed. Then, we can interpret the output of an appropriately shrinking hash as an index into the small collection of objects, and, for each index, we can efficiently compute and output the corresponding element in the collection. Again, since the small collection does not contain the desired substructure, all solutions of the instance constructed via graph-hash product must in some way result from a collision in the hash .
For example, consider the total search problem arising from Sperner’s theorem on antichains – here, the threshold size is , meaning that if we have a family with strictly more than distinct subsets of then one subset from the family must be contained in another member of the family. It is straightforward to construct a family of subsets that does not contain the specific substructure (i.e., a subset that is included in another one) with size equal to the threshold size . It suffices to consider the family of all the -subsets of . Similarly, for many other combinatorial problems we study, an adequate collection of objects can be found by looking at a collection of maximum size that does not contain the substructure.
We also show natural reductions between some of the problems we define (from Erdős-Ko-Rado to Sperner-Antichain for instance), which, in our opinion, highlights the relevance of these new problems and the fact that their definition is the correct one.
1.3 PPP-Completeness From Extremal Combinatorics
Up to this point, our discussion did not explicitly distinguish between the classes PWPP and PPP. However, our work highlights important structural differences between the two complexity classes. Recall that the class PWPP contains the search problems in TFNP whose totality can be proved using the weak pigeonhole principle: “In any assignment of pigeons to holes there must be two pigeons sharing the same hole.”
This statement can be seen as a result in extremal combinatorics bounding the maximum number of pigeons that can be assigned to holes without two pigeons being sent to the same hole. More generally, we say that a theorem from extremal combinatorics is “weak” if it gives an upper bound (which may or may not be tight) on the maximum size of a collection of objects that does not contain some substructure (above, two pigeons sharing the same hole). On the contrary, we say that a theorem from extremal combinatorics is “strong” if it gives a tight upper bound on the maximum size of a collection of objects that does not contain some substructure, as well as some structural property about the maximum families without the substructure. For instance, the strong pigeonhole principle can be stated as: “In any assignment of pigeons to holes there is either a pigeon in the first hole or two pigeons sharing the same hole.” Note that it is exactly this formulation of the strong pigeonhole principle that defines the class PPP.
Many results in extremal combinatorics have a weak statement and a strong statement. For such results, we can define a problem corresponding to the weak statement, which often is related to PWPP, and a problem corresponding to the strong statement, which often is related to PPP. In this paper, all PWPP-hard problems correspond to a weak theorem in extremal combinatorics, while PPP-hard problems correspond to a strong theorems in extremal combinatorics. As an example, consider Cayley’s formula and note that the bound is tight. Hence, if we are given a collection of exactly distinct graphs on vertices, then either one of the graphs is not a spanning tree, or every spanning tree is in the collection. This observation induces a TFNP problem that we show to be PPP-complete.
1.4 Related Work
Compared to the majority of subclasses of TFNP that have been extensively studied and are known to capture various total search problems from diverse domains of mathematics, PPP and PWPP might seem less expressive and the first non-trivial completeness results appeared only recently.
Sotiraki, Zampetakis, and Zirdelis [SZZ18] and Ban, Jain, Papadimitriou, Psomas, and Rubinstein [BJP+19] demonstrated that PPP contains computational problems from number theory and the theory of integral lattices. In particular, Sotiraki et al. showed PPP-completeness of a computational problem related to Blitchfeld’s theorem and PPP-completeness (resp. PWPP-completeness) of a problem motivated by the Short Integer Solution problem. Hubáček and Václavek [HV21] showed that some general formalizations of the discrete logarithm problem are complete for PWPP and PPP and, motivated by classical constructions of collision-resistant hashing, they characterized PWPP via the problem of breaking claw-free (pseudo-)permutations.
1.5 Open Problems
Our work suggests various interesting directions for future research:
-
•
We exploit the power of strong statements in extremal combinatorics for establishing PPP-completeness. The notorious lack of tight bounds for the Erdős-Rado sunflower lemma and Ramsey’s theorem implies that we have no strong version of these theorems, which may explain why showing the inclusion of the corresponding problems in, e.g., PPP has eluded researchers.
-
•
We introduced total search problems corresponding to Mantel’s theorem, Turán’s theorem, and Ward-Szabo’s theorem. In this work, we only prove hardness results for these problems but no inclusion results. Hence, it is still open whether they are complete for the classes PPP and PWPP, or whether they could define a new subclass of TFNP.
-
•
The Turánr problem is defined in a similar fashion to Mantel, yet, unlike for Mantel, we currently do not have a proof of PPP-hardness for it. Thus, the question of PPP-hardness of Turánr is immediate. Alternatively, it would be interesting to define a different PPP-hard problem in a natural way from Turán’s theorem.
-
•
Another exciting question is whether the efficient Baranyai assumption (4.18) holds, as well as whether it is possible to prove the inclusion results of the problems associated to the general version of Erdős-Ko-Rado’s theorem without that assumption. Showing reductions between general-Erdős-Ko-Radok and general-Erdős-Ko-Radol for without the efficient Baranyai assumption would also be intriguing.
-
•
Finally, we believe the problems deserve a more thorough investigation to further our understanding of the classes they define and their interrelation.
2 Preliminaries
We denote by the binary logarithm of . We denote by the set . We interpret elements of as strings and write them as for . Each element is also called a bit. We say is the length of , and say is an -bit string. We denote by (resp. ) the -bit string consisting of all 0 (resp. 1). If are two strings of lengths , respectively, we denote by the concatenation of and . We denote by the lexicographical order on strings. Note that is a partial order as it is only well-defined for strings of the same length. We use to denote and . We may occasionally abuse notation and write where , in which case we mean the binary encoding of on the same number of bits as . If exceeds the length of , we define such that the order is total.
If is a set of size , we associate the set with the characteristic vectors from for some arbitrary (but fixed) order on . We denote by the partial order on where iff for every . If is a string, we denote by the complement of , defined by . We also use other set-theoretic operators that are defined in a natural way. We also denote by the number of 1s in when the length is implicit from the context.
2.1 Total Search Problems
A search problem is defined by a binary relation – a string is a solution for an instance if . A search problem defined by relation is total if for every , there exists an such that . We define TFNP as the class of all total search problems that can be efficiently verified, i.e., there is a deterministic polynomial-time Turing machine that, given , outputs 1 if and only if and, for every instance , there exists a solution of polynomial length in the size of .
To avoid unnecessarily cumbersome phrasing throughout the paper, we define TFNP relations implicitly by presenting the set of valid instances recognizable in polynomial time (in the length of an instance) and, for each instance , the set of admissible solutions for the instance . It is then implicitly assumed that, for any invalid instance , we define the corresponding solution set as .
Next, we recall the definitions of the complexity classes PWPP and PPP via their canonical complete problems weak-Pigeon and Pigeon.
Definition 2.1 (weak-Pigeon and PWPP [Jeř16]).
The problem weak-Pigeon is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
s.t. .
The class of all TFNP problems reducible to weak-Pigeon is called PWPP.
Definition 2.2 (Pigeon and PPP [Pap94]).
The problem Pigeon is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. ,
-
ii)
s.t. .
-
i)
The class of all TFNP problems reducible to Pigeon is called PPP.
3 Property-Preserving Encodings
A key ingredient to our proofs of inclusion in PWPP and PPP is the use of efficient encodings. We rely on two different types of encodings. The first one simply consists of bijections between two different representations of the same set of objects, the first one being more natural and more convenient to work with, and the second one being more concise. The second type of encodings, which we call property-preserving encodings, consists of shrinking functions, in the sense that the range of the encoding is smaller than the domain, whose collisions exactly correspond to elements sharing some property. The following definition gives a precise description of the features we require from these encodings.
Definition 3.1 (Property-preserving encoding).
Let be sets, and let be an equivalence relation on . Let be a surjection. We say that constitutes a property-preserving encoding for on if it satisfies.
-
•
(Efficiency). can be computed in polynomial time.
-
•
(Compression). .
-
•
(-correctness). is constant on every coset of for .
We first describe some bijective encodings before studying some property-preserving encodings.
3.1 Cover Encodings
Our reductions in Section 4 make use of Cover encodings [Cov73] that efficiently encode subsets of a specified size in optimal space: namely, we may encode every subset such that by considering the lexicographic order of all such sets (in fact we consider the lexicographic order over their characteristic vectors ), and mapping this into binary strings: this requires bits, which is optimal. We denote the encoding and decoding functions as follows, with .
We set and , and . As described in [Cov73], these functions can be made efficient.
Lemma 3.2.
For every , is the identity over all -subsets of . Similarly, is the identity over the first elements in the lexicographic order of .
Note that the behavior of is undefined for the last inputs. Furthermore, by design, is well-defined on any subset of (even if this subset does not have size ), but the encoding only makes sense for subsets of size . We also note the following identity which will be useful later when dealing with -subsets of .
(1) |
Remark 3.3.
When we encode -subsets of , since we encode sets according to their rank of their characteristic vector in the lexicographic order, any set that does not contain element 1 is one of the first ones in the lexicographic order, hence its encoding starts with a 0. Conversely, if we decode an element whose first two bits are 0’s, this means that the corresponding -subset of is one of the first in the lexicographic order, hence that it does not contain the element 1.
3.2 Encoding 2-subsets of
In Section 7, we need to encode the subsets of with 2 distinct elements in an injective way.
Unfortunately, since the base set is large, we cannot use Cover encodings to do so.
However, we can use the idea behind Cover encodings, that is to encode the subsets by their rank in the lexicographic order.
Consider , with . What is its rank in the lexicographic order?
All subsets whose smallest element is smaller than have a lower rank. The number of such subsets is
All subsets whose smallest element is and whose second smallest element is smaller than also have a lower rank. There are exactly such subsets.
Hence, the rank of the subset in the lexicographic order is
Note that since there are subsets of with 2 distinct elements, the rank of any subset with can be written in binary using bits. Now, denote as the following circuit. On input , it proceeds as follows.
-
1.
If , it returns .
-
2.
If , it computes and returns the binary encoding on bits of .
-
3.
If , it computes and returns the binary encoding on bits of .
Note that has polynomial size, and is injective on the set of subsets of with 2 distinct elements by construction.
Remark 3.4.
In fact, this encoding is a bijection from the set of 2-subsets of to the set . The reciprocal of that bijection can also be computed by a circuit of polynomial size.
3.3 Prüfer Codes
In Section 6, we make use of Prüfer codes [Pru18] that give an efficiently computable bijection between the set of labelled spanning trees on vertices and the set of sequences of elements of . They were originally used by Heinz Prüfer [Pru18] to prove Classical Theorem 5
We denote by a circuit that efficiently computes the Prüfer encoding of a spanning tree described by an element of . Similarly, let be a circuit that efficiently computes the spanning tree associated with a Prüfer code. By looking at the algorithm to compute Prüfer encodings, it is clear that we can assume these circuits to have polynomial size. We also assume that outputs elements of the right form even on inputs which do not correspond to spanning trees. Consider the lexicographic order on . Let be a circuit that efficiently computes the rank of an element of , and let . Given a spanning tree, returns the rank of its Prüfer code in the lexicographic order.
Let be a circuit which on input computes the sequence of whose rank in the lexicographic order is . Let . Given a rank, returns the spanning tree whose Prüfer code has the corresponding rank in the lexicographic order. Note that and both have polynomial size. Now, if , then , . By construction, we have the following.
Lemma 3.5.
The following statements are true.
-
1.
is the identity over the set of labelled spanning trees on vertices.
-
2.
is the identity over the first elements of .
Remark 3.6.
The behavior of on its last inputs is undefined.
Remark 3.7.
Let be the tree composed of the edges . Then, and .
3.4 Catalan Factorization
Catalan factorization [EK99] is an encoding of subsets of that allows us to decompose the partially ordered set into chains and to move efficiently within each chain to find a canonical representative, namely the only -subset of the chain.
Let be a bitmap representing an element of . We introduce a new symbol , and construct the Catalan factorization as follows. We temporarily record for each symbol whether or not it is underlined.
-
1.
Underline the leftmost substring that starts with a non-underlined 1, followed by a (possibly empty) sequence of underlined symbols, and ends in a non-underlined 0. If no such substring exists, go to step 3.
-
2.
Go to step 1.
-
3.
Record the number of non-underlined 1’s.
-
4.
Replace all non-underlined symbols in with , and let be the resulting string (with underlinings removed).
-
5.
Output .
We denote the output of the Catalan factorization as . We say is the Catalan string of . If and is the number of ’s in , then for any , we define as the string obtained from by replacing the last ’s by and the rest by 0.
Example 3.8.
Let and let be the string corresponding to the set . Then, we construct the Catalan factorization by repeating step 1 to get the underlined version.
We terminate as there are no non-underlined 0’s with a 1 on its left. We record that there is non-underlined 1. We then replace all non-underlined symbols with to obtain the Catalan factorization.
Note that we have so the encoding and decoding operations behave as expected. Note also that corresponds to the set and corresponds to the set . For this reason, we say that the Catalan string identifies the following chain.
In that chain, identifies that is the element, counting from 0.
Lemma 3.9.
acts as identity over .
Proof.
Let , and be its Catalan factorization. Let be the number of ’s in . We claim that at the end of the underlining phase of the Catalan factorization of , the entries that are not underlined are first 0’s and then 1’s. Indeed, by definition, of them are 1, so of them are 0. Furthermore, if we had a non-underlined 1 before a non-underlined 0, then we could consider the rightmost non-underlined 1 that is before a non-underlined 0. This 1 is followed by a sequence of underlined symbols and then by a non-underlined 0 so this 1 and the corresponding 0 should have been underlined. Thus, we indeed have that the entries that are not underlined are first 0’s and then 1’s. These are the entries that are turned into ’s when we go from to .
Now, when we compute , we replace the last ’s in by 1’s and the other ones by 0’s, which is exactly what we had in . Hence, . ∎
We also denote by the map . If on input , is larger than the number of symbols in , all symbols are be replaced with 1; this ensures the map is defined for all .
Lemma 3.10.
For every , acts as identity on the set of Catalan strings. That is, if is a Catalan string, then for every , the Catalan string of is .
Proof.
Let and let be the Catalan string of . Now let , and be the Catalan string of . We want to show that .
We proceed using induction on the steps of the algorithm. At first, no entries are underlined in either string. Next, suppose that after some number of steps, the underlined bits are exactly the same in and in . Now, consider two bits that get underlined in at the next step. Then, all the bits between them are underlined in at this point, so this is also the case in by induction hypothesis. Furthermore, since these two bits get underlined in , they are not turned into ’s at the end of the algorithm, which means that they are still the same bits in and therefore in . Hence, in we have these 2 bits, first a 1 and then a 0, such that every entry between them is underlined, so they get underlined at this step.
Conversely, consider two bits that get underlined in at the next step. Then, all the entries between them in are underlined at this point, so it is the case in too by induction hypothesis. By contradiction, suppose that the corresponding bits in do not get underlined at this step. By the previous observation, it means that this pair of bits in is not . There are three cases to consider:
-
1.
In , these two bits are ’s. Then, the first gets turned into a 1 in , which means that it never gets underlined in (otherwise it would remain the same). Then, since all the bits in between these two are already underlined, and since the first never gets underlined, this means that the second never gets underlined (there will never be a non-underlined 1 before it such that all entries between them are underlined). Hence, these two bits never get underlined in the algorithm, and are finally turned into ’s. Then, to go from to , we replace the last ’s by 1’s and the others by ’s, thus making it impossible for the first of these two bits to be turned into a 1 while the second is turned into a 0.
-
2.
In , these two bits are respectively 0 and 1. Then, both these bits are changed between and , which means that they never get underlined in , hence they are ’s in . Thus, like previously, it is impossible that the first one is turned into a 0 while the second is turned into a 1.
-
3.
In , these two bits are ’s. Then, the second bit gets turned into a 0 in , which means that it never gets underlined in . Like in the first case, we get that the first bit never gets underlined neither, once more making it impossible for these two bits to be turned respectively in 1 and 0.
In all three cases, we get a contradiction. Thus, the corresponding bits in are also underlined at this step. Then, by induction, we get that at each step, the same bits are underlined in and . Finally, we turn all the bits that are not underlined into ’s to get and , hence . ∎
Remark 3.11.
We can define an equivalence relation over the subsets of by saying that two subsets are equivalent if and only if they have the same Catalan string.
By combining Catalan factorization and Cover encodings, we can obtain a property-preserving encoding for on .
We use this in Section 5.
4 Erdős-Ko-Rado Theorem on Intersecting Families
In this section, we define total search problems motivated by the well-known Erdős-Ko-Rado theorem on intersecting families and study their computational complexity. First, we present a PWPP-complete variant of the problem. Next, we modify the problem using a strong statement of the Erdős-Ko-Rado theorem to get a PPP-complete variant.
Recall the definition of an intersecting family and the statement of the Erdős-Ko-Rado theorem.
Definition 4.1 (Intersecting family).
Let be any set. A family of sets is an intersecting family if no two sets are disjoint, i.e., if for any , it holds that .
Classical Theorem 1 (Erdős-Ko-Rado [EKR61]).
Any intersecting family where each set has elements on a universe of size contains at most sets, and this bound is tight.
We start by defining a total search problem motivated by a special case of the Erdős-Ko-Rado theorem for families of -sets in a universe of size presented in the following corollary.
Corollary 4.2.
Any intersecting family where each set has elements on a universe of size contains at most sets, and this bound is tight. Furthermore, if is an intersecting family of maximum size, then for every -subset , exactly one of and is in .
Suppose that we have a collection, containing more than sets of size on elements. Then, by Classical Theorem 1, there must be two sets that do not intersect. This induces a total search problem of finding two such disjoint sets. We consider an implicit representation of such a collection by a circuit whose inputs serve as indices in the collection. The output of the circuit is a representation of the corresponding set as a characteristic vector of the elements. Of course, this representation does not guarantee that satisfies the conditions required for Classical Theorem 1 to apply, which would make the problem not total; in this case, we allow evidence of this fact to be a solution to the problem. Namely, if for a given input , we do not have , or two distinct indices represent the same set, i.e., , we allow such inputs as solutions.
Definition 4.3 (weak-Erdős-Ko-Rado).
The problem weak-Erdős-Ko-Rado is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. ,
-
ii)
s.t. ,
-
iii)
s.t. .
-
i)
As we discussed in the introduction, the totality of this problem is proved using a “weak” statement in extremal combinatorics, namely the first part of Corollary 4.2, hence the name Weak. However, the analogy with weak-Pigeon goes further. Indeed, our first main theorem is the following.
Theorem 4.4.
weak-Erdős-Ko-Rado is PWPP-complete.
Throughout this section, we maintain .
Lemma 4.5.
.
Proof.
At a high level, we want to encode the sets using a shrinking circuit, in such a way that collisions correspond to disjoint sets. Observe that for -sets in a universe of size , the only disjoint sets are complements, hence we get an equivalent instance of weak-Erdős-Ko-Rado if we map each set to either itself or its complement, arbitrarily. In our construction, we map each set to the representative not containing 1. That is, if , the set is left unchanged and, otherwise, it is mapped to its complement . Note that by the pigeonhole principle, two sets that do not contain 1 must have a non-empty intersection since we work with -subsets of . To obtain a shrinking circuit, we make use of Cover encodings (Section 3.1) that give an optimal encoding of all -sets by considering their lexicographic order. Notice that if the input is not an -set, we may map it arbitrarily to any -set, as a collision, in this case, yields a solution to the weak-Erdős-Ko-Rado instance.
Formally, recall that we have and . Now let be an instance of Erdős-Ko-Rado. We proceed to construct an instance of weak-Pigeon as follows:
Note that since we only encode sets whose first bit is a , by Remark 3.3, we get that the first bit of the encoding always is a , so we can consider only the last bits of for every , which is why we say that only outputs bits. Note also that if for some , does not have size , then and are still well-defined, even if they are meaningless.
Now, suppose that we have a solution to , that is such that . There are four cases to consider, depending on the first bits of . If , then . If both and have size , then by injectivity of on inputs of size (see Lemma 3.2), we get , which is a solution to weak-Erdős-Ko-Rado. If one of them does not have size , we also get a solution to weak-Erdős-Ko-Rado. The other cases are similar. ∎
Remark 4.6.
Consider the circuit , defined as follows.
Let be the subset of corresponding to the -subsets of . We define an equivalence relation on by saying that two strings are equivalent if the corresponding subsets are either equal or disjoint. Note that this relation is transitive only because we work with -subsets of .
Then, we have that is a property-preserving encoding for on .
Furthermore, the property that is preserved by is such that if two of its inputs collide, they form a solution to the problem we’re interested in.
Then, to prove the inclusion of weak-Erdős-Ko-Rado into PWPP, it suffices to compose our instance of weak-Erdős-Ko-Rado with .
Lemma 4.7.
weak-Erdős-Ko-Rado is PWPP-hard.
Proof.
Our goal is for the Erdős-Ko-Rado solver to find collisions in an instance of weak-Pigeon. We use a variation of the graph hash product [Kra05, KNY19]. The idea is to interpret the output of as an index into the collection of all -sets that do not contain 1. We then use the Cover decoding function to obtain a representation of the corresponding set, and by correctness of the encoding, any such set must have exactly elements – and all the sets intersect since they do not contain 1. Hence, the only solutions to the weak-Erdős-Ko-Rado instance are collisions, that yield solutions to the original circuit .
Formally, let be an instance of weak-Pigeon. Let be the minimal integer such that . Then, . We proceed to build a circuit whose size is polynomial in and such that from any collision in we can efficiently find a collision in . Recall that we have and . We define by
By Remark 3.3, since for every , is one of the first possible inputs, we have that the set is an -subset of which does not contain the element 1. We observe that defines an instance of weak-Erdős-Ko-Rado. Now suppose that we have a solution to this instance. By correctness of the decoding, we can only have solutions of type iii), that is such that . By injectivity of on its first inputs (see Lemma 3.2), we get that hence and from there we can retrieve a collision for .∎
PPP-completeness using the tight bound
We remark that Corollary 4.2 gives a tight upper bound on the size of the collection. Furthermore, we know some structure of any collection whose size is exactly one : it must either not be an intersecting family, or it must contain either or . This is an example of a “strong” theorem in extremal combinatorics. As discussed in the introduction, this observation allows us to modify the problem to be create a variant of weak-Erdős-Ko-Rado that is to weak-Erdős-Ko-Rado what Pigeon is to weak-Pigeon. The idea is to let encode a collection whose size exactly matches the threshold. We then let represent a collection of exactly sets, and also allow preimages of and as solutions. We show that modifying the problem in this manner makes it PPP-complete, thus strengthening the analogy with Pigeon. This technique is quite general, and we utilise it again in later sections.
Definition 4.8 (Erdős-Ko-Rado).
The problem Erdős-Ko-Rado is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. and ,
-
ii)
s.t. and ,
-
iii)
s.t. and ,
-
iv)
s.t. or and .
-
i)
Theorem 4.9.
Erdős-Ko-Rado is PPP-complete.
Lemma 4.10.
Erdős-Ko-Rado is PPP-hard.
Proof.
This proof is similar in spirit to that of Lemma 4.7, except for some minor changes. The first one is that the instance of Pigeon might be a permutation, and thus not have collisions. We then need to be able to find the preimage of 0. This is done by solutions of type . The second one is that we only look at the first inputs of the Pigeon instance, so we have to modify it to make sure that all the possible solutions come from here. This is why we build the circuit .
Formally, let be an instance of Pigeon, and let be the minimal integer such that . Since , we have . Define by,
It might be the case that the output of has less than bits, in which case we pad it with 0 on the left to make it an -bit string. Recall that we have and .
We proceed to build an instance of Erdős-Ko-Rado by setting . Note that for any , we have , thus is an -subset and does not contain the element 1 by Remark 3.3.
Now, suppose that we have a solution to . Since the index of a solution is , the corresponding subset(s) must have size and can’t contain . If the solution is of the form such that then we have so we must have either or , which is not possible.
Thus, any solution must be such that or such that or . There are two cases to consider:
-
•
Case . Then since is injective on its first inputs. But has range so any collision in must result from a collision in . Hence, we get that give us a solution to .
-
•
Case or . Since then does not contain element 1, so , thus . This means that we have and corresponds to a preimage of for .
In each case, we get a solution to our original problem. ∎
Remark 4.11.
We often use that technique of creating a circuit from a circuit , such that any collision (resp. preimage of 0) in must come from a collision (resp. preimage of 0) in , and happen in the first inputs of (in the range where we want it to happen).
Lemma 4.12.
.
Proof.
This proof is quite the same as the proof of Lemma 4.7, with two minor differences. The first one is that in the instance of Pigeon we create, there might be preimages of 0. These solutions to Pigeon correspond to solutions of type for Erdős-Ko-Rado. The second difference is that we only perform the reduction on the first inputs, and then map the others in such a way that they neither create a collision nor result in a preimage of 0.
Formally, suppose that we have an instance of Erdős-Ko-Rado, i.e., a circuit . We proceed to construct an instance of Pigeon as follows:
In the case , since we only encode sets whose first bit is a , by Remark 3.3, we get that the first bit of the encoding always is a , so we can consider only the last bits of for every such . Furthermore, if we consider the output of as an integer, we get that this integer is (because the set we encode is one of the first in the lexicographic order). Note also that if for some such that , does not have size , then is still well-defined and less than , even if it is meaningless.
Now, suppose that we have a solution to of the form such that . Again there are four cases to consider, depending on the first bits of . If then . If both and have size , then by injectivity of on inputs of size (see Lemma 3.2), we get , which is a solution to Erdős-Ko-Rado. If one of them does not have size , we also get a solution to Erdős-Ko-Rado. The other cases are similar.
Now, suppose that we have a solution to of the form such that . Like previously, we get that . If does not have size then is a solution. Now, suppose that has size . There are two cases to consider, depending on the first bit of . If the first bit of is 0, then, so by Eq. 1 and Lemma 3.2. Thus, . Instead, if the first bit of is 1, then so and thus . In either case, we get a solution to our original problem. ∎
Remark 4.13.
Like previously, the idea behind that proof is to compose our instance of Erdős-Ko-Rado with the property-preserving encoding we defined in Remark 4.6. However, this time it is not only the collisions that are of interest to us, but also the preimages of the 0 string.
4.1 A Generalized Erdős-Ko-Rado Problem
For the previous problems, we were only considering a very restricted version of the Erdős-Ko-Rado theorem, namely for an intersecting family of -subsets of . We now consider a more general version where we consider an intersecting family of -subsets of for some .
We now fix some for the rest of this section. The Erdős-Ko-Rado theorem states that if is an intersecting family where each set has elements on a universe of size , then contains at most sets. Then, we can define the following TFNP problem, very similar to weak-Erdős-Ko-Rado.
Definition 4.14 (weak-general-Erdős-Ko-Radok).
The problem weak-general-Erdős-Ko-Radok is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. ,
-
ii)
s.t. ,
-
iii)
s.t. .
-
i)
Proposition 4.15.
weak-general-Erdős-Ko-Radok is PWPP-hard.
Proof.
This proof is very similar to the proof of Lemma 4.7, except that instead of working with -subsets of , we work with -subsets of . There is also a technical change, which is that this time we work with -subsets of that do contain the element 1. This is necessary to make sure that we have an intersecting family, but it adds some more technicality. For the same reason, we need to shrink more than in the previous proof. However, the idea behind the proof is exactly the same, with the same use of the graph-hash product on a large intersecting family.
Formally, let be an instance of weak-Pigeon. Let be the minimal integer such that . Now, let . Then, . We also define . By definition of , we have . We also have , so . Like in the proof of Lemma 6.5, we can build a circuit whose size is polynomial in and such that from any collision in we can efficiently find a collision in . Let be the binary encoding on bits of . We use the Cover encoding functions for -subsets of : and .
We define by . For every , we have that is one of the first elements of in the lexicographic order, hence it is one of the first first. Thus, the rank of in the lexicographic order is between and counting from 0. The last -subsets of in the lexicographic order correspond to subsets that contain the element 1. Hence, for every , we have that the set is an -subset of which contains the element 1. We observe that defines an instance of weak-general-Erdős-Ko-Radok.
Now, suppose that we have a solution to this instance. We consider each solution type separately.
-
i)
It cannot be such that because is an -subset of .
-
ii)
By the previous, and so , which is a contradiction.
-
iii)
By injectivity of on its first inputs (see Lemma 3.2), we get that hence and from there we can retrieve a collision for .∎
To prove that , we present some useful definitions and results related to the Erdős-Ko-Rado theorem.
Definition 4.16.
If divides , a -parallel class is a set of -subsets of which partition .
Classical Theorem 2 (Baranyai, [Bar73]).
If divides , we can define -parallel classes such that each -subset of appears in exactly one .
Remark 4.17.
Note that this result proves the Erdős-Ko-Rado theorem in the case where the size of the subsets divides the size of the universe.
Note also that up to renaming the elements, we can assume that consists exactly of the sets , and .
However, all known proofs of this theorem are inefficient, in the sense that there is no known way to define such that given a -subset of , we can find in polynomial time the only such that this subset appears in . We make this assumption explicit.
Assumption 4.18 (efficient Baranyai assumption).
There is an efficient procedure to define and a circuit which takes as input a -subset of and returns the only index such that this subset appears in . Furthermore, we assume that consists exactly of the sets , and .
Proposition 4.19.
Under 4.18, .
Proof.
At a high level, the proof goes as follows. We are given strictly more than subsets of . We map them to elements of in the following way. If one set does not have size , we map it anywhere. If it has size , we map it to the only such that the set is in . This defines an instance of weak-Pigeon. In any collision for this instance, we must have either a set that does not have size , or two sets in the same parallel class, which means that either they are equal, or they do not intersect.
Formally, by assumption, we have a circuit which takes as input an -subset of and returns the only index such that this subset appears in . We define a circuit which takes as input an -subset of and returns the binary encoding on bits of the only index such that this subset appears in . Now, suppose that we have an instance of weak-general-Erdős-Ko-Radok. We set . Then, we have so is an instance of weak-Pigeon.
Now, suppose that we have a solution to this instance of weak-Pigeon, that is such that . Then, . If one of does not have size , we have a solution to our instance of weak-general-Erdős-Ko-Radok, and similarly if . Otherwise, it means that are distinct -subsets of that appear in the same -parallel class. By definition of a parallel class, it means that these 2 sets are part of a partition of , hence they don’t intersect and they form a solution to our original instance of weak-general-Erdős-Ko-Radok. ∎
Remark 4.20.
Let be the set of -subsets of . We define an equivalence relation on by saying that two -subsets and of are equivalent if and only , meaning that they are in the same -parallel class in the partition induced by .
Then, we have that is a property-preserving encoding for on .
Note that two equivalent subsets are either equal or disjoint. Hence, the property that is preserved by is such that if two of its inputs collide, they form a solution to our problem.
Then, to prove the inclusion of weak-general-Erdős-Ko-Radok into PPP, it suffices to compose our instance of weak-general-Erdős-Ko-Radok with .
The previous two propositions establish the following result.
Theorem 4.21.
Under 4.18, weak-general-Erdős-Ko-Radok is PWPP-complete.
PPP-completeness using the tight bound
Like for the case of -subsets of , we can define a “tight” version of the previous problem, which is very similar to Erdős-Ko-Rado.
Definition 4.22 (general-Erdős-Ko-Radok).
The problem general-Erdős-Ko-Radok is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. and ,
-
ii)
s.t. and ,
-
iii)
s.t. and ,
-
iv)
s.t. or , or…, or and .
-
i)
First, let’s see why this problem is total. Suppose that we have a list of subsets of . If one of the sets does not have elements, if two of the sets are equal, or if two of the sets don’t intersect, we have a solution. Now, suppose that we have an intersecting family of distinct -subsets of .
Now, consider a collection of -parallel classes such that each -subset of appears in exactly one (which exists by Classical Theorem 2). Up to renaming the elements, we can assume that is composed of the -subsets , , … and .
Since we have an intersecting family of distinct subsets, no two subsets can be in the same , and we have as many subsets as ’s, which means that one of the subsets is in , hence that it is one of the particular subsets we are looking for. This proves that . We then have the following result.
Proposition 4.23.
general-Erdős-Ko-Radok is PPP-hard.
Proof.
Informally, this proof is very much like the proof of Proposition 4.15, with the same technicalities as in the proof of Lemma 4.10. The idea is again to interpret the outputs of an instance of Pigeon as indices into the collection of all the -subsets of which contain the element 1. Solutions of type correspond to preimages of 0. Like for Lemma 4.10, we need to define to make sure that all solutions to our instance of general-Erdős-Ko-Radok indeed come from the instance of Pigeon.
Formally, let be an instance of Pigeon, and let be the minimal integer such that . We set and . Then, . Define by,
It might be the case that the output of has less than bits, in which case we pad it with 0 on the left to make it an -bit string. Let be the binary encoding on bits of . Recall that we have and .
We proceed to build an instance of general-Erdős-Ko-Radok by setting where - represents the subtraction in binary (mod ). For every , we have that is one of the first elements of in the lexicographic order. Thus, the rank of in the lexicographic order is between and counting from 0. The last -subsets of in the lexicographic order correspond to subsets that contain the element 1. Hence, for every , we have that the set is an -subset of which contains the element 1. We observe that defines an instance of general-Erdős-Ko-Radok.
Now, suppose that we have a solution to this instance. We consider each solution type separately.
-
i)
It cannot be such that because is an -subset of .
-
ii)
By the previous, and so , which is a contradiction.
-
iii)
By injectivity of on its first inputs (see Lemma 3.2), we get that hence and from there we can retrieve a collision for by design of .
-
iv)
If it is such that is one of the particular subsets we’re looking for, since we know that , it means that . When we consider -subsets of , the characteristic vector of is the last one in the lexicographic order, which means that . Furthermore, , the rank of in the lexicographic order is between and and is injective on its first inputs. Thus, , which implies that . By definition of , this can only mean that .
In either case, we get a solution to our original problem. ∎
Proposition 4.24.
Under 4.18, .
Proof.
The proof of this result resembles a lot the proof of Proposition 4.19. The idea is the same: we are given subsets of . We map each of them to an element of as follows. If a set does not have elements, we map it anywhere, and if it has elements, we map it to the only such that this set is in . This defines an instance of Pigeon. If we have a collision, it results in a solution like before. If we have a preimage of 0, it is a set in , which means it is one of the sets we are looking for. The definition of has some technicality since we need to take care of the last inputs to make sure that they are not involved in a collision or result in a preimage of 0.
More formally, we have by assumption a circuit which takes as input an -subset of and returns the only index such that this subset appears in . We define a circuit which takes as input an -subset of and returns the binary encoding on bits of where is the only index such that this subset appears in .
Now, suppose that we have an instance of weak-general-Erdős-Ko-Radok.
We set
Then, we have so is an instance of Pigeon.
Now, suppose that we have a solution to this instance of Pigeon. There are two cases to consider.
-
1.
It is such that . By construction of (and by definition of ), this means that . We have . If one of does not have size , we have a solution to our instance of general-Erdős-Ko-Radok, and similarly if . Otherwise, it means that are distinct -subsets of that appear in the same -parallel class. By definition of a parallel class, it means that these 2 sets are part of a partition of , hence they don’t intersect and they form a solution to our original instance of general-Erdős-Ko-Radok.
-
2.
It is such that . By construction of , it means that . We have . If does not have size , it is a solution to our original instance. If it has size , it means that it is an -subset of which is in . By assumption, the only such subsets are the particular ones we’re looking for. Hence, is a solution to our original instance of general-Erdős-Ko-Radok.∎
Remark 4.25.
As before, the idea behind that proof is to compose our instance of general-Erdős-Ko-Radok with the property-preserving encoding . However, this time it is not only the collisions that are of interest to us, but also the preimages of the 0 string.
The previous two propositions establish the following result.
Theorem 4.26.
Under 4.18, general-Erdős-Ko-Radok is PPP-complete.
5 Sperner’s Theorem on Largest Antichains
We now turn our attention to a different existence theorem from extremal combinatorics, concerning antichains. We say a family of sets is an antichain if for every , it holds that . A well-known theorem by Sperner gives a characterization of the largest antichain. As before, for an appropriate input size, this induces a total search problem of finding two distinct sets for which . As in the previous section, we consider both a weak and a strong version, and prove the weak version to be PWPP-complete, and the strong one PPP-complete.
Classical Theorem 3 (Sperner [Spe28]).
The largest antichain on any universe of elements is unique and consists of all subsets of size .
Like before, we consider an implicit representation of the collection of subsets via a circuit whose input corresponds to an index into the collection, and whose output is the characteristic vector of the corresponding set.
Definition 5.1 (weak-Sperner-Antichain).
The problem weak-Sperner-Antichain is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
s.t. .
Theorem 5.2.
weak-Sperner-Antichain is PWPP-complete
For the rest of this section, we set .
Lemma 5.3.
weak-Sperner-Antichain is PWPP-hard.
Proof.
We explain the reduction at a high level. We reduce from weak-Erdős-Ko-Rado and create an instance of weak-Sperner-Antichain by including each set from the weak-Erdős-Ko-Rado instance, as well as its complement. If we find a solution to weak-Sperner-Antichain, one of the sets must be contained within another. If one of the two sets does not have size , we obtain a solution to weak-Erdős-Ko-Rado of type i). Otherwise, the duplicated sets must be equal, and hence the original sets are either equal, or one of the sets is the complement of the other.
Formally, suppose that we have an instance of weak-Erdős-Ko-Rado. Write where is a bit. We build an instance of Sperner-Antichain as follows.
Now, suppose that we have a solution to this instance of Sperner-Antichain, that is such that . Write and . There are four cases to consider. If . Then and . If and both have size , then , and if this is not the case we get a solution for . In both cases, we get a solution for weak-Erdős-Ko-Rado. The other cases are similar; in all four cases, we get a solution to our original problem, so weak-Sperner-Antichain is PWPP-hard. ∎
Classical Theorem 4 (Dilworth’s Theorem, [Dil50]).
The size of the largest antichain in is equal to the size of the smallest chain partition, namely .
Lemma 5.4.
.
Proof.
We give a high-level overview of the reduction from weak-Sperner-Antichain to weak-Pigeon.
Fix an arbitrary partition into chains of of size (which exists by Classical Theorems 3 and 4). Since we have more than inputs in an instance of weak-Sperner-Antichain, by the pigeonhole principle, two distinct inputs must end up in the same chain. We want to give an identifier to each of these chains, using bits, such that for any subset we are be able to quickly find the identifier of the chain to which it belongs. To do so, in each chain, we choose as representative the -subset of the chain, that is guaranteed to exist by Classical Theorem 4. Then, the identifier of the chain is the Cover encoding on this subset. To map a subset to the representative of its chain, we make use Catalan factorizations (Section 3.4). Once we have this, from each subset we can efficiently get the -subset in its chain and therefore the identifier of the chain. Finally, a collision in the identifiers is equivalent to two elements in the same chain, which means a solution for weak-Sperner-Antichain.
Formally, let be an instance of weak-Sperner-Antichain. We proceed to construct an instance of weak-Pigeon as follows: if , we have which represents a subset of . Let be the Catalan factorization of , be the number of ’s in and the number of bits underlined during the construction of . Note that every time we underline bits we underline simultaneously a 0 and a 1, thus is even. Then, is an even number. Now, let . Then, since has the same number of 1’s and 0’s and since we replaced half of the ’s by 1’s and the other half by 0’s, we have that represents an -subset of . Informally, it is the -subset of the chain that contains , and replacing ’s by 1’s enables us to move inside that chain. Finally, we set . We observe that is an instance of weak-Pigeon.
Now suppose that we have a solution to this instance of weak-Pigeon, that is such that . Then, by injectivity of on the -subsets of (see Lemma 3.2), we get that . Informally, this means that and belong to the same chain and thus that one is contained is the other. Let’s now prove it formally. Let be the Catalan factorization of and be the number of ’s in , and let . We have so by Lemma 3.10, the Catalan string that corresponds to is . Similarly, the Catalan string that corresponds to is . Since , we get . We have that and that by Lemma 3.9, so and . By symmetry of and we can assume that . Then, to go from to we added elements (the ones corresponding to the last ’s in ) while to go from to we added these same elements plus others. Hence, . ∎
Remark 5.5.
Consider the circuit , defined as follows. On input , it computes the Catalan factorization of , the number of in . Then, it computes and finally returns .
Let . We define an equivalence relation on by saying that two subsets are equivalent if and only if they have the same Catalan string.
Then, we showed in the previous proof that is a property-preserving encoding for on .
Note that we also showed that if we have two equivalent subsets, one is included in the other. Hence, the property that is preserved by is such that if two of its inputs collide, they form a solution to our problem.
Then, to prove the inclusion of weak-Sperner-Antichain into PWPP, it suffices to compose our instance of weak-Sperner-Antichain with .
PPP-completeness using the tight bound
As with Erdős-Ko-Rado, we observe that the bound in theorem is tight, and we know the unique antichain of size , so we have some structural information about any collection of size . From that strong theorem, employing the same technique as before, we modify the problem to let the circuit represent a collection of that exact size. By Classical Theorem 3, we observe that if is an antichain with , then must contain . This leads us to define the following problem.
Definition 5.6 (Sperner-Antichain).
The problem Sperner-Antichain is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. and ,
-
ii)
s.t. and .
-
i)
Theorem 5.7.
Sperner-Antichain is PPP-complete.
Lemma 5.8.
Sperner-Antichain is PPP-hard.
Proof.
Same proof as for Lemma 5.3, by reduction from Erdős-Ko-Rado. Observe that if we have a solution of type ii) for Sperner-Antichain, the corresponding set in the Erdős-Ko-Rado instance is either or , which is one of the desired solutions to Erdős-Ko-Rado. ∎
Lemma 5.9.
.
Proof.
Informally, this proof is the same as the proof of Lemma 5.3, with some additional technical details. First, we need to take care of preimages of 0. The indices corresponding to preimages of 0 correspond to solutions of type . Second, since we only care about the first inputs, we have to make sure that the last ones are not part of a collision, or result in a preimage of 0.
Formally, let be an instance of Sperner-Antichain. We proceed to construct an instance of Pigeon as follows: if , we have which is a subset of . Let be the Catalan factorization of , be the number of ’s in and the number of bits underlined during the construction of . Note that every time we underline bits we underline simultaneously a 0 and a 1, thus is even. Then, is an even number. Now, let
Then, since has the same number of 1’s and 0’s and since we replaced half of the ’s by 1’s and the other half by 0’s, we have that represents an -subset of . Informally, it is the -subset of the chain that contains , and replacing ’s by 1’s enables us to move inside that chain. Finally, we set,
Then is an instance of Pigeon and has polynomial size. Suppose that we have a solution to this instance of Pigeon of the form such that . Then, and so . Let be the Catalan factorization of . Like previously, we get that the Catalan string that corresponds to is . However, and the Catalan string that corresponds to is . Thus, . Now, , so .
Suppose instead that we have a solution to this instance of Pigeon, of the form such that . Like before, we have . Then, by injectivity of on the -subsets of (see Lemma 3.2), we get that . Informally, this means that and belong to the same chain and thus that one is contained is the other. Let be the Catalan factorization of and be the number of ’s in , and let . We have so by Lemma 3.10, the Catalan string that corresponds to is . Similarly, the Catalan string that corresponds to is . Since , we get . We have that and that by Lemma 3.9, so and . By symmetry of and we can assume that . Then, to go from to we added elements (the ones corresponding to the last ’s in ) while to go from to we added these same elements plus others. Hence, . ∎
Remark 5.10.
Like previously, the idea behind that proof is to compose our instance of Sperner-Antichain with the property-preserving encoding we defined in Remark 5.5. However, this time it is not only the collisions that are of interest to us, but also the preimages of the 0 string.
6 Cayley’s Tree Formula
We consider yet another classic theorem from combinatorics, related to spanning trees. A classic result by Cayley establishes the number of spanning trees of the complete graph on vertices. We observe then that if we have a collection of sufficiently many such graphs, either one of the graphs is not a spanning tree, or two spanning trees collide. Note that two isomorphic trees on distinct vertices are not considered a collision. This allows us to define a total search problem of either finding a collision or finding an index not corresponding to a spanning tree. We represent trees using a bitmap on all possible edges, ordered arbitrarily. We show that this problem is equivalent to weak-Pigeon, in a more direct way than for the previous results. As before, the problem can be modified using the same technique as previously to become equivalent to Pigeon, and thus PPP-complete.
Classical Theorem 5 (Cayley [Cay89]).
There are exactly spanning trees of the complete graph on vertices.
Definition 6.1 (weak-Cayley).
The problem weak-Cayley is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. is not a spanning tree (i.e., is not spanning, not connected or contains a cycle),
-
ii)
s.t. .
-
i)
Theorem 6.2.
weak-Cayley is PWPP-complete.
For the rest of this section, we set .
Lemma 6.3.
.
Proof.
We reduce to weak-Pigeon. Unlike the previous problems, here, we are interested in a very simple algebraic structure, namely equality. Thus, we want collisions in our encoding to correspond to equality. This means that we want an efficiently computable injective encoding of spanning trees. For this, we use Prüfer codes (Section 3.3). We map any input to the Prüfer encoding of and, therefore, a collision either yield a collision in the trees or a graph that is not a spanning tree.
Formally, suppose that we have an instance of Cayley. We may define an instance of weak-Pigeon by setting . We observe that is indeed an instance of weak-Pigeon. By definition, is the rank in the lexicographic order of the Prüfer code of . Now, suppose that we have a solution to this instance, that is such that . Then, . If or is not a spanning tree, then we have a solution to our original instance of Cayley. Otherwise, and are spanning trees, so by injectivity of on the set of labelled spanning trees on vertices (see Lemma 3.5), we have which is a solution to our original instance of weak-Cayley. ∎
Remark 6.4.
Here, we can interpret as a property-preserving encoding on the set of labelled spanning trees on vertices, where the equivalence relation is equality. Hence, this is another proof of inclusion using property-preserving encodings, where we compose the instance of our problem with an appropriate property-preserving encoding. The equivalence relation has to be equality since the only spanning trees that are solutions of weak-Cayley are spanning trees that are equal.
Lemma 6.5.
weak-Cayley is PWPP-hard.
Proof.
We interpret the output of the weak-Pigeon instance as an index into the collection of all labelled spanning trees on vertices. By correctness of the encoding, the output necessarily is a spanning tree and, hence, the only solutions are collisions. We also detail some technical work to get a circuit with the right input size and output size, for which finding collisions allows solving the original instance of weak-Pigeon.
Formally, let be an instance of weak-Pigeon. We define a circuit as follows. For any , write with and . Then, we set . Note that still has polynomial size and that any collision in allows us to retrieve a collision for (like in the Merkle-Damgård construction, see [Mer79]).
Let be the smallest integer such that . Note that is polynomial in . Let . Then, , hence . Now, we define a circuit as follows. For any , write with and . Then, we set . Note that also has polynomial size and that any collision in allows us to retrieve a collision for hence for .
Recall that we have and . We now define an instance of Cayley by setting . Now, suppose that we have a solution to this instance of Cayley. For every , is one of the first elements of in the lexicographic order, so is well-defined and correct (i.e., it indeed returns a spanning tree) on input . Then, this solution must be such that . By injectivity of on its first inputs (Lemma 3.5), we get that and from this we can retrieve a solution to our original instance of weak-Pigeon. ∎
PPP-completeness using the tight bound
Again, we observe that Classical Theorem 5 gives an exact bound, namely that there are exactly labelled spanning trees on vertices. As before, this leads us to defining the following problem.
Definition 6.6 (Cayley).
The problem Cayley is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. is not a spanning tree and ,
-
ii)
s.t. and ,
-
iii)
s.t. and , with defined as in Remark 3.7.
-
i)
Theorem 6.7.
Cayley is PPP-complete.
Lemma 6.8.
Cayley is PPP-hard.
Proof.
This proof is in spirit similar to the proof of Lemma 6.5. We interpret the outputs of the instance of Pigeon as indices in the list of all spanning trees of the complete graph on vertices. Like in previous proofs, we have to define a circuit with sufficiently many inputs such that from any collision (resp. preimage of 0) in we can find a collision (resp. preimage of 0) in the instance of Pigeon. In the instance of Cayley we create, preimages of correspond to preimages of 0.
Let be an instance of Pigeon, and let be the smallest integer such that . Note that is polynomial in . Let . We define as follows.
If necessary, we pad the outputs of on the left by ’s so that they have length (this might be necessary for ). Note that and acts as the identity over , hence any solution to as an instance of Pigeon immediately gives a solution to . Recall that we have and . Then, we define an instance of Cayley by setting .
Now, suppose that we have a solution to this instance of Cayley. Every solution must consist of inputs but by construction of , and is well-defined, correct and injective on this set by Lemma 3.5. This implies that this solution can not be such that is not a spanning tree. Then, suppose that this solution is such that . By injectivity of on , we get that and from this we can retrieve a solution to our original instance of Pigeon. Now, if this solution is such that then this means that by Remark 3.7 and injectivity of over so . ∎
Lemma 6.9.
.
Proof.
The idea behind the proof is similar to that of Lemma 6.3, using to create an instance of Pigeon except that we restrict the circuit to only apply the first elements of the collection, and set it to the identity on the rest of the inputs. Any preimage of 0 correspond to a preimage of , and collisions arise from graphs that are not spanning trees, as well as collisions in the Cayley instance. ∎
7 Ward-Szabo Theorem on Swell Colorings
We now focus on a different theorem from extremal combinatorics, and more precisely from extremal graph theory. Let be the complete graph on vertices. An edge-coloring for some is called a swell coloring of if it uses at least 2 colors and if every triangle is either monochromatic or trichromatic. It is rather straightforward to see that in any -coloring of , there must exist a bichromatic triangle. On the contrary, if we color each edge with a different color, we trivially get a swell coloring. The natural question that appears is then to determine the minimal number of colors required to swell-color the complete graph on vertices. This was solved in some cases by Ward and Szabo in 1995.
Classical Theorem 6 (Ward-Szabo [WS95]).
The complete graph on vertices cannot be swell-colored with fewer than colors, and this bound is tight.
From that theorem, we can define a TFNP problem as follows: the input is a coloring of the edges of the complete graph on vertices with colors, as well as three vertices such that to guarantee that at least 2 colors are used in the coloring. A solution is then the vertices of a bichromatic triangle (which is guaranteed to exist by Classical Theorem 6). We also allow extra solutions, one to specify that the edges and have the same color, and one if the coloring of the graph is not consistent.
Definition 7.1 (Ward-Szabo).
The problem Ward-Szabo is defined by the relation
- Instance:
-
The following:
-
1.
A Boolean circuit ; and,
-
2.
Distinct .
-
1.
- Solution:
-
One of the following:
-
i)
if ,
-
ii)
s.t. ,
-
iii)
Distinct s.t. .
-
i)
We also define two variants of this problem, whose totality is a consequence of the totality of Ward-Szabo.
In the first one, we allow an extra type of solution, namely the vertices of two distinct triangles with the same “color profile”.
Definition 7.2 (Ward-Szabo-Collisions).
The problem Ward-Szabo-Collisions is defined by the relation
- Instance:
-
The following:
-
1.
A Boolean circuit ; and,
-
2.
Distinct .
-
1.
- Solution:
-
One of the following:
-
i)
if ,
-
ii)
s.t. ,
-
iii)
Distinct s.t. ,
-
iv)
Two triples, , each with 3 distinct elements, s.t. and , , .
-
i)
In the second variant, we allow the same extra type of solution, namely the vertices of two distinct triangles with the same “color profile”, with the additional constraint that these triangles should be trichromatic.
Definition 7.3 (Ward-Szabo-Colorful-Collisions).
The problem Ward-Szabo-Colorful-Collisions is defined by the relation
- Instance:
-
The following:
-
1.
A Boolean circuit ; and,
-
2.
Distinct .
-
1.
- Solution:
-
One of the following:
-
i)
if ,
-
ii)
s.t. ,
-
iii)
Distinct s.t. ,
-
iv)
Two triples , each with 3 distinct elements, s.t. , , , and the triangle is trichromatic.
-
i)
Theorem 7.4.
.
Proof.
At a high level, we use the weak-Pigeon circuit as the coloring of the graph. If we find a bichromatic triangle, we have found a collision. If we find two triangles with the same “color-profile”, we have also found a collision.
Formally, let us prove that weak-Pigeon reduces to Ward-Szabo-Collisions. Let be an instance of weak-Pigeon. By the Merkle-Damgård construction, we can build a circuit of polynomial size such that finding a collision for allows finding a collision for . We set and . If then we have a collision for . Otherwise, we have . We define a circuit as follows.
Then, we define an instance of Ward-Szabo-Collisions by saying that the coloring is and that .
Now, suppose that we have a solution to this instance of Ward-Szabo-Collisions. Note that solution cannot be such that by definition of .
If this solution is distinct such that then . which implies a collision for in any case.
If this solution is two triples such that , , , then by symmetry of and , and of and , we can assume . If and , then and so this gives us a collision for . Otherwise, from , from which we can find a collision for .
In all cases, we get a collision for from which we can get a collision for .
∎
Theorem 7.5.
.
Proof.
We describe informally the proof. There are only different “color profiles” possible, which is less than the number of triangles containing the vertex . Hence, if we map sufficiently many distinct triangles containing that vertex to their color profile, it defines an instance of weak-Pigeon, and any solution to this instance gives us a solution of type .
Formally, let , be an instance of Ward-Szabo-Collisions.
We consider the “color profile” of some triangles containing the vertex indexed by .
Let be the circuit defined as follows.
For every , write with and . Then, let and .
Then, we set . defines an instance of weak-Pigeon. Suppose now that we have a solution to this instance of weak-Pigeon, that is such that .
Then, define and as above. Since , by construction we have that and that each of these two sets has three distinct elements. Furthermore, implies that and . Hence, we have a solution of type to Ward-Szabo-Collisions.
∎
Remark 7.6.
The last two theorems prove that Ward-Szabo-Collisions is PWPP-complete. However, notice that the proof of inclusion into PWPP does not use solutions of the first three types. Hence, if we call Ward-Szabo-Collisions’ the problem similar to Ward-Szabo-Collisions but without the first three types of solutions, this new problem is also PWPP-complete. Indeed, the proof of inclusion into PWPP would be similar, and the proof of hardness too, only with less cases to consider. Thus, it seems (at least that is how we prove it) that what makes Ward-Szabo-Collisions PWPP-complete is only its last type of solutions. Now, one could wonder how hard this problem becomes if we slightly modify this last type of solutions to make them harder to find. This is exactly what Ward-Szabo-Colorful-Collisions does.
Theorem 7.7.
.
Proof.
We first give an overview of the proof. It is quite similar in spirit to the previous one, but we need to work to avoid getting collisions that would give us 2 monochromatic triangles. This costs an extra bit, hence the inclusion in PPP and not in PWPP. We are given three vertices such that the colors and are distinct (otherwise we have an easy solution to the instance). We create an instance of Pigeon by mapping any vertex to the pair of colors if we don’t have which would be a monochromatic triangle, and to the color otherwise. We need bits to make sure that these two types of outputs don’t collide. We make sure that 0 has no preimage. Then, any solution to the instance of Pigeon must be a collision. If it is a collision from the first case, we found 2 distinct non-monochromatic triangles with the same profile, hence a solution of type or . If it is a collision from the second case, we found 2 non-monochromatic triangles with the same profile.
Formally, let and be an instance of Ward-Szabo-Colorful-Collisions. If then we have a solution to this instance of Ward-Szabo-Colorful-Collisions. Now, suppose . If or , then we have a solution of type to this instance of Ward-Szabo-Colorful-Collisions. Hence, we can suppose that the colors and are all distinct. Furthermore, if , we have a solution of type , so we also assume that . We use the circuit defined in Section 3.2, to encode 2-subsets of using bits.
We define an instance of Pigeon as follows.
Now, suppose that we have a solution to this instance of Pigeon. By construction of , it cannot be such that . Then, it must be such that . Furthermore, by design of , we have . We consider two cases, depending on the first bit of .
-
1.
Suppose the first bit of is a . Then, . If , then we have that otherwise the first bit of would be a 0. Then, the triangle is bichromatic so it’s a solution to our instance of Ward-Szabo-Colorful-Collisions. Similarly, if , then the triangle is bichromatic. Now, if and , then by injectivity of on subsets of 2 distinct elements of . Then, , each has three distinct elements, and either , and , or , and . The triangle is not monochromatic so this gives us a solution to our instance of Ward-Szabo-Colorful-Collisions, either of type if it is trichromatic, or of type if it is bichromatic.
-
2.
Otherwise, suppose that the first bit of is a 0. By construction of , this means that . Furthermore, since , we get that . Then, , each has three distinct elements, and , and . The triangle is not monochromatic since so this gives us a solution to our instance of Ward-Szabo-Colorful-Collisions, either of type if it is trichromatic, or of type if it is bichromatic.∎
7.1 A Hierarchy of Total Search Problems between weak-Pigeon and Pigeon?
In the last proof, we define a reduction to Pigeon where the circuit only has a range of elements. Indeed, we need exactly elements to encode the pairs of colors. We also need exactly elements for the fourth case. However, we can map the anywhere in that case if because such an would give us a bichromatic triangle. Hence, we need colors for this case. We also need 3 extra elements for and . Hence, overall, we only need a range of elements. Thus, we get a reduction from Ward-Szabo-Colorful-Collisions to a problem that is weaker than Pigeon (but stronger than weak-Pigeon), which is the following : given a circuit from bits to bits, either find a collision, or a preimage of one of the first elements.
More generally, we can define the problem as follows.
Definition 7.8 ().
The problem is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
s.t. ,
-
ii)
s.t. is one of the first elements of .
-
i)
Note that this problem gets harder as decreases. It is trivial for , equivalent to weak-Pigeon for and to Pigeon for .
This problem induces an entire family of intermediary problems between weak-Pigeon and Pigeon. It is not clear how many non-equivalent problems appear in that hierarchy. It is also unclear whether each PWPP-hard problem that is in PPP is in fact equivalent to one of these.
8 Mantel’s Theorem on Triangle-Free Graphs
Next, we move on to another classical theorem in extremal graph theory. It answers the following question: What is the maximum number of edges in a triangle-free graph on vertices?
Classical Theorem 7 (Mantel [Man07]).
If is a triangle-free graph on vertices then , and this bound is tight.
This gives rise to the following search problem. Suppose that we are given a collection of strictly more than distinct edges for a graph on vertices. Then, by Mantel’s theorem, there must be three of these edges forming a triangle in the graph. The search problem is then to find them. We can turn this problem into a TFNP problem if we also allow evidence that two edges in the collection are in fact the same, or that an edge is in fact a loop. For practical reasons, we demand that the endpoints of every edge are given in the lexicographic order. When the edges are represented implicitly by a poly-sized circuit, we get the following problem.
Definition 8.1 (weak-Mantel).
The problem weak-Mantel is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
Distinct s.t. form a triangle,
-
ii)
s.t. with in the lexicographic order,
-
iii)
s.t. .
-
i)
Remark 8.2.
Like in the other problems, the size of the collection we receive (in this case, edges) is twice the threshold size (here, ). However, here, we observe that the number of edges we receive as input is greater than the number of possible edges since . Thus, in any instance of weak-Mantel, there must be solutions of type or .
Theorem 8.3.
weak-Mantel is PWPP-hard.
Proof.
To prove this result, we apply the graph-hash product to the complete balanced bipartite graph on vertices.
Formally, let be an instance of weak-Pigeon. We define as follows. For every , write with and . We then set . Note that from any collision for we can retrieve a collision for (by looking at the first bits). Now, we define as follows. For every , write with . We then set . We observe that defines an instance of Mantel. Note that the edges given by correspond to edges of the complete balanced bipartite graph on vertices where one side of the bipartition consists of the first elements in the lexicographic order. In particular, the graph described by is triangle-free, so there is no solution of type . Similarly, by construction of , there can be no solution of type . Thus, any solution to this instance of weak-Mantel is such that . By construction of , this means that and from there we can find a collision for . ∎
Theorem 8.4.
.
Proof.
We give a high-level overview of the proof. Since we have more edges than there are possible distinct edges, we encode the edges injectively, mapping only ill-defined edges to 0. This defines an instance of Pigeon, where a solution can only be a collision, meaning two different indices corresponding to the same edge.
With the circuit defined in Section 3.2, we can encode 2-subsets of using optimally many bits, that is .
Now, consider the following circuit ,
where represents the addition in binary. Note that since the range of is exactly the first elements of in the lexicographic order, if , it must be that .
Let be an instance of weak-Mantel. For every , we set . Then, is an instance of Pigeon.
Now, suppose that we have a solution to this instance of Pigeon. If it is such that , then which means that with so is a solution to our instance of weak-Mantel. If it is such that . If , by the first case we have that is a solution to the instance of weak-Mantel. Now, if , then it means that so . By injectivity of on well-defined inputs (that is inputs of the form with ), this means that which is a solution to our original instance of weak-Mantel. ∎
Remark 8.5.
Similarly to the proof that , we only use the last two types of solutions, which suggests that what makes this problem easier than Pigeon is only the fact that we are given more edges than there are different possible edges in a graph on vertices.
Remark 8.6.
In fact, this last proof shows that weak-Mantel reduces to .
Mantel’s theorem states that there is a unique triangle-free graph on vertices that has edges, it is the complete bipartite graph . Now, consider any labelling of the vertices of . If for every label , the vertices labelled and were on the same side of the bipartition, then all the vertices would be on the same side of the bipartition, which is impossible. Hence, there must be 2 vertices labelled and on different sides of the bipartition, and therefore there must be an edge between them. Thus, the following problem is total.
Definition 8.7 (Mantel).
The problem Mantel is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
Distinct s.t. form a triangle,
-
ii)
s.t. with in the lexicographic order,
-
iii)
s.t. ,
-
iv)
s.t. with when we consider and as integers.
-
i)
Theorem 8.8.
Mantel is PPP-hard.
Proof.
To prove this result, we do the graph-hash product on the complete balanced bipartite graph on vertices, where one side of the bipartition consists of the first vertices in the lexicographic order. We make sure to map 0 into the edge , which is the only edge satisfying in that graph.
Formally, let be an instance of Pigeon.
We define a circuit as follows. Let . If , we set . If , we set . Otherwise, if , we set . has polynomial size and defines an instance of Mantel.
Now, suppose that we have a solution to this instance of Mantel. Like in the proof of Theorem 8.8, this solution cannot be of type because the graph described by is bipartite hence triangle-free, and it cannot be of type neither, by construction. If this solution is of the form such that , by construction of it means that which is a collision for . If this solution is of the form such that with , then by definition of , it can only be that . By construction of , this means that hence is a solution to the original instance of Pigeon.
∎
8.1 Generalization with Turán’s Theorem
Mantel’s theorem investigates the maximum number of edges in a triangle-free graph on vertices. Similarly, one could wonder about the maximum number of edges in a graph on vertices that does not contain a clique on vertices, where is an arbitrary constant. This problem was solved by Turán in 1941.
Classical Theorem 8 (Turán [Tur41]).
If is a graph on vertices that does not contain any -clique, then and this bound is tight when divides .
Now, suppose that we are given a list of strictly more than edges for a graph on vertices. Then, by Turán’s theorem, if all these edges are distinct, the graph must contain an -clique. This induces a total search, namely that of finding the vertices of such a clique. If the edges are given implicitly via a Boolean circuit which on input returns the endpoints of the -th edge, we get the following TFNP problem.
Definition 8.9 (weak-Turánr).
The problem weak-Turánr is defined by the relation
- Instance:
-
A Boolean circuit .
- Solution:
-
One of the following:
-
i)
Distinct such that are the edges of an -clique,
-
ii)
s.t. with in the lexicographic order,
-
iii)
s.t. .
-
i)
Remark 8.10.
Note that can be any polynomial in in the previous definition and it would still define a TFNP problem.
Theorem 8.11.
For every , there is a reduction from to .
Proof.
Let be an instance of . Now, we interpret it as an instance of . Suppose that we have a solution to this instance of .
If we have edges that form an -clique, it suffices to remove some of them to get the edges of an -clique. Otherwise, any solution of type or for immediately translates into a solution of the same type for .
∎
Theorem 8.12.
For every , weak-Turánr is PWPP-hard.
Proof.
It is enough to notice that is exactly weak-Mantel, which is PWPP-hard by Theorem 8.3. Then, apply Theorem 8.11. ∎
Theorem 8.13.
For every , .
The proof is exactly similar to the proof of Theorem 8.4. In this case too, it appears that what makes the problem easier than Pigeon is that we are given too many edges.
Turán’s theorem states that there if divides , there is a unique graph on vertices that does not contain any -clique and that has the maximum number of edges. This graph is the complete -partite graph, where each part has size . Like previously, there must be 2 vertices labelled and with an edge between them. We denote by the largest multiple of that is at most , and set . Thus, the following problem is in TFNP.
Definition 8.14 (Turánr).
The problem Turánr is defined by the relation
- Instance:
-
The following:
-
1.
A Boolean circuit ; and,
-
2.
Two integers and .
-
1.
- Solution:
-
One of the following:
-
i)
if does not divide , or if , or if , or if ,
-
ii)
s.t. with or , and
-
iii)
Distinct such that are the edges of an -clique, and for every ,
-
iv)
s.t. with in the lexicographic order, and ,
-
v)
s.t. , and ,
-
vi)
s.t. with when we consider and as integers, and .
-
i)
This last problem is in TFNP. However, we cannot adapt the proof of PPP-hardness of Mantel to it in a straightforward way and, in fact, it is open whether this problem is PPP-hard.
References
- [Bar73] Zsolt Baranyai. Infinite and finite sets, vol. 1. proceedings of a colloquium held at Keszthely, June 25 – July 1, 1973. Dedicated to Paul Erdős on his 60th Birthday. J. Symb. Log., 1:91–108, 1973.
- [BJP+19] Frank Ban, Kamal Jain, Christos H. Papadimitriou, Christos-Alexandros Psomas, and Aviad Rubinstein. Reductions in PPP. Inf. Process. Lett., 145:48–52, 2019.
- [Cay89] Arthur Cayley. A theorem on trees. Quarterly Journal of Mathematics, 23:376–378, 1889.
- [Cov73] Thomas M. Cover. Enumerative source encoding. IEEE Transactions on Information Theory, 19(1):73–77, 1973.
- [DGP09] Constantinos Daskalakis, Paul W. Goldberg, and Christos H. Papadimitriou. The complexity of computing a Nash equilibrium. SIAM J. Comput., 39(1):195–259, 2009.
- [Dil50] Robert P. Dilworth. A decomposition theorem for partially ordered sets. Annals of Mathematics 51, pages 161–166, 1950.
- [EK99] Ömer Egecioglu and Alastair King. Random walks and Catalan factorization. 1999.
- [EKR61] Paul Erdős, Chao Ko, and Richard Rado. Intersection theorems for systems of finite sets. The Quarterly Journal of Mathematics, 12(1):313–320, 01 1961.
- [ER60] Paul Erdös and Richard Rado. Intersection theorems for systems of sets. Journal of the London Mathematical Society, s1-35(1):85–90, 1960.
- [Erd47] Paul Erdös. Some remarks on the theory of graphs. Bulletin of the American Mathematical Society, 53(4):292–294, 1947.
- [FG18] Aris Filos-Ratsikas and Paul W. Goldberg. Consensus halving is PPA-complete. In Ilias Diakonikolas, David Kempe, and Monika Henzinger, editors, Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 51–64. ACM, 2018.
- [HV21] Pavel Hubáček and Jan Václavek. On search complexity of discrete logarithm. In Filippo Bonchi and Simon J. Puglisi, editors, 46th International Symposium on Mathematical Foundations of Computer Science, MFCS 2021, August 23-27, 2021, Tallinn, Estonia, volume 202 of LIPIcs, pages 60:1–60:16. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.
- [Jeř16] Emil Jeřábek. Integer factoring and modular square roots. J. Comput. Syst. Sci., 82(2):380–394, 2016.
- [JPY88] David S. Johnson, Christos H. Papadimitriou, and Mihalis Yannakakis. How easy is local search? J. Comput. Syst. Sci., 37(1):79–100, 1988.
- [KNY19] Ilan Komargodski, Moni Naor, and Eylon Yogev. White-box vs. black-box complexity of search problems: Ramsey and graph property testing. J. ACM, 66(5), jul 2019.
- [Kra05] Jan Krajíček. Structured pigeonhole principle, search problems and hard tautologies. J. Symb. Log., 70(2):619–630, 2005.
- [Man07] Willem Mantel. Problem 28 (Solution by H. Gouwentak, W. Mantel, J. Teixeira de Mattes, F. Schuh and W. A. Wythoff). Wiskundige Opgaven, 18:60–61, 1907.
- [Meh18] Ruta Mehta. Constant rank two-player games are PPAD-hard. SIAM J. Comput., 47(5):1858–1887, 2018.
- [Mer79] Ralph Charles Merkle. Secrecy, Authentication, and Public Key Systems. PhD thesis, Stanford, CA, USA, 1979. AAI8001972.
- [MP91] Nimrod Megiddo and Christos H. Papadimitriou. On total functions, existence theorems and computational complexity. Theor. Comput. Sci., 81(2):317–324, 1991.
- [Pap94] Christos H. Papadimitriou. On the complexity of the parity argument and other inefficient proofs of existence. J. Comput. Syst. Sci., 48(3):498–532, 1994.
- [Pru18] Heinz Prufer. Neuer Beweis eines Satzes über Permutationen. Archiv der Mathematischen Physik, 27:742–744, 1918.
- [Ram30] Frank P. Ramsey. On a Problem of Formal Logic. Proceedings of the London Mathematical Society, s2-30(1):264–286, 01 1930.
- [Spe28] Emanuel Sperner. Ein Satz über Untermengen einer endlichen Menge. Mathematische Zeitschrift, 27(1):544–548, 1928.
- [SZZ18] Katerina Sotiraki, Manolis Zampetakis, and Giorgos Zirdelis. Ppp-completeness with connections to cryptography. In Mikkel Thorup, editor, 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 148–158. IEEE Computer Society, 2018.
- [Tur41] Paul Turán. On an extremal problem in graph theory (in hungarian). Matematikai és Fizikai Lapok, 48:436–452, 1941.
- [WS95] Coburn Ward and Sandor Szabo. On swell-colored complete graphs. 06 1995.
Appendix A Efficient algorithm for the explicit Ramsey problem
The following proof of Ramsey’s theorem is folklore. Recall the statement of the theorem
- Ramsey [Ram30]
-
Any edge-coloring of the complete graph on vertices with two colors contains a monochromatic clique of size at least .
Proof.
Let be the complete graph on vertices, and be a two-coloring of its edges.
Pick an arbitrary vertex .
has adjacent edges so at least of them have the same color by the pigeonhole principle.
Let be that color and .
Then, has at least elements.
Next, pick an arbitrary vertex .
There are at least edges between and another vertex in . Like before, at least of them have the same color by the pigeonhole principle.
Let be that color and .
That way, we proceed to build by induction a finite family of vertices , a finite family of colors and a finite family of sets of vertices with the following properties :
For every , .
For every , has size at least .
For every , .
For every and for every , we have .
In particular, note that the second point implies that we have at least ’s, thus we can construct at least ’s (since we need that is not empty to build ).
This means that we define at least colors . By the pigeonhole principle, at least of them are the same, say color .
Let .
Pick such that .
We claim that the subgraph whose vertices are is monochromatic.
Indeed, let .
Then, , so by the fourth point, we get that .
∎
Now, note that this proof is constructive and yields an algorithm to find a monochromatic subgraph of size of the complete graph on vertices.
In this algorithm, we have iterations, and each of them can be done in time , so overall we get an algorithm running in time.