The Random Turán Problem for Theta Graphs
Abstract
Given a graph , we define to be the maximum number of edges in an -free subgraph of the random graph . Very little is known about when is bipartite, with essentially tight bounds known only when is either , or with sufficiently large in terms of , due to work of Füredi and of Morris and Saxton. We extend this work by establishing essentially tight bounds when is a theta graph with sufficiently many paths. Our main innovation is in proving a balanced supersaturation result for vertices, which differs from the standard approach of proving balanced supersaturation for edges.
1 Introduction
Given a graph , we define the Turán number or extremal number to be the maximum number of edges that an -vertex -free graph can have. If is not bipartite, then the asymptotic behavior of is determined by the Erdős-Stone theorem [11]. Only sporadic results for are known when is bipartite, and in most cases these bounds are not tight.
For example, The Kővári-Sós-Turán theorem [19] implies , and this bound is only known to be tight when is sufficiently large in terms of ; see for example [3]. The bound was first proven by Bondy and Simonovits [2]. It was shown by Faudree and Simonovits [12] that this same upper bound continues to hold for theta graphs with paths of length , and it was later shown by Conlon [7] that this upper bound is tight for theta graphs which have sufficiently many paths. A more detailed treatment on Turán numbers of bipartite graphs can be found in the survey by Füredi and Simonovits [14].
In this paper, we study a probabilistic analog of the Turán number. We define the random graph to be the -vertex graph obtained by including each possible edge independently and with probability , and we let denote the maximum number of edges in an -free subgraph of .
Observe that . With this in mind, it is natural to ask if the classical bounds on mentioned above can be extended to bounds on for all . For example, just as in the classical case, we have a complete asymptotic understanding of when is not bipartite due to breakthrough work done independently by Conlon and Gowers [6] and Schacht [27]. To formally state their result, we define the 2-density of a graph by
and we write to mean tends to infinity as tends to infinity. We also recall that a sequence of events occurs with high probability or w.h.p. if tends to 1 as tends to infinity.
Theorem 1.1 to a large extent solves the random Turán problem for non-bipartie graphs, though many questions still remain in this setting; see the survey by Rödl and Schacht [25] for more on this.
For the rest of this paper we focus on the random Turán problem when is bipartite, a setting where much less is known. This lack of information is partially due to the fact that the classical Turán numbers are unknown for almost all bipartite graphs. An additional obstacle is that even if it is known that for some bipartite graph , it is typically not the case that for large ; see for example Figure 1 which plots . That is, unlike in the non-bipartite case, the extremal constructions for with bipartite are far from the intersection of with an extremal -free graph.

Despite these obstacles, what is currently known about the random Turán problem for bipartite graphs suggests that the following analog of 1.1 could be true; notice that in contrast to the non-bipartite case, where exhibits a single “phase transition” around , if 1.2 is correct, we will see two phase transitions in the bipartite case.
Conjecture 1.2.
If is a graph with for some , then w.h.p.
Equivalently,
The bound for is easy to prove with a deletion argument, as is the lower bound of when . Thus the hard part of 1.2 is in showing that whenever is large.
In terms of evidence supporting 1.2, Füredi [13] showed that this conjecture holds when . This work was substantially generalized by Morris and Saxton [20] who proved w.h.p. when is large, and that w.h.p. when is large. Moreover, they showed that these bounds are tight provided and , respectively. The lower bound of Conjecture 1.2 was shown to hold for large powers of balanced trees by Spiro [28]. Under some mild conditions, Jiang and Longbreak [17] proved a general upper bound of the form
where . This bound matches Conjecture 1.2 precisely when , which happens when (giving a simpler proof of [20]), but otherwise is strictly weaker than the bound proposed in Conjecture 1.2. As far as we are aware, these are the only known bounds for the random Turán problem for bipartite graphs, though there have been a number of recent results regarding the analogous problem for degenerate hypergraphs, see for example [21, 22, 23, 24, 29].
In this paper, we contribute to this growing body of literature by studying the random Turán problem for theta graphs. Recall that a theta graph is a graph which consists of two vertices together with internally disjoint paths from to of length . For example, is depicted in Figure 2.

Observe that and . Given that Morris and Saxton essentially solved the random Turán problem for cycles and complete bipartite graphs, one might hope that their methods can be extended to give bounds for in general.
And indeed, by using similar ideas as in [20], Corsten and Tran [10] implicitly proved
(1) |
which matches the general upper bound given by Jiang and Longbreak in this case. However, Faudree and Simonovits [12] proved , so Conjecture 1.2 predicts that we should have when is large, which differs substantially from (1) when is large.
By adding several new ideas (see Section 1.1.2) to the approach of Morris and Saxton, we significantly improve upon the bounds of (1) and establish essentially tight (and unconditional) bounds for when is large, which agree with what 1.2 predicts.
Theorem 1.3.
For all , there exists such that for any fixed , w.h.p.
As far as we are aware, Theorem 1.3 is the first result since Morris and Saxton [20] which gives essentially tight bounds on for any bipartite graph .
1.1 Proof Outline
The vast majority of our paper is focused on proving the upper bound of 1.3 when is large, with the other bounds following from previous results together with the monotonicity of . Before going into our new ideas, we first recall the proof ideas from Morris and Saxton [20] for bounding , as well as the adaptation of these methods by Corsten and Tran [10] to theta graphs.
1.1.1 Previous Ideas
The main approach to upper bounding when is large is to show that exhibits “balanced supersaturation”, which roughly means that if , then one can find a large collection of copies of in which are “spread out.” Given such a result, one can derive upper bounds on by using what is by now a somewhat standard argument involving hypergraph containers; see 6.1 below.
To prove this balanced supersaturation result, let be an -vertex graph with edges. Our goal is to build a hypergraph whose vertex set is , hyperedges are copies of in , and is such that
for all , where is some suitably small constant. If we can construct such a collection with , then this together with 6.1 will give our desired result.
To construct such an , we iteratively build copies of to add to as follows. Given our current collection , we “clean up” by deleting edges with (since we will not be able to use these edges in any copy of to add to ), and by iteratively deleting vertices with low degree. We then pick a vertex that will have some number associated to it (which roughly measures how well the graph expands near ) to use in our cycle. We then run an algorithm which starts with a set , and then iteratively adds new vertices to until we build a cycle , with the exact algorithm we use depending on the value of . A key point is that at each step of our algorithm, we always have significantly more than choices for each new vertex to add. Because we have so many choices, one of the cycles that we can create will be such that and such that continues to satisfy our desired codegree condtions. By applying this algorithm repeatedly, we end up constructing a large collection of cycles which satisfies our codegree conditions, proving our desired balanced superaturation result.
For theta graphs, Corsten and Tran [10] used an argument similar to the one outlined above, but their approach only gave effective bounds on the codegrees when is a set of edges inducing a forest111The general result of Jiang and Longbreak [17] similarly only gives effective bounds for forests, and as such it seems like moving beyond the forest case is the main difficulty in general for upper bounding .. At a high level, the fundamental issue with the approach outlined above is that the algorithm proceeds by selecting vertices one at a time, but the codegree bounds of 6.1 are a function of the number of edges . For which are forests, this distinction turns out to be irrelevant because has at least as many vertices as edges, but this property fails substantially for general subgraphs of theta graphs.
1.1.2 New Ideas
As in previous works, our proof relies on showing that theta graphs exhibit balanced supersaturation. In particular, we use the following three main ideas in order to get around the fundamental issues outlined above:
Vertex Supersaturation. The first key idea is that instead of viewing as a hypergraph on , we essentially view it as a hypergraph on . To be more precise, we consider hypergraphs on in such a way that hyperedges of correspond to a unique labelled theta graph in . Balanced supersaturation for this hypergraph then easily translates to balanced supersaturation for the corresponding hypergraph on , which we ultimately need in order to use hypergraph containers.
Asymmetric Codegree Bounds. The second idea is that the codegree bounds we enforce on will not depend solely on the number of vertices , but also on the vertices of which the set corresponds to. For example, if are the two high-degree vertices of (i.e. the vertices of degree at least 3), then the codegree bounds we enforce on the set will be higher than those we would put on for some other since, in some sense, are the most important vertices of . A similar idea was implicitly used by Morris and Saxton when proving balanced supersaturation results for (where the vertices in the smaller part are “more important” than those in the larger part).
Multiple Collections. The final idea is that we do not build a single collection , but instead build multiple collections and impose different codegree bounds on each of these.
As a somewhat concrete example of why we do this, say we knew our graph was a random graph with edges. In this case, it would be possible to build an such that for every , there are at most roughly theta graphs in our collection which use as the two high degree vertices; and this is a strong bound on the codegree of this pair. In contrast, if was a clique with edges together with some isolated vertices, then it would be impossible to impose such a codegree bound for all . Thus if we only worked with a single collection , we would have to pessimistically use the weaker codegree bounds that work for a clique when correspond to the high degree vertices, and similarly we would have to consider the worst possible choice of when determining the codegree bounds for any given set . Doing this would ultimately give bounds that are too weak. By building multiple collections, we can impose the “correct” codegree bounds regardless of the structure of .
Other Ideas. In addition to these three main ideas, we use a slightly different algorithm to construct our theta graphs compared to those of [10, 20]. Previous algorithms worked (roughly) by first specifying a vertex to play the role of one of the high degree vertices of , then choosing a path in of length (which specifies the other vertex playing the role of a high degree vertex in ), and from there choosing the remaining paths from to one at a time. Instead, our algorithm chooses the two high degree vertices at the start and then builds all of the paths from to one at a time. This somewhat more symmetric argument allows us to overcome various technical issues that arose with previous approaches, and is crucial for our present argument to go through.
1.2 Organization and Notation
The rest of this paper is organized as follows. In Section 2 we prove several auxiliary results that will be used in our main proof, and in Section 3 we establish the main definitions used in this paper. In Section 4, which is the real heart of our argument, we establish our balanced supersaturation result for vertices. We then translate this result into balanced supersaturation for edges in Section 5 before completing the proof of 1.3 in Section 6 by invoking a result that follows from a standard containers type argument. A few open problems are given in Section 7.
Throughout the paper we adopt the following conventions. We always use to denote vertices of and to denote vertices of a (larger) graph . Further, we will almost always let denote the two vertices of with degree at least 3 (when ), and we will informally call these the “vertices of high degree” in . We write . Whenever we write asymptotic notation such as , our implicit constants will always depend on , and we will occasionally emphasize this point by writing, for example, . For a hypergraph and a set of vertices , its degree or codegree is the number of hyperedges of containing .
2 Auxiliary Results
Here we establish two results that will be crucial for our proof.
2.1 Expansion
One of the key technical lemmas of Morris and Saxtion says that in a graph with sufficiently large minimum degree, there exists a vertex which is the endpoint of many “nice” paths of some length . Analogously, we will rely heavily on the following.
Proposition 2.1.
For all integers , there exists some such that the following holds. If is an -vertex graph with minimum degree and , and if is a set of forests, then there exists an integer and a set of vertices such that the following properties hold:
-
(a)
For each , there exists a pair such that is a tuple of (not necessarily disjoint) vertex sets of with , and is a set of paths with for all .
-
(b)
We have .
-
(c)
We have .
-
(d)
For every and , we have , and for every , we have .
-
(e)
For every , we have , where denotes the set of paths of starting at and ending at .
-
(f)
For any and non-empty set of vertices , there are at most paths in containing .
-
(g)
If is such that for every path of with which does not contain an element of as a subgraph, there are at most vertices such that the path contains an element of as a subgraph; then no path of contains an element of as a subgraph.
-
(h)
We have .
Morris and Saxton essentially proved this same result but with the last condition replaced by as opposed to . Our two proofs will be essentially identical outside of this improved quantitative bound222It’s possible that 2.1 holds with the even stronger quantitative bound . If true, this would significantly simplify our proof of 1.3; see 7.3 for more on this., and as such we defer many of the redundant details of the proof to Appendix B.
For our proof, we fix a sequence of rapidly decreasing constants
which depend only on . The exact values of these constants are not particularly important, other than that they are sufficiently small with respect to 1 and with respect to each other. In particular, we demand for all . For the rest of the subsection we will fix some -vertex graph with minimum degree with (and hence ) sufficiently large in terms of the constants.
Definition 1.
For , we say that a tuple of (not necessarily disjoint) subsets of is a concentrated -neighborhood of if , , and for all .
We define to be the minimal such that there exists a concentrated -neighborhood of in . Note that for all since we can iterativly take .
Morris and Saxton implicitly proved that for any vertex with , there exist sets as in Proposition 2.1, and in particular, at least one such vertex exists. The only place where is used in their argument is in showing that there exists a tuple with , , for all , and (crucially) every has ; in other words, for their argument to go through we only need that achieves a local minimum value among vertices near , and it is not strictly necessary for to achieve a global minimum value.
Motivated by this, our main goal is to show that tuples with essentially these same properties noted above exist for many vertices . Specifically, we prove the following.
Lemma 2.2.
There exists some integer and some set of size at least such that for every , and such that for every , there exists a tuple of sets such that , , for all , and every has .
This differs ever so slightly from the condition that Morris and Saxton worked with since we only guarantee as opposed to . By slightly adjusting the constants of Morris and Saxton, their same proof still carries over word for word for any vertex as in Lemma 2.2; see Appendix B for details. Thus to prove Proposition 2.1, it suffices to prove Lemma 2.2, which will be our goal for the rest of this subsection.
For any integer , define
Note that (since is the minimum degree of an -vertex graph), i.e. , and thus is an increasing function in . From now on we let be the smallest integer such that there are at least vertices with . Note that satisfies these conditions, so such a (smallest) integer exists.
Let denote the set of vertices with . Iteratively define to be the set of vertices which have at least neighbors in , where . Note that every with has since .
To motivate these definitions, we observe that in proving Lemma 2.2 with as stated, we can not include any vertex of in any of the sets. While we are allowed to include vertices of in these sets, these vertices are “dangerous” since a large number of their neighbors lie in , and similarly it is somewhat dangerous to include since a large number of their neighbors are in , and so on. We thus want to show that these sets are all relatively small, which is accomplished by the following lemma.
Lemma 2.3.
If then for all , and otherwise for all .
For this proof, we note that by choosing sufficiently small in Proposition 2.1, we may assume is sufficiently large compared to all of the constants .
Proof.
If then , and hence inductively we have for all . From now on we assume . We prove the result by induction on , the base case being immediate from the definition of and . Say we have proven for some . The key technical observation we need is the following.
Claim 2.4.
There exists a non-empty bipartite graph with bipartition such that , , and such that for and for .
Proof.
Let be the graph on obtained after deleting every edge within and within . Note that by definition, each vertex of has at least neighbors in (which is disjoint from ), so . Define by iteratively deleting every vertex which violates the degree conditions of the claim. Note that the number of edges deleted in this process is at most
In particular, is non-empty, and it satisfies all of the other properties by construction. ∎
Returning to our induction, we wish to show that ; our inductive hypothesis gives , so it suffices to prove that . Assume for contradiction that . Let be any vertex of (which exists since is non-empty), and let be defined by and . Note that if and only if is odd since is bipartite. Also note that for all we have
where this last step used and that is sufficiently small compared to . In particular, if is even, then is a concentrated -neighborhood of since
This implies , a contradiction to since is disjoint from .
Thus we may assume is odd. Define the random set by including each vertex of independently and with probability , which is well defined since we assumed . Observe that is a binomial random variable with trials and probability of success . Since , by Markov’s inequality we have . Thus for sufficiently large, we conclude that the event occurs with probability at least 1/2.
Similarly for each , the random variable is binomial with success probability and number of trials . Thus by the multiplicative Chernoff inequality, we have for any ,
and for sufficiently large this probability is at most . By a union bound over , we see that with probability at least , every vertex satisfies .
In total we conclude that there exists some choice of such that both and for all (since in particular, this holds with positive probability for a random subset of ). Thus is a concentrated -neighborhood of . This implies , which again is a contradiction. We conclude , and hence by the inductive hypothesis. ∎
We are now ready to prove Lemma 2.2.
Proof of Lemma 2.2.
Let be the set of vertices with and .
We claim that . Indeed, by definition of , there are at least vertices with . Every vertex with is either in or , so by the previous lemma,
It remains to find the tuple of sets guaranteed by Lemma 2.2 for each . For each , by definition of , there exists a tuple with , , and for all . Define . Note that with this we have , , and no has because we removed from each .
It remains to show that each has many neighbors in . Since each does not belong to any with , by definition has at most neighbors in . This implies
proving the result. ∎
2.2 Minimum Degrees
A very minor step in the proof of Morris and Saxton calls for deleting vertices of low degree in . In their setting this is fine, as this does not significantly decrease the number of edges in . However, because the focus of our approach is on balanced supersaturation for vertices rather than for edges, we will need to be more careful with this step.
Towards this end, we use the reduction lemma stated below, which guarantees a subgraph of large minimum degree, where the degree condition is stronger the more vertices are removed from . In particular, the tradeoff is roughly what one would expect if was a clique together with some number of isolated vertices.
As a small technical convenience, we will prove this lemma in the more general setting of multigraphs with loops. Here, the degree of a vertex is the number of edges incident to (so each loop contributes 1 to its degree).
Lemma 2.5.
Let be an -vertex multigraph with loops. For all , there exists a subgraph with and minimum degree at least
Proof.
Write . The result is trivial if , so assume . Assume for contradiction that no such subgraph exists. We claim that for all non-negative integers there exist with at most vertices and at least edges. The result holds with , so inductively assume the result has been proven through some . Let be the graph obtained after iteratively deleting vertices of degree less than from . Note that
If , then . This implies that has minimum degree at least
where this first inequality implicitly uses . This contradicts our assumption that no such subgraph of exists, so we must have . Thus satisfies the desired conditions, proving the claim. Taking , the claim implies there exists a subgraph on less than 1 vertex with at least edges, which is impossible. ∎
The only reason we proved 2.5 in the more general setting of multigraphs with loops is to prove the following technical result.
Lemma 2.6.
Let be a set, and let be any function from to . For all , there exists a subset such that
Proof.
Define an auxiliary graph on where each vertex has loops (and these are the only edges in ). Applying Lemma 2.5 gives the result. ∎
Roughly speaking, this lemma will be applied with the set of vertices that are at distance from some vertex and with the number of paths of length from to . This will allow us to choose vertices and connected by many paths of length , which we can use to construct copies of where and are the two high-degree vertices.
3 Preliminaries
3.1 Key Definitions
As noted in the proof outline Subsection 1.1, we wish to consider hypergraphs on . To aid with this, we make use of the following definitions throughout the paper; see Figure 3 for an example.
Definition 2.
Given a graph and a set , we define the projection sets and . We say that a set is valid if
-
1.
(equivalently, each vertex of and appears at most once in a pair of ), and
-
2.
If with , then .
We let denote the set of valid subsets of .
Note that if is a valid set with , then by definition this means the map which sends to the unique vertex with is an injective homomorphism. In particular, the vertices induce a graph containing a copy of as a subgraph. Since our ultimate goal is to find a large colleciton of such subgraphs which are spread out, we make the following definitions.
Definition 3.
We say that a hypergraph is a -hypergraph if its vertex set is and all of its hyperedges are valid sets with . We say that functions of the form are codegree functions, and for such a we say that is -good if for all .
Note that being -good means that no valid set is contained in too many hyperedges, with the exact degree condition depending only on and the projection (which allows us, for example, to impose stronger conditions if contains vertices corresponding to the high degree vertices of ).
The main technical work of this paper is in constructing -hypergraphs which have many hyperedges and which are -good for some which is sufficiently small to apply 6.1. To do this, we will consider several functions simultaneously (which will ultimately be combined into a unified codegree bound in the next two sections). The first and simplest function we consider is the following.
Definition 4.
Let , and let be positive integers. We define a codegree function as follows: if induces a forest on edges, then
and otherwise .
Having evaluate to infinity is not necessary, and we do this only to emphasize that this function essentially ignores sets which do not induce a forest with at least one edge.
Essentially, the main technical result of Morris and Saxton and Corsten and Tran says that one can construct large -hypergraphs which are -good. To go beyond this, we will show that one can construct collections which are -good for functions which are finite on sets that contain cycles (and more precisely, sets that contain the two vertices of of large degree).
The specific functions we need are somewhat complex. In all of these functions, the denominator of roughly corresponds to the number of choices our algorithm has to build , with the terms to the left of the “ ” typically counting the number of choices for the two high degree vertices of . The parameter will be chosen roughly such that the graph obtained from 2.5 has vertices.
With all of this established, we define the remainder of our codegree functions.
Definition 5.
Let , and let be positive integers. For , let denote the two vertices of of degree larger than 2. For each integer , we define a codegree function as follows: if , then
and otherwise .
The definition above will be used when the value from Proposition 2.1 equals . The case is somewhat more complicated. Again this is because the denominator roughly represents the number of choices we have for our algorithm at any step, and just as in [10, 20], the case of the algorithm is somewhat more complicated.
Definition 6.
Let , and let be positive integers. For and , we define a codegree function as follows: write the paths of as for , and define
If , then
and otherwise .
The intuition for this codegree function is as follows: in the case with , our algorithm first selects , which it will be able to do in about ways (which is essentially the product of the bounds from 2.1(c) and (h)). When choosing , the number of choices will turn out to be about if and about if . Thus the denominator represents the number of choices our algorithm has for building .
One last codegree function is needed for the case.
Definition 7.
Let , and let be positive integers. For , we define a codegree function as follows: define as above and let if is even and otherwise. If for some , then
and otherwise . For notational convenience, we also define by for all .
The motivation for this definition is that the number of choices for is at least the number of choices for just , which is about , i.e. the number of total edges in the graph . As before, the number of choices for every other vertex will depend on whether it is in or not. The definition of reflects the fact that if is odd, then , but the number of choices for the first vertex of this form is already accounted for by the term, so we can not include an extra factor of for this vertex. Note that with this codegree function, we omit counting the number of choices for in the denominator. As such, will typically be much weaker (i.e. larger) than , though it will do better when has few vertices and is small.
3.2 Saturated Sets
When building our -hypergraph , we need to be careful to avoid constructing theta graphs which contain a subset that has very large codegree in . To aid with this, we introduce the following.
Definition 8.
Let be a -hypergraph and a codegree function . We define the set of saturated sets
Given a valid set and , define the link set to be the set of with such that . If we will sometimes denote this set simply by
The intuition for the link set comes from our goal of algorithmically trying to iteratively add a new theta graph to some such that continues to be -good even after adding the theta graph. If during the algorithm we have already designated some to be used in our new theta graph, and if our algorithm is about to choose some to add to such that , then the algorithm can not choose any for any , as otherwise the degree of would be strictly larger than what dictates. As an aside, our definition of link sets differs slightly from Morris and Saxton, who essentially defined the links to be .
Because the link sets represent the number of “bad” choices our algorithm has, we will want to show that these sets are relatively small. This is accomplished by the following lemma.
Lemma 3.1.
Let be a -hypergraph which is -good for some . If then , and otherwise
Proof.
If , then every with trivially has , so no such satisfies and we conclude . From now on we assume . Note that
where the second inequality used that each hyperedge containing is counted at most times by the sum over , and the last inequality used that is -good. On the other hand,
Rearranging these two inequalities gives
completing the proof.
∎
Because all of our codegree functions involve the ceiling of a real-valued function, the following result, which allows us to ignore the ceilings, will be slightly more convenient to use compared to 3.1.
Corollary 3.2.
Let be a -hypergraph which is -good for some , and suppose that for some . If then . If and , then
Moreover, this bound continues to hold for even if provided is sufficiently small.
Proof.
Note that trivially and that provided . Thus in this case, the result follows immediately from 3.1, and in particular, this situation always holds for provided is sufficiently small.
It remains to consider the case that and . These two conditions imply , so there exists no hyperedge of containing . Thus there exists no such that , i.e. such that . We conclude that the link set is empty in this case, and hence the result trivially holds. ∎
When applying this claim it will always be immediate333In terms of the notation for the next section, we will only apply 3.2 with when is a subset of an -compatible set, which by definition will not be in . that , and for simplicity we will omit saying this explicitly.
4 Balanced Supersaturation for Vertices
In this section, we prove our main technical theorem: a balanced supersaturation result for vertices.
Theorem 4.1.
For all and , there exist constants such that the following holds for all and . If is an -vertex graph with edges, then there exists an integer and a -hypergraph with which is -good, where is defined by
We note that there is no need to consider since the case is already dealt with by Morris and Saxton [20].
4.1 will follow quickly from the following technical result, which roughly says that given a collection of much fewer than copies of satisfying certain codegree conditions, we can find an additional copy of to add to the collection while maintaining the desired codegrees.
Proposition 4.2.
For all and , there exist constants such that the following holds for all and . Let be an -vertex graph on edges with , and let be a set of -hypergraphs such that is -good for each and , and is -good for each , and is -good.
If , then there exists some valid set of size and some such that , and such that if we define and for all , then is -good for each and , and is -good for each , and is -good.
Before proving 4.2, we show how it may be repeatedly applied to obtain our main supersaturation theorem.
Proof of 4.1.
Initially start with collections where for all . By repeatedly applying 4.2, we obtain collections satisfying all of the codegree conditions and with . In particular, there exists some such that contains at least hyperedges.
By Proposition 4.2, we have for all that
and similarly
To complete the proof, we only have to show for all such that contains a cycle. We first consider the case . Here Proposition 4.2 gives
where this last step used that either , or (by Definition 5) differs from by at most a multiplicative factor of (where the factor of 4 comes from the two ceiling functions involving and ). Since contains a cycle, we have , so the sum above is at most . We conclude that for all valid .
When , essentially the same reasoning gives that if contains a cycle then
and since this latter sum is at most . This gives the result. ∎
The rest of this section is dedicated to proving 4.2. Let be the graph in the hypothesis of Proposition 4.2 and the corresponding collections.
The basic idea of the proof is to algorithmically construct many copies of , and to show that at least one of them is not already contained in for some appropriate , and such that our codegree conditions continue to be satisfied.
We will begin by pruning the graph so that all of its remaining edges and vertices are “well-behaved” (Section 4.1). We then give the algorithmic construction of copies of and show that a new copy may be added to some (Section 4.2), completing the proof. Throughout the argument we fix some depending only on which is sufficiently small for our arguments to go through.
4.1 Pruning
Let be the graph obtained by deleting edges of that are already “saturated” by hyperedges of , that is, those edges for which there exists with and
This will ensure that any new theta graph constructed using only edges of will not violate our edge-codegree bounds when added to . We bound the number of these saturated edges by double-counting elements of :
Rearranging slightly, the number of such edges is at most
and if is sufficiently small this is at most , which implies . Notice that no remaining edges are “saturated,” i.e. we have
(2) |
Now we further prune the graph by eliminating low-degree vertices: let be the subgraph of high minimum degree guaranteed by Lemma 2.5. Although may have substantially fewer vertices than (meaning our algorithm will have fewer choices at various steps), in this case it will compensate by having a substantially larger minimum degree.
More concretely, let , and let be the real number such that has minimum degree . By Lemma 2.5 we have
We let be the unique integer such that
(3) |
and we note that the previous inequality implies
(4) |
In total this implies the minimum degree of is at least , which is the average degree of , and that the minimum degree of is much larger compared to that of if is much smaller than . Before moving on, we note
(5) |
Indeed, since is the minimum degree of the -vertex graph , we must have , and rearranging shows this is equivalent to (5).
4.2 The Algorithm
We are now ready to begin finding copies on in . Our strategy is roughly as follows: first, we identify which collection we wish to add a copy to (based on the expansion properties of detailed in 2.1). After this, we carefully choose vertices and to serve as the two high-degree vertices for the copies we will add. We then show that and are not already contained in too many copies in , and algorithmically construct a large number of theta graphs in that do contain and ; this allows us to conclude that we have found at least one new copy not already contained in . Crucially, along the way, we ensure that at no step of the algorithm are our codegree conditions violated, ensuring that the copy added to is “good.”
Before delving into the meat of the proof, we introduce some notation which is more compact. Define
Also define
When applying 3.2, we adopt the shorthand
We say that a set is -compatible if and if no subset of lies in . Crucially, we observe that proving the proposition is equivalent to showing that for some , there exists an -compatible set with such that (since, for example, no subset of being in implies is -good).
Before moving on, we make a small but important observation.
Claim 4.3.
If is a valid set and is such that induces at most one edge in , then .
Proof.
If induces 0 edges then and the result follows from 3.2. Thus we can assume induces exactly one edge . If there exists some , then there exist pairs (since ). In this case we have
where the first inequality used that we are looking at the codegree of a smaller set, the second inequality follows from (2) which says does not contain any edges with , and the equality used that induces exactly one edge. This inequality implies , contradicting the assumption . We conclude that this link set is indeed empty. ∎
4.2.1 The Setup
We wish to apply Proposition 2.1 to the “pruned” graph . For this we need to specify a set of forests to avoid. Intuitively we wish to use the set , but this is a collection of subsets of , not of subgraphs of . To get around this minor technically, for we define the “projection graph” by
Note that by definition of being valid, is a subgraph of which is isomorphic to the subgraph of induced by . We let . Since for which do not induce forests, every element of is a forest. To apply Proposition 2.1 with this set, it remains to verify the following, which gives the hypotheses of Proposition 2.1(g).
Claim 4.4.
Let be as in Proposition 2.1. If is sufficiently small, then for every path of with which does not contain an element of as a subgraph, the number of vertices such that some subgraph of the path is in is at most .
Proof.
Fix any path as above; we wish to bound then. We introduce the following notation which will only be used in the proof of this claim: we say that a pair with and is good if and is adjacent to at most one vertex of . We claim that if is such that some subgraph of the path is in , then for some good pair .
Indeed, say the subgraph of in was for some . Because is not a subgraph of , we must have , and thus we have for some . If , then being a subgraph of implies and that is adjacent to at most one vertex of (as otherwise would have degree greater than 1 in , contradicting this being a subgraph of ). In total we find for some good pair . Moreover, by definition of , we find
so by definition as desired.
With this we see that the number of choices for is at most the number of elements of for all possible good pairs . To count the number of such elements, fix some good pair . If induces no edges, then since is adjacent to at most one vertex of by definition of being a good pair, induces at most one edge. Claim 4.3 then implies .
Now assume induces at least one edge. By definition of and 3.2, we have for any good pair that
With this claim and the fact that is at least a sufficiently large constant due to (4), we can apply Proposition 2.1 to and , and we let be the integer and set guaranteed by this proposition. We recall that are the high degree vertices of .
Let be a vertex such that is in the fewest hyperedges of , that is, a vertex with (this will help us ensure we can find a new copy of containing ). Let with be the pair for as guaranteed by Proposition 2.1(a).
We now split our analysis into several cases based on the value of . The overarching strategy is the same in both cases, but the details are somewhat simpler in the case (where has nice “random-like” expansion near ).
4.2.2 Case 1:
Our strategy is to build copies of in which use as one of the high degree vertices . To choose the vertex which plays the role of the other high-degree vertex , we would like to ensure there are many paths in connecting to (which we will then use to build our theta graphs). And indeed, this holds for many vertices in : for , let denote the number of paths of which is an endpoint of. By 2.6, there is some so that
(6) |
Notice that we face a trade-off: we may have a large set of vertices , each of which is the endpoint of approximately the average number of paths, or a smaller set where is much larger than average. We can obtain a strong balanced supersaturation result either way, but to do so, we must keep track of this trade-off. To this end, let be the unique integer such that
(7) |
Since and by 2.1(c), (6) can be relaxed to
(8) |
Recall from (3) that , and let
Roughly speaking, if is small, then and/or are large, which is what we would expect to happen if was a random graph (as opposed to being, e.g., a clique with isolated vertices, wherein both of these quantities would be small). We now aim to show that we can add a new theta graph to the collection . Let be such that is in the fewest number of hyperedges with in . As before, this will help ensure we find a new hyperedge, since and are not already contained in too many elements of .
Claim 4.5.
The set satisfies
and is -compatible provided is sufficiently small.
Proof.
First, recall that is such that is contained in the fewest number of hyperedges in among all vertices in the set , which means
where the second inequality used by 2.1(b) when and the hypothesis of 4.2. Similarly, as is such that is in the fewest number of hyperedges with in , we have
It remains to show is -compatible. First we show is a valid set. Since are distinct non-adjacent vertices of , we only need to check . And indeed, we can not have , since by Proposition 2.1(e), every element of is the endpoint of a positive number of paths of length from (and since these are paths, can not serve as both endpoints). Since , we conclude and that is valid.
It remains to check that every subset of satisfies the desired codegree conditions. Note that for any of size 1, we have , and as such will not belong to . Similarly , so it only remains to verify that
Since , this bound follows from the first part of the claim, completing our proof. ∎
We now wish to construct many “good” copies of in with and as the two high-degree vertices (i.e. more than the bound in Claim 4.5). To do this, we iteratively pick paths in that end in , and we take our copy of to be the union of these paths. We must ensure that the paths chosen are such that is -compatible, and in particular that they do not intersect each other and that no subset of their vertices is already saturated. For this claim, we recall that the paths of are denoted by .
Claim 4.6.
Let , and let be a collection of paths in ending in , and for each path , write . Suppose that the set
is -compatible. Then there are at least choices of a path in so that
is -compatible.

Proof.
Let denote the set of paths in ending in . Since , we have
(9) |
where this last step used and .
Our goal now is to show that among these paths, there are few “bad choices” that must be avoided, i.e. few choices so that is not -compatible. To show this, 2.1(f) will be crucial, which we recall says that for any non-empty set of vertices in not containing , there are at most paths in which contain .
We first show that almost all choices of make valid. Because was already valid, this is equivalent to choosing a which contains none of the vertices of other than and . By 2.1(f) with , the number of paths in containing a given vertex from is at most . Therefore, the number of containing any of the vertices in (other than and ) is at most a constant times , which for sufficiently large (which makes sufficiently large) will be at most . Thus at least three quarters of the paths will make valid.
To show that is -compatible for most choices of paths in , it remains to bound the number of “bad” sets that must be avoided when choosing . To this end, for each integer define
Note that is -compatible if and only if it is valid and does not contain any set for any value of (here we implicitly use that since is always equal to , so we can ignore when checking for compatibility).
We may use 3.2 to bound the size of each link set above: consider and with . If or , then by 3.2. Otherwise, 3.2 gives
(10) |
Similarly, if does not induce a forest on at least one edge, then . If both and induce a forest on at least one edge, then 3.2 gives
(11) |
It remains to deal with the case that does not induce a forest on at least one edge but does. Analogous to the proof of Claim 4.3, in this setting 3.2 gives no meaningful bound, but we can show that for any choice of in , no element of the link set can appear in (and therefore there are no “bad options” that must be avoided in choosing from ). To this end, consider any set . By definition, this means and
(12) |
Since does not induce any edges but does, all of these induced edges of must be contained in the path . Since only depends on the number of edges induced, we have
(13) |
On the other hand, if for each we let
then we have
(14) |
since taking smaller sets can only cause to increase. Putting this all together, we obtain
(15) |
Notice that . Thus (15) says that . Using the notation introduced just before Claim 4.4, this means that the subgraph is an element of . Recall that by 2.1, no path in contains an element of as a subgraph. Since is a subgraph of , we conclude that there is no choice of such that contains in this case.
Putting it all together, and writing , we obtain
(16) |
By 2.1(f), the number of which contain the projection of a given (of size ) is at most . So combining this with (16), the number of which contain for any is at most
Summing over all values of from 1 to and simplifying slightly, the number of that contain a “bad” set of any size is at most
(17) |
where By (4), and since and , we have
This gives
where . By taking sufficiently small, we can assume . So, after taking into account that at most one quarter of the choices have not valid, we find that the number of choices for such that is -compatible is at least
where both inequalities used (9). This gives the desired result. ∎
With Claim 4.6 established, we are now nearly ready to finish Case 1.
Claim 4.7.
The number of -compatible sets of size containing and is at least
Proof.
This result will follow directly by an iterative application of Claim 4.6. As a base step, we take , which is -compatible by Claim 4.5. With this we may apply Claim 4.6 to obtain at least choices of a path in such that the corresponding set is -compatible. Iterating up to , we obtain at least
distinct collections such that the corresponding sets are -compatible. This completes the proof. ∎
Now we are ready to finish Case 1. By Claim 4.5, the number of hyperedges in containing and is at most
By Claim 4.7, the number of -compatible sets of size containing and is at least
Therefore, provided is sufficiently small, there must be at least one -compatible set of size that is not already in . This may be added to , completing the proof of 4.2 when .
4.2.3 Case 2: and Even
Parts of this proof are nearly identical to the previous case, and as such we omit some of the redundant details.
Recall that is the unique integer such that . Our goal in this case is to show that we can add a new theta graph to . Let be such that is in as few hyperedges in with as possible.
Claim 4.8.
The set satisfies
and is -compatible if is sufficiently small.
Proof.
Recall that is such that is contained in the fewest number of hyperedges in among all vertices in the set . This together with the definition of implies that the number of hyperedges containing both and is at most
where this last step used and that and by Proposition 2.1(b) and (h). Using and from (3) and (4) gives the first result.
As in the case, we have by Proposition 2.1(e), so and the set is valid. Any trivially fails to be in , and to show is not in this set it suffices to show
and this follows by the first result. We conclude that is -compatible.
∎
Now that we have selected our two high degree vertices of our theta graph, we build the rest of the theta graph as follows. First, we work our way out from by selecting neighbors of , then neighbors of each , and so on, until we have chosen vertices . Then, once we have chosen the vertices , we select paths from the set connecting the vertices to .
To do the first part, we use the following claim. Here we recall that the paths of are denoted , and for this claim we adopt the convention that and . We also recall that is defined to be the set of with and even. In particular, when is even.
Claim 4.9.
Let and be integers, and let be an -compatible set consisting of the pairs , and for all with either or with and .
-
•
If is odd and , then there exist at least choices such that is -compatible.
-
•
If is even and , then there exist at least choices such that is -compatible.

Proof.
Observe that if is a vertex such that is not -compatible, then either (which can only hold for vertices), or there exists some with
Thus it suffices to show that each of these sets are small for each .
First consider . If induces at most one edge, then this link set is empty by Claim 4.3. If this is not the case, then must induce at least one edge since only has one edge incident to (this implicitly uses , as otherwise would also be adjacent to ). By 3.2, we find
(18) |
Next consider , which we recall is based off of the codegree function defined in Definition 6. If , then this link set is empty by 3.2, so we may assume . Then 3.2 gives
since if is odd, adding to keeps the parameter in Definition 6 the same while increasing . Similarly,
since and both increase by 1.
Finally consider , which we recall is based off of the codegree function defined in Definition 7. Again we may assume . If for some , then the argument and final bound is exactly the same as in the case for (with taking the role of in exactly the same way as before). We next consider the subcase for all . If , then contains no vertex of the form , so , and hence the link set is empty by 3.1. If , then by the hypothesis of the claim, so our assumption implies . Thus
where the first inequality used that we are looking at the codegree of a smaller set in a larger hypergraph, the second inequality used (2) (i.e. that every edge in has codegree smaller than that given by ), and the equality used and in the definition of for even. This implies .
By summing up the sizes of all of these sets over all possible choices of (as well as the number of choices ), we find when is odd that the number of which can not be selected is at most
with the last step using . By Proposition 2.1(d), has at least neighbors in , and for sufficiently small this is at least twice the number of forbidden choices. Essentially the same reasoning holds for the even case after noting when applying (18). We conclude the result.
∎
By starting with the two high-degree vertices and , and iteratively applying Claim 4.9, we can find many -compatible sets with . To get the remaining vertices corresponding to with , we use the same strategy as in the case of choosing paths from .
Claim 4.10.
Let be a collection of paths in ending in , and for each path , write . Suppose that the set
is -compatible. Then there are at least choices of a path in so that
is -compatible.

Sketch of Proof.
The argument is almost identical to that of Claim 4.6 so we only sketch the details (with our notation defined analogously as before). By Proposition 2.1(c) we have that there are at least paths in from to . Using 2.1(f) we find that very few of these paths contain any of the other vertices of besides and .
By using 3.2, we find that each of the sets , and after intersecting with are all of size whenever and with (here we use that for any such , so in the definitions of do not change when going from to ). From here essentially the same computations as before go through. ∎
Combining the previous three claims, we find that our algorithm produces a large number of theta graphs.
Claim 4.11.
The number of -compatible sets of size containing and is at least
Proof.
The result follows by iteratively applying Claims 4.9 and 4.10. Starting with , which is -compatible by Claim 4.8, we repeatedly apply Claim 4.9 to build paths (for ); we then finish by repeatedly applying Claim 4.10 to select paths . In total, we find that the number of -compatible sets of size with is at least
(19) |
where the first two terms use that for each path we get a factor of for each vertex in position and a factor of for each by Claim 4.9, and the last term uses Claim 4.10. The expression above is equal to some positive constant depending only on times
where the inequality used (5), i.e. , and implicitly that so that the exponent of is positive. Finally, using and crudely gives
where this last step used . ∎
We are now ready to finish Case 2. If is sufficiently small in terms of , the number of theta graphs guaranteed by Claim 4.11 exceeds the codegree bound in Claim 4.8; thus there exists some -valid set obtained through our algorithm which is not already a hyperedge of . Adding such an to gives the result in this case.
4.2.4 Case 3: and Odd
This case is nearly identical to the previous one, and as such we only sketch the proof.
Again our goal is to add a new hyperedge to . To start, we pick such that is in as few hyperedges with as possible. Here we emphasize that, in the previous case, we picked and hence immediately obtained (since each element of is the endpoint of a path with ), but here we have to be slightly more careful and explicitly enforce . However, since no hyperedge of contains both and (since every hyperedge is a valid set), and since by Proposition 2.1(b), we find that is at most twice the bound from Claim 4.8, and the rest of the proof showing that this set is -compatible goes through in exactly the same way as in Claim 4.8.
From here we apply Claim 4.9 exactly as written (since depends only on the parity of and not of ); the proof of Claim 4.9 also remains word for word the same, with the only minor exception being that we have (which again implies when and ).
Finally, we choose paths in going from each of the vertices to , and again the statement and proof of Claim 4.10 remain exactly the same. With this, the total number of choices for the algorithm to produce an -compatible set is
since in this setting we get a factor of for each vertex in position , of which there are . This quantity is at least as large as (19), so we conclude that for sufficiently small the number of choices is more than the number of hyperedges containing in . With this we conclude the result.
5 Balanced Supersaturation for Edges
In the previous section we showed that exhibits balanced supersaturation for vertices in terms of the (complicated) codegree function . We begin by simplifying this function.
Proposition 5.1.
For all and , let and be as in 4.1. There exist constants such that if and induces edges, where , then
Note that always holds if we are considering -vertex graphs with edges. We defer the proof of 5.1 for the moment and show that together with 4.1, it implies a balanced supersaturation result for edges which we will use to complete the proof of 1.3; see 6.1 below.
Corollary 5.2.
For all and , there exist constants such that the following holds for all and . If is an -vertex graph with edges, then there exists a hypergraph on whose hyperedges are copies of and is such that and such that for every with , we have
Proof.
Let be the -good -hypergraph on guaranteed by 4.1. We would like to translate into a hypergraph on satisfying the codegree bounds above.
This will be conceptually straightforward, but a little tedious. In essence, the hyperedges of correspond to theta graphs in , and we will define to be the hypergraph corresponding to these theta graphs. However, we must deal with two small issues with this translation: (1) a single theta graph in may appear isomorphically several times in , and (2) the codegree bound depends on the number of edges induced by , whereas the bound in 5.2 depends only on for an arbitrary set of edges , even if the vertices used by induce additional edges. Neither of these issues is a real obstacle (in particular, the second can only improve the codegrees), but we will need some additional notation in order to address them.
For each valid set , we will define the corresponding set of edges induced in (excluding “extraneous” edges that do not play a role in the isomorphic copy of ) as follows:
In particular, is a copy of in for every hyperedge (since every hyperedge is a valid set of size ). Define to be the hypergraph with hyperedge set . Observe that , so it remains to check the codegree conditions – that is, to bound for each set of edges in .
Fix a set of edges . We need to get an understanding of which valid sets “correspond” to . To this end, let be the set of vertices used by the edges , and let be the set of all valid with and . See Figure 7 for an example.

With this notation, we can convert the codegree bounds in to a bound on as follows.
Claim 5.3.
We have
Proof.
We would like to show that each hyperedge counted by corresponds to at least one hyperedge counted by for some . We first observe that if , then by definition of , there exists some with . We wish to show that this contains a set , so that it will be counted by .
To this end, we observe that if the edge set is contained in , then the corresponding vertex set is contained in ; so there exists some such that . We claim that as well. To see this, note that if , then for some by the definition of . Since and , we have as well, giving . So by definition, as desired. ∎
To finish, it remains to bound the sum above. First, notice that there are only a constant number of terms: each element of is uniquely identified by (since for each ), and hence . Since is -good, we have , and hence
where this last step used that each induces edges, together with the bound on given by 5.1. This gives the desired result by taking . ∎
5.1 Proof of 5.1
The rest of this section is dedicated to proving 5.1, which we emphasize will consist entirely of (moderately involved) arithmetic and case analysis. We will abuse notation slightly by identifying a vertex set with its induced subgraph in . For example, we may say contains a cycle to mean its induced subgraph contains a cycle. Unless stated otherwise, will refer to the number of edges that induces in . We let be the constant guaranteed by 4.1, and throughout we assume and . We recall that our goal is to show that there exists a constant such that for sufficiently large and for all inducing edges for , we have
where the definition of will be recalled below. We begin with an easy case.
Lemma 5.4.
For any codegree function , if is such that and induces edges, then
Proof.
This follows immediately from . ∎
We remind the reader that
For ease of reading, we recall each of the functions mentioned above before they are used. First, we recall
whenever induces a forest with edges.
Lemma 5.5.
If , , and induces a forest on edges, then
Proof.
If then the result follows from 5.4, and otherwise
where the factor of 2 comes from the ceiling function and the assumption . The result follows since . ∎
It remains to prove the result when contains a cycle. To help with the case analysis, we show that it suffices to prove the result when consists of paths of length , i.e. when contains no leaves or isolated vertices.
Lemma 5.6.
Let be a codegree function such that if contains a cycle, then either or for every .
If is a constant such that for every which consists of paths of length , we have
then for every and which contains a cycle, we have
Proof.
Assume the hypotheses hold for . We prove by induction on that any containing a cycle satisfies the desired inequality. For this proof, we recall that always denote the two high degree vertices of .
Say we have proved the result up to some set which induces edges. If then the inequality follows from Lemma 5.4, so we may assume . If consists of paths of length then the result follows by hypothesis. Otherwise, there exists some vertex which is adjacent to at most one other vertex in . Thus induces a graph containing a cycle with at least edges. By our hypothesis on , we find
where the second inequality used that our inductive hypothesis applies to (since contains a cycle and ). This gives the desired result. ∎
We will show that essentially all of our remaining codegree functions are of the form described in 5.6. First, we recall
whenever contains the two high degree vertices , and otherwise. Note that if contains a cycle, then if , and otherwise if we have , where the factor of 2 comes from the ceiling function. Thus satisfies the conditions of Lemma 5.6, and using this we prove the following.
Lemma 5.7.
There exists a constant such that if is sufficiently large in terms of and if induces edges where , then
Proof.
First, if induces a forest then the result follows from 5.5, so we may assume contains a cycle.
Now consider the case . In particular, it suffices here to show that
And indeed, since , this inequality is satisfied if
which holds for all and , provided is sufficiently large.
Thus we may assume that ; in particular, since , this implies that , and as such . Possibly by adjusting the constant , it now suffices to show satisfies the inequality of the lemma. By Lemmas 5.4 and 5.6, it suffices to show this holds for consisting of paths of length . In this case and , so it suffices to show
for some constant , where implicitly we used that the ceiling function in can be ignored by increasing by a factor of 2. Using and rearranging the above gives that it suffices to show
which holds for any since . We conclude the result. ∎
It remains to deal with the case . To start, we recall that we write the paths of as for , and that we define . We recall that if then
where , with otherwise. Similarly if for some then
where if is even and otherwise, with otherwise. Note that both of these codegree functions satisfy the conditions of Lemma 5.6 since we assumed . From now on we will assume we work with and define as in the above codegree functions. It will be useful to note that if consists of paths of length , then by definition
(20) |
and
(21) |
where this last equality follows from if is even and otherwise .
Lemma 5.8.
Let consist of complete paths and define . There exists a constant such that
provided either
or
Proof.
If then the result holds by Lemma 5.4, so from now on we assume . We can rewrite the denominator of as
Using this and , we see that to show the desired result holds with , it suffices to show
(22) |
Using that the minimum in (22) is at most and rearranging, we see that it suffices to have
i.e. , which gives the first result.
If we instead use that the minimum in (22) is at most , then we see that this inequality will be satisfied provided
which is equivalent to
This gives the last part of the lemma. ∎
Lemma 5.9.
Let consist of complete paths. There exists a constant such that
provided
Proof.
and rearranging gives the desired result. ∎
With these two results we can solve the cycle case for provided is sufficiently large.
Lemma 5.10.
If and , then there exists a constant such that if is sufficiently large in terms of and if induces edges where , then
Proof.
By using similar reasoning as in Lemma 5.7, it suffices to prove this upper bounds holds for (i.e. ignoring the term in ) whenever consists of paths of length . With as in Lemma 5.8, (20) and (21) give
where follows from . We also note that both inequalities of Lemma 5.8 become easier to satisfy for larger values of (this holds for the first inequality because the function is increasing for any , and it holds for the second since ).
We first claim that is relatively large in most cases; namely, if , then . Indeed, this being false is equivalent to
If then this is equivalent to , contradicting our assumption on , so we may assume . By dropping the ceiling function, the inequality above implies
which is equivalent to
with the last step using , again giving a contradiction to our assumption on .
We conclude that if . Note that the second inequality of Lemma 5.8 trivially holds at , and since the lemma is easier to satisfy for larger values of , we conclude the result for . From now on444As an aside, it is not difficult to show that Lemma 5.8 alone suffices to prove the result for if, say, . However, for it is necessary to use Lemma 5.9 as well since, in particular, we can have in this case. Dealing with the case here is the only reason the codegree functions are introduced. we assume .
Note that
and in particular,
(where the last step uses ). First consider the case . Note that for we have
where this last step holds for (and uses ). With this we either have (in which case we are done by the argument for ), or
in which case the result follows from Lemma 5.8.
Now assume , which in particular implies . By Lemma 5.9, and using , we obtain the result if . On the other hand, using the second inequality of Lemma 5.8, which is harder to satisfy the smaller is, we see that that for the result holds if
Thus the result holds for all provided
or equivalently
This holds for , proving the result. ∎
6 Completing the Proof of 1.3
Recall that we wish to show that for all , there exists such that for any fixed , w.h.p.
The case follows from Morris and Saxton [20], so from now on we assume . The lower bounds for follow555Specifically, one applies Corollary 5.1 to the rooted tree with the path on edges and its set of leaves. With this one can check (which is also implicitly shown in Conlon [7]), and that from [28, Corollary 5.1], which is proven using random polynomial graphs (similar to how Conlon [7] proved whenever is sufficiently large in terms of ). The upper bound for small follows from the fact that has at most edges w.h.p., and the upper bound for in the middle range will follow from the upper bound for large due to the monotonicity of with respect to .
With this all in mind, it only remains to prove when . For this we utilize the following general result showing that balanced supersaturation implies upper bounds on .
Theorem 6.1.
Let be a graph and a real number satisfying the following: there exist real numbers such that for every -vertex graph with and , there exists a hypergraph on whose hyperedges are copies of and is such that , and such that for every with , we have
In this case,
We note that if is 2-balanced, i.e. if it has , then the conclusion of 6.1 is exactly the upper bound predicted by 1.2 provided .
The proof of 6.1 uses what is by now a fairly routine argument involving hypergraph containers, which is a powerful technique developed recently and independently by Balogh, Morris and Samotij [1] and Saxton and Thomason [26]. We defer the details to Appendix A.
7 Concluding Remarks
In this paper we established upper bounds for which are essentially tight whenever is sufficiently large in terms of . It would be of interest if one could extend our ideas to prove effective upper bounds on for other . In particular, one might hope to prove upper bounds for powers of rooted trees.
More precisely, given a tree , a set , and an integer , we define to be the graph consisting of copies of which agree only on the set . For example, if is a path of length and consists of its two endpoints, then , and if and is its set of leaves, then . In particular, the only bipartite graphs for which we know tight bounds for , namely theta graphs and complete bipartite graphs, are examples of powers of trees.
Question 7.1.
Can one prove essentially tight bounds on for other powers of rooted trees?
The best upper bounds for this problem come from the general bounds of Jiang and Longbreak [17], and the best lower bound comes from [28]. We note that analogous to the situation for theta graphs prior to this paper, the lower bound of [28] depends only on the tree while the upper bound of [17] depends on , and as such the gaps between these bounds grow large as increases. Similar to the situation in the present paper, we suspect that the lower bound is closer to the truth, and in particular, Conjecture 1.2 claims that in many cases, the lower bound from [28] should be the correct answer.
Solving Question 7.1 for all rooted trees is likely impossible. Indeed, even the case, namely that of determining the Turán number , is an important open problem of Bukh and Conlon [4] related to the rational exponents conjecture. That being said, there are a number of special cases where this Turán number is known [8, 9, 15, 16, 18], and it might be possible to generalize our ideas to deal with some of these cases in the random setting. A more detailed discussion on this problem can be found in the concluding remarks of [28].
To prove 1.3, we first proved a balanced supersaturation result, 5.2, which is essentially optimal for . It would be desirable to do this for all .
Question 7.2.
Can one extend 5.2 to hold for all ?
Note that the case is already dealt with by Morris and Saxton [20]. Solving this question, in addition to being desirable from a philosophical standpoint, might lead to a simpler proof of 5.2 which could more easily generalize to solving Question 7.1. The simplest way to resolve this question would be to resolve the following.
Question 7.3.
Can one extend Proposition 2.1 to hold with ?
An affirmative answer here would not only give an affirmative answer to Question 7.2, but also would allow one to avoid many of the messy technical details in our proof. Namely, with this one can alter the definition of in such a way that the functions are no longer needed, and such that the computations for proving 5.1 are much simpler.
Acknowledgments. We thank Rob Morris for useful comments about the presentation of this paper.
References
- [1] József Balogh, Robert Morris and Wojciech Samotij “Independent sets in hypergraphs” In J. Amer. Math. Soc. 28.3, 2015, pp. 669–709 DOI: 10.1090/S0894-0347-2014-00816-X
- [2] J.. Bondy and M. Simonovits “Cycles of even length in graphs” In J. Combinatorial Theory Ser. B 16, 1974, pp. 97–105 DOI: 10.1016/0095-8956(74)90052-5
- [3] Boris Bukh “Extremal graphs without exponentially-small bicliques” In arXiv:2107.04167, 2022
- [4] Boris Bukh and David Conlon “Rational exponents in extremal graph theory” In J. Eur. Math. Soc. (JEMS) 20.7, 2018, pp. 1747–1757 DOI: 10.4171/JEMS/798
- [5] Maurı́cio Collares Neto and Robert Morris “Maximum-size antichains in random set-systems” In Random Structures Algorithms 49.2, 2016, pp. 308–321 DOI: 10.1002/rsa.20647
- [6] D. Conlon and W.. Gowers “Combinatorial theorems in sparse random sets” In Ann. of Math. (2) 184.2, 2016, pp. 367–454 DOI: 10.4007/annals.2016.184.2.2
- [7] David Conlon “Graphs with few paths of prescribed length between any two vertices” In Bull. Lond. Math. Soc. 51.6, 2019, pp. 1015–1021 DOI: 10.1112/blms.12295
- [8] David Conlon and Oliver Janzer “Rational exponents near two” In Adv. Comb., 2022, pp. 10pp
- [9] David Conlon, Oliver Janzer and Joonkyung Lee “More on the extremal number of subdivisions” In Combinatorica 41.4, 2021, pp. 465–494 DOI: 10.1007/s00493-020-4202-1
- [10] Jan Corsten and Tuan Tran “Balanced supersaturation for some degenerate hypergraphs” In J. Graph Theory 97.4, 2021, pp. 600–623 DOI: 10.1002/jgt.22674
- [11] P. Erdös and A.. Stone “On the structure of linear graphs” In Bull. Amer. Math. Soc. 52, 1946, pp. 1087–1091 DOI: 10.1090/S0002-9904-1946-08715-7
- [12] R.. Faudree and M. Simonovits “On a class of degenerate extremal graph problems” In Combinatorica 3.1, 1983, pp. 83–93 DOI: 10.1007/BF02579343
- [13] Zoltán Füredi “Random Ramsey graphs for the four-cycle” In Discrete Math. 126.1-3, 1994, pp. 407–410 DOI: 10.1016/0012-365X(94)90287-9
- [14] Zoltán Füredi and Miklós Simonovits “The history of degenerate (bipartite) extremal graph problems” In Erdös centennial 25, Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2013, pp. 169–264 DOI: 10.1007/978-3-642-39286-3“˙7
- [15] Oliver Janzer “The extremal number of the subdivisions of the complete bipartite graph” In SIAM J. Discrete Math. 34.1, 2020, pp. 241–250 DOI: 10.1137/19M1269798
- [16] Tao Jiang, Zilin Jiang and Jie Ma “Negligible obstructions and Turán exponents” In Ann. Appl. Math. 38.3, 2022, pp. 356–384
- [17] Tao Jiang and Sean Longbrake “Balanced supersaturation and Turán numbers in random graphs” In arXiv:2208.10572, 2022
- [18] Dong Yeap Kang, Jaehoon Kim and Hong Liu “On the rational Turán exponents conjecture” In J. Combin. Theory Ser. B 148, 2021, pp. 149–172 DOI: 10.1016/j.jctb.2020.12.003
- [19] T. Kövari, V.. Sós and P. Turán “On a problem of K. Zarankiewicz” In Colloq. Math. 3, 1954, pp. 50–57 DOI: 10.4064/cm-3-1-50-57
- [20] Robert Morris and David Saxton “The number of -free graphs” In Adv. Math. 298, 2016, pp. 534–580 DOI: 10.1016/j.aim.2016.05.001
- [21] Dhruv Mubayi and Liana Yepremyan “On The Random Turán number of linear cycles” In arXiv:2304.15003, 2023
- [22] Jiaxi Nie “Random Turán theorem for expansions of spanning subgraphs of tight trees” In arXiv:2305.04193, 2023
- [23] Jiaxi Nie “Turán theorems for even cycles in random hypergraph” In arXiv:2304.14588, 2023
- [24] Jiaxi Nie, Sam Spiro and Jacques Verstraëte “Triangle-free subgraphs of hypergraphs” In Graphs Combin. 37.6, 2021, pp. 2555–2570 DOI: 10.1007/s00373-021-02388-5
- [25] Vojtěch Rödl and Mathias Schacht “Extremal results in random graphs” In Erdös centennial 25, Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2013, pp. 535–583 DOI: 10.1007/978-3-642-39286-3“˙20
- [26] David Saxton and Andrew Thomason “Hypergraph containers” In Invent. Math. 201.3, 2015, pp. 925–992 DOI: 10.1007/s00222-014-0562-8
- [27] Mathias Schacht “Extremal results for random discrete structures” In Ann. of Math. (2) 184.2, 2016, pp. 333–365 DOI: 10.4007/annals.2016.184.2.1
- [28] Sam Spiro “Random Polynomial Graphs for Random Turán Problems” In arXiv:2212.08050, 2022
- [29] Sam Spiro and Jacques Verstraëte “Relative Turán problems for uniform hypergraphs” In SIAM J. Discrete Math. 35.3, 2021, pp. 2170–2191 DOI: 10.1137/20M1364631
Appendix A Proof of 6.1
Throughout this section, we say that a graph is -good with a real number if it satisfies the following balanced supersaturation condition: there exist real numbers such that for every -vertex graph with and , there exists a hypergraph on whose hyperedges are copies of and is such that , and such that for every with , we have
Here we prove 6.1, i.e. that if is -good, then w.h.p. for all . We emphasize that our proof is nearly word-for-word the same as that of Morris and Saxton [20]. We make use the following definition from [26].
Definition 9.
Given an -uniform hypergraph and a real number , define
where
denotes the maximum degree in of a -set containing .
We remark that we have removed some extraneous constants from the definition in [26], since these do not affect the formulation of the theorem below. We also note that is typically called a codegree function, but we emphasize that this has no relation to the definition of codegree functions that we used throughout our paper.
The following container theorem was proved by Balogh, Morris and Samotij [1, Proposition 3.1] and by Saxton and Thomason [26, Theorem 6.2]666To be precise, Theorem 6.2 in [26] is stated where is a tuple of vertex sets rather than a single vertex set, but it is straightforward to deduce this form from the methods of [26]., where here the notation denotes the collection of all subsets of of size at most .
Theorem A.1.
Let and let be sufficiently small. Let be an -graph with vertices, and suppose that for some . Then there exists a collection of subsets of , and a function such that:
-
For every independent set , there exists with and , and
-
for every .
We will refer to the collection as the containers of , since, by , every independent set is contained in some member of . The reader should think of as being the edge set of some underlying graph , and as encoding (some subset of) the copies of a graph in . Thus every -free subgraph of is an independent set of .
Let us introduce some notation to simplify the statements which follow. Given a graph and real number , let denote the collection of -free graphs with vertices, and let denote the collection of all graphs with vertices and at most edges. By a colored graph, we mean a graph together with an arbitrary labelled partition of its edge set.
Theorem A.2.
If is -good, then there exists a constant such that the following holds for all sufficiently large with . There exists a collection of colored graphs with vertices and at most edges, as well as functions
with the following properties:
-
For every , the number of colored graphs in with edges is at most
-
for every .
We prove Theorem A.2 by iterating the following container result.
Proposition A.3.
If is an -good graph, then there exist and such that the following holds for every and every . Set
(23) |
Given a graph with vertices and edges, there exists a function that maps subgraphs of to subgraphs of , such that, for every -free subgraph ,
-
There exists a subgraph with and , and
-
.
Proof.
By definition of being -good, there exist real numbers and a hypergraph on whose hyperedges are copies of and is such that
-
(i)
, and
-
(ii)
For every with , we have
Let , and without loss of generality we can assume is sufficiently large so that Theorem A.1 holds with and this choice of . We will show that if
then it follows from and that . Indeed, since , we have
where this last bound holds provided is sufficiently small, which we can assume to be the case without loss of generality. Thus, applying Theorem A.1 and setting , we obtain a collection of subgraphs of and a function mapping subgraphs of to elements of so that for every -free subgraph , there exists a subgraph with
and , and also
(24) |
It only remains to show that this second condition implies for every (notice that the first inequality is about hyperedges and the second is about graph edges). To prove this, for each set
and recall that for every , by , , and . Therefore,
On the other hand, we have by condition (24), and so
as required. Hence the proposition follows with . ∎
With this in hand, we continue on towards the proof of Theorem A.2. We will need the following straightforward lemma (see, for example, [5, Lemma 4.3]).
Lemma A.4.
Let , and . If satisfy and for each , then
We can now deduce Theorem A.2.
Proof of Theorem A.2.
We construct the functions and and the family as follows. Given an -free graph , we repeatedly apply Proposition A.3, first to the complete graph , then to the graph , where is the set guaranteed to exist by part , then to the graph , where , and so on. We continue until we arrive at a graph with at most edges, and set
Since depends only on the sequence , the function is well-defined.
It remains to bound the number of colored graphs in with edges. To do so, it suffices to count the number of choices for the sequence of graphs with . For each , define and as follows:
and note that
Thus, fixing , and as above, and writing
for each , and
for each , it follows that the number of colored graphs in with edges is at most
Given , and , let us partition the product over according to whether or not . Since if is at least some large constant times , the product of the terms for which this is the case is at most
where in the last step we used the fact that . On the other hand, if , i.e. if , then
and hence, by Lemma A.4, the product over the remaining is at most
for some . Noting that since for at least some large constant , the theorem follows. ∎
We can now easily deduce Theorem 6.1.
Proof of Theorem 6.1.
Let be a graph satisfying the hypotheses of the theorem, i.e. a graph which is -good for some . Recall that we wish to show that for , we have w.h.p. Given such a function , define . Since , we can apply A.2 to get functions . Suppose that there exists an -free subgraph with edges, and observe that , and that contains at least elements of . The probability of this event is therefore at most
as , as long as is a sufficiently large constant times
We conclude that w.h.p., giving the result. ∎
Appendix B Proof of Proposition 2.1
We emphasize that almost everything in this section will be nearly identical to Morris and Saxton [20]. We first recall the definitions and conventions introduced in Section 2:
-
•
We fixed a sequence of rapidly decreasing constants
which depend only on . We also fixed some -vertex graph with minimum degree with (and hence ) sufficiently large in terms of the constants.
-
•
For , we say that a tuple of (not necessarily disjoint) subsets of is a concentrated -neighborhood of if , for all , and
We define to be the minimal such that there exists a concentrated -neighborhood of in .
-
•
Lemma 2.2 says that for some , there exists of size at least such that for every , and such that for every there exists a tuple of sets such that , , for all , and every has .
For the rest of this section we fix as in Lemma 2.2. We also fix some sufficiently small compared to the constants, as well as a set of forests such that for every path of which does not contain an element of as a subgraph, there are at most vertices such that the path contains an element of as a subgraph. As much as possible we use the notation of Morris and Saxton’s original proof, and in particular, we drop our convention from the main part of the text that are used only as vertices of .
We introduce some notation and definitions that will be used for the rest of the proof. Given a set of paths , we define the -branching factor of is the maximum number such that there exist paths in with th vertex and pairwise distinct st vertices. The branching factor of is defined to be the maximum -branching factor amongst all choices of . We define and , and also define .
One lemma that we will need in several places is the following.
Lemma B.1.
Let be a collection of paths of length in from a vertex to a set . Assume that , , and that has branching factor at most . Then ,
Proof.
Form a subset by starting with and then iteratively choosing such that the -branching factor of is less than and then deleting any paths which contain as their th vertex. Let be the set of vertices used as the th vertex of some path of . If is non-empty, then is a concentrated -neighborhood of by construction, which shows .
Thus it suffices to show that is non-empty. We claim that the number of paths that were destroyed is at most . Indeed, because has branching factor at most , every destroyed path can be identified by choosing some index , starting a path at , and then iteratively choosing the next vertex of the path in at most ways for each and in at most ways when , proving the claim. Since is strictly greater than the number of destroyed paths, is non-empty and the result follows. ∎
The following definition will almost be strong enough to prove Proposition 2.1.
Definition 10.
Let be a collection of (not necessarily disjoint) Sets of vertices of with and let be a collection of paths of the form with for all . We say that is a balanced -neighborhood of if the following conditions hold:
-
(i)
We have and .
-
(ii)
For every with and every , we have .
-
(iii)
The branching factor of is at most .
For the next lemma we recall that is a set of forests satisfying a property that depends on .
Lemma B.2.
If , then there exists a balanced -neighborhood of with such that every subgraph of each does not lie in provided is sufficiently small.
Proof.
Let be the tuple of sets guaranteed by Lemma 2.2. We may assume that , as otherwise we can just remove vertices from while maintaining all the properties guaranteed by Lemma 2.2. For each , let be an arbitrary subset of of size . Let be the set of paths generated as follows. Given , select any such that and such no subgraph of is contained in . Note that the number of choices at each step is at least
with the last step holding if is sufficiently small (which also implies is sufficiently large compared to ). This means
(25) |
Note that by construction, every path in avoids .
We now remove some paths from to produce . If there exists with and vertices and , then choose a path with and and delete this path from . Repeat this until no such paths remain in , and let be the resulting set of paths. By construction is a balanced neighborhood, so it suffices to show is large.
We say that a pair of vertices is -unbalanced if and (we emphasize that this condition involves the original family before any paths are deleted). Let . We claim that
(26) |
for all with . Assuming this is true, this fact together with the branching factor of implies that the number of paths removed is at most
with this last step holding if is sufficiently large (which holds if is sufficiently small). From this and , we conclude that the remaining set of paths has the desired properties and is as large as claimed. It thus remains to prove (26).
Fix and let . If then (since no pair of vertices can be -unbalanced), so we may assume . Observe that
where the second inequality used which has branching factor at most . Thus if we assume for contradiction that (26) does not hold, then there must exist some such that
(27) |
with this last step holding if the constants decrease sufficiently quickly. Let
Note that because has branching factor at most , and that each is the last vertex of more than paths of (since by definition of , such a pair must be -unbalanced). Using gives
With this and (27), we can apply Lemma B.1 to to conclude . This gives a contradiction to and the properties of guaranteed by Lemma 2.2.
∎
A key fact about balanced neighorhoods is the following.
Lemma B.3.
Let be a balanced -neighborhood of . If is sufficiently small, then for any and non-empty set of vertices , there are at most paths in containing .
Proof.
Let , and for ease of notation let and . Given a sequence , the number of paths with is at most
Every path containing can be formed in this way, possibly by reordering the elements of and by choosing different indices . As the number of ways of doing this is some finite number depending only on , we conclude the result. ∎
We now move onto the last notion of neighborhoods that we need for this proof.
Definition 11.
Let be a balanced -neighborhood of . We say that is a refined -neighborhood of if the following conditions also hold:
-
1.
For every and every ,
-
2.
For every ,
-
3.
For every ,
Lemma B.4.
If is a balanced -neighborhood of a vertex with , then there exists a refined -neighborhood of with for all and such that
Proof.
Repeatedly delete vertices using the following three steps until no further vertices can be removed:
-
Step 1
If there exists and with
then remove from and remove all paths with .
-
Step 2
If there exists with
then remove from and remove all paths with .
-
Step 3
If there exists with
then remove from and remove all paths with .
Let and be the set of vertices and paths that remain at the end of this process and let . Note that with this, automatically satisfies every condition for a refined -neighborhood except possibly . This will follow from having large, which we prove below by arguing that few paths are destroyed in the process above.
Because is a balanced -neighborhood, its branching factor is at most . As such the number of paths removed in Step 1 is at most
(28) |
and in Step 3 we remove at most
(29) |
where this last step uses the definition of balanced -neighborhoods.
For Step 2, we aim to show that the number of destroyed paths is at most
(30) |
Let and denote the collection of vertices and paths removed in Step 2, and let
Note that by the definition of and the bound on the branching factor of ,
(31) |
so it remains to show that there are few paths which use a vertex of as the second to last vertex. For this, let
Note first that by definition of , . On the other hand, we have that since at the time each vertex is deleted, has at most neighbors in . In total then we find
(32) |
where this last step used by definition of balanced neighborhoods.
Observe that if the number of paths in using a vertex of as the second to last vertex is at most then (31) implies that the number of paths removed is at most (30), so we may assume this is not the case. Letting , this assumption together with the branching factor of implies , or equivalently
(33) |
With the last step holding if the constants decrease sufficiently quickly. If , then (32) and (33) together with Lemma B.1 imply , contradicting . If then (32) gives , so the fact that has branching factor at most implies that there are at most paths in whose second to last vertex is in . In either case, this bound together with (31) implies the number of paths removed is at most (30).
Proof of Proposition 2.1.
Let be as in Lemma 2.2. For each , let be the balanced -neighborhood guaranteed by Lemma B.2 and the refined -neighborhood guaranteed by Lemma B.4 from . Most of the properties of Proposition 2.1 follow immediately from Definitions 10 and 11, as well as Lemmas B.2 and B.3 (with the last lemma using that is in particular a balanced -neighborhood). The only conditions which are not immediate are the bounds . If this bound did not hold for , then the tuple would be a concentrated -neighborhood of (assuming ), contradicting every having .