This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The Random Turán Problem for Theta Graphs

Gwen McKinley Department of Mathematics, University of California, San Diego, La Jolla, CA. Email gmckinley@ucsd.edu.    Sam Spiro Department of Mathematics, Rutgers University, Piscataway, NJ. Email: sas703@scarletmail.rutgers.edu. Research supported by the NSF Mathematical Sciences Postdoctoral Research Fellowships Program under Grant DMS-2202730.
Abstract

Given a graph FF, we define ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) to be the maximum number of edges in an FF-free subgraph of the random graph Gn,pG_{n,p}. Very little is known about ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) when FF is bipartite, with essentially tight bounds known only when FF is either C4,C6,C10C_{4},\ C_{6},\ C_{10}, or Ks,tK_{s,t} with tt sufficiently large in terms of ss, due to work of Füredi and of Morris and Saxton. We extend this work by establishing essentially tight bounds when FF is a theta graph with sufficiently many paths. Our main innovation is in proving a balanced supersaturation result for vertices, which differs from the standard approach of proving balanced supersaturation for edges.

1 Introduction

Given a graph FF, we define the Turán number or extremal number ex(n,F)\mathrm{ex}(n,F) to be the maximum number of edges that an nn-vertex FF-free graph can have. If FF is not bipartite, then the asymptotic behavior of ex(n,F)\mathrm{ex}(n,F) is determined by the Erdős-Stone theorem [11]. Only sporadic results for ex(n,F)\mathrm{ex}(n,F) are known when FF is bipartite, and in most cases these bounds are not tight.

For example, The Kővári-Sós-Turán theorem [19] implies ex(n,Ks,t)=O(n21/s)\mathrm{ex}(n,K_{s,t})=O(n^{2-1/s}), and this bound is only known to be tight when tt is sufficiently large in terms of ss; see for example [3]. The bound ex(n,C2b)=O(n1+1/b)\mathrm{ex}(n,C_{2b})=O(n^{1+1/b}) was first proven by Bondy and Simonovits [2]. It was shown by Faudree and Simonovits [12] that this same upper bound continues to hold for theta graphs with paths of length bb, and it was later shown by Conlon [7] that this upper bound is tight for theta graphs which have sufficiently many paths. A more detailed treatment on Turán numbers of bipartite graphs can be found in the survey by Füredi and Simonovits [14].

In this paper, we study a probabilistic analog of the Turán number. We define the random graph Gn,pG_{n,p} to be the nn-vertex graph obtained by including each possible edge independently and with probability pp, and we let ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) denote the maximum number of edges in an FF-free subgraph of Gn,pG_{n,p}.

Observe that ex(Gn,1,F)=ex(n,F)\mathrm{ex}(G_{n,1},F)=\mathrm{ex}(n,F). With this in mind, it is natural to ask if the classical bounds on ex(n,F)\mathrm{ex}(n,F) mentioned above can be extended to bounds on ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) for all pp. For example, just as in the classical case, we have a complete asymptotic understanding of ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) when FF is not bipartite due to breakthrough work done independently by Conlon and Gowers [6] and Schacht [27]. To formally state their result, we define the 2-density of a graph FF by

m2(F)=max{e(F)1v(F)2:FF,e(F)2},m_{2}(F)=\max\left\{\frac{e(F^{\prime})-1}{v(F^{\prime})-2}:F^{\prime}\subseteq F,\ e(F^{\prime})\geq 2\right\},

and we write f(n)g(n)f(n)\gg g(n) to mean f(n)/g(n)f(n)/g(n) tends to infinity as nn tends to infinity. We also recall that a sequence of events AnA_{n} occurs with high probability or w.h.p. if Pr[An]\Pr[A_{n}] tends to 1 as nn tends to infinity.

Theorem 1.1 ([6, 27]).

For any graph FF, w.h.p.

ex(Gn,p,F)={(11χ(F)1+o(1))p(n2)pn1/m2(F),(1+o(1))p(n2)n1/m2(F)pn2.\mathrm{ex}(G_{n,p},F)=\begin{cases}\left(1-\frac{1}{\chi(F)-1}+o(1)\right)p{n\choose 2}&p\gg n^{-1/m_{2}(F)},\\ (1+o(1))p{n\choose 2}&n^{-1/m_{2}(F)}\gg p\gg n^{-2}.\end{cases}

Theorem 1.1 to a large extent solves the random Turán problem for non-bipartie graphs, though many questions still remain in this setting; see the survey by Rödl and Schacht [25] for more on this.

For the rest of this paper we focus on the random Turán problem when FF is bipartite, a setting where much less is known. This lack of information is partially due to the fact that the classical Turán numbers ex(n,F)\mathrm{ex}(n,F) are unknown for almost all bipartite graphs. An additional obstacle is that even if it is known that ex(n,F)=Θ(nα)\mathrm{ex}(n,F)=\Theta(n^{\alpha}) for some bipartite graph FF, it is typically not the case that ex(Gn,p,F)=Θ(pnα)\mathrm{ex}(G_{n,p},F)=\Theta(pn^{\alpha}) for large pp; see for example Figure 1 which plots ex(Gn,p,C4)\mathrm{ex}(G_{n,p},C_{4}). That is, unlike in the non-bipartite case, the extremal constructions for ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) with FF bipartite are far from the intersection of Gn,pG_{n,p} with an extremal FF-free graph.

Refer to caption
Figure 1: A plot of the value of ex(Gn,p,C4)\mathrm{ex}(G_{n,p},C_{4}) as a function of pp, with these bounds holding with high probability and up to constant factors.

Despite these obstacles, what is currently known about the random Turán problem for bipartite graphs suggests that the following analog of 1.1 could be true; notice that in contrast to the non-bipartite case, where ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) exhibits a single “phase transition” around p=n1/m2(F)p=n^{-1/m_{2}(F)}, if 1.2 is correct, we will see two phase transitions in the bipartite case.

Conjecture 1.2.

If FF is a graph with ex(n,F)=Θ(nα)\mathrm{ex}(n,F)=\Theta(n^{\alpha}) for some α(1,2]\alpha\in(1,2], then w.h.p.

ex(Gn,p,F)={max{Θ(pα1nα),n21/m2(F)(logn)O(1)}pn1/m2(F),(1+o(1))p(n2)n1/m2(F)pn2.\mathrm{ex}(G_{n,p},F)=\begin{cases}\max\{\Theta(p^{\alpha-1}n^{\alpha}),n^{2-1/m_{2}(F)}(\log n)^{O(1)}\}&p\geq n^{-1/m_{2}(F)},\\ (1+o(1))p{n\choose 2}&n^{-1/m_{2}(F)}\gg p\gg n^{-2}.\end{cases}

Equivalently,

ex(Gn,p,F)={Θ(pα1nα)pn2α1/m2(F)α1(logn)O(1)n21/m2(F)(logn)O(1)n2α1/m2(F)α1(logn)O(1)pn1/m2(F),(1+o(1))p(n2)n1/m2(F)pn2.\mathrm{ex}(G_{n,p},F)=\begin{cases}\Theta(p^{\alpha-1}n^{\alpha})&p\geq n^{\frac{2-\alpha-1/m_{2}(F)}{\alpha-1}}(\log n)^{O(1)}\\ n^{2-1/m_{2}(F)}(\log n)^{O(1)}&n^{\frac{2-\alpha-1/m_{2}(F)}{\alpha-1}}(\log n)^{O(1)}\geq p\geq n^{-1/m_{2}(F)},\\ (1+o(1))p{n\choose 2}&n^{-1/m_{2}(F)}\gg p\gg n^{-2}.\end{cases}

The bound for n1/m2(F)pn2n^{-1/m_{2}(F)}\gg p\gg n^{-2} is easy to prove with a deletion argument, as is the lower bound of Ω(n21/m2(F))\Omega(n^{2-1/m_{2}(F)}) when pn1/m2(F)p\geq n^{-1/m_{2}(F)}. Thus the hard part of 1.2 is in showing that ex(Gn,p,F)=Θ(pα1nα)\mathrm{ex}(G_{n,p},F)=\Theta(p^{\alpha-1}n^{\alpha}) whenever pp is large.

In terms of evidence supporting 1.2, Füredi [13] showed that this conjecture holds when F=C4F=C_{4}. This work was substantially generalized by Morris and Saxton [20] who proved ex(Gn,p,Ks,t)=O(p11/sn21/s)\mathrm{ex}(G_{n,p},K_{s,t})=O(p^{1-1/s}n^{2-1/s}) w.h.p. when pp is large, and that ex(Gn,p,C2b)=O(p1/bn1+1/b)\mathrm{ex}(G_{n,p},C_{2b})=O(p^{1/b}n^{1+1/b}) w.h.p. when pp is large. Moreover, they showed that these bounds are tight provided ex(n,Ks,t)=Ω(n21/s)\mathrm{ex}(n,K_{s,t})=\Omega(n^{2-1/s}) and ex(n,{C3,C4,,C2b})=Ω(n1+1/b)\mathrm{ex}(n,\{C_{3},C_{4},\ldots,C_{2b}\})=\Omega(n^{1+1/b}), respectively. The lower bound of Conjecture 1.2 was shown to hold for large powers of balanced trees by Spiro [28]. Under some mild conditions, Jiang and Longbreak [17] proved a general upper bound of the form

ex(Gn,p,F)=O(p1m2(F)(2α)nα) w.h.p. when p is large,\mathrm{ex}(G_{n,p},F)=O\big{(}p^{1-m_{2}^{*}(F)(2-\alpha)}n^{\alpha}\big{)}\textrm{ w.h.p.\ when }p\textrm{ is large},

where m2(F)=max{e(F)1v(F)2:FF,e(F)2}m_{2}^{*}(F)=\max\left\{\frac{e(F^{\prime})-1}{v(F^{\prime})-2}:F^{\prime}\subsetneq F,\ e(F^{\prime})\geq 2\right\}. This bound matches Conjecture 1.2 precisely when m2(F)=1m_{2}^{*}(F)=1, which happens when F=C2bF=C_{2b} (giving a simpler proof of [20]), but otherwise is strictly weaker than the bound proposed in Conjecture 1.2. As far as we are aware, these are the only known bounds for the random Turán problem for bipartite graphs, though there have been a number of recent results regarding the analogous problem for degenerate hypergraphs, see for example [21, 22, 23, 24, 29].

In this paper, we contribute to this growing body of literature by studying the random Turán problem for theta graphs. Recall that a theta graph θa,b\theta_{a,b} is a graph which consists of two vertices u,vu,v together with aa internally disjoint paths from uu to vv of length bb. For example, θ3,4\theta_{3,4} is depicted in Figure 2.

Refer to caption
Figure 2: A depiction of the theta graph θ3,4\theta_{3,4}.

Observe that θ2,b=C2b\theta_{2,b}=C_{2b} and θa,2=K2,a\theta_{a,2}=K_{2,a}. Given that Morris and Saxton essentially solved the random Turán problem for cycles and complete bipartite graphs, one might hope that their methods can be extended to give bounds for ex(Gn,p,θa,b)\mathrm{ex}(G_{n,p},\theta_{a,b}) in general.

And indeed, by using similar ideas as in [20], Corsten and Tran [10] implicitly proved

ex(Gn,p,θa,b)=O(p2abn1+1b) w.h.p. when p is large,\mathrm{ex}(G_{n,p},\theta_{a,b})=O\left(p^{\frac{2}{ab}}n^{1+\frac{1}{b}}\right)\textrm{ w.h.p.\ when }p\textrm{ is large}, (1)

which matches the general upper bound given by Jiang and Longbreak in this case. However, Faudree and Simonovits [12] proved ex(n,θa,b)=O(n1+1/b)\mathrm{ex}(n,\theta_{a,b})=O(n^{1+1/b}), so Conjecture 1.2 predicts that we should have ex(Gn,p,θa,b)=O(p1/bn1+1/b)\mathrm{ex}(G_{n,p},\theta_{a,b})=O(p^{1/b}n^{1+1/b}) when pp is large, which differs substantially from (1) when aa is large.

By adding several new ideas (see Section 1.1.2) to the approach of Morris and Saxton, we significantly improve upon the bounds of (1) and establish essentially tight (and unconditional) bounds for ex(Gn,p,θa,b)\mathrm{ex}(G_{n,p},\theta_{a,b}) when aa is large, which agree with what 1.2 predicts.

Theorem 1.3.

For all b2b\geq 2, there exists a0=a0(b)a_{0}=a_{0}(b) such that for any fixed aa0a\geq a_{0}, w.h.p.

ex(Gn,p,θa,b)={Θ(p1bn1+1b)pnb1ab1(logn)2b,n2a(b1)ab1(logn)O(1)nb1ab1(logn)2bpna(b1)ab1,(1+o(1))p(n2)na(b1)ab1pn2.\mathrm{ex}(G_{n,p},\theta_{a,b})=\begin{cases}\Theta\left(p^{\frac{1}{b}}n^{1+\frac{1}{b}}\right)&p\geq n^{-\frac{b-1}{ab-1}}(\log n)^{2b},\\ n^{2-\frac{a(b-1)}{ab-1}}(\log n)^{O(1)}&n^{-\frac{b-1}{ab-1}}(\log n)^{2b}\geq p\geq n^{-\frac{a(b-1)}{ab-1}},\\ (1+o(1))p{n\choose 2}&n^{-\frac{a(b-1)}{ab-1}}\gg p\gg n^{-2}.\end{cases}

As far as we are aware, Theorem 1.3 is the first result since Morris and Saxton [20] which gives essentially tight bounds on ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) for any bipartite graph FF.

1.1 Proof Outline

The vast majority of our paper is focused on proving the upper bound of 1.3 when pp is large, with the other bounds following from previous results together with the monotonicity of ex(Gn,p,θa,b)\mathrm{ex}(G_{n,p},\theta_{a,b}). Before going into our new ideas, we first recall the proof ideas from Morris and Saxton [20] for bounding ex(Gn,p,C2b)\mathrm{ex}(G_{n,p},C_{2b}), as well as the adaptation of these methods by Corsten and Tran [10] to theta graphs.

1.1.1 Previous Ideas

The main approach to upper bounding ex(Gn,p,C2b)\mathrm{ex}(G_{n,p},C_{2b}) when pp is large is to show that C2bC_{2b} exhibits “balanced supersaturation”, which roughly means that if e(G)ex(n,C2b)e(G)\gg\mathrm{ex}(n,C_{2b}), then one can find a large collection of copies of C2bC_{2b} in GG which are “spread out.” Given such a result, one can derive upper bounds on ex(Gn,p,C2b)\mathrm{ex}(G_{n,p},C_{2b}) by using what is by now a somewhat standard argument involving hypergraph containers; see 6.1 below.

To prove this balanced supersaturation result, let GG be an nn-vertex graph with kn1+1/bkn^{1+1/b} edges. Our goal is to build a hypergraph \mathcal{H} whose vertex set is E(G)E(G), hyperedges are copies of C2bC_{2b} in GG, and is such that

deg(σ)k2bn2δkn1+1/b(δkb/(b1))|σ|1,\deg_{\mathcal{H}}(\sigma)\leq\frac{k^{2b}n^{2}}{\delta kn^{1+1/b}(\delta k^{b/(b-1)})^{|\sigma|-1}},

for all σE(G)\sigma\subseteq E(G), where δ\delta is some suitably small constant. If we can construct such a collection with ||k2bn2|\mathcal{H}|\approx k^{2b}n^{2}, then this together with 6.1 will give our desired result.

To construct such an \mathcal{H}, we iteratively build copies of C2bC_{2b} to add to \mathcal{H} as follows. Given our current collection \mathcal{H}, we “clean up” GG by deleting edges ee with d(e)k2bn2δkn1+1/bd_{\mathcal{H}}(e)\geq\frac{k^{2b}n^{2}}{\delta kn^{1+1/b}} (since we will not be able to use these edges in any copy of C2bC_{2b} to add to \mathcal{H}), and by iteratively deleting vertices with low degree. We then pick a vertex xx that will have some number t(x)bt(x)\leq b associated to it (which roughly measures how well the graph expands near xx) to use in our cycle. We then run an algorithm which starts with a set χ={x}\chi=\{x\}, and then iteratively adds new vertices to χ\chi until we build a cycle CC, with the exact algorithm we use depending on the value of t(x)t(x). A key point is that at each step of our algorithm, we always have significantly more than δkb/(b1)\delta k^{b/(b-1)} choices for each new vertex to add. Because we have so many choices, one of the cycles CC that we can create will be such that CC\notin\mathcal{H} and such that {C}\{C\}\cup\mathcal{H} continues to satisfy our desired codegree condtions. By applying this algorithm repeatedly, we end up constructing a large collection \mathcal{H} of cycles which satisfies our codegree conditions, proving our desired balanced superaturation result.

For theta graphs, Corsten and Tran [10] used an argument similar to the one outlined above, but their approach only gave effective bounds on the codegrees deg(σ)\deg_{\mathcal{H}}(\sigma) when σ\sigma is a set of edges inducing a forest111The general result of Jiang and Longbreak [17] similarly only gives effective bounds for forests, and as such it seems like moving beyond the forest case is the main difficulty in general for upper bounding ex(Gn,p,F)\mathrm{ex}(G_{n,p},F).. At a high level, the fundamental issue with the approach outlined above is that the algorithm proceeds by selecting vertices one at a time, but the codegree bounds of 6.1 are a function of the number of edges |σ||\sigma|. For σ\sigma which are forests, this distinction turns out to be irrelevant because σ\sigma has at least as many vertices as edges, but this property fails substantially for general subgraphs of theta graphs.

1.1.2 New Ideas

As in previous works, our proof relies on showing that theta graphs exhibit balanced supersaturation. In particular, we use the following three main ideas in order to get around the fundamental issues outlined above:

Vertex Supersaturation. The first key idea is that instead of viewing \mathcal{H} as a hypergraph on E(G)E(G), we essentially view it as a hypergraph on V(G)V(G). To be more precise, we consider hypergraphs \mathcal{H} on V(θa,b)×V(G)V(\theta_{a,b})\times V(G) in such a way that hyperedges of \mathcal{H} correspond to a unique labelled theta graph in GG. Balanced supersaturation for this hypergraph \mathcal{H} then easily translates to balanced supersaturation for the corresponding hypergraph on E(G)E(G), which we ultimately need in order to use hypergraph containers.

Asymmetric Codegree Bounds. The second idea is that the codegree bounds we enforce on χV(θa,b)×V(G)\chi\subseteq V(\theta_{a,b})\times V(G) will not depend solely on the number of vertices |χ||\chi|, but also on the vertices of θa,b\theta_{a,b} which the set χ\chi corresponds to. For example, if u,vV(θa,b)u,v\in V(\theta_{a,b}) are the two high-degree vertices of θa,b\theta_{a,b} (i.e. the vertices of degree at least 3), then the codegree bounds we enforce on the set χ={(u,x),(v,y)}\chi=\{(u,x),(v,y)\} will be higher than those we would put on χ={(w,x),(w,y)}\chi=\{(w,x),(w^{\prime},y)\} for some other w,wV(θa,b)w,w^{\prime}\in V(\theta_{a,b}) since, in some sense, u,vu,v are the most important vertices of θa,b\theta_{a,b}. A similar idea was implicitly used by Morris and Saxton when proving balanced supersaturation results for Ks,tK_{s,t} (where the vertices in the smaller part are “more important” than those in the larger part).

Multiple Collections. The final idea is that we do not build a single collection \mathcal{H}, but instead build multiple collections 1,2,\mathcal{H}_{1},\mathcal{H}_{2},\ldots and impose different codegree bounds on each of these.

As a somewhat concrete example of why we do this, say we knew our graph GG was a random graph with kn1+1/bkn^{1+1/b} edges. In this case, it would be possible to build an \mathcal{H} such that for every x,yV(G)x,y\in V(G), there are at most roughly kabk^{ab} theta graphs in our collection which use x,yx,y as the two high degree vertices; and this is a strong bound on the codegree of this pair. In contrast, if GG was a clique with kn1+1/bkn^{1+1/b} edges together with some isolated vertices, then it would be impossible to impose such a codegree bound for all x,yx,y. Thus if we only worked with a single collection \mathcal{H}, we would have to pessimistically use the weaker codegree bounds that work for a clique when x,yx,y correspond to the high degree vertices, and similarly we would have to consider the worst possible choice of GG when determining the codegree bounds for any given set χV(G)\chi\subseteq V(G). Doing this would ultimately give bounds that are too weak. By building multiple collections, we can impose the “correct” codegree bounds regardless of the structure of GG.

Other Ideas. In addition to these three main ideas, we use a slightly different algorithm to construct our theta graphs compared to those of [10, 20]. Previous algorithms worked (roughly) by first specifying a vertex xV(G)x\in V(G) to play the role of one of the high degree vertices of θa,b\theta_{a,b}, then choosing a path in GG of length bb (which specifies the other vertex yy playing the role of a high degree vertex in θa,b\theta_{a,b}), and from there choosing the remaining a1a-1 paths from xx to yy one at a time. Instead, our algorithm chooses the two high degree vertices x,yx,y at the start and then builds all of the aa paths from xx to yy one at a time. This somewhat more symmetric argument allows us to overcome various technical issues that arose with previous approaches, and is crucial for our present argument to go through.

1.2 Organization and Notation

The rest of this paper is organized as follows. In Section 2 we prove several auxiliary results that will be used in our main proof, and in Section 3 we establish the main definitions used in this paper. In Section 4, which is the real heart of our argument, we establish our balanced supersaturation result for vertices. We then translate this result into balanced supersaturation for edges in Section 5 before completing the proof of 1.3 in Section 6 by invoking a result that follows from a standard containers type argument. A few open problems are given in Section 7.

Throughout the paper we adopt the following conventions. We always use u,v,wu,v,w to denote vertices of θa,b\theta_{a,b} and x,y,zx,y,z to denote vertices of a (larger) graph GG. Further, we will almost always let u,vu,v denote the two vertices of θa,b\theta_{a,b} with degree at least 3 (when a3a\geq 3), and we will informally call these the “vertices of high degree” in θa,b\theta_{a,b}. We write v(G)=|V(G)|v(G)=|V(G)|. Whenever we write asymptotic notation such as O(f)O(f), our implicit constants will always depend on a,ba,b, and we will occasionally emphasize this point by writing, for example, Oa,b(f)O_{a,b}(f). For a hypergraph \mathcal{H} and a set of vertices σV()\sigma\subseteq V(\mathcal{H}), its degree or codegree deg(σ)\deg_{\mathcal{H}}(\sigma) is the number of hyperedges of \mathcal{H} containing σ\sigma.

2 Auxiliary Results

Here we establish two results that will be crucial for our proof.

2.1 Expansion

One of the key technical lemmas of Morris and Saxtion says that in a graph with sufficiently large minimum degree, there exists a vertex xx which is the endpoint of many “nice” paths of some length tt. Analogously, we will rely heavily on the following.

Proposition 2.1.

For all integers b2b\geq 2, there exists some ε>0\varepsilon>0 such that the following holds. If GG is an mm-vertex graph with minimum degree m1/b\ell m^{1/b} and ε1\ell\geq\varepsilon^{-1}, and if \mathcal{F} is a set of forests, then there exists an integer 2tb2\leq t\leq b and a set of vertices XX such that the following properties hold:

  • (a)

    For each xXx\in X, there exists a pair (,𝒬)(\mathcal{B},\mathcal{Q}) such that =(B0,,Bt)\mathcal{B}=(B_{0},\ldots,B_{t}) is a tuple of (not necessarily disjoint) vertex sets of GG with B0={x}B_{0}=\{x\}, and 𝒬\mathcal{Q} is a set of paths xz1ztxz_{1}\cdots z_{t} with ziBiz_{i}\in B_{i} for all ii.

  • (b)

    We have |Bt1|,|Bt|ε(bt+1)/(b1)m(t1)/b|B_{t-1}|,|B_{t}|\geq\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}.

  • (c)

    We have |𝒬|εtmt/b|\mathcal{Q}|\geq\varepsilon\ell^{t}m^{t/b}.

  • (d)

    For every 1it1\leq i\leq t and zBi1z\in B_{i-1}, we have |N(z)Bi|εm1/b|N(z)\cap B_{i}|\geq\varepsilon\ell m^{1/b}, and for every zBtz\in B_{t}, we have |N(z)Bt1|εb/(b1)|N(z)\cap B_{t-1}|\geq\varepsilon\ell^{b/(b-1)}.

  • (e)

    For every yBty\in B_{t}, we have |𝒬[xy]|ε(t1)b/(b1)|\mathcal{Q}[x\to y]|\geq\varepsilon\ell^{(t-1)b/(b-1)}, where 𝒬[xy]\mathcal{Q}[x\to y] denotes the set of paths of 𝒬\mathcal{Q} starting at xx and ending at yy.

  • (f)

    For any yBty\in B_{t} and non-empty set of vertices SV(G){x,y}S\subseteq V(G)\setminus\{x,y\}, there are at most ε1(t1|S|)b/(b1)\varepsilon^{-1}\ell^{(t-1-|S|)b/(b-1)} paths in 𝒬[xy]\mathcal{Q}[x\to y] containing SS.

  • (g)

    If \mathcal{F} is such that for every path x1xrx_{1}\cdots x_{r} of GG with rbr\leq b which does not contain an element of \mathcal{F} as a subgraph, there are at most εm1/b\varepsilon\ell m^{1/b} vertices xr+1NG(xr)x_{r+1}\in N_{G}(x_{r}) such that the path x1xr+1x_{1}\cdots x_{r+1} contains an element of \mathcal{F} as a subgraph; then no path of 𝒬\mathcal{Q} contains an element of \mathcal{F} as a subgraph.

  • (h)

    We have |X|ε(bt)/bmt/b|X|\geq\varepsilon\ell^{(b-t)/b}m^{t/b}.

Morris and Saxton essentially proved this same result but with the last condition replaced by |X|>0|X|>0 as opposed to |X|ε(bt)/bmt/b|X|\geq\varepsilon\ell^{(b-t)/b}m^{t/b}. Our two proofs will be essentially identical outside of this improved quantitative bound222It’s possible that 2.1 holds with the even stronger quantitative bound |X|εm|X|\geq\varepsilon m. If true, this would significantly simplify our proof of 1.3; see 7.3 for more on this., and as such we defer many of the redundant details of the proof to Appendix B.

For our proof, we fix a sequence of rapidly decreasing constants

1εbε2ε1>01\geq\varepsilon_{b}\geq\cdots\geq\varepsilon_{2}\geq\varepsilon_{1}>0

which depend only on bb. The exact values of these constants are not particularly important, other than that they are sufficiently small with respect to 1 and with respect to each other. In particular, we demand εt16(b+1)εt1\varepsilon_{t}\geq 16(b+1)\varepsilon_{t-1} for all tt. For the rest of the subsection we will fix some mm-vertex graph GG with minimum degree m1/b\ell m^{1/b} with \ell (and hence mm) sufficiently large in terms of the εt\varepsilon_{t} constants.

Definition 1.

For xV(G)x\in V(G), we say that a tuple 𝒜=(A0,A1,,At)\mathcal{A}=(A_{0},A_{1},\ldots,A_{t}) of (not necessarily disjoint) subsets of V(G)V(G) is a concentrated tt-neighborhood of xx if A0={x}A_{0}=\{x\}, |At|(bt)/(b1)mt/b|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}, and |N(y)Ai|εtm1/b|N(y)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b} for all yAi1y\in A_{i-1}.

We define t(x)t(x) to be the minimal t2t\geq 2 such that there exists a concentrated tt-neighborhood of xx in GG. Note that t(x)bt(x)\leq b for all xx since we can iterativly take Ai=yAi1N(y)A_{i}=\bigcup_{y\in A_{i-1}}N(y).

Morris and Saxton implicitly proved that for any vertex xx with t(x)=minyV(G)t(y)t(x)=\min_{y\in V(G)}t(y), there exist sets (,𝒬)(\mathcal{B},\mathcal{Q}) as in Proposition 2.1, and in particular, at least one such vertex exists. The only place where t(x)=minyV(G)t(y)t(x)=\min_{y\in V(G)}t(y) is used in their argument is in showing that there exists a tuple 𝒜=(A0,,At)\mathcal{A}=(A_{0},\ldots,A_{t}) with t=t(x)t=t(x), A0={x}A_{0}=\{x\}, |At|(bt)/(b1)mt/b,|N(y)Ai|εtm1/b|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b},\ |N(y)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b} for all yAi1y\in A_{i-1}, and (crucially) every yi=0tAiy\in\bigcup_{i=0}^{t}A_{i} has t(y)tt(y)\geq t; in other words, for their argument to go through we only need that t(x)t(x) achieves a local minimum value among vertices yy near xx, and it is not strictly necessary for t(x)t(x) to achieve a global minimum value.

Motivated by this, our main goal is to show that tuples with essentially these same properties noted above exist for many vertices xx. Specifically, we prove the following.

Lemma 2.2.

There exists some integer 2tb2\leq t\leq b and some set XV(G)X\subseteq V(G) of size at least 12(4b)tb(bt)/(b1)mt/b\frac{1}{2}(4b)^{t-b}\ell^{(b-t)/(b-1)}m^{t/b} such that t(x)=tt(x)=t for every xXx\in X, and such that for every xXx\in X, there exists a tuple of sets 𝒜=(A0,,At)\mathcal{A}=(A_{0},\ldots,A_{t}) such that A0={x}A_{0}=\{x\}, |At|(bt)/(b1)mt/b|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}, |N(y)Ai|12εtm1/b|N(y)\cap A_{i}|\geq\frac{1}{2}\varepsilon_{t}\ell m^{1/b} for all yAi1y\in A_{i-1}, and every yi=0tAiy\in\bigcup_{i=0}^{t}A_{i} has t(y)tt(y)\geq t.

This differs ever so slightly from the condition that Morris and Saxton worked with since we only guarantee |N(y)Ai|12εtm1/b|N(y)\cap A_{i}|\geq\frac{1}{2}\varepsilon_{t}\ell m^{1/b} as opposed to |N(y)Ai|εtm1/b|N(y)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b}. By slightly adjusting the constants of Morris and Saxton, their same proof still carries over word for word for any vertex xx as in Lemma 2.2; see Appendix B for details. Thus to prove Proposition 2.1, it suffices to prove Lemma 2.2, which will be our goal for the rest of this subsection.

For any integer 2tb2\leq t^{\prime}\leq b, define

Λ(t)=(4b)tb(bt)/(b1)mt/b.\Lambda(t^{\prime})=(4b)^{t^{\prime}-b}\ell^{(b-t^{\prime})/(b-1)}m^{t^{\prime}/b}.

Note that m1/bm\ell m^{1/b}\leq m (since m1/b\ell m^{1/b} is the minimum degree of an mm-vertex graph), i.e. 1/(b1)m1/b\ell^{1/(b-1)}\leq m^{1/b}, and thus Λ(t)\Lambda(t^{\prime}) is an increasing function in tt^{\prime}. From now on we let tt be the smallest integer such that there are at least Λ(t)\Lambda(t) vertices with t(x)tt(x)\leq t. Note that t=bt=b satisfies these conditions, so such a (smallest) integer exists.

Let Y0Y_{0} denote the set of vertices xx with t(x)<tt(x)<t. Iteratively define YiY_{i} to be the set of vertices xj=0i1Yjx\notin\bigcup_{j=0}^{i-1}Y_{j} which have at least αm1/b\alpha\ell m^{1/b} neighbors in Yi1Y_{i-1}, where α:=12(b+1)εt\alpha:=\frac{1}{2(b+1)}\varepsilon_{t}. Note that every xYix\in Y_{i} with i1i\geq 1 has t(x)tt(x)\geq t since xY0x\notin Y_{0}.

To motivate these definitions, we observe that in proving Lemma 2.2 with tt as stated, we can not include any vertex of Y0Y_{0} in any of the AiA_{i} sets. While we are allowed to include vertices of Y1Y_{1} in these sets, these vertices are “dangerous” since a large number of their neighbors lie in Y0Y_{0}, and similarly it is somewhat dangerous to include Y2Y_{2} since a large number of their neighbors are in Y1Y_{1}, and so on. We thus want to show that these YiY_{i} sets are all relatively small, which is accomplished by the following lemma.

Lemma 2.3.

If t=2t=2 then Yi=Y_{i}=\emptyset for all i0i\geq 0, and otherwise |Yi|Λ(t1)|Y_{i}|\leq\Lambda(t-1) for all i0i\geq 0.

For this proof, we note that by choosing ε\varepsilon sufficiently small in Proposition 2.1, we may assume mb/(b1)εb/(b1)m\geq\ell^{b/(b-1)}\geq\varepsilon^{-b/(b-1)} is sufficiently large compared to all of the constants εt\varepsilon_{t^{\prime}}.

Proof.

If t=2t=2 then Y0=Y_{0}=\emptyset, and hence inductively we have Yi=Y_{i}=\emptyset for all ii. From now on we assume t>2t>2. We prove the result by induction on ii, the base case |Y0|Λ(t1)|Y_{0}|\leq\Lambda(t-1) being immediate from the definition of tt and Y0Y_{0}. Say we have proven |Yi|Λ(t1)|Y_{i}|\leq\Lambda(t-1) for some i0i\geq 0. The key technical observation we need is the following.

Claim 2.4.

There exists a non-empty bipartite graph GGG^{\prime}\subseteq G with bipartition STS\cup T such that SYiS\subseteq Y_{i}, TYi+1T\subseteq Y_{i+1}, and such that dG(y)14αm1/bd_{G^{\prime}}(y)\geq\frac{1}{4}\alpha\ell m^{1/b} for yTy\in T and dG(y)14αm1/b|Yi|1|Yi+1|d_{G^{\prime}}(y)\geq\frac{1}{4}\alpha\ell m^{1/b}|Y_{i}|^{-1}|Y_{i+1}| for ySy\in S.

Proof.

Let GGG^{*}\subseteq G be the graph on YiYi+1Y_{i}\cup Y_{i+1} obtained after deleting every edge within YiY_{i} and within Yi+1Y_{i+1}. Note that by definition, each vertex of Yi+1Y_{i+1} has at least αm1/b\alpha\ell m^{1/b} neighbors in YiY_{i} (which is disjoint from Yi+1Y_{i+1}), so e(G)αm1/b|Yi+1|e(G^{*})\geq\alpha\ell m^{1/b}|Y_{i+1}|. Define GG^{\prime} by iteratively deleting every vertex which violates the degree conditions of the claim. Note that the number of edges deleted in this process is at most

14αm1/b|Yi+1|+14αm1/b|Yi|1|Yi+1||Yi|=12αm1/b|Yi+1|<e(G).\frac{1}{4}\alpha\ell m^{1/b}\cdot|Y_{i+1}|+\frac{1}{4}\alpha\ell m^{1/b}|Y_{i}|^{-1}|Y_{i+1}|\cdot|Y_{i}|=\frac{1}{2}\alpha\ell m^{1/b}|Y_{i+1}|<e(G^{*}).

In particular, GG^{\prime} is non-empty, and it satisfies all of the other properties by construction. ∎

Returning to our induction, we wish to show that |Yi+1|Λ(t1)|Y_{i+1}|\leq\Lambda(t-1); our inductive hypothesis gives |Yi|Λ(t1)|Y_{i}|\leq\Lambda(t-1), so it suffices to prove that |Yi+1||Yi||Y_{i+1}|\leq|Y_{i}|. Assume for contradiction that |Yi+1|>|Yi||Y_{i+1}|>|Y_{i}|. Let xx be any vertex of TT (which exists since GG^{\prime} is non-empty), and let A0,,At1A_{0},\ldots,A_{t-1} be defined by A0={x}A_{0}=\{x\} and Aj=yAj1NG(y)A_{j}=\bigcup_{y\in A_{j-1}}N_{G^{\prime}}(y). Note that AjSA_{j}\subseteq S if and only if jj is odd since GG^{\prime} is bipartite. Also note that for all yAj1y\in A_{j-1} we have

|NG(y)Aj||NG(y)Aj|14αm1/bεt1m1/b,|N_{G}(y)\cap A_{j}|\geq|N_{G^{\prime}}(y)\cap A_{j}|\geq\frac{1}{4}\alpha\ell m^{1/b}\geq\varepsilon_{t-1}\ell m^{1/b},

where this last step used α=12(b+1)εt\alpha=\frac{1}{2(b+1)}\varepsilon_{t} and that εt1\varepsilon_{t-1} is sufficiently small compared to εt\varepsilon_{t}. In particular, if t>2t>2 is even, then (A0,,At1)(A_{0},\ldots,A_{t-1}) is a concentrated (t1)(t-1)-neighborhood of xx since

|At1||S||Yi|Λ(t1)(bt+1)/(b1)m(t1)/b.|A_{t-1}|\leq|S|\leq|Y_{i}|\leq\Lambda(t-1)\leq\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}.

This implies t(x)<tt(x)<t, a contradiction to xTYi+1x\in T\subseteq Y_{i+1} since Yi+1Y_{i+1} is disjoint from Y0Y_{0}.

Thus we may assume t>2t>2 is odd. Define the random set At1Yi+1A^{\prime}_{t-1}\subseteq Y_{i+1} by including each vertex of Yi+1Y_{i+1} independently and with probability p=|Yi||Yi+1|1p=|Y_{i}||Y_{i+1}|^{-1}, which is well defined since we assumed |Yi+1|>|Yi||Y_{i+1}|>|Y_{i}|. Observe that |At1||A^{\prime}_{t-1}| is a binomial random variable with |Yi+1||Y_{i+1}| trials and probability of success pp. Since 𝔼[|At1|]=p|Yi+1|=|Yi|Λ(t1)\mathbb{E}[|A^{\prime}_{t-1}|]=p|Y_{i+1}|=|Y_{i}|\leq\Lambda(t-1), by Markov’s inequality we have Pr[At12Λ(t1)]1/2\Pr[A^{\prime}_{t-1}\geq 2\Lambda(t-1)]\leq 1/2. Thus for mm sufficiently large, we conclude that the event |At1|<2Λ(t1)(bt+1)/(b1)m(t1)/b|A^{\prime}_{t-1}|<2\Lambda(t-1)\leq\ell^{(b-t+1)/(b-1)}m^{(t-1)/b} occurs with probability at least 1/2.

Similarly for each yAt2Sy\in A_{t-2}\subseteq S, the random variable |NG(y)At1||N_{G}(y)\cap A_{t-1}^{\prime}| is binomial with success probability pp and number of trials dG(y)14αm1/bp1d_{G^{\prime}}(y)\geq\frac{1}{4}\alpha\ell m^{1/b}p^{-1}. Thus by the multiplicative Chernoff inequality, we have for any yAt2y\in A_{t-2},

Pr[|NG(y)At1|18αm1/b]eαm1/b/32,\Pr[|N_{G}(y)\cap A_{t-1}^{\prime}|\leq\frac{1}{8}\alpha\ell m^{1/b}]\leq e^{-\alpha\ell m^{1/b}/32},

and for mm sufficiently large this probability is at most .1m1.1m^{-1}. By a union bound over yAt2y\in A_{t-2}, we see that with probability at least .9.9, every vertex yAt2y\in A_{t-2} satisfies |NG(y)At1|18αm1/bεt1m1/b|N_{G}(y)\cap A_{t-1}^{\prime}|\geq\frac{1}{8}\alpha\ell m^{1/b}\geq\varepsilon_{t-1}\ell m^{1/b}.

In total we conclude that there exists some choice of At1Yi+1A^{\prime}_{t-1}\subseteq Y_{i+1} such that both |At1|(bt+1)/(b1)m(t1)/b|A^{\prime}_{t-1}|\leq\ell^{(b-t+1)/(b-1)}m^{(t-1)/b} and |NG(y)At1|εt1m1/b|N_{G}(y)\cap A_{t-1}^{\prime}|\geq\varepsilon_{t-1}\ell m^{1/b} for all yAt2y\in A_{t-2} (since in particular, this holds with positive probability for a random subset of Yi+1Y_{i+1}). Thus (A0,,At2,At1)(A_{0},\ldots,A_{t-2},A_{t-1}^{\prime}) is a concentrated (t1)(t-1)-neighborhood of xx. This implies t(x)<tt(x)<t, which again is a contradiction. We conclude |Yi+1||Yi||Y_{i+1}|\leq|Y_{i}|, and hence |Yi+1|Λ(t1)|Y_{i+1}|\leq\Lambda(t-1) by the inductive hypothesis. ∎

We are now ready to prove Lemma 2.2.

Proof of Lemma 2.2.

Let XX be the set of vertices xx with t(x)=tt(x)=t and xi=1bYix\notin\bigcup_{i=1}^{b}Y_{i}.

We claim that |X|12Λ(t)|X|\geq\frac{1}{2}\Lambda(t). Indeed, by definition of tt, there are at least Λ(t)\Lambda(t) vertices with t(x)tt(x)\leq t. Every vertex with t(x)tt(x)\leq t is either in XX or i=0bYi\bigcup_{i=0}^{b}Y_{i}, so by the previous lemma,

|X|Λ(t)|i=0bYi|Λ(t)(b+1)Λ(t1)12Λ(t).|X|\geq\Lambda(t)-|\bigcup_{i=0}^{b}Y_{i}|\geq\Lambda(t)-(b+1)\Lambda(t-1)\geq\frac{1}{2}\Lambda(t).

It remains to find the tuple of sets 𝒜\mathcal{A} guaranteed by Lemma 2.2 for each xXx\in X. For each xXx\in X, by definition of t(x)=tt(x)=t, there exists a tuple (A0,A1,,At)(A^{\prime}_{0},A^{\prime}_{1},\ldots,A^{\prime}_{t}) with A0={x}A^{\prime}_{0}=\{x\}, |At|(bt)/(b1)mt/b|A^{\prime}_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}, and |N(y)Ai|εtm1/b|N(y)\cap A_{i}^{\prime}|\geq\varepsilon_{t}\ell m^{1/b} for all yAi1y\in A_{i-1}^{\prime}. Define Ai=Aij=0biYjA_{i}=A^{\prime}_{i}\setminus\bigcup_{j=0}^{b-i}Y_{j}. Note that with this we have A0={x}A_{0}=\{x\}, |At|(bt)/(b1)mt/b|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}, and no yAiy\in A_{i} has t(y)<tt(y)<t because we removed Y0Y_{0} from each AiA_{i}.

It remains to show that each yAi1y\in A_{i-1} has many neighbors in AiA_{i}. Since each yAi1y\in A_{i-1} does not belong to any YjY_{j} with 1jbi+11\leq j\leq b-i+1, by definition yy has at most (b+1)αm1/b(b+1)\alpha\ell m^{1/b} neighbors in j=0biYj\bigcup_{j=0}^{b-i}Y_{j}. This implies

|N(y)Ai|=|N(y)(Aij=0biYj)||N(y)Ai|(b+1)αm1/b(εt(b+1)α)m1/b=12εtm1/b,|N(y)\cap A_{i}|=|N(y)\cap(A_{i}^{\prime}\setminus\bigcup_{j=0}^{b-i}Y_{j})|\geq|N(y)\cap A_{i}^{\prime}|-(b+1)\alpha\ell m^{1/b}\geq(\varepsilon_{t}-(b+1)\alpha)\ell m^{1/b}=\frac{1}{2}\varepsilon_{t}\ell m^{1/b},

proving the result. ∎

2.2 Minimum Degrees

A very minor step in the proof of Morris and Saxton calls for deleting vertices of low degree in GG. In their setting this is fine, as this does not significantly decrease the number of edges in GG. However, because the focus of our approach is on balanced supersaturation for vertices rather than for edges, we will need to be more careful with this step.

Towards this end, we use the reduction lemma stated below, which guarantees a subgraph GGG^{\prime}\subseteq G of large minimum degree, where the degree condition is stronger the more vertices are removed from GG. In particular, the tradeoff is roughly what one would expect if GG was a clique GG^{\prime} together with some number of isolated vertices.

As a small technical convenience, we will prove this lemma in the more general setting of multigraphs with loops. Here, the degree of a vertex vv is the number of edges incident to vv (so each loop contributes 1 to its degree).

Lemma 2.5.

Let GG be an nn-vertex multigraph with loops. For all b1b\geq 1, there exists a subgraph GGG^{\prime}\subseteq G with v(G)>0v(G^{\prime})>0 and minimum degree at least

2b(v(G)n)1/be(G)v(G).2^{-b}\left(\frac{v(G^{\prime})}{n}\right)^{1/b}\frac{e(G)}{v(G^{\prime})}.
Proof.

Write e=e(G)e=e(G). The result is trivial if e=0e=0, so assume e>0e>0. Assume for contradiction that no such subgraph GG^{\prime} exists. We claim that for all non-negative integers rr there exist GrGG_{r}\subseteq G with at most 2brn2^{-br}n vertices and at least 2re2^{-r}e edges. The result holds with G0=GG_{0}=G, so inductively assume the result has been proven through some rr. Let Gr+1GrG_{r+1}\subseteq G_{r} be the graph obtained after iteratively deleting vertices of degree less than 2(b1)r1(e/n)2^{(b-1)r-1}(e/n) from GrG_{r}. Note that

e(Gr+1)e(Gr)2(b1)r1(e/n)v(Gr)2re2(b1)r1(e/n)2brn2r1e.e(G_{r+1})\geq e(G_{r})-2^{(b-1)r-1}(e/n)\cdot v(G_{r})\geq 2^{-r}e-2^{(b-1)r-1}(e/n)\cdot 2^{-br}n\geq 2^{-r-1}e.

If v(Gr+1)2b(r+1)n>0v(G_{r+1})\geq 2^{-b(r+1)}n>0, then 2r12(nv(Gr+1))1/b2^{r}\geq\frac{1}{2}(\frac{n}{v(G_{r+1})})^{1/b}. This implies that Gr+1G_{r+1} has minimum degree at least

2(b1)r1(e/n)(nv(Gr+1))b1b2b(e/n)=2b(v(Gr+1)n)1/be(G)v(Gr+1),2^{(b-1)r-1}(e/n)\geq\left(\frac{n}{v(G_{r+1})}\right)^{\frac{b-1}{b}}2^{-b}(e/n)=2^{-b}\left(\frac{v(G_{r+1})}{n}\right)^{1/b}\frac{e(G)}{v(G_{r+1})},

where this first inequality implicitly uses b10b-1\geq 0. This contradicts our assumption that no such subgraph of GG exists, so we must have v(Gr+1)<2b(r+1)nv(G_{r+1})<2^{-b(r+1)}n. Thus Gr+1G_{r+1} satisfies the desired conditions, proving the claim. Taking r=log2(n)r=\log_{2}(n), the claim implies there exists a subgraph on less than 1 vertex with at least e/n>0e/n>0 edges, which is impossible. ∎

The only reason we proved 2.5 in the more general setting of multigraphs with loops is to prove the following technical result.

Lemma 2.6.

Let BB be a set, and let ff be any function from BB to 0\mathbb{Z}_{\geq 0}. For all b>1b>1, there exists a subset BBB^{\prime}\subseteq B such that

minyBf(y)2b(|B||B|)1/byBf(y)|B|.\min_{y^{\prime}\in B^{\prime}}f(y^{\prime})\geq 2^{-b}\left(\frac{|B^{\prime}|}{|B|}\right)^{1/b}\frac{\sum_{y\in B}f(y)}{|B^{\prime}|}.
Proof.

Define an auxiliary graph GG on BB where each vertex yy has f(y)f(y) loops (and these are the only edges in GG). Applying Lemma 2.5 gives the result. ∎

Roughly speaking, this lemma will be applied with BB the set of vertices that are at distance bb from some vertex xx and with f(y)f(y) the number of paths of length bb from xx to yy. This will allow us to choose vertices xx and yy connected by many paths of length bb, which we can use to construct copies of θa,b\theta_{a,b} where xx and yy are the two high-degree vertices.

3 Preliminaries

3.1 Key Definitions

As noted in the proof outline Subsection 1.1, we wish to consider hypergraphs on V(θa,b)×V(G)V(\theta_{a,b})\times V(G). To aid with this, we make use of the following definitions throughout the paper; see Figure 3 for an example.

Definition 2.

Given a graph GG and a set χV(θa,b)×V(G)\chi\subseteq V(\theta_{a,b})\times V(G), we define the projection sets χθ={w:z,(w,z)χ}\chi_{\theta}=\{w:\exists z,(w,z)\in\chi\} and χG={z:w,(w,z)χ}\chi_{G}=\{z:\exists w,(w,z)\in\chi\}. We say that a set χV(θa,b)×V(G)\chi\subseteq V(\theta_{a,b})\times V(G) is valid if

  1. 1.

    |χ|=|χθ|=|χG||\chi|=|\chi_{\theta}|=|\chi_{G}| (equivalently, each vertex of θa,b\theta_{a,b} and GG appears at most once in a pair of χ\chi), and

  2. 2.

    If (w,z),(w,z)χ(w,z),(w^{\prime},z^{\prime})\in\chi with wwE(θa,b)ww^{\prime}\in E(\theta_{a,b}), then zzE(G)zz^{\prime}\in E(G).

We let 𝕍\mathbb{V} denote the set of valid subsets of V(θa,b)×V(G)V(\theta_{a,b})\times V(G).


Refer to caption

Figure 3: A set χV(θa,b)×V(G)\chi\subseteq V(\theta_{a,b})\times V(G) together with its projections χθV(θa,b)\chi_{\theta}\subseteq V(\theta_{a,b}) and χGV(G)\chi_{G}\subseteq V(G). The set χ\chi is valid if and only if all of the vertices depicted are distinct and xz11,z32yE(G)xz_{1}^{1},\ z_{3}^{2}y\in E(G).

Note that if χ\chi is a valid set with |χ|=v(θa,b)|\chi|=v(\theta_{a,b}), then by definition this means the map ϕχ:V(θa,b)V(G)\phi_{\chi}:V(\theta_{a,b})\to V(G) which sends wV(θa,b)w\in V(\theta_{a,b}) to the unique vertex zV(G)z\in V(G) with (w,z)χ(w,z)\in\chi is an injective homomorphism. In particular, the vertices χG\chi_{G} induce a graph containing a copy of θa,b\theta_{a,b} as a subgraph. Since our ultimate goal is to find a large colleciton of such subgraphs which are spread out, we make the following definitions.

Definition 3.

We say that a hypergraph \mathcal{H} is a GG-hypergraph if its vertex set is V(θa,b)×V(G)V(\theta_{a,b})\times V(G) and all of its hyperedges are valid sets hh with |h|=v(θa,b)|h|=v(\theta_{a,b}). We say that functions of the form D:2V(θa,b){}D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\{\infty\} are codegree functions, and for such a DD we say that \mathcal{H} is DD-good if deg(χ)D(χθ)\deg_{\mathcal{H}}(\chi)\leq D(\chi_{\theta}) for all χ𝕍\chi\in\mathbb{V}.

Note that being DD-good means that no valid set χ\chi is contained in too many hyperedges, with the exact degree condition depending only on DD and the projection χθV(θa,b)\chi_{\theta}\subseteq V(\theta_{a,b}) (which allows us, for example, to impose stronger conditions if χ\chi contains vertices corresponding to the high degree vertices of V(θa,b)V(\theta_{a,b})).

The main technical work of this paper is in constructing GG-hypergraphs which have many hyperedges and which are DD-good for some DD which is sufficiently small to apply 6.1. To do this, we will consider several DD functions simultaneously (which will ultimately be combined into a unified codegree bound in the next two sections). The first and simplest function we consider is the following.

Definition 4.

Let δ>0\delta>0, and let k,nk,n be positive integers. We define a codegree function DforestD_{\operatorname{forest}} as follows: if νV(θa,b)\nu\subseteq V(\theta_{a,b}) induces a forest on e1e\geq 1 edges, then

Dforest(ν)=kabn2δkn1+1/b(δkb/(b1))e1,D_{\operatorname{forest}}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}\cdot(\delta k^{b/(b-1)})^{e-1}}\right\rceil,

and otherwise Dforest(ν)=D_{\operatorname{forest}}(\nu)=\infty.

Having Dforest(ν)D_{\operatorname{forest}}(\nu) evaluate to infinity is not necessary, and we do this only to emphasize that this function essentially ignores sets ν\nu which do not induce a forest with at least one edge.

Essentially, the main technical result of Morris and Saxton and Corsten and Tran says that one can construct large GG-hypergraphs which are DforestD_{\operatorname{forest}}-good. To go beyond this, we will show that one can construct collections which are DD-good for functions DD which are finite on sets νV(θa,b)\nu\subseteq V(\theta_{a,b}) that contain cycles (and more precisely, sets that contain the two vertices of θa,b\theta_{a,b} of large degree).

The specific functions DD we need are somewhat complex. In all of these functions, the denominator of D(ν)D(\nu) roughly corresponds to the number of choices our algorithm has to build ν\nu, with the terms to the left of the “ \cdot ” typically counting the number of choices for the two high degree vertices of θa,b\theta_{a,b}. The parameter ss will be chosen roughly such that the graph GG^{\prime} obtained from 2.5 has 2sn2^{-s}n vertices.

With all of this established, we define the remainder of our codegree functions.

Definition 5.

Let δ>0\delta>0, and let k,nk,n be positive integers. For a3a\geq 3, let u,vu,v denote the two vertices of θa,b\theta_{a,b} of degree larger than 2. For each integer 0s3logn0\leq s\leq 3\log n, we define a codegree function Ds,bD_{s,b} as follows: if u,vνu,v\in\nu, then

Ds,b(ν)=kabn22sn2δ|ν|(22s/3kb)(|ν|2)/(b1),D_{s,b}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{2^{-s}n^{2}\cdot\delta^{|\nu|}(2^{2s/3}k^{b})^{(|\nu|-2)/(b-1)}}\right\rceil,

and otherwise Ds,b(ν)=D_{s,b}(\nu)=\infty.

The definition above will be used when the tt value from Proposition 2.1 equals bb. The t<bt<b case is somewhat more complicated. Again this is because the denominator roughly represents the number of choices we have for our algorithm at any step, and just as in [10, 20], the t<bt<b case of the algorithm is somewhat more complicated.

Definition 6.

Let δ>0\delta>0, and let k,nk,n be positive integers. For 0s3logn0\leq s\leq 3\log n and 2t<b2\leq t<b, we define a codegree function Ds,tD_{s,t} as follows: write the paths of θa,b\theta_{a,b} as uw1jwb1jvuw_{1}^{j}\cdots w_{b-1}^{j}v for 1ja1\leq j\leq a, and define

Ft={wij:ti<b,it even},f=|νFt|.F_{t}=\{w_{i}^{j}:t\leq i<b,\ i-t\textrm{ even}\},\hskip 20.00003ptf=|\nu\cap F_{t}|.

If u,vνu,v\in\nu, then

Ds,t(ν)=kabn222sk(2b2t+1)/(b1)n(2t1)/bδ|ν|(22s/3kn1/b)f(22s/3kb/(b1))|ν|f2,D_{s,t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{2^{-2s}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}\cdot\delta^{|\nu|}(2^{2s/3}kn^{1/b})^{f}(2^{2s/3}k^{b/(b-1)})^{|\nu|-f-2}}\right\rceil,

and otherwise Ds,t(ν)=D_{s,t}(\nu)=\infty.

The intuition for this codegree function is as follows: in the t<bt<b case with s=0s=0, our algorithm first selects u,vu,v, which it will be able to do in about k(2b2t+1)/(b1)n(2t1)/bk^{(2b-2t+1)/(b-1)}n^{(2t-1)/b} ways (which is essentially the product of the bounds from 2.1(c) and (h)). When choosing wijw_{i}^{j}, the number of choices will turn out to be about kn1/bkn^{1/b} if wijFtw_{i}^{j}\in F_{t} and about kb/(b1)k^{b/(b-1)} if wijFtw_{i}^{j}\notin F_{t}. Thus the denominator represents the number of choices our algorithm has for building ν\nu.

One last codegree function is needed for the t<bt<b case.

Definition 7.

Let δ>0\delta>0, and let k,nk,n be positive integers. For 2t<b2\leq t<b, we define a codegree function DtD_{t} as follows: define FtF_{t} as above and let g=|νFt|g=|\nu\cap F_{t}| if btb-t is even and g=|νFt|1g=|\nu\cap F_{t}|-1 otherwise. If u,v,wb1jνu,v,w_{b-1}^{j}\in\nu for some jj, then

Dt(ν)=kabn2kn1+1/b(kn1/b)g(kb/(b1))|ν|g3δ|ν|,D_{t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{kn^{1+1/b}\cdot(kn^{1/b})^{g}(k^{b/(b-1)})^{|\nu|-g-3}\delta^{|\nu|}}\right\rceil,

and otherwise Dt(ν)=D_{t}(\nu)=\infty. For notational convenience, we also define DbD_{b} by Db(ν)=D_{b}(\nu)=\infty for all νV(θa,b)\nu\subseteq V(\theta_{a,b}).

The motivation for this definition is that the number of choices for u,v,wb1ju,v,w_{b-1}^{j} is at least the number of choices for just v,wb1jv,w_{b-1}^{j}, which is about kn1+1/bkn^{1+1/b}, i.e. the number of total edges in the graph GG. As before, the number of choices for every other wijw_{i}^{j} vertex will depend on whether it is in FtF_{t} or not. The definition of gg reflects the fact that if btb-t is odd, then wb1jFtw_{b-1}^{j}\in F_{t}, but the number of choices for the first vertex of this form is already accounted for by the kn1+1/bkn^{1+1/b} term, so we can not include an extra factor of kn1/bkn^{1/b} for this vertex. Note that with this codegree function, we omit counting the number of choices for uu in the denominator. As such, DtD_{t} will typically be much weaker (i.e. larger) than Ds,tD_{s,t}, though it will do better when ν\nu has few vertices and kk is small.

3.2 Saturated Sets

When building our GG-hypergraph \mathcal{H}, we need to be careful to avoid constructing theta graphs which contain a subset that has very large codegree in \mathcal{H}. To aid with this, we introduce the following.

Definition 8.

Let \mathcal{H} be a GG-hypergraph and DD a codegree function D:2V(θa,b)D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\infty. We define the set of saturated sets

(,D)={χ𝕍:deg(χ)D(χθ)}.\mathcal{F}(\mathcal{H},D)=\{\chi\in\mathbb{V}:\deg(\chi)\geq D(\chi_{\theta})\}.

Given a valid set χ\chi and νV(θa,b)χθ\nu\subseteq V(\theta_{a,b})\setminus\chi_{\theta}, define the link set 𝒥,D(χ;ν)\mathcal{J}_{\mathcal{H},D}(\chi;\nu) to be the set of γ𝕍\gamma\in\mathbb{V} with γθ=ν\gamma_{\theta}=\nu such that χγ(,D)\chi\cup\gamma\in\mathcal{F}(\mathcal{H},D). If ν={w}\nu=\{w\} we will sometimes denote this set simply by 𝒥,D(χ;w)\mathcal{J}_{\mathcal{H},D}(\chi;w)

The intuition for the link set comes from our goal of algorithmically trying to iteratively add a new theta graph to some \mathcal{H} such that \mathcal{H} continues to be DD-good even after adding the theta graph. If during the algorithm we have already designated some χ𝕍\chi\in\mathbb{V} to be used in our new theta graph, and if our algorithm is about to choose some γ\gamma to add to χ\chi such that γθ=ν\gamma_{\theta}=\nu, then the algorithm can not choose any γ𝒥,D(χ;ν)\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi^{\prime};\nu) for any χχ\chi^{\prime}\subseteq\chi, as otherwise the degree of χγ\chi^{\prime}\cup\gamma would be strictly larger than what DD dictates. As an aside, our definition of link sets differs slightly from Morris and Saxton, who essentially defined the links to be χχ𝒥,D(χ;ν)\bigcup_{\chi^{\prime}\subseteq\chi}\mathcal{J}_{\mathcal{H},D}(\chi^{\prime};\nu).

Because the link sets represent the number of “bad” choices our algorithm has, we will want to show that these sets are relatively small. This is accomplished by the following lemma.

Lemma 3.1.

Let \mathcal{H} be a GG-hypergraph which is DD-good for some D:2V(θa,b)D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\infty. If D(χθν)=D(\chi_{\theta}\cup\nu)=\infty then 𝒥,D(χ;ν)=\mathcal{J}_{\mathcal{H},D}(\chi;\nu)=\emptyset, and otherwise

|𝒥,D(χ;ν)|2v(θa,b)D(χθ)D(χθν).|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|\leq 2^{v(\theta_{a,b})}\frac{D(\chi_{\theta})}{D(\chi_{\theta}\cup\nu)}.
Proof.

If D(χθν)=D(\chi_{\theta}\cup\nu)=\infty, then every γ𝕍\gamma\in\mathbb{V} with γθ=ν\gamma_{\theta}=\nu trivially has deg(χγ)<D(χθγθ)=\deg(\chi\cup\gamma)<D(\chi_{\theta}\cup\gamma_{\theta})=\infty, so no such γ\gamma satisfies χγ(,D)\chi\cup\gamma\in\mathcal{F}(\mathcal{H},D) and we conclude 𝒥,D(χ;ν)=\mathcal{J}_{\mathcal{H},D}(\chi;\nu)=\emptyset. From now on we assume D(χθν)<D(\chi_{\theta}\cup\nu)<\infty. Note that

γ𝒥,D(χ;ν)deg(χγ)γ:|γ|=|ν|deg(χγ)2v(θa,b)deg(χ)2v(θa,b)D(χθ),\sum_{\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi;\nu)}\deg(\chi\cup\gamma)\leq\sum_{\gamma:\ |\gamma|=|\nu|}\deg(\chi\cup\gamma)\leq 2^{v(\theta_{a,b})}\deg(\chi)\leq 2^{v(\theta_{a,b})}D(\chi_{\theta}),

where the second inequality used that each hyperedge hh containing χ\chi is counted at most 2v(θa,b)2^{v(\theta_{a,b})} times by the sum over γ\gamma, and the last inequality used that \mathcal{H} is DD-good. On the other hand,

γ𝒥,D(χ;ν)deg(χγ)γ𝒥,D(χ;ν)D(χθγθ)=|𝒥,D(χ;ν)|D(χθν).\sum_{\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi;\nu)}\deg(\chi\cup\gamma)\geq\sum_{\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi;\nu)}D(\chi_{\theta}\cup\gamma_{\theta})=|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|D(\chi_{\theta}\cup\nu).

Rearranging these two inequalities gives

|𝒥,D(χ;ν)|2v(θa,b)D(χθ)D(χθν),|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|\leq 2^{v(\theta_{a,b})}\frac{D(\chi_{\theta})}{D(\chi_{\theta}\cup\nu)},

completing the proof.

Because all of our codegree functions involve the ceiling of a real-valued function, the following result, which allows us to ignore the ceilings, will be slightly more convenient to use compared to 3.1.

Corollary 3.2.

Let \mathcal{H} be a GG-hypergraph which is DD-good for some D:2V(θa,b)D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\infty, and suppose that D(ν)=D(ν)D(\nu)=\left\lceil D^{\prime}(\nu)\right\rceil for some D:2V(θa,b)>0D^{\prime}:2^{V(\theta_{a,b})}\to\mathbb{R}_{>0}. If D(χθν)=D(\chi_{\theta}\cup\nu)=\infty then 𝒥,D(χ;ν)=\mathcal{J}_{\mathcal{H},D}(\chi;\nu)=\emptyset. If D(χθν)D(\chi_{\theta}\cup\nu)\neq\infty and χ(,D)\chi\notin\mathcal{F}(\mathcal{H},D), then

|𝒥,D(χ;ν)|2v(θa,b)+1D(χθ)D(χθν).|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|\leq 2^{v(\theta_{a,b})+1}\frac{D^{\prime}(\chi_{\theta})}{D^{\prime}(\chi_{\theta}\cup\nu)}.

Moreover, this bound continues to hold for D=DforestD=D_{\operatorname{forest}} even if χ(,D)\chi\in\mathcal{F}(\mathcal{H},D) provided δ\delta is sufficiently small.

Proof.

Note that trivially D(χθν)D(χθν)D(\chi_{\theta}\cup\nu)\geq D^{\prime}(\chi_{\theta}\cup\nu) and that D(χθ)2D(χθ)D(\chi_{\theta})\leq 2D^{\prime}(\chi_{\theta}) provided D(χθ)2D(\chi_{\theta})\geq 2. Thus in this case, the result follows immediately from 3.1, and in particular, this situation always holds for DforestD_{\operatorname{forest}} provided δ\delta is sufficiently small.

It remains to consider the case that D(χθ)=1D(\chi_{\theta})=1 and χ(,D)\chi\notin\mathcal{F}(\mathcal{H},D). These two conditions imply deg(χ)<D(χθ)=1\deg(\chi)<D(\chi_{\theta})=1, so there exists no hyperedge of \mathcal{H} containing χ\chi. Thus there exists no γ\gamma such that χγ(,D)\chi\cup\gamma\in\mathcal{F}(\mathcal{H},D), i.e. such that deg(χγ)D(χθγθ)1\deg(\chi\cup\gamma)\geq D(\chi_{\theta}\cup\gamma_{\theta})\geq 1. We conclude that the link set is empty in this case, and hence the result trivially holds. ∎

When applying this claim it will always be immediate333In terms of the notation for the next section, we will only apply 3.2 with DDforestD\neq D_{\operatorname{forest}} when χ\chi is a subset of an (s,t)(s,t)-compatible set, which by definition will not be in (,D)\mathcal{F}(\mathcal{H},D). that χ(,D)\chi\notin\mathcal{F}(\mathcal{H},D), and for simplicity we will omit saying this explicitly.

4 Balanced Supersaturation for Vertices

In this section, we prove our main technical theorem: a balanced supersaturation result for vertices.

Theorem 4.1.

For all a6a\geq 6 and b3b\geq 3, there exist constants δ>0,k0>0\delta>0,k_{0}>0 such that the following holds for all nn\in\mathbb{N} and kk0k\geq k_{0}. If GG is an nn-vertex graph with kn1+1/bkn^{1+1/b} edges, then there exists an integer 2tb2\leq t\leq b and a GG-hypergraph t\mathcal{H}^{\prime}_{t} with |t|b1δkabn2|\mathcal{H}^{\prime}_{t}|\geq b^{-1}\delta k^{ab}n^{2} which is DtD^{\prime}_{t}-good, where DtD^{\prime}_{t} is defined by

Dt(ν):={Dforest(ν)if ν induces a forest,min{Dt(ν),20(D0,t(ν)+logn)}otherwise.D^{\prime}_{t}(\nu):=\begin{cases}D_{\operatorname{forest}}(\nu)&\textrm{if }\nu\textrm{ induces a forest},\\ \min\{D_{t}(\nu),20(D_{0,t}(\nu)+\left\lceil\log n\right\rceil)\}&\textrm{otherwise}.\end{cases}

We note that there is no need to consider b=2b=2 since the case θa,2=K2,a\theta_{a,2}=K_{2,a} is already dealt with by Morris and Saxton [20].

4.1 will follow quickly from the following technical result, which roughly says that given a collection of much fewer than kabn2k^{ab}n^{2} copies of θa,b\theta_{a,b} satisfying certain codegree conditions, we can find an additional copy of θa,b\theta_{a,b} to add to the collection while maintaining the desired codegrees.

Proposition 4.2.

For all a6a\geq 6 and b3b\geq 3, there exist constants δ>0,k0>0\delta>0,k_{0}>0 such that the following holds for all nn\in\mathbb{N} and kk0k\geq k_{0}. Let GG be an nn-vertex graph on kn1+1/bkn^{1+1/b} edges with kk0k\geq k_{0}, and let {s,t}s,t\{\mathcal{H}_{s,t}\}_{s,t} be a set of GG-hypergraphs such that s,t\mathcal{H}_{s,t} is Ds,tD_{s,t}-good for each 0s3logn0\leq s\leq 3\log n and 2tb2\leq t\leq b, and ss,t\bigcup_{s}\mathcal{H}_{s,t} is DtD_{t}-good for each t<bt<b, and s,ts,t\bigcup_{s,t}\mathcal{H}_{s,t} is DforestD_{\operatorname{forest}}-good.

If |s,ts,t|δkabn2|\bigcup_{s,t}\mathcal{H}_{s,t}|\leq\delta k^{ab}n^{2}, then there exists some valid set h𝕍h\in\mathbb{V} of size v(θa,b)v(\theta_{a,b}) and some s,ts^{\prime},t^{\prime} such that hs,ts,th\notin\bigcup_{s,t}\mathcal{H}_{s,t}, and such that if we define s,t=s,t{h}\mathcal{H}^{\prime}_{s^{\prime},t^{\prime}}=\mathcal{H}_{s^{\prime},t^{\prime}}\cup\{h\} and s,t=s,t\mathcal{H}^{\prime}_{s,t}=\mathcal{H}_{s,t} for all (s,t)(s,t)(s,t)\neq(s^{\prime},t^{\prime}), then s,t\mathcal{H}_{s,t}^{\prime} is Ds,tD_{s,t}-good for each 0s3logn0\leq s\leq 3\log n and 2tb2\leq t\leq b, and ss,t\bigcup_{s}\mathcal{H}_{s,t}^{\prime} is DtD_{t}-good for each t<bt<b, and s,ts,t\bigcup_{s,t}\mathcal{H}_{s,t}^{\prime} is DforestD_{\operatorname{forest}}-good.

Before proving 4.2, we show how it may be repeatedly applied to obtain our main supersaturation theorem.

Proof of 4.1.

Initially start with collections {s,t}s,t\{\mathcal{H}_{s,t}\}_{s,t} where s,t=\mathcal{H}_{s,t}=\emptyset for all s,ts,t. By repeatedly applying 4.2, we obtain collections satisfying all of the codegree conditions and with |s,ts,t|δkabn2|\bigcup_{s,t}\mathcal{H}_{s,t}|\geq\delta k^{ab}n^{2}. In particular, there exists some 2tb2\leq t\leq b such that t:=ss,t\mathcal{H}_{t}^{\prime}:=\bigcup_{s}\mathcal{H}_{s,t} contains at least b1δkabn2b^{-1}\delta k^{ab}n^{2} hyperedges.

By Proposition 4.2, we have for all χ𝕍\chi\in\mathbb{V} that

degt(χ)degs,ts,t(χ)Dforest(χθ),\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq\deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)\leq D_{\operatorname{forest}}(\chi_{\theta}),

and similarly

degt(χ)Dt(χθ).\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq D_{t}(\chi_{\theta}).

To complete the proof, we only have to show degt(χ)20(D0,t(ν)+logn)\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq 20(D_{0,t}(\nu)+\left\lfloor\log n\right\rfloor) for all χ\chi such that χθ\chi_{\theta} contains a cycle. We first consider the case t=bt=b. Here Proposition 4.2 gives

degb(χ)s=03logndegs,b(χ)s=03lognDs,b(ν)3logn+4D0,b(ν)s=03logn2(12(|χθ|2)3(b1))s,\deg_{\mathcal{H}^{\prime}_{b}}(\chi)\leq\sum_{s=0}^{3\log n}\deg_{\mathcal{H}_{s,b}}(\chi)\leq\sum_{s=0}^{3\log n}D_{s,b}(\nu)\leq 3\left\lceil\log n\right\rceil+4D_{0,b}(\nu)\sum_{s=0}^{3\log n}2^{\left(1-\frac{2(|\chi_{\theta}|-2)}{3(b-1)}\right)s},

where this last step used that either Ds,b(ν)=1D_{s,b}(\nu)=1, or (by Definition 5) Ds,b(ν)D_{s,b}(\nu) differs from D0,b(ν)D_{0,b}(\nu) by at most a multiplicative factor of 42(12(|χθ|2)3(b1))s4\cdot 2^{\left(1-\frac{2(|\chi_{\theta}|-2)}{3(b-1)}\right)s} (where the factor of 4 comes from the two ceiling functions involving Ds,bD_{s,b} and D0,bD_{0,b}). Since χθ\chi_{\theta} contains a cycle, we have |χθ|2b|\chi_{\theta}|\geq 2b, so the sum above is at most s=02s/35\sum_{s=0}^{\infty}2^{-s/3}\leq 5. We conclude that degb(χ)Db(χθ)\deg_{\mathcal{H}_{b}^{\prime}}(\chi)\leq D^{\prime}_{b}(\chi_{\theta}) for all valid χ\chi.

When t<bt<b, essentially the same reasoning gives that if χθ\chi_{\theta} contains a cycle then

degt(χ)3logn+4D0,t(ν)s=02(2(2b2)23)s,\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq 3\left\lceil\log n\right\rceil+4D_{0,t}(\nu)\sum_{s=0}^{\infty}2^{\left(2-(2b-2)\frac{2}{3}\right)s},

and since b3b\geq 3 this latter sum is at most s=022s/35\sum_{s=0}^{\infty}2^{-2s/3}\leq 5. This gives the result. ∎

The rest of this section is dedicated to proving 4.2. Let GG be the graph in the hypothesis of Proposition 4.2 and {s,t}s,t\{\mathcal{H}_{s,t}\}_{s,t} the corresponding collections.

The basic idea of the proof is to algorithmically construct many copies of θa,b\theta_{a,b}, and to show that at least one of them is not already contained in s,t\mathcal{H}_{s^{\prime},t^{\prime}} for some appropriate s,ts^{\prime},t^{\prime}, and such that our codegree conditions continue to be satisfied.

We will begin by pruning the graph GG so that all of its remaining edges and vertices are “well-behaved” (Section 4.1). We then give the algorithmic construction of copies of θa,b\theta_{a,b} and show that a new copy may be added to some s,t\mathcal{H}_{s,t} (Section 4.2), completing the proof. Throughout the argument we fix some δ\delta depending only on a,ba,b which is sufficiently small for our arguments to go through.

4.1 Pruning

Let G0GG_{0}\subseteq G be the graph obtained by deleting edges of GG that are already “saturated” by hyperedges of s,ts,t\bigcup_{s,t}\mathcal{H}_{s,t}, that is, those edges eE(G)e\in E(G) for which there exists χ𝕍\chi\in\mathbb{V} with χG=e\chi_{G}=e and

degs,ts,t(χ)Dforest(χθ)=kabn2δkn1+1/b.\deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)\,\geq\,D_{{\operatorname{forest}}}\big{(}\chi_{\theta}\big{)}=\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil.

This will ensure that any new theta graph constructed using only edges of G0G_{0} will not violate our edge-codegree bounds when added to s,ts,t\bigcup_{s,t}\mathcal{H}_{s,t}. We bound the number of these saturated edges by double-counting elements of s,ts,t\bigcup_{s,t}\mathcal{H}_{s,t}\,:

(# edges e with χ𝕍,χG=e,degs,ts,t(χ)=kabn2δkn1+1/b)kabn2δkn1+1/b\displaystyle\left(\text{\# edges $e$ with $\chi\in\mathbb{V},\ \chi_{G}=e,\ \deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)=\left\lceil\tfrac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil$}\right)\cdot\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil e(θa,b)|s,ts,t|.\displaystyle\leq e(\theta_{a,b})\cdot|\textstyle\bigcup_{s,t}\mathcal{H}_{s,t}|.

Rearranging slightly, the number of such edges is at most

e(θa,b)|s,ts,t|/kabn2δkn1+1/bδ2e(θa,b)kn1+1/b,e(\theta_{a,b})\,|\textstyle\bigcup_{s,t}\mathcal{H}_{s,t}|\bigg{/}\left\lceil\dfrac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil\leq\,{\delta^{2}e(\theta_{a,b})}\cdot kn^{1+1/b},

and if δ\delta is sufficiently small this is at most 12kn1+1/b=12e(G)\frac{1}{2}kn^{1+1/b}=\frac{1}{2}e(G), which implies e(G0)12kn1+1/be(G_{0})\geq\frac{1}{2}kn^{1+1/b}. Notice that no remaining edges are “saturated,” i.e. we have

degs,ts,t(χ)kabn2δkn1+1/b1 for all χ𝕍 such that χGE(G0).\deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)\leq\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil-1\textrm{ for all }\chi\in\mathbb{V}\textrm{ such that }\chi_{G}\in E(G_{0}). (2)

Now we further prune the graph by eliminating low-degree vertices: let GG0G^{\prime}\subseteq G_{0} be the subgraph of high minimum degree guaranteed by Lemma 2.5. Although GG^{\prime} may have substantially fewer vertices than GG (meaning our algorithm will have fewer choices at various steps), in this case it will compensate by having a substantially larger minimum degree.

More concretely, let m=v(G)m=v(G^{\prime}), and let \ell be the real number such that GG^{\prime} has minimum degree m1/b\ell m^{1/b}. By Lemma 2.5 we have

m1/b2b1kn1+1/bm1(n/m)1/b2b1(n/m)k.\ell m^{1/b}\geq 2^{-b-1}kn^{1+1/b}m^{-1}(n/m)^{-1/b}\implies\ell\geq 2^{-b-1}(n/m)k.

We let rr be the unique integer such that

2rnm<2r+1n,2^{-r}n\leq m<2^{-r+1}n, (3)

and we note that the previous inequality implies

4b2rk.\ell\geq 4^{-b}2^{r}k. (4)

In total this implies the minimum degree m1/b\ell m^{1/b} of GG^{\prime} is at least Ω(kn1/b)\Omega(kn^{1/b}), which is the average degree of GG, and that the minimum degree of GG^{\prime} is much larger compared to that of GG if m=v(G)m=v(G^{\prime}) is much smaller than n=v(G)n=v(G). Before moving on, we note

b/(b1)m1/b.\ell^{b/(b-1)}\leq\ell m^{1/b}. (5)

Indeed, since m1/b\ell m^{1/b} is the minimum degree of the mm-vertex graph GG^{\prime}, we must have m1/bm\ell m^{1/b}\leq m, and rearranging shows this is equivalent to (5).

4.2 The Algorithm

We are now ready to begin finding copies on θa,b\theta_{a,b} in GG^{\prime}. Our strategy is roughly as follows: first, we identify which collection s,t\mathcal{H}_{s,t} we wish to add a copy to (based on the expansion properties of GG^{\prime} detailed in 2.1). After this, we carefully choose vertices xx and yy to serve as the two high-degree vertices for the copies we will add. We then show that xx and yy are not already contained in too many copies in s,t\mathcal{H}_{s,t}, and algorithmically construct a large number of theta graphs in GG^{\prime} that do contain xx and yy; this allows us to conclude that we have found at least one new copy not already contained in s,t\mathcal{H}_{s,t}. Crucially, along the way, we ensure that at no step of the algorithm are our codegree conditions violated, ensuring that the copy added to s,t\mathcal{H}_{s,t} is “good.”

Before delving into the meat of the proof, we introduce some notation which is more compact. Define

=s,ts,t,t=ss,t.\mathcal{H}=\bigcup_{s,t}\mathcal{H}_{s,t},\hskip 20.00003pt\mathcal{H}_{t}=\bigcup_{s}\mathcal{H}_{s,t}.

Also define

forest=(,Dforest),t=(t,Dt),s,t=(s,t,Ds,t).\mathcal{F}_{\operatorname{forest}}=\mathcal{F}(\mathcal{H},D_{\operatorname{forest}}),\hskip 20.00003pt\mathcal{F}_{t}=\mathcal{F}(\mathcal{H}_{t},D_{t}),\hskip 20.00003pt\mathcal{F}_{s,t}=\mathcal{F}(\mathcal{H}_{s,t},D_{s,t}).

When applying 3.2, we adopt the shorthand

𝒥forest=𝒥,Dforest,𝒥t=𝒥t,Dt,𝒥s,t=𝒥s,t,Ds,t.\mathcal{J}_{\operatorname{forest}}=\mathcal{J}_{\mathcal{H},D_{\operatorname{forest}}},\hskip 20.00003pt\mathcal{J}_{t}=\mathcal{J}_{\mathcal{H}_{t},D_{t}},\hskip 20.00003pt\mathcal{J}_{s,t}=\mathcal{J}_{\mathcal{H}_{s,t},D_{s,t}}.

We say that a set χ\chi is (s,t)(s,t)-compatible if χ𝕍\chi\in\mathbb{V} and if no subset of χ\chi lies in forestts,t\mathcal{F}_{\operatorname{forest}}\cup\mathcal{F}_{t}\cup\mathcal{F}_{s,t}. Crucially, we observe that proving the proposition is equivalent to showing that for some s,ts,t, there exists an (s,t)(s,t)-compatible set hh with |h|=v(θa,b)|h|=v(\theta_{a,b}) such that hh\notin\mathcal{H} (since, for example, no subset of hh being in forest\mathcal{F}_{\operatorname{forest}} implies {h}\mathcal{H}\cup\{h\} is DforestD_{\operatorname{forest}}-good).

Before moving on, we make a small but important observation.

Claim 4.3.

If χ\chi is a valid set and νV(θa,b)χθ\nu\subseteq V(\theta_{a,b})\setminus\chi_{\theta} is such that χθν\chi_{\theta}\cup\nu induces at most one edge in θa,b\theta_{a,b}, then 𝒥forest(χ;ν)=\mathcal{J}_{\operatorname{forest}}(\chi;\nu)=\emptyset.

Proof.

If χθν\chi_{\theta}\cup\nu induces 0 edges then Dforest(χθν)=D_{\operatorname{forest}}(\chi_{\theta}\cup\nu)=\infty and the result follows from 3.2. Thus we can assume χθν\chi_{\theta}\cup\nu induces exactly one edge wwww^{\prime}. If there exists some γ𝒥forest(χ;ν)\gamma\in\mathcal{J}_{\operatorname{forest}}(\chi;\nu), then there exist pairs (w,z),(w,z)χγ(w,z),(w^{\prime},z^{\prime})\in\chi\cup\gamma (since (χγ)θ=χθν(\chi\cup\gamma)_{\theta}=\chi_{\theta}\cup\nu). In this case we have

deg(χγ)deg({(w,z),(w,z)})<Dforest({w,w})=Dforest(χθγθ),\deg_{\mathcal{H}}(\chi\cup\gamma)\leq\deg_{\mathcal{H}}(\{(w,z),(w^{\prime},z^{\prime})\})<D_{\operatorname{forest}}(\{w,w^{\prime}\})=D_{\operatorname{forest}}(\chi_{\theta}\cup\gamma_{\theta}),

where the first inequality used that we are looking at the codegree of a smaller set, the second inequality follows from (2) which says GG0G^{\prime}\subseteq G_{0} does not contain any edges zzzz^{\prime} with deg({(w,z),(w,z)})Dforest({w,w})\deg_{\mathcal{H}}(\{(w,z),(w^{\prime},z^{\prime})\})\geq D_{\operatorname{forest}}(\{w,w^{\prime}\}), and the equality used that χθγθ=χθν\chi_{\theta}\cup\gamma_{\theta}=\chi_{\theta}\cup\nu induces exactly one edge. This inequality implies χγ(,Dforest)\chi\cup\gamma\notin\mathcal{F}(\mathcal{H},D_{\operatorname{forest}}), contradicting the assumption γ𝒥forest(χ,ν)\gamma\in\mathcal{J}_{\operatorname{forest}}(\chi,\nu). We conclude that this link set is indeed empty. ∎

4.2.1 The Setup

We wish to apply Proposition 2.1 to the “pruned” graph GG^{\prime}. For this we need to specify a set of forests to avoid. Intuitively we wish to use the set forest\mathcal{F}_{\operatorname{forest}}, but this is a collection of subsets of V(θa,b)×V(G)V(\theta_{a,b})\times V(G), not of subgraphs of GG^{\prime}. To get around this minor technically, for χ𝕍\chi\in\mathbb{V} we define the “projection graph” HχH_{\chi} by

V(Hχ)=χG,E(Hχ)={zz:wwE(θa,b),(w,z),(w,z)χ}.V(H_{\chi})=\chi_{G},\hskip 20.00003ptE(H_{\chi})=\{zz^{\prime}:\exists ww^{\prime}\in E(\theta_{a,b}),\ (w,z),(w^{\prime},z^{\prime})\in\chi\}.

Note that by definition of χ\chi being valid, HχH_{\chi} is a subgraph of GG which is isomorphic to the subgraph of θa,b\theta_{a,b} induced by χθ\chi_{\theta}. We let forest={Hχ:χforest}\mathcal{F}^{\prime}_{\operatorname{forest}}=\{H_{\chi}:\chi\in\mathcal{F}_{\operatorname{forest}}\}. Since Dforest(ν)=D_{\operatorname{forest}}(\nu)=\infty for ν\nu which do not induce forests, every element of forest\mathcal{F}_{\operatorname{forest}}^{\prime} is a forest. To apply Proposition 2.1 with this set, it remains to verify the following, which gives the hypotheses of Proposition 2.1(g).

Claim 4.4.

Let ε>0\varepsilon>0 be as in Proposition 2.1. If δ>0\delta>0 is sufficiently small, then for every path x1xpx_{1}\cdots x_{p} of GG^{\prime} with pbp\leq b which does not contain an element of forest\mathcal{F}_{\operatorname{forest}}^{\prime} as a subgraph, the number of vertices xp+1NG(xp)x_{p+1}\in N_{G^{\prime}}(x_{p}) such that some subgraph of the path x1xp+1x_{1}\cdots x_{p+1} is in forest\mathcal{F}_{\operatorname{forest}}^{\prime} is at most εm1/b\varepsilon\ell m^{1/b}.

Proof.

Fix any path x1xpx_{1}\cdots x_{p} as above; we wish to bound then. We introduce the following notation which will only be used in the proof of this claim: we say that a pair (χ,w)(\chi,w) with χ𝕍\chi\in\mathbb{V} and wV(θa,b)w\in V(\theta_{a,b}) is good if χG{x1,,xp}\chi_{G}\subseteq\{x_{1},\ldots,x_{p}\} and ww is adjacent to at most one vertex of χθ\chi_{\theta}. We claim that if xp+1NG(xp)x_{p+1}\in N_{G^{\prime}}(x_{p}) is such that some subgraph of the path x1xp+1x_{1}\cdots x_{p+1} is in forest\mathcal{F}^{\prime}_{\operatorname{forest}}, then {(w,xp+1)}𝒥forest(χ;w)\{(w,x_{p+1})\}\in\mathcal{J}_{\operatorname{forest}}(\chi;w) for some good pair (χ,w)(\chi,w).

Indeed, say the subgraph of x1xp+1x_{1}\cdots x_{p+1} in forest\mathcal{F}^{\prime}_{\operatorname{forest}} was HγH_{\gamma} for some γ𝕍\gamma\in\mathbb{V}. Because HγH_{\gamma} is not a subgraph of x1xpx_{1}\cdots x_{p}, we must have xp+1V(Hγ)x_{p+1}\in V(H_{\gamma}), and thus we have (w,xp+1)γ(w,x_{p+1})\in\gamma for some wV(θa,b)w\in V(\theta_{a,b}). If χ:=γ{(w,xp+1)}\chi:=\gamma\setminus\{(w,x_{p+1})\}, then HγH_{\gamma} being a subgraph of x1xp+1x_{1}\cdots x_{p+1} implies χG{x1,,xp}\chi_{G}\subseteq\{x_{1},\ldots,x_{p}\} and that ww is adjacent to at most one vertex of χθ\chi_{\theta} (as otherwise xp+1x_{p+1} would have degree greater than 1 in HγH_{\gamma}, contradicting this being a subgraph of x1xp+1x_{1}\cdots x_{p+1}). In total we find γ=χ{(w,xp+1)}\gamma=\chi\cup\{(w,x_{p+1})\} for some good pair (χ,w)(\chi,w). Moreover, by definition of HγforestH_{\gamma}\in\mathcal{F}^{\prime}_{\operatorname{forest}}, we find

χ{(w,xp+1)}=γforest=(,Dforest),\chi\cup\{(w,x_{p+1})\}=\gamma\in\mathcal{F}_{\operatorname{forest}}=\mathcal{F}(\mathcal{H},D_{\operatorname{forest}}),

so by definition {(w,xp+1)}𝒥forest(χ;w)\{(w,x_{p+1})\}\in\mathcal{J}_{\operatorname{forest}}(\chi;w) as desired.

With this we see that the number of choices for xp+1x_{p+1} is at most the number of elements of 𝒥forest(χ;w)\mathcal{J}_{\operatorname{forest}}(\chi;w) for all possible good pairs (χ,w)(\chi,w). To count the number of such elements, fix some good pair (χ,w)(\chi,w). If χθ\chi_{\theta} induces no edges, then since ww is adjacent to at most one vertex of χθ\chi_{\theta} by definition of (χ,w)(\chi,w) being a good pair, χθ{w}\chi_{\theta}\cup\{w\} induces at most one edge. Claim 4.3 then implies 𝒥forest(χ;w)=\mathcal{J}_{\operatorname{forest}}(\chi;w)=\emptyset.

Now assume χθ\chi_{\theta} induces at least one edge. By definition of DforestD_{\operatorname{forest}} and 3.2, we have for any good pair (χ,w)(\chi,w) that

|𝒥forest(χ;w)|2v(θa,b)+1δkb/(b1)2v(θa,b)+1δ16bm1/b,|\mathcal{J}_{\operatorname{forest}}(\chi;w)|\leq 2^{v(\theta_{a,b})+1}\delta k^{b/(b-1)}\leq 2^{v(\theta_{a,b})+1}\delta 16^{b}\ell m^{1/b},

where the first inequality used that χθ{w}\chi_{\theta}\cup\{w\} induces at most one more edge than χθ\chi_{\theta}, and the last inequality used (4) and (5). As the total number of good pairs is at most v(θa,b)2p=Oa,b(1)v(\theta_{a,b})2^{p}=O_{a,b}(1), the result follows by taking δ\delta sufficiently small. ∎

With this claim and the fact that \ell is at least a sufficiently large constant due to (4), we can apply Proposition 2.1 to GG^{\prime} and forest\mathcal{F}_{\operatorname{forest}}^{\prime}, and we let t,Xt,X be the integer and set guaranteed by this proposition. We recall that u,vV(θa,b)u,v\in V(\theta_{a,b}) are the high degree vertices of θa,b\theta_{a,b}.

Let xXx\in X be a vertex such that (u,x)(u,x) is in the fewest hyperedges of \mathcal{H}, that is, a vertex with deg({(u,x)})=minyXdeg({(u,y)})\deg_{\mathcal{H}}(\{(u,x)\})=\min_{y\in X}\deg_{\mathcal{H}}(\{(u,y)\}) (this will help us ensure we can find a new copy of θa,b\theta_{a,b} containing xx). Let (,𝒬)(\mathcal{B},\mathcal{Q}) with =(B1,,Bt)\mathcal{B}=(B_{1},\ldots,B_{t}) be the pair for xx as guaranteed by Proposition 2.1(a).

We now split our analysis into several cases based on the value of tt. The overarching strategy is the same in both cases, but the details are somewhat simpler in the case t=bt=b (where GG^{\prime} has nice “random-like” expansion near xx).

4.2.2 Case 1: t=bt=b

Our strategy is to build copies of θa,b\theta_{a,b} in GG^{\prime} which use xx as one of the high degree vertices uu. To choose the vertex yy which plays the role of the other high-degree vertex vv, we would like to ensure there are many paths in 𝒬\mathcal{Q} connecting yy to xx (which we will then use to build our theta graphs). And indeed, this holds for many vertices in BbB_{b}: for yBby\in B_{b}, let f(y)f(y) denote the number of paths of 𝒬\mathcal{Q} which yy is an endpoint of. By 2.6, there is some BBbB^{\prime}\subseteq B_{b} so that

minyBf(y)2b(|B||Bb|)1/byBbf(y)|B|.\min_{y\in B^{\prime}}f(y)\geq 2^{-b}\left(\frac{|B^{\prime}|}{|B_{b}|}\right)^{1/b}\frac{\sum_{y\in B_{b}}f(y)}{|B^{\prime}|}. (6)

Notice that we face a trade-off: we may have a large set of vertices BB^{\prime}, each of which is the endpoint of approximately the average number of paths, or a smaller set BB^{\prime} where f(y)f(y) is much larger than average. We can obtain a strong balanced supersaturation result either way, but to do so, we must keep track of this trade-off. To this end, let rr^{\prime} be the unique integer such that

2rm|B|<2r+1m.2^{-r^{\prime}}m\leq|B^{\prime}|<2^{-r^{\prime}+1}m. (7)

Since |Bb|m|B_{b}|\leq m and yBbf(y)=|𝒬|εbm\sum_{y\in B_{b}}f(y)=|\mathcal{Q}|\geq\varepsilon\ell^{b}m by 2.1(c), (6) can be relaxed to

minyBf(y)2bεbm|Bb|1/b|B|11/bε4b2(11/b)rb.\min_{y\in B^{\prime}}f(y)\geq 2^{-b}\frac{\varepsilon\ell^{b}m}{|B_{b}|^{1/b}|B^{\prime}|^{1-1/b}}\geq\varepsilon 4^{-b}\cdot 2^{(1-1/b)r^{\prime}}\ell^{b}. (8)

Recall from (3) that 2rnm<2r+1n2^{-r}n\leq m<2^{-r+1}n, and let

s=2r+r.s=2r+r^{\prime}.

Roughly speaking, if ss is small, then mm and/or |B||B^{\prime}| are large, which is what we would expect to happen if GG was a random graph (as opposed to GG being, e.g., a clique with isolated vertices, wherein both of these quantities would be small). We now aim to show that we can add a new theta graph to the collection s,b\mathcal{H}_{s,b}. Let yBy\in B^{\prime} be such that (v,y)(v,y) is in the fewest number of hyperedges with (u,x)(u,x) in \mathcal{H}. As before, this will help ensure we find a new hyperedge, since xx and yy are not already contained in too many elements of \mathcal{H}.

Claim 4.5.

The set χ:={(u,x),(v,y)}\chi:=\{(u,x),(v,y)\} satisfies

deg(χ)δε12skab,\deg_{\mathcal{H}}(\chi)\leq\delta\varepsilon^{-1}2^{s}k^{ab},

and is (s,b)(s,b)-compatible provided δ\delta is sufficiently small.

Proof.

First, recall that xx is such that (u,x)(u,x) is contained in the fewest number of hyperedges in \mathcal{H} among all vertices in the set XX, which means

deg({(u,x)})|||X|δkabn2εm,\deg_{\mathcal{H}}(\{(u,x)\})\leq\frac{|\mathcal{H}|}{|X|}\leq\frac{\delta k^{ab}n^{2}}{\varepsilon m},

where the second inequality used |X|εm|X|\geq\varepsilon m by 2.1(b) when t=bt=b and the hypothesis ||δkabn2|\mathcal{H}|\leq\delta k^{ab}n^{2} of 4.2. Similarly, as yBy\in B^{\prime} is such that (v,y)(v,y) is in the fewest number of hyperedges with (u,x)(u,x) in \mathcal{H}, we have

deg(χ)δkabn2εm|B|δkabn2ε22rrn2=δε12skab,\deg_{\mathcal{H}}(\chi)\leq\frac{\delta k^{ab}n^{2}}{\varepsilon m|B^{\prime}|}\leq\frac{\delta k^{ab}n^{2}}{\varepsilon 2^{-2r-r^{\prime}}n^{2}}=\delta\varepsilon^{-1}2^{s}k^{ab},

where the second inequality used (3) and (7).

It remains to show χ\chi is (s,b)(s,b)-compatible. First we show χ\chi is a valid set. Since u,vu,v are distinct non-adjacent vertices of θa,b\theta_{a,b}, we only need to check xyx\neq y. And indeed, we can not have xBbx\in B_{b}, since by Proposition 2.1(e), every element of BbB_{b} is the endpoint of a positive number of paths of length bb from xx (and since these are paths, xx can not serve as both endpoints). Since yBBby\in B^{\prime}\subseteq B_{b}, we conclude xyx\neq y and that χ\chi is valid.

It remains to check that every subset of χ\chi satisfies the desired codegree conditions. Note that for any χχ\chi^{\prime}\subseteq\chi of size 1, we have Dforest(χθ)=Db(χθ)=Ds,b(χθ)=D_{\operatorname{forest}}(\chi^{\prime}_{\theta})=D_{b}(\chi^{\prime}_{\theta})=D_{s,b}(\chi^{\prime}_{\theta})=\infty, and as such χ\chi^{\prime} will not belong to forestbs,b\mathcal{F}_{\operatorname{forest}}\cup\mathcal{F}_{b}\cup\mathcal{F}_{s,b}. Similarly Dforest(χθ)=Db(χθ)=D_{\operatorname{forest}}(\chi_{\theta})=D_{b}(\chi_{\theta})=\infty, so it only remains to verify that

degs,b(χ)<Ds,b(χθ)=kabn22sn2δ2=2sδ2kab.\deg_{\mathcal{H}_{s,b}}(\chi)<D_{s,b}(\chi_{\theta})=\left\lceil\frac{k^{ab}n^{2}}{2^{-s}n^{2}\delta^{2}}\right\rceil=\left\lceil 2^{s}\delta^{-2}k^{ab}\right\rceil.

Since degs,b(χ)deg(χ)\deg_{\mathcal{H}_{s,b}}(\chi)\leq\deg_{\mathcal{H}}(\chi), this bound follows from the first part of the claim, completing our proof. ∎

We now wish to construct many “good” copies of θa,b\theta_{a,b} in GG^{\prime} with xx and yy as the two high-degree vertices (i.e. more than the bound in Claim 4.5). To do this, we iteratively pick paths P1,,PaP_{1},\ldots,P_{a} in 𝒬\mathcal{Q} that end in yy, and we take our copy of θa,b\theta_{a,b} to be the union of these paths. We must ensure that the paths chosen are such that P1PaP_{1}\cup\cdots\cup P_{a} is (s,b)(s,b)-compatible, and in particular that they do not intersect each other and that no subset of their vertices is already saturated. For this claim, we recall that the paths of θa,b\theta_{a,b} are denoted by uw1jwb1jvuw_{1}^{j}\cdots w_{b-1}^{j}v.

Claim 4.6.

Let 1ja1\leq j\leq a, and let P1,,Pj1P_{1},\dots,P_{j-1} be a collection of paths in 𝒬\mathcal{Q} ending in yy, and for each path PjP_{j^{\prime}}, write Pj=xz1jzb1jyP_{j^{\prime}}=xz_{1}^{j^{\prime}}\cdots z_{b-1}^{j^{\prime}}y. Suppose that the set

χ:={(u,x),(v,y)}j<j{(w1j,z1j),,(wb1j,zb1j)}\chi:=\{(u,x),(v,y)\}\cup\bigcup_{j^{\prime}<j}\left\{(w_{1}^{j^{\prime}},z_{1}^{j^{\prime}}),\dots,(w_{b-1}^{j^{\prime}},z_{b-1}^{j^{\prime}})\right\}

is (s,b)(s,b)-compatible. Then there are at least ε42b222s/3kb\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b} choices of a path Pj=xz1jzb1jyP_{j}=xz_{1}^{j}\cdots z_{b-1}^{j}y in 𝒬\mathcal{Q} so that

χj:=χ{(w1j,z1j),,(wb1j,zb1j)}\chi_{j}:=\chi\cup\left\{(w_{1}^{j},z_{1}^{j}),\dots,(w_{b-1}^{j},z_{b-1}^{j})\right\}

is (s,b)(s,b)-compatible.

Refer to caption
Figure 4: Given paths P1P_{1} and P2P_{2}, the algorithm next picks a path P3P_{3} from yy to xx in 𝒬\mathcal{Q} while maintaining that the relevant sets are compatible.
Proof.

Let 𝒬(y)\mathcal{Q}(y) denote the set of paths in 𝒬\mathcal{Q} ending in yy. Since yBy\in B^{\prime}, we have

|𝒬(y)|(8)ε4b2(11/b)rb(4)ε4b4b22(11/b)r+brkb2ε42b222s/3kb,|\mathcal{Q}(y)|\,\stackrel{{\scriptstyle\eqref{eq:fbound}}}{{\geq}}\,\varepsilon 4^{-b}\cdot 2^{(1-1/b)r^{\prime}}\ell^{b}\,\stackrel{{\scriptstyle\eqref{eq:ell}}}{{\geq}}\,\varepsilon 4^{-b}\cdot 4^{-b^{2}}\cdot 2^{(1-1/b)r^{\prime}+br}\,k^{b}\geq 2\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b}, (9)

where this last step used b3b\geq 3 and s=2r+rs=2r+r^{\prime}.

Our goal now is to show that among these paths, there are few “bad choices” that must be avoided, i.e. few choices so that χj\chi_{j} is not (s,b)(s,b)-compatible. To show this, 2.1(f) will be crucial, which we recall says that for any non-empty set SS of vertices in GG^{\prime} not containing x,yx,y, there are at most ε1(b1|S|)b/(b1)\varepsilon^{-1}\ell^{(b-1-|S|)b/(b-1)} paths in 𝒬(y)\mathcal{Q}(y) which contain SS.

We first show that almost all choices of PjP_{j} make χj\chi_{j} valid. Because χ\chi was already valid, this is equivalent to choosing a PjP_{j} which contains none of the vertices of j<jPj\bigcup_{j^{\prime}<j}P_{j^{\prime}} other than xx and yy. By 2.1(f) with |S|=1|S|=1, the number of paths in 𝒬\mathcal{Q} containing a given vertex from j<jPj\bigcup_{j^{\prime}<j}P_{j^{\prime}} is at most ε1(b2)b/(b1)\varepsilon^{-1}\ell^{(b-2)b/(b-1)}. Therefore, the number of Pj𝒬(y)P_{j}\in\mathcal{Q}(y) containing any of the vertices in j<jPj\bigcup_{j^{\prime}<j}P_{j^{\prime}} (other than xx and yy) is at most a constant times (b2)b/(b1)\ell^{(b-2)b/(b-1)}, which for k0k_{0} sufficiently large (which makes \ell sufficiently large) will be at most 14|𝒬(y)|=Ω(b)\frac{1}{4}|\mathcal{Q}(y)|=\Omega(\ell^{b}). Thus at least three quarters of the paths Pj𝒬(y)P_{j}\in\mathcal{Q}(y) will make χj\chi_{j} valid.

To show that χj\chi_{j} is (s,b)(s,b)-compatible for most choices of paths in 𝒬(y)\mathcal{Q}(y), it remains to bound the number of “bad” sets γ\gamma that must be avoided when choosing PjP_{j}. To this end, for each integer 1pb11\leq p\leq b-1 define

𝒥~p:=χχ,ν{w1j,,wb1j}:|ν|=p𝒥s,b(χ;ν)χχ,ν{w1j,,wb1j}:|ν|=p𝒥forest(χ;ν)\widetilde{\mathcal{J}}_{p}:=\bigcup_{\begin{subarray}{c}\chi^{\prime}\subseteq{\chi},\\ \nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\}:\ |\nu|=p\end{subarray}}\mathcal{J}_{s,b}(\chi^{\prime};\nu)\ \ \cup\ \bigcup_{\begin{subarray}{c}\chi^{\prime}\subseteq{\chi},\\ \nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\}:\ |\nu|=p\end{subarray}}\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)

Note that χj\chi_{j} is (s,b)(s,b)-compatible if and only if it is valid and does not contain any set γ𝒥~p\gamma\in\widetilde{\mathcal{J}}_{p} for any value of pp (here we implicitly use that (b,Db)=\mathcal{F}(\mathcal{H}_{b},D_{b})=\emptyset since DbD_{b} is always equal to \infty, so we can ignore 𝒥b(χ;ν)\mathcal{J}_{b}(\chi^{\prime};\nu) when checking for compatibility).

We may use 3.2 to bound the size of each link set above: consider χχ\chi^{\prime}\subseteq\chi and ν{w1j,,wb1j}\nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\} with |ν|=p|\nu|=p. If (u,x)χ(u,x)\not\in\chi^{\prime} or (v,y)χ(v,y)\not\in\chi^{\prime}, then 𝒥s,b(χ;ν)=\mathcal{J}_{s,b}(\chi^{\prime};\nu)=\emptyset by 3.2. Otherwise, 3.2 gives

|𝒥s,b(χ;ν)|2v(θa,b)+1δp(22s/3kb)p/(b1).\left|\mathcal{J}_{s,b}(\chi^{\prime};\nu)\right|\leq 2^{v(\theta_{a,b})+1}\delta^{p}\left(2^{2s/3}k^{b}\right)^{p/(b-1)}. (10)

Similarly, if χθν\chi_{\theta}^{\prime}\cup\nu does not induce a forest on at least one edge, then 𝒥forest(χ;ν)=\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)=\emptyset. If both χθν\chi_{\theta}^{\prime}\cup\nu and χθ\chi_{\theta}^{\prime} induce a forest on at least one edge, then 3.2 gives

|𝒥forest(χ;ν)|2v(θa,b)+1δpkpb/(b1).|\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)|\leq 2^{v(\theta_{a,b})+1}\delta^{p}k^{pb/(b-1)}. (11)

It remains to deal with the case that χθ\chi_{\theta}^{\prime} does not induce a forest on at least one edge but χθν\chi_{\theta}^{\prime}\cup\nu does. Analogous to the proof of Claim 4.3, in this setting 3.2 gives no meaningful bound, but we can show that for any choice of PjP_{j} in 𝒬(y)\mathcal{Q}(y), no element of the link set 𝒥forest(χ;ν)\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu) can appear in χj\chi_{j} (and therefore there are no “bad options” that must be avoided in choosing PjP_{j} from 𝒬\mathcal{Q}). To this end, consider any set γ𝒥forest(χ;ν)\gamma\in\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu). By definition, this means γθ=ν\gamma_{\theta}=\nu and

deg(χγ)Dforest(χθν).\deg_{\mathcal{H}}(\chi^{\prime}\cup\gamma)\geq D_{\operatorname{forest}}(\chi_{\theta}^{\prime}\cup\nu). (12)

Since χθ\chi_{\theta}^{\prime} does not induce any edges but χθν\chi^{\prime}_{\theta}\cup\nu does, all of these induced edges of θa,b\theta_{a,b} must be contained in the path Wj:=uw1jwb1jvW_{j}:=uw_{1}^{j}\cdots w_{b-1}^{j}v. Since DforestD_{\operatorname{forest}} only depends on the number of edges induced, we have

Dforest(χθν)=Dforest((χθν)Wj)..D_{\operatorname{forest}}(\chi_{\theta}^{\prime}\cup\nu)=D_{\operatorname{forest}}\left((\chi_{\theta}^{\prime}\cup\nu)\cap W_{j}\right).. (13)

On the other hand, if for each Pj𝒬(y)P_{j}\in\mathcal{Q}(y) we let

(Wj,Pj):={(u,x),(w1j,z1j),,(wb1j,zb1j),(v,y)},(W_{j},P_{j}):=\left\{(u,x),(w_{1}^{j},z_{1}^{j}),\dots,(w_{b-1}^{j},z_{b-1}^{j}),(v,y)\right\},

then we have

deg((χγ)(Wj,Pj))deg(χγ),\deg_{\mathcal{H}}\big{(}(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})\big{)}\geq\deg_{\mathcal{H}}(\chi^{\prime}\cup\gamma), (14)

since taking smaller sets can only cause deg\deg_{\mathcal{H}} to increase. Putting this all together, we obtain

deg((χγ)(Wj,Pj))(14),(12),(13)Dforest((χθν)Wj).\deg_{\mathcal{H}}\big{(}(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})\big{)}\,\stackrel{{\scriptstyle\eqref{eq:annoyingBound3},\eqref{eq:annoyingBound},\eqref{eq:annoyingBound2}}}{{\geq}}\,D_{\operatorname{forest}}\left((\chi_{\theta}^{\prime}\cup\nu)\cap W_{j}\right). (15)

Notice that (χθν)Wi=((χγ)(Wj,Pj))θ(\chi_{\theta}^{\prime}\cup\nu)\cap W_{i}=((\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j}))_{\theta}. Thus (15) says that (χγ)(Wj,Pj)forest(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})\in\mathcal{F}_{\operatorname{forest}}. Using the notation introduced just before Claim 4.4, this means that the subgraph H:=H(χγ)(Wj,Pj)PjH^{\prime}:=H_{(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})}\subseteq P_{j} is an element of forest\mathcal{F}^{\prime}_{\operatorname{forest}}. Recall that by 2.1, no path in 𝒬\mathcal{Q} contains an element of forest\mathcal{F}^{\prime}_{\operatorname{forest}} as a subgraph. Since HH^{\prime} is a subgraph of Pj𝒬P_{j}\in\mathcal{Q}, we conclude that there is no choice of Pj𝒬(y)P_{j}\in\mathcal{Q}(y) such that χj\chi_{j} contains γ𝒥forest(χ;ν)\gamma\in\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu) in this case.

Putting it all together, and writing 𝒫possible:={γ(Wj,Pj):Pj𝒬(y)}\mathcal{P}_{\text{possible}}:=\{\gamma\subseteq(W_{j},P_{j}):P_{j}\in\mathcal{Q}(y)\}, we obtain

|𝒥~p𝒫possible|\displaystyle|\widetilde{\mathcal{J}}_{p}\cap\mathcal{P}_{\text{possible}}| (10),(11)2v(θa,b)+1δpkpb/(b1)[(22s/3)p/(b1)+1]|{(χ,ν):χχ,ν{w1j,,wb1j},|ν|=p}|\displaystyle\stackrel{{\scriptstyle\eqref{eq:Jsb},\eqref{eq:Jforest}}}{{\leq}}2^{v(\theta_{a,b})+1}\delta^{p}k^{pb/(b-1)}\left[\left(2^{2s/3}\right)^{p/(b-1)}+1\right]\cdot\text{\footnotesize$\left|\{(\chi^{\prime},\nu)\ :\ \chi^{\prime}\subseteq\chi,\nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\},|\nu|=p\}\right|$}
22v(θa,b)+2bδp(22s/3kb)p/(b1).\displaystyle\leq 2^{2v(\theta_{a,b})+2b}\cdot\delta^{p}\cdot\left(2^{2s/3}k^{b}\right)^{p/(b-1)}. (16)

By 2.1(f), the number of PjP_{j} which contain the projection γG\gamma_{G} of a given γ\gamma (of size pp) is at most ε1(b1p)b/(b1)\varepsilon^{-1}\ell^{(b-1-p)b/(b-1)}. So combining this with (16), the number of PjP_{j} which contain γG\gamma_{G} for any γ𝒥~p𝒫possible\gamma\in\widetilde{\mathcal{J}}_{p}\cap\mathcal{P}_{\text{possible}} is at most

(ε1(b1p)b/(b1))22v(θa,b)+2bδp(22s/3kb)p/(b1).\left(\varepsilon^{-1}\ell^{(b-1-p)b/(b-1)}\right)\cdot 2^{2v(\theta_{a,b})+2b}\cdot\delta^{p}\cdot\left(2^{2s/3}k^{b}\right)^{p/(b-1)}.

Summing over all values of pp from 1 to b1b-1 and simplifying slightly, the number of Pj𝒬(y)P_{j}\in\mathcal{Q}(y) that contain a “bad” set γ\gamma of any size is at most

max1pb1Cb(22s/3(k/)b)p/(b1),\max_{1\leq p\leq b-1}C\ell^{b}\mathcal{\cdot}\left(2^{2s/3}(k/\ell)^{b}\right)^{p/(b-1)}, (17)

where C=(ε1δ(b1)22v(θa,b)+2b).C=\left(\varepsilon^{-1}\delta(b-1)2^{2v(\theta_{a,b})+2b}\right). By (4), and since s=2r+rs=2r+r^{\prime} and b3b\geq 3, we have

22s/3(k/)b22s/3(4b2r)b4b222r/3.2^{2s/3}(k/\ell)^{b}\leq 2^{2s/3}(4^{b}2^{-r})^{b}\leq 4^{b^{2}}2^{2r^{\prime}/3}.

This gives

(17)C22r/3b,\eqref{eq:badBoundy}\leq C^{\prime}2^{2r^{\prime}/3}\ell^{b},

where C=(ε1δ(b1)22v(θa,b)+2b+2b2)C^{\prime}=\left(\varepsilon^{-1}\delta(b-1)2^{2v(\theta_{a,b})+2b+2b^{2}}\right). By taking δ\delta sufficiently small, we can assume C14ε4bC^{\prime}\leq\frac{1}{4}\varepsilon 4^{-b}. So, after taking into account that at most one quarter of the choices Pj𝒬(y)P_{j}\in\mathcal{Q}(y) have χj\chi_{j} not valid, we find that the number of choices for Pj𝒬(y)P_{j}\in\mathcal{Q}(y) such that χj\chi_{j} is (s,b)(s,b)-compatible is at least

34|𝒬(y)|C22r/3b12|𝒬(y)|ε42b222s/3kb,\frac{3}{4}|\mathcal{Q}(y)|-C^{\prime}2^{2r^{\prime}/3}\ell^{b}\geq\frac{1}{2}|\mathcal{Q}(y)|\geq\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b},

where both inequalities used (9). This gives the desired result. ∎

With Claim 4.6 established, we are now nearly ready to finish Case 1.

Claim 4.7.

The number of (s,b)(s,b)-compatible sets of size v(θa,b)v(\theta_{a,b}) containing (u,x)(u,x) and (v,y)(v,y) is at least

(ε42b2)a2skab.(\varepsilon 4^{-2b^{2}})^{a}2^{s}k^{ab}.
Proof.

This result will follow directly by an iterative application of Claim 4.6. As a base step, we take χ0={(u,x),(v,y)}\chi_{0}=\{(u,x),(v,y)\}, which is (s,b)(s,b)-compatible by Claim 4.5. With this we may apply Claim 4.6 to obtain at least ε42b222s/3kb\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b} choices of a path P1P_{1} in GG^{\prime} such that the corresponding set χ1\chi_{1} is (s,b)(s,b)-compatible. Iterating up to j=aj=a, we obtain at least

(ε42b222s/3kb)a(ε42b2)a2skab\left(\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b}\right)^{a}\geq(\varepsilon 4^{-2b^{2}})^{a}2^{s}k^{ab}

distinct collections P1,,PaP_{1},\dots,P_{a} such that the corresponding sets χa\chi_{a} are (s,b)(s,b)-compatible. This completes the proof. ∎

Now we are ready to finish Case 1. By Claim 4.5, the number of hyperedges in s,b\mathcal{H}_{s,b} containing (u,x)(u,x) and (v,y)(v,y) is at most

δε12skab.{\delta\varepsilon^{-1}2^{s}k^{ab}}.

By Claim 4.7, the number of (s,b)(s,b)-compatible sets of size v(θa,b)v(\theta_{a,b}) containing (u,x)(u,x) and (v,y)(v,y) is at least

(ε42b2)a2skab.(\varepsilon 4^{-2b^{2}})^{a}2^{s}k^{ab}.

Therefore, provided δ\delta is sufficiently small, there must be at least one (s,b)(s,b)-compatible set hh of size v(θa,b)v(\theta_{a,b}) that is not already in s,b\mathcal{H}_{s,b}. This hh may be added to s,b\mathcal{H}_{s,b}, completing the proof of 4.2 when t=bt=b.

4.2.3 Case 2: t<bt<b and btb-t Even

Parts of this proof are nearly identical to the previous case, and as such we omit some of the redundant details.

Recall that rr is the unique integer such that 2rnm<2r+1n2^{-r}n\leq m<2^{-r+1}n. Our goal in this case is to show that we can add a new theta graph to r,t\mathcal{H}_{r,t}. Let yBty\in B_{t} be such that (v,y)(v,y) is in as few hyperedges in \mathcal{H} with (u,x)(u,x) as possible.

Claim 4.8.

The set χ={(u,x),(v,y)}\chi=\{(u,x),(v,y)\} satisfies

deg(χ)δε242bkabn222rk(2b2t+1)/(b1)n(2t1)/b\deg_{\mathcal{H}}(\chi)\leq\frac{\delta\varepsilon^{-2}4^{2b}k^{ab}n^{2}}{2^{-2r}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}}

and is (r,t)(r,t)-compatible if δ\delta is sufficiently small.

Proof.

Recall that xx is such that (u,x)(u,x) is contained in the fewest number of hyperedges in \mathcal{H} among all vertices in the set XX. This together with the definition of yy implies that the number of hyperedges containing both (u,x)(u,x) and (v,y)(v,y) is at most

|||X||Bt|δkabn2ε2(2b2t+1)/(b1)m(2t1)/b,\frac{|\mathcal{H}|}{|X|\cdot|B_{t}|}\leq\frac{\delta k^{ab}n^{2}}{\varepsilon^{2}\ell^{(2b-2t+1)/(b-1)}m^{(2t-1)/b}},

where this last step used ||δkabn2|\mathcal{H}|\leq\delta k^{ab}n^{2} and that |Bt|ε(bt+1)/(b1)m(t1)/b|B_{t}|\geq\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b} and |X|ε(bt)/(b1)mt/b|X|\geq\varepsilon\ell^{(b-t)/(b-1)}m^{t/b} by Proposition 2.1(b) and (h). Using m2rnm\geq 2^{-r}n and 4b2rk\ell\geq 4^{-b}2^{r}k from (3) and (4) gives the first result.

As in the t=bt=b case, we have xBtx\notin B_{t} by Proposition 2.1(e), so yxy\neq x and the set χ\chi is valid. Any χχ\chi^{\prime}\subsetneq\chi trivially fails to be in foresttr,t\mathcal{F}_{\operatorname{forest}}\cup\mathcal{F}_{t}\cup\mathcal{F}_{r,t}, and to show χ\chi is not in this set it suffices to show

degr,t(χ)<Dr,t(χθ)=δ2kabn222rk(2b2t+1)/(b1)n(2t1)/b,\deg_{\mathcal{H}_{r,t}}(\chi)<D_{r,t}(\chi_{\theta})=\left\lceil\frac{\delta^{-2}k^{ab}n^{2}}{2^{-2r}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}}\right\rceil,

and this follows by the first result. We conclude that χ\chi is (r,t)(r,t)-compatible.

Now that we have selected our two high degree vertices x,yx,y of our theta graph, we build the rest of the theta graph as follows. First, we work our way out from yy by selecting neighbors zb1jBt1z_{b-1}^{j}\in B_{t-1} of yy, then neighbors zb2jBtz_{b-2}^{j}\in B_{t} of each zb1jz_{b-1}^{j}, and so on, until we have chosen vertices ztjBtz_{t}^{j}\in B_{t}. Then, once we have chosen the vertices ztjz_{t}^{j}, we select paths from the set 𝒬\mathcal{Q} connecting the vertices ztjz_{t}^{j} to xx.

To do the first part, we use the following claim. Here we recall that the paths of θa,b\theta_{a,b} are denoted uw1jwb1jvuw_{1}^{j}\cdots w_{b-1}^{j}v, and for this claim we adopt the convention that wbj:=vw_{b}^{j}:=v and zbj:=yz_{b}^{j}:=y. We also recall that FtV(θa,b)F_{t}\subseteq V(\theta_{a,b}) is defined to be the set of wijw_{i}^{j} with ti<bt\leq i<b and iti-t even. In particular, wb1jFtw_{b-1}^{j}\notin F_{t} when btb-t is even.

Claim 4.9.

Let tib1t\leq i\leq b-1 and 1ja1\leq j\leq a be integers, and let χ\chi be an (r,t)(r,t)-compatible set consisting of the pairs (u,x),(v,y)(u,x),(v,y), and (wij,zij)(w_{i^{\prime}}^{j^{\prime}},z_{i^{\prime}}^{j^{\prime}}) for all i,ji^{\prime},j^{\prime} with either i>ii^{\prime}>i or with i=ii^{\prime}=i and j<jj^{\prime}<j.

  • If iti-t is odd and zi+1jBtz_{i+1}^{j}\in B_{t}, then there exist at least 12εb/(b1)\frac{1}{2}\varepsilon\ell^{b/(b-1)} choices zijBt1NG(zi+1j)z_{i}^{j}\in B_{t-1}\cap N_{G^{\prime}}(z_{i+1}^{j}) such that χ{(wij,zij)}\chi\cup\{(w_{i}^{j},z_{i}^{j})\} is (r,t)(r,t)-compatible.

  • If iti-t is even and zi+1jBt1z_{i+1}^{j}\in B_{t-1}, then there exist at least 12εm1/b\frac{1}{2}\varepsilon\ell m^{1/b} choices zijBtNG(zi+1j)z_{i}^{j}\in B_{t}\cap N_{G^{\prime}}(z_{i+1}^{j}) such that χ{(wij,zij)}\chi\cup\{(w_{i}^{j},z_{i}^{j})\} is (r,t)(r,t)-compatible.

Refer to caption
Figure 5: When a=3a=3, after picking zb11,zb12,zb13,zb21,zb22z_{b-1}^{1},z_{b-1}^{2},z_{b-1}^{3},z_{b-2}^{1},z_{b-2}^{2} (in that order), the algorithm next selects zb23z_{b-2}^{3}.
Proof.

Observe that if zijNG(zi+1j)z_{i}^{j}\in N_{G^{\prime}}(z_{i+1}^{j}) is a vertex such that χ{(wij,zij)}\chi\cup\{(w_{i}^{j},z_{i}^{j})\} is not (r,t)(r,t)-compatible, then either zijχGz_{i}^{j}\in\chi_{G} (which can only hold for O(1)O(1) vertices), or there exists some χχ\chi^{\prime}\subseteq\chi with

{(wij,zij)}𝒥forest(χ;wij)𝒥r,t(χ;wij)𝒥t(χ;wij).\{(w_{i}^{j},z_{i}^{j})\}\in\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};w_{i}^{j})\cup\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})\cup\mathcal{J}_{t}(\chi^{\prime};w_{i}^{j}).

Thus it suffices to show that each of these sets are small for each χχ\chi^{\prime}\subseteq\chi.

First consider 𝒥forest(χ;wij)\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};w_{i}^{j}). If χθwij\chi^{\prime}_{\theta}\cup w_{i}^{j} induces at most one edge, then this link set is empty by Claim 4.3. If this is not the case, then χθ\chi^{\prime}_{\theta} must induce at least one edge since wijw_{i}^{j} only has one edge incident to χθ\chi^{\prime}_{\theta} (this implicitly uses it2i\geq t\geq 2, as otherwise wijw_{i}^{j} would also be adjacent to uu). By 3.2, we find

|𝒥forest(χ;wij)|2v(θa,b)+1δkb/(b1).|\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};w_{i}^{j})|\leq 2^{v(\theta_{a,b})+1}\delta k^{b/(b-1)}. (18)

Next consider 𝒥r,t(χ;wij)\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j}), which we recall is based off of the codegree function defined in Definition 6. If {u,v}χθ\{u,v\}\not\subseteq\chi^{\prime}_{\theta}, then this link set is empty by 3.2, so we may assume {u,v}χθ\{u,v\}\subseteq\chi^{\prime}_{\theta}. Then 3.2 gives

|𝒥r,t(χ;wij)|2v(θa,b)+1δ22r/3kb/(b1) if it is odd,|\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})|\leq 2^{v(\theta_{a,b})+1}\delta 2^{2r/3}k^{b/(b-1)}\hskip 3.99994pt\textrm{ if }i-t\textrm{ is odd},

since if iti-t is odd, adding wijFtw_{i}^{j}\notin F_{t} to χθ\chi^{\prime}_{\theta} keeps the parameter f=|νFt|f=|\nu\cap F_{t}| in Definition 6 the same while increasing |ν||\nu|. Similarly,

|𝒥r,t(χ;wij)|2v(θa,b)+1δ22r/3kn1/b if it is even,|\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})|\leq 2^{v(\theta_{a,b})+1}\delta 2^{2r/3}kn^{1/b}\hskip 3.99994pt\textrm{ if }i-t\textrm{ is even},

since ff and ν\nu both increase by 1.

Finally consider 𝒥t(χ;wij)\mathcal{J}_{t}(\chi^{\prime};w_{i}^{j}), which we recall is based off of the codegree function defined in Definition 7. Again we may assume {u,v}χθ\{u,v\}\subseteq\chi^{\prime}_{\theta}. If wb1jχθw_{b-1}^{j^{\prime}}\in\chi^{\prime}_{\theta} for some jj^{\prime}, then the argument and final bound is exactly the same as in the case for 𝒥r,t\mathcal{J}_{r,t} (with gg taking the role of ff in exactly the same way as before). We next consider the subcase wb1jχθw_{b-1}^{j^{\prime}}\notin\chi^{\prime}_{\theta} for all j<jj^{\prime}<j. If ib1i\neq b-1, then χθ{wij}\chi^{\prime}_{\theta}\cup\{w_{i}^{j}\} contains no vertex of the form wb1jw_{b-1}^{j^{\prime}}, so Dt(χθ{wij})=D_{t}(\chi^{\prime}_{\theta}\cup\{w_{i}^{j}\})=\infty, and hence the link set is empty by 3.1. If i=b1i=b-1, then χθ{u,v,wb11,,wb1a}\chi^{\prime}_{\theta}\subseteq\{u,v,w_{b-1}^{1},\ldots,w_{b-1}^{a}\} by the hypothesis of the claim, so our assumption wb1jχθw_{b-1}^{j^{\prime}}\notin\chi^{\prime}_{\theta} implies χ={(u,x),(v,y)}\chi^{\prime}=\{(u,x),(v,y)\}. Thus

degt(χ{(wb1j,zb1j)})deg({(v,y),(wb1j,zb1j)})<Dforest({v,wb1j})=Dt(χθ{wb1j}),\deg_{\mathcal{H}_{t}}(\chi^{\prime}\cup\{(w_{b-1}^{j},z_{b-1}^{j})\})\leq\deg_{\mathcal{H}}(\{(v,y),(w_{b-1}^{j},z_{b-1}^{j})\})<D_{\operatorname{forest}}(\{v,w_{b-1}^{j}\})=D_{t}(\chi^{\prime}_{\theta}\cup\{w_{b-1}^{j}\}),

where the first inequality used that we are looking at the codegree of a smaller set in a larger hypergraph, the second inequality used (2) (i.e. that every edge in GG^{\prime} has codegree smaller than that given by DforestD_{\operatorname{forest}}), and the equality used |χθ{wb1j}|=3|\chi^{\prime}_{\theta}\cup\{w_{b-1}^{j}\}|=3 and g=|(χθ{wb1j})Ft|=0g=|(\chi^{\prime}_{\theta}\cup\{w_{b-1}^{j}\})\cap F_{t}|=0 in the definition of DtD_{t} for btb-t even. This implies 𝒥t(χ;wb1j)=\mathcal{J}_{t}(\chi^{\prime};w_{b-1}^{j})=\emptyset.

By summing up the sizes of all of these sets over all possible choices of χχ\chi^{\prime}\subseteq\chi (as well as the number of choices zijχGz_{i}^{j}\in\chi_{G}), we find when iti-t is odd that the number of zijz_{i}^{j} which can not be selected is at most

Oa,b(δ22r/3kb/(b1))=Oa,b(δb/(b1)),O_{a,b}(\delta 2^{2r/3}k^{b/(b-1)})=O_{a,b}(\delta\ell^{b/(b-1)}),

with the last step using 4b2rk\ell\geq 4^{-b}2^{r}k. By Proposition 2.1(d), zi+1jBtz_{i+1}^{j}\in B_{t} has at least εb/(b1)\varepsilon\ell^{b/(b-1)} neighbors in Bt1B_{t-1}, and for δ\delta sufficiently small this is at least twice the number of forbidden choices. Essentially the same reasoning holds for the iti-t even case after noting kb/(b1)kn1/bk^{b/(b-1)}\leq kn^{1/b} when applying (18). We conclude the result.

By starting with the two high-degree vertices (u,x)(u,x) and (v,y)(v,y), and iteratively applying Claim 4.9, we can find many (r,t)(r,t)-compatible sets χ\chi with χθ={u,v}it{wi1,,wia}\chi_{\theta}=\{u,v\}\cup\bigcup_{i\geq t}\{w_{i}^{1},\ldots,w_{i}^{a}\}. To get the remaining vertices corresponding to wijw_{i}^{j} with i<ti<t, we use the same strategy as in the t=bt=b case of choosing paths from 𝒬\mathcal{Q}.

Claim 4.10.

Let P1,,Pj1P_{1},\dots,P_{j-1} be a collection of paths in 𝒬\mathcal{Q} ending in BtB_{t}, and for each path PjP_{j^{\prime}}, write Pj=xz1jztjP_{j^{\prime}}=xz_{1}^{j^{\prime}}\cdots z_{t}^{j^{\prime}}. Suppose that the set

χ:={(u,x),(v,y)}it{(wi1,zi1),,(wia,zia)}j<j{(w1j,z1j),,(wtj,ztj)}\chi:=\{(u,x),(v,y)\}\cup\bigcup_{i\geq t}\{(w_{i}^{1},z_{i}^{1}),\ldots,(w_{i}^{a},z_{i}^{a})\}\cup\bigcup_{j^{\prime}<j}\left\{(w_{1}^{j^{\prime}},z_{1}^{j^{\prime}}),\dots,(w_{t}^{j^{\prime}},z_{t}^{j^{\prime}})\right\}

is (r,t)(r,t)-compatible. Then there are at least 12ε(t1)b/(b1)\frac{1}{2}\varepsilon\ell^{(t-1)b/(b-1)} choices of a path Pj=xz1jztjP_{j}=xz_{1}^{j}\cdots z_{t}^{j} in 𝒬\mathcal{Q} so that

χj:=χ{(w1j,z1j),,(wtj,ztj)}\chi_{j}:=\chi\cup\left\{(w_{1}^{j},z_{1}^{j}),\dots,(w_{t}^{j},z_{t}^{j})\right\}

is (r,t)(r,t)-compatible.

Refer to caption
Figure 6: After picking all of the vertices zijz_{i}^{j} with iti\geq t, the algorithm next picks a path P1P_{1} from zt1z_{t}^{1} to xx, then a path P2P_{2} from zt2z_{t}^{2} to xx, and so on.
Sketch of Proof.

The argument is almost identical to that of Claim 4.6 so we only sketch the details (with our notation defined analogously as before). By Proposition 2.1(c) we have that there are at least ε(t1)b/(b1)\varepsilon\ell^{(t-1)b/(b-1)} paths in 𝒬\mathcal{Q} from ztjz_{t}^{j} to xx. Using 2.1(f) we find that very few of these paths contain any of the other vertices of χG\chi_{G} besides xx and yy.

By using 3.2, we find that each of the sets 𝒥r,t(χ;ν),𝒥t(χ;ν)\mathcal{J}_{r,t}(\chi^{\prime};\nu),\ \mathcal{J}_{t}(\chi^{\prime};\nu), and 𝒥forest(χ;ν)\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};\nu) after intersecting with 𝒫possible\mathcal{P}_{\text{possible}} are all of size O((δ22r/3kb/(b1))p)O((\delta 2^{2r/3}k^{b/(b-1)})^{p}) whenever χχ\chi^{\prime}\subseteq\chi and ν{w1j,,wt1j}\nu\subseteq\{w_{1}^{j},\ldots,w_{t-1}^{j}\} with |ν|=p|\nu|=p (here we use that νFt=\nu\cap F_{t}=\emptyset for any such ν\nu, so f,gf,g in the definitions of Dr,t,DtD_{r,t},D_{t} do not change when going from χθ\chi^{\prime}_{\theta} to χθν\chi^{\prime}_{\theta}\cup\nu). From here essentially the same computations as before go through. ∎

Combining the previous three claims, we find that our algorithm produces a large number of theta graphs.

Claim 4.11.

The number of (r,t)(r,t)-compatible sets of size v(θa,b)v(\theta_{a,b}) containing (u,x)(u,x) and (v,y)(v,y) is at least

Ω(22rkab2b2t+1b1n22t1b).\Omega\left(2^{2r}k^{ab-\frac{2b-2t+1}{b-1}}n^{2-\frac{2t-1}{b}}\right).
Proof.

The result follows by iteratively applying Claims 4.9 and 4.10. Starting with χ0:={(u,x),(v,y)}\chi_{0}:=\{(u,x),(v,y)\}, which is (r,t)(r,t)-compatible by Claim 4.8, we repeatedly apply Claim 4.9 to build paths ztjzb1jyz_{t}^{j}\cdots z_{b-1}^{j}y (for 1ja1\leq j\leq a); we then finish by repeatedly applying Claim 4.10 to select paths xz1jztjyxz_{1}^{j}\cdots z_{t}^{j}y. In total, we find that the number of (r,t)(r,t)-compatible sets hh of size v(θa,b)v(\theta_{a,b}) with (u,x),(v,y)h(u,x),(v,y)\in h is at least

((12εm1/b)(bt)/2(12εb/(b1))(bt)/2(12ε(t1)b/(b1)))a,\left(\left(\frac{1}{2}\varepsilon\ell m^{1/b}\right)^{(b-t)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{b/(b-1)}\right)^{(b-t)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{(t-1)b/(b-1)}\right)\right)^{a}, (19)

where the first two terms use that for each path uz1jzbjvuz_{1}^{j}\cdots z_{b}^{j}v we get a factor of 12εm1/b\frac{1}{2}\varepsilon\ell m^{1/b} for each vertex in position i{t,t+2,,b2}i\in\{t,t+2,\ldots,b-2\} and a factor of 12εb/(b1)\frac{1}{2}\varepsilon\ell^{b/(b-1)} for each i{t+1,t+3,,b1}i\in\{t+1,t+3,\ldots,b-1\} by Claim 4.9, and the last term uses Claim 4.10. The expression above is equal to some positive constant depending only on a,b,εa,b,\varepsilon times

aba(bt)2(b1)ma(bt)2b\displaystyle\ell^{ab-\frac{a(b-t)}{2(b-1)}}m^{\frac{a(b-t)}{2b}} =aba(bt)2(b1)m(a4)(bt)22bm4b4t+22b\displaystyle=\ell^{ab-\frac{a(b-t)}{2(b-1)}}m^{\frac{(a-4)(b-t)-2}{2b}}\cdot m^{\frac{4b-4t+2}{2b}}
aba(bt)2(b1)+(a4)(bt)22(b1)m2b2t+1b=ab2b2t+1b1m22t1b,\displaystyle\geq\ell^{ab-\frac{a(b-t)}{2(b-1)}+\frac{(a-4)(b-t)-2}{2(b-1)}}\cdot m^{\frac{2b-2t+1}{b}}=\ell^{ab-\frac{2b-2t+1}{b-1}}\cdot m^{2-\frac{2t-1}{b}},

where the inequality used (5), i.e. m1/b1/(b1)m^{1/b}\geq\ell^{1/(b-1)}, and implicitly that a6a\geq 6 so that the exponent of mm is positive. Finally, using =Ω(2rk)\ell=\Omega(2^{r}k) and m=Ω(2rn)m=\Omega(2^{-r}n) crudely gives

ab2b2t+1b1m22t1b=Ω(2(ab2)rkab2b2t+1b12(22t1b)rn22t1b)=Ω(22rkab2b2t+1b1n22t1b),\ell^{ab-\frac{2b-2t+1}{b-1}}\cdot m^{2-\frac{2t-1}{b}}=\Omega\left(2^{(ab-2)r}k^{ab-\frac{2b-2t+1}{b-1}}\cdot 2^{-(2-\frac{2t-1}{b})r}n^{2-\frac{2t-1}{b}}\right)=\Omega\left(2^{2r}k^{ab-\frac{2b-2t+1}{b-1}}n^{2-\frac{2t-1}{b}}\right),

where this last step used ab6ab\geq 6. ∎

We are now ready to finish Case 2. If δ\delta is sufficiently small in terms of a,b,εa,b,\varepsilon, the number of theta graphs guaranteed by Claim 4.11 exceeds the codegree bound in Claim 4.8; thus there exists some (r,t)(r,t)-valid set hh obtained through our algorithm which is not already a hyperedge of \mathcal{H}. Adding such an hh to r,t\mathcal{H}_{r,t} gives the result in this case.

4.2.4 Case 3: t<bt<b and btb-t Odd

This case is nearly identical to the previous one, and as such we only sketch the proof.

Again our goal is to add a new hyperedge to r,t\mathcal{H}_{r,t}. To start, we pick yBt1{x}y\in B_{t-1}\setminus\{x\} such that (v,y)(v,y) is in as few hyperedges with (u,x)(u,x) as possible. Here we emphasize that, in the previous case, we picked yBty\in B_{t} and hence immediately obtained yxy\neq x (since each element of BtB_{t} is the endpoint of a path with xx), but here we have to be slightly more careful and explicitly enforce yxy\neq x. However, since no hyperedge of \mathcal{H} contains both (u,x)(u,x) and (v,x)(v,x) (since every hyperedge is a valid set), and since |Bt1{x}|12ε(bt+1)/(b1)m(t1)/b|B_{t-1}\setminus\{x\}|\geq\frac{1}{2}\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b} by Proposition 2.1(b), we find that deg({(u,x),(v,y)})\deg_{\mathcal{H}}(\{(u,x),(v,y)\}) is at most twice the bound from Claim 4.8, and the rest of the proof showing that this set is (r,t)(r,t)-compatible goes through in exactly the same way as in Claim 4.8.

From here we apply Claim 4.9 exactly as written (since wijFtw_{i}^{j}\in F_{t} depends only on the parity of iti-t and not of btb-t); the proof of Claim 4.9 also remains word for word the same, with the only minor exception being that we have g=g(ν):=|νFt|1g=g(\nu):=|\nu\cap F_{t}|-1 (which again implies g=0g=0 when i=b1i=b-1 and χθ={u,v}\chi^{\prime}_{\theta}=\{u,v\}).

Finally, we choose paths in 𝒬\mathcal{Q} going from each of the ztjz_{t}^{j} vertices to xx, and again the statement and proof of Claim 4.10 remain exactly the same. With this, the total number of choices for the algorithm to produce an (r,t)(r,t)-compatible set is

((12εm1/b)(bt+1)/2(12εb/(b1))(bt1)/2(12ε(t1)b/(b1)))a,\left(\left(\frac{1}{2}\varepsilon\ell m^{1/b}\right)^{(b-t+1)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{b/(b-1)}\right)^{(b-t-1)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{(t-1)b/(b-1)}\right)\right)^{a},

since in this setting we get a factor of 12εm1/b\frac{1}{2}\varepsilon\ell m^{1/b} for each vertex in position i{t,t+2,,b1}i\in\{t,t+2,\ldots,b-1\}, of which there are (bt+1)/2(b-t+1)/2. This quantity is at least as large as (19), so we conclude that for δ\delta sufficiently small the number of choices is more than the number of hyperedges containing (u,x),(v,y)(u,x),(v,y) in \mathcal{H}. With this we conclude the result.

5 Balanced Supersaturation for Edges

In the previous section we showed that θa,b\theta_{a,b} exhibits balanced supersaturation for vertices in terms of the (complicated) codegree function DtD^{\prime}_{t}. We begin by simplifying this function.

Proposition 5.1.

For all a100a\geq 100 and b3b\geq 3, let δ>0\delta>0 and DtD^{\prime}_{t} be as in 4.1. There exist constants C,k0>0C^{\prime},k_{0}>0 such that if n11/bkk0n^{1-1/b}\geq k\geq k_{0} and νV(θa,b)\nu\subseteq V(\theta_{a,b}) induces ee edges, where 1ee(θa,b)11\leq e\leq e(\theta_{a,b})-1, then

Dt(ν)Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1.D^{\prime}_{t}(\nu)\leq\frac{C^{\prime}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.

Note that n11/bkn^{1-1/b}\geq k always holds if we are considering nn-vertex graphs GG with kn1+1/bkn^{1+1/b} edges. We defer the proof of 5.1 for the moment and show that together with 4.1, it implies a balanced supersaturation result for edges which we will use to complete the proof of 1.3; see 6.1 below.

Corollary 5.2.

For all a100a\geq 100 and b3b\geq 3, there exist constants C,k0>0C,k_{0}>0 such that the following holds for all nn\in\mathbb{N} and kk0k\geq k_{0}. If GG is an nn-vertex graph with kn1+1/bkn^{1+1/b} edges, then there exists a hypergraph \mathcal{H} on E(G)E(G) whose hyperedges are copies of θa,b\theta_{a,b} and is such that ||C1kabn2|\mathcal{H}|\geq C^{-1}k^{ab}n^{2} and such that for every σE(G)\sigma\subseteq E(G) with 1|σ|e(θa,b)11\leq|\sigma|\leq e(\theta_{a,b})-1, we have

deg(σ)Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})|σ|1.\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{|\sigma|-1}}.
Proof.

Let t\mathcal{H}^{\prime}_{t} be the DtD^{\prime}_{t}-good GG-hypergraph on V(θa,b)×V(G)V(\theta_{a,b})\times V(G) guaranteed by 4.1. We would like to translate t\mathcal{H}^{\prime}_{t} into a hypergraph \mathcal{H} on E(G)E(G) satisfying the codegree bounds above.

This will be conceptually straightforward, but a little tedious. In essence, the hyperedges of t\mathcal{H}^{\prime}_{t} correspond to theta graphs in GG, and we will define \mathcal{H} to be the hypergraph corresponding to these theta graphs. However, we must deal with two small issues with this translation: (1) a single theta graph in GG may appear isomorphically several times in t\mathcal{H}^{\prime}_{t}, and (2) the codegree bound Dt(ν)D_{t}^{\prime}(\nu) depends on the number of edges induced by ν\nu, whereas the bound in 5.2 depends only on |σ||\sigma| for an arbitrary set of edges σ\sigma, even if the vertices used by σ\sigma induce additional edges. Neither of these issues is a real obstacle (in particular, the second can only improve the codegrees), but we will need some additional notation in order to address them.

For each valid set χ\chi, we will define the corresponding set of edges induced in GG (excluding “extraneous” edges that do not play a role in the isomorphic copy of θa,b\theta_{a,b}) as follows:

Eχ={zz:(w,z),(w,z)χ,wwE(θa,b)}.E_{\chi}=\{zz^{\prime}:(w,z),(w^{\prime},z^{\prime})\in\chi,\ ww^{\prime}\in E(\theta_{a,b})\}.

In particular, EhE(G)E_{h}\subseteq E(G) is a copy of θa,b\theta_{a,b} in GG for every hyperedge hth\in\mathcal{H}^{\prime}_{t} (since every hyperedge hth\in\mathcal{H}^{\prime}_{t} is a valid set of size v(θa,b)v(\theta_{a,b})). Define \mathcal{H} to be the hypergraph with hyperedge set {Eh:ht}\{E_{h}:h\in\mathcal{H}^{\prime}_{t}\}. Observe that ||1v(θa,b)!|t|=Ω(kabn2)|\mathcal{H}|\geq\frac{1}{v(\theta_{a,b})!}|\mathcal{H}^{\prime}_{t}|=\Omega(k^{ab}n^{2}), so it remains to check the codegree conditions – that is, to bound deg(σ)\deg_{\mathcal{H}}(\sigma) for each set of edges σ\sigma in GG.

Fix a set of edges σE(G)\sigma\subseteq E(G). We need to get an understanding of which valid sets “correspond” to σ\sigma. To this end, let σvV(G)\sigma_{v}\subseteq V(G) be the set of vertices used by the edges σ\sigma, and let 𝒳\mathcal{X} be the set of all valid χV(θa,b)×V(G)\chi\in V(\theta_{a,b})\times V(G) with χG=σv\chi_{G}=\sigma_{v} and σEχ\sigma\subseteq E_{\chi}. See Figure 7 for an example.

Refer to caption
Figure 7: A pair of edges σE(G)\sigma\subseteq E(G), together with two valid sets χ,χ\chi,\chi^{\prime} in 𝒳\mathcal{X}, i.e. valid sets having χG,χG=σv\chi_{G},\chi^{\prime}_{G}=\sigma_{v} and σEχ,Eχ\sigma\subseteq E_{\chi},E_{\chi^{\prime}}. Note that EχσE_{\chi^{\prime}}\neq\sigma because χθ\chi^{\prime}_{\theta} induces three edges in θa,b\theta_{a,b}.

With this notation, we can convert the codegree bounds in HtH_{t}^{\prime} to a bound on deg(σ)\deg_{\mathcal{H}}(\sigma) as follows.

Claim 5.3.

We have

deg(σ)χ𝒳degt(χ).\deg_{\mathcal{H}}(\sigma)\leq\sum_{\chi\in\mathcal{X}}\deg_{\mathcal{H}^{\prime}_{t}}(\chi).
Proof.

We would like to show that each hyperedge hh\in\mathcal{H} counted by deg(σ)\deg_{\mathcal{H}}(\sigma) corresponds to at least one hyperedge hth^{\prime}\in\mathcal{H}_{t}^{\prime} counted by degt(χ)\deg_{\mathcal{H}^{\prime}_{t}}(\chi) for some χ𝒳\chi\in\mathcal{X}. We first observe that if hh\in\mathcal{H}, then by definition of \mathcal{H}, there exists some hth^{\prime}\in\mathcal{H}^{\prime}_{t} with h=Ehh=E_{h^{\prime}}. We wish to show that this hh^{\prime} contains a set χ𝒳\chi\in\mathcal{X}, so that it will be counted by degt(χ)\deg_{\mathcal{H}^{\prime}_{t}}(\chi).

To this end, we observe that if the edge set σ\sigma is contained in h=Ehh=E_{h^{\prime}}, then the corresponding vertex set σv\sigma_{v} is contained in hGh^{\prime}_{G}; so there exists some χh\chi\subseteq h^{\prime} such that χG=σv\chi_{G}=\sigma_{v}. We claim that σEχ\sigma\subseteq E_{\chi} as well. To see this, note that if zzσEhzz^{\prime}\in\sigma\subseteq E_{h^{\prime}}, then (w,z),(w,z)h(w,z),(w^{\prime},z^{\prime})\in h^{\prime} for some wwE(θa,b)ww^{\prime}\in E(\theta_{a,b}) by the definition of EhE_{h^{\prime}}. Since z,zσv=χGz,z^{\prime}\in\sigma_{v}=\chi_{G} and χh\chi\subseteq h^{\prime}, we have (w,z),(w,z)χ(w,z),(w^{\prime},z^{\prime})\in\chi as well, giving zzEχzz^{\prime}\in E_{\chi}. So by definition, χ𝒳\chi\in\mathcal{X} as desired. ∎

To finish, it remains to bound the sum above. First, notice that there are only a constant number of terms: each element of 𝒳\mathcal{X} is uniquely identified by χθ\chi_{\theta} (since χG=σv\chi_{G}=\sigma_{v} for each χ𝒳\chi\in\mathcal{X}), and hence |𝒳|2v(θa,b)|\mathcal{X}|\leq 2^{v(\theta_{a,b})}. Since t\mathcal{H}^{\prime}_{t} is DtD^{\prime}_{t}-good, we have degt(χ)Dt(χθ)\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq D^{\prime}_{t}(\chi_{\theta}), and hence

deg(σ)χ𝒳degt(χ)χ𝒳Dt(χθ)2v(θa,b)Ckabn2kn1+1/b(min{kb/(b1),kn(b1)/(ab1)})|σ|1,\deg_{\mathcal{H}}(\sigma)\leq\sum_{\chi\in\mathcal{X}}\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq\sum_{\chi\in\mathcal{X}}D^{\prime}_{t}(\chi_{\theta})\leq 2^{v(\theta_{a,b})}\cdot\frac{C^{\prime}k^{ab}n^{2}}{kn^{1+1/b}(\min\{k^{b/(b-1)},kn^{(b-1)/(ab-1)}\})^{|\sigma|-1}},

where this last step used that each χ𝒳\chi\in\mathcal{X} induces |Eχ||σ||E_{\chi}|\geq|\sigma| edges, together with the bound on DtD_{t}^{\prime} given by 5.1. This gives the desired result by taking C=2v(θa,b)CC=2^{v(\theta_{a,b})}C^{\prime}. ∎

5.1 Proof of 5.1

The rest of this section is dedicated to proving 5.1, which we emphasize will consist entirely of (moderately involved) arithmetic and case analysis. We will abuse notation slightly by identifying a vertex set νV(θa,b)\nu\subseteq V(\theta_{a,b}) with its induced subgraph in θa,b\theta_{a,b}. For example, we may say ν\nu contains a cycle to mean its induced subgraph contains a cycle. Unless stated otherwise, ee will refer to the number of edges that ν\nu induces in θa,b\theta_{a,b}. We let δ>0\delta>0 be the constant guaranteed by 4.1, and throughout we assume n11/bkn^{1-1/b}\geq k and b3b\geq 3. We recall that our goal is to show that there exists a constant C>0C^{\prime}>0 such that for kk sufficiently large and for all νV(θa,b)\nu\subseteq V(\theta_{a,b}) inducing ee edges for 1ee(θa,b)11\leq e\leq e(\theta_{a,b})-1, we have

Dt(ν)Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1,D^{\prime}_{t}(\nu)\leq\frac{C^{\prime}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}},

where the definition of Dt(ν)D^{\prime}_{t}(\nu) will be recalled below. We begin with an easy case.

Lemma 5.4.

For any codegree function DD, if νV(θa,b)\nu\subseteq V(\theta_{a,b}) is such that D(ν)=1D(\nu)=1 and ν\nu induces e1e\geq 1 edges, then

D(ν)kabn2kn1+1/b(knb1b(ab1))e1.D(\nu)\leq\frac{k^{ab}n^{2}}{kn^{1+1/b}\big{(}kn^{\frac{b-1}{b(ab-1)}}\big{)}^{e-1}}.
Proof.

This follows immediately from eabe\leq ab. ∎

We remind the reader that

Dt(ν):={Dforest(ν)ν induces a forest,min{Dt(ν),20(D0,t(ν)+logn)}otherwise.D^{\prime}_{t}(\nu):=\begin{cases}D_{\operatorname{forest}}(\nu)&\nu\textrm{ induces a forest},\\ \min\{D_{t}(\nu),20(D_{0,t}(\nu)+\left\lceil\log n\right\rceil)\}&\textrm{otherwise}.\end{cases}

For ease of reading, we recall each of the functions mentioned above before they are used. First, we recall

Dforest(ν)=kabn2δkn1+1/b(δkb/(b1))e1D_{\operatorname{forest}}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}\cdot(\delta k^{b/(b-1)})^{e-1}}\right\rceil

whenever νV(θa,b)\nu\subseteq V(\theta_{a,b}) induces a forest with e1e\geq 1 edges.

Lemma 5.5.

If a3a\geq 3, 2tb2\leq t\leq b, and νV(θa,b)\nu\subseteq V(\theta_{a,b}) induces a forest on e1e\geq 1 edges, then

Dt(ν)2δabkabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1.D^{\prime}_{t}(\nu)\leq\frac{2\delta^{-ab}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.
Proof.

If Dt(ν)=Dforest(ν)=1D^{\prime}_{t}(\nu)=D_{\operatorname{forest}}(\nu)=1 then the result follows from 5.4, and otherwise

Dt(ν)=Dforest(ν)2kabn2δkn1+1/b(δkb/(b1))e1,D^{\prime}_{t}(\nu)=D_{\operatorname{forest}}(\nu)\leq\frac{2k^{ab}n^{2}}{\delta kn^{1+1/b}\cdot(\delta k^{b/(b-1)})^{e-1}},

where the factor of 2 comes from the ceiling function and the assumption Dforest(ν)>1D_{\operatorname{forest}}(\nu)>1. The result follows since e1ab1e-1\leq ab-1. ∎

It remains to prove the result when ν\nu contains a cycle. To help with the case analysis, we show that it suffices to prove the result when ν\nu consists of paths of length bb, i.e. when ν\nu contains no leaves or isolated vertices.

Lemma 5.6.

Let DD be a codegree function such that if νV(θa,b)\nu\subseteq V(\theta_{a,b}) contains a cycle, then either D(ν)=1D(\nu)=1 or D(ν)2δ1kb/(b1)D(ν{w})D(\nu)\leq 2\delta^{-1}k^{-b/(b-1)}D(\nu\setminus\{w\}) for every wνw\in\nu.

If C1C\geq 1 is a constant such that for every ν\nu which consists of paths of length bb, we have

D(ν)Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1,D(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}},

then for every k2δ1k\geq 2\delta^{-1} and ν\nu which contains a cycle, we have

D(ν)C(2δ1)|ν|kabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1.D(\nu)\leq\frac{C(2\delta^{-1})^{|\nu|}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.
Proof.

Assume the hypotheses hold for DD. We prove by induction on |ν||\nu| that any νV(θa,b)\nu\subseteq V(\theta_{a,b}) containing a cycle satisfies the desired inequality. For this proof, we recall that u,vu,v always denote the two high degree vertices of θa,b\theta_{a,b}.

Say we have proved the result up to some set ν\nu which induces ee edges. If D(ν)=1D(\nu)=1 then the inequality follows from Lemma 5.4, so we may assume D(ν)>1D(\nu)>1. If ν\nu consists of paths of length bb then the result follows by hypothesis. Otherwise, there exists some vertex wν{u,v}w\in\nu\setminus\{u,v\} which is adjacent to at most one other vertex in ν\nu. Thus ν{w}\nu\setminus\{w\} induces a graph containing a cycle with at least e2e-2 edges. By our hypothesis on DD, we find

D(ν)\displaystyle D(\nu) 2δ1kb/(b1)D(ν{w})\displaystyle\leq 2\delta^{-1}k^{-b/(b-1)}\cdot D(\nu\setminus\{w\})
2δ1kb/(b1)C(2δ1)|ν|1kabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e2\displaystyle\leq 2\delta^{-1}k^{-b/(b-1)}\cdot\frac{C(2\delta^{-1})^{|\nu|-1}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-2}}
C(2δ1)|ν|kabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1,\displaystyle\leq\frac{C(2\delta^{-1})^{|\nu|}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}},

where the second inequality used that our inductive hypothesis applies to ν{w}\nu\setminus\{w\} (since ν{w}\nu\setminus\{w\} contains a cycle and D(ν{w})(δ/2)kb/(b1)D(ν)>1D(\nu\setminus\{w\})\geq(\delta/2)k^{b/(b-1)}D(\nu)>1). This gives the desired result. ∎

We will show that essentially all of our remaining codegree functions are of the form described in 5.6. First, we recall

D0,b(ν)=kabn2n2(kb/(b1))|ν|2δ|ν|D_{0,b}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{n^{2}(k^{b/(b-1)})^{|\nu|-2}\delta^{|\nu|}}\right\rceil

whenever ν\nu contains the two high degree vertices u,vu,v, and D0,b(ν)=D_{0,b}(\nu)=\infty otherwise. Note that if ν\nu contains a cycle, then D0,b(ν{w})=D_{0,b}(\nu\setminus\{w\})=\infty if w{u,v}w\in\{u,v\}, and otherwise if D0,b(ν)>1D_{0,b}(\nu)>1 we have D0,b(ν)2δ1kb/(b1)D0,b(ν{w})D_{0,b}(\nu)\leq 2\delta^{-1}k^{-b/(b-1)}D_{0,b}(\nu\setminus\{w\}), where the factor of 2 comes from the ceiling function. Thus D0,bD_{0,b} satisfies the conditions of Lemma 5.6, and using this we prove the following.

Lemma 5.7.

There exists a constant C>0C>0 such that if kk is sufficiently large in terms of a,b,δa,b,\delta and if νV(θa,b)\nu\subseteq V(\theta_{a,b}) induces ee edges where 1ee(θa,b)11\leq e\leq e(\theta_{a,b})-1, then

Db(ν)Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1.D^{\prime}_{b}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.
Proof.

First, if ν\nu induces a forest then the result follows from 5.5, so we may assume ν\nu contains a cycle.

Now consider the case Db(ν)40lognD_{b}^{\prime}(\nu)\leq 40\lceil\log n\rceil. In particular, it suffices here to show that

40lognCkabn2kn1+1/b(knb1b(ab1))e1.40\lceil\log n\rceil\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\cdot\big{(}kn^{\frac{b-1}{b(ab-1)}}\big{)}^{e-1}}.

And indeed, since eab1e\leq ab-1, this inequality is satisfied if

40lognCknb1b(ab1),40\lceil\log n\rceil\leq Ckn^{\frac{b-1}{b(ab-1)}},

which holds for all nn and kk, provided CC is sufficiently large.

Thus we may assume that 40lognDb(ν)40\lceil\log n\rceil\leq D_{b}^{\prime}(\nu); in particular, since Db(ν)20(D0,b(ν)+logn)D^{\prime}_{b}(\nu)\leq 20(D_{0,b}(\nu)+\lceil\log n\rceil), this implies that lognD0,b(ν)\lceil\log n\rceil\leq D_{0,b}(\nu), and as such Db(ν)40D0,b(ν)D_{b}^{\prime}(\nu)\leq 40D_{0,b}(\nu). Possibly by adjusting the constant CC, it now suffices to show D0,b(ν)D_{0,b}(\nu) satisfies the inequality of the lemma. By Lemmas 5.4 and 5.6, it suffices to show this holds for ν\nu consisting of p2p\geq 2 paths of length bb. In this case |ν|=p(b1)+2|\nu|=p(b-1)+2 and e=pbe=pb, so it suffices to show

kabn2n2(kb/(b1))p(b1)δ|ν|Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})pb1\frac{k^{ab}n^{2}}{n^{2}(k^{b/(b-1)})^{p(b-1)}\delta^{|\nu|}}\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{pb-1}}

for some constant CC, where implicitly we used that the ceiling function in D0,b(ν)D_{0,b}(\nu) can be ignored by increasing CC by a factor of 2. Using min{kb/(b1),knb1b(ab1)}knb1b(ab1)\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\leq kn^{\frac{b-1}{b(ab-1)}} and rearranging the above gives that it suffices to show

(knb1b(ab1))pb1Cδ|ν|kpb1n11/b,\big{(}kn^{\frac{b-1}{b(ab-1)}}\big{)}^{pb-1}\leq C\delta^{|\nu|}k^{pb-1}n^{1-1/b},

which holds for any Cδ|ν|C\geq\delta^{-|\nu|} since pb1ab1pb-1\leq ab-1. We conclude the result. ∎

It remains to deal with the case t<bt<b. To start, we recall that we write the paths of θa,b\theta_{a,b} as uw1jwb1jvuw_{1}^{j}\cdots w_{b-1}^{j}v for 1ja1\leq j\leq a, and that we define Ft={wij:ti<b,it is even}F_{t}=\{w_{i}^{j}:t\leq i<b,\ i-t\textrm{ is even}\}. We recall that if u,vνu,v\in\nu then

D0,t(ν)=kabn2k(2b2t+1)/(b1)n(2t1)/b(kn1/b)f(kb/(b1))|ν|f2δ|ν|D_{0,t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}\cdot(kn^{1/b})^{f}(k^{b/(b-1)})^{|\nu|-f-2}\delta^{|\nu|}}\right\rceil

where f=|νFt|f=|\nu\cap F_{t}|, with D0,t(ν)=D_{0,t}(\nu)=\infty otherwise. Similarly if u,v,wb1jνu,v,w_{b-1}^{j}\in\nu for some jj then

Dt(ν)=kabn2kn1+1/b(kn1/b)g(kb/(b1))|ν|g3δ|ν|,D_{t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{kn^{1+1/b}\cdot(kn^{1/b})^{g}(k^{b/(b-1)})^{|\nu|-g-3}\delta^{|\nu|}}\right\rceil,

where g=|νFt|g=|\nu\cap F_{t}| if btb-t is even and g=|νFt|1g=|\nu\cap F_{t}|-1 otherwise, with Dt(ν)=D_{t}(\nu)=\infty otherwise. Note that both of these codegree functions satisfy the conditions of Lemma 5.6 since we assumed n11/bkn^{1-1/b}\geq k. From now on we will assume we work with t<bt<b and define f,gf,g as in the above codegree functions. It will be useful to note that if ν\nu consists of pp paths of length bb, then by definition

f=p(bt)/2f=p\left\lceil(b-t)/2\right\rceil (20)

and

g=(p2)(bt)/2+bt,g=(p-2)\left\lceil(b-t)/2\right\rceil+b-t, (21)

where this last equality follows from g=fg=f if btb-t is even and otherwise g=f1=(p1)(bt)/2+(bt)/2g=f-1=(p-1)\left\lceil(b-t)/2\right\rceil+\left\lfloor(b-t)/2\right\rfloor.

Lemma 5.8.

Let ν\nu consist of p2p\geq 2 complete paths and define h=2t+fb2h=2t+f-b-2. There exists a constant C>0C>0 such that

D0,t(ν)Ckabn2kn1+1/b(min{kb/(b1),knb1b(ab1)})e1D_{0,t}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}

provided either

knb1bh(p1)b+h,k\leq n^{\frac{b-1}{b}\cdot\frac{h}{(p-1)b+h}},

or

kb1hnb1b(b1)(pb1)h(ab1)ab1.k^{b-1-h}\geq n^{\frac{b-1}{b}\cdot\frac{(b-1)(pb-1)-h(ab-1)}{ab-1}}.
Proof.

If D0,t(ν)=1D_{0,t}(\nu)=1 then the result holds by Lemma 5.4, so from now on we assume D0,t(ν)>1D_{0,t}(\nu)>1. We can rewrite the denominator of D0,t(ν)D_{0,t}(\nu) as

δ|ν|k(2b2t+1)/(b1)n(2t1)/b(kn1/b)f(kb/(b1))|ν|f2=δ|ν|k2b/(b1)(k1/(b1)n1/b)2t+f1(kb/(b1))|ν|2\displaystyle\delta^{|\nu|}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}(kn^{1/b})^{f}(k^{b/(b-1)})^{|\nu|-f-2}=\delta^{|\nu|}k^{2b/(b-1)}(k^{-1/(b-1)}n^{1/b})^{2t+f-1}(k^{b/(b-1)})^{|\nu|-2}
=δ|ν|kn1+1/b(k1/(b1)n1/b)2t+fb2(kb/(b1))|ν|2.\displaystyle=\delta^{|\nu|}kn^{1+1/b}\cdot(k^{-1/(b-1)}n^{1/b})^{2t+f-b-2}(k^{b/(b-1)})^{|\nu|-2}.

Using this and D0,t(ν)>1D_{0,t}(\nu)>1, we see that to show the desired result holds with C=2δv(θa,b)C=2\delta^{-v(\theta_{a,b})}, it suffices to show

(k1/(b1)n1/b)h(kb/(b1))|ν|2(min{kb/(b1),knb1b(ab1)})e1.(k^{-1/(b-1)}n^{1/b})^{h}(k^{b/(b-1)})^{|\nu|-2}\geq\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}. (22)

Using that the minimum in (22) is at most kb/(b1)k^{b/(b-1)} and rearranging, we see that it suffices to have

(k1/(b1)n1/b)h(kb/(b1))(e1)(|ν|2)=(kb/(b1))pb1p(b1)=k(p1)b/(b1),(k^{-1/(b-1)}n^{1/b})^{h}\geq(k^{b/(b-1)})^{(e-1)-(|\nu|-2)}=(k^{b/(b-1)})^{pb-1-p(b-1)}=k^{(p-1)b/(b-1)},

i.e. k((p1)b+h)/(b1)nh/bk^{((p-1)b+h)/(b-1)}\leq n^{h/b}, which gives the first result.

If we instead use that the minimum in (22) is at most knb1b(ab1)kn^{\frac{b-1}{b(ab-1)}}, then we see that this inequality will be satisfied provided

(k1/(b1)n1/b)hkpbkpb1n(b1)(pb1)b(ab1),(k^{-1/(b-1)}n^{1/b})^{h}k^{pb}\geq k^{pb-1}n^{\frac{(b-1)(pb-1)}{b(ab-1)}},

which is equivalent to

kb1hb1n(b1)(pb1)b(ab1)hb=n(b1)(pb1)h(ab1)b(ab1).k^{\frac{b-1-h}{b-1}}\geq n^{\frac{(b-1)(pb-1)}{b(ab-1)}-\frac{h}{b}}=n^{\frac{(b-1)(pb-1)-h(ab-1)}{b(ab-1)}}.

This gives the last part of the lemma. ∎

Lemma 5.9.

Let ν\nu consist of p2p\geq 2 complete paths. There exists a constant C>0C>0 such that

Dt(ν)Ckabn2kn1+1/b(kb/(b1))e1D_{t}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}(k^{b/(b-1)})^{e-1}}

provided

knb1bgpb+g.k\leq n^{\frac{b-1}{b}\cdot\frac{g}{pb+g}}.
Proof.

By Lemma 5.4 we can assume Dt(ν)>1D_{t}(\nu)>1, so the result holds for some C2δ|ν|C\geq 2\delta^{-|\nu|} provided

(kn1/b)g(kb/(b1))|ν|g3(kb/(b1))e1,(kn^{1/b})^{g}(k^{b/(b-1)})^{|\nu|-g-3}\geq(k^{b/(b-1)})^{e-1},

and rearranging this gives

(k1/(b1)n1/b)g(kb/(b1))e1|ν|+3=(kb/(b1))pb1p(b1)+1=kpb/(b1),(k^{-1/(b-1)}n^{1/b})^{g}\geq(k^{b/(b-1)})^{e-1-|\nu|+3}=(k^{b/(b-1)})^{pb-1-p(b-1)+1}=k^{pb/(b-1)},

and rearranging gives the desired result. ∎

With these two results we can solve the cycle case for t<bt<b provided aa is sufficiently large.

Lemma 5.10.

If b>t2b>t\geq 2 and a100,b3a\geq 100,b\geq 3, then there exists a constant C>0C>0 such that if kk is sufficiently large in terms of a,b,δa,b,\delta and if νV(θa,b)\nu\subseteq V(\theta_{a,b}) induces ee edges where 1ee(θa,b)11\leq e\leq e(\theta_{a,b})-1, then

Dt(ν)Ckabn2kn1+1/b(kb/(b1))e1.D_{t}^{\prime}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}(k^{b/(b-1)})^{e-1}}.
Proof.

By using similar reasoning as in Lemma 5.7, it suffices to prove this upper bounds holds for min{D0,t(ν),Dt(ν)}\min\{D_{0,t}(\nu),D_{t}(\nu)\} (i.e. ignoring the logn\log n term in Dt(ν)D^{\prime}_{t}(\nu)) whenever ν\nu consists of p2p\geq 2 paths of length bb. With hh as in Lemma 5.8, (20) and (21) give

h=2t+p(bt)/2b20,h=2t+p\left\lceil(b-t)/2\right\rceil-b-2\geq 0,
g=(p2)(bt)/2+bt,g=(p-2)\left\lceil(b-t)/2\right\rceil+b-t,

where h0h\geq 0 follows from p,t2p,t\geq 2. We also note that both inequalities of Lemma 5.8 become easier to satisfy for larger values of hh (this holds for the first inequality because the function hc+h\frac{h}{c+h} is increasing for any c>0c>0, and it holds for the second since nb1bkn^{\frac{b-1}{b}}\geq k).

We first claim that hh is relatively large in most cases; namely, if p5p\geq 5, then hb1h\geq b-1. Indeed, this being false is equivalent to

2t+p(bt)/2b2<b1.2t+p\left\lceil(b-t)/2\right\rceil-b-2<b-1.

If t=b1t=b-1 then this is equivalent to p<2b+12(b1)=3p<2b+1-2(b-1)=3, contradicting our assumption on pp, so we may assume t<b1t<b-1. By dropping the ceiling function, the inequality above implies

2t+p(bt)/2b2<b1,2t+p(b-t)/2-b-2<b-1,

which is equivalent to

p<4+2bt5,p<4+\frac{2}{b-t}\leq 5,

with the last step using t<b1t<b-1, again giving a contradiction to our assumption on pp.

We conclude that hb1h\geq b-1 if p5p\geq 5. Note that the second inequality of Lemma 5.8 trivially holds at h=b1h=b-1, and since the lemma is easier to satisfy for larger values of hh, we conclude the result for p5p\geq 5. From now on444As an aside, it is not difficult to show that Lemma 5.8 alone suffices to prove the result for p3p\geq 3 if, say, a12a\geq 12. However, for p=2p=2 it is necessary to use Lemma 5.9 as well since, in particular, we can have h=0h=0 in this case. Dealing with the case p=2p=2 here is the only reason the codegree functions DtD_{t} are introduced. we assume 2p42\leq p\leq 4.

Note that

g+h=t+(2p2)(bt)/22,g+h=t+(2p-2)\left\lceil(b-t)/2\right\rceil-2,

and in particular,

max{g,h}t/2+(p1)(bt)/21b/21\max\{g,h\}\geq t/2+(p-1)\left\lceil(b-t)/2\right\rceil-1\geq b/2-1

(where the last step uses p2p\geq 2). First consider the case hb/21h\geq b/2-1. Note that for p4p\leq 4 we have

(b1)(pb1)h(ab1)(b1)(4b1)(b/21)(ab1)=(b/2)(8b9a(b2))0,(b-1)(pb-1)-h(ab-1)\leq(b-1)(4b-1)-(b/2-1)(ab-1)=(b/2)(8b-9-a(b-2))\leq 0,

where this last step holds for a158b9b2a\geq 15\geq\frac{8b-9}{b-2} (and uses b3b\geq 3). With this we either have hb1h\geq b-1 (in which case we are done by the argument for p5p\geq 5), or

kb1h1nb1b(b1)(pb1)h(ab1)ab1,k^{b-1-h}\geq 1\geq n^{\frac{b-1}{b}\cdot\frac{(b-1)(pb-1)-h(ab-1)}{ab-1}},

in which case the result follows from Lemma 5.8.

Now assume h<b/21h<b/2-1, which in particular implies gb/21g\geq b/2-1. By Lemma 5.9, and using p4p\leq 4, we obtain the result if knb1bb/214b+b/21=nb1bb29b2k\leq n^{\frac{b-1}{b}\cdot\frac{b/2-1}{4b+b/2-1}}=n^{\frac{b-1}{b}\cdot\frac{b-2}{9b-2}}. On the other hand, using the second inequality of Lemma 5.8, which is harder to satisfy the smaller h0h\geq 0 is, we see that that for p4p\leq 4 the result holds if

kb1nb1b(b1)(4b1)ab1.k^{b-1}\geq n^{\frac{b-1}{b}\cdot\frac{(b-1)(4b-1)}{ab-1}}.

Thus the result holds for all kk provided

4b1ab1b29b2,\frac{4b-1}{ab-1}\leq\frac{b-2}{9b-2},

or equivalently

a1b+(4b1)(9b2)b(b2).a\geq\frac{1}{b}+\frac{(4b-1)(9b-2)}{b(b-2)}.

This holds for a100a\geq 100, proving the result. ∎

5.1 now follows immediately from Lemmas  5.7 and 5.10.

We note that sharper arguments can easily be used to reduce the bound a100a\geq 100 of 5.1 considerably, though the bound cannot be made arbitrarily small. In particular, one can work out that the case b=4b=4 and p=t=2p=t=2 shows that a9a\geq 9 is needed, as Dt(ν)D_{t}^{\prime}(\nu) does not satisfy the conclusion of 5.1 in this case.

6 Completing the Proof of 1.3

Recall that we wish to show that for all b2b\geq 2, there exists a0=a0(b)a_{0}=a_{0}(b) such that for any fixed aa0a\geq a_{0}, w.h.p.

ex(Gn,p,θa,b)={Θ(p1bn1+1b)pnb1ab1(logn)2b,n2a(b1)ab1(logn)O(1)nb1ab1(logn)2bpna(b1)ab1,(1+o(1))p(n2)na(b1)ab1pn2.\mathrm{ex}(G_{n,p},\theta_{a,b})=\begin{cases}\Theta\left(p^{\frac{1}{b}}n^{1+\frac{1}{b}}\right)&p\geq n^{-\frac{b-1}{ab-1}}(\log n)^{2b},\\ n^{2-\frac{a(b-1)}{ab-1}}(\log n)^{O(1)}&n^{-\frac{b-1}{ab-1}}(\log n)^{2b}\geq p\geq n^{-\frac{a(b-1)}{ab-1}},\\ (1+o(1))p{n\choose 2}&n^{-\frac{a(b-1)}{ab-1}}\gg p\gg n^{-2}.\end{cases}

The case b=2b=2 follows from Morris and Saxton [20], so from now on we assume b3b\geq 3. The lower bounds for ex(Gn,p,θa,b)\mathrm{ex}(G_{n,p},\theta_{a,b}) follow555Specifically, one applies Corollary 5.1 to the rooted tree (T,R)(T,R) with TT the path on bb edges and RR its set of leaves. With this one can check ρ(T)bb1\rho(T)\geq\frac{b}{b-1} (which is also implicitly shown in Conlon [7]), and that θa,b𝒯a\theta_{a,b}\in\mathcal{T}^{a} from [28, Corollary 5.1], which is proven using random polynomial graphs (similar to how Conlon [7] proved ex(n,θa,b)=Ω(n1+1/b)\mathrm{ex}(n,\theta_{a,b})=\Omega(n^{1+1/b}) whenever aa is sufficiently large in terms of bb). The upper bound for pp small follows from the fact that Gn,pG_{n,p} has at most (1+o(1))p(n2)(1+o(1))p{n\choose 2} edges w.h.p., and the upper bound for pp in the middle range will follow from the upper bound for pp large due to the monotonicity of ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) with respect to pp.

With this all in mind, it only remains to prove ex(Gn,p,θa,b)=O(p1/bn1+1/b)\mathrm{ex}(G_{n,p},\theta_{a,b})=O(p^{1/b}n^{1+1/b}) when pnb1ab1(logn)2bp\geq n^{-\frac{b-1}{ab-1}}(\log n)^{2b}. For this we utilize the following general result showing that balanced supersaturation implies upper bounds on ex(Gn,p,F)\mathrm{ex}(G_{n,p},F).

Theorem 6.1.

Let FF be a graph and 1<α<21<\alpha<2 a real number satisfying the following: there exist real numbers C,k0>0C,k_{0}>0 such that for every nn-vertex graph GG with e(G)=knαe(G)=kn^{\alpha} and kk0k\geq k_{0}, there exists a hypergraph \mathcal{H} on E(G)E(G) whose hyperedges are copies of FF and is such that ||C1ke(F)nv(F)(2α)e(F)|\mathcal{H}|\geq C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}, and such that for every σE(G)\sigma\subseteq E(G) with 1|σ|e(F)11\leq|\sigma|\leq e(F)-1, we have

deg(σ)Cke(F)nv(F)(2α)e(F)knα(min{k12α,knα2+v(F)2e(F)1})|σ|1.\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{|\sigma|-1}}.

In this case,

ex(Gn,p,F)=O(pα1nα)for all p(n2αv(F)2e(F)1/log2n)1α1.\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha})\hskip 16.99998pt\text{for all \ }p\geq\left(n^{2-\alpha-\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{1}{\alpha-1}}.

We note that if FF is 2-balanced, i.e. if it has m2(F)=e(F)1v(F)2m_{2}(F)=\frac{e(F)-1}{v(F)-2}, then the conclusion of 6.1 is exactly the upper bound predicted by 1.2 provided ex(n,F)=Θ(nα)\mathrm{ex}(n,F)=\Theta(n^{\alpha}).

The proof of 6.1 uses what is by now a fairly routine argument involving hypergraph containers, which is a powerful technique developed recently and independently by Balogh, Morris and Samotij [1] and Saxton and Thomason [26]. We defer the details to Appendix A.

In any case, by 5.2, we see that θa,b\theta_{a,b} satisfies the conditions of 6.1 for a100a\geq 100 and b3b\geq 3, proving the desired upper bound and completing the proof.

7 Concluding Remarks

In this paper we established upper bounds for ex(Gn,p,θa,b)\mathrm{ex}(G_{n,p},\theta_{a,b}) which are essentially tight whenever aa is sufficiently large in terms of bb. It would be of interest if one could extend our ideas to prove effective upper bounds on ex(Gn,p,F)\mathrm{ex}(G_{n,p},F) for other FF. In particular, one might hope to prove upper bounds for powers of rooted trees.

More precisely, given a tree TT, a set RV(T)R\subseteq V(T), and an integer aa, we define TRaT_{R}^{a} to be the graph consisting of aa copies of TT which agree only on the set RR. For example, if TT is a path of length bb and RR consists of its two endpoints, then TRa=θa,bT_{R}^{a}=\theta_{a,b}, and if T=Ks,1T=K_{s,1} and RR is its set of leaves, then TRa=Ks,aT_{R}^{a}=K_{s,a}. In particular, the only bipartite graphs for which we know tight bounds for ex(Gn,p,F)\mathrm{ex}(G_{n,p},F), namely theta graphs and complete bipartite graphs, are examples of powers of trees.

Question 7.1.

Can one prove essentially tight bounds on ex(Gn,p,TRa)\mathrm{ex}(G_{n,p},T_{R}^{a}) for other powers of rooted trees?

The best upper bounds for this problem come from the general bounds of Jiang and Longbreak [17], and the best lower bound comes from [28]. We note that analogous to the situation for theta graphs prior to this paper, the lower bound of [28] depends only on the tree TT while the upper bound of [17] depends on aa, and as such the gaps between these bounds grow large as aa increases. Similar to the situation in the present paper, we suspect that the lower bound is closer to the truth, and in particular, Conjecture 1.2 claims that in many cases, the lower bound from [28] should be the correct answer.

Solving Question 7.1 for all rooted trees is likely impossible. Indeed, even the p=1p=1 case, namely that of determining the Turán number ex(n,TRa)\mathrm{ex}(n,T_{R}^{a}), is an important open problem of Bukh and Conlon [4] related to the rational exponents conjecture. That being said, there are a number of special cases where this Turán number is known [8, 9, 15, 16, 18], and it might be possible to generalize our ideas to deal with some of these cases in the random setting. A more detailed discussion on this problem can be found in the concluding remarks of [28].

To prove 1.3, we first proved a balanced supersaturation result, 5.2, which is essentially optimal for a100a\geq 100. It would be desirable to do this for all aa.

Question 7.2.

Can one extend 5.2 to hold for all a3a\geq 3?

Note that the a=2a=2 case is already dealt with by Morris and Saxton [20]. Solving this question, in addition to being desirable from a philosophical standpoint, might lead to a simpler proof of 5.2 which could more easily generalize to solving Question 7.1. The simplest way to resolve this question would be to resolve the following.

Question 7.3.

Can one extend Proposition 2.1 to hold with |X|εm|X|\geq\varepsilon m?

An affirmative answer here would not only give an affirmative answer to Question 7.2, but also would allow one to avoid many of the messy technical details in our proof. Namely, with this one can alter the definition of Ds,tD_{s,t} in such a way that the DtD_{t} functions are no longer needed, and such that the computations for proving 5.1 are much simpler.

Acknowledgments. We thank Rob Morris for useful comments about the presentation of this paper.

References

  • [1] József Balogh, Robert Morris and Wojciech Samotij “Independent sets in hypergraphs” In J. Amer. Math. Soc. 28.3, 2015, pp. 669–709 DOI: 10.1090/S0894-0347-2014-00816-X
  • [2] J.. Bondy and M. Simonovits “Cycles of even length in graphs” In J. Combinatorial Theory Ser. B 16, 1974, pp. 97–105 DOI: 10.1016/0095-8956(74)90052-5
  • [3] Boris Bukh “Extremal graphs without exponentially-small bicliques” In arXiv:2107.04167, 2022
  • [4] Boris Bukh and David Conlon “Rational exponents in extremal graph theory” In J. Eur. Math. Soc. (JEMS) 20.7, 2018, pp. 1747–1757 DOI: 10.4171/JEMS/798
  • [5] Maurı́cio Collares Neto and Robert Morris “Maximum-size antichains in random set-systems” In Random Structures Algorithms 49.2, 2016, pp. 308–321 DOI: 10.1002/rsa.20647
  • [6] D. Conlon and W.. Gowers “Combinatorial theorems in sparse random sets” In Ann. of Math. (2) 184.2, 2016, pp. 367–454 DOI: 10.4007/annals.2016.184.2.2
  • [7] David Conlon “Graphs with few paths of prescribed length between any two vertices” In Bull. Lond. Math. Soc. 51.6, 2019, pp. 1015–1021 DOI: 10.1112/blms.12295
  • [8] David Conlon and Oliver Janzer “Rational exponents near two” In Adv. Comb., 2022, pp. 10pp
  • [9] David Conlon, Oliver Janzer and Joonkyung Lee “More on the extremal number of subdivisions” In Combinatorica 41.4, 2021, pp. 465–494 DOI: 10.1007/s00493-020-4202-1
  • [10] Jan Corsten and Tuan Tran “Balanced supersaturation for some degenerate hypergraphs” In J. Graph Theory 97.4, 2021, pp. 600–623 DOI: 10.1002/jgt.22674
  • [11] P. Erdös and A.. Stone “On the structure of linear graphs” In Bull. Amer. Math. Soc. 52, 1946, pp. 1087–1091 DOI: 10.1090/S0002-9904-1946-08715-7
  • [12] R.. Faudree and M. Simonovits “On a class of degenerate extremal graph problems” In Combinatorica 3.1, 1983, pp. 83–93 DOI: 10.1007/BF02579343
  • [13] Zoltán Füredi “Random Ramsey graphs for the four-cycle” In Discrete Math. 126.1-3, 1994, pp. 407–410 DOI: 10.1016/0012-365X(94)90287-9
  • [14] Zoltán Füredi and Miklós Simonovits “The history of degenerate (bipartite) extremal graph problems” In Erdös centennial 25, Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2013, pp. 169–264 DOI: 10.1007/978-3-642-39286-3“˙7
  • [15] Oliver Janzer “The extremal number of the subdivisions of the complete bipartite graph” In SIAM J. Discrete Math. 34.1, 2020, pp. 241–250 DOI: 10.1137/19M1269798
  • [16] Tao Jiang, Zilin Jiang and Jie Ma “Negligible obstructions and Turán exponents” In Ann. Appl. Math. 38.3, 2022, pp. 356–384
  • [17] Tao Jiang and Sean Longbrake “Balanced supersaturation and Turán numbers in random graphs” In arXiv:2208.10572, 2022
  • [18] Dong Yeap Kang, Jaehoon Kim and Hong Liu “On the rational Turán exponents conjecture” In J. Combin. Theory Ser. B 148, 2021, pp. 149–172 DOI: 10.1016/j.jctb.2020.12.003
  • [19] T. Kövari, V.. Sós and P. Turán “On a problem of K. Zarankiewicz” In Colloq. Math. 3, 1954, pp. 50–57 DOI: 10.4064/cm-3-1-50-57
  • [20] Robert Morris and David Saxton “The number of C2C_{2\ell}-free graphs” In Adv. Math. 298, 2016, pp. 534–580 DOI: 10.1016/j.aim.2016.05.001
  • [21] Dhruv Mubayi and Liana Yepremyan “On The Random Turán number of linear cycles” In arXiv:2304.15003, 2023
  • [22] Jiaxi Nie “Random Turán theorem for expansions of spanning subgraphs of tight trees” In arXiv:2305.04193, 2023
  • [23] Jiaxi Nie “Turán theorems for even cycles in random hypergraph” In arXiv:2304.14588, 2023
  • [24] Jiaxi Nie, Sam Spiro and Jacques Verstraëte “Triangle-free subgraphs of hypergraphs” In Graphs Combin. 37.6, 2021, pp. 2555–2570 DOI: 10.1007/s00373-021-02388-5
  • [25] Vojtěch Rödl and Mathias Schacht “Extremal results in random graphs” In Erdös centennial 25, Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2013, pp. 535–583 DOI: 10.1007/978-3-642-39286-3“˙20
  • [26] David Saxton and Andrew Thomason “Hypergraph containers” In Invent. Math. 201.3, 2015, pp. 925–992 DOI: 10.1007/s00222-014-0562-8
  • [27] Mathias Schacht “Extremal results for random discrete structures” In Ann. of Math. (2) 184.2, 2016, pp. 333–365 DOI: 10.4007/annals.2016.184.2.1
  • [28] Sam Spiro “Random Polynomial Graphs for Random Turán Problems” In arXiv:2212.08050, 2022
  • [29] Sam Spiro and Jacques Verstraëte “Relative Turán problems for uniform hypergraphs” In SIAM J. Discrete Math. 35.3, 2021, pp. 2170–2191 DOI: 10.1137/20M1364631

Appendix A Proof of 6.1

Throughout this section, we say that a graph FF is α\alpha-good with 1<α<21<\alpha<2 a real number if it satisfies the following balanced supersaturation condition: there exist real numbers C,k0>0C,k_{0}>0 such that for every nn-vertex graph GG with e(G)=knαe(G)=kn^{\alpha} and kk0k\geq k_{0}, there exists a hypergraph \mathcal{H} on E(G)E(G) whose hyperedges are copies of FF and is such that ||C1ke(F)nv(F)(2α)e(F)|\mathcal{H}|\geq C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}, and such that for every σE(G)\sigma\subseteq E(G) with 1|σ|e(F)11\leq|\sigma|\leq e(F)-1, we have

deg(σ)Cke(F)nv(F)(2α)e(F)knα(min{k12α,knα2+v(F)2e(F)1})|σ|1.\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{|\sigma|-1}}.

Here we prove 6.1, i.e.  that if FF is α\alpha-good, then ex(Gn,p,F)=O(pα1nα)\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha}) w.h.p. for all p(nα2+v(F)2e(F)1/log2n)1α1p\geq\left(n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{-1}{\alpha-1}}. We emphasize that our proof is nearly word-for-word the same as that of Morris and Saxton [20]. We make use the following definition from [26].

Definition 9.

Given an rr-uniform hypergraph \mathcal{H} and a real number τ\tau, define

δ(,τ)=1e()j=2r1τj1vV()d(j)(v),\delta(\mathcal{H},\tau)\,=\,\frac{1}{e(\mathcal{H})}\,\sum_{j=2}^{r}\,\frac{1}{\tau^{j-1}}\sum_{v\in V(\mathcal{H})}d^{(j)}(v),

where

d(j)(v)=max{deg(σ):vσV() and |σ|=j}d^{(j)}(v)\,=\,\max\big{\{}\deg_{\mathcal{H}}(\sigma)\,:\,v\in\sigma\subseteq V(\mathcal{H})\textup{ and }|\sigma|=j\big{\}}

denotes the maximum degree in \mathcal{H} of a jj-set containing vv.

We remark that we have removed some extraneous constants from the definition in [26], since these do not affect the formulation of the theorem below. We also note that δ\delta is typically called a codegree function, but we emphasize that this has no relation to the definition of codegree functions that we used throughout our paper.

The following container theorem was proved by Balogh, Morris and Samotij [1, Proposition 3.1] and by Saxton and Thomason [26, Theorem 6.2]666To be precise, Theorem 6.2 in [26] is stated where TT is a tuple of vertex sets rather than a single vertex set, but it is straightforward to deduce this form from the methods of [26]., where here the notation S(t)S^{(\leq t)} denotes the collection of all subsets of SS of size at most tt.

Theorem A.1.

Let r2r\geq 2 and let 0<δ<δ0(r)0<\delta<\delta_{0}(r) be sufficiently small. Let \mathcal{H} be an rr-graph with NN vertices, and suppose that δ(,τ)δ\delta(\mathcal{H},\tau)\leq\delta for some τ>0\tau>0. Then there exists a collection 𝒞\mathcal{C} of subsets of V()V(\mathcal{H}), and a function f:V()(τN/δ)𝒞f\colon V(\mathcal{H})^{(\leq\tau N/\delta)}\to\mathcal{C} such that:

  • (a)(a)

    For every independent set II, there exists TIT\subset I with |T|τN/δ|T|\leq\tau N/\delta and If(T)I\subset f(T), and

  • (b)(b)

    e([C])(1δ)e()e\big{(}\mathcal{H}[C]\big{)}\leq\big{(}1-\delta\big{)}e(\mathcal{H}) for every C𝒞C\in\mathcal{C}.

We will refer to the collection 𝒞\mathcal{C} as the containers of \mathcal{H}, since, by (a)(a), every independent set is contained in some member of 𝒞\mathcal{C}. The reader should think of V()V(\mathcal{H}) as being the edge set of some underlying graph GG, and E()E(\mathcal{H}) as encoding (some subset of) the copies of a graph FF in GG. Thus every FF-free subgraph of GG is an independent set of \mathcal{H}.

Let us introduce some notation to simplify the statements which follow. Given a graph FF and real number α\alpha, let =(n)\mathcal{I}=\mathcal{I}(n) denote the collection of FF-free graphs with nn vertices, and let 𝒢=𝒢(n,k)\mathcal{G}=\mathcal{G}(n,k) denote the collection of all graphs with nn vertices and at most knαkn^{\alpha} edges. By a colored graph, we mean a graph together with an arbitrary labelled partition of its edge set.

Theorem A.2.

If FF is α\alpha-good, then there exists a constant C=C(F)C=C(F) such that the following holds for all sufficiently large n,kn,k\in\mathbb{N} with k(nα2+v(F)2e(F)1/log2n)2αα1k\leq\left(n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{2-\alpha}{\alpha-1}}. There exists a collection 𝒮\mathcal{S} of colored graphs with nn vertices and at most Ckα12αnαCk^{-\frac{\alpha-1}{2-\alpha}}\cdot n^{\alpha} edges, as well as functions

g:𝒮andh:𝒮𝒢(n,k)g\colon\mathcal{I}\to\mathcal{S}\qquad\text{and}\qquad h\colon\mathcal{S}\to\mathcal{G}(n,k)

with the following properties:

  • (a)(a)

    For every s0s\geq 0, the number of colored graphs in 𝒮\mathcal{S} with ss edges is at most

    (Cnαs)1α1sexp(Ckα12αnα).\bigg{(}\frac{Cn^{\alpha}}{s}\bigg{)}^{\frac{1}{\alpha-1}\cdot s}\cdot\exp\Big{(}Ck^{-\frac{\alpha-1}{2-\alpha}}\cdot n^{\alpha}\Big{)}.
  • (b)(b)

    g(I)Ig(I)h(g(I))g(I)\subset I\subset g(I)\cup h(g(I)) for every II\in\mathcal{I}.

We prove Theorem A.2 by iterating the following container result.

Proposition A.3.

If FF is an α\alpha-good graph, then there exist k0k_{0}\in\mathbb{N} and ε>0\varepsilon>0 such that the following holds for every kk0k\geq k_{0} and every nn\in\mathbb{N}. Set

μ=1εmax{kα12α,n(α2+v(F)2e(F)1)}.\mu\,=\,\frac{1}{\varepsilon}\cdot\max\Big{\{}k^{-\frac{\alpha-1}{2-\alpha}},\,n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}\Big{\}}. (23)

Given a graph GG with nn vertices and knαkn^{\alpha} edges, there exists a function fGf_{G} that maps subgraphs of GG to subgraphs of GG, such that, for every FF-free subgraph IGI\subset G,

  • (a)(a)

    There exists a subgraph T=T(I)IT=T(I)\subset I with e(T)μnαe(T)\leq\mu n^{\alpha} and IfG(T)I\subset f_{G}(T), and

  • (b)(b)

    e(fG(T(I)))(1ε)e(G)e\big{(}f_{G}(T(I))\big{)}\leq(1-\varepsilon)e(G).

Proof.

By definition of FF being α\alpha-good, there exist real numbers C,k0>0C,k_{0}>0 and a hypergraph \mathcal{H} on E(G)E(G) whose hyperedges are copies of FF and is such that

  • (i)

    ||C1ke(F)nv(F)(2α)e(F)|\mathcal{H}|\geq C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}, and

  • (ii)

    For every σE(G)\sigma\subseteq E(G) with 1|σ|e(F)11\leq|\sigma|\leq e(F)-1, we have

    deg(σ)Cke(F)nv(F)(2α)e(F)knα(min{k12α,knα2+v(F)2e(F)1})|σ|1.\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{|\sigma|-1}}.

Let δ:=C1\delta:=C^{-1}, and without loss of generality we can assume CC is sufficiently large so that Theorem A.1 holds with r=e(F)r=e(F) and this choice of δ\delta. We will show that if

1τ=δ4kmin{kα12α,nα2+v(F)2e(F)1}=δ4min{k12α,knα2+v(F)2e(F)1},\frac{1}{\tau}=\delta^{4}k\cdot\min\Big{\{}k^{\frac{\alpha-1}{2-\alpha}},\,n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\Big{\}}=\delta^{4}\cdot\min\Big{\{}k^{\frac{1}{2-\alpha}},\,kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\Big{\}},

then it follows from (i)(i) and (ii)(ii) that δ(,τ)δ\delta(\mathcal{H},\tau)\leq\delta. Indeed, since v()=e(G)=knαv(\mathcal{H})=e(G)=kn^{\alpha}, we have

δ(,τ)\displaystyle\delta(\mathcal{H},\tau) =1e()j=2e(F)1τj1vV()d(j)(v)\displaystyle\,=\,\frac{1}{e(\mathcal{H})}\,\sum_{j=2}^{e(F)}\,\frac{1}{\tau^{j-1}}\cdot\sum_{v\in V(\mathcal{H})}d^{(j)}(v)
v()e()[j=2e(F)1τj1Cke(F)nv(F)(2α)e(F)knα(min{k12α,knα2+v(F)2e(F)1})j1]\displaystyle\,\leq\,\frac{v(\mathcal{H})}{e(\mathcal{H})}\Bigg{[}\sum_{j=2}^{e(F)}\frac{1}{\tau^{j-1}}\cdot\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{j-1}}\Bigg{]}
knαC1ke(F)nv(F)(2α)e(F)[j=2e(F)(δ4min{k12α,knα2+v(F)2e(F)1})j1Cke(F)nv(F)(2α)e(F)knα(min{k12α,knα2+v(F)2e(F)1})j1]\displaystyle\,\leq\,\frac{kn^{\alpha}}{C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}}\Bigg{[}\sum_{j=2}^{e(F)}\left(\delta^{4}\cdot\min\Big{\{}k^{\frac{1}{2-\alpha}},\,kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\Big{\}}\right)^{j-1}\cdot\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{j-1}}\Bigg{]}
j=2e(F)C2δ4(j1)δ,\displaystyle\,\leq\,\sum_{j=2}^{e(F)}C^{2}\delta^{4(j-1)}\,\leq\,\delta,

where this last bound holds provided δ=C1\delta=C^{-1} is sufficiently small, which we can assume to be the case without loss of generality. Thus, applying Theorem A.1 and setting ε=δ5\varepsilon=\delta^{5}, we obtain a collection 𝒞\mathcal{C} of subgraphs of GG and a function fGf_{G} mapping subgraphs of GG to elements of 𝒞\mathcal{C} so that for every FF-free subgraph IGI\subset G, there exists a subgraph T=T(I)IT=T(I)\subset I with

e(T)τN/δnαεmax{kα12α,n(α2+v(F)2e(F)1)}=μnαe(T)\ \leq\ \tau N/\delta\ \leq\frac{n^{\alpha}}{\varepsilon}\cdot\max\Big{\{}k^{-\frac{\alpha-1}{2-\alpha}},\,n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}\Big{\}}\ =\ \mu n^{\alpha}

and IfG(T)I\subset f_{G}(T), and also

e([C])(1δ)e() for all C𝒞.e\big{(}\mathcal{H}[C]\big{)}\leq\big{(}1-\delta\big{)}e(\mathcal{H})\text{ for all }C\in\mathcal{C}. (24)

It only remains to show that this second condition implies e(C)(1ε)e(G)e(C)\leq(1-\varepsilon)e(G) for every C𝒞C\in\mathcal{C} (notice that the first inequality is about hyperedges and the second is about graph edges). To prove this, for each C𝒞C\in\mathcal{C} set

𝒟(C)=E()E([C])={eE():ve for some vV()C},\mathcal{D}(C)\,=\,E(\mathcal{H})\setminus E(\mathcal{H}[C])\,=\,\big{\{}e\in E(\mathcal{H})\,:\,v\in e\mbox{ for some }v\in V(\mathcal{H})\setminus C\big{\}},

and recall that deg(v)e()/(δ2knα)\deg_{\mathcal{H}}(v)\leq e(\mathcal{H})/\big{(}\delta^{2}kn^{\alpha}\big{)} for every vV()v\in V(\mathcal{H}), by (i)(i),  (ii)(ii), and δ=C1\delta=C^{-1}. Therefore,

|𝒟(C)|e()δ2knα|E(G)C|.|\mathcal{D}(C)|\,\leq\,\frac{e(\mathcal{H})}{\delta^{2}kn^{\alpha}}\cdot|E(G)\setminus C|.

On the other hand, we have |𝒟(C)|=e()e([C])δe()|\mathcal{D}(C)|=e(\mathcal{H})-e(\mathcal{H}[C])\geq\delta e(\mathcal{H}) by condition (24), and so

|E(G)C|δ3knαεe(G),|E(G)\setminus C|\,\geq\,\delta^{3}kn^{\alpha}\geq\varepsilon e(G),

as required. Hence the proposition follows with ε=δ5\varepsilon=\delta^{5}. ∎

With this in hand, we continue on towards the proof of Theorem A.2. We will need the following straightforward lemma (see, for example, [5, Lemma 4.3]).

Lemma A.4.

Let M>0M>0, s>0s>0 and 0<δ<10<\delta<1. If a1,,ama_{1},\ldots,a_{m}\in\mathbb{R} satisfy s=jajs=\sum_{j}a_{j} and 1aj(1δ)jM1\leq a_{j}\leq(1-\delta)^{j}M for each j[m]j\in[m], then

slogsj=1majlogaj+O(M).s\log s\,\leq\,\sum_{j=1}^{m}a_{j}\log a_{j}+O(M).

We can now deduce Theorem A.2.

Proof of Theorem A.2.

We construct the functions gg and hh and the family 𝒮\mathcal{S} as follows. Given an FF-free graph II\in\mathcal{I}, we repeatedly apply Proposition A.3, first to the complete graph G0=KnG_{0}=K_{n}, then to the graph G1=fG0(T1)T1G_{1}=f_{G_{0}}(T_{1})\setminus T_{1}, where T1IT_{1}\subset I is the set guaranteed to exist by part (a)(a), then to the graph G2=fG1(T2)T2G_{2}=f_{G_{1}}(T_{2})\setminus T_{2}, where T2IG1=IT1T_{2}\subset I\cap G_{1}=I\setminus T_{1}, and so on. We continue until we arrive at a graph GmG_{m} with at most knαkn^{\alpha} edges, and set

g(I)=(T1,,Tm)andh(g(I))=Gm.g(I)=(T_{1},\ldots,T_{m})\qquad\text{and}\qquad h\big{(}g(I)\big{)}=G_{m}.

Since GmG_{m} depends only on the sequence (T1,,Tm)(T_{1},\ldots,T_{m}), the function hh is well-defined.

It remains to bound the number of colored graphs in 𝒮\mathcal{S} with ss edges. To do so, it suffices to count the number of choices for the sequence of graphs (T1,,Tm)(T_{1},\ldots,T_{m}) with je(Tj)=s\sum_{j}e(T_{j})=s. For each j1j\geq 1, define k(j)k(j) and μ(j)\mu(j) as follows:

e(Gmj)=k(j)nαandμ(j)=1εmax{k(j)α12α,n(α2+v(F)2e(F)1)},e\big{(}G_{m-j}\big{)}=k(j)n^{\alpha}\quad\text{and}\quad\mu(j)=\frac{1}{\varepsilon}\cdot\max\Big{\{}k(j)^{-\frac{\alpha-1}{2-\alpha}},\,n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}\Big{\}},

and note that

k(j)(1ε)j+1k,Tj+1Gjande(Tmj)μ(j)nα.k(j)\geq(1-\varepsilon)^{-j+1}k,\qquad T_{j+1}\subset G_{j}\qquad\text{and}\qquad e(T_{m-j})\leq\mu(j)n^{\alpha}.

Thus, fixing kk, ε\varepsilon and ss as above, and writing

𝒦(m)={𝐤=(k(1),,k(m)):(1ε)j+1kk(j)n2α}\mathcal{K}(m)\,=\,\Big{\{}\mathbf{k}=(k(1),\ldots,k(m))\,:\,(1-\varepsilon)^{-j+1}k\leq k(j)\leq n^{2-\alpha}\Big{\}}

for each mm\in\mathbb{N}, and

𝒜(𝐤)={𝐚=(a(1),,a(m)):a(j)μ(j)nα and ja(j)=s},\mathcal{A}(\mathbf{k})\,=\,\Big{\{}\mathbf{a}=(a(1),\ldots,a(m))\,:\,a(j)\leq\mu(j)n^{\alpha}\text{ and }\sum_{j}a(j)=s\Big{\}},

for each 𝐤𝒦(m)\mathbf{k}\in\mathcal{K}(m), it follows that the number of colored graphs in 𝒮\mathcal{S} with ss edges is at most

m=1𝐤𝒦(m)𝐚𝒜(𝐤)j=1m(k(j)nαa(j)).\sum_{m=1}^{\infty}\sum_{\mathbf{k}\in\mathcal{K}(m)}\sum_{\mathbf{a}\in\mathcal{A}(\mathbf{k})}\prod_{j=1}^{m}{k(j)n^{\alpha}\choose a(j)}.

Given mm\in\mathbb{N}, 𝐤𝒦(m)\mathbf{k}\in\mathcal{K}(m) and 𝐚𝒜(𝐤)\mathbf{a}\in\mathcal{A}(\mathbf{k}), let us partition the product over jj according to whether or not μ(j)=1εn(α2+v(F)2e(F)1)\mu(j)=\frac{1}{\varepsilon}\cdot n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}. Since 𝒦(m)=\mathcal{K}(m)=\emptyset if mm is at least some large constant times logn\log n, the product of the terms for which this is the case is at most

(n2)ja(j)exp(O(1)n2v(F)2e(F)1(logn)2)exp(O(1)kα12αnα),\big{(}n^{2}\big{)}^{\sum_{j}a(j)}\,\leq\,\exp\Big{(}O(1)\cdot n^{2-\frac{v(F)-2}{e(F)-1}}(\log n)^{2}\Big{)}\,\leq\,\exp\Big{(}O(1)\cdot k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)},

where in the last step we used the fact that k(nα2+v(F)2e(F)1/log2n)2αα1k\leq\left(n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{2-\alpha}{\alpha-1}}. On the other hand, if a(j)k(j)α12αnαa(j)\leq k(j)^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}, i.e. if k(j)(nα/a(j))2αα1k(j)\leq(n^{\alpha}/a(j))^{\frac{2-\alpha}{\alpha-1}}, then

(k(j)nαa(j))(ek(j)nαa(j))a(j)(enαa(j))1α1a(j),{k(j)n^{\alpha}\choose a(j)}\,\leq\,\bigg{(}\frac{ek(j)n^{\alpha}}{a(j)}\bigg{)}^{a(j)}\,\leq\,\bigg{(}\frac{en^{\alpha}}{a(j)}\bigg{)}^{\frac{1}{\alpha-1}\cdot a(j)},

and hence, by Lemma A.4, the product over the remaining jj is at most

(Cnαs)1α1sexp(Ckα12αnα)\bigg{(}\frac{C^{\prime}n^{\alpha}}{s}\bigg{)}^{\frac{1}{\alpha-1}\cdot s}\cdot\exp\Big{(}C^{\prime}k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)}

for some C=C(F)C^{\prime}=C^{\prime}(F). Noting that m=1𝐤𝒦(m)|𝒜(𝐤)|=nO(logn)\sum_{m=1}^{\infty}\sum_{\mathbf{k}\in\mathcal{K}(m)}|\mathcal{A}(\mathbf{k})|=n^{O(\log n)} since 𝒦(m)=\mathcal{K}(m)=\emptyset for mm at least some large constant logn\log n, the theorem follows. ∎

We can now easily deduce Theorem 6.1.

Proof of Theorem 6.1.

Let FF be a graph satisfying the hypotheses of the theorem, i.e. a graph which is α\alpha-good for some 1<α<21<\alpha<2. Recall that we wish to show that for p(n2αv(F)2e(F)1/log2n)1α1p\geq\left(n^{2-\alpha-\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{1}{\alpha-1}}, we have ex(Gn,p,F)=O(pα1nα)\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha}) w.h.p. Given such a function p=p(n)p=p(n), define k=p(2α)k=p^{-(2-\alpha)}. Since k(Cnα2+v(F)2e(F)1/log2n)2αα1k\leq\left(Cn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{2-\alpha}{\alpha-1}}, we can apply A.2 to get functions g,hg,h. Suppose that there exists an FF-free subgraph IG(n,p)I\subset G(n,p) with mm edges, and observe that g(I)G(n,p)g(I)\subset G(n,p), and that G(n,p)G(n,p) contains at least me(g(I))m-e\big{(}g(I)\big{)} elements of h(g(I))h(g(I)). The probability of this event is therefore at most

S𝒮(knαme(S))pm\displaystyle\sum_{S\in\mathcal{S}}{kn^{\alpha}\choose m-e(S)}p^{m} s=0Ckα12αnα(Cpα1nαs)1α1sexp(Ckα12αnα)(3pknαms)ms\displaystyle\,\leq\sum_{s=0}^{Ck^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}}\bigg{(}\frac{Cp^{\alpha-1}n^{\alpha}}{s}\bigg{)}^{\frac{1}{\alpha-1}\cdot s}\exp\Big{(}Ck^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)}\bigg{(}\frac{3pkn^{\alpha}}{m-s}\bigg{)}^{m-s}
exp[O(1)(pα1nα+kα12αnα)](4pknαm)m/20\displaystyle\,\leq\,\exp\bigg{[}O(1)\cdot\Big{(}p^{\alpha-1}n^{\alpha}+k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)}\bigg{]}\bigg{(}\frac{4pkn^{\alpha}}{m}\bigg{)}^{m/2}\to 0

as nn\to\infty, as long as mm is a sufficiently large constant times

max{pknα,kα12αnα}=pα1nα.\max\Big{\{}pkn^{\alpha},\,k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{\}}=p^{\alpha-1}n^{\alpha}.

We conclude that ex(Gn,p,F)=O(pα1nα)\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha}) w.h.p., giving the result. ∎

Appendix B Proof of Proposition 2.1

We emphasize that almost everything in this section will be nearly identical to Morris and Saxton [20]. We first recall the definitions and conventions introduced in Section 2:

  • We fixed a sequence of rapidly decreasing constants

    1εbε2ε1>01\geq\varepsilon_{b}\geq\cdots\geq\varepsilon_{2}\geq\varepsilon_{1}>0

    which depend only on bb. We also fixed some mm-vertex graph GG with minimum degree m1/b\ell m^{1/b} with \ell (and hence mm) sufficiently large in terms of the εt\varepsilon_{t} constants.

  • For xV(G)x\in V(G), we say that a tuple 𝒜=(A0,A1,,At)\mathcal{A}=(A_{0},A_{1},\ldots,A_{t}) of (not necessarily disjoint) subsets of V(G)V(G) is a concentrated tt-neighborhood of xx if A0={x}A_{0}=\{x\}, |N(v)Ai|εtm1/b|N(v)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b} for all vAi1v\in A_{i-1}, and

    |At|(bt)/(b1)mt/b.|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}.

    We define t(x)t(x) to be the minimal t2t\geq 2 such that there exists a concentrated tt-neighborhood of xx in GG.

  • Lemma 2.2 says that for some 2tb2\leq t\leq b, there exists XV(G)X\subseteq V(G) of size at least 12(4b)tb(bt)/(b1)mt/b\frac{1}{2}(4b)^{t-b}\ell^{(b-t)/(b-1)}m^{t/b} such that t(x)=tt(x)=t for every xXx\in X, and such that for every xXx\in X there exists a tuple of sets 𝒜=(A0,,At)\mathcal{A}=(A_{0},\ldots,A_{t}) such that A0={x}A_{0}=\{x\}, |At|(bt)/(b1)mt/b|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}, |N(y)Ai|12εtm1/b|N(y)\cap A_{i}|\geq\frac{1}{2}\varepsilon_{t}\ell m^{1/b} for all yAi1y\in A_{i-1}, and every yAiy\in\bigcup A_{i} has t(y)tt(y)\geq t.

For the rest of this section we fix t,Xt,X as in Lemma 2.2. We also fix some ε>0\varepsilon>0 sufficiently small compared to the εs\varepsilon_{s} constants, as well as a set of forests \mathcal{F} such that for every path x1xrx_{1}\cdots x_{r} of GG which does not contain an element of \mathcal{F} as a subgraph, there are at most εm1/b\varepsilon\ell m^{1/b} vertices xr+1NG(xr)x_{r+1}\in N_{G}(x_{r}) such that the path x1xr+1x_{1}\cdots x_{r+1} contains an element of \mathcal{F} as a subgraph. As much as possible we use the notation of Morris and Saxton’s original proof, and in particular, we drop our convention from the main part of the text that u,v,wu,v,w are used only as vertices of θa,b\theta_{a,b}.

We introduce some notation and definitions that will be used for the rest of the proof. Given a set of paths 𝒫\mathcal{P}, we define the (r,v)(r,v)-branching factor of 𝒫\mathcal{P} is the maximum number dd such that there exist dd paths in 𝒫\mathcal{P} with iith vertex vv and pairwise distinct (i+1)(i+1)st vertices. The branching factor of 𝒫\mathcal{P} is defined to be the maximum (i,v)(i,v)-branching factor amongst all choices of i,vi,v. We define 𝒫i,j={uiuj:u0ut𝒫}\mathcal{P}_{i,j}=\{u_{i}\cdots u_{j}:u_{0}\cdots u_{t}\in\mathcal{P}\} and 𝒫[uv]:={x1xs𝒫:x1=u,xs=v}\mathcal{P}[u\to v]:=\{x_{1}\cdots x_{s}\in\mathcal{P}:x_{1}=u,\ x_{s}=v\}, and also define 𝒫[uS]=vS𝒫[uv]\mathcal{P}[u\to S]=\bigcup_{v\in S}\mathcal{P}[u\to v].

One lemma that we will need in several places is the following.

Lemma B.1.

Let \mathcal{R} be a collection of paths of length s2s\geq 2 in GG from a vertex xV(G)x\in V(G) to a set BV(G)B\subseteq V(G). Assume that |B|(bs)/(b1)ms/b|B|\leq\ell^{(b-s)/(b-1)}m^{s/b}, ||>sεs(m1/b)s|\mathcal{R}|>s\varepsilon_{s}(\ell m^{1/b})^{s}, and that \mathcal{R} has branching factor at most m1/b\ell m^{1/b}. Then t(x)st(x)\leq s,

Proof.

Form a subset \mathcal{R}^{\prime}\subseteq\mathcal{R} by starting with =\mathcal{R}^{\prime}=\mathcal{R} and then iteratively choosing i,vi,v such that the (i,v)(i,v)-branching factor of \mathcal{R}^{\prime} is less than εsm1/b\varepsilon_{s}\ell m^{1/b} and then deleting any paths which contain vv as their iith vertex. Let AiA_{i} be the set of vertices used as the iith vertex of some path of \mathcal{R}^{\prime}. If \mathcal{R}^{\prime} is non-empty, then (A0,,As)(A_{0},\ldots,A_{s}) is a concentrated ss-neighborhood of xx by construction, which shows t(x)st(x)\leq s.

Thus it suffices to show that \mathcal{R}^{\prime} is non-empty. We claim that the number of paths that were destroyed is at most sεs(m1/b)ss\cdot\varepsilon_{s}(\ell m^{1/b})^{s}. Indeed, because \mathcal{R} has branching factor at most m1/b\ell m^{1/b}, every destroyed path can be identified by choosing some index 0i<s0\leq i<s, starting a path at u0=xu_{0}=x, and then iteratively choosing the next vertex of the path uj+1u_{j+1} in at most m1/b\ell m^{1/b} ways for each jij\neq i and in at most εsm1/b\varepsilon_{s}\ell m^{1/b} ways when j=ij=i, proving the claim. Since |||\mathcal{R}| is strictly greater than the number of destroyed paths, \mathcal{R}^{\prime} is non-empty and the result follows. ∎

The following definition will almost be strong enough to prove Proposition 2.1.

Definition 10.

Let 𝒜=(A0,,At)\mathcal{A}=(A_{0},\ldots,A_{t}) be a collection of (not necessarily disjoint) Sets of vertices of GG with A0={x}A_{0}=\{x\} and let 𝒫\mathcal{P} be a collection of paths of the form xu1utxu_{1}\cdots u_{t} with uiAiu_{i}\in A_{i} for all ii. We say that (𝒜,𝒫)(\mathcal{A},\mathcal{P}) is a balanced tt-neighborhood of xx if the following conditions hold:

  1. (i)

    We have |A1|m1/b|A_{1}|\leq\ell m^{1/b} and |At|(bt)/(b1)mt/b|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}.

  2. (ii)

    For every 0i<jt0\leq i<j\leq t with (i,j)(0,t)(i,j)\neq(0,t) and every uAi,vAju\in A_{i},v\in A_{j}, we have |𝒫i,j[u,v]|(ji1)b/(b1)|\mathcal{P}_{i,j}[u,v]|\leq\ell^{(j-i-1)b/(b-1)}.

  3. (iii)

    The branching factor of 𝒫\mathcal{P} is at most εtm1/b\varepsilon_{t}\ell m^{1/b}.

For the next lemma we recall that \mathcal{F} is a set of forests satisfying a property that depends on ε\varepsilon.

Lemma B.2.

If xXx\in X, then there exists a balanced tt-neighborhood (𝒜,𝒫)(\mathcal{A},\mathcal{P}) of xx with |𝒫|12(14εtm1/b)t|\mathcal{P}|\geq\frac{1}{2}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t} such that every subgraph of each P𝒫P\in\mathcal{P} does not lie in \mathcal{F} provided ε\varepsilon is sufficiently small.

Proof.

Let 𝒜\mathcal{A} be the tuple of sets guaranteed by Lemma 2.2. We may assume that |A1|m1/b|A_{1}|\leq\ell m^{1/b}, as otherwise we can just remove vertices from A1A_{1} while maintaining all the properties guaranteed by Lemma 2.2. For each vAi1v\in A_{i-1}, let Qi(v)Q_{i}(v) be an arbitrary subset of N(v)AiN(v)\cap A_{i} of size 12εtm1/b\frac{1}{2}\varepsilon_{t}\ell m^{1/b}. Let 𝒬\mathcal{Q} be the set of paths xu1utxu_{1}\cdots u_{t} generated as follows. Given xu1ui1xu_{1}\cdots u_{i-1}, select any ui𝒬i(ui1)u_{i}\in\mathcal{Q}_{i}(u_{i-1}) such that ui{x,u1,,ui1}u_{i}\notin\{x,u_{1},\ldots,u_{i-1}\} and such no subgraph of xu1uixu_{1}\cdots u_{i} is contained in \mathcal{F}. Note that the number of choices at each step is at least

12εtm1/btεm1/b14εtm1/b,\frac{1}{2}\varepsilon_{t}\ell m^{1/b}-t-\varepsilon\ell m^{1/b}\geq\frac{1}{4}\varepsilon_{t}\ell m^{1/b},

with the last step holding if ε\varepsilon is sufficiently small (which also implies m1/b\ell m^{1/b} is sufficiently large compared to εt1t\varepsilon_{t}^{-1}t). This means

|𝒬|(14εtm1/b)t.|\mathcal{Q}|\geq\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t}. (25)

Note that by construction, every path in 𝒬\mathcal{Q} avoids \mathcal{F}.

We now remove some paths from 𝒬\mathcal{Q} to produce 𝒫\mathcal{P}. If there exists 0i<jt0\leq i<j\leq t with (i,j)(0,t)(i,j)\neq(0,t) and vertices uAi,vAju\in A_{i},v\in A_{j} and |𝒬i,j[uv]|>(ji1)b/(b1)|\mathcal{Q}_{i,j}[u\to v]|>\ell^{(j-i-1)b/(b-1)}, then choose a path xu1ut𝒬xu_{1}\cdots u_{t}\in\mathcal{Q} with ui=uu_{i}=u and uj=vu_{j}=v and delete this path from 𝒬\mathcal{Q}. Repeat this until no such paths remain in 𝒬\mathcal{Q}, and let 𝒫\mathcal{P} be the resulting set of paths. By construction (𝒜,𝒫)(\mathcal{A},\mathcal{P}) is a balanced neighborhood, so it suffices to show |𝒫||\mathcal{P}| is large.

We say that a pair of vertices (u,v)(u,v) is (i,j)(i,j)-unbalanced if uAi,vAju\in A_{i},v\in A_{j} and |𝒬i,j[uv]|>(ji1)b/(b1)|\mathcal{Q}_{i,j}[u\to v]|>\ell^{(j-i-1)b/(b-1)} (we emphasize that this condition involves the original family 𝒬\mathcal{Q} before any paths are deleted). Let (i,j)={xu1uj𝒬0,j:(ui,uj) is (i,j)-unbalanced}\mathcal{R}(i,j)=\{xu_{1}\cdots u_{j}\in\mathcal{Q}_{0,j}:(u_{i},u_{j})\textrm{ is }(i,j)\textrm{-unbalanced}\}. We claim that

|(i,j)|t24b1(εtm1/b)j|\mathcal{R}(i,j)|\leq t^{-2}4^{-b-1}(\varepsilon_{t}\ell m^{1/b})^{j} (26)

for all 0i<jt0\leq i<j\leq t with (i,j)(0,t)(i,j)\neq(0,t). Assuming this is true, this fact together with the branching factor of 𝒬\mathcal{Q} implies that the number of paths removed is at most

i,j|(i,j)|(εtm1/b)tj12(14εtm1/b)t,\sum_{i,j}|\mathcal{R}(i,j)|(\varepsilon_{t}\ell m^{1/b})^{t-j}\leq\frac{1}{2}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t},

with this last step holding if m1/b\ell m^{1/b} is sufficiently large (which holds if ε\varepsilon is sufficiently small). From this and (25)\eqref{eq:QPaths}, we conclude that the remaining set of paths 𝒫\mathcal{P} has the desired properties and is as large as claimed. It thus remains to prove (26).

Fix (i,j)(0,t)(i,j)\neq(0,t) and let s:=jis:=j-i. If s=1s=1 then (i,j)=\mathcal{R}(i,j)=\emptyset (since no pair of vertices can be (i,i+1)(i,i+1)-unbalanced), so we may assume s2s\geq 2. Observe that

|(i,j)|uAi|(i,j)0,i[xu]||(i,j)i,j[uAj]|(εtm1/b)imaxuAi|(i,j)i,j[uAj]|,|\mathcal{R}(i,j)|\leq\sum_{u\in A_{i}}|\mathcal{R}(i,j)_{0,i}[x\to u]|\cdot|\mathcal{R}(i,j)_{i,j}[u\to A_{j}]|\leq(\varepsilon_{t}\ell m^{1/b})^{i}\cdot\max_{u\in A_{i}}|\mathcal{R}(i,j)_{i,j}[u\to A_{j}]|,

where the second inequality used (i,j)0,i𝒬0,i\mathcal{R}(i,j)_{0,i}\subseteq\mathcal{Q}_{0,i} which has branching factor at most εtm1/b\varepsilon_{t}\ell m^{1/b}. Thus if we assume for contradiction that (26) does not hold, then there must exist some uAiu\in A_{i} such that

|(i,j)i,j[uAj]|>t24b1(εtm1/b)ssεs(m1/b)s,|\mathcal{R}(i,j)_{i,j}[u\to A_{j}]|>t^{-2}4^{-b-1}(\varepsilon_{t}\ell m^{1/b})^{s}\geq s\varepsilon_{s}(\ell m^{1/b})^{s}, (27)

with this last step holding if the εs\varepsilon_{s^{\prime}} constants decrease sufficiently quickly. Let

B:={ujAj:xu1uj(i,j),ui=u}.B:=\{u_{j}\in A_{j}:\exists xu_{1}\cdots u_{j}\in\mathcal{R}(i,j),\ u_{i}=u\}.

Note that |(i,j)[u,Aj]|(εtm1/b)ji|\mathcal{R}(i,j)[u,A_{j}]|\leq(\varepsilon_{t}\ell m^{1/b})^{j-i} because 𝒬\mathcal{Q} has branching factor at most εtm1/b\varepsilon_{t}\ell m^{1/b}, and that each vBv\in B is the last vertex of more than (ji1)b/(b1)\ell^{(j-i-1)b/(b-1)} paths of (i,j)[u,Aj]\mathcal{R}(i,j)[u,A_{j}] (since by definition of (i,j)\mathcal{R}(i,j), such a pair (u,v)(u,v) must be (i,j)(i,j)-unbalanced). Using s=jis=j-i gives

|B|(εtm1/b)s(s1)b/(b1)=εts(bs)/(b1)ms/b(bs)/(b1)ms/b.|B|\leq\frac{(\varepsilon_{t}\ell m^{1/b})^{s}}{\ell^{(s-1)b/(b-1)}}=\varepsilon_{t}^{s}\ell^{(b-s)/(b-1)}m^{s/b}\leq\ell^{(b-s)/(b-1)}m^{s/b}.

With this and (27), we can apply Lemma B.1 to (i,j)i,j[uAj]\mathcal{R}(i,j)_{i,j}[u\to A_{j}] to conclude t(u)s<t(x)t(u)\leq s<t(x). This gives a contradiction to uAiu\in A_{i} and the properties of 𝒜\mathcal{A} guaranteed by Lemma 2.2.

A key fact about balanced neighorhoods is the following.

Lemma B.3.

Let (𝒜,𝒫)(\mathcal{A},\mathcal{P}) be a balanced tt-neighborhood of xV(G)x\in V(G). If ε\varepsilon is sufficiently small, then for any yAty\in A_{t} and non-empty set of vertices SV(G){x,y}S\subseteq V(G)\setminus\{x,y\}, there are at most ε1(t1|S|)b/(b1)\varepsilon^{-1}\ell^{(t-1-|S|)b/(b-1)} paths in 𝒫[xy]\mathcal{P}[x\to y] containing SS.

Proof.

Let S={z1,,zr}S=\{z_{1},\ldots,z_{r}\}, and for ease of notation let z0=xz_{0}=x and zr+1=yz_{r+1}=y. Given a sequence 0=i0<i1<<ir<ir+1=t0=i_{0}<i_{1}<\cdots<i_{r}<i_{r+1}=t, the number of paths xu1ut1y𝒫[xy]xu_{1}\cdots u_{t-1}y\in\mathcal{P}[x\to y] with uij=zju_{i_{j}}=z_{j} is at most

j=0r|𝒫[zjzj+1]|i=0r(ij+1ij1)b/(b1)=(ir+1i0(r+1))b/(b1)=(t1|S|)b/(b1).\prod_{j=0}^{r}|\mathcal{P}[z_{j}\to z_{j+1}]|\leq\prod_{i=0}^{r}\ell^{(i_{j+1}-i_{j}-1)b/(b-1)}=\ell^{(i_{r+1}-i_{0}-(r+1))b/(b-1)}=\ell^{(t-1-|S|)b/(b-1)}.

Every path containing SS can be formed in this way, possibly by reordering the elements of SS and by choosing different indices iji_{j}. As the number of ways of doing this is some finite number depending only on bb, we conclude the result. ∎

We now move onto the last notion of neighborhoods that we need for this proof.

Definition 11.

Let (,𝒬)(\mathcal{B},\mathcal{Q}) be a balanced tt-neighborhood of xx. We say that (,𝒬)(\mathcal{B},\mathcal{Q}) is a refined tt-neighborhood of xx if the following conditions also hold:

  1. 1.

    For every i{0,1,,t1}i\in\{0,1,\ldots,t-1\} and every uBiu\in B_{i},

    |N(u)Bi+1|t142tεtm1/b.|N(u)\cap B_{i+1}|\geq t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b}.
  2. 2.

    For every vBtv\in B_{t},

    |N(v)Bt1|42tεt2(t1)b/(b1).|N(v)\cap B_{t-1}|\geq 4^{-2t}\varepsilon_{t}^{2}\ell^{(t-1)b/(b-1)}.
  3. 3.

    For every vBtv\in B_{t},

    |𝒬[xv]|42tεtt(t1)b/(b1).|\mathcal{Q}[x\to v]|\geq 4^{-2t}\varepsilon_{t}^{t}\ell^{(t-1)b/(b-1)}.
Lemma B.4.

If (𝒜,𝒫)(\mathcal{A},\mathcal{P}) is a balanced tt-neighborhood of a vertex xXx\in X with |𝒫|12(14εtm1/b)t|\mathcal{P}|\geq\frac{1}{2}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}, then there exists a refined tt-neighborhood (,𝒬)(\mathcal{B},\mathcal{Q}) of xx with BiAiB_{i}\subseteq A_{i} for all ii and 𝒬𝒫\mathcal{Q}\subseteq\mathcal{P} such that

|𝒬|14(14εtm1/b)t.|\mathcal{Q}|\geq\frac{1}{4}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}.
Proof.

Repeatedly delete vertices using the following three steps until no further vertices can be removed:

  1. Step 1

    If there exists i{1,,t1}i\in\{1,\ldots,t-1\} and vAiv\in A_{i} with

    |N(v)Ai+1|<t142tεtm1/b,|N(v)\cap A_{i+1}|<t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b},

    then remove vv from AiA_{i} and remove all paths P=xu1ut𝒫P=xu_{1}\cdots u_{t}\in\mathcal{P} with ui=vu_{i}=v.

  2. Step 2

    If there exists vAtv\in A_{t} with

    |N(v)At1|<42tεt2b/(b1),|N(v)\cap A_{t-1}|<4^{-2t}\varepsilon_{t}^{2}\ell^{b/(b-1)},

    then remove vv from AtA_{t} and remove all paths P=xu1ut𝒫P=xu_{1}\cdots u_{t}\in\mathcal{P} with ut=vu_{t}=v.

  3. Step 3

    If there exists vAtv\in A_{t} with

    |𝒫[xv]|<42tεtt(t1)b/(b1),|\mathcal{P}[x\to v]|<4^{-2t}\varepsilon_{t}^{t}\ell^{(t-1)b/(b-1)},

    then remove vv from AtA_{t} and remove all paths P=xu1ut𝒫P=xu_{1}\cdots u_{t}\in\mathcal{P} with ut=vu_{t}=v.

Let BiAiB_{i}\subseteq A_{i} and 𝒬𝒫\mathcal{Q}\subseteq\mathcal{P} be the set of vertices and paths that remain at the end of this process and let =(B0,,Bt)\mathcal{B}=(B_{0},\ldots,B_{t}). Note that with this, (,𝒬)(\mathcal{B},\mathcal{Q}) automatically satisfies every condition for a refined tt-neighborhood except possibly |N(x)B1|t142tεtm1/b|N(x)\cap B_{1}|\geq t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b}. This will follow from having 𝒬\mathcal{Q} large, which we prove below by arguing that few paths are destroyed in the process above.

Because 𝒫\mathcal{P} is a balanced tt-neighborhood, its branching factor is at most εtm1/b\varepsilon_{t}\ell m^{1/b}. As such the number of paths removed in Step 1 is at most

tt142tεtm1/b(εtm1/b)t1=4t(14εtm1/b)t,t\cdot t^{-1}4^{-2t}\cdot\varepsilon_{t}\ell m^{1/b}\cdot(\varepsilon_{t}\ell m^{1/b})^{t-1}=4^{-t}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t}, (28)

and in Step 3 we remove at most

42tεtt(t1)b/(b1)|At|4t(14εtm1/b)t,4^{-2t}\varepsilon_{t}^{t}\ell^{(t-1)b/(b-1)}|A_{t}|\leq 4^{-t}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t}, (29)

where this last step uses the definition of balanced tt-neighborhoods.

For Step 2, we aim to show that the number of destroyed paths is at most

18(14εtm1/b)t.\frac{1}{8}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t}. (30)

Let ZAtZ\subseteq A_{t} and 𝒫(Z)\mathcal{P}(Z) denote the collection of vertices and paths removed in Step 2, and let

Y={uAt1:𝒫(Z) has (t1,u)-branching factor at least 4t2εtm1/b}.Y=\{u\in A_{t-1}:\mathcal{P}(Z)\textrm{ has }(t-1,u)\textrm{-branching factor at least }4^{-t-2}\varepsilon_{t}\ell m^{1/b}\}.

Note that by the definition of YY and the bound on the branching factor of 𝒫\mathcal{P},

|{xu1ut𝒫(Z):ut1At1Y}|4t2(εtm1/b)t,|\{xu_{1}\cdots u_{t}\in\mathcal{P}(Z):u_{t-1}\in A_{t-1}\setminus Y\}|\leq 4^{-t-2}(\varepsilon_{t}\ell m^{1/b})^{t}, (31)

so it remains to show that there are few paths which use a vertex of YY as the second to last vertex. For this, let

W={(u,v):uY,vZ,xw1wt𝒫(Z),wt1=u,wt=v}.W=\{(u,v):u\in Y,\ v\in Z,\exists xw_{1}\cdots w_{t}\in\mathcal{P}(Z),w_{t-1}=u,w_{t}=v\}.

Note first that by definition of YY, |W||Y|4t2εtm1/b|W|\geq|Y|4^{-t-2}\varepsilon_{t}\ell m^{1/b}. On the other hand, we have that |W||Z|42tεt2b/(b1)|W|\leq|Z|4^{-2t}\varepsilon_{t}^{2}\ell^{b/(b-1)} since at the time each vertex vZv\in Z is deleted, vv has at most εt2b/(b1)\varepsilon_{t}^{2}\ell^{b/(b-1)} neighbors in At1YA_{t-1}\supseteq Y. In total then we find

|Y||Z|42tεt2b/(b1)4t2εtm1/b4t+2εt(bt+1)/(b1)m(t1)/b,|Y|\leq\frac{|Z|4^{-2t}\varepsilon_{t}^{2}\ell^{b/(b-1)}}{4^{-t-2}\varepsilon_{t}\ell m^{1/b}}\leq 4^{-t+2}\varepsilon_{t}\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}, (32)

where this last step used |Z||At|(bt)/(b1)mt/b|Z|\leq|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b} by definition of balanced neighborhoods.

Observe that if the number of paths in 𝒫(Z)\mathcal{P}(Z) using a vertex of YY as the second to last vertex is at most 4t2(εtm1/b)t4^{-t-2}\left(\varepsilon_{t}\ell m^{1/b}\right)^{t} then (31) implies that the number of paths removed is at most (30), so we may assume this is not the case. Letting S:=𝒫(Z)[xY]S:=\mathcal{P}(Z)[x\to Y], this assumption together with the branching factor of 𝒫\mathcal{P} implies |S|εtm1/b4t2(εtm1/b)t|S|\cdot\varepsilon_{t}\ell m^{1/b}\leq 4^{-t-2}\left(\varepsilon_{t}\ell m^{1/b}\right)^{t}, or equivalently

|S|4t2(εtm1/b)tεtm1/b>εt1(m1/b)t1,|S|\geq\frac{4^{-t-2}(\varepsilon_{t}\ell m^{1/b})^{t}}{\varepsilon_{t}\ell m^{1/b}}>\varepsilon_{t-1}(\ell m^{1/b})^{t-1}, (33)

With the last step holding if the εs\varepsilon_{s} constants decrease sufficiently quickly. If t12t-1\geq 2, then (32) and (33) together with Lemma B.1 imply t(x)t1t(x)\leq t-1, contradicting xXx\in X. If t=2t=2 then (32) gives |Y|4t2εtm1/b|Y|\leq 4^{-t-2}\varepsilon_{t}\ell m^{1/b}, so the fact that 𝒫\mathcal{P} has branching factor at most εtm1/b\varepsilon_{t}\ell m^{1/b} implies that there are at most 4t2(εtm1/b)24^{-t-2}(\varepsilon_{t}\ell m^{1/b})^{2} paths in 𝒫\mathcal{P} whose second to last vertex is in YY. In either case, this bound together with (31) implies the number of paths removed is at most (30).

As t2t\geq 2, (28), (30), (29) imply that the total number of paths destroyed is at most 14(14εtm1/b)t\frac{1}{4}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}, so 𝒬\mathcal{Q} has the desired size. To prove that (,𝒬)(\mathcal{B},\mathcal{Q}) is a refined tt-neighborhood, it remains to show |N(x)B1|t142tεtm1/b|N(x)\cap B_{1}|\geq t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b}. Since 𝒬𝒫\mathcal{Q}\subseteq\mathcal{P} has branching factor at most εtm1/b\varepsilon_{t}\ell m^{1/b}, we have that |𝒬||N(x)B1|(εtm1/b)t1|\mathcal{Q}|\leq|N(x)\cap B_{1}|\cdot(\varepsilon_{t}\ell m^{1/b})^{t-1}. Our bound on |𝒬||\mathcal{Q}| then implies

|N(x)B1|14(14εtm1/b)t(εtm1/b)1t=4t1εtm1/b.|N(x)\cap B_{1}|\geq\frac{1}{4}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}\cdot(\varepsilon_{t}\ell m^{1/b})^{1-t}=4^{-t-1}\varepsilon_{t}\ell m^{1/b}.

This gives the desired bound, proving the result. ∎

Proof of Proposition 2.1.

Let t,Xt,X be as in Lemma 2.2. For each xXx\in X, let (𝒜,𝒫)(\mathcal{A},\mathcal{P}) be the balanced tt-neighborhood guaranteed by Lemma B.2 and (,𝒬)(\mathcal{B},\mathcal{Q}) the refined tt-neighborhood guaranteed by Lemma B.4 from (𝒜,𝒫)(\mathcal{A},\mathcal{P}). Most of the properties of Proposition 2.1 follow immediately from Definitions 10 and 11, as well as Lemmas B.2 and B.3 (with the last lemma using that (,𝒬)(\mathcal{B},\mathcal{Q}) is in particular a balanced tt-neighborhood). The only conditions which are not immediate are the bounds |Bt1|,|Bt|ε(bt+1)/(b1)m(t1)/b|B_{t-1}|,|B_{t}|\geq\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}. If this bound did not hold for Bt1B_{t-1}, then the tuple (B0,B1,,Bt1)(B_{0},B_{1},\ldots,B_{t-1}) would be a concentrated (t1)(t-1)-neighborhood of xx (assuming t142tεtεt1t^{-1}4^{-2t}\varepsilon_{t}\geq\varepsilon_{t-1}), contradicting every xXx\in X having t(x)=tt(x)=t.

To prove the bound on BtB_{t}, we use Lemma B.3 to find

|𝒬|=uB1,vBt|𝒬[uv]|ε1(t2)b/(b1)|B1||Bt|ε1(t2)b/(b1)m1/b|Bt|,|\mathcal{Q}|=\sum_{u\in B_{1},v\in B_{t}}|\mathcal{Q}[u\to v]|\leq\varepsilon^{-1}\ell^{(t-2)b/(b-1)}\cdot|B_{1}|\cdot|B_{t}|\leq\varepsilon^{-1}\ell^{(t-2)b/(b-1)}\cdot\ell m^{1/b}\cdot|B_{t}|,

where this last step used Definition 10(i). As |𝒬|εtmt/b|\mathcal{Q}|\geq\varepsilon\ell^{t}m^{t/b}, this gives |Bt|ε2(bt+1)/(b1)m(t1)/b|B_{t}|\geq\varepsilon^{2}\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}. This gives the desired result after replacing ε\varepsilon in the proposition statement with ε2\varepsilon^{2} (which easily implies the result after replacing ε\varepsilon with ε\sqrt{\varepsilon} throughout). ∎