The Random Turán Problem for Theta Graphs

Gwen McKinley Department of Mathematics, University of California, San Diego, La Jolla, CA. Email gmckinley@ucsd.edu. Sam Spiro Department of Mathematics, Rutgers University, Piscataway, NJ. Email: sas703@scarletmail.rutgers.edu. Research supported by the NSF Mathematical Sciences Postdoctoral Research Fellowships Program under Grant DMS-2202730.

Abstract

Given a graph $F$ , we define $\mathrm{ex}(G_{n,p},F)$ to be the maximum number of edges in an $F$ -free subgraph of the random graph $G_{n,p}$ . Very little is known about $\mathrm{ex}(G_{n,p},F)$ when $F$ is bipartite, with essentially tight bounds known only when $F$ is either $C_{4},\ C_{6},\ C_{10}$ , or $K_{s,t}$ with $t$ sufficiently large in terms of $s$ , due to work of Füredi and of Morris and Saxton. We extend this work by establishing essentially tight bounds when $F$ is a theta graph with sufficiently many paths. Our main innovation is in proving a balanced supersaturation result for vertices, which differs from the standard approach of proving balanced supersaturation for edges.

1 Introduction

Given a graph $F$ , we define the Turán number or extremal number $\mathrm{ex}(n,F)$ to be the maximum number of edges that an $n$ -vertex $F$ -free graph can have. If $F$ is not bipartite, then the asymptotic behavior of $\mathrm{ex}(n,F)$ is determined by the Erdős-Stone theorem [11]. Only sporadic results for $\mathrm{ex}(n,F)$ are known when $F$ is bipartite, and in most cases these bounds are not tight.

For example, The Kővári-Sós-Turán theorem [19] implies $\mathrm{ex}(n,K_{s,t})=O(n^{2-1/s})$ , and this bound is only known to be tight when $t$ is sufficiently large in terms of $s$ ; see for example [3]. The bound $\mathrm{ex}(n,C_{2b})=O(n^{1+1/b})$ was first proven by Bondy and Simonovits [2]. It was shown by Faudree and Simonovits [12] that this same upper bound continues to hold for theta graphs with paths of length $b$ , and it was later shown by Conlon [7] that this upper bound is tight for theta graphs which have sufficiently many paths. A more detailed treatment on Turán numbers of bipartite graphs can be found in the survey by Füredi and Simonovits [14].

In this paper, we study a probabilistic analog of the Turán number. We define the random graph $G_{n,p}$ to be the $n$ -vertex graph obtained by including each possible edge independently and with probability $p$ , and we let $\mathrm{ex}(G_{n,p},F)$ denote the maximum number of edges in an $F$ -free subgraph of $G_{n,p}$ .

Observe that $\mathrm{ex}(G_{n,1},F)=\mathrm{ex}(n,F)$ . With this in mind, it is natural to ask if the classical bounds on $\mathrm{ex}(n,F)$ mentioned above can be extended to bounds on $\mathrm{ex}(G_{n,p},F)$ for all $p$ . For example, just as in the classical case, we have a complete asymptotic understanding of $\mathrm{ex}(G_{n,p},F)$ when $F$ is not bipartite due to breakthrough work done independently by Conlon and Gowers [6] and Schacht [27]. To formally state their result, we define the 2-density of a graph $F$ by

m_{2}(F)=\max\left\{\frac{e(F^{\prime})-1}{v(F^{\prime})-2}:F^{\prime}\subseteq F,\ e(F^{\prime})\geq 2\right\},

and we write $f(n)\gg g(n)$ to mean $f(n)/g(n)$ tends to infinity as $n$ tends to infinity. We also recall that a sequence of events $A_{n}$ occurs with high probability or w.h.p. if $\Pr[A_{n}]$ tends to 1 as $n$ tends to infinity.

Theorem 1.1 ([6, 27]).

For any graph $F$ , w.h.p.

\mathrm{ex}(G_{n,p},F)=\begin{cases}\left(1-\frac{1}{\chi(F)-1}+o(1)\right)p{n\choose 2}&p\gg n^{-1/m_{2}(F)},\\ (1+o(1))p{n\choose 2}&n^{-1/m_{2}(F)}\gg p\gg n^{-2}.\end{cases}

Theorem 1.1 to a large extent solves the random Turán problem for non-bipartie graphs, though many questions still remain in this setting; see the survey by Rödl and Schacht [25] for more on this.

For the rest of this paper we focus on the random Turán problem when $F$ is bipartite, a setting where much less is known. This lack of information is partially due to the fact that the classical Turán numbers $\mathrm{ex}(n,F)$ are unknown for almost all bipartite graphs. An additional obstacle is that even if it is known that $\mathrm{ex}(n,F)=\Theta(n^{\alpha})$ for some bipartite graph $F$ , it is typically not the case that $\mathrm{ex}(G_{n,p},F)=\Theta(pn^{\alpha})$ for large $p$ ; see for example Figure 1 which plots $\mathrm{ex}(G_{n,p},C_{4})$ . That is, unlike in the non-bipartite case, the extremal constructions for $\mathrm{ex}(G_{n,p},F)$ with $F$ bipartite are far from the intersection of $G_{n,p}$ with an extremal $F$ -free graph.

Refer to caption — Figure 1: A plot of the value of $\mathrm{ex}(G_{n,p},C_{4})$ as a function of $p$ , with these bounds holding with high probability and up to constant factors.

Despite these obstacles, what is currently known about the random Turán problem for bipartite graphs suggests that the following analog of 1.1 could be true; notice that in contrast to the non-bipartite case, where $\mathrm{ex}(G_{n,p},F)$ exhibits a single “phase transition” around $p=n^{-1/m_{2}(F)}$ , if 1.2 is correct, we will see two phase transitions in the bipartite case.

Conjecture 1.2.

If $F$ is a graph with $\mathrm{ex}(n,F)=\Theta(n^{\alpha})$ for some $\alpha\in(1,2]$ , then w.h.p.

\mathrm{ex}(G_{n,p},F)=\begin{cases}\max\{\Theta(p^{\alpha-1}n^{\alpha}),n^{2-1/m_{2}(F)}(\log n)^{O(1)}\}&p\geq n^{-1/m_{2}(F)},\\ (1+o(1))p{n\choose 2}&n^{-1/m_{2}(F)}\gg p\gg n^{-2}.\end{cases}

Equivalently,

\mathrm{ex}(G_{n,p},F)=\begin{cases}\Theta(p^{\alpha-1}n^{\alpha})&p\geq n^{\frac{2-\alpha-1/m_{2}(F)}{\alpha-1}}(\log n)^{O(1)}\\ n^{2-1/m_{2}(F)}(\log n)^{O(1)}&n^{\frac{2-\alpha-1/m_{2}(F)}{\alpha-1}}(\log n)^{O(1)}\geq p\geq n^{-1/m_{2}(F)},\\ (1+o(1))p{n\choose 2}&n^{-1/m_{2}(F)}\gg p\gg n^{-2}.\end{cases}

The bound for $n^{-1/m_{2}(F)}\gg p\gg n^{-2}$ is easy to prove with a deletion argument, as is the lower bound of $\Omega(n^{2-1/m_{2}(F)})$ when $p\geq n^{-1/m_{2}(F)}$ . Thus the hard part of 1.2 is in showing that $\mathrm{ex}(G_{n,p},F)=\Theta(p^{\alpha-1}n^{\alpha})$ whenever $p$ is large.

In terms of evidence supporting 1.2, Füredi [13] showed that this conjecture holds when $F=C_{4}$ . This work was substantially generalized by Morris and Saxton [20] who proved $\mathrm{ex}(G_{n,p},K_{s,t})=O(p^{1-1/s}n^{2-1/s})$ w.h.p. when $p$ is large, and that $\mathrm{ex}(G_{n,p},C_{2b})=O(p^{1/b}n^{1+1/b})$ w.h.p. when $p$ is large. Moreover, they showed that these bounds are tight provided $\mathrm{ex}(n,K_{s,t})=\Omega(n^{2-1/s})$ and $\mathrm{ex}(n,\{C_{3},C_{4},\ldots,C_{2b}\})=\Omega(n^{1+1/b})$ , respectively. The lower bound of Conjecture 1.2 was shown to hold for large powers of balanced trees by Spiro [28]. Under some mild conditions, Jiang and Longbreak [17] proved a general upper bound of the form

\mathrm{ex}(G_{n,p},F)=O\big{(}p^{1-m_{2}^{*}(F)(2-\alpha)}n^{\alpha}\big{)}\textrm{ w.h.p.\ when }p\textrm{ is large},

where $m_{2}^{*}(F)=\max\left\{\frac{e(F^{\prime})-1}{v(F^{\prime})-2}:F^{\prime}\subsetneq F,\ e(F^{\prime})\geq 2\right\}$ . This bound matches Conjecture 1.2 precisely when $m_{2}^{*}(F)=1$ , which happens when $F=C_{2b}$ (giving a simpler proof of [20]), but otherwise is strictly weaker than the bound proposed in Conjecture 1.2. As far as we are aware, these are the only known bounds for the random Turán problem for bipartite graphs, though there have been a number of recent results regarding the analogous problem for degenerate hypergraphs, see for example [21, 22, 23, 24, 29].

In this paper, we contribute to this growing body of literature by studying the random Turán problem for theta graphs. Recall that a theta graph $\theta_{a,b}$ is a graph which consists of two vertices $u,v$ together with $a$ internally disjoint paths from $u$ to $v$ of length $b$ . For example, $\theta_{3,4}$ is depicted in Figure 2.

Figure 2: A depiction of the theta graph

\theta_{3,4}

Observe that $\theta_{2,b}=C_{2b}$ and $\theta_{a,2}=K_{2,a}$ . Given that Morris and Saxton essentially solved the random Turán problem for cycles and complete bipartite graphs, one might hope that their methods can be extended to give bounds for $\mathrm{ex}(G_{n,p},\theta_{a,b})$ in general.

And indeed, by using similar ideas as in [20], Corsten and Tran [10] implicitly proved

\mathrm{ex}(G_{n,p},\theta_{a,b})=O\left(p^{\frac{2}{ab}}n^{1+\frac{1}{b}}\right)\textrm{ w.h.p.\ when }p\textrm{ is large},

(1)

which matches the general upper bound given by Jiang and Longbreak in this case. However, Faudree and Simonovits [12] proved $\mathrm{ex}(n,\theta_{a,b})=O(n^{1+1/b})$ , so Conjecture 1.2 predicts that we should have $\mathrm{ex}(G_{n,p},\theta_{a,b})=O(p^{1/b}n^{1+1/b})$ when $p$ is large, which differs substantially from (1) when $a$ is large.

By adding several new ideas (see Section 1.1.2) to the approach of Morris and Saxton, we significantly improve upon the bounds of (1) and establish essentially tight (and unconditional) bounds for $\mathrm{ex}(G_{n,p},\theta_{a,b})$ when $a$ is large, which agree with what 1.2 predicts.

Theorem 1.3.

For all $b\geq 2$ , there exists $a_{0}=a_{0}(b)$ such that for any fixed $a\geq a_{0}$ , w.h.p.

\mathrm{ex}(G_{n,p},\theta_{a,b})=\begin{cases}\Theta\left(p^{\frac{1}{b}}n^{1+\frac{1}{b}}\right)&p\geq n^{-\frac{b-1}{ab-1}}(\log n)^{2b},\\ n^{2-\frac{a(b-1)}{ab-1}}(\log n)^{O(1)}&n^{-\frac{b-1}{ab-1}}(\log n)^{2b}\geq p\geq n^{-\frac{a(b-1)}{ab-1}},\\ (1+o(1))p{n\choose 2}&n^{-\frac{a(b-1)}{ab-1}}\gg p\gg n^{-2}.\end{cases}

As far as we are aware, Theorem 1.3 is the first result since Morris and Saxton [20] which gives essentially tight bounds on $\mathrm{ex}(G_{n,p},F)$ for any bipartite graph $F$ .

1.1 Proof Outline

The vast majority of our paper is focused on proving the upper bound of 1.3 when $p$ is large, with the other bounds following from previous results together with the monotonicity of $\mathrm{ex}(G_{n,p},\theta_{a,b})$ . Before going into our new ideas, we first recall the proof ideas from Morris and Saxton [20] for bounding $\mathrm{ex}(G_{n,p},C_{2b})$ , as well as the adaptation of these methods by Corsten and Tran [10] to theta graphs.

1.1.1 Previous Ideas

The main approach to upper bounding $\mathrm{ex}(G_{n,p},C_{2b})$ when $p$ is large is to show that $C_{2b}$ exhibits “balanced supersaturation”, which roughly means that if $e(G)\gg\mathrm{ex}(n,C_{2b})$ , then one can find a large collection of copies of $C_{2b}$ in $G$ which are “spread out.” Given such a result, one can derive upper bounds on $\mathrm{ex}(G_{n,p},C_{2b})$ by using what is by now a somewhat standard argument involving hypergraph containers; see 6.1 below.

To prove this balanced supersaturation result, let $G$ be an $n$ -vertex graph with $kn^{1+1/b}$ edges. Our goal is to build a hypergraph $\mathcal{H}$ whose vertex set is $E(G)$ , hyperedges are copies of $C_{2b}$ in $G$ , and is such that

\deg_{\mathcal{H}}(\sigma)\leq\frac{k^{2b}n^{2}}{\delta kn^{1+1/b}(\delta k^{b/(b-1)})^{|\sigma|-1}},

for all $\sigma\subseteq E(G)$ , where $\delta$ is some suitably small constant. If we can construct such a collection with $|\mathcal{H}|\approx k^{2b}n^{2}$ , then this together with 6.1 will give our desired result.

To construct such an $\mathcal{H}$ , we iteratively build copies of $C_{2b}$ to add to $\mathcal{H}$ as follows. Given our current collection $\mathcal{H}$ , we “clean up” $G$ by deleting edges $e$ with $d_{\mathcal{H}}(e)\geq\frac{k^{2b}n^{2}}{\delta kn^{1+1/b}}$ (since we will not be able to use these edges in any copy of $C_{2b}$ to add to $\mathcal{H}$ ), and by iteratively deleting vertices with low degree. We then pick a vertex $x$ that will have some number $t(x)\leq b$ associated to it (which roughly measures how well the graph expands near $x$ ) to use in our cycle. We then run an algorithm which starts with a set $\chi=\{x\}$ , and then iteratively adds new vertices to $\chi$ until we build a cycle $C$ , with the exact algorithm we use depending on the value of $t(x)$ . A key point is that at each step of our algorithm, we always have significantly more than $\delta k^{b/(b-1)}$ choices for each new vertex to add. Because we have so many choices, one of the cycles $C$ that we can create will be such that $C\notin\mathcal{H}$ and such that $\{C\}\cup\mathcal{H}$ continues to satisfy our desired codegree condtions. By applying this algorithm repeatedly, we end up constructing a large collection $\mathcal{H}$ of cycles which satisfies our codegree conditions, proving our desired balanced superaturation result.

For theta graphs, Corsten and Tran [10] used an argument similar to the one outlined above, but their approach only gave effective bounds on the codegrees $\deg_{\mathcal{H}}(\sigma)$ when $\sigma$ is a set of edges inducing a forest¹¹1The general result of Jiang and Longbreak [17] similarly only gives effective bounds for forests, and as such it seems like moving beyond the forest case is the main difficulty in general for upper bounding $\mathrm{ex}(G_{n,p},F)$ .. At a high level, the fundamental issue with the approach outlined above is that the algorithm proceeds by selecting vertices one at a time, but the codegree bounds of 6.1 are a function of the number of edges $|\sigma|$ . For $\sigma$ which are forests, this distinction turns out to be irrelevant because $\sigma$ has at least as many vertices as edges, but this property fails substantially for general subgraphs of theta graphs.

1.1.2 New Ideas

As in previous works, our proof relies on showing that theta graphs exhibit balanced supersaturation. In particular, we use the following three main ideas in order to get around the fundamental issues outlined above:

Vertex Supersaturation. The first key idea is that instead of viewing $\mathcal{H}$ as a hypergraph on $E(G)$ , we essentially view it as a hypergraph on $V(G)$ . To be more precise, we consider hypergraphs $\mathcal{H}$ on $V(\theta_{a,b})\times V(G)$ in such a way that hyperedges of $\mathcal{H}$ correspond to a unique labelled theta graph in $G$ . Balanced supersaturation for this hypergraph $\mathcal{H}$ then easily translates to balanced supersaturation for the corresponding hypergraph on $E(G)$ , which we ultimately need in order to use hypergraph containers.

Asymmetric Codegree Bounds. The second idea is that the codegree bounds we enforce on $\chi\subseteq V(\theta_{a,b})\times V(G)$ will not depend solely on the number of vertices $|\chi|$ , but also on the vertices of $\theta_{a,b}$ which the set $\chi$ corresponds to. For example, if $u,v\in V(\theta_{a,b})$ are the two high-degree vertices of $\theta_{a,b}$ (i.e. the vertices of degree at least 3), then the codegree bounds we enforce on the set $\chi=\{(u,x),(v,y)\}$ will be higher than those we would put on $\chi=\{(w,x),(w^{\prime},y)\}$ for some other $w,w^{\prime}\in V(\theta_{a,b})$ since, in some sense, $u,v$ are the most important vertices of $\theta_{a,b}$ . A similar idea was implicitly used by Morris and Saxton when proving balanced supersaturation results for $K_{s,t}$ (where the vertices in the smaller part are “more important” than those in the larger part).

Multiple Collections. The final idea is that we do not build a single collection $\mathcal{H}$ , but instead build multiple collections $\mathcal{H}_{1},\mathcal{H}_{2},\ldots$ and impose different codegree bounds on each of these.

As a somewhat concrete example of why we do this, say we knew our graph $G$ was a random graph with $kn^{1+1/b}$ edges. In this case, it would be possible to build an $\mathcal{H}$ such that for every $x,y\in V(G)$ , there are at most roughly $k^{ab}$ theta graphs in our collection which use $x,y$ as the two high degree vertices; and this is a strong bound on the codegree of this pair. In contrast, if $G$ was a clique with $kn^{1+1/b}$ edges together with some isolated vertices, then it would be impossible to impose such a codegree bound for all $x,y$ . Thus if we only worked with a single collection $\mathcal{H}$ , we would have to pessimistically use the weaker codegree bounds that work for a clique when $x,y$ correspond to the high degree vertices, and similarly we would have to consider the worst possible choice of $G$ when determining the codegree bounds for any given set $\chi\subseteq V(G)$ . Doing this would ultimately give bounds that are too weak. By building multiple collections, we can impose the “correct” codegree bounds regardless of the structure of $G$ .

Other Ideas. In addition to these three main ideas, we use a slightly different algorithm to construct our theta graphs compared to those of [10, 20]. Previous algorithms worked (roughly) by first specifying a vertex $x\in V(G)$ to play the role of one of the high degree vertices of $\theta_{a,b}$ , then choosing a path in $G$ of length $b$ (which specifies the other vertex $y$ playing the role of a high degree vertex in $\theta_{a,b}$ ), and from there choosing the remaining $a-1$ paths from $x$ to $y$ one at a time. Instead, our algorithm chooses the two high degree vertices $x,y$ at the start and then builds all of the $a$ paths from $x$ to $y$ one at a time. This somewhat more symmetric argument allows us to overcome various technical issues that arose with previous approaches, and is crucial for our present argument to go through.

1.2 Organization and Notation

The rest of this paper is organized as follows. In Section 2 we prove several auxiliary results that will be used in our main proof, and in Section 3 we establish the main definitions used in this paper. In Section 4, which is the real heart of our argument, we establish our balanced supersaturation result for vertices. We then translate this result into balanced supersaturation for edges in Section 5 before completing the proof of 1.3 in Section 6 by invoking a result that follows from a standard containers type argument. A few open problems are given in Section 7.

Throughout the paper we adopt the following conventions. We always use $u,v,w$ to denote vertices of $\theta_{a,b}$ and $x,y,z$ to denote vertices of a (larger) graph $G$ . Further, we will almost always let $u,v$ denote the two vertices of $\theta_{a,b}$ with degree at least 3 (when $a\geq 3$ ), and we will informally call these the “vertices of high degree” in $\theta_{a,b}$ . We write $v(G)=|V(G)|$ . Whenever we write asymptotic notation such as $O(f)$ , our implicit constants will always depend on $a,b$ , and we will occasionally emphasize this point by writing, for example, $O_{a,b}(f)$ . For a hypergraph $\mathcal{H}$ and a set of vertices $\sigma\subseteq V(\mathcal{H})$ , its degree or codegree $\deg_{\mathcal{H}}(\sigma)$ is the number of hyperedges of $\mathcal{H}$ containing $\sigma$ .

2 Auxiliary Results

Here we establish two results that will be crucial for our proof.

2.1 Expansion

One of the key technical lemmas of Morris and Saxtion says that in a graph with sufficiently large minimum degree, there exists a vertex $x$ which is the endpoint of many “nice” paths of some length $t$ . Analogously, we will rely heavily on the following.

Proposition 2.1.

For all integers $b\geq 2$ , there exists some $\varepsilon>0$ such that the following holds. If $G$ is an $m$ -vertex graph with minimum degree $\ell m^{1/b}$ and $\ell\geq\varepsilon^{-1}$ , and if $\mathcal{F}$ is a set of forests, then there exists an integer $2\leq t\leq b$ and a set of vertices $X$ such that the following properties hold:

(a)

For each $x\in X$ , there exists a pair $(\mathcal{B},\mathcal{Q})$ such that $\mathcal{B}=(B_{0},\ldots,B_{t})$ is a tuple of (not necessarily disjoint) vertex sets of $G$ with $B_{0}=\{x\}$ , and $\mathcal{Q}$ is a set of paths $xz_{1}\cdots z_{t}$ with $z_{i}\in B_{i}$ for all $i$ .
(b)

We have $|B_{t-1}|,|B_{t}|\geq\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ .
(c)

We have $|\mathcal{Q}|\geq\varepsilon\ell^{t}m^{t/b}$ .
(d)

For every $1\leq i\leq t$ and $z\in B_{i-1}$ , we have $|N(z)\cap B_{i}|\geq\varepsilon\ell m^{1/b}$ , and for every $z\in B_{t}$ , we have $|N(z)\cap B_{t-1}|\geq\varepsilon\ell^{b/(b-1)}$ .
(e)

For every $y\in B_{t}$ , we have $|\mathcal{Q}[x\to y]|\geq\varepsilon\ell^{(t-1)b/(b-1)}$ , where $\mathcal{Q}[x\to y]$ denotes the set of paths of $\mathcal{Q}$ starting at $x$ and ending at $y$ .
(f)

For any $y\in B_{t}$ and non-empty set of vertices $S\subseteq V(G)\setminus\{x,y\}$ , there are at most $\varepsilon^{-1}\ell^{(t-1-|S|)b/(b-1)}$ paths in $\mathcal{Q}[x\to y]$ containing $S$ .
(g)

If $\mathcal{F}$ is such that for every path $x_{1}\cdots x_{r}$ of $G$ with $r\leq b$ which does not contain an element of $\mathcal{F}$ as a subgraph, there are at most $\varepsilon\ell m^{1/b}$ vertices $x_{r+1}\in N_{G}(x_{r})$ such that the path $x_{1}\cdots x_{r+1}$ contains an element of $\mathcal{F}$ as a subgraph; then no path of $\mathcal{Q}$ contains an element of $\mathcal{F}$ as a subgraph.
(h)

We have $|X|\geq\varepsilon\ell^{(b-t)/b}m^{t/b}$ .

Morris and Saxton essentially proved this same result but with the last condition replaced by $|X|>0$ as opposed to $|X|\geq\varepsilon\ell^{(b-t)/b}m^{t/b}$ . Our two proofs will be essentially identical outside of this improved quantitative bound²²2It’s possible that 2.1 holds with the even stronger quantitative bound $|X|\geq\varepsilon m$ . If true, this would significantly simplify our proof of 1.3; see 7.3 for more on this., and as such we defer many of the redundant details of the proof to Appendix B.

For our proof, we fix a sequence of rapidly decreasing constants

1\geq\varepsilon_{b}\geq\cdots\geq\varepsilon_{2}\geq\varepsilon_{1}>0

which depend only on $b$ . The exact values of these constants are not particularly important, other than that they are sufficiently small with respect to 1 and with respect to each other. In particular, we demand $\varepsilon_{t}\geq 16(b+1)\varepsilon_{t-1}$ for all $t$ . For the rest of the subsection we will fix some $m$ -vertex graph $G$ with minimum degree $\ell m^{1/b}$ with $\ell$ (and hence $m$ ) sufficiently large in terms of the $\varepsilon_{t}$ constants.

Definition 1.

For $x\in V(G)$ , we say that a tuple $\mathcal{A}=(A_{0},A_{1},\ldots,A_{t})$ of (not necessarily disjoint) subsets of $V(G)$ is a concentrated $t$ -neighborhood of $x$ if $A_{0}=\{x\}$ , $|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ , and $|N(y)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b}$ for all $y\in A_{i-1}$ .

We define $t(x)$ to be the minimal $t\geq 2$ such that there exists a concentrated $t$ -neighborhood of $x$ in $G$ . Note that $t(x)\leq b$ for all $x$ since we can iterativly take $A_{i}=\bigcup_{y\in A_{i-1}}N(y)$ .

Morris and Saxton implicitly proved that for any vertex $x$ with $t(x)=\min_{y\in V(G)}t(y)$ , there exist sets $(\mathcal{B},\mathcal{Q})$ as in Proposition 2.1, and in particular, at least one such vertex exists. The only place where $t(x)=\min_{y\in V(G)}t(y)$ is used in their argument is in showing that there exists a tuple $\mathcal{A}=(A_{0},\ldots,A_{t})$ with $t=t(x)$ , $A_{0}=\{x\}$ , $|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b},\ |N(y)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b}$ for all $y\in A_{i-1}$ , and (crucially) every $y\in\bigcup_{i=0}^{t}A_{i}$ has $t(y)\geq t$ ; in other words, for their argument to go through we only need that $t(x)$ achieves a local minimum value among vertices $y$ near $x$ , and it is not strictly necessary for $t(x)$ to achieve a global minimum value.

Motivated by this, our main goal is to show that tuples with essentially these same properties noted above exist for many vertices $x$ . Specifically, we prove the following.

Lemma 2.2.

There exists some integer $2\leq t\leq b$ and some set $X\subseteq V(G)$ of size at least $\frac{1}{2}(4b)^{t-b}\ell^{(b-t)/(b-1)}m^{t/b}$ such that $t(x)=t$ for every $x\in X$ , and such that for every $x\in X$ , there exists a tuple of sets $\mathcal{A}=(A_{0},\ldots,A_{t})$ such that $A_{0}=\{x\}$ , $|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ , $|N(y)\cap A_{i}|\geq\frac{1}{2}\varepsilon_{t}\ell m^{1/b}$ for all $y\in A_{i-1}$ , and every $y\in\bigcup_{i=0}^{t}A_{i}$ has $t(y)\geq t$ .

This differs ever so slightly from the condition that Morris and Saxton worked with since we only guarantee $|N(y)\cap A_{i}|\geq\frac{1}{2}\varepsilon_{t}\ell m^{1/b}$ as opposed to $|N(y)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b}$ . By slightly adjusting the constants of Morris and Saxton, their same proof still carries over word for word for any vertex $x$ as in Lemma 2.2; see Appendix B for details. Thus to prove Proposition 2.1, it suffices to prove Lemma 2.2, which will be our goal for the rest of this subsection.

For any integer $2\leq t^{\prime}\leq b$ , define

\Lambda(t^{\prime})=(4b)^{t^{\prime}-b}\ell^{(b-t^{\prime})/(b-1)}m^{t^{\prime}/b}.

Note that $\ell m^{1/b}\leq m$ (since $\ell m^{1/b}$ is the minimum degree of an $m$ -vertex graph), i.e. $\ell^{1/(b-1)}\leq m^{1/b}$ , and thus $\Lambda(t^{\prime})$ is an increasing function in $t^{\prime}$ . From now on we let $t$ be the smallest integer such that there are at least $\Lambda(t)$ vertices with $t(x)\leq t$ . Note that $t=b$ satisfies these conditions, so such a (smallest) integer exists.

Let $Y_{0}$ denote the set of vertices $x$ with $t(x)<t$ . Iteratively define $Y_{i}$ to be the set of vertices $x\notin\bigcup_{j=0}^{i-1}Y_{j}$ which have at least $\alpha\ell m^{1/b}$ neighbors in $Y_{i-1}$ , where $\alpha:=\frac{1}{2(b+1)}\varepsilon_{t}$ . Note that every $x\in Y_{i}$ with $i\geq 1$ has $t(x)\geq t$ since $x\notin Y_{0}$ .

To motivate these definitions, we observe that in proving Lemma 2.2 with $t$ as stated, we can not include any vertex of $Y_{0}$ in any of the $A_{i}$ sets. While we are allowed to include vertices of $Y_{1}$ in these sets, these vertices are “dangerous” since a large number of their neighbors lie in $Y_{0}$ , and similarly it is somewhat dangerous to include $Y_{2}$ since a large number of their neighbors are in $Y_{1}$ , and so on. We thus want to show that these $Y_{i}$ sets are all relatively small, which is accomplished by the following lemma.

Lemma 2.3.

If $t=2$ then $Y_{i}=\emptyset$ for all $i\geq 0$ , and otherwise $|Y_{i}|\leq\Lambda(t-1)$ for all $i\geq 0$ .

For this proof, we note that by choosing $\varepsilon$ sufficiently small in Proposition 2.1, we may assume $m\geq\ell^{b/(b-1)}\geq\varepsilon^{-b/(b-1)}$ is sufficiently large compared to all of the constants $\varepsilon_{t^{\prime}}$ .

Proof.

If $t=2$ then $Y_{0}=\emptyset$ , and hence inductively we have $Y_{i}=\emptyset$ for all $i$ . From now on we assume $t>2$ . We prove the result by induction on $i$ , the base case $|Y_{0}|\leq\Lambda(t-1)$ being immediate from the definition of $t$ and $Y_{0}$ . Say we have proven $|Y_{i}|\leq\Lambda(t-1)$ for some $i\geq 0$ . The key technical observation we need is the following.

Claim 2.4.

There exists a non-empty bipartite graph $G^{\prime}\subseteq G$ with bipartition $S\cup T$ such that $S\subseteq Y_{i}$ , $T\subseteq Y_{i+1}$ , and such that $d_{G^{\prime}}(y)\geq\frac{1}{4}\alpha\ell m^{1/b}$ for $y\in T$ and $d_{G^{\prime}}(y)\geq\frac{1}{4}\alpha\ell m^{1/b}|Y_{i}|^{-1}|Y_{i+1}|$ for $y\in S$ .

Proof.

Let $G^{*}\subseteq G$ be the graph on $Y_{i}\cup Y_{i+1}$ obtained after deleting every edge within $Y_{i}$ and within $Y_{i+1}$ . Note that by definition, each vertex of $Y_{i+1}$ has at least $\alpha\ell m^{1/b}$ neighbors in $Y_{i}$ (which is disjoint from $Y_{i+1}$ ), so $e(G^{*})\geq\alpha\ell m^{1/b}|Y_{i+1}|$ . Define $G^{\prime}$ by iteratively deleting every vertex which violates the degree conditions of the claim. Note that the number of edges deleted in this process is at most

\frac{1}{4}\alpha\ell m^{1/b}\cdot|Y_{i+1}|+\frac{1}{4}\alpha\ell m^{1/b}|Y_{i}|^{-1}|Y_{i+1}|\cdot|Y_{i}|=\frac{1}{2}\alpha\ell m^{1/b}|Y_{i+1}|<e(G^{*}).

In particular, $G^{\prime}$ is non-empty, and it satisfies all of the other properties by construction. ∎

Returning to our induction, we wish to show that $|Y_{i+1}|\leq\Lambda(t-1)$ ; our inductive hypothesis gives $|Y_{i}|\leq\Lambda(t-1)$ , so it suffices to prove that $|Y_{i+1}|\leq|Y_{i}|$ . Assume for contradiction that $|Y_{i+1}|>|Y_{i}|$ . Let $x$ be any vertex of $T$ (which exists since $G^{\prime}$ is non-empty), and let $A_{0},\ldots,A_{t-1}$ be defined by $A_{0}=\{x\}$ and $A_{j}=\bigcup_{y\in A_{j-1}}N_{G^{\prime}}(y)$ . Note that $A_{j}\subseteq S$ if and only if $j$ is odd since $G^{\prime}$ is bipartite. Also note that for all $y\in A_{j-1}$ we have

|N_{G}(y)\cap A_{j}|\geq|N_{G^{\prime}}(y)\cap A_{j}|\geq\frac{1}{4}\alpha\ell m^{1/b}\geq\varepsilon_{t-1}\ell m^{1/b},

where this last step used $\alpha=\frac{1}{2(b+1)}\varepsilon_{t}$ and that $\varepsilon_{t-1}$ is sufficiently small compared to $\varepsilon_{t}$ . In particular, if $t>2$ is even, then $(A_{0},\ldots,A_{t-1})$ is a concentrated $(t-1)$ -neighborhood of $x$ since

|A_{t-1}|\leq|S|\leq|Y_{i}|\leq\Lambda(t-1)\leq\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}.

This implies $t(x)<t$ , a contradiction to $x\in T\subseteq Y_{i+1}$ since $Y_{i+1}$ is disjoint from $Y_{0}$ .

Thus we may assume $t>2$ is odd. Define the random set $A^{\prime}_{t-1}\subseteq Y_{i+1}$ by including each vertex of $Y_{i+1}$ independently and with probability $p=|Y_{i}||Y_{i+1}|^{-1}$ , which is well defined since we assumed $|Y_{i+1}|>|Y_{i}|$ . Observe that $|A^{\prime}_{t-1}|$ is a binomial random variable with $|Y_{i+1}|$ trials and probability of success $p$ . Since $\mathbb{E}[|A^{\prime}_{t-1}|]=p|Y_{i+1}|=|Y_{i}|\leq\Lambda(t-1)$ , by Markov’s inequality we have $\Pr[A^{\prime}_{t-1}\geq 2\Lambda(t-1)]\leq 1/2$ . Thus for $m$ sufficiently large, we conclude that the event $|A^{\prime}_{t-1}|<2\Lambda(t-1)\leq\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ occurs with probability at least 1/2.

Similarly for each $y\in A_{t-2}\subseteq S$ , the random variable $|N_{G}(y)\cap A_{t-1}^{\prime}|$ is binomial with success probability $p$ and number of trials $d_{G^{\prime}}(y)\geq\frac{1}{4}\alpha\ell m^{1/b}p^{-1}$ . Thus by the multiplicative Chernoff inequality, we have for any $y\in A_{t-2}$ ,

\Pr[|N_{G}(y)\cap A_{t-1}^{\prime}|\leq\frac{1}{8}\alpha\ell m^{1/b}]\leq e^{-\alpha\ell m^{1/b}/32},

and for $m$ sufficiently large this probability is at most $.1m^{-1}$ . By a union bound over $y\in A_{t-2}$ , we see that with probability at least $.9$ , every vertex $y\in A_{t-2}$ satisfies $|N_{G}(y)\cap A_{t-1}^{\prime}|\geq\frac{1}{8}\alpha\ell m^{1/b}\geq\varepsilon_{t-1}\ell m^{1/b}$ .

In total we conclude that there exists some choice of $A^{\prime}_{t-1}\subseteq Y_{i+1}$ such that both $|A^{\prime}_{t-1}|\leq\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ and $|N_{G}(y)\cap A_{t-1}^{\prime}|\geq\varepsilon_{t-1}\ell m^{1/b}$ for all $y\in A_{t-2}$ (since in particular, this holds with positive probability for a random subset of $Y_{i+1}$ ). Thus $(A_{0},\ldots,A_{t-2},A_{t-1}^{\prime})$ is a concentrated $(t-1)$ -neighborhood of $x$ . This implies $t(x)<t$ , which again is a contradiction. We conclude $|Y_{i+1}|\leq|Y_{i}|$ , and hence $|Y_{i+1}|\leq\Lambda(t-1)$ by the inductive hypothesis. ∎

We are now ready to prove Lemma 2.2.

Proof of Lemma 2.2.

Let $X$ be the set of vertices $x$ with $t(x)=t$ and $x\notin\bigcup_{i=1}^{b}Y_{i}$ .

We claim that $|X|\geq\frac{1}{2}\Lambda(t)$ . Indeed, by definition of $t$ , there are at least $\Lambda(t)$ vertices with $t(x)\leq t$ . Every vertex with $t(x)\leq t$ is either in $X$ or $\bigcup_{i=0}^{b}Y_{i}$ , so by the previous lemma,

|X|\geq\Lambda(t)-|\bigcup_{i=0}^{b}Y_{i}|\geq\Lambda(t)-(b+1)\Lambda(t-1)\geq\frac{1}{2}\Lambda(t).

It remains to find the tuple of sets $\mathcal{A}$ guaranteed by Lemma 2.2 for each $x\in X$ . For each $x\in X$ , by definition of $t(x)=t$ , there exists a tuple $(A^{\prime}_{0},A^{\prime}_{1},\ldots,A^{\prime}_{t})$ with $A^{\prime}_{0}=\{x\}$ , $|A^{\prime}_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ , and $|N(y)\cap A_{i}^{\prime}|\geq\varepsilon_{t}\ell m^{1/b}$ for all $y\in A_{i-1}^{\prime}$ . Define $A_{i}=A^{\prime}_{i}\setminus\bigcup_{j=0}^{b-i}Y_{j}$ . Note that with this we have $A_{0}=\{x\}$ , $|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ , and no $y\in A_{i}$ has $t(y)<t$ because we removed $Y_{0}$ from each $A_{i}$ .

It remains to show that each $y\in A_{i-1}$ has many neighbors in $A_{i}$ . Since each $y\in A_{i-1}$ does not belong to any $Y_{j}$ with $1\leq j\leq b-i+1$ , by definition $y$ has at most $(b+1)\alpha\ell m^{1/b}$ neighbors in $\bigcup_{j=0}^{b-i}Y_{j}$ . This implies

|N(y)\cap A_{i}|=|N(y)\cap(A_{i}^{\prime}\setminus\bigcup_{j=0}^{b-i}Y_{j})|\geq|N(y)\cap A_{i}^{\prime}|-(b+1)\alpha\ell m^{1/b}\geq(\varepsilon_{t}-(b+1)\alpha)\ell m^{1/b}=\frac{1}{2}\varepsilon_{t}\ell m^{1/b},

proving the result. ∎

2.2 Minimum Degrees

A very minor step in the proof of Morris and Saxton calls for deleting vertices of low degree in $G$ . In their setting this is fine, as this does not significantly decrease the number of edges in $G$ . However, because the focus of our approach is on balanced supersaturation for vertices rather than for edges, we will need to be more careful with this step.

Towards this end, we use the reduction lemma stated below, which guarantees a subgraph $G^{\prime}\subseteq G$ of large minimum degree, where the degree condition is stronger the more vertices are removed from $G$ . In particular, the tradeoff is roughly what one would expect if $G$ was a clique $G^{\prime}$ together with some number of isolated vertices.

As a small technical convenience, we will prove this lemma in the more general setting of multigraphs with loops. Here, the degree of a vertex $v$ is the number of edges incident to $v$ (so each loop contributes 1 to its degree).

Lemma 2.5.

Let $G$ be an $n$ -vertex multigraph with loops. For all $b\geq 1$ , there exists a subgraph $G^{\prime}\subseteq G$ with $v(G^{\prime})>0$ and minimum degree at least

2^{-b}\left(\frac{v(G^{\prime})}{n}\right)^{1/b}\frac{e(G)}{v(G^{\prime})}.

Proof.

Write $e=e(G)$ . The result is trivial if $e=0$ , so assume $e>0$ . Assume for contradiction that no such subgraph $G^{\prime}$ exists. We claim that for all non-negative integers $r$ there exist $G_{r}\subseteq G$ with at most $2^{-br}n$ vertices and at least $2^{-r}e$ edges. The result holds with $G_{0}=G$ , so inductively assume the result has been proven through some $r$ . Let $G_{r+1}\subseteq G_{r}$ be the graph obtained after iteratively deleting vertices of degree less than $2^{(b-1)r-1}(e/n)$ from $G_{r}$ . Note that

e(G_{r+1})\geq e(G_{r})-2^{(b-1)r-1}(e/n)\cdot v(G_{r})\geq 2^{-r}e-2^{(b-1)r-1}(e/n)\cdot 2^{-br}n\geq 2^{-r-1}e.

If $v(G_{r+1})\geq 2^{-b(r+1)}n>0$ , then $2^{r}\geq\frac{1}{2}(\frac{n}{v(G_{r+1})})^{1/b}$ . This implies that $G_{r+1}$ has minimum degree at least

2^{(b-1)r-1}(e/n)\geq\left(\frac{n}{v(G_{r+1})}\right)^{\frac{b-1}{b}}2^{-b}(e/n)=2^{-b}\left(\frac{v(G_{r+1})}{n}\right)^{1/b}\frac{e(G)}{v(G_{r+1})},

where this first inequality implicitly uses $b-1\geq 0$ . This contradicts our assumption that no such subgraph of $G$ exists, so we must have $v(G_{r+1})<2^{-b(r+1)}n$ . Thus $G_{r+1}$ satisfies the desired conditions, proving the claim. Taking $r=\log_{2}(n)$ , the claim implies there exists a subgraph on less than 1 vertex with at least $e/n>0$ edges, which is impossible. ∎

The only reason we proved 2.5 in the more general setting of multigraphs with loops is to prove the following technical result.

Lemma 2.6.

Let $B$ be a set, and let $f$ be any function from $B$ to $\mathbb{Z}_{\geq 0}$ . For all $b>1$ , there exists a subset $B^{\prime}\subseteq B$ such that

\min_{y^{\prime}\in B^{\prime}}f(y^{\prime})\geq 2^{-b}\left(\frac{|B^{\prime}|}{|B|}\right)^{1/b}\frac{\sum_{y\in B}f(y)}{|B^{\prime}|}.

Proof.

Define an auxiliary graph $G$ on $B$ where each vertex $y$ has $f(y)$ loops (and these are the only edges in $G$ ). Applying Lemma 2.5 gives the result. ∎

Roughly speaking, this lemma will be applied with $B$ the set of vertices that are at distance $b$ from some vertex $x$ and with $f(y)$ the number of paths of length $b$ from $x$ to $y$ . This will allow us to choose vertices $x$ and $y$ connected by many paths of length $b$ , which we can use to construct copies of $\theta_{a,b}$ where $x$ and $y$ are the two high-degree vertices.

3 Preliminaries

3.1 Key Definitions

As noted in the proof outline Subsection 1.1, we wish to consider hypergraphs on $V(\theta_{a,b})\times V(G)$ . To aid with this, we make use of the following definitions throughout the paper; see Figure 3 for an example.

Definition 2.

Given a graph $G$ and a set $\chi\subseteq V(\theta_{a,b})\times V(G)$ , we define the projection sets $\chi_{\theta}=\{w:\exists z,(w,z)\in\chi\}$ and $\chi_{G}=\{z:\exists w,(w,z)\in\chi\}$ . We say that a set $\chi\subseteq V(\theta_{a,b})\times V(G)$ is valid if

1.

$|\chi|=|\chi_{\theta}|=|\chi_{G}|$ (equivalently, each vertex of $\theta_{a,b}$ and $G$ appears at most once in a pair of $\chi$ ), and
2.

If $(w,z),(w^{\prime},z^{\prime})\in\chi$ with $ww^{\prime}\in E(\theta_{a,b})$ , then $zz^{\prime}\in E(G)$ .

We let $\mathbb{V}$ denote the set of valid subsets of $V(\theta_{a,b})\times V(G)$ .

Note that if $\chi$ is a valid set with $|\chi|=v(\theta_{a,b})$ , then by definition this means the map $\phi_{\chi}:V(\theta_{a,b})\to V(G)$ which sends $w\in V(\theta_{a,b})$ to the unique vertex $z\in V(G)$ with $(w,z)\in\chi$ is an injective homomorphism. In particular, the vertices $\chi_{G}$ induce a graph containing a copy of $\theta_{a,b}$ as a subgraph. Since our ultimate goal is to find a large colleciton of such subgraphs which are spread out, we make the following definitions.

Definition 3.

We say that a hypergraph $\mathcal{H}$ is a $G$ -hypergraph if its vertex set is $V(\theta_{a,b})\times V(G)$ and all of its hyperedges are valid sets $h$ with $|h|=v(\theta_{a,b})$ . We say that functions of the form $D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\{\infty\}$ are codegree functions, and for such a $D$ we say that $\mathcal{H}$ is $D$ -good if $\deg_{\mathcal{H}}(\chi)\leq D(\chi_{\theta})$ for all $\chi\in\mathbb{V}$ .

Note that being $D$ -good means that no valid set $\chi$ is contained in too many hyperedges, with the exact degree condition depending only on $D$ and the projection $\chi_{\theta}\subseteq V(\theta_{a,b})$ (which allows us, for example, to impose stronger conditions if $\chi$ contains vertices corresponding to the high degree vertices of $V(\theta_{a,b})$ ).

The main technical work of this paper is in constructing $G$ -hypergraphs which have many hyperedges and which are $D$ -good for some $D$ which is sufficiently small to apply 6.1. To do this, we will consider several $D$ functions simultaneously (which will ultimately be combined into a unified codegree bound in the next two sections). The first and simplest function we consider is the following.

Definition 4.

Let $\delta>0$ , and let $k,n$ be positive integers. We define a codegree function $D_{\operatorname{forest}}$ as follows: if $\nu\subseteq V(\theta_{a,b})$ induces a forest on $e\geq 1$ edges, then

D_{\operatorname{forest}}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}\cdot(\delta k^{b/(b-1)})^{e-1}}\right\rceil,

and otherwise $D_{\operatorname{forest}}(\nu)=\infty$ .

Having $D_{\operatorname{forest}}(\nu)$ evaluate to infinity is not necessary, and we do this only to emphasize that this function essentially ignores sets $\nu$ which do not induce a forest with at least one edge.

Essentially, the main technical result of Morris and Saxton and Corsten and Tran says that one can construct large $G$ -hypergraphs which are $D_{\operatorname{forest}}$ -good. To go beyond this, we will show that one can construct collections which are $D$ -good for functions $D$ which are finite on sets $\nu\subseteq V(\theta_{a,b})$ that contain cycles (and more precisely, sets that contain the two vertices of $\theta_{a,b}$ of large degree).

The specific functions $D$ we need are somewhat complex. In all of these functions, the denominator of $D(\nu)$ roughly corresponds to the number of choices our algorithm has to build $\nu$ , with the terms to the left of the “ $\cdot$ ” typically counting the number of choices for the two high degree vertices of $\theta_{a,b}$ . The parameter $s$ will be chosen roughly such that the graph $G^{\prime}$ obtained from 2.5 has $2^{-s}n$ vertices.

With all of this established, we define the remainder of our codegree functions.

Definition 5.

Let $\delta>0$ , and let $k,n$ be positive integers. For $a\geq 3$ , let $u,v$ denote the two vertices of $\theta_{a,b}$ of degree larger than 2. For each integer $0\leq s\leq 3\log n$ , we define a codegree function $D_{s,b}$ as follows: if $u,v\in\nu$ , then

D_{s,b}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{2^{-s}n^{2}\cdot\delta^{|\nu|}(2^{2s/3}k^{b})^{(|\nu|-2)/(b-1)}}\right\rceil,

and otherwise $D_{s,b}(\nu)=\infty$ .

The definition above will be used when the $t$ value from Proposition 2.1 equals $b$ . The $t<b$ case is somewhat more complicated. Again this is because the denominator roughly represents the number of choices we have for our algorithm at any step, and just as in [10, 20], the $t<b$ case of the algorithm is somewhat more complicated.

Definition 6.

Let $\delta>0$ , and let $k,n$ be positive integers. For $0\leq s\leq 3\log n$ and $2\leq t<b$ , we define a codegree function $D_{s,t}$ as follows: write the paths of $\theta_{a,b}$ as $uw_{1}^{j}\cdots w_{b-1}^{j}v$ for $1\leq j\leq a$ , and define

F_{t}=\{w_{i}^{j}:t\leq i<b,\ i-t\textrm{ even}\},\hskip 20.00003ptf=|\nu\cap F_{t}|.

If $u,v\in\nu$ , then

D_{s,t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{2^{-2s}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}\cdot\delta^{|\nu|}(2^{2s/3}kn^{1/b})^{f}(2^{2s/3}k^{b/(b-1)})^{|\nu|-f-2}}\right\rceil,

and otherwise $D_{s,t}(\nu)=\infty$ .

The intuition for this codegree function is as follows: in the $t<b$ case with $s=0$ , our algorithm first selects $u,v$ , which it will be able to do in about $k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}$ ways (which is essentially the product of the bounds from 2.1(c) and (h)). When choosing $w_{i}^{j}$ , the number of choices will turn out to be about $kn^{1/b}$ if $w_{i}^{j}\in F_{t}$ and about $k^{b/(b-1)}$ if $w_{i}^{j}\notin F_{t}$ . Thus the denominator represents the number of choices our algorithm has for building $\nu$ .

One last codegree function is needed for the $t<b$ case.

Definition 7.

Let $\delta>0$ , and let $k,n$ be positive integers. For $2\leq t<b$ , we define a codegree function $D_{t}$ as follows: define $F_{t}$ as above and let $g=|\nu\cap F_{t}|$ if $b-t$ is even and $g=|\nu\cap F_{t}|-1$ otherwise. If $u,v,w_{b-1}^{j}\in\nu$ for some $j$ , then

D_{t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{kn^{1+1/b}\cdot(kn^{1/b})^{g}(k^{b/(b-1)})^{|\nu|-g-3}\delta^{|\nu|}}\right\rceil,

and otherwise $D_{t}(\nu)=\infty$ . For notational convenience, we also define $D_{b}$ by $D_{b}(\nu)=\infty$ for all $\nu\subseteq V(\theta_{a,b})$ .

The motivation for this definition is that the number of choices for $u,v,w_{b-1}^{j}$ is at least the number of choices for just $v,w_{b-1}^{j}$ , which is about $kn^{1+1/b}$ , i.e. the number of total edges in the graph $G$ . As before, the number of choices for every other $w_{i}^{j}$ vertex will depend on whether it is in $F_{t}$ or not. The definition of $g$ reflects the fact that if $b-t$ is odd, then $w_{b-1}^{j}\in F_{t}$ , but the number of choices for the first vertex of this form is already accounted for by the $kn^{1+1/b}$ term, so we can not include an extra factor of $kn^{1/b}$ for this vertex. Note that with this codegree function, we omit counting the number of choices for $u$ in the denominator. As such, $D_{t}$ will typically be much weaker (i.e. larger) than $D_{s,t}$ , though it will do better when $\nu$ has few vertices and $k$ is small.

3.2 Saturated Sets

When building our $G$ -hypergraph $\mathcal{H}$ , we need to be careful to avoid constructing theta graphs which contain a subset that has very large codegree in $\mathcal{H}$ . To aid with this, we introduce the following.

Definition 8.

Let $\mathcal{H}$ be a $G$ -hypergraph and $D$ a codegree function $D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\infty$ . We define the set of saturated sets

\mathcal{F}(\mathcal{H},D)=\{\chi\in\mathbb{V}:\deg(\chi)\geq D(\chi_{\theta})\}.

Given a valid set $\chi$ and $\nu\subseteq V(\theta_{a,b})\setminus\chi_{\theta}$ , define the link set $\mathcal{J}_{\mathcal{H},D}(\chi;\nu)$ to be the set of $\gamma\in\mathbb{V}$ with $\gamma_{\theta}=\nu$ such that $\chi\cup\gamma\in\mathcal{F}(\mathcal{H},D)$ . If $\nu=\{w\}$ we will sometimes denote this set simply by $\mathcal{J}_{\mathcal{H},D}(\chi;w)$

The intuition for the link set comes from our goal of algorithmically trying to iteratively add a new theta graph to some $\mathcal{H}$ such that $\mathcal{H}$ continues to be $D$ -good even after adding the theta graph. If during the algorithm we have already designated some $\chi\in\mathbb{V}$ to be used in our new theta graph, and if our algorithm is about to choose some $\gamma$ to add to $\chi$ such that $\gamma_{\theta}=\nu$ , then the algorithm can not choose any $\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi^{\prime};\nu)$ for any $\chi^{\prime}\subseteq\chi$ , as otherwise the degree of $\chi^{\prime}\cup\gamma$ would be strictly larger than what $D$ dictates. As an aside, our definition of link sets differs slightly from Morris and Saxton, who essentially defined the links to be $\bigcup_{\chi^{\prime}\subseteq\chi}\mathcal{J}_{\mathcal{H},D}(\chi^{\prime};\nu)$ .

Because the link sets represent the number of “bad” choices our algorithm has, we will want to show that these sets are relatively small. This is accomplished by the following lemma.

Lemma 3.1.

Let $\mathcal{H}$ be a $G$ -hypergraph which is $D$ -good for some $D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\infty$ . If $D(\chi_{\theta}\cup\nu)=\infty$ then $\mathcal{J}_{\mathcal{H},D}(\chi;\nu)=\emptyset$ , and otherwise

|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|\leq 2^{v(\theta_{a,b})}\frac{D(\chi_{\theta})}{D(\chi_{\theta}\cup\nu)}.

Proof.

If $D(\chi_{\theta}\cup\nu)=\infty$ , then every $\gamma\in\mathbb{V}$ with $\gamma_{\theta}=\nu$ trivially has $\deg(\chi\cup\gamma)<D(\chi_{\theta}\cup\gamma_{\theta})=\infty$ , so no such $\gamma$ satisfies $\chi\cup\gamma\in\mathcal{F}(\mathcal{H},D)$ and we conclude $\mathcal{J}_{\mathcal{H},D}(\chi;\nu)=\emptyset$ . From now on we assume $D(\chi_{\theta}\cup\nu)<\infty$ . Note that

\sum_{\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi;\nu)}\deg(\chi\cup\gamma)\leq\sum_{\gamma:\ |\gamma|=|\nu|}\deg(\chi\cup\gamma)\leq 2^{v(\theta_{a,b})}\deg(\chi)\leq 2^{v(\theta_{a,b})}D(\chi_{\theta}),

where the second inequality used that each hyperedge $h$ containing $\chi$ is counted at most $2^{v(\theta_{a,b})}$ times by the sum over $\gamma$ , and the last inequality used that $\mathcal{H}$ is $D$ -good. On the other hand,

\sum_{\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi;\nu)}\deg(\chi\cup\gamma)\geq\sum_{\gamma\in\mathcal{J}_{\mathcal{H},D}(\chi;\nu)}D(\chi_{\theta}\cup\gamma_{\theta})=|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|D(\chi_{\theta}\cup\nu).

Rearranging these two inequalities gives

|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|\leq 2^{v(\theta_{a,b})}\frac{D(\chi_{\theta})}{D(\chi_{\theta}\cup\nu)},

completing the proof.

∎

Because all of our codegree functions involve the ceiling of a real-valued function, the following result, which allows us to ignore the ceilings, will be slightly more convenient to use compared to 3.1.

Corollary 3.2.

Let $\mathcal{H}$ be a $G$ -hypergraph which is $D$ -good for some $D:2^{V(\theta_{a,b})}\to\mathbb{N}\cup\infty$ , and suppose that $D(\nu)=\left\lceil D^{\prime}(\nu)\right\rceil$ for some $D^{\prime}:2^{V(\theta_{a,b})}\to\mathbb{R}_{>0}$ . If $D(\chi_{\theta}\cup\nu)=\infty$ then $\mathcal{J}_{\mathcal{H},D}(\chi;\nu)=\emptyset$ . If $D(\chi_{\theta}\cup\nu)\neq\infty$ and $\chi\notin\mathcal{F}(\mathcal{H},D)$ , then

|\mathcal{J}_{\mathcal{H},D}(\chi;\nu)|\leq 2^{v(\theta_{a,b})+1}\frac{D^{\prime}(\chi_{\theta})}{D^{\prime}(\chi_{\theta}\cup\nu)}.

Moreover, this bound continues to hold for $D=D_{\operatorname{forest}}$ even if $\chi\in\mathcal{F}(\mathcal{H},D)$ provided $\delta$ is sufficiently small.

Proof.

Note that trivially $D(\chi_{\theta}\cup\nu)\geq D^{\prime}(\chi_{\theta}\cup\nu)$ and that $D(\chi_{\theta})\leq 2D^{\prime}(\chi_{\theta})$ provided $D(\chi_{\theta})\geq 2$ . Thus in this case, the result follows immediately from 3.1, and in particular, this situation always holds for $D_{\operatorname{forest}}$ provided $\delta$ is sufficiently small.

It remains to consider the case that $D(\chi_{\theta})=1$ and $\chi\notin\mathcal{F}(\mathcal{H},D)$ . These two conditions imply $\deg(\chi)<D(\chi_{\theta})=1$ , so there exists no hyperedge of $\mathcal{H}$ containing $\chi$ . Thus there exists no $\gamma$ such that $\chi\cup\gamma\in\mathcal{F}(\mathcal{H},D)$ , i.e. such that $\deg(\chi\cup\gamma)\geq D(\chi_{\theta}\cup\gamma_{\theta})\geq 1$ . We conclude that the link set is empty in this case, and hence the result trivially holds. ∎

When applying this claim it will always be immediate³³3In terms of the notation for the next section, we will only apply 3.2 with $D\neq D_{\operatorname{forest}}$ when $\chi$ is a subset of an $(s,t)$ -compatible set, which by definition will not be in $\mathcal{F}(\mathcal{H},D)$ . that $\chi\notin\mathcal{F}(\mathcal{H},D)$ , and for simplicity we will omit saying this explicitly.

4 Balanced Supersaturation for Vertices

In this section, we prove our main technical theorem: a balanced supersaturation result for vertices.

Theorem 4.1.

For all $a\geq 6$ and $b\geq 3$ , there exist constants $\delta>0,k_{0}>0$ such that the following holds for all $n\in\mathbb{N}$ and $k\geq k_{0}$ . If $G$ is an $n$ -vertex graph with $kn^{1+1/b}$ edges, then there exists an integer $2\leq t\leq b$ and a $G$ -hypergraph $\mathcal{H}^{\prime}_{t}$ with $|\mathcal{H}^{\prime}_{t}|\geq b^{-1}\delta k^{ab}n^{2}$ which is $D^{\prime}_{t}$ -good, where $D^{\prime}_{t}$ is defined by

D^{\prime}_{t}(\nu):=\begin{cases}D_{\operatorname{forest}}(\nu)&\textrm{if }\nu\textrm{ induces a forest},\\ \min\{D_{t}(\nu),20(D_{0,t}(\nu)+\left\lceil\log n\right\rceil)\}&\textrm{otherwise}.\end{cases}

We note that there is no need to consider $b=2$ since the case $\theta_{a,2}=K_{2,a}$ is already dealt with by Morris and Saxton [20].

4.1 will follow quickly from the following technical result, which roughly says that given a collection of much fewer than $k^{ab}n^{2}$ copies of $\theta_{a,b}$ satisfying certain codegree conditions, we can find an additional copy of $\theta_{a,b}$ to add to the collection while maintaining the desired codegrees.

Proposition 4.2.

For all $a\geq 6$ and $b\geq 3$ , there exist constants $\delta>0,k_{0}>0$ such that the following holds for all $n\in\mathbb{N}$ and $k\geq k_{0}$ . Let $G$ be an $n$ -vertex graph on $kn^{1+1/b}$ edges with $k\geq k_{0}$ , and let $\{\mathcal{H}_{s,t}\}_{s,t}$ be a set of $G$ -hypergraphs such that $\mathcal{H}_{s,t}$ is $D_{s,t}$ -good for each $0\leq s\leq 3\log n$ and $2\leq t\leq b$ , and $\bigcup_{s}\mathcal{H}_{s,t}$ is $D_{t}$ -good for each $t<b$ , and $\bigcup_{s,t}\mathcal{H}_{s,t}$ is $D_{\operatorname{forest}}$ -good.

If $|\bigcup_{s,t}\mathcal{H}_{s,t}|\leq\delta k^{ab}n^{2}$ , then there exists some valid set $h\in\mathbb{V}$ of size $v(\theta_{a,b})$ and some $s^{\prime},t^{\prime}$ such that $h\notin\bigcup_{s,t}\mathcal{H}_{s,t}$ , and such that if we define $\mathcal{H}^{\prime}_{s^{\prime},t^{\prime}}=\mathcal{H}_{s^{\prime},t^{\prime}}\cup\{h\}$ and $\mathcal{H}^{\prime}_{s,t}=\mathcal{H}_{s,t}$ for all $(s,t)\neq(s^{\prime},t^{\prime})$ , then $\mathcal{H}_{s,t}^{\prime}$ is $D_{s,t}$ -good for each $0\leq s\leq 3\log n$ and $2\leq t\leq b$ , and $\bigcup_{s}\mathcal{H}_{s,t}^{\prime}$ is $D_{t}$ -good for each $t<b$ , and $\bigcup_{s,t}\mathcal{H}_{s,t}^{\prime}$ is $D_{\operatorname{forest}}$ -good.

Before proving 4.2, we show how it may be repeatedly applied to obtain our main supersaturation theorem.

Proof of 4.1.

Initially start with collections $\{\mathcal{H}_{s,t}\}_{s,t}$ where $\mathcal{H}_{s,t}=\emptyset$ for all $s,t$ . By repeatedly applying 4.2, we obtain collections satisfying all of the codegree conditions and with $|\bigcup_{s,t}\mathcal{H}_{s,t}|\geq\delta k^{ab}n^{2}$ . In particular, there exists some $2\leq t\leq b$ such that $\mathcal{H}_{t}^{\prime}:=\bigcup_{s}\mathcal{H}_{s,t}$ contains at least $b^{-1}\delta k^{ab}n^{2}$ hyperedges.

By Proposition 4.2, we have for all $\chi\in\mathbb{V}$ that

\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq\deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)\leq D_{\operatorname{forest}}(\chi_{\theta}),

and similarly

\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq D_{t}(\chi_{\theta}).

To complete the proof, we only have to show $\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq 20(D_{0,t}(\nu)+\left\lfloor\log n\right\rfloor)$ for all $\chi$ such that $\chi_{\theta}$ contains a cycle. We first consider the case $t=b$ . Here Proposition 4.2 gives

\deg_{\mathcal{H}^{\prime}_{b}}(\chi)\leq\sum_{s=0}^{3\log n}\deg_{\mathcal{H}_{s,b}}(\chi)\leq\sum_{s=0}^{3\log n}D_{s,b}(\nu)\leq 3\left\lceil\log n\right\rceil+4D_{0,b}(\nu)\sum_{s=0}^{3\log n}2^{\left(1-\frac{2(|\chi_{\theta}|-2)}{3(b-1)}\right)s},

where this last step used that either $D_{s,b}(\nu)=1$ , or (by Definition 5) $D_{s,b}(\nu)$ differs from $D_{0,b}(\nu)$ by at most a multiplicative factor of $4\cdot 2^{\left(1-\frac{2(|\chi_{\theta}|-2)}{3(b-1)}\right)s}$ (where the factor of 4 comes from the two ceiling functions involving $D_{s,b}$ and $D_{0,b}$ ). Since $\chi_{\theta}$ contains a cycle, we have $|\chi_{\theta}|\geq 2b$ , so the sum above is at most $\sum_{s=0}^{\infty}2^{-s/3}\leq 5$ . We conclude that $\deg_{\mathcal{H}_{b}^{\prime}}(\chi)\leq D^{\prime}_{b}(\chi_{\theta})$ for all valid $\chi$ .

When $t<b$ , essentially the same reasoning gives that if $\chi_{\theta}$ contains a cycle then

\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq 3\left\lceil\log n\right\rceil+4D_{0,t}(\nu)\sum_{s=0}^{\infty}2^{\left(2-(2b-2)\frac{2}{3}\right)s},

and since $b\geq 3$ this latter sum is at most $\sum_{s=0}^{\infty}2^{-2s/3}\leq 5$ . This gives the result. ∎

The rest of this section is dedicated to proving 4.2. Let $G$ be the graph in the hypothesis of Proposition 4.2 and $\{\mathcal{H}_{s,t}\}_{s,t}$ the corresponding collections.

The basic idea of the proof is to algorithmically construct many copies of $\theta_{a,b}$ , and to show that at least one of them is not already contained in $\mathcal{H}_{s^{\prime},t^{\prime}}$ for some appropriate $s^{\prime},t^{\prime}$ , and such that our codegree conditions continue to be satisfied.

We will begin by pruning the graph $G$ so that all of its remaining edges and vertices are “well-behaved” (Section 4.1). We then give the algorithmic construction of copies of $\theta_{a,b}$ and show that a new copy may be added to some $\mathcal{H}_{s,t}$ (Section 4.2), completing the proof. Throughout the argument we fix some $\delta$ depending only on $a,b$ which is sufficiently small for our arguments to go through.

4.1 Pruning

Let $G_{0}\subseteq G$ be the graph obtained by deleting edges of $G$ that are already “saturated” by hyperedges of $\bigcup_{s,t}\mathcal{H}_{s,t}$ , that is, those edges $e\in E(G)$ for which there exists $\chi\in\mathbb{V}$ with $\chi_{G}=e$ and

\deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)\,\geq\,D_{{\operatorname{forest}}}\big{(}\chi_{\theta}\big{)}=\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil.

This will ensure that any new theta graph constructed using only edges of $G_{0}$ will not violate our edge-codegree bounds when added to $\bigcup_{s,t}\mathcal{H}_{s,t}$ . We bound the number of these saturated edges by double-counting elements of $\bigcup_{s,t}\mathcal{H}_{s,t}\,$ :

\displaystyle\left(\text{\# edges $e$ with $\chi\in\mathbb{V},\ \chi_{G}=e,\ \deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)=\left\lceil\tfrac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil$}\right)\cdot\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil

\displaystyle\leq e(\theta_{a,b})\cdot|\textstyle\bigcup_{s,t}\mathcal{H}_{s,t}|.

Rearranging slightly, the number of such edges is at most

e(\theta_{a,b})\,|\textstyle\bigcup_{s,t}\mathcal{H}_{s,t}|\bigg{/}\left\lceil\dfrac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil\leq\,{\delta^{2}e(\theta_{a,b})}\cdot kn^{1+1/b},

and if $\delta$ is sufficiently small this is at most $\frac{1}{2}kn^{1+1/b}=\frac{1}{2}e(G)$ , which implies $e(G_{0})\geq\frac{1}{2}kn^{1+1/b}$ . Notice that no remaining edges are “saturated,” i.e. we have

\deg_{\bigcup_{s,t}\mathcal{H}_{s,t}}(\chi)\leq\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}}\right\rceil-1\textrm{ for all }\chi\in\mathbb{V}\textrm{ such that }\chi_{G}\in E(G_{0}).

(2)

Now we further prune the graph by eliminating low-degree vertices: let $G^{\prime}\subseteq G_{0}$ be the subgraph of high minimum degree guaranteed by Lemma 2.5. Although $G^{\prime}$ may have substantially fewer vertices than $G$ (meaning our algorithm will have fewer choices at various steps), in this case it will compensate by having a substantially larger minimum degree.

More concretely, let $m=v(G^{\prime})$ , and let $\ell$ be the real number such that $G^{\prime}$ has minimum degree $\ell m^{1/b}$ . By Lemma 2.5 we have

\ell m^{1/b}\geq 2^{-b-1}kn^{1+1/b}m^{-1}(n/m)^{-1/b}\implies\ell\geq 2^{-b-1}(n/m)k.

We let $r$ be the unique integer such that

2^{-r}n\leq m<2^{-r+1}n,

(3)

and we note that the previous inequality implies

\ell\geq 4^{-b}2^{r}k.

(4)

In total this implies the minimum degree $\ell m^{1/b}$ of $G^{\prime}$ is at least $\Omega(kn^{1/b})$ , which is the average degree of $G$ , and that the minimum degree of $G^{\prime}$ is much larger compared to that of $G$ if $m=v(G^{\prime})$ is much smaller than $n=v(G)$ . Before moving on, we note

\ell^{b/(b-1)}\leq\ell m^{1/b}.

(5)

Indeed, since $\ell m^{1/b}$ is the minimum degree of the $m$ -vertex graph $G^{\prime}$ , we must have $\ell m^{1/b}\leq m$ , and rearranging shows this is equivalent to (5).

4.2 The Algorithm

We are now ready to begin finding copies on $\theta_{a,b}$ in $G^{\prime}$ . Our strategy is roughly as follows: first, we identify which collection $\mathcal{H}_{s,t}$ we wish to add a copy to (based on the expansion properties of $G^{\prime}$ detailed in 2.1). After this, we carefully choose vertices $x$ and $y$ to serve as the two high-degree vertices for the copies we will add. We then show that $x$ and $y$ are not already contained in too many copies in $\mathcal{H}_{s,t}$ , and algorithmically construct a large number of theta graphs in $G^{\prime}$ that do contain $x$ and $y$ ; this allows us to conclude that we have found at least one new copy not already contained in $\mathcal{H}_{s,t}$ . Crucially, along the way, we ensure that at no step of the algorithm are our codegree conditions violated, ensuring that the copy added to $\mathcal{H}_{s,t}$ is “good.”

Before delving into the meat of the proof, we introduce some notation which is more compact. Define

\mathcal{H}=\bigcup_{s,t}\mathcal{H}_{s,t},\hskip 20.00003pt\mathcal{H}_{t}=\bigcup_{s}\mathcal{H}_{s,t}.

Also define

\mathcal{F}_{\operatorname{forest}}=\mathcal{F}(\mathcal{H},D_{\operatorname{forest}}),\hskip 20.00003pt\mathcal{F}_{t}=\mathcal{F}(\mathcal{H}_{t},D_{t}),\hskip 20.00003pt\mathcal{F}_{s,t}=\mathcal{F}(\mathcal{H}_{s,t},D_{s,t}).

When applying 3.2, we adopt the shorthand

\mathcal{J}_{\operatorname{forest}}=\mathcal{J}_{\mathcal{H},D_{\operatorname{forest}}},\hskip 20.00003pt\mathcal{J}_{t}=\mathcal{J}_{\mathcal{H}_{t},D_{t}},\hskip 20.00003pt\mathcal{J}_{s,t}=\mathcal{J}_{\mathcal{H}_{s,t},D_{s,t}}.

We say that a set $\chi$ is $(s,t)$ -compatible if $\chi\in\mathbb{V}$ and if no subset of $\chi$ lies in $\mathcal{F}_{\operatorname{forest}}\cup\mathcal{F}_{t}\cup\mathcal{F}_{s,t}$ . Crucially, we observe that proving the proposition is equivalent to showing that for some $s,t$ , there exists an $(s,t)$ -compatible set $h$ with $|h|=v(\theta_{a,b})$ such that $h\notin\mathcal{H}$ (since, for example, no subset of $h$ being in $\mathcal{F}_{\operatorname{forest}}$ implies $\mathcal{H}\cup\{h\}$ is $D_{\operatorname{forest}}$ -good).

Before moving on, we make a small but important observation.

Claim 4.3.

If $\chi$ is a valid set and $\nu\subseteq V(\theta_{a,b})\setminus\chi_{\theta}$ is such that $\chi_{\theta}\cup\nu$ induces at most one edge in $\theta_{a,b}$ , then $\mathcal{J}_{\operatorname{forest}}(\chi;\nu)=\emptyset$ .

Proof.

If $\chi_{\theta}\cup\nu$ induces 0 edges then $D_{\operatorname{forest}}(\chi_{\theta}\cup\nu)=\infty$ and the result follows from 3.2. Thus we can assume $\chi_{\theta}\cup\nu$ induces exactly one edge $ww^{\prime}$ . If there exists some $\gamma\in\mathcal{J}_{\operatorname{forest}}(\chi;\nu)$ , then there exist pairs $(w,z),(w^{\prime},z^{\prime})\in\chi\cup\gamma$ (since $(\chi\cup\gamma)_{\theta}=\chi_{\theta}\cup\nu$ ). In this case we have

\deg_{\mathcal{H}}(\chi\cup\gamma)\leq\deg_{\mathcal{H}}(\{(w,z),(w^{\prime},z^{\prime})\})<D_{\operatorname{forest}}(\{w,w^{\prime}\})=D_{\operatorname{forest}}(\chi_{\theta}\cup\gamma_{\theta}),

where the first inequality used that we are looking at the codegree of a smaller set, the second inequality follows from (2) which says $G^{\prime}\subseteq G_{0}$ does not contain any edges $zz^{\prime}$ with $\deg_{\mathcal{H}}(\{(w,z),(w^{\prime},z^{\prime})\})\geq D_{\operatorname{forest}}(\{w,w^{\prime}\})$ , and the equality used that $\chi_{\theta}\cup\gamma_{\theta}=\chi_{\theta}\cup\nu$ induces exactly one edge. This inequality implies $\chi\cup\gamma\notin\mathcal{F}(\mathcal{H},D_{\operatorname{forest}})$ , contradicting the assumption $\gamma\in\mathcal{J}_{\operatorname{forest}}(\chi,\nu)$ . We conclude that this link set is indeed empty. ∎

4.2.1 The Setup

We wish to apply Proposition 2.1 to the “pruned” graph $G^{\prime}$ . For this we need to specify a set of forests to avoid. Intuitively we wish to use the set $\mathcal{F}_{\operatorname{forest}}$ , but this is a collection of subsets of $V(\theta_{a,b})\times V(G)$ , not of subgraphs of $G^{\prime}$ . To get around this minor technically, for $\chi\in\mathbb{V}$ we define the “projection graph” $H_{\chi}$ by

V(H_{\chi})=\chi_{G},\hskip 20.00003ptE(H_{\chi})=\{zz^{\prime}:\exists ww^{\prime}\in E(\theta_{a,b}),\ (w,z),(w^{\prime},z^{\prime})\in\chi\}.

Note that by definition of $\chi$ being valid, $H_{\chi}$ is a subgraph of $G$ which is isomorphic to the subgraph of $\theta_{a,b}$ induced by $\chi_{\theta}$ . We let $\mathcal{F}^{\prime}_{\operatorname{forest}}=\{H_{\chi}:\chi\in\mathcal{F}_{\operatorname{forest}}\}$ . Since $D_{\operatorname{forest}}(\nu)=\infty$ for $\nu$ which do not induce forests, every element of $\mathcal{F}_{\operatorname{forest}}^{\prime}$ is a forest. To apply Proposition 2.1 with this set, it remains to verify the following, which gives the hypotheses of Proposition 2.1(g).

Claim 4.4.

Let $\varepsilon>0$ be as in Proposition 2.1. If $\delta>0$ is sufficiently small, then for every path $x_{1}\cdots x_{p}$ of $G^{\prime}$ with $p\leq b$ which does not contain an element of $\mathcal{F}_{\operatorname{forest}}^{\prime}$ as a subgraph, the number of vertices $x_{p+1}\in N_{G^{\prime}}(x_{p})$ such that some subgraph of the path $x_{1}\cdots x_{p+1}$ is in $\mathcal{F}_{\operatorname{forest}}^{\prime}$ is at most $\varepsilon\ell m^{1/b}$ .

Proof.

Fix any path $x_{1}\cdots x_{p}$ as above; we wish to bound then. We introduce the following notation which will only be used in the proof of this claim: we say that a pair $(\chi,w)$ with $\chi\in\mathbb{V}$ and $w\in V(\theta_{a,b})$ is good if $\chi_{G}\subseteq\{x_{1},\ldots,x_{p}\}$ and $w$ is adjacent to at most one vertex of $\chi_{\theta}$ . We claim that if $x_{p+1}\in N_{G^{\prime}}(x_{p})$ is such that some subgraph of the path $x_{1}\cdots x_{p+1}$ is in $\mathcal{F}^{\prime}_{\operatorname{forest}}$ , then $\{(w,x_{p+1})\}\in\mathcal{J}_{\operatorname{forest}}(\chi;w)$ for some good pair $(\chi,w)$ .

Indeed, say the subgraph of $x_{1}\cdots x_{p+1}$ in $\mathcal{F}^{\prime}_{\operatorname{forest}}$ was $H_{\gamma}$ for some $\gamma\in\mathbb{V}$ . Because $H_{\gamma}$ is not a subgraph of $x_{1}\cdots x_{p}$ , we must have $x_{p+1}\in V(H_{\gamma})$ , and thus we have $(w,x_{p+1})\in\gamma$ for some $w\in V(\theta_{a,b})$ . If $\chi:=\gamma\setminus\{(w,x_{p+1})\}$ , then $H_{\gamma}$ being a subgraph of $x_{1}\cdots x_{p+1}$ implies $\chi_{G}\subseteq\{x_{1},\ldots,x_{p}\}$ and that $w$ is adjacent to at most one vertex of $\chi_{\theta}$ (as otherwise $x_{p+1}$ would have degree greater than 1 in $H_{\gamma}$ , contradicting this being a subgraph of $x_{1}\cdots x_{p+1}$ ). In total we find $\gamma=\chi\cup\{(w,x_{p+1})\}$ for some good pair $(\chi,w)$ . Moreover, by definition of $H_{\gamma}\in\mathcal{F}^{\prime}_{\operatorname{forest}}$ , we find

\chi\cup\{(w,x_{p+1})\}=\gamma\in\mathcal{F}_{\operatorname{forest}}=\mathcal{F}(\mathcal{H},D_{\operatorname{forest}}),

so by definition $\{(w,x_{p+1})\}\in\mathcal{J}_{\operatorname{forest}}(\chi;w)$ as desired.

With this we see that the number of choices for $x_{p+1}$ is at most the number of elements of $\mathcal{J}_{\operatorname{forest}}(\chi;w)$ for all possible good pairs $(\chi,w)$ . To count the number of such elements, fix some good pair $(\chi,w)$ . If $\chi_{\theta}$ induces no edges, then since $w$ is adjacent to at most one vertex of $\chi_{\theta}$ by definition of $(\chi,w)$ being a good pair, $\chi_{\theta}\cup\{w\}$ induces at most one edge. Claim 4.3 then implies $\mathcal{J}_{\operatorname{forest}}(\chi;w)=\emptyset$ .

Now assume $\chi_{\theta}$ induces at least one edge. By definition of $D_{\operatorname{forest}}$ and 3.2, we have for any good pair $(\chi,w)$ that

|\mathcal{J}_{\operatorname{forest}}(\chi;w)|\leq 2^{v(\theta_{a,b})+1}\delta k^{b/(b-1)}\leq 2^{v(\theta_{a,b})+1}\delta 16^{b}\ell m^{1/b},

where the first inequality used that $\chi_{\theta}\cup\{w\}$ induces at most one more edge than $\chi_{\theta}$ , and the last inequality used (4) and (5). As the total number of good pairs is at most $v(\theta_{a,b})2^{p}=O_{a,b}(1)$ , the result follows by taking $\delta$ sufficiently small. ∎

With this claim and the fact that $\ell$ is at least a sufficiently large constant due to (4), we can apply Proposition 2.1 to $G^{\prime}$ and $\mathcal{F}_{\operatorname{forest}}^{\prime}$ , and we let $t,X$ be the integer and set guaranteed by this proposition. We recall that $u,v\in V(\theta_{a,b})$ are the high degree vertices of $\theta_{a,b}$ .

Let $x\in X$ be a vertex such that $(u,x)$ is in the fewest hyperedges of $\mathcal{H}$ , that is, a vertex with $\deg_{\mathcal{H}}(\{(u,x)\})=\min_{y\in X}\deg_{\mathcal{H}}(\{(u,y)\})$ (this will help us ensure we can find a new copy of $\theta_{a,b}$ containing $x$ ). Let $(\mathcal{B},\mathcal{Q})$ with $\mathcal{B}=(B_{1},\ldots,B_{t})$ be the pair for $x$ as guaranteed by Proposition 2.1(a).

We now split our analysis into several cases based on the value of $t$ . The overarching strategy is the same in both cases, but the details are somewhat simpler in the case $t=b$ (where $G^{\prime}$ has nice “random-like” expansion near $x$ ).

4.2.2 Case 1: $t=b$

Our strategy is to build copies of $\theta_{a,b}$ in $G^{\prime}$ which use $x$ as one of the high degree vertices $u$ . To choose the vertex $y$ which plays the role of the other high-degree vertex $v$ , we would like to ensure there are many paths in $\mathcal{Q}$ connecting $y$ to $x$ (which we will then use to build our theta graphs). And indeed, this holds for many vertices in $B_{b}$ : for $y\in B_{b}$ , let $f(y)$ denote the number of paths of $\mathcal{Q}$ which $y$ is an endpoint of. By 2.6, there is some $B^{\prime}\subseteq B_{b}$ so that

\min_{y\in B^{\prime}}f(y)\geq 2^{-b}\left(\frac{|B^{\prime}|}{|B_{b}|}\right)^{1/b}\frac{\sum_{y\in B_{b}}f(y)}{|B^{\prime}|}.

(6)

Notice that we face a trade-off: we may have a large set of vertices $B^{\prime}$ , each of which is the endpoint of approximately the average number of paths, or a smaller set $B^{\prime}$ where $f(y)$ is much larger than average. We can obtain a strong balanced supersaturation result either way, but to do so, we must keep track of this trade-off. To this end, let $r^{\prime}$ be the unique integer such that

2^{-r^{\prime}}m\leq|B^{\prime}|<2^{-r^{\prime}+1}m.

(7)

Since $|B_{b}|\leq m$ and $\sum_{y\in B_{b}}f(y)=|\mathcal{Q}|\geq\varepsilon\ell^{b}m$ by 2.1(c), (6) can be relaxed to

\min_{y\in B^{\prime}}f(y)\geq 2^{-b}\frac{\varepsilon\ell^{b}m}{|B_{b}|^{1/b}|B^{\prime}|^{1-1/b}}\geq\varepsilon 4^{-b}\cdot 2^{(1-1/b)r^{\prime}}\ell^{b}.

(8)

Recall from (3) that $2^{-r}n\leq m<2^{-r+1}n$ , and let

s=2r+r^{\prime}.

Roughly speaking, if $s$ is small, then $m$ and/or $|B^{\prime}|$ are large, which is what we would expect to happen if $G$ was a random graph (as opposed to $G$ being, e.g., a clique with isolated vertices, wherein both of these quantities would be small). We now aim to show that we can add a new theta graph to the collection $\mathcal{H}_{s,b}$ . Let $y\in B^{\prime}$ be such that $(v,y)$ is in the fewest number of hyperedges with $(u,x)$ in $\mathcal{H}$ . As before, this will help ensure we find a new hyperedge, since $x$ and $y$ are not already contained in too many elements of $\mathcal{H}$ .

Claim 4.5.

The set $\chi:=\{(u,x),(v,y)\}$ satisfies

\deg_{\mathcal{H}}(\chi)\leq\delta\varepsilon^{-1}2^{s}k^{ab},

and is $(s,b)$ -compatible provided $\delta$ is sufficiently small.

Proof.

First, recall that $x$ is such that $(u,x)$ is contained in the fewest number of hyperedges in $\mathcal{H}$ among all vertices in the set $X$ , which means

\deg_{\mathcal{H}}(\{(u,x)\})\leq\frac{|\mathcal{H}|}{|X|}\leq\frac{\delta k^{ab}n^{2}}{\varepsilon m},

where the second inequality used $|X|\geq\varepsilon m$ by 2.1(b) when $t=b$ and the hypothesis $|\mathcal{H}|\leq\delta k^{ab}n^{2}$ of 4.2. Similarly, as $y\in B^{\prime}$ is such that $(v,y)$ is in the fewest number of hyperedges with $(u,x)$ in $\mathcal{H}$ , we have

\deg_{\mathcal{H}}(\chi)\leq\frac{\delta k^{ab}n^{2}}{\varepsilon m|B^{\prime}|}\leq\frac{\delta k^{ab}n^{2}}{\varepsilon 2^{-2r-r^{\prime}}n^{2}}=\delta\varepsilon^{-1}2^{s}k^{ab},

where the second inequality used (3) and (7).

It remains to show $\chi$ is $(s,b)$ -compatible. First we show $\chi$ is a valid set. Since $u,v$ are distinct non-adjacent vertices of $\theta_{a,b}$ , we only need to check $x\neq y$ . And indeed, we can not have $x\in B_{b}$ , since by Proposition 2.1(e), every element of $B_{b}$ is the endpoint of a positive number of paths of length $b$ from $x$ (and since these are paths, $x$ can not serve as both endpoints). Since $y\in B^{\prime}\subseteq B_{b}$ , we conclude $x\neq y$ and that $\chi$ is valid.

It remains to check that every subset of $\chi$ satisfies the desired codegree conditions. Note that for any $\chi^{\prime}\subseteq\chi$ of size 1, we have $D_{\operatorname{forest}}(\chi^{\prime}_{\theta})=D_{b}(\chi^{\prime}_{\theta})=D_{s,b}(\chi^{\prime}_{\theta})=\infty$ , and as such $\chi^{\prime}$ will not belong to $\mathcal{F}_{\operatorname{forest}}\cup\mathcal{F}_{b}\cup\mathcal{F}_{s,b}$ . Similarly $D_{\operatorname{forest}}(\chi_{\theta})=D_{b}(\chi_{\theta})=\infty$ , so it only remains to verify that

\deg_{\mathcal{H}_{s,b}}(\chi)<D_{s,b}(\chi_{\theta})=\left\lceil\frac{k^{ab}n^{2}}{2^{-s}n^{2}\delta^{2}}\right\rceil=\left\lceil 2^{s}\delta^{-2}k^{ab}\right\rceil.

Since $\deg_{\mathcal{H}_{s,b}}(\chi)\leq\deg_{\mathcal{H}}(\chi)$ , this bound follows from the first part of the claim, completing our proof. ∎

We now wish to construct many “good” copies of $\theta_{a,b}$ in $G^{\prime}$ with $x$ and $y$ as the two high-degree vertices (i.e. more than the bound in Claim 4.5). To do this, we iteratively pick paths $P_{1},\ldots,P_{a}$ in $\mathcal{Q}$ that end in $y$ , and we take our copy of $\theta_{a,b}$ to be the union of these paths. We must ensure that the paths chosen are such that $P_{1}\cup\cdots\cup P_{a}$ is $(s,b)$ -compatible, and in particular that they do not intersect each other and that no subset of their vertices is already saturated. For this claim, we recall that the paths of $\theta_{a,b}$ are denoted by $uw_{1}^{j}\cdots w_{b-1}^{j}v$ .

Claim 4.6.

Let $1\leq j\leq a$ , and let $P_{1},\dots,P_{j-1}$ be a collection of paths in $\mathcal{Q}$ ending in $y$ , and for each path $P_{j^{\prime}}$ , write $P_{j^{\prime}}=xz_{1}^{j^{\prime}}\cdots z_{b-1}^{j^{\prime}}y$ . Suppose that the set

\chi:=\{(u,x),(v,y)\}\cup\bigcup_{j^{\prime}<j}\left\{(w_{1}^{j^{\prime}},z_{1}^{j^{\prime}}),\dots,(w_{b-1}^{j^{\prime}},z_{b-1}^{j^{\prime}})\right\}

is $(s,b)$ -compatible. Then there are at least $\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b}$ choices of a path $P_{j}=xz_{1}^{j}\cdots z_{b-1}^{j}y$ in $\mathcal{Q}$ so that

\chi_{j}:=\chi\cup\left\{(w_{1}^{j},z_{1}^{j}),\dots,(w_{b-1}^{j},z_{b-1}^{j})\right\}

is $(s,b)$ -compatible.

Proof.

Let $\mathcal{Q}(y)$ denote the set of paths in $\mathcal{Q}$ ending in $y$ . Since $y\in B^{\prime}$ , we have

|\mathcal{Q}(y)|\,\stackrel{{\scriptstyle\eqref{eq:fbound}}}{{\geq}}\,\varepsilon 4^{-b}\cdot 2^{(1-1/b)r^{\prime}}\ell^{b}\,\stackrel{{\scriptstyle\eqref{eq:ell}}}{{\geq}}\,\varepsilon 4^{-b}\cdot 4^{-b^{2}}\cdot 2^{(1-1/b)r^{\prime}+br}\,k^{b}\geq 2\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b},

(9)

where this last step used $b\geq 3$ and $s=2r+r^{\prime}$ .

Our goal now is to show that among these paths, there are few “bad choices” that must be avoided, i.e. few choices so that $\chi_{j}$ is not $(s,b)$ -compatible. To show this, 2.1(f) will be crucial, which we recall says that for any non-empty set $S$ of vertices in $G^{\prime}$ not containing $x,y$ , there are at most $\varepsilon^{-1}\ell^{(b-1-|S|)b/(b-1)}$ paths in $\mathcal{Q}(y)$ which contain $S$ .

We first show that almost all choices of $P_{j}$ make $\chi_{j}$ valid. Because $\chi$ was already valid, this is equivalent to choosing a $P_{j}$ which contains none of the vertices of $\bigcup_{j^{\prime}<j}P_{j^{\prime}}$ other than $x$ and $y$ . By 2.1(f) with $|S|=1$ , the number of paths in $\mathcal{Q}$ containing a given vertex from $\bigcup_{j^{\prime}<j}P_{j^{\prime}}$ is at most $\varepsilon^{-1}\ell^{(b-2)b/(b-1)}$ . Therefore, the number of $P_{j}\in\mathcal{Q}(y)$ containing any of the vertices in $\bigcup_{j^{\prime}<j}P_{j^{\prime}}$ (other than $x$ and $y$ ) is at most a constant times $\ell^{(b-2)b/(b-1)}$ , which for $k_{0}$ sufficiently large (which makes $\ell$ sufficiently large) will be at most $\frac{1}{4}|\mathcal{Q}(y)|=\Omega(\ell^{b})$ . Thus at least three quarters of the paths $P_{j}\in\mathcal{Q}(y)$ will make $\chi_{j}$ valid.

To show that $\chi_{j}$ is $(s,b)$ -compatible for most choices of paths in $\mathcal{Q}(y)$ , it remains to bound the number of “bad” sets $\gamma$ that must be avoided when choosing $P_{j}$ . To this end, for each integer $1\leq p\leq b-1$ define

\widetilde{\mathcal{J}}_{p}:=\bigcup_{\begin{subarray}{c}\chi^{\prime}\subseteq{\chi},\\ \nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\}:\ |\nu|=p\end{subarray}}\mathcal{J}_{s,b}(\chi^{\prime};\nu)\ \ \cup\ \bigcup_{\begin{subarray}{c}\chi^{\prime}\subseteq{\chi},\\ \nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\}:\ |\nu|=p\end{subarray}}\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)

Note that $\chi_{j}$ is $(s,b)$ -compatible if and only if it is valid and does not contain any set $\gamma\in\widetilde{\mathcal{J}}_{p}$ for any value of $p$ (here we implicitly use that $\mathcal{F}(\mathcal{H}_{b},D_{b})=\emptyset$ since $D_{b}$ is always equal to $\infty$ , so we can ignore $\mathcal{J}_{b}(\chi^{\prime};\nu)$ when checking for compatibility).

We may use 3.2 to bound the size of each link set above: consider $\chi^{\prime}\subseteq\chi$ and $\nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\}$ with $|\nu|=p$ . If $(u,x)\not\in\chi^{\prime}$ or $(v,y)\not\in\chi^{\prime}$ , then $\mathcal{J}_{s,b}(\chi^{\prime};\nu)=\emptyset$ by 3.2. Otherwise, 3.2 gives

\left|\mathcal{J}_{s,b}(\chi^{\prime};\nu)\right|\leq 2^{v(\theta_{a,b})+1}\delta^{p}\left(2^{2s/3}k^{b}\right)^{p/(b-1)}.

(10)

Similarly, if $\chi_{\theta}^{\prime}\cup\nu$ does not induce a forest on at least one edge, then $\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)=\emptyset$ . If both $\chi_{\theta}^{\prime}\cup\nu$ and $\chi_{\theta}^{\prime}$ induce a forest on at least one edge, then 3.2 gives

|\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)|\leq 2^{v(\theta_{a,b})+1}\delta^{p}k^{pb/(b-1)}.

(11)

It remains to deal with the case that $\chi_{\theta}^{\prime}$ does not induce a forest on at least one edge but $\chi_{\theta}^{\prime}\cup\nu$ does. Analogous to the proof of Claim 4.3, in this setting 3.2 gives no meaningful bound, but we can show that for any choice of $P_{j}$ in $\mathcal{Q}(y)$ , no element of the link set $\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)$ can appear in $\chi_{j}$ (and therefore there are no “bad options” that must be avoided in choosing $P_{j}$ from $\mathcal{Q}$ ). To this end, consider any set $\gamma\in\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)$ . By definition, this means $\gamma_{\theta}=\nu$ and

\deg_{\mathcal{H}}(\chi^{\prime}\cup\gamma)\geq D_{\operatorname{forest}}(\chi_{\theta}^{\prime}\cup\nu).

(12)

Since $\chi_{\theta}^{\prime}$ does not induce any edges but $\chi^{\prime}_{\theta}\cup\nu$ does, all of these induced edges of $\theta_{a,b}$ must be contained in the path $W_{j}:=uw_{1}^{j}\cdots w_{b-1}^{j}v$ . Since $D_{\operatorname{forest}}$ only depends on the number of edges induced, we have

D_{\operatorname{forest}}(\chi_{\theta}^{\prime}\cup\nu)=D_{\operatorname{forest}}\left((\chi_{\theta}^{\prime}\cup\nu)\cap W_{j}\right)..

(13)

On the other hand, if for each $P_{j}\in\mathcal{Q}(y)$ we let

(W_{j},P_{j}):=\left\{(u,x),(w_{1}^{j},z_{1}^{j}),\dots,(w_{b-1}^{j},z_{b-1}^{j}),(v,y)\right\},

then we have

\deg_{\mathcal{H}}\big{(}(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})\big{)}\geq\deg_{\mathcal{H}}(\chi^{\prime}\cup\gamma),

(14)

since taking smaller sets can only cause $\deg_{\mathcal{H}}$ to increase. Putting this all together, we obtain

\deg_{\mathcal{H}}\big{(}(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})\big{)}\,\stackrel{{\scriptstyle\eqref{eq:annoyingBound3},\eqref{eq:annoyingBound},\eqref{eq:annoyingBound2}}}{{\geq}}\,D_{\operatorname{forest}}\left((\chi_{\theta}^{\prime}\cup\nu)\cap W_{j}\right).

(15)

Notice that $(\chi_{\theta}^{\prime}\cup\nu)\cap W_{i}=((\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j}))_{\theta}$ . Thus (15) says that $(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})\in\mathcal{F}_{\operatorname{forest}}$ . Using the notation introduced just before Claim 4.4, this means that the subgraph $H^{\prime}:=H_{(\chi^{\prime}\cup\gamma)\cap(W_{j},P_{j})}\subseteq P_{j}$ is an element of $\mathcal{F}^{\prime}_{\operatorname{forest}}$ . Recall that by 2.1, no path in $\mathcal{Q}$ contains an element of $\mathcal{F}^{\prime}_{\operatorname{forest}}$ as a subgraph. Since $H^{\prime}$ is a subgraph of $P_{j}\in\mathcal{Q}$ , we conclude that there is no choice of $P_{j}\in\mathcal{Q}(y)$ such that $\chi_{j}$ contains $\gamma\in\mathcal{J}_{{\operatorname{forest}}}(\chi^{\prime};\nu)$ in this case.

Putting it all together, and writing $\mathcal{P}_{\text{possible}}:=\{\gamma\subseteq(W_{j},P_{j}):P_{j}\in\mathcal{Q}(y)\}$ , we obtain

	$\displaystyle\|\widetilde{\mathcal{J}}_{p}\cap\mathcal{P}_{\text{possible}}\|$	$\displaystyle\stackrel{{\scriptstyle\eqref{eq:Jsb},\eqref{eq:Jforest}}}{{\leq}}2^{v(\theta_{a,b})+1}\delta^{p}k^{pb/(b-1)}\left[\left(2^{2s/3}\right)^{p/(b-1)}+1\right]\cdot\text{\footnotesize$\left\|\{(\chi^{\prime},\nu)\ :\ \chi^{\prime}\subseteq\chi,\nu\subseteq\{w_{1}^{j},\dots,w_{b-1}^{j}\},\|\nu\|=p\}\right\|$}$
		$\displaystyle\leq 2^{2v(\theta_{a,b})+2b}\cdot\delta^{p}\cdot\left(2^{2s/3}k^{b}\right)^{p/(b-1)}.$		(16)

By 2.1(f), the number of $P_{j}$ which contain the projection $\gamma_{G}$ of a given $\gamma$ (of size $p$ ) is at most $\varepsilon^{-1}\ell^{(b-1-p)b/(b-1)}$ . So combining this with (16), the number of $P_{j}$ which contain $\gamma_{G}$ for any $\gamma\in\widetilde{\mathcal{J}}_{p}\cap\mathcal{P}_{\text{possible}}$ is at most

\left(\varepsilon^{-1}\ell^{(b-1-p)b/(b-1)}\right)\cdot 2^{2v(\theta_{a,b})+2b}\cdot\delta^{p}\cdot\left(2^{2s/3}k^{b}\right)^{p/(b-1)}.

Summing over all values of $p$ from 1 to $b-1$ and simplifying slightly, the number of $P_{j}\in\mathcal{Q}(y)$ that contain a “bad” set $\gamma$ of any size is at most

\max_{1\leq p\leq b-1}C\ell^{b}\mathcal{\cdot}\left(2^{2s/3}(k/\ell)^{b}\right)^{p/(b-1)},

(17)

where $C=\left(\varepsilon^{-1}\delta(b-1)2^{2v(\theta_{a,b})+2b}\right).$ By (4), and since $s=2r+r^{\prime}$ and $b\geq 3$ , we have

2^{2s/3}(k/\ell)^{b}\leq 2^{2s/3}(4^{b}2^{-r})^{b}\leq 4^{b^{2}}2^{2r^{\prime}/3}.

This gives

\eqref{eq:badBoundy}\leq C^{\prime}2^{2r^{\prime}/3}\ell^{b},

where $C^{\prime}=\left(\varepsilon^{-1}\delta(b-1)2^{2v(\theta_{a,b})+2b+2b^{2}}\right)$ . By taking $\delta$ sufficiently small, we can assume $C^{\prime}\leq\frac{1}{4}\varepsilon 4^{-b}$ . So, after taking into account that at most one quarter of the choices $P_{j}\in\mathcal{Q}(y)$ have $\chi_{j}$ not valid, we find that the number of choices for $P_{j}\in\mathcal{Q}(y)$ such that $\chi_{j}$ is $(s,b)$ -compatible is at least

\frac{3}{4}|\mathcal{Q}(y)|-C^{\prime}2^{2r^{\prime}/3}\ell^{b}\geq\frac{1}{2}|\mathcal{Q}(y)|\geq\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b},

where both inequalities used (9). This gives the desired result. ∎

With Claim 4.6 established, we are now nearly ready to finish Case 1.

Claim 4.7.

The number of $(s,b)$ -compatible sets of size $v(\theta_{a,b})$ containing $(u,x)$ and $(v,y)$ is at least

(\varepsilon 4^{-2b^{2}})^{a}2^{s}k^{ab}.

Proof.

This result will follow directly by an iterative application of Claim 4.6. As a base step, we take $\chi_{0}=\{(u,x),(v,y)\}$ , which is $(s,b)$ -compatible by Claim 4.5. With this we may apply Claim 4.6 to obtain at least $\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b}$ choices of a path $P_{1}$ in $G^{\prime}$ such that the corresponding set $\chi_{1}$ is $(s,b)$ -compatible. Iterating up to $j=a$ , we obtain at least

\left(\varepsilon 4^{-2b^{2}}\cdot 2^{2s/3}k^{b}\right)^{a}\geq(\varepsilon 4^{-2b^{2}})^{a}2^{s}k^{ab}

distinct collections $P_{1},\dots,P_{a}$ such that the corresponding sets $\chi_{a}$ are $(s,b)$ -compatible. This completes the proof. ∎

Now we are ready to finish Case 1. By Claim 4.5, the number of hyperedges in $\mathcal{H}_{s,b}$ containing $(u,x)$ and $(v,y)$ is at most

{\delta\varepsilon^{-1}2^{s}k^{ab}}.

By Claim 4.7, the number of $(s,b)$ -compatible sets of size $v(\theta_{a,b})$ containing $(u,x)$ and $(v,y)$ is at least

(\varepsilon 4^{-2b^{2}})^{a}2^{s}k^{ab}.

Therefore, provided $\delta$ is sufficiently small, there must be at least one $(s,b)$ -compatible set $h$ of size $v(\theta_{a,b})$ that is not already in $\mathcal{H}_{s,b}$ . This $h$ may be added to $\mathcal{H}_{s,b}$ , completing the proof of 4.2 when $t=b$ .

4.2.3 Case 2: $t<b$ and $b-t$ Even

Parts of this proof are nearly identical to the previous case, and as such we omit some of the redundant details.

Recall that $r$ is the unique integer such that $2^{-r}n\leq m<2^{-r+1}n$ . Our goal in this case is to show that we can add a new theta graph to $\mathcal{H}_{r,t}$ . Let $y\in B_{t}$ be such that $(v,y)$ is in as few hyperedges in $\mathcal{H}$ with $(u,x)$ as possible.

Claim 4.8.

The set $\chi=\{(u,x),(v,y)\}$ satisfies

\deg_{\mathcal{H}}(\chi)\leq\frac{\delta\varepsilon^{-2}4^{2b}k^{ab}n^{2}}{2^{-2r}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}}

and is $(r,t)$ -compatible if $\delta$ is sufficiently small.

Proof.

Recall that $x$ is such that $(u,x)$ is contained in the fewest number of hyperedges in $\mathcal{H}$ among all vertices in the set $X$ . This together with the definition of $y$ implies that the number of hyperedges containing both $(u,x)$ and $(v,y)$ is at most

\frac{|\mathcal{H}|}{|X|\cdot|B_{t}|}\leq\frac{\delta k^{ab}n^{2}}{\varepsilon^{2}\ell^{(2b-2t+1)/(b-1)}m^{(2t-1)/b}},

where this last step used $|\mathcal{H}|\leq\delta k^{ab}n^{2}$ and that $|B_{t}|\geq\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ and $|X|\geq\varepsilon\ell^{(b-t)/(b-1)}m^{t/b}$ by Proposition 2.1(b) and (h). Using $m\geq 2^{-r}n$ and $\ell\geq 4^{-b}2^{r}k$ from (3) and (4) gives the first result.

As in the $t=b$ case, we have $x\notin B_{t}$ by Proposition 2.1(e), so $y\neq x$ and the set $\chi$ is valid. Any $\chi^{\prime}\subsetneq\chi$ trivially fails to be in $\mathcal{F}_{\operatorname{forest}}\cup\mathcal{F}_{t}\cup\mathcal{F}_{r,t}$ , and to show $\chi$ is not in this set it suffices to show

\deg_{\mathcal{H}_{r,t}}(\chi)<D_{r,t}(\chi_{\theta})=\left\lceil\frac{\delta^{-2}k^{ab}n^{2}}{2^{-2r}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}}\right\rceil,

and this follows by the first result. We conclude that $\chi$ is $(r,t)$ -compatible.

∎

Now that we have selected our two high degree vertices $x,y$ of our theta graph, we build the rest of the theta graph as follows. First, we work our way out from $y$ by selecting neighbors $z_{b-1}^{j}\in B_{t-1}$ of $y$ , then neighbors $z_{b-2}^{j}\in B_{t}$ of each $z_{b-1}^{j}$ , and so on, until we have chosen vertices $z_{t}^{j}\in B_{t}$ . Then, once we have chosen the vertices $z_{t}^{j}$ , we select paths from the set $\mathcal{Q}$ connecting the vertices $z_{t}^{j}$ to $x$ .

To do the first part, we use the following claim. Here we recall that the paths of $\theta_{a,b}$ are denoted $uw_{1}^{j}\cdots w_{b-1}^{j}v$ , and for this claim we adopt the convention that $w_{b}^{j}:=v$ and $z_{b}^{j}:=y$ . We also recall that $F_{t}\subseteq V(\theta_{a,b})$ is defined to be the set of $w_{i}^{j}$ with $t\leq i<b$ and $i-t$ even. In particular, $w_{b-1}^{j}\notin F_{t}$ when $b-t$ is even.

Claim 4.9.

Let $t\leq i\leq b-1$ and $1\leq j\leq a$ be integers, and let $\chi$ be an $(r,t)$ -compatible set consisting of the pairs $(u,x),(v,y)$ , and $(w_{i^{\prime}}^{j^{\prime}},z_{i^{\prime}}^{j^{\prime}})$ for all $i^{\prime},j^{\prime}$ with either $i^{\prime}>i$ or with $i^{\prime}=i$ and $j^{\prime}<j$ .

•

If $i-t$ is odd and $z_{i+1}^{j}\in B_{t}$ , then there exist at least $\frac{1}{2}\varepsilon\ell^{b/(b-1)}$ choices $z_{i}^{j}\in B_{t-1}\cap N_{G^{\prime}}(z_{i+1}^{j})$ such that $\chi\cup\{(w_{i}^{j},z_{i}^{j})\}$ is $(r,t)$ -compatible.
•

If $i-t$ is even and $z_{i+1}^{j}\in B_{t-1}$ , then there exist at least $\frac{1}{2}\varepsilon\ell m^{1/b}$ choices $z_{i}^{j}\in B_{t}\cap N_{G^{\prime}}(z_{i+1}^{j})$ such that $\chi\cup\{(w_{i}^{j},z_{i}^{j})\}$ is $(r,t)$ -compatible.

Proof.

Observe that if $z_{i}^{j}\in N_{G^{\prime}}(z_{i+1}^{j})$ is a vertex such that $\chi\cup\{(w_{i}^{j},z_{i}^{j})\}$ is not $(r,t)$ -compatible, then either $z_{i}^{j}\in\chi_{G}$ (which can only hold for $O(1)$ vertices), or there exists some $\chi^{\prime}\subseteq\chi$ with

\{(w_{i}^{j},z_{i}^{j})\}\in\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};w_{i}^{j})\cup\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})\cup\mathcal{J}_{t}(\chi^{\prime};w_{i}^{j}).

Thus it suffices to show that each of these sets are small for each $\chi^{\prime}\subseteq\chi$ .

First consider $\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};w_{i}^{j})$ . If $\chi^{\prime}_{\theta}\cup w_{i}^{j}$ induces at most one edge, then this link set is empty by Claim 4.3. If this is not the case, then $\chi^{\prime}_{\theta}$ must induce at least one edge since $w_{i}^{j}$ only has one edge incident to $\chi^{\prime}_{\theta}$ (this implicitly uses $i\geq t\geq 2$ , as otherwise $w_{i}^{j}$ would also be adjacent to $u$ ). By 3.2, we find

|\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};w_{i}^{j})|\leq 2^{v(\theta_{a,b})+1}\delta k^{b/(b-1)}.

(18)

Next consider $\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})$ , which we recall is based off of the codegree function defined in Definition 6. If $\{u,v\}\not\subseteq\chi^{\prime}_{\theta}$ , then this link set is empty by 3.2, so we may assume $\{u,v\}\subseteq\chi^{\prime}_{\theta}$ . Then 3.2 gives

|\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})|\leq 2^{v(\theta_{a,b})+1}\delta 2^{2r/3}k^{b/(b-1)}\hskip 3.99994pt\textrm{ if }i-t\textrm{ is odd},

since if $i-t$ is odd, adding $w_{i}^{j}\notin F_{t}$ to $\chi^{\prime}_{\theta}$ keeps the parameter $f=|\nu\cap F_{t}|$ in Definition 6 the same while increasing $|\nu|$ . Similarly,

|\mathcal{J}_{r,t}(\chi^{\prime};w_{i}^{j})|\leq 2^{v(\theta_{a,b})+1}\delta 2^{2r/3}kn^{1/b}\hskip 3.99994pt\textrm{ if }i-t\textrm{ is even},

since $f$ and $\nu$ both increase by 1.

Finally consider $\mathcal{J}_{t}(\chi^{\prime};w_{i}^{j})$ , which we recall is based off of the codegree function defined in Definition 7. Again we may assume $\{u,v\}\subseteq\chi^{\prime}_{\theta}$ . If $w_{b-1}^{j^{\prime}}\in\chi^{\prime}_{\theta}$ for some $j^{\prime}$ , then the argument and final bound is exactly the same as in the case for $\mathcal{J}_{r,t}$ (with $g$ taking the role of $f$ in exactly the same way as before). We next consider the subcase $w_{b-1}^{j^{\prime}}\notin\chi^{\prime}_{\theta}$ for all $j^{\prime}<j$ . If $i\neq b-1$ , then $\chi^{\prime}_{\theta}\cup\{w_{i}^{j}\}$ contains no vertex of the form $w_{b-1}^{j^{\prime}}$ , so $D_{t}(\chi^{\prime}_{\theta}\cup\{w_{i}^{j}\})=\infty$ , and hence the link set is empty by 3.1. If $i=b-1$ , then $\chi^{\prime}_{\theta}\subseteq\{u,v,w_{b-1}^{1},\ldots,w_{b-1}^{a}\}$ by the hypothesis of the claim, so our assumption $w_{b-1}^{j^{\prime}}\notin\chi^{\prime}_{\theta}$ implies $\chi^{\prime}=\{(u,x),(v,y)\}$ . Thus

\deg_{\mathcal{H}_{t}}(\chi^{\prime}\cup\{(w_{b-1}^{j},z_{b-1}^{j})\})\leq\deg_{\mathcal{H}}(\{(v,y),(w_{b-1}^{j},z_{b-1}^{j})\})<D_{\operatorname{forest}}(\{v,w_{b-1}^{j}\})=D_{t}(\chi^{\prime}_{\theta}\cup\{w_{b-1}^{j}\}),

where the first inequality used that we are looking at the codegree of a smaller set in a larger hypergraph, the second inequality used (2) (i.e. that every edge in $G^{\prime}$ has codegree smaller than that given by $D_{\operatorname{forest}}$ ), and the equality used $|\chi^{\prime}_{\theta}\cup\{w_{b-1}^{j}\}|=3$ and $g=|(\chi^{\prime}_{\theta}\cup\{w_{b-1}^{j}\})\cap F_{t}|=0$ in the definition of $D_{t}$ for $b-t$ even. This implies $\mathcal{J}_{t}(\chi^{\prime};w_{b-1}^{j})=\emptyset$ .

By summing up the sizes of all of these sets over all possible choices of $\chi^{\prime}\subseteq\chi$ (as well as the number of choices $z_{i}^{j}\in\chi_{G}$ ), we find when $i-t$ is odd that the number of $z_{i}^{j}$ which can not be selected is at most

O_{a,b}(\delta 2^{2r/3}k^{b/(b-1)})=O_{a,b}(\delta\ell^{b/(b-1)}),

with the last step using $\ell\geq 4^{-b}2^{r}k$ . By Proposition 2.1(d), $z_{i+1}^{j}\in B_{t}$ has at least $\varepsilon\ell^{b/(b-1)}$ neighbors in $B_{t-1}$ , and for $\delta$ sufficiently small this is at least twice the number of forbidden choices. Essentially the same reasoning holds for the $i-t$ even case after noting $k^{b/(b-1)}\leq kn^{1/b}$ when applying (18). We conclude the result.

∎

By starting with the two high-degree vertices $(u,x)$ and $(v,y)$ , and iteratively applying Claim 4.9, we can find many $(r,t)$ -compatible sets $\chi$ with $\chi_{\theta}=\{u,v\}\cup\bigcup_{i\geq t}\{w_{i}^{1},\ldots,w_{i}^{a}\}$ . To get the remaining vertices corresponding to $w_{i}^{j}$ with $i<t$ , we use the same strategy as in the $t=b$ case of choosing paths from $\mathcal{Q}$ .

Claim 4.10.

Let $P_{1},\dots,P_{j-1}$ be a collection of paths in $\mathcal{Q}$ ending in $B_{t}$ , and for each path $P_{j^{\prime}}$ , write $P_{j^{\prime}}=xz_{1}^{j^{\prime}}\cdots z_{t}^{j^{\prime}}$ . Suppose that the set

\chi:=\{(u,x),(v,y)\}\cup\bigcup_{i\geq t}\{(w_{i}^{1},z_{i}^{1}),\ldots,(w_{i}^{a},z_{i}^{a})\}\cup\bigcup_{j^{\prime}<j}\left\{(w_{1}^{j^{\prime}},z_{1}^{j^{\prime}}),\dots,(w_{t}^{j^{\prime}},z_{t}^{j^{\prime}})\right\}

is $(r,t)$ -compatible. Then there are at least $\frac{1}{2}\varepsilon\ell^{(t-1)b/(b-1)}$ choices of a path $P_{j}=xz_{1}^{j}\cdots z_{t}^{j}$ in $\mathcal{Q}$ so that

\chi_{j}:=\chi\cup\left\{(w_{1}^{j},z_{1}^{j}),\dots,(w_{t}^{j},z_{t}^{j})\right\}

is $(r,t)$ -compatible.

Sketch of Proof.

The argument is almost identical to that of Claim 4.6 so we only sketch the details (with our notation defined analogously as before). By Proposition 2.1(c) we have that there are at least $\varepsilon\ell^{(t-1)b/(b-1)}$ paths in $\mathcal{Q}$ from $z_{t}^{j}$ to $x$ . Using 2.1(f) we find that very few of these paths contain any of the other vertices of $\chi_{G}$ besides $x$ and $y$ .

By using 3.2, we find that each of the sets $\mathcal{J}_{r,t}(\chi^{\prime};\nu),\ \mathcal{J}_{t}(\chi^{\prime};\nu)$ , and $\mathcal{J}_{\operatorname{forest}}(\chi^{\prime};\nu)$ after intersecting with $\mathcal{P}_{\text{possible}}$ are all of size $O((\delta 2^{2r/3}k^{b/(b-1)})^{p})$ whenever $\chi^{\prime}\subseteq\chi$ and $\nu\subseteq\{w_{1}^{j},\ldots,w_{t-1}^{j}\}$ with $|\nu|=p$ (here we use that $\nu\cap F_{t}=\emptyset$ for any such $\nu$ , so $f,g$ in the definitions of $D_{r,t},D_{t}$ do not change when going from $\chi^{\prime}_{\theta}$ to $\chi^{\prime}_{\theta}\cup\nu$ ). From here essentially the same computations as before go through. ∎

Combining the previous three claims, we find that our algorithm produces a large number of theta graphs.

Claim 4.11.

The number of $(r,t)$ -compatible sets of size $v(\theta_{a,b})$ containing $(u,x)$ and $(v,y)$ is at least

\Omega\left(2^{2r}k^{ab-\frac{2b-2t+1}{b-1}}n^{2-\frac{2t-1}{b}}\right).

Proof.

The result follows by iteratively applying Claims 4.9 and 4.10. Starting with $\chi_{0}:=\{(u,x),(v,y)\}$ , which is $(r,t)$ -compatible by Claim 4.8, we repeatedly apply Claim 4.9 to build paths $z_{t}^{j}\cdots z_{b-1}^{j}y$ (for $1\leq j\leq a$ ); we then finish by repeatedly applying Claim 4.10 to select paths $xz_{1}^{j}\cdots z_{t}^{j}y$ . In total, we find that the number of $(r,t)$ -compatible sets $h$ of size $v(\theta_{a,b})$ with $(u,x),(v,y)\in h$ is at least

\left(\left(\frac{1}{2}\varepsilon\ell m^{1/b}\right)^{(b-t)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{b/(b-1)}\right)^{(b-t)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{(t-1)b/(b-1)}\right)\right)^{a},

(19)

where the first two terms use that for each path $uz_{1}^{j}\cdots z_{b}^{j}v$ we get a factor of $\frac{1}{2}\varepsilon\ell m^{1/b}$ for each vertex in position $i\in\{t,t+2,\ldots,b-2\}$ and a factor of $\frac{1}{2}\varepsilon\ell^{b/(b-1)}$ for each $i\in\{t+1,t+3,\ldots,b-1\}$ by Claim 4.9, and the last term uses Claim 4.10. The expression above is equal to some positive constant depending only on $a,b,\varepsilon$ times

	$\displaystyle\ell^{ab-\frac{a(b-t)}{2(b-1)}}m^{\frac{a(b-t)}{2b}}$	$\displaystyle=\ell^{ab-\frac{a(b-t)}{2(b-1)}}m^{\frac{(a-4)(b-t)-2}{2b}}\cdot m^{\frac{4b-4t+2}{2b}}$
		$\displaystyle\geq\ell^{ab-\frac{a(b-t)}{2(b-1)}+\frac{(a-4)(b-t)-2}{2(b-1)}}\cdot m^{\frac{2b-2t+1}{b}}=\ell^{ab-\frac{2b-2t+1}{b-1}}\cdot m^{2-\frac{2t-1}{b}},$

where the inequality used (5), i.e. $m^{1/b}\geq\ell^{1/(b-1)}$ , and implicitly that $a\geq 6$ so that the exponent of $m$ is positive. Finally, using $\ell=\Omega(2^{r}k)$ and $m=\Omega(2^{-r}n)$ crudely gives

\ell^{ab-\frac{2b-2t+1}{b-1}}\cdot m^{2-\frac{2t-1}{b}}=\Omega\left(2^{(ab-2)r}k^{ab-\frac{2b-2t+1}{b-1}}\cdot 2^{-(2-\frac{2t-1}{b})r}n^{2-\frac{2t-1}{b}}\right)=\Omega\left(2^{2r}k^{ab-\frac{2b-2t+1}{b-1}}n^{2-\frac{2t-1}{b}}\right),

where this last step used $ab\geq 6$ . ∎

We are now ready to finish Case 2. If $\delta$ is sufficiently small in terms of $a,b,\varepsilon$ , the number of theta graphs guaranteed by Claim 4.11 exceeds the codegree bound in Claim 4.8; thus there exists some $(r,t)$ -valid set $h$ obtained through our algorithm which is not already a hyperedge of $\mathcal{H}$ . Adding such an $h$ to $\mathcal{H}_{r,t}$ gives the result in this case.

4.2.4 Case 3: $t<b$ and $b-t$ Odd

This case is nearly identical to the previous one, and as such we only sketch the proof.

Again our goal is to add a new hyperedge to $\mathcal{H}_{r,t}$ . To start, we pick $y\in B_{t-1}\setminus\{x\}$ such that $(v,y)$ is in as few hyperedges with $(u,x)$ as possible. Here we emphasize that, in the previous case, we picked $y\in B_{t}$ and hence immediately obtained $y\neq x$ (since each element of $B_{t}$ is the endpoint of a path with $x$ ), but here we have to be slightly more careful and explicitly enforce $y\neq x$ . However, since no hyperedge of $\mathcal{H}$ contains both $(u,x)$ and $(v,x)$ (since every hyperedge is a valid set), and since $|B_{t-1}\setminus\{x\}|\geq\frac{1}{2}\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ by Proposition 2.1(b), we find that $\deg_{\mathcal{H}}(\{(u,x),(v,y)\})$ is at most twice the bound from Claim 4.8, and the rest of the proof showing that this set is $(r,t)$ -compatible goes through in exactly the same way as in Claim 4.8.

From here we apply Claim 4.9 exactly as written (since $w_{i}^{j}\in F_{t}$ depends only on the parity of $i-t$ and not of $b-t$ ); the proof of Claim 4.9 also remains word for word the same, with the only minor exception being that we have $g=g(\nu):=|\nu\cap F_{t}|-1$ (which again implies $g=0$ when $i=b-1$ and $\chi^{\prime}_{\theta}=\{u,v\}$ ).

Finally, we choose paths in $\mathcal{Q}$ going from each of the $z_{t}^{j}$ vertices to $x$ , and again the statement and proof of Claim 4.10 remain exactly the same. With this, the total number of choices for the algorithm to produce an $(r,t)$ -compatible set is

\left(\left(\frac{1}{2}\varepsilon\ell m^{1/b}\right)^{(b-t+1)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{b/(b-1)}\right)^{(b-t-1)/2}\cdot\left(\frac{1}{2}\varepsilon\ell^{(t-1)b/(b-1)}\right)\right)^{a},

since in this setting we get a factor of $\frac{1}{2}\varepsilon\ell m^{1/b}$ for each vertex in position $i\in\{t,t+2,\ldots,b-1\}$ , of which there are $(b-t+1)/2$ . This quantity is at least as large as (19), so we conclude that for $\delta$ sufficiently small the number of choices is more than the number of hyperedges containing $(u,x),(v,y)$ in $\mathcal{H}$ . With this we conclude the result.

5 Balanced Supersaturation for Edges

In the previous section we showed that $\theta_{a,b}$ exhibits balanced supersaturation for vertices in terms of the (complicated) codegree function $D^{\prime}_{t}$ . We begin by simplifying this function.

Proposition 5.1.

For all $a\geq 100$ and $b\geq 3$ , let $\delta>0$ and $D^{\prime}_{t}$ be as in 4.1. There exist constants $C^{\prime},k_{0}>0$ such that if $n^{1-1/b}\geq k\geq k_{0}$ and $\nu\subseteq V(\theta_{a,b})$ induces $e$ edges, where $1\leq e\leq e(\theta_{a,b})-1$ , then

D^{\prime}_{t}(\nu)\leq\frac{C^{\prime}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.

Note that $n^{1-1/b}\geq k$ always holds if we are considering $n$ -vertex graphs $G$ with $kn^{1+1/b}$ edges. We defer the proof of 5.1 for the moment and show that together with 4.1, it implies a balanced supersaturation result for edges which we will use to complete the proof of 1.3; see 6.1 below.

Corollary 5.2.

For all $a\geq 100$ and $b\geq 3$ , there exist constants $C,k_{0}>0$ such that the following holds for all $n\in\mathbb{N}$ and $k\geq k_{0}$ . If $G$ is an $n$ -vertex graph with $kn^{1+1/b}$ edges, then there exists a hypergraph $\mathcal{H}$ on $E(G)$ whose hyperedges are copies of $\theta_{a,b}$ and is such that $|\mathcal{H}|\geq C^{-1}k^{ab}n^{2}$ and such that for every $\sigma\subseteq E(G)$ with $1\leq|\sigma|\leq e(\theta_{a,b})-1$ , we have

\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{|\sigma|-1}}.

Proof.

Let $\mathcal{H}^{\prime}_{t}$ be the $D^{\prime}_{t}$ -good $G$ -hypergraph on $V(\theta_{a,b})\times V(G)$ guaranteed by 4.1. We would like to translate $\mathcal{H}^{\prime}_{t}$ into a hypergraph $\mathcal{H}$ on $E(G)$ satisfying the codegree bounds above.

This will be conceptually straightforward, but a little tedious. In essence, the hyperedges of $\mathcal{H}^{\prime}_{t}$ correspond to theta graphs in $G$ , and we will define $\mathcal{H}$ to be the hypergraph corresponding to these theta graphs. However, we must deal with two small issues with this translation: (1) a single theta graph in $G$ may appear isomorphically several times in $\mathcal{H}^{\prime}_{t}$ , and (2) the codegree bound $D_{t}^{\prime}(\nu)$ depends on the number of edges induced by $\nu$ , whereas the bound in 5.2 depends only on $|\sigma|$ for an arbitrary set of edges $\sigma$ , even if the vertices used by $\sigma$ induce additional edges. Neither of these issues is a real obstacle (in particular, the second can only improve the codegrees), but we will need some additional notation in order to address them.

For each valid set $\chi$ , we will define the corresponding set of edges induced in $G$ (excluding “extraneous” edges that do not play a role in the isomorphic copy of $\theta_{a,b}$ ) as follows:

E_{\chi}=\{zz^{\prime}:(w,z),(w^{\prime},z^{\prime})\in\chi,\ ww^{\prime}\in E(\theta_{a,b})\}.

In particular, $E_{h}\subseteq E(G)$ is a copy of $\theta_{a,b}$ in $G$ for every hyperedge $h\in\mathcal{H}^{\prime}_{t}$ (since every hyperedge $h\in\mathcal{H}^{\prime}_{t}$ is a valid set of size $v(\theta_{a,b})$ ). Define $\mathcal{H}$ to be the hypergraph with hyperedge set $\{E_{h}:h\in\mathcal{H}^{\prime}_{t}\}$ . Observe that $|\mathcal{H}|\geq\frac{1}{v(\theta_{a,b})!}|\mathcal{H}^{\prime}_{t}|=\Omega(k^{ab}n^{2})$ , so it remains to check the codegree conditions – that is, to bound $\deg_{\mathcal{H}}(\sigma)$ for each set of edges $\sigma$ in $G$ .

Fix a set of edges $\sigma\subseteq E(G)$ . We need to get an understanding of which valid sets “correspond” to $\sigma$ . To this end, let $\sigma_{v}\subseteq V(G)$ be the set of vertices used by the edges $\sigma$ , and let $\mathcal{X}$ be the set of all valid $\chi\in V(\theta_{a,b})\times V(G)$ with $\chi_{G}=\sigma_{v}$ and $\sigma\subseteq E_{\chi}$ . See Figure 7 for an example.

With this notation, we can convert the codegree bounds in $H_{t}^{\prime}$ to a bound on $\deg_{\mathcal{H}}(\sigma)$ as follows.

Claim 5.3.

We have

\deg_{\mathcal{H}}(\sigma)\leq\sum_{\chi\in\mathcal{X}}\deg_{\mathcal{H}^{\prime}_{t}}(\chi).

Proof.

We would like to show that each hyperedge $h\in\mathcal{H}$ counted by $\deg_{\mathcal{H}}(\sigma)$ corresponds to at least one hyperedge $h^{\prime}\in\mathcal{H}_{t}^{\prime}$ counted by $\deg_{\mathcal{H}^{\prime}_{t}}(\chi)$ for some $\chi\in\mathcal{X}$ . We first observe that if $h\in\mathcal{H}$ , then by definition of $\mathcal{H}$ , there exists some $h^{\prime}\in\mathcal{H}^{\prime}_{t}$ with $h=E_{h^{\prime}}$ . We wish to show that this $h^{\prime}$ contains a set $\chi\in\mathcal{X}$ , so that it will be counted by $\deg_{\mathcal{H}^{\prime}_{t}}(\chi)$ .

To this end, we observe that if the edge set $\sigma$ is contained in $h=E_{h^{\prime}}$ , then the corresponding vertex set $\sigma_{v}$ is contained in $h^{\prime}_{G}$ ; so there exists some $\chi\subseteq h^{\prime}$ such that $\chi_{G}=\sigma_{v}$ . We claim that $\sigma\subseteq E_{\chi}$ as well. To see this, note that if $zz^{\prime}\in\sigma\subseteq E_{h^{\prime}}$ , then $(w,z),(w^{\prime},z^{\prime})\in h^{\prime}$ for some $ww^{\prime}\in E(\theta_{a,b})$ by the definition of $E_{h^{\prime}}$ . Since $z,z^{\prime}\in\sigma_{v}=\chi_{G}$ and $\chi\subseteq h^{\prime}$ , we have $(w,z),(w^{\prime},z^{\prime})\in\chi$ as well, giving $zz^{\prime}\in E_{\chi}$ . So by definition, $\chi\in\mathcal{X}$ as desired. ∎

To finish, it remains to bound the sum above. First, notice that there are only a constant number of terms: each element of $\mathcal{X}$ is uniquely identified by $\chi_{\theta}$ (since $\chi_{G}=\sigma_{v}$ for each $\chi\in\mathcal{X}$ ), and hence $|\mathcal{X}|\leq 2^{v(\theta_{a,b})}$ . Since $\mathcal{H}^{\prime}_{t}$ is $D^{\prime}_{t}$ -good, we have $\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq D^{\prime}_{t}(\chi_{\theta})$ , and hence

\deg_{\mathcal{H}}(\sigma)\leq\sum_{\chi\in\mathcal{X}}\deg_{\mathcal{H}^{\prime}_{t}}(\chi)\leq\sum_{\chi\in\mathcal{X}}D^{\prime}_{t}(\chi_{\theta})\leq 2^{v(\theta_{a,b})}\cdot\frac{C^{\prime}k^{ab}n^{2}}{kn^{1+1/b}(\min\{k^{b/(b-1)},kn^{(b-1)/(ab-1)}\})^{|\sigma|-1}},

where this last step used that each $\chi\in\mathcal{X}$ induces $|E_{\chi}|\geq|\sigma|$ edges, together with the bound on $D_{t}^{\prime}$ given by 5.1. This gives the desired result by taking $C=2^{v(\theta_{a,b})}C^{\prime}$ . ∎

5.1 Proof of 5.1

The rest of this section is dedicated to proving 5.1, which we emphasize will consist entirely of (moderately involved) arithmetic and case analysis. We will abuse notation slightly by identifying a vertex set $\nu\subseteq V(\theta_{a,b})$ with its induced subgraph in $\theta_{a,b}$ . For example, we may say $\nu$ contains a cycle to mean its induced subgraph contains a cycle. Unless stated otherwise, $e$ will refer to the number of edges that $\nu$ induces in $\theta_{a,b}$ . We let $\delta>0$ be the constant guaranteed by 4.1, and throughout we assume $n^{1-1/b}\geq k$ and $b\geq 3$ . We recall that our goal is to show that there exists a constant $C^{\prime}>0$ such that for $k$ sufficiently large and for all $\nu\subseteq V(\theta_{a,b})$ inducing $e$ edges for $1\leq e\leq e(\theta_{a,b})-1$ , we have

D^{\prime}_{t}(\nu)\leq\frac{C^{\prime}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}},

where the definition of $D^{\prime}_{t}(\nu)$ will be recalled below. We begin with an easy case.

Lemma 5.4.

For any codegree function $D$ , if $\nu\subseteq V(\theta_{a,b})$ is such that $D(\nu)=1$ and $\nu$ induces $e\geq 1$ edges, then

D(\nu)\leq\frac{k^{ab}n^{2}}{kn^{1+1/b}\big{(}kn^{\frac{b-1}{b(ab-1)}}\big{)}^{e-1}}.

Proof.

This follows immediately from $e\leq ab$ . ∎

We remind the reader that

D^{\prime}_{t}(\nu):=\begin{cases}D_{\operatorname{forest}}(\nu)&\nu\textrm{ induces a forest},\\ \min\{D_{t}(\nu),20(D_{0,t}(\nu)+\left\lceil\log n\right\rceil)\}&\textrm{otherwise}.\end{cases}

For ease of reading, we recall each of the functions mentioned above before they are used. First, we recall

D_{\operatorname{forest}}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{\delta kn^{1+1/b}\cdot(\delta k^{b/(b-1)})^{e-1}}\right\rceil

whenever $\nu\subseteq V(\theta_{a,b})$ induces a forest with $e\geq 1$ edges.

Lemma 5.5.

If $a\geq 3$ , $2\leq t\leq b$ , and $\nu\subseteq V(\theta_{a,b})$ induces a forest on $e\geq 1$ edges, then

D^{\prime}_{t}(\nu)\leq\frac{2\delta^{-ab}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.

Proof.

If $D^{\prime}_{t}(\nu)=D_{\operatorname{forest}}(\nu)=1$ then the result follows from 5.4, and otherwise

D^{\prime}_{t}(\nu)=D_{\operatorname{forest}}(\nu)\leq\frac{2k^{ab}n^{2}}{\delta kn^{1+1/b}\cdot(\delta k^{b/(b-1)})^{e-1}},

where the factor of 2 comes from the ceiling function and the assumption $D_{\operatorname{forest}}(\nu)>1$ . The result follows since $e-1\leq ab-1$ . ∎

It remains to prove the result when $\nu$ contains a cycle. To help with the case analysis, we show that it suffices to prove the result when $\nu$ consists of paths of length $b$ , i.e. when $\nu$ contains no leaves or isolated vertices.

Lemma 5.6.

Let $D$ be a codegree function such that if $\nu\subseteq V(\theta_{a,b})$ contains a cycle, then either $D(\nu)=1$ or $D(\nu)\leq 2\delta^{-1}k^{-b/(b-1)}D(\nu\setminus\{w\})$ for every $w\in\nu$ .

If $C\geq 1$ is a constant such that for every $\nu$ which consists of paths of length $b$ , we have

D(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}},

then for every $k\geq 2\delta^{-1}$ and $\nu$ which contains a cycle, we have

D(\nu)\leq\frac{C(2\delta^{-1})^{|\nu|}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.

Proof.

Assume the hypotheses hold for $D$ . We prove by induction on $|\nu|$ that any $\nu\subseteq V(\theta_{a,b})$ containing a cycle satisfies the desired inequality. For this proof, we recall that $u,v$ always denote the two high degree vertices of $\theta_{a,b}$ .

Say we have proved the result up to some set $\nu$ which induces $e$ edges. If $D(\nu)=1$ then the inequality follows from Lemma 5.4, so we may assume $D(\nu)>1$ . If $\nu$ consists of paths of length $b$ then the result follows by hypothesis. Otherwise, there exists some vertex $w\in\nu\setminus\{u,v\}$ which is adjacent to at most one other vertex in $\nu$ . Thus $\nu\setminus\{w\}$ induces a graph containing a cycle with at least $e-2$ edges. By our hypothesis on $D$ , we find

	$\displaystyle D(\nu)$	$\displaystyle\leq 2\delta^{-1}k^{-b/(b-1)}\cdot D(\nu\setminus\{w\})$
		$\displaystyle\leq 2\delta^{-1}k^{-b/(b-1)}\cdot\frac{C(2\delta^{-1})^{\|\nu\|-1}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-2}}$
		$\displaystyle\leq\frac{C(2\delta^{-1})^{\|\nu\|}k^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}},$

where the second inequality used that our inductive hypothesis applies to $\nu\setminus\{w\}$ (since $\nu\setminus\{w\}$ contains a cycle and $D(\nu\setminus\{w\})\geq(\delta/2)k^{b/(b-1)}D(\nu)>1$ ). This gives the desired result. ∎

We will show that essentially all of our remaining codegree functions are of the form described in 5.6. First, we recall

D_{0,b}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{n^{2}(k^{b/(b-1)})^{|\nu|-2}\delta^{|\nu|}}\right\rceil

whenever $\nu$ contains the two high degree vertices $u,v$ , and $D_{0,b}(\nu)=\infty$ otherwise. Note that if $\nu$ contains a cycle, then $D_{0,b}(\nu\setminus\{w\})=\infty$ if $w\in\{u,v\}$ , and otherwise if $D_{0,b}(\nu)>1$ we have $D_{0,b}(\nu)\leq 2\delta^{-1}k^{-b/(b-1)}D_{0,b}(\nu\setminus\{w\})$ , where the factor of 2 comes from the ceiling function. Thus $D_{0,b}$ satisfies the conditions of Lemma 5.6, and using this we prove the following.

Lemma 5.7.

There exists a constant $C>0$ such that if $k$ is sufficiently large in terms of $a,b,\delta$ and if $\nu\subseteq V(\theta_{a,b})$ induces $e$ edges where $1\leq e\leq e(\theta_{a,b})-1$ , then

D^{\prime}_{b}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}.

Proof.

First, if $\nu$ induces a forest then the result follows from 5.5, so we may assume $\nu$ contains a cycle.

Now consider the case $D_{b}^{\prime}(\nu)\leq 40\lceil\log n\rceil$ . In particular, it suffices here to show that

40\lceil\log n\rceil\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\cdot\big{(}kn^{\frac{b-1}{b(ab-1)}}\big{)}^{e-1}}.

And indeed, since $e\leq ab-1$ , this inequality is satisfied if

40\lceil\log n\rceil\leq Ckn^{\frac{b-1}{b(ab-1)}},

which holds for all $n$ and $k$ , provided $C$ is sufficiently large.

Thus we may assume that $40\lceil\log n\rceil\leq D_{b}^{\prime}(\nu)$ ; in particular, since $D^{\prime}_{b}(\nu)\leq 20(D_{0,b}(\nu)+\lceil\log n\rceil)$ , this implies that $\lceil\log n\rceil\leq D_{0,b}(\nu)$ , and as such $D_{b}^{\prime}(\nu)\leq 40D_{0,b}(\nu)$ . Possibly by adjusting the constant $C$ , it now suffices to show $D_{0,b}(\nu)$ satisfies the inequality of the lemma. By Lemmas 5.4 and 5.6, it suffices to show this holds for $\nu$ consisting of $p\geq 2$ paths of length $b$ . In this case $|\nu|=p(b-1)+2$ and $e=pb$ , so it suffices to show

\frac{k^{ab}n^{2}}{n^{2}(k^{b/(b-1)})^{p(b-1)}\delta^{|\nu|}}\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{pb-1}}

for some constant $C$ , where implicitly we used that the ceiling function in $D_{0,b}(\nu)$ can be ignored by increasing $C$ by a factor of 2. Using $\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\leq kn^{\frac{b-1}{b(ab-1)}}$ and rearranging the above gives that it suffices to show

\big{(}kn^{\frac{b-1}{b(ab-1)}}\big{)}^{pb-1}\leq C\delta^{|\nu|}k^{pb-1}n^{1-1/b},

which holds for any $C\geq\delta^{-|\nu|}$ since $pb-1\leq ab-1$ . We conclude the result. ∎

It remains to deal with the case $t<b$ . To start, we recall that we write the paths of $\theta_{a,b}$ as $uw_{1}^{j}\cdots w_{b-1}^{j}v$ for $1\leq j\leq a$ , and that we define $F_{t}=\{w_{i}^{j}:t\leq i<b,\ i-t\textrm{ is even}\}$ . We recall that if $u,v\in\nu$ then

D_{0,t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}\cdot(kn^{1/b})^{f}(k^{b/(b-1)})^{|\nu|-f-2}\delta^{|\nu|}}\right\rceil

where $f=|\nu\cap F_{t}|$ , with $D_{0,t}(\nu)=\infty$ otherwise. Similarly if $u,v,w_{b-1}^{j}\in\nu$ for some $j$ then

D_{t}(\nu)=\left\lceil\frac{k^{ab}n^{2}}{kn^{1+1/b}\cdot(kn^{1/b})^{g}(k^{b/(b-1)})^{|\nu|-g-3}\delta^{|\nu|}}\right\rceil,

where $g=|\nu\cap F_{t}|$ if $b-t$ is even and $g=|\nu\cap F_{t}|-1$ otherwise, with $D_{t}(\nu)=\infty$ otherwise. Note that both of these codegree functions satisfy the conditions of Lemma 5.6 since we assumed $n^{1-1/b}\geq k$ . From now on we will assume we work with $t<b$ and define $f,g$ as in the above codegree functions. It will be useful to note that if $\nu$ consists of $p$ paths of length $b$ , then by definition

f=p\left\lceil(b-t)/2\right\rceil

(20)

and

g=(p-2)\left\lceil(b-t)/2\right\rceil+b-t,

(21)

where this last equality follows from $g=f$ if $b-t$ is even and otherwise $g=f-1=(p-1)\left\lceil(b-t)/2\right\rceil+\left\lfloor(b-t)/2\right\rfloor$ .

Lemma 5.8.

Let $\nu$ consist of $p\geq 2$ complete paths and define $h=2t+f-b-2$ . There exists a constant $C>0$ such that

D_{0,t}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}}

provided either

k\leq n^{\frac{b-1}{b}\cdot\frac{h}{(p-1)b+h}},

k^{b-1-h}\geq n^{\frac{b-1}{b}\cdot\frac{(b-1)(pb-1)-h(ab-1)}{ab-1}}.

Proof.

If $D_{0,t}(\nu)=1$ then the result holds by Lemma 5.4, so from now on we assume $D_{0,t}(\nu)>1$ . We can rewrite the denominator of $D_{0,t}(\nu)$ as

	$\displaystyle\delta^{\|\nu\|}k^{(2b-2t+1)/(b-1)}n^{(2t-1)/b}(kn^{1/b})^{f}(k^{b/(b-1)})^{\|\nu\|-f-2}=\delta^{\|\nu\|}k^{2b/(b-1)}(k^{-1/(b-1)}n^{1/b})^{2t+f-1}(k^{b/(b-1)})^{\|\nu\|-2}$
	$\displaystyle=\delta^{\|\nu\|}kn^{1+1/b}\cdot(k^{-1/(b-1)}n^{1/b})^{2t+f-b-2}(k^{b/(b-1)})^{\|\nu\|-2}.$

Using this and $D_{0,t}(\nu)>1$ , we see that to show the desired result holds with $C=2\delta^{-v(\theta_{a,b})}$ , it suffices to show

(k^{-1/(b-1)}n^{1/b})^{h}(k^{b/(b-1)})^{|\nu|-2}\geq\big{(}\min\big{\{}k^{b/(b-1)},kn^{\frac{b-1}{b(ab-1)}}\big{\}}\big{)}^{e-1}.

(22)

Using that the minimum in (22) is at most $k^{b/(b-1)}$ and rearranging, we see that it suffices to have

(k^{-1/(b-1)}n^{1/b})^{h}\geq(k^{b/(b-1)})^{(e-1)-(|\nu|-2)}=(k^{b/(b-1)})^{pb-1-p(b-1)}=k^{(p-1)b/(b-1)},

i.e. $k^{((p-1)b+h)/(b-1)}\leq n^{h/b}$ , which gives the first result.

If we instead use that the minimum in (22) is at most $kn^{\frac{b-1}{b(ab-1)}}$ , then we see that this inequality will be satisfied provided

(k^{-1/(b-1)}n^{1/b})^{h}k^{pb}\geq k^{pb-1}n^{\frac{(b-1)(pb-1)}{b(ab-1)}},

which is equivalent to

k^{\frac{b-1-h}{b-1}}\geq n^{\frac{(b-1)(pb-1)}{b(ab-1)}-\frac{h}{b}}=n^{\frac{(b-1)(pb-1)-h(ab-1)}{b(ab-1)}}.

This gives the last part of the lemma. ∎

Lemma 5.9.

Let $\nu$ consist of $p\geq 2$ complete paths. There exists a constant $C>0$ such that

D_{t}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}(k^{b/(b-1)})^{e-1}}

provided

k\leq n^{\frac{b-1}{b}\cdot\frac{g}{pb+g}}.

Proof.

By Lemma 5.4 we can assume $D_{t}(\nu)>1$ , so the result holds for some $C\geq 2\delta^{-|\nu|}$ provided

(kn^{1/b})^{g}(k^{b/(b-1)})^{|\nu|-g-3}\geq(k^{b/(b-1)})^{e-1},

and rearranging this gives

(k^{-1/(b-1)}n^{1/b})^{g}\geq(k^{b/(b-1)})^{e-1-|\nu|+3}=(k^{b/(b-1)})^{pb-1-p(b-1)+1}=k^{pb/(b-1)},

and rearranging gives the desired result. ∎

With these two results we can solve the cycle case for $t<b$ provided $a$ is sufficiently large.

Lemma 5.10.

If $b>t\geq 2$ and $a\geq 100,b\geq 3$ , then there exists a constant $C>0$ such that if $k$ is sufficiently large in terms of $a,b,\delta$ and if $\nu\subseteq V(\theta_{a,b})$ induces $e$ edges where $1\leq e\leq e(\theta_{a,b})-1$ , then

D_{t}^{\prime}(\nu)\leq\frac{Ck^{ab}n^{2}}{kn^{1+1/b}(k^{b/(b-1)})^{e-1}}.

Proof.

By using similar reasoning as in Lemma 5.7, it suffices to prove this upper bounds holds for $\min\{D_{0,t}(\nu),D_{t}(\nu)\}$ (i.e. ignoring the $\log n$ term in $D^{\prime}_{t}(\nu)$ ) whenever $\nu$ consists of $p\geq 2$ paths of length $b$ . With $h$ as in Lemma 5.8, (20) and (21) give

h=2t+p\left\lceil(b-t)/2\right\rceil-b-2\geq 0,

g=(p-2)\left\lceil(b-t)/2\right\rceil+b-t,

where $h\geq 0$ follows from $p,t\geq 2$ . We also note that both inequalities of Lemma 5.8 become easier to satisfy for larger values of $h$ (this holds for the first inequality because the function $\frac{h}{c+h}$ is increasing for any $c>0$ , and it holds for the second since $n^{\frac{b-1}{b}}\geq k$ ).

We first claim that $h$ is relatively large in most cases; namely, if $p\geq 5$ , then $h\geq b-1$ . Indeed, this being false is equivalent to

2t+p\left\lceil(b-t)/2\right\rceil-b-2<b-1.

If $t=b-1$ then this is equivalent to $p<2b+1-2(b-1)=3$ , contradicting our assumption on $p$ , so we may assume $t<b-1$ . By dropping the ceiling function, the inequality above implies

2t+p(b-t)/2-b-2<b-1,

which is equivalent to

p<4+\frac{2}{b-t}\leq 5,

with the last step using $t<b-1$ , again giving a contradiction to our assumption on $p$ .

We conclude that $h\geq b-1$ if $p\geq 5$ . Note that the second inequality of Lemma 5.8 trivially holds at $h=b-1$ , and since the lemma is easier to satisfy for larger values of $h$ , we conclude the result for $p\geq 5$ . From now on⁴⁴4As an aside, it is not difficult to show that Lemma 5.8 alone suffices to prove the result for $p\geq 3$ if, say, $a\geq 12$ . However, for $p=2$ it is necessary to use Lemma 5.9 as well since, in particular, we can have $h=0$ in this case. Dealing with the case $p=2$ here is the only reason the codegree functions $D_{t}$ are introduced. we assume $2\leq p\leq 4$ .

Note that

g+h=t+(2p-2)\left\lceil(b-t)/2\right\rceil-2,

and in particular,

\max\{g,h\}\geq t/2+(p-1)\left\lceil(b-t)/2\right\rceil-1\geq b/2-1

(where the last step uses $p\geq 2$ ). First consider the case $h\geq b/2-1$ . Note that for $p\leq 4$ we have

(b-1)(pb-1)-h(ab-1)\leq(b-1)(4b-1)-(b/2-1)(ab-1)=(b/2)(8b-9-a(b-2))\leq 0,

where this last step holds for $a\geq 15\geq\frac{8b-9}{b-2}$ (and uses $b\geq 3$ ). With this we either have $h\geq b-1$ (in which case we are done by the argument for $p\geq 5$ ), or

k^{b-1-h}\geq 1\geq n^{\frac{b-1}{b}\cdot\frac{(b-1)(pb-1)-h(ab-1)}{ab-1}},

in which case the result follows from Lemma 5.8.

Now assume $h<b/2-1$ , which in particular implies $g\geq b/2-1$ . By Lemma 5.9, and using $p\leq 4$ , we obtain the result if $k\leq n^{\frac{b-1}{b}\cdot\frac{b/2-1}{4b+b/2-1}}=n^{\frac{b-1}{b}\cdot\frac{b-2}{9b-2}}$ . On the other hand, using the second inequality of Lemma 5.8, which is harder to satisfy the smaller $h\geq 0$ is, we see that that for $p\leq 4$ the result holds if

k^{b-1}\geq n^{\frac{b-1}{b}\cdot\frac{(b-1)(4b-1)}{ab-1}}.

Thus the result holds for all $k$ provided

\frac{4b-1}{ab-1}\leq\frac{b-2}{9b-2},

or equivalently

a\geq\frac{1}{b}+\frac{(4b-1)(9b-2)}{b(b-2)}.

This holds for $a\geq 100$ , proving the result. ∎

5.1 now follows immediately from Lemmas 5.7 and 5.10.

We note that sharper arguments can easily be used to reduce the bound $a\geq 100$ of 5.1 considerably, though the bound cannot be made arbitrarily small. In particular, one can work out that the case $b=4$ and $p=t=2$ shows that $a\geq 9$ is needed, as $D_{t}^{\prime}(\nu)$ does not satisfy the conclusion of 5.1 in this case.

6 Completing the Proof of 1.3

Recall that we wish to show that for all $b\geq 2$ , there exists $a_{0}=a_{0}(b)$ such that for any fixed $a\geq a_{0}$ , w.h.p.

\mathrm{ex}(G_{n,p},\theta_{a,b})=\begin{cases}\Theta\left(p^{\frac{1}{b}}n^{1+\frac{1}{b}}\right)&p\geq n^{-\frac{b-1}{ab-1}}(\log n)^{2b},\\ n^{2-\frac{a(b-1)}{ab-1}}(\log n)^{O(1)}&n^{-\frac{b-1}{ab-1}}(\log n)^{2b}\geq p\geq n^{-\frac{a(b-1)}{ab-1}},\\ (1+o(1))p{n\choose 2}&n^{-\frac{a(b-1)}{ab-1}}\gg p\gg n^{-2}.\end{cases}

The case $b=2$ follows from Morris and Saxton [20], so from now on we assume $b\geq 3$ . The lower bounds for $\mathrm{ex}(G_{n,p},\theta_{a,b})$ follow⁵⁵5Specifically, one applies Corollary 5.1 to the rooted tree $(T,R)$ with $T$ the path on $b$ edges and $R$ its set of leaves. With this one can check $\rho(T)\geq\frac{b}{b-1}$ (which is also implicitly shown in Conlon [7]), and that $\theta_{a,b}\in\mathcal{T}^{a}$ from [28, Corollary 5.1], which is proven using random polynomial graphs (similar to how Conlon [7] proved $\mathrm{ex}(n,\theta_{a,b})=\Omega(n^{1+1/b})$ whenever $a$ is sufficiently large in terms of $b$ ). The upper bound for $p$ small follows from the fact that $G_{n,p}$ has at most $(1+o(1))p{n\choose 2}$ edges w.h.p., and the upper bound for $p$ in the middle range will follow from the upper bound for $p$ large due to the monotonicity of $\mathrm{ex}(G_{n,p},F)$ with respect to $p$ .

With this all in mind, it only remains to prove $\mathrm{ex}(G_{n,p},\theta_{a,b})=O(p^{1/b}n^{1+1/b})$ when $p\geq n^{-\frac{b-1}{ab-1}}(\log n)^{2b}$ . For this we utilize the following general result showing that balanced supersaturation implies upper bounds on $\mathrm{ex}(G_{n,p},F)$ .

Theorem 6.1.

Let $F$ be a graph and $1<\alpha<2$ a real number satisfying the following: there exist real numbers $C,k_{0}>0$ such that for every $n$ -vertex graph $G$ with $e(G)=kn^{\alpha}$ and $k\geq k_{0}$ , there exists a hypergraph $\mathcal{H}$ on $E(G)$ whose hyperedges are copies of $F$ and is such that $|\mathcal{H}|\geq C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}$ , and such that for every $\sigma\subseteq E(G)$ with $1\leq|\sigma|\leq e(F)-1$ , we have

\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{|\sigma|-1}}.

In this case,

\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha})\hskip 16.99998pt\text{for all \ }p\geq\left(n^{2-\alpha-\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{1}{\alpha-1}}.

We note that if $F$ is 2-balanced, i.e. if it has $m_{2}(F)=\frac{e(F)-1}{v(F)-2}$ , then the conclusion of 6.1 is exactly the upper bound predicted by 1.2 provided $\mathrm{ex}(n,F)=\Theta(n^{\alpha})$ .

The proof of 6.1 uses what is by now a fairly routine argument involving hypergraph containers, which is a powerful technique developed recently and independently by Balogh, Morris and Samotij [1] and Saxton and Thomason [26]. We defer the details to Appendix A.

In any case, by 5.2, we see that $\theta_{a,b}$ satisfies the conditions of 6.1 for $a\geq 100$ and $b\geq 3$ , proving the desired upper bound and completing the proof.

7 Concluding Remarks

In this paper we established upper bounds for $\mathrm{ex}(G_{n,p},\theta_{a,b})$ which are essentially tight whenever $a$ is sufficiently large in terms of $b$ . It would be of interest if one could extend our ideas to prove effective upper bounds on $\mathrm{ex}(G_{n,p},F)$ for other $F$ . In particular, one might hope to prove upper bounds for powers of rooted trees.

More precisely, given a tree $T$ , a set $R\subseteq V(T)$ , and an integer $a$ , we define $T_{R}^{a}$ to be the graph consisting of $a$ copies of $T$ which agree only on the set $R$ . For example, if $T$ is a path of length $b$ and $R$ consists of its two endpoints, then $T_{R}^{a}=\theta_{a,b}$ , and if $T=K_{s,1}$ and $R$ is its set of leaves, then $T_{R}^{a}=K_{s,a}$ . In particular, the only bipartite graphs for which we know tight bounds for $\mathrm{ex}(G_{n,p},F)$ , namely theta graphs and complete bipartite graphs, are examples of powers of trees.

Question 7.1.

Can one prove essentially tight bounds on $\mathrm{ex}(G_{n,p},T_{R}^{a})$ for other powers of rooted trees?

The best upper bounds for this problem come from the general bounds of Jiang and Longbreak [17], and the best lower bound comes from [28]. We note that analogous to the situation for theta graphs prior to this paper, the lower bound of [28] depends only on the tree $T$ while the upper bound of [17] depends on $a$ , and as such the gaps between these bounds grow large as $a$ increases. Similar to the situation in the present paper, we suspect that the lower bound is closer to the truth, and in particular, Conjecture 1.2 claims that in many cases, the lower bound from [28] should be the correct answer.

Solving Question 7.1 for all rooted trees is likely impossible. Indeed, even the $p=1$ case, namely that of determining the Turán number $\mathrm{ex}(n,T_{R}^{a})$ , is an important open problem of Bukh and Conlon [4] related to the rational exponents conjecture. That being said, there are a number of special cases where this Turán number is known [8, 9, 15, 16, 18], and it might be possible to generalize our ideas to deal with some of these cases in the random setting. A more detailed discussion on this problem can be found in the concluding remarks of [28].

To prove 1.3, we first proved a balanced supersaturation result, 5.2, which is essentially optimal for $a\geq 100$ . It would be desirable to do this for all $a$ .

Question 7.2.

Can one extend 5.2 to hold for all $a\geq 3$ ?

Note that the $a=2$ case is already dealt with by Morris and Saxton [20]. Solving this question, in addition to being desirable from a philosophical standpoint, might lead to a simpler proof of 5.2 which could more easily generalize to solving Question 7.1. The simplest way to resolve this question would be to resolve the following.

Question 7.3.

Can one extend Proposition 2.1 to hold with $|X|\geq\varepsilon m$ ?

An affirmative answer here would not only give an affirmative answer to Question 7.2, but also would allow one to avoid many of the messy technical details in our proof. Namely, with this one can alter the definition of $D_{s,t}$ in such a way that the $D_{t}$ functions are no longer needed, and such that the computations for proving 5.1 are much simpler.

Acknowledgments. We thank Rob Morris for useful comments about the presentation of this paper.

References

[1] József Balogh, Robert Morris and Wojciech Samotij “Independent sets in hypergraphs” In J. Amer. Math. Soc. 28.3, 2015, pp. 669–709 DOI: 10.1090/S0894-0347-2014-00816-X
[2] J.. Bondy and M. Simonovits “Cycles of even length in graphs” In J. Combinatorial Theory Ser. B 16, 1974, pp. 97–105 DOI: 10.1016/0095-8956(74)90052-5
[3] Boris Bukh “Extremal graphs without exponentially-small bicliques” In arXiv:2107.04167, 2022
[4] Boris Bukh and David Conlon “Rational exponents in extremal graph theory” In J. Eur. Math. Soc. (JEMS) 20.7, 2018, pp. 1747–1757 DOI: 10.4171/JEMS/798
[5] Maurı́cio Collares Neto and Robert Morris “Maximum-size antichains in random set-systems” In Random Structures Algorithms 49.2, 2016, pp. 308–321 DOI: 10.1002/rsa.20647
[6] D. Conlon and W.. Gowers “Combinatorial theorems in sparse random sets” In Ann. of Math. (2) 184.2, 2016, pp. 367–454 DOI: 10.4007/annals.2016.184.2.2
[7] David Conlon “Graphs with few paths of prescribed length between any two vertices” In Bull. Lond. Math. Soc. 51.6, 2019, pp. 1015–1021 DOI: 10.1112/blms.12295
[8] David Conlon and Oliver Janzer “Rational exponents near two” In Adv. Comb., 2022, pp. 10pp
[9] David Conlon, Oliver Janzer and Joonkyung Lee “More on the extremal number of subdivisions” In Combinatorica 41.4, 2021, pp. 465–494 DOI: 10.1007/s00493-020-4202-1
[10] Jan Corsten and Tuan Tran “Balanced supersaturation for some degenerate hypergraphs” In J. Graph Theory 97.4, 2021, pp. 600–623 DOI: 10.1002/jgt.22674
[11] P. Erdös and A.. Stone “On the structure of linear graphs” In Bull. Amer. Math. Soc. 52, 1946, pp. 1087–1091 DOI: 10.1090/S0002-9904-1946-08715-7
[12] R.. Faudree and M. Simonovits “On a class of degenerate extremal graph problems” In Combinatorica 3.1, 1983, pp. 83–93 DOI: 10.1007/BF02579343
[13] Zoltán Füredi “Random Ramsey graphs for the four-cycle” In Discrete Math. 126.1-3, 1994, pp. 407–410 DOI: 10.1016/0012-365X(94)90287-9
[14] Zoltán Füredi and Miklós Simonovits “The history of degenerate (bipartite) extremal graph problems” In Erdös centennial 25, Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2013, pp. 169–264 DOI: 10.1007/978-3-642-39286-3“˙7
[15] Oliver Janzer “The extremal number of the subdivisions of the complete bipartite graph” In SIAM J. Discrete Math. 34.1, 2020, pp. 241–250 DOI: 10.1137/19M1269798
[16] Tao Jiang, Zilin Jiang and Jie Ma “Negligible obstructions and Turán exponents” In Ann. Appl. Math. 38.3, 2022, pp. 356–384
[17] Tao Jiang and Sean Longbrake “Balanced supersaturation and Turán numbers in random graphs” In arXiv:2208.10572, 2022
[18] Dong Yeap Kang, Jaehoon Kim and Hong Liu “On the rational Turán exponents conjecture” In J. Combin. Theory Ser. B 148, 2021, pp. 149–172 DOI: 10.1016/j.jctb.2020.12.003
[19] T. Kövari, V.. Sós and P. Turán “On a problem of K. Zarankiewicz” In Colloq. Math. 3, 1954, pp. 50–57 DOI: 10.4064/cm-3-1-50-57
[20] Robert Morris and David Saxton “The number of $C_{2\ell}$ -free graphs” In Adv. Math. 298, 2016, pp. 534–580 DOI: 10.1016/j.aim.2016.05.001
[21] Dhruv Mubayi and Liana Yepremyan “On The Random Turán number of linear cycles” In arXiv:2304.15003, 2023
[22] Jiaxi Nie “Random Turán theorem for expansions of spanning subgraphs of tight trees” In arXiv:2305.04193, 2023
[23] Jiaxi Nie “Turán theorems for even cycles in random hypergraph” In arXiv:2304.14588, 2023
[24] Jiaxi Nie, Sam Spiro and Jacques Verstraëte “Triangle-free subgraphs of hypergraphs” In Graphs Combin. 37.6, 2021, pp. 2555–2570 DOI: 10.1007/s00373-021-02388-5
[25] Vojtěch Rödl and Mathias Schacht “Extremal results in random graphs” In Erdös centennial 25, Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2013, pp. 535–583 DOI: 10.1007/978-3-642-39286-3“˙20
[26] David Saxton and Andrew Thomason “Hypergraph containers” In Invent. Math. 201.3, 2015, pp. 925–992 DOI: 10.1007/s00222-014-0562-8
[27] Mathias Schacht “Extremal results for random discrete structures” In Ann. of Math. (2) 184.2, 2016, pp. 333–365 DOI: 10.4007/annals.2016.184.2.1
[28] Sam Spiro “Random Polynomial Graphs for Random Turán Problems” In arXiv:2212.08050, 2022
[29] Sam Spiro and Jacques Verstraëte “Relative Turán problems for uniform hypergraphs” In SIAM J. Discrete Math. 35.3, 2021, pp. 2170–2191 DOI: 10.1137/20M1364631

Appendix A Proof of 6.1

Throughout this section, we say that a graph $F$ is $\alpha$ -good with $1<\alpha<2$ a real number if it satisfies the following balanced supersaturation condition: there exist real numbers $C,k_{0}>0$ such that for every $n$ -vertex graph $G$ with $e(G)=kn^{\alpha}$ and $k\geq k_{0}$ , there exists a hypergraph $\mathcal{H}$ on $E(G)$ whose hyperedges are copies of $F$ and is such that $|\mathcal{H}|\geq C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}$ , and such that for every $\sigma\subseteq E(G)$ with $1\leq|\sigma|\leq e(F)-1$ , we have

\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{|\sigma|-1}}.

Here we prove 6.1, i.e. that if $F$ is $\alpha$ -good, then $\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha})$ w.h.p. for all $p\geq\left(n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{-1}{\alpha-1}}$ . We emphasize that our proof is nearly word-for-word the same as that of Morris and Saxton [20]. We make use the following definition from [26].

Definition 9.

Given an $r$ -uniform hypergraph $\mathcal{H}$ and a real number $\tau$ , define

\delta(\mathcal{H},\tau)\,=\,\frac{1}{e(\mathcal{H})}\,\sum_{j=2}^{r}\,\frac{1}{\tau^{j-1}}\sum_{v\in V(\mathcal{H})}d^{(j)}(v),

where

d^{(j)}(v)\,=\,\max\big{\{}\deg_{\mathcal{H}}(\sigma)\,:\,v\in\sigma\subseteq V(\mathcal{H})\textup{ and }|\sigma|=j\big{\}}

denotes the maximum degree in $\mathcal{H}$ of a $j$ -set containing $v$ .

We remark that we have removed some extraneous constants from the definition in [26], since these do not affect the formulation of the theorem below. We also note that $\delta$ is typically called a codegree function, but we emphasize that this has no relation to the definition of codegree functions that we used throughout our paper.

The following container theorem was proved by Balogh, Morris and Samotij [1, Proposition 3.1] and by Saxton and Thomason [26, Theorem 6.2]⁶⁶6To be precise, Theorem 6.2 in [26] is stated where $T$ is a tuple of vertex sets rather than a single vertex set, but it is straightforward to deduce this form from the methods of [26]., where here the notation $S^{(\leq t)}$ denotes the collection of all subsets of $S$ of size at most $t$ .

Theorem A.1.

Let $r\geq 2$ and let $0<\delta<\delta_{0}(r)$ be sufficiently small. Let $\mathcal{H}$ be an $r$ -graph with $N$ vertices, and suppose that $\delta(\mathcal{H},\tau)\leq\delta$ for some $\tau>0$ . Then there exists a collection $\mathcal{C}$ of subsets of $V(\mathcal{H})$ , and a function $f\colon V(\mathcal{H})^{(\leq\tau N/\delta)}\to\mathcal{C}$ such that:

$(a)$

For every independent set $I$ , there exists $T\subset I$ with $|T|\leq\tau N/\delta$ and $I\subset f(T)$ , and
$(b)$

$e\big{(}\mathcal{H}[C]\big{)}\leq\big{(}1-\delta\big{)}e(\mathcal{H})$ for every $C\in\mathcal{C}$ .

We will refer to the collection $\mathcal{C}$ as the containers of $\mathcal{H}$ , since, by $(a)$ , every independent set is contained in some member of $\mathcal{C}$ . The reader should think of $V(\mathcal{H})$ as being the edge set of some underlying graph $G$ , and $E(\mathcal{H})$ as encoding (some subset of) the copies of a graph $F$ in $G$ . Thus every $F$ -free subgraph of $G$ is an independent set of $\mathcal{H}$ .

Let us introduce some notation to simplify the statements which follow. Given a graph $F$ and real number $\alpha$ , let $\mathcal{I}=\mathcal{I}(n)$ denote the collection of $F$ -free graphs with $n$ vertices, and let $\mathcal{G}=\mathcal{G}(n,k)$ denote the collection of all graphs with $n$ vertices and at most $kn^{\alpha}$ edges. By a colored graph, we mean a graph together with an arbitrary labelled partition of its edge set.

Theorem A.2.

If $F$ is $\alpha$ -good, then there exists a constant $C=C(F)$ such that the following holds for all sufficiently large $n,k\in\mathbb{N}$ with $k\leq\left(n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{2-\alpha}{\alpha-1}}$ . There exists a collection $\mathcal{S}$ of colored graphs with $n$ vertices and at most $Ck^{-\frac{\alpha-1}{2-\alpha}}\cdot n^{\alpha}$ edges, as well as functions

g\colon\mathcal{I}\to\mathcal{S}\qquad\text{and}\qquad h\colon\mathcal{S}\to\mathcal{G}(n,k)

with the following properties:

(a)

For every $s\geq 0$ , the number of colored graphs in $\mathcal{S}$ with $s$ edges is at most

\bigg{(}\frac{Cn^{\alpha}}{s}\bigg{)}^{\frac{1}{\alpha-1}\cdot s}\cdot\exp\Big{(}Ck^{-\frac{\alpha-1}{2-\alpha}}\cdot n^{\alpha}\Big{)}.

$(b)$

$g(I)\subset I\subset g(I)\cup h(g(I))$ for every $I\in\mathcal{I}$ .

We prove Theorem A.2 by iterating the following container result.

Proposition A.3.

If $F$ is an $\alpha$ -good graph, then there exist $k_{0}\in\mathbb{N}$ and $\varepsilon>0$ such that the following holds for every $k\geq k_{0}$ and every $n\in\mathbb{N}$ . Set

\mu\,=\,\frac{1}{\varepsilon}\cdot\max\Big{\{}k^{-\frac{\alpha-1}{2-\alpha}},\,n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}\Big{\}}.

(23)

Given a graph $G$ with $n$ vertices and $kn^{\alpha}$ edges, there exists a function $f_{G}$ that maps subgraphs of $G$ to subgraphs of $G$ , such that, for every $F$ -free subgraph $I\subset G$ ,

$(a)$

There exists a subgraph $T=T(I)\subset I$ with $e(T)\leq\mu n^{\alpha}$ and $I\subset f_{G}(T)$ , and
$(b)$

$e\big{(}f_{G}(T(I))\big{)}\leq(1-\varepsilon)e(G)$ .

Proof.

By definition of $F$ being $\alpha$ -good, there exist real numbers $C,k_{0}>0$ and a hypergraph $\mathcal{H}$ on $E(G)$ whose hyperedges are copies of $F$ and is such that

(i)

$|\mathcal{H}|\geq C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}$ , and

(ii)

For every $\sigma\subseteq E(G)$ with $1\leq|\sigma|\leq e(F)-1$ , we have

\deg_{\mathcal{H}}(\sigma)\leq\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{|\sigma|-1}}.

Let $\delta:=C^{-1}$ , and without loss of generality we can assume $C$ is sufficiently large so that Theorem A.1 holds with $r=e(F)$ and this choice of $\delta$ . We will show that if

\frac{1}{\tau}=\delta^{4}k\cdot\min\Big{\{}k^{\frac{\alpha-1}{2-\alpha}},\,n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\Big{\}}=\delta^{4}\cdot\min\Big{\{}k^{\frac{1}{2-\alpha}},\,kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\Big{\}},

then it follows from $(i)$ and $(ii)$ that $\delta(\mathcal{H},\tau)\leq\delta$ . Indeed, since $v(\mathcal{H})=e(G)=kn^{\alpha}$ , we have

	$\displaystyle\delta(\mathcal{H},\tau)$	$\displaystyle\,=\,\frac{1}{e(\mathcal{H})}\,\sum_{j=2}^{e(F)}\,\frac{1}{\tau^{j-1}}\cdot\sum_{v\in V(\mathcal{H})}d^{(j)}(v)$
		$\displaystyle\,\leq\,\frac{v(\mathcal{H})}{e(\mathcal{H})}\Bigg{[}\sum_{j=2}^{e(F)}\frac{1}{\tau^{j-1}}\cdot\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{j-1}}\Bigg{]}$
		$\displaystyle\,\leq\,\frac{kn^{\alpha}}{C^{-1}k^{e(F)}n^{v(F)-(2-\alpha)e(F)}}\Bigg{[}\sum_{j=2}^{e(F)}\left(\delta^{4}\cdot\min\Big{\{}k^{\frac{1}{2-\alpha}},\,kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\Big{\}}\right)^{j-1}\cdot\frac{Ck^{e(F)}n^{v(F)-(2-\alpha)e(F)}}{kn^{\alpha}\left(\min\left\{k^{\frac{1}{2-\alpha}},kn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}\right\}\right)^{j-1}}\Bigg{]}$
		$\displaystyle\,\leq\,\sum_{j=2}^{e(F)}C^{2}\delta^{4(j-1)}\,\leq\,\delta,$

where this last bound holds provided $\delta=C^{-1}$ is sufficiently small, which we can assume to be the case without loss of generality. Thus, applying Theorem A.1 and setting $\varepsilon=\delta^{5}$ , we obtain a collection $\mathcal{C}$ of subgraphs of $G$ and a function $f_{G}$ mapping subgraphs of $G$ to elements of $\mathcal{C}$ so that for every $F$ -free subgraph $I\subset G$ , there exists a subgraph $T=T(I)\subset I$ with

e(T)\ \leq\ \tau N/\delta\ \leq\frac{n^{\alpha}}{\varepsilon}\cdot\max\Big{\{}k^{-\frac{\alpha-1}{2-\alpha}},\,n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}\Big{\}}\ =\ \mu n^{\alpha}

and $I\subset f_{G}(T)$ , and also

e\big{(}\mathcal{H}[C]\big{)}\leq\big{(}1-\delta\big{)}e(\mathcal{H})\text{ for all }C\in\mathcal{C}.

(24)

It only remains to show that this second condition implies $e(C)\leq(1-\varepsilon)e(G)$ for every $C\in\mathcal{C}$ (notice that the first inequality is about hyperedges and the second is about graph edges). To prove this, for each $C\in\mathcal{C}$ set

\mathcal{D}(C)\,=\,E(\mathcal{H})\setminus E(\mathcal{H}[C])\,=\,\big{\{}e\in E(\mathcal{H})\,:\,v\in e\mbox{ for some }v\in V(\mathcal{H})\setminus C\big{\}},

and recall that $\deg_{\mathcal{H}}(v)\leq e(\mathcal{H})/\big{(}\delta^{2}kn^{\alpha}\big{)}$ for every $v\in V(\mathcal{H})$ , by $(i)$ , $(ii)$ , and $\delta=C^{-1}$ . Therefore,

|\mathcal{D}(C)|\,\leq\,\frac{e(\mathcal{H})}{\delta^{2}kn^{\alpha}}\cdot|E(G)\setminus C|.

On the other hand, we have $|\mathcal{D}(C)|=e(\mathcal{H})-e(\mathcal{H}[C])\geq\delta e(\mathcal{H})$ by condition (24), and so

|E(G)\setminus C|\,\geq\,\delta^{3}kn^{\alpha}\geq\varepsilon e(G),

as required. Hence the proposition follows with $\varepsilon=\delta^{5}$ . ∎

With this in hand, we continue on towards the proof of Theorem A.2. We will need the following straightforward lemma (see, for example, [5, Lemma 4.3]).

Lemma A.4.

Let $M>0$ , $s>0$ and $0<\delta<1$ . If $a_{1},\ldots,a_{m}\in\mathbb{R}$ satisfy $s=\sum_{j}a_{j}$ and $1\leq a_{j}\leq(1-\delta)^{j}M$ for each $j\in[m]$ , then

s\log s\,\leq\,\sum_{j=1}^{m}a_{j}\log a_{j}+O(M).

We can now deduce Theorem A.2.

Proof of Theorem A.2.

We construct the functions $g$ and $h$ and the family $\mathcal{S}$ as follows. Given an $F$ -free graph $I\in\mathcal{I}$ , we repeatedly apply Proposition A.3, first to the complete graph $G_{0}=K_{n}$ , then to the graph $G_{1}=f_{G_{0}}(T_{1})\setminus T_{1}$ , where $T_{1}\subset I$ is the set guaranteed to exist by part $(a)$ , then to the graph $G_{2}=f_{G_{1}}(T_{2})\setminus T_{2}$ , where $T_{2}\subset I\cap G_{1}=I\setminus T_{1}$ , and so on. We continue until we arrive at a graph $G_{m}$ with at most $kn^{\alpha}$ edges, and set

g(I)=(T_{1},\ldots,T_{m})\qquad\text{and}\qquad h\big{(}g(I)\big{)}=G_{m}.

Since $G_{m}$ depends only on the sequence $(T_{1},\ldots,T_{m})$ , the function $h$ is well-defined.

It remains to bound the number of colored graphs in $\mathcal{S}$ with $s$ edges. To do so, it suffices to count the number of choices for the sequence of graphs $(T_{1},\ldots,T_{m})$ with $\sum_{j}e(T_{j})=s$ . For each $j\geq 1$ , define $k(j)$ and $\mu(j)$ as follows:

e\big{(}G_{m-j}\big{)}=k(j)n^{\alpha}\quad\text{and}\quad\mu(j)=\frac{1}{\varepsilon}\cdot\max\Big{\{}k(j)^{-\frac{\alpha-1}{2-\alpha}},\,n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}\Big{\}},

and note that

k(j)\geq(1-\varepsilon)^{-j+1}k,\qquad T_{j+1}\subset G_{j}\qquad\text{and}\qquad e(T_{m-j})\leq\mu(j)n^{\alpha}.

Thus, fixing $k$ , $\varepsilon$ and $s$ as above, and writing

\mathcal{K}(m)\,=\,\Big{\{}\mathbf{k}=(k(1),\ldots,k(m))\,:\,(1-\varepsilon)^{-j+1}k\leq k(j)\leq n^{2-\alpha}\Big{\}}

for each $m\in\mathbb{N}$ , and

\mathcal{A}(\mathbf{k})\,=\,\Big{\{}\mathbf{a}=(a(1),\ldots,a(m))\,:\,a(j)\leq\mu(j)n^{\alpha}\text{ and }\sum_{j}a(j)=s\Big{\}},

for each $\mathbf{k}\in\mathcal{K}(m)$ , it follows that the number of colored graphs in $\mathcal{S}$ with $s$ edges is at most

\sum_{m=1}^{\infty}\sum_{\mathbf{k}\in\mathcal{K}(m)}\sum_{\mathbf{a}\in\mathcal{A}(\mathbf{k})}\prod_{j=1}^{m}{k(j)n^{\alpha}\choose a(j)}.

Given $m\in\mathbb{N}$ , $\mathbf{k}\in\mathcal{K}(m)$ and $\mathbf{a}\in\mathcal{A}(\mathbf{k})$ , let us partition the product over $j$ according to whether or not $\mu(j)=\frac{1}{\varepsilon}\cdot n^{-\left(\alpha-2+\frac{v(F)-2}{e(F)-1}\right)}$ . Since $\mathcal{K}(m)=\emptyset$ if $m$ is at least some large constant times $\log n$ , the product of the terms for which this is the case is at most

\big{(}n^{2}\big{)}^{\sum_{j}a(j)}\,\leq\,\exp\Big{(}O(1)\cdot n^{2-\frac{v(F)-2}{e(F)-1}}(\log n)^{2}\Big{)}\,\leq\,\exp\Big{(}O(1)\cdot k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)},

where in the last step we used the fact that $k\leq\left(n^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{2-\alpha}{\alpha-1}}$ . On the other hand, if $a(j)\leq k(j)^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}$ , i.e. if $k(j)\leq(n^{\alpha}/a(j))^{\frac{2-\alpha}{\alpha-1}}$ , then

{k(j)n^{\alpha}\choose a(j)}\,\leq\,\bigg{(}\frac{ek(j)n^{\alpha}}{a(j)}\bigg{)}^{a(j)}\,\leq\,\bigg{(}\frac{en^{\alpha}}{a(j)}\bigg{)}^{\frac{1}{\alpha-1}\cdot a(j)},

and hence, by Lemma A.4, the product over the remaining $j$ is at most

\bigg{(}\frac{C^{\prime}n^{\alpha}}{s}\bigg{)}^{\frac{1}{\alpha-1}\cdot s}\cdot\exp\Big{(}C^{\prime}k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)}

for some $C^{\prime}=C^{\prime}(F)$ . Noting that $\sum_{m=1}^{\infty}\sum_{\mathbf{k}\in\mathcal{K}(m)}|\mathcal{A}(\mathbf{k})|=n^{O(\log n)}$ since $\mathcal{K}(m)=\emptyset$ for $m$ at least some large constant $\log n$ , the theorem follows. ∎

We can now easily deduce Theorem 6.1.

Proof of Theorem 6.1.

Let $F$ be a graph satisfying the hypotheses of the theorem, i.e. a graph which is $\alpha$ -good for some $1<\alpha<2$ . Recall that we wish to show that for $p\geq\left(n^{2-\alpha-\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{1}{\alpha-1}}$ , we have $\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha})$ w.h.p. Given such a function $p=p(n)$ , define $k=p^{-(2-\alpha)}$ . Since $k\leq\left(Cn^{\alpha-2+\frac{v(F)-2}{e(F)-1}}/\log^{2}n\right)^{\frac{2-\alpha}{\alpha-1}}$ , we can apply A.2 to get functions $g,h$ . Suppose that there exists an $F$ -free subgraph $I\subset G(n,p)$ with $m$ edges, and observe that $g(I)\subset G(n,p)$ , and that $G(n,p)$ contains at least $m-e\big{(}g(I)\big{)}$ elements of $h(g(I))$ . The probability of this event is therefore at most

	$\displaystyle\sum_{S\in\mathcal{S}}{kn^{\alpha}\choose m-e(S)}p^{m}$	$\displaystyle\,\leq\sum_{s=0}^{Ck^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}}\bigg{(}\frac{Cp^{\alpha-1}n^{\alpha}}{s}\bigg{)}^{\frac{1}{\alpha-1}\cdot s}\exp\Big{(}Ck^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)}\bigg{(}\frac{3pkn^{\alpha}}{m-s}\bigg{)}^{m-s}$
		$\displaystyle\,\leq\,\exp\bigg{[}O(1)\cdot\Big{(}p^{\alpha-1}n^{\alpha}+k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{)}\bigg{]}\bigg{(}\frac{4pkn^{\alpha}}{m}\bigg{)}^{m/2}\to 0$

as $n\to\infty$ , as long as $m$ is a sufficiently large constant times

\max\Big{\{}pkn^{\alpha},\,k^{-\frac{\alpha-1}{2-\alpha}}n^{\alpha}\Big{\}}=p^{\alpha-1}n^{\alpha}.

We conclude that $\mathrm{ex}(G_{n,p},F)=O(p^{\alpha-1}n^{\alpha})$ w.h.p., giving the result. ∎

Appendix B Proof of Proposition 2.1

We emphasize that almost everything in this section will be nearly identical to Morris and Saxton [20]. We first recall the definitions and conventions introduced in Section 2:

•

We fixed a sequence of rapidly decreasing constants

$1\geq\varepsilon_{b}\geq\cdots\geq\varepsilon_{2}\geq\varepsilon_{1}>0$

which depend only on $b$ . We also fixed some $m$ -vertex graph $G$ with minimum degree $\ell m^{1/b}$ with $\ell$ (and hence $m$ ) sufficiently large in terms of the $\varepsilon_{t}$ constants.
•

For $x\in V(G)$ , we say that a tuple $\mathcal{A}=(A_{0},A_{1},\ldots,A_{t})$ of (not necessarily disjoint) subsets of $V(G)$ is a concentrated $t$ -neighborhood of $x$ if $A_{0}=\{x\}$ , $|N(v)\cap A_{i}|\geq\varepsilon_{t}\ell m^{1/b}$ for all $v\in A_{i-1}$ , and

$|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}.$

We define $t(x)$ to be the minimal $t\geq 2$ such that there exists a concentrated $t$ -neighborhood of $x$ in $G$ .
•

Lemma 2.2 says that for some $2\leq t\leq b$ , there exists $X\subseteq V(G)$ of size at least $\frac{1}{2}(4b)^{t-b}\ell^{(b-t)/(b-1)}m^{t/b}$ such that $t(x)=t$ for every $x\in X$ , and such that for every $x\in X$ there exists a tuple of sets $\mathcal{A}=(A_{0},\ldots,A_{t})$ such that $A_{0}=\{x\}$ , $|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ , $|N(y)\cap A_{i}|\geq\frac{1}{2}\varepsilon_{t}\ell m^{1/b}$ for all $y\in A_{i-1}$ , and every $y\in\bigcup A_{i}$ has $t(y)\geq t$ .

For the rest of this section we fix $t,X$ as in Lemma 2.2. We also fix some $\varepsilon>0$ sufficiently small compared to the $\varepsilon_{s}$ constants, as well as a set of forests $\mathcal{F}$ such that for every path $x_{1}\cdots x_{r}$ of $G$ which does not contain an element of $\mathcal{F}$ as a subgraph, there are at most $\varepsilon\ell m^{1/b}$ vertices $x_{r+1}\in N_{G}(x_{r})$ such that the path $x_{1}\cdots x_{r+1}$ contains an element of $\mathcal{F}$ as a subgraph. As much as possible we use the notation of Morris and Saxton’s original proof, and in particular, we drop our convention from the main part of the text that $u,v,w$ are used only as vertices of $\theta_{a,b}$ .

We introduce some notation and definitions that will be used for the rest of the proof. Given a set of paths $\mathcal{P}$ , we define the $(r,v)$ -branching factor of $\mathcal{P}$ is the maximum number $d$ such that there exist $d$ paths in $\mathcal{P}$ with $i$ th vertex $v$ and pairwise distinct $(i+1)$ st vertices. The branching factor of $\mathcal{P}$ is defined to be the maximum $(i,v)$ -branching factor amongst all choices of $i,v$ . We define $\mathcal{P}_{i,j}=\{u_{i}\cdots u_{j}:u_{0}\cdots u_{t}\in\mathcal{P}\}$ and $\mathcal{P}[u\to v]:=\{x_{1}\cdots x_{s}\in\mathcal{P}:x_{1}=u,\ x_{s}=v\}$ , and also define $\mathcal{P}[u\to S]=\bigcup_{v\in S}\mathcal{P}[u\to v]$ .

One lemma that we will need in several places is the following.

Lemma B.1.

Let $\mathcal{R}$ be a collection of paths of length $s\geq 2$ in $G$ from a vertex $x\in V(G)$ to a set $B\subseteq V(G)$ . Assume that $|B|\leq\ell^{(b-s)/(b-1)}m^{s/b}$ , $|\mathcal{R}|>s\varepsilon_{s}(\ell m^{1/b})^{s}$ , and that $\mathcal{R}$ has branching factor at most $\ell m^{1/b}$ . Then $t(x)\leq s$ ,

Proof.

Form a subset $\mathcal{R}^{\prime}\subseteq\mathcal{R}$ by starting with $\mathcal{R}^{\prime}=\mathcal{R}$ and then iteratively choosing $i,v$ such that the $(i,v)$ -branching factor of $\mathcal{R}^{\prime}$ is less than $\varepsilon_{s}\ell m^{1/b}$ and then deleting any paths which contain $v$ as their $i$ th vertex. Let $A_{i}$ be the set of vertices used as the $i$ th vertex of some path of $\mathcal{R}^{\prime}$ . If $\mathcal{R}^{\prime}$ is non-empty, then $(A_{0},\ldots,A_{s})$ is a concentrated $s$ -neighborhood of $x$ by construction, which shows $t(x)\leq s$ .

Thus it suffices to show that $\mathcal{R}^{\prime}$ is non-empty. We claim that the number of paths that were destroyed is at most $s\cdot\varepsilon_{s}(\ell m^{1/b})^{s}$ . Indeed, because $\mathcal{R}$ has branching factor at most $\ell m^{1/b}$ , every destroyed path can be identified by choosing some index $0\leq i<s$ , starting a path at $u_{0}=x$ , and then iteratively choosing the next vertex of the path $u_{j+1}$ in at most $\ell m^{1/b}$ ways for each $j\neq i$ and in at most $\varepsilon_{s}\ell m^{1/b}$ ways when $j=i$ , proving the claim. Since $|\mathcal{R}|$ is strictly greater than the number of destroyed paths, $\mathcal{R}^{\prime}$ is non-empty and the result follows. ∎

The following definition will almost be strong enough to prove Proposition 2.1.

Definition 10.

Let $\mathcal{A}=(A_{0},\ldots,A_{t})$ be a collection of (not necessarily disjoint) Sets of vertices of $G$ with $A_{0}=\{x\}$ and let $\mathcal{P}$ be a collection of paths of the form $xu_{1}\cdots u_{t}$ with $u_{i}\in A_{i}$ for all $i$ . We say that $(\mathcal{A},\mathcal{P})$ is a balanced $t$ -neighborhood of $x$ if the following conditions hold:

(i)

We have $|A_{1}|\leq\ell m^{1/b}$ and $|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ .
(ii)

For every $0\leq i<j\leq t$ with $(i,j)\neq(0,t)$ and every $u\in A_{i},v\in A_{j}$ , we have $|\mathcal{P}_{i,j}[u,v]|\leq\ell^{(j-i-1)b/(b-1)}$ .
(iii)

The branching factor of $\mathcal{P}$ is at most $\varepsilon_{t}\ell m^{1/b}$ .

For the next lemma we recall that $\mathcal{F}$ is a set of forests satisfying a property that depends on $\varepsilon$ .

Lemma B.2.

If $x\in X$ , then there exists a balanced $t$ -neighborhood $(\mathcal{A},\mathcal{P})$ of $x$ with $|\mathcal{P}|\geq\frac{1}{2}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}$ such that every subgraph of each $P\in\mathcal{P}$ does not lie in $\mathcal{F}$ provided $\varepsilon$ is sufficiently small.

Proof.

Let $\mathcal{A}$ be the tuple of sets guaranteed by Lemma 2.2. We may assume that $|A_{1}|\leq\ell m^{1/b}$ , as otherwise we can just remove vertices from $A_{1}$ while maintaining all the properties guaranteed by Lemma 2.2. For each $v\in A_{i-1}$ , let $Q_{i}(v)$ be an arbitrary subset of $N(v)\cap A_{i}$ of size $\frac{1}{2}\varepsilon_{t}\ell m^{1/b}$ . Let $\mathcal{Q}$ be the set of paths $xu_{1}\cdots u_{t}$ generated as follows. Given $xu_{1}\cdots u_{i-1}$ , select any $u_{i}\in\mathcal{Q}_{i}(u_{i-1})$ such that $u_{i}\notin\{x,u_{1},\ldots,u_{i-1}\}$ and such no subgraph of $xu_{1}\cdots u_{i}$ is contained in $\mathcal{F}$ . Note that the number of choices at each step is at least

\frac{1}{2}\varepsilon_{t}\ell m^{1/b}-t-\varepsilon\ell m^{1/b}\geq\frac{1}{4}\varepsilon_{t}\ell m^{1/b},

with the last step holding if $\varepsilon$ is sufficiently small (which also implies $\ell m^{1/b}$ is sufficiently large compared to $\varepsilon_{t}^{-1}t$ ). This means

|\mathcal{Q}|\geq\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t}.

(25)

Note that by construction, every path in $\mathcal{Q}$ avoids $\mathcal{F}$ .

We now remove some paths from $\mathcal{Q}$ to produce $\mathcal{P}$ . If there exists $0\leq i<j\leq t$ with $(i,j)\neq(0,t)$ and vertices $u\in A_{i},v\in A_{j}$ and $|\mathcal{Q}_{i,j}[u\to v]|>\ell^{(j-i-1)b/(b-1)}$ , then choose a path $xu_{1}\cdots u_{t}\in\mathcal{Q}$ with $u_{i}=u$ and $u_{j}=v$ and delete this path from $\mathcal{Q}$ . Repeat this until no such paths remain in $\mathcal{Q}$ , and let $\mathcal{P}$ be the resulting set of paths. By construction $(\mathcal{A},\mathcal{P})$ is a balanced neighborhood, so it suffices to show $|\mathcal{P}|$ is large.

We say that a pair of vertices $(u,v)$ is $(i,j)$ -unbalanced if $u\in A_{i},v\in A_{j}$ and $|\mathcal{Q}_{i,j}[u\to v]|>\ell^{(j-i-1)b/(b-1)}$ (we emphasize that this condition involves the original family $\mathcal{Q}$ before any paths are deleted). Let $\mathcal{R}(i,j)=\{xu_{1}\cdots u_{j}\in\mathcal{Q}_{0,j}:(u_{i},u_{j})\textrm{ is }(i,j)\textrm{-unbalanced}\}$ . We claim that

|\mathcal{R}(i,j)|\leq t^{-2}4^{-b-1}(\varepsilon_{t}\ell m^{1/b})^{j}

(26)

for all $0\leq i<j\leq t$ with $(i,j)\neq(0,t)$ . Assuming this is true, this fact together with the branching factor of $\mathcal{Q}$ implies that the number of paths removed is at most

\sum_{i,j}|\mathcal{R}(i,j)|(\varepsilon_{t}\ell m^{1/b})^{t-j}\leq\frac{1}{2}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t},

with this last step holding if $\ell m^{1/b}$ is sufficiently large (which holds if $\varepsilon$ is sufficiently small). From this and $\eqref{eq:QPaths}$ , we conclude that the remaining set of paths $\mathcal{P}$ has the desired properties and is as large as claimed. It thus remains to prove (26).

Fix $(i,j)\neq(0,t)$ and let $s:=j-i$ . If $s=1$ then $\mathcal{R}(i,j)=\emptyset$ (since no pair of vertices can be $(i,i+1)$ -unbalanced), so we may assume $s\geq 2$ . Observe that

|\mathcal{R}(i,j)|\leq\sum_{u\in A_{i}}|\mathcal{R}(i,j)_{0,i}[x\to u]|\cdot|\mathcal{R}(i,j)_{i,j}[u\to A_{j}]|\leq(\varepsilon_{t}\ell m^{1/b})^{i}\cdot\max_{u\in A_{i}}|\mathcal{R}(i,j)_{i,j}[u\to A_{j}]|,

where the second inequality used $\mathcal{R}(i,j)_{0,i}\subseteq\mathcal{Q}_{0,i}$ which has branching factor at most $\varepsilon_{t}\ell m^{1/b}$ . Thus if we assume for contradiction that (26) does not hold, then there must exist some $u\in A_{i}$ such that

|\mathcal{R}(i,j)_{i,j}[u\to A_{j}]|>t^{-2}4^{-b-1}(\varepsilon_{t}\ell m^{1/b})^{s}\geq s\varepsilon_{s}(\ell m^{1/b})^{s},

(27)

with this last step holding if the $\varepsilon_{s^{\prime}}$ constants decrease sufficiently quickly. Let

B:=\{u_{j}\in A_{j}:\exists xu_{1}\cdots u_{j}\in\mathcal{R}(i,j),\ u_{i}=u\}.

Note that $|\mathcal{R}(i,j)[u,A_{j}]|\leq(\varepsilon_{t}\ell m^{1/b})^{j-i}$ because $\mathcal{Q}$ has branching factor at most $\varepsilon_{t}\ell m^{1/b}$ , and that each $v\in B$ is the last vertex of more than $\ell^{(j-i-1)b/(b-1)}$ paths of $\mathcal{R}(i,j)[u,A_{j}]$ (since by definition of $\mathcal{R}(i,j)$ , such a pair $(u,v)$ must be $(i,j)$ -unbalanced). Using $s=j-i$ gives

|B|\leq\frac{(\varepsilon_{t}\ell m^{1/b})^{s}}{\ell^{(s-1)b/(b-1)}}=\varepsilon_{t}^{s}\ell^{(b-s)/(b-1)}m^{s/b}\leq\ell^{(b-s)/(b-1)}m^{s/b}.

With this and (27), we can apply Lemma B.1 to $\mathcal{R}(i,j)_{i,j}[u\to A_{j}]$ to conclude $t(u)\leq s<t(x)$ . This gives a contradiction to $u\in A_{i}$ and the properties of $\mathcal{A}$ guaranteed by Lemma 2.2.

∎

A key fact about balanced neighorhoods is the following.

Lemma B.3.

Let $(\mathcal{A},\mathcal{P})$ be a balanced $t$ -neighborhood of $x\in V(G)$ . If $\varepsilon$ is sufficiently small, then for any $y\in A_{t}$ and non-empty set of vertices $S\subseteq V(G)\setminus\{x,y\}$ , there are at most $\varepsilon^{-1}\ell^{(t-1-|S|)b/(b-1)}$ paths in $\mathcal{P}[x\to y]$ containing $S$ .

Proof.

Let $S=\{z_{1},\ldots,z_{r}\}$ , and for ease of notation let $z_{0}=x$ and $z_{r+1}=y$ . Given a sequence $0=i_{0}<i_{1}<\cdots<i_{r}<i_{r+1}=t$ , the number of paths $xu_{1}\cdots u_{t-1}y\in\mathcal{P}[x\to y]$ with $u_{i_{j}}=z_{j}$ is at most

\prod_{j=0}^{r}|\mathcal{P}[z_{j}\to z_{j+1}]|\leq\prod_{i=0}^{r}\ell^{(i_{j+1}-i_{j}-1)b/(b-1)}=\ell^{(i_{r+1}-i_{0}-(r+1))b/(b-1)}=\ell^{(t-1-|S|)b/(b-1)}.

Every path containing $S$ can be formed in this way, possibly by reordering the elements of $S$ and by choosing different indices $i_{j}$ . As the number of ways of doing this is some finite number depending only on $b$ , we conclude the result. ∎

We now move onto the last notion of neighborhoods that we need for this proof.

Definition 11.

Let $(\mathcal{B},\mathcal{Q})$ be a balanced $t$ -neighborhood of $x$ . We say that $(\mathcal{B},\mathcal{Q})$ is a refined $t$ -neighborhood of $x$ if the following conditions also hold:

1.

For every $i\in\{0,1,\ldots,t-1\}$ and every $u\in B_{i}$ ,

$|N(u)\cap B_{i+1}|\geq t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b}.$
2.

For every $v\in B_{t}$ ,

$|N(v)\cap B_{t-1}|\geq 4^{-2t}\varepsilon_{t}^{2}\ell^{(t-1)b/(b-1)}.$
3.

For every $v\in B_{t}$ ,

$|\mathcal{Q}[x\to v]|\geq 4^{-2t}\varepsilon_{t}^{t}\ell^{(t-1)b/(b-1)}.$

Lemma B.4.

If $(\mathcal{A},\mathcal{P})$ is a balanced $t$ -neighborhood of a vertex $x\in X$ with $|\mathcal{P}|\geq\frac{1}{2}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}$ , then there exists a refined $t$ -neighborhood $(\mathcal{B},\mathcal{Q})$ of $x$ with $B_{i}\subseteq A_{i}$ for all $i$ and $\mathcal{Q}\subseteq\mathcal{P}$ such that

|\mathcal{Q}|\geq\frac{1}{4}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}.

Proof.

Repeatedly delete vertices using the following three steps until no further vertices can be removed:

Step 1

If there exists $i\in\{1,\ldots,t-1\}$ and $v\in A_{i}$ with

$|N(v)\cap A_{i+1}|<t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b},$

then remove $v$ from $A_{i}$ and remove all paths $P=xu_{1}\cdots u_{t}\in\mathcal{P}$ with $u_{i}=v$ .
Step 2

If there exists $v\in A_{t}$ with

$|N(v)\cap A_{t-1}|<4^{-2t}\varepsilon_{t}^{2}\ell^{b/(b-1)},$

then remove $v$ from $A_{t}$ and remove all paths $P=xu_{1}\cdots u_{t}\in\mathcal{P}$ with $u_{t}=v$ .
Step 3

If there exists $v\in A_{t}$ with

$|\mathcal{P}[x\to v]|<4^{-2t}\varepsilon_{t}^{t}\ell^{(t-1)b/(b-1)},$

then remove $v$ from $A_{t}$ and remove all paths $P=xu_{1}\cdots u_{t}\in\mathcal{P}$ with $u_{t}=v$ .

Let $B_{i}\subseteq A_{i}$ and $\mathcal{Q}\subseteq\mathcal{P}$ be the set of vertices and paths that remain at the end of this process and let $\mathcal{B}=(B_{0},\ldots,B_{t})$ . Note that with this, $(\mathcal{B},\mathcal{Q})$ automatically satisfies every condition for a refined $t$ -neighborhood except possibly $|N(x)\cap B_{1}|\geq t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b}$ . This will follow from having $\mathcal{Q}$ large, which we prove below by arguing that few paths are destroyed in the process above.

Because $\mathcal{P}$ is a balanced $t$ -neighborhood, its branching factor is at most $\varepsilon_{t}\ell m^{1/b}$ . As such the number of paths removed in Step 1 is at most

t\cdot t^{-1}4^{-2t}\cdot\varepsilon_{t}\ell m^{1/b}\cdot(\varepsilon_{t}\ell m^{1/b})^{t-1}=4^{-t}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t},

(28)

and in Step 3 we remove at most

4^{-2t}\varepsilon_{t}^{t}\ell^{(t-1)b/(b-1)}|A_{t}|\leq 4^{-t}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t},

(29)

where this last step uses the definition of balanced $t$ -neighborhoods.

For Step 2, we aim to show that the number of destroyed paths is at most

\frac{1}{8}\left(\frac{1}{4}\varepsilon_{t}\ell m^{1/b}\right)^{t}.

(30)

Let $Z\subseteq A_{t}$ and $\mathcal{P}(Z)$ denote the collection of vertices and paths removed in Step 2, and let

Y=\{u\in A_{t-1}:\mathcal{P}(Z)\textrm{ has }(t-1,u)\textrm{-branching factor at least }4^{-t-2}\varepsilon_{t}\ell m^{1/b}\}.

Note that by the definition of $Y$ and the bound on the branching factor of $\mathcal{P}$ ,

|\{xu_{1}\cdots u_{t}\in\mathcal{P}(Z):u_{t-1}\in A_{t-1}\setminus Y\}|\leq 4^{-t-2}(\varepsilon_{t}\ell m^{1/b})^{t},

(31)

so it remains to show that there are few paths which use a vertex of $Y$ as the second to last vertex. For this, let

W=\{(u,v):u\in Y,\ v\in Z,\exists xw_{1}\cdots w_{t}\in\mathcal{P}(Z),w_{t-1}=u,w_{t}=v\}.

Note first that by definition of $Y$ , $|W|\geq|Y|4^{-t-2}\varepsilon_{t}\ell m^{1/b}$ . On the other hand, we have that $|W|\leq|Z|4^{-2t}\varepsilon_{t}^{2}\ell^{b/(b-1)}$ since at the time each vertex $v\in Z$ is deleted, $v$ has at most $\varepsilon_{t}^{2}\ell^{b/(b-1)}$ neighbors in $A_{t-1}\supseteq Y$ . In total then we find

|Y|\leq\frac{|Z|4^{-2t}\varepsilon_{t}^{2}\ell^{b/(b-1)}}{4^{-t-2}\varepsilon_{t}\ell m^{1/b}}\leq 4^{-t+2}\varepsilon_{t}\ell^{(b-t+1)/(b-1)}m^{(t-1)/b},

(32)

where this last step used $|Z|\leq|A_{t}|\leq\ell^{(b-t)/(b-1)}m^{t/b}$ by definition of balanced neighborhoods.

Observe that if the number of paths in $\mathcal{P}(Z)$ using a vertex of $Y$ as the second to last vertex is at most $4^{-t-2}\left(\varepsilon_{t}\ell m^{1/b}\right)^{t}$ then (31) implies that the number of paths removed is at most (30), so we may assume this is not the case. Letting $S:=\mathcal{P}(Z)[x\to Y]$ , this assumption together with the branching factor of $\mathcal{P}$ implies $|S|\cdot\varepsilon_{t}\ell m^{1/b}\leq 4^{-t-2}\left(\varepsilon_{t}\ell m^{1/b}\right)^{t}$ , or equivalently

|S|\geq\frac{4^{-t-2}(\varepsilon_{t}\ell m^{1/b})^{t}}{\varepsilon_{t}\ell m^{1/b}}>\varepsilon_{t-1}(\ell m^{1/b})^{t-1},

(33)

With the last step holding if the $\varepsilon_{s}$ constants decrease sufficiently quickly. If $t-1\geq 2$ , then (32) and (33) together with Lemma B.1 imply $t(x)\leq t-1$ , contradicting $x\in X$ . If $t=2$ then (32) gives $|Y|\leq 4^{-t-2}\varepsilon_{t}\ell m^{1/b}$ , so the fact that $\mathcal{P}$ has branching factor at most $\varepsilon_{t}\ell m^{1/b}$ implies that there are at most $4^{-t-2}(\varepsilon_{t}\ell m^{1/b})^{2}$ paths in $\mathcal{P}$ whose second to last vertex is in $Y$ . In either case, this bound together with (31) implies the number of paths removed is at most (30).

As $t\geq 2$ , (28), (30), (29) imply that the total number of paths destroyed is at most $\frac{1}{4}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}$ , so $\mathcal{Q}$ has the desired size. To prove that $(\mathcal{B},\mathcal{Q})$ is a refined $t$ -neighborhood, it remains to show $|N(x)\cap B_{1}|\geq t^{-1}4^{-2t}\varepsilon_{t}\ell m^{1/b}$ . Since $\mathcal{Q}\subseteq\mathcal{P}$ has branching factor at most $\varepsilon_{t}\ell m^{1/b}$ , we have that $|\mathcal{Q}|\leq|N(x)\cap B_{1}|\cdot(\varepsilon_{t}\ell m^{1/b})^{t-1}$ . Our bound on $|\mathcal{Q}|$ then implies

|N(x)\cap B_{1}|\geq\frac{1}{4}(\frac{1}{4}\varepsilon_{t}\ell m^{1/b})^{t}\cdot(\varepsilon_{t}\ell m^{1/b})^{1-t}=4^{-t-1}\varepsilon_{t}\ell m^{1/b}.

This gives the desired bound, proving the result. ∎

Proof of Proposition 2.1.

Let $t,X$ be as in Lemma 2.2. For each $x\in X$ , let $(\mathcal{A},\mathcal{P})$ be the balanced $t$ -neighborhood guaranteed by Lemma B.2 and $(\mathcal{B},\mathcal{Q})$ the refined $t$ -neighborhood guaranteed by Lemma B.4 from $(\mathcal{A},\mathcal{P})$ . Most of the properties of Proposition 2.1 follow immediately from Definitions 10 and 11, as well as Lemmas B.2 and B.3 (with the last lemma using that $(\mathcal{B},\mathcal{Q})$ is in particular a balanced $t$ -neighborhood). The only conditions which are not immediate are the bounds $|B_{t-1}|,|B_{t}|\geq\varepsilon\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ . If this bound did not hold for $B_{t-1}$ , then the tuple $(B_{0},B_{1},\ldots,B_{t-1})$ would be a concentrated $(t-1)$ -neighborhood of $x$ (assuming $t^{-1}4^{-2t}\varepsilon_{t}\geq\varepsilon_{t-1}$ ), contradicting every $x\in X$ having $t(x)=t$ .

To prove the bound on $B_{t}$ , we use Lemma B.3 to find

|\mathcal{Q}|=\sum_{u\in B_{1},v\in B_{t}}|\mathcal{Q}[u\to v]|\leq\varepsilon^{-1}\ell^{(t-2)b/(b-1)}\cdot|B_{1}|\cdot|B_{t}|\leq\varepsilon^{-1}\ell^{(t-2)b/(b-1)}\cdot\ell m^{1/b}\cdot|B_{t}|,

where this last step used Definition 10(i). As $|\mathcal{Q}|\geq\varepsilon\ell^{t}m^{t/b}$ , this gives $|B_{t}|\geq\varepsilon^{2}\ell^{(b-t+1)/(b-1)}m^{(t-1)/b}$ . This gives the desired result after replacing $\varepsilon$ in the proposition statement with $\varepsilon^{2}$ (which easily implies the result after replacing $\varepsilon$ with $\sqrt{\varepsilon}$ throughout). ∎

The Random Turán Problem for Theta Graphs

Abstract

1 Introduction

Theorem 1.1 ([6, 27]).

Conjecture 1.2.

Theorem 1.3.

1.1 Proof Outline

1.1.1 Previous Ideas

1.1.2 New Ideas

1.2 Organization and Notation

2 Auxiliary Results

2.1 Expansion

Proposition 2.1.

Definition 1.

Lemma 2.2.

Lemma 2.3.

Proof.

Claim 2.4.

Proof.

Proof of Lemma 2.2.

2.2 Minimum Degrees

Lemma 2.5.

Proof.

Lemma 2.6.

Proof.

3 Preliminaries

3.1 Key Definitions

Definition 2.

Definition 3.

Definition 4.

Definition 5.

Definition 6.

Definition 7.

3.2 Saturated Sets

Definition 8.

Lemma 3.1.

Proof.

Corollary 3.2.

Proof.

4 Balanced Supersaturation for Vertices

Theorem 4.1.

Proposition 4.2.

Proof of 4.1.

4.1 Pruning

4.2 The Algorithm

Claim 4.3.

Proof.

4.2.1 The Setup

Claim 4.4.

Proof.

4.2.2 Case 1: t=bt=b

Claim 4.5.

Proof.

Claim 4.6.

Proof.

Claim 4.7.

Proof.

4.2.3 Case 2: t<bt<b and b−tb-t Even

Claim 4.8.

Proof.

Claim 4.9.

Proof.

Claim 4.10.

Sketch of Proof.

Claim 4.11.

Proof.

4.2.4 Case 3: t<bt<b and b−tb-t Odd

5 Balanced Supersaturation for Edges

Proposition 5.1.

Corollary 5.2.

Proof.

Claim 5.3.

Proof.

5.1 Proof of 5.1

Lemma 5.4.

Proof.

Lemma 5.5.

Proof.

Lemma 5.6.

Proof.

4.2.2 Case 1: $t=b$

4.2.3 Case 2: $t<b$ and $b-t$ Even

4.2.4 Case 3: $t<b$ and $b-t$ Odd