New algorithms for girth and cycle detection

Liam Roditty Plia Trabelsi Liam Roditty Department of Computer Science, Bar Ilan University, Ramat Gan 5290002, Israel. E-mail liam.roditty@biu.ac.il. Supported in part by BSF grants 2016365 and 2020356. Plia Trabelsi Department of Computer Science, Bar Ilan University, Ramat Gan 5290002, Israel. E-mail plia.trabelsi@gmail.com.

Abstract

Let $G=(V,E)$ be an unweighted undirected graph with $n$ vertices and $m$ edges. Let $g$ be the girth of $G$ , that is, the length of a shortest cycle in $G$ . We present a randomized algorithm with a running time of $\tilde{O}\big{(}\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}}\big{)}$ that returns a cycle of length at most $2\ell\left\lceil\frac{g}{2}\right\rceil-2\left\lfloor\varepsilon\left\lceil\frac{g}{2}\right\rceil\right\rfloor,$ where $\ell\geq 2$ is an integer and $\varepsilon\in[0,1]$ , for every graph with $g={\rm polylog}(n)$ .

Our algorithm generalizes an algorithm of Kadria et al. [SODA’22] that computes a cycle of length at most $4\left\lceil\frac{g}{2}\right\rceil-2\left\lfloor\varepsilon\left\lceil\frac{g}{2}\right\rceil\right\rfloor$ in $\tilde{O}\big{(}n^{1+\frac{1}{2-\varepsilon}}\big{)}$ time. Kadria et al. presented also an algorithm that finds a cycle of length at most $2\ell\left\lceil\frac{g}{2}\right\rceil$ in $\tilde{O}\big{(}n^{1+\frac{1}{\ell}}\big{)}$ time, where $\ell$ must be an integer. Our algorithm generalizes this algorithm, as well, by replacing the integer parameter $\ell$ in the running time exponent with a real-valued parameter $\ell-\varepsilon$ , thereby offering greater flexibility in parameter selection and enabling a broader spectrum of combinations between running times and cycle lengths.

We also show that for sparse graphs a better tradeoff is possible, by presenting an $\tilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)})$ time randomized algorithm that returns a cycle of length at most $2\ell(\lfloor\frac{g-1}{2}\rfloor)-2(\lfloor\varepsilon\lfloor\frac{g-1}{2}\rfloor\rfloor+1)$ , where $\ell\geq 3$ is an integer and $\varepsilon\in[0,1)$ , for every graph with $g={\rm polylog}(n)$ .

To obtain our algorithms we develop several techniques and introduce a formal definition of hybrid cycle detection algorithms. Both may prove useful in broader contexts, including other cycle detection and approximation problems. Among our techniques is a new cycle searching technique, in which we search for a cycle from a given vertex and possibly all its neighbors in linear time. Using this technique together with more ideas we develop two hybrid algorithms. The first allows us to obtain a $\tilde{O}(m^{2-\frac{2}{\lceil g/2\rceil+1}})$ -time, $(+1)$ -approximation of $g$ . The second is used to obtain our $\tilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)})$ -time and $\tilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)})$ -time approximation algorithms.

1 Introduction

Let $G=(V,E)$ be an unweighted undirected graph with $n$ vertices and $m$ edges. A set of vertices $C_{\ell}=\{v_{1},v_{2},\cdots,v_{\ell+1}\}$ in $G$ , where $\ell\geq 2$ , is a cycle of length $\ell$ if $v_{1}=v_{\ell+1}$ and $(v_{i},v_{i+1})\in E$ , where $1\leq i\leq\ell$ . A $C_{\leq\ell}$ is a cycle of length at most $\ell$ . The girth $g$ of $G$ is the length of a shortest cycle in $G$ . The girth of a graph has been studied extensively since the 1970s by researchers from both the graph theory and the algorithms communities.

Itai and Rodeh [6] showed that the girth can be computed in $O(mn)$ time or in $O(n^{\omega})$ time, where $\omega<2.371552$ [13], if Fast Matrix Multiplication (FMM) algorithms are used. They also proved that the problem of computing the girth is equivalent to the problem of deciding whether there is a $C_{3}$ (triangle) in the graph or not.

Interestingly, there is a close connection between the girth problem and the All Pairs Shortest Path (APSP) problem. Vassilevska W. and Williams [12] proved that a truly subcubic time algorithm that computes the girth, without FMM, implies a truly subcubic time algorithm that computes APSP, without FMM. Such an algorithm for APSP would be a major breakthrough. In light of this girth and APSP connection, it is natural to settle for an approximation algorithm for the girth instead of exact computation. An $(\alpha,\beta)$ -approximation $\hat{g}$ of $g$ (where $\alpha\geq 1$ and $\beta\geq 0$ ), satisfies $g\leq\hat{g}\leq\alpha\cdot g+\beta$ . We denote an approximation as an $\alpha$ -approximation if $\beta=0$ and as a $(+\beta)$ -approximation if $\alpha=1$ .

Itai and Rodeh [6] presented a $(+1)$ -approximation algorithm that runs in $O(n^{2})$ time. Notice that in contrast to the APSP problem, where a running time of $\Omega(n^{2})$ is inevitable since the output size is $\Omega(n^{2})$ , in the girth problem the output is a single number, thus, there is no natural barrier for sub-quadratic time algorithms. Indeed, Lingas and Lundell [8] presented a $\frac{8}{3}$ -approximation algorithm that runs in $\tilde{O}(n^{3/2})$ time, and Roditty and V. Williams [11] presented a $2$ -approximation algorithm that runs in $\tilde{O}(n^{5/3})$ time. Dahlgaard, Knudsen and Stöckel [5] presented two tradeoffs between running time and approximation. One generalizes the algorithms of [8, 11] and computes a cycle of length at most $2\lceil\frac{g}{2}\rceil+2\lceil\frac{g}{2(\ell-1)}\rceil$ in $\tilde{O}(n^{2-1/\ell})$ time. The other computes, whp, a $C_{\leq 2^{\ell}g}$ , for any integer $\ell\geq 2$ , in $\tilde{O}(n^{1+1/\ell})$ time.

Kadria et al. [7] significantly improved upon the second algorithm of [5] and presented an algorithm, that for every integer $\ell\geq 1$ , computes a $C_{\leq 2\ell\lceil g/2\rceil}$ in $\tilde{O}(n^{1+1/\ell})$ time. They also presented an algorithm, that for every $\varepsilon\in(0,1)$ , computes a cycle of length at most $4\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(2-\varepsilon)g+4$ , in $\widetilde{O}(n^{1+1/(2-\varepsilon)})$ time, for every graph with $g={\rm polylog}(n)$ .

These two algorithms of Kadria et al., as well as few other approximation algorithms (see for example, [8], [3], [9]), were obtained using a general framework for girth approximation in which a search is performed over the range of possible values of $g$ , using some algorithm $\mathcal{A}$ that gets as an input an integer $\tilde{g}$ which is a guess for the value of $g$ . In each step of the search $\mathcal{A}$ either returns a cycle $C_{\leq f(\tilde{g})}$ , where $f$ is a non decreasing function, or determines that $g>\tilde{g}$ . The goal of the search is to find the smallest $\tilde{g}$ , for which $\mathcal{A}$ returns a cycle, because for this value we have $g>\tilde{g}-1$ (and thus $g\geq\tilde{g}$ ), and algorithm $\mathcal{A}$ returns a $C_{\leq f(\tilde{g})}$ . This cycle is of length at most $f(g)$ since $g\geq\tilde{g}$ and $f$ is a non decreasing function. The two possible outcomes of $\mathcal{A}$ and its usage in the general girth approximation framework inspired us to formally define the notion of a $(\gamma,\delta)$ -hybrid algorithm as follows:

Definition 1.1.

A $(\gamma,\delta)$ -hybrid algorithm is an algorithm that either outputs a $C_{\leq\gamma}$ or determines that $g>\delta$ .

When $\gamma=\delta$ , the algorithm is referred to as a $\gamma$ -hybrid algorithm. The girth approximation framework described above suggests that a possible approach for developing efficient girth approximation algorithms is by developing efficient $(\gamma,\delta)$ -hybrid algorithms.

Kadria et al. [7] designed several algorithms that satisfy the definition of $(\gamma,\delta)$ -hybrid algorithms. Their girth approximation algorithms mentioned above were obtained using two different $(f(\tilde{g}),\tilde{g})$ -hybrid algorithms. Additionally, for every $k\geq 2$ , they presented a $(2k,3)$ -hybrid and a $(2k,4)$ -hybrid algorithms that run in $O(\min\{n^{1+2/k},m^{1+1/(k+1)}\})$ time, and a $(\max\{2k,g\},5)$ -hybrid algorithm that runs in $O(\min\{n^{1+3/k},m^{1+2/(k+1)}\})$ time. Therefore, for $k\geq 2$ , $\tilde{g}\in\{3,4,5\}$ and $\alpha=\lceil\frac{\tilde{g}}{2}\rceil$ , there is a $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm that runs in $O(\min\{n^{1+\frac{\alpha}{k}},m^{1+\frac{\alpha-1}{k+1}}\})$ time. A natural question is whether these three algorithms are only part of a general tradeoff between the running time, $\gamma$ and $\delta$ .

Problem 1.1.

Let $\tilde{g}\geq 6$ and $k\geq 2$ be two integers and let $\alpha=\lceil\frac{\tilde{g}}{2}\rceil$ . Is it possible to obtain a $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm that runs in $O(\min\{n^{1+\frac{\alpha}{k}},m^{1+\frac{\alpha-1}{k+1}}\})$ time?

In this paper we present a $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm that runs, whp, in $O((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{n^{1+\frac{\alpha}{k}},m^{1+\frac{\alpha-1}{k+1}}\})$ time. This algorithm provides an affirmative answer to Problem 1.1, albeit the $(\frac{k+1}{\alpha-1}+\alpha)$ factor in the running time.

Using our $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm we obtain a generalization of the $\widetilde{O}(n^{1+1/(2-\varepsilon)})$ time algorithm of Kadria et al. [7] that computes a cycle of length at most $4\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(2-\varepsilon)g+4$ . Our generalized algorithm runs in $\tilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)})$ time, whp, and returns a cycle of length at most $2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(\ell-\varepsilon)g+\ell+2$ , where $\ell\geq 2$ is an integer and $\varepsilon\in[0,1]$ , for every graph with $g={\rm polylog}(n)$ . We also show that if the graph is sparse then the approximation can be improved. More specifically, we present an algorithm that runs in $\tilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)})$ time, whp, and returns a cycle of length at most $(\ell-\varepsilon)g-\ell+2\varepsilon$ , where $\ell\geq 3$ is an integer and $\varepsilon\in[0,1)$ , for every graph with $g={\rm polylog}(n)$ .

Our $\tilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)})$ -time algorithm also generalizes the $\tilde{O}(n^{1+1/\ell})$ -time algorithm of Kadria et al. [7], which computes a $C_{\leq 2\ell\lceil g/2\rceil}$ for every integer $\ell\geq 1$ . In our algorithm, the integer parameter $\ell$ that appears in the exponent of the running time is replaced by a real-valued parameter $\ell-\varepsilon$ . Thus, we introduce many new points on the tradeoff curve between running time and approximation ratio. Specifically, for every integer $\ell\leq{\rm polylog}(n)$ , up to $\lceil g/2\rceil-1$ additional tradeoff points are added (since for every such $\ell$ , when $\varepsilon$ is a multiple of $\frac{1}{\lceil g/2\rceil}$ , we get a $\tilde{O}(n^{1+1/(\ell-\varepsilon)})$ -time algorithm which computes a $C_{\leq 2(\ell-\varepsilon)\lceil g/2\rceil}$ ). For example, consider $\ell=3$ and a graph with girth $g=5$ or $g=6$ . Our algorithm yields two additional points on the tradeoff curve, corresponding to $\varepsilon=\frac{1}{3}$ and $\varepsilon=\frac{2}{3}$ . For $\varepsilon=\frac{1}{3}$ , we compute a $C_{\leq 16}$ in $\tilde{O}(n^{1+\frac{3}{8}})$ time, and for $\varepsilon=\frac{2}{3}$ , we compute a $C_{\leq 14}$ in $\tilde{O}(n^{1+\frac{3}{7}})$ time. These points lie between the two points on the tradeoff curve given by the algorithm of Kadria et al. [7], which computes either a $C_{\leq 12}$ in $\tilde{O}(n^{1+\frac{3}{6}})$ time or a $C_{\leq 18}$ in $\tilde{O}(n^{1+\frac{3}{9}})$ time. See Figure 1 for a comparison.

$\varepsilon$ Time [3pt] exp Cycle bound $0$ $1+\frac{3}{9}$ $18$ $\frac{1}{3}$ $1+\frac{3}{8}$ $16$ $\frac{2}{3}$ $1+\frac{3}{7}$ $14$ $1$ $1+\frac{3}{6}$ $12$

Figure 1: Our

\tilde{O}(\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}})

-time girth approximation algorithm compared to the

\tilde{O}(n^{1+\frac{1}{\ell}})

-time algorithm of [7], for every

\varepsilon\in[0,1]

, choosing

\ell=3

and

g=5

6

. (a) The

y

-axis is the exponent of

n

in the running time, and the

x

-axis is

\varepsilon

. (b) The

y

-axis is the upper bound on the length of the returned cycle, and the

x

-axis is

\varepsilon

. The blue points correspond to our algorithm at four specific choices of

\varepsilon

\varepsilon=0,\frac{1}{3},\frac{2}{3},1

(see the table on the right).

The tradeoff curve to which we add new points encompasses many known algorithms, including those of Itai and Rodeh [6], Lingas and Lundell [8], and Kadria et al. [7] (and those of Roditty and V. Williams [11] for $g=4c$ and $g=4c-1$ when $c\geq 1$ is an integer, and Dahlgaard et al. [5] for some values of $g$ and $\ell$ ). Notably, some of these algorithms have resisted improvement for many years. The addition of new points to this curve reinforces the possibility that it captures a fundamental relationship between running time and approximation quality. This, in turn, motivates further investigation into whether a matching lower bound exists for this tradeoff.

The rest of this paper is organized as follows. In Section 2 we provide an overview. Preliminaries are in Section 3. In Section 4, we present a new cycle searching technique that is used by our algorithms. In Section 5 we present a $2k$ -hybrid algorithm and then use it to obtain a $(+1)$ -approximation algorithm for the girth. In Section 6 we generalize the $2k$ -hybrid algorithm and present a $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm. In Section 7 we use the hybrid algorithm from Section 6 to obtain two more approximation algorithms for the girth.

2 Overview

Among the techniques that we develop to obtain our new algorithms, is a new cycle searching technique that might be of independent interest. Our new technique exploits the property that if $s\in V$ is not on a $C_{\leq 2k}$ , then for any two neighbors $x$ and $y$ of $s$ , the set of vertices at distance exactly $k-1$ from $x$ and $y$ that are also at distance $k$ from $s$ are disjoint (see Figure 2). This allows us to check efficiently for all the neighbors of $s$ if they are on a $C_{\leq 2k}$ . Using this technique, together with more tools that we develop, we obtain two hybrid algorithms.

Figure 2: Disjoint sets of vertices at distance

k-1

from

x

and

y

The first is a relatively simple $O(m^{1+\frac{k-1}{k+1}})$ -time, $2k$ -hybrid algorithm. We use this hybrid algorithm in the girth approximation framework described earlier, to obtain an $\tilde{O}(m^{1+\frac{\ell-1}{\ell+1}})$ -time, $(+1)$ -approximation of the girth, where $g=2\ell$ or $g=2\ell-1$ . We remark that using an algorithm of [4] it is possible to obtain a $(+1)$ -approximation in $\tilde{O}(\ell^{O(\ell)}\cdot m^{1+\frac{\ell-1}{\ell+1}})$ time.¹¹1[4] showed that a $C_{2k}$ , if exists, can be found in $O(k^{O(k)}\cdot m^{\frac{2k}{k+1}})$ time, and if not then a $C_{2k-1}$ , if exists, can be found in the same time. Thus, if we run their algorithm with increasing values of $k$ we can obtain a $(+1)$ -approximation for the girth in $\tilde{O}(\ell^{O(\ell)}\cdot m^{1+\frac{\ell-1}{\ell+1}})$ time, where $g=2\ell$ or $g=2\ell-1$ . However, the additional $\ell^{O(\ell)}$ factor might be significant even for small values of $\ell$ .

The second is the $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm that solves Problem 1.1. Its main component is an $(2k,2\alpha)$ -hybrid algorithm that runs in $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k+1}})$ -time, whp, and generalizes the first $O(m^{1+\frac{k-1}{k+1}})$ -time $2k$ -hybrid algorithm, by introducing an additional parameter $\alpha\leq k$ . Using $\alpha$ we can tradeoff between the running time and the lower bound on $g$ and obtain a faster running time at the price of a worse lower bound.

We compare our $(2k,2\alpha)$ -hybrid algorithm to algorithm Cycle of Kadria et al. [7], an $O(m+n^{1+\frac{1}{\ell}})$ -time²²2 Cycle runs in $O(n^{1+\frac{1}{\ell}}+m)$ time, which can be reduced to $O(n^{1+\frac{1}{\ell}})$ time, as shown in [7]. $(f(\tilde{g}),\tilde{g})$ -hybrid algorithm, where $f(\tilde{g})=2\ell\lceil\tilde{g}/2\rceil=2\ell\alpha$ , that they used to obtain the $\tilde{O}(n^{1+\frac{1}{\ell}})$ -time, $2\ell\lceil g/2\rceil$ -approximation algorithm. As we show later, the running time of our $(2k,2\alpha)$ -hybrid algorithm can be bounded by $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot n^{1+\frac{\alpha}{k}})$ . Since in our algorithm $k$ is not necessarily a multiple of $\alpha$ (compared to the $\ell\alpha$ of Cycle), our algorithm allows more flexibility, and we achieve many more possible tradeoffs between the running time and the output cycle length. For example, if we consider a multiplicative approximation better than $3$ , when the value of $g$ is a constant known in advance, our algorithm can return longer cycles that are still shorter than $3g$ , in a faster running time. See Figure 3 for a comparison.³³3 [7] presented also an $O((\alpha-c)\cdot n^{1+\frac{\alpha}{2\alpha-c}})$ -time, $(4\alpha-2c,2\alpha)$ -hybrid algorithm, where $0<c\leq\alpha$ are integers. For $c=2\alpha-k$ , this is an $O((k-\alpha)\cdot n^{1+\frac{\alpha}{k}})$ -time, $(2k,2\alpha)$ -hybrid algorithm, similar to our $(2k,2\alpha)$ -hybrid algorithm. However, since $0<c\leq\alpha$ , the possible values of $k$ are restricted and must satisfy $\alpha\leq k<2\alpha$ . By choosing $\alpha=\lceil\frac{g}{2}\rceil$ and an appropriate $k$ the two algorithms have a similar flexibility for a $2$ -approximation, but since in our algorithm also larger values of $k$ are allowed, we can achieve a faster running time for a $t$ -approximation where $t>2$ .

Figure 3:

(2k,2\alpha)

-hybrid algorithm vs. Cycle algorithm of [7]. Both produce a multiplicative approximation strictly better than

3

, given a constant

g

known in advance. (a) The

y

-axis is the exponent of

n

in the fastest running time that achieves such an approximation, and the

x

-axis is

g

. (b) The

y

-axis is the upper bound on the length of the cycle returned within this time, and the

x

-axis is

g

The flexibility of our algorithm is also demonstrated in Figure 4. For a given constant value of $k$ , if our $(2k,2\alpha)$ -hybrid algorithm returns a cycle then its length is at most $2k$ . If we want algorithm Cycle to output a $C_{\leq 2k}$ , then $\lfloor\frac{k}{\alpha}\rfloor$ is the largest $\ell$ that we can choose, since $\ell$ must be an integer. The running time is $O(n^{1+1/\ell})=O(n^{1+1/\lfloor\frac{k}{\alpha}\rfloor})$ . Our algorithm achieves a better running time if $k$ is not divisible by $\alpha$ . (In Figure 4 we choose $\alpha=3$ .)

Figure 4:

\alpha=3

. Red points are the

O(n^{1+1/\lfloor\frac{k}{\alpha}\rfloor})

-time Cycle algorithm [7], and blue points are our

\widetilde{O}(n^{1+\frac{\alpha}{k}})

-time

(2k,2\alpha)

-hybrid algorithm. Both algorithms either return a

C_{\leq 2k}

or determine that

g>2\alpha

, for a constant

k

. The

y

-axis is the exponent of

n

in the running time. The

x

-axis is

k

Next, we overview our $(2k,2\alpha)$ -hybrid algorithm that either finds a $C_{\leq 2k}$ or determines that $g>2\alpha$ in $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k+1}})$ time. To determine that $g\geq 2\alpha$ , we can check for every $v\in V$ if $v$ is on a $C_{\leq 2\alpha}$ . If $v$ is on a $C_{\leq 2\alpha}$ , then all the vertices and edges of this $C_{\leq 2\alpha}$ are at distance at most $\alpha$ from $v$ . If, for every $v\in V$ , the number of edges at distance at most $\alpha$ is $O(deg(v)\cdot m^{\frac{\alpha-1}{k+1}})$ then using standard techniques we can check for every $v\in V$ if $v$ is on a $C_{\leq 2\alpha}$ in $O(m^{1+\frac{\alpha-1}{k+1}})$ time. However, this is not necessarily the case, and the region at distance at most $\alpha$ from some vertices might be dense. To deal with dense regions within the promised running time we develop an iterative sampling procedure (see BfsSample in Section 6), whose goal is to sparsify the graph, or to return a $C_{\leq 2k}$ . One component of the iterative sampling procedure is a generalization of our new cycle searching technique mentioned above. In the generalization instead of checking whether a vertex $s$ and its neighbors are on a $C_{\leq 2k}$ , we check whether all the vertices up to a possibly further distance from $s$ are on a $C_{\leq 2\alpha}$ , for $\alpha\leq k$ , and if not we mark them so that they can be removed later.

If the iterative sampling procedure ends without finding a $C_{\leq 2k}$ then there are two possibilities. Let $r=(k+1)\bmod(\alpha-1)$ . If $r=0$ then it holds that the number of edges at distance at most $\alpha$ from every $v\in V$ is $O(deg(v)\cdot m^{\frac{\alpha-1}{k+1}})$ , whp, as required. If $r>0$ then it holds that the number of edges at distance at most $r$ from every $v\in V$ is $O(m^{\frac{r}{k+1}})$ , whp. This does not necessarily imply that the graph is sparse enough for checking whether $g>2\alpha$ . In this case, we run another algorithm (see HandleReminder in Section 6) that continues to sparsify the graph until the number of edges at distance at most $\alpha$ from every $v\in V$ is $O(deg(v)\cdot m^{\frac{\alpha-1}{k+1}})$ and checking whether $g>2\alpha$ is possible within the required running time of $O(m^{1+\frac{\alpha-1}{k+1}})$ .

3 Preliminaries

Let $G=(V,E)$ be an unweighted undirected graph with $n$ vertices and $m$ edges. Let $U\subseteq V$ be a set of vertices and let $G\setminus U$ be the graph obtained from $G$ by deleting all the vertices of $U$ together with their incident edges. For two graphs $G=(V,E)$ and $G^{\prime}=(V^{\prime},E^{\prime})$ , let $G\setminus G^{\prime}$ be $G\setminus V^{\prime}$ . We say that $G\subseteq G^{\prime}$ if $V\subseteq V^{\prime}$ and $E\subseteq E^{\prime}$ . For convenience, we use both $u\in V$ and $u\in G$ to say that $u\in V$ . For every $u,v\in V$ , let $d_{G}(u,v)$ be the length of a shortest path between $u$ and $v$ in $G$ . The girth $g$ of $G$ is the length of a shortest cycle in $G$ . Let $wt(C)$ be the length of a cycle $C$ . For an integer $\ell$ , we denote a cycle of length (at most) $\ell$ by ( $C_{\leq\ell}$ ) $C_{\ell}$ .⁴⁴4Both $C_{\ell}$ and $C_{\leq\ell}$ might not be simple cycles. However, the cycles that our algorithms return are simple. Let $E(v)$ be the edges incident to $v$ and $E(v,i)$ the $i$ th edge in $E(v)$ . Let $deg_{G}(v)$ be the degree of $v$ in $G$ . Let $N(v)$ be the set of neighbors of $v$ , namely $N(v)=\{w\mid(v,w)\in E\}$ . For an edge set $S$ , let $V(S)$ be the endpoints of $S$ ’s edges, that is, $V(S)=\{u\in V\mid\exists(u,v)\in S\}$ . Let $e=(u,v)\in E$ and $w\in V$ . The distance $d_{G}(w,e)$ between $w$ and $e$ is $\min\{d_{G}(w,u),d_{G}(w,v)\}+1$ . For every $u\in V$ and a real number $k$ let $B(G,u,k)=(V_{u}^{k}(G),E_{u}^{k}(G))$ be the ball graph of $u$ , where $V_{u}^{k}(G)=\{v\in V\mid d_{G}(u,v)\leq k\}$ and $E_{u}^{k}(G)=\{e\in E\mid d_{G}(u,e)\leq k\}$ [7].⁵⁵5When the graph is clear from the context, we sometimes omit $G$ from the notations.

We now turn to present several essential tools that are required in order to obtain our new algorithms. We first restate an important property of the ball graph $B(v,R)$ .

Lemma 3.1 ([7]).

Let $0\leq t\leq R$ be two integers and let $v\in V$ . If $B(v,R)$ is a tree then no vertex in $V_{v}^{R-t}$ is part of a cycle of length at most $2t$ in $G$ .

We use procedure $\texttt{BallOrCycle}(G,v,R)$ [7, 8] (see Algorithm 1) that searches for a $C_{\leq 2R}$ in the ball graph $B(v,R)$ . We summarize the properties of BallOrCycle in the next lemma.

Q\leftarrow

a queue that contains

v

with

d(v)=0

;

while $Q\neq\emptyset$ do

u\leftarrow dequeue(Q)

;

V_{v}^{R}\leftarrow V_{v}^{R}\cup\{u\}

;

i\leftarrow 1

;

while $(i\leq|E(u)|)~\mathrm{and}~d(u)+1\leq R$ do

(u,w)\leftarrow E(u,i)

;

if $w\in Q$ then

return null,

P(LCA(u,w),u)\cup\{(u,w)\}\cup P(LCA(u,w),w)

⁶⁶6

LCA(u,w)

is the least common ancestor of

u

and

w

in the tree rooted at

v

before the edge

(u,w)

was discovered.

P(x,y)

is the path in this tree between

x

and

y

Q\leftarrow Q\cup\{w\}

with

d(w)=d(u)+1

;

i\leftarrow i+1

;

return

V_{v}^{R}

, null;

Algorithm 1

\texttt{BallOrCycle}(G,v,R)

[7, 8]

Lemma 3.2 ([7]).

Let $v\in V$ . If the ball graph $B(v,R)$ is not a tree then $\texttt{BallOrCycle}(G,v,R)$ returns a $C_{\leq 2R}$ from $B(v,R)$ . If $B(v,R)$ is a tree then $\texttt{BallOrCycle}(G,v,R)$ returns $V_{v}^{R}$ .⁷⁷7If $V_{v}^{R}$ is returned then we assume that $V_{v}^{R}$ is ordered by the distance from $v$ , and for every $u\in V_{v}^{R}$ we store $d(u,v)$ with $u$ . Thus, given the set $V_{v}^{R}$ , we can find $V_{v}^{R^{\prime}}$ for every $R^{\prime}<R$ in $O(|V_{v}^{R^{\prime}}|)$ time. The running time of $\texttt{BallOrCycle}(G,v,R)$ is $O(|V_{v}^{R}|)$ .

Next, we obtain a simple $t$ -hybrid algorithm, called AllVtxBallOrCycle, using BallOrCycle. AllVtxBallOrCycle (see Algorithm 2) gets a graph $G$ and an integer $t\geq 2$ , and runs $\texttt{BallOrCycle}(v,t)$ from every $v\in G$ as long as no cycle is found by BallOrCycle. If BallOrCycle finds a cycle then AllVtxBallOrCycle stops and returns that cycle. If no cycle is found then AllVtxBallOrCycle returns null. We prove the next Lemma.

foreach $v\in V$ do

(V_{v}^{t},C)\leftarrow\texttt{BallOrCycle}(v,t)

;

if $C\neq\texttt{null}$ then return

C

;

return null;

Algorithm 2

\texttt{AllVtxBallOrCycle}(G,t)

Lemma 3.3.

$\texttt{AllVtxBallOrCycle}(G,t)$ either finds a $C_{\leq 2t}$ or determines that $g>2t$ , in $O(\sum_{v\in V}|V_{v}^{t}|)$ time.

Proof.

By Lemma 3.2, if $\texttt{BallOrCycle}(u,t)$ returns a cycle $C$ then $wt(C)\leq 2t$ . Also by Lemma 3.2, if $\texttt{BallOrCycle}(u,t)$ does not return a cycle then the ball graph $B(u,t)$ is a tree, and specifically $u$ is not part of a $C_{\leq 2t}$ in $G$ . Therefore, if no cycle was found during any of the calls then all the vertices in $G$ are not part of a $C_{\leq 2t}$ in $G$ . Hence, $G$ does not contain a $C_{\leq 2t}$ , and we get that $g>2t$ . By Lemma 3.2, the running time of $\texttt{BallOrCycle}(v,t)$ is $O(|V_{v}^{t}|)$ , which is $O(\sum_{v\in V}|V_{v}^{t}|)$ in total. ∎

We now show that if the input graph satisfies a certain sparsity property then the running time of $\texttt{AllVtxBallOrCycle}(G,t)$ can be bounded as follows.

Corollary 3.1.

If $|E_{u}^{t-1}|<D^{t-1}$ for every $u\in V$ then $\texttt{AllVtxBallOrCycle}(G,t)$ runs in $O(mD^{t-1})$ time.

Proof.

For every $u\in V$ , we have $O(|V_{v}^{t}|)\leq O(|E_{v}^{t}|)\leq O(\sum_{u\in N(v)}|E_{u}^{t-1}|)$ , which is at most $O(\sum_{u\in N(v)}D^{t-1})=O(deg(v)\cdot D^{t-1})$ . Thus, by Lemma 3.3, the running time of $\texttt{AllVtxBallOrCycle}(G,t)$ is $O(\sum_{v\in V}|V_{v}^{t}|)\leq O(\sum_{v\in V}deg(v)\cdot D^{t-1})=O(mD^{t-1})$ . ∎

Next, we present procedure $\texttt{IsDense}(G,w,T,r)$ from [7]. IsDense (see Algorithm 3) gets a graph $G$ , a vertex $w$ , a budget $T\geq 1$ (real) and a distance $r\geq 0$ (integer). In the procedure a BFS is executed from $w$ . The BFS counts the edges that are scanned as long as their total number is less than $T$ and the farthest vertex from $w$ is at distance at most $r$ .

T^{\prime}\leftarrow 0

\ell_{c}\leftarrow 1

\ell_{n}\leftarrow 0

j\leftarrow 0

;

Q\leftarrow\{w\}

;

while $\big{(}Q\text{ is not empty }\big{)}~\mathrm{and}~\big{(}j<r\big{)}~\mathrm{and}~\big{(}T^{\prime}<T\big{)}$ do

u\leftarrow dequeue(Q)

;

\ell_{c}\leftarrow\ell_{c}-1

i\leftarrow 1

;

while ( $i\leq|E(u)|\big{)}~\mathrm{and}~\big{(}T^{\prime}<T$ ) do

(u,v)\leftarrow E(u,i)

;

remove

(u,v)

from

E(v)

;

T^{\prime}\leftarrow T^{\prime}+1

;

if $v$ is not marked then

Q\leftarrow Q\cup\{v\}

, mark

v

\ell_{n}\leftarrow\ell_{n}+1

;

i\leftarrow i+1

;

if $\ell_{c}=0$ then

\ell_{c}\leftarrow\ell_{n}

\ell_{n}\leftarrow 0

;

j\leftarrow j+1

;

if $\big{(}T^{\prime}\geq T\big{)}$ then return Yes;

else return No;

Algorithm 3

\texttt{IsDense}(G,w,T,r)

[7]

Lemma 3.4 ([7]).

Procedure $\texttt{IsDense}(G,w,T,r)$ runs in $O(\lceil T\rceil)=O(T)$ time. If $\texttt{IsDense}(G,w,T,r)=\texttt{No}$ then $|E_{w}^{r}|<T$ . If $\texttt{IsDense}(G,w,T,r)=\texttt{Yes}$ then $|E_{w}^{r}|\geq T$ .

Given a vertex $v$ and a distance $R$ we sometimes want to bound $|E_{v}^{R}|$ . Therefore, we adapt a lemma and a corollary of [7] from vertices to edges.

Lemma 3.5.

Let $x,y$ be positive integers, let $D\geq 1$ be a real number, and let $w\in V$ . If $|E_{w}^{x}|<D^{x}$ , and $|E_{u}^{y}|<D^{y}$ for every $u\in V$ , then $|E_{w}^{x+y}|<D^{x+y}$ .

Proof.

Let $w\in V$ and assume that $|E_{w}^{x}|<D^{x}$ . We also know that $|E_{u}^{y}|<D^{y}$ , for every $u\in V$ . We denote $L=V_{w}^{x}\setminus\{w\}$ . If $L=\emptyset$ then $|E_{w}^{x}|=|E_{w}^{x+y}|=0\leq D^{x+y}$ as required. Now assume that $L\neq\emptyset$ . Since $y\geq 1$ , $E_{w}^{x+y}=\bigcup_{u\in L}E_{u}^{y}$ . Therefore, $|E_{w}^{x+y}|=|\bigcup_{u\in L}E_{u}^{y}|$ . As the ball graph $B(w,x)=(V_{w}^{x},E_{w}^{x})$ is connected, we know that $|V_{w}^{x}|\leq|E_{w}^{x}|+1<D^{x}+1$ and since $x>0$ , $|L|=|V_{w}^{x}|-1<(D^{x}+1)-1=D^{x}$ . Thus, we get that $|E_{w}^{x+y}|=|\bigcup_{u\in L}E_{u}^{y}|\leq\sum_{u\in L}|E_{u}^{y}|<\sum_{u\in L}D^{y}=|L|\cdot D^{y}<D^{x}\cdot D^{y}=D^{x+y}$ so $|E_{w}^{x+y}|<D^{x+y}$ . ∎

Using Lemma 3.5, we prove following corollary.

Corollary 3.2.

Let $x$ be a positive integer and let $D\geq 1$ be a real number. If $|E_{w}^{x}|<D^{x}$ for every $w\in V$ , then $|E_{w}^{ix}|<D^{ix}$ , for every $w\in V$ and $i\geq 1$ .

Proof.

The proof is by induction on $i$ . For $i=1$ it follows from our assumption that $|E_{w}^{x}|<D^{x}$ . Assume now that the claim holds for $j=i-1$ . This implies that $|E_{w}^{(i-1)x}|<D^{(i-1)x}$ , for every $w\in V$ . Combining this with the fact that $|E_{w}^{x}|<D^{x}$ , for every $w\in V$ , by Lemma 3.5 we get that $|E_{w}^{ix}|<D^{ix}$ , for every $w\in V$ and $i\geq 1$ . ∎

We also adapt procedure $\texttt{SparseOrCycle}(G,D,x,y)$ of [7] to our needs. SparseOrCycle (see Algorithm 4) gets a graph $G$ , a parameter $D\geq 1$ , and two integers $x,y>0$ , and iterates over vertices using a for-each loop. Let $w$ be the vertex currently considered and $G(w)$ the current graph. If $\texttt{IsDense}(G(w),w,D^{x},x)=\texttt{Yes}$ then $\texttt{BallOrCycle}(G(w),w,x-1+y)$ is called. If BallOrCycle returns a cycle $C$ then $C$ is returned by SparseOrCycle. Otherwise, the vertex set $V_{w}^{x-1}$ is removed from $G(w)$ along with the edge set $E_{w}^{x}$ . After the loop ends, if no cycle was found, we return null. Let $W\subseteq V$ be the set of vertices for which BallOrCycle was called and no cycle was found, and $\hat{G}=(\hat{V},\hat{E})$ the graph after SparseOrCycle ends.

foreach $w\in V$ do

if $\texttt{IsDense}(G,w,D^{x},x)=\texttt{Yes}$ then

(V_{w}^{x-1+y},C)\leftarrow\texttt{BallOrCycle}(w,x-1+y)

;

if $C\neq\texttt{null}$ then return

C

;

G\leftarrow G\setminus V_{w}^{x-1}

;

return

null

;

Algorithm 4

\texttt{SparseOrCycle}(G,D,x,y)

The following lemma is similar to the corresponding lemmas from [7], and the proof is omitted.

Lemma 3.6.

$\texttt{SparseOrCycle}(G,D,x,y)$ satisfies the following:

(i)

If a cycle $C$ is returned then $wt(C)\leq 2(x-1+y)$
(ii)

If a cycle is not returned then $|E_{u}^{x}|<D^{x}$ , for every $u\in\hat{G}$
(iii)

If $u\in G\setminus\hat{G}$ then $u$ is not part of a $C_{\leq 2y}$ in $G$
(iv)

$\texttt{SparseOrCycle}(G,D,x,y)$ runs in $O(nD^{x}+\sum_{w\in W}(|V_{w}^{x-1+y}|))$ time.

Similarly to AllVtxBallOrCycle, we show for SparseOrCycle that if $G$ satisfies a certain sparsity property, the running time can be bounded as follows.

Corollary 3.3.

If $|E_{u}^{x-1+y}|<D^{x-1+y}$ for every vertex $u\in V$ then $\texttt{SparseOrCycle}(G,D,x,y)$ runs in $O(nD^{x}+mD^{y-1})$ time.

Proof.

By Lemma 3.6, $\texttt{SparseOrCycle}(G,D,x,y)$ runs in $O(|V|D^{x}+\sum_{w\in W}(|V_{w}^{x-1+y}|))$ time. For every $w\in W$ , the call to $\texttt{IsDense}(G,w,D^{x},x)$ returned Yes, so it follows from Lemma 3.4 that $|E_{w}^{x}|\geq D^{x}$ . The edge set $E_{w}^{x}$ is removed while removing $V_{w}^{x-1}$ . Therefore, for every $w\in W$ we remove at least $D^{x}$ edges. Since at most $m$ edges can be removed, the size of $W$ is at most $\frac{m}{D^{x}}$ . By our assumption, we have $O(|V_{w}^{x-1+y}|)\leq O(|E_{w}^{x-1+y}|)\leq O(D^{x-1+y})$ . Therefore, we get that $O(\sum_{w\in W}(|V_{w}^{x-1+y}|))\leq O(|W|\cdot D^{x-1+y})\leq O(\frac{m}{D^{x}}\cdot D^{x-1+y})=O(mD^{y-1})$ . Thus, the running time of SparseOrCycle is $O(nD^{x}+mD^{y-1})$ . ∎

Finally, we include a standard lemma about sampling a hitting set (see, e.g., [1], [8], [10]).

Lemma 3.7.

It is possible to obtain in $O(m)$ time, using sampling, a set of edges $S$ of size $\tilde{\Theta}(\frac{m}{s})$ , that hits, whp, the $s$ closest edges of every $v\in V$ .

We remark that some of our algorithms get a graph $G$ that is being updated during their run. Within their scope, $G$ denotes the current graph that includes all updates done so far.

4 A new cycle searching technique

Consider a vertex $s\in V$ . It is straightforward to check whether $s$ is on a $C_{\leq 2k}$ , for every integer $k$ , using $\texttt{BallOrCycle}(G,s,k)$ in $O(n)$ time. If $\texttt{BallOrCycle}(G,s,k)$ does not return a $C_{\leq 2k}$ then for every $x,y\in N(s)$ it holds that $V_{x}^{k-1}(G^{\prime})\cap V_{y}^{k-1}(G^{\prime})=\emptyset$ , where $G^{\prime}=G\setminus\{s\}$ , as otherwise there was a $C_{\leq 2k}$ passing through $s$ and $\texttt{BallOrCycle}(G,s,k)$ would have returned a $C_{\leq 2k}$ . We show that it is possible to exploit this property to check for every $v\in N(s)$ whether $v$ is on a $C_{\leq 2k}$ , using $\texttt{BallOrCycle}(G^{\prime},v,k)$ , in $O(n+m)$ time instead of $O(deg(s)\cdot n)$ . More specifically, we present algorithm $\texttt{NbrBallOrCycle}(G,s,k)$ (see Algorithm 5) that gets a graph $G$ , a vertex $s$ , and an integer $k\geq 2$ . We first initialize $\hat{U}$ to $\emptyset$ . Then, we run $\texttt{BallOrCycle}(G,s,k)$ . If a cycle $C$ is found by $\texttt{BallOrCycle}(G,s,k)$ then $C$ is returned by NbrBallOrCycle. Otherwise, we add the vertex $s$ to $\hat{U}$ , keep the neighbors of $s$ in $N_{s}$ , and then remove $s$ from $G$ . Recall that $G^{\prime}$ equals $G\setminus\{s\}$ . Next, for every $v\in N_{s}$ we run $\texttt{BallOrCycle}(G^{\prime},v,k)$ , as long as a cycle is not found. If a cycle $C$ is returned by $\texttt{BallOrCycle}(G^{\prime},v,k)$ then $C$ is returned by NbrBallOrCycle.⁸⁸8For our needs it suffices to stop and return a cycle passing through an $s$ neighbor once we find one, though BallOrCycle can be run from all the neighbors of $s$ in the same running time bound of $O(n+m)$ . Otherwise, $v$ is added to $\hat{U}$ . After the loop ends, the vertex $s$ and its adjacent edges are added back to the graph, and the set $\hat{U}$ is returned by NbrBallOrCycle. We prove the following lemma.

\hat{U}\leftarrow\emptyset

;

(V_{s}^{k},C)\leftarrow\texttt{BallOrCycle}(s,k)

;

if $C\neq\texttt{null}$ then return

(\emptyset,C)

;

\hat{U}\leftarrow\hat{U}\cup\{s\}

;

N_{s}\leftarrow N(s)

;

G\leftarrow G\setminus\{s\}

;

foreach $v\in N_{s}$ do

(V_{v}^{k},C)\leftarrow\texttt{BallOrCycle}(v,k)

;

if $C\neq\texttt{null}$ then return

(\emptyset,C)

;

\hat{U}\leftarrow\hat{U}\cup\{v\}

;

add

s

and the edge set

\{(s,v)\mid v\in N_{s}\}

back to

G

;

return

(\hat{U},\texttt{null});

Algorithm 5

\texttt{NbrBallOrCycle}(G,s,k)

Lemma 4.1.

If algorithm $\texttt{NbrBallOrCycle}(G,s,k)$ finds a cycle $C$ then $wt(C)\leq 2k$ . Otherwise, no vertex in $V_{s}^{1}$ is part of a $C_{\leq 2k}$ in $G$ , and the set $\hat{U}=V_{s}^{1}$ is returned.

Proof.

If a cycle was found, it happened during the call to $\texttt{BallOrCycle}(G,s,k)$ or one of the calls to $\texttt{BallOrCycle}(G^{\prime},v,k)$ . Therefore, by Lemma 3.2, the cycle length is at most $2k$ .

If no cycle was found during the run of $\texttt{BallOrCycle}(G,s,k)$ then $s$ is not on a $C_{\leq 2k}$ in $G$ . In addition, for every $v\in N_{s}$ $\texttt{BallOrCycle}(G^{\prime},v,k)$ did not find a cycle. Hence, $v$ is not on a $C_{\leq 2k}$ in $G^{\prime}$ and therefore also in $G$ , since $G^{\prime}=G\setminus\{s\}$ and $s$ is not on a $C_{\leq 2k}$ . Since no cycle was found, $s$ and the vertices $v$ for every $v\in N_{s}$ are added to $\hat{U}$ , so by the definition of $V_{s}^{1}$ we have $\hat{U}=V_{s}^{1}$ . ∎

To bound the running time of NbrBallOrCycle, we show how to use the fact that no $C_{\leq 2k}$ was found by $\texttt{BallOrCycle}(G,s,k)$ , to efficiently run $\texttt{BallOrCycle}(G^{\prime},v,k)$ for every $v\in N_{s}$ .

Lemma 4.2.

Let $s\in V$ . If the ball graph $B(G,s,k)$ is a tree then the total cost of running $\texttt{BallOrCycle}(G^{\prime},v,k)$ for every $v\in N_{s}$ is $O(|V_{s}^{k}(G)|+\sum_{w\in L_{s}^{k}(G)}deg(w))$ .

Proof.

By the definitions of $V_{s}^{k}$ , $L_{s}^{k}$ and $G^{\prime}$ , we know that $V_{s}^{k}(G)=\cup_{v\in N_{s}}V_{v}^{k-1}(G^{\prime})\cup\{s\}$ and $L_{s}^{k}(G)=\cup_{v\in N_{s}}L_{v}^{k-1}(G^{\prime})$ . Since $B(G,s,k)$ contains no cycles it follows that $V_{x}^{k-1}(G^{\prime})\cap V_{y}^{k-1}(G^{\prime})=\emptyset$ and $L_{x}^{k-1}(G^{\prime})\cap L_{y}^{k-1}(G^{\prime})=\emptyset$ , for any two distinct vertices $x,y\in N_{s}$ . Therefore, (i) $|V_{s}^{k}(G)|=\sum_{v\in N_{s}}|V_{v}^{k-1}(G^{\prime})|+1$ , and (ii) $L_{s}^{k}(G)=\mathbin{\mathaccent 0{\cdot}\cup}_{v\in N_{s}}L_{v}^{k-1}(G^{\prime})$ . From Lemma 3.2 it follows that the total cost of the calls to $\texttt{BallOrCycle}(G^{\prime},v,k)$ for every $v\in N_{s}$ is $O(\sum_{v\in N_{s}}|V_{v}^{k}(G^{\prime})|)$ . It holds that $|V_{v}^{k}(G^{\prime})|\leq|V_{v}^{k-1}(G^{\prime})|+\sum_{w\in L_{v}^{k-1}(G^{\prime})}deg(w)$ , for every $v\in N_{s}$ . Thus, we get that the total cost is $O(\sum_{v\in N_{s}}|V_{v}^{k}(G^{\prime})|)=O(\sum_{v\in N_{s}}(|V_{v}^{k-1}(G^{\prime})|+\sum_{w\in L_{v}^{k-1}(G^{\prime})}deg(w)))$ . This equals to $O(\sum_{v\in N_{s}}|V_{v}^{k-1}(G^{\prime})|+\sum_{v\in N_{s}}\sum_{w\in L_{v}^{k-1}(G^{\prime})}deg(w)))$ , and it follows from (i) and (ii) that this is at most $O(|V_{s}^{k}(G)|+\sum_{w\in L_{s}^{k}(G)}deg(w))$ . ∎

We use Lemma 4.2 to bound the running time of NbrBallOrCycle.

Lemma 4.3.

Algorithm NbrBallOrCycle runs in $O(n+m)$ time.

Proof.

Running $\texttt{BallOrCycle}(G,s,k)$ costs $O(n)$ . If a cycle is found, it is returned and the running time is $O(n)=O(n+m)$ . If no cycle is found, then removing (and later adding back) $s$ and its edges costs $O(deg(s))$ . By Lemma 4.2, the cost of running $\texttt{BallOrCycle}(G^{\prime},v,k)$ for every $v\in N_{s}$ is $O(|V_{s}^{k}(G)|+\sum_{w\in L_{s}^{k}(G)}deg(w))=O(n+m)$ . Adding $s$ and $N_{s}$ to $\hat{U}$ takes $O(V_{s}^{1})=O(n)$ time. Thus, the total running time of NbrBallOrCycle is $O(n+m)$ . ∎

5 A $2k$ -hybrid algorithm and a $(+1)$ -approximation of the girth

In this section we first show how to use algorithm NbrBallOrCycle from the previous section to obtain a $2k$ -hybrid algorithm that in $O(m^{1+\frac{k-1}{k+1}})$ time, either returns a $C_{\leq 2k}$ or determines that $g>2k$ . Then, we use the $2k$ -hybrid algorithm to compute a $(+1)$ -approximation of $g$ .

5.1 A $2k$ -hybrid algorithm

We first present algorithm $\texttt{$2$-SparseOrCycle}(G,k)$ that gets a graph $G$ and an integer $k\geq 2$ . Let $G_{0}$ ( $G_{1}$ ) be $G$ before (after) running $2$ -SparseOrCycle. $\texttt{$2$-SparseOrCycle}(G,k)$ either finds a $C_{\leq 2k}$ or removes vertices that are not on a $C_{\leq 2k}$ , such that for every $u\in G_{1}$ , the ball graph $B(G_{1},u,2)$ is relatively sparse, that is, $|E_{u}^{2}(G_{1})|<m^{\frac{2}{k+1}}$ . $2$ -SparseOrCycle (see Algorithm 6) iterates over vertices using a for-each loop. Let $s$ be the vertex currently considered. If $\sum_{v\in N(s)}deg(v)\geq m^{\frac{2}{k+1}}$ then $\texttt{NbrBallOrCycle}(G,s,k)$ is called. If NbrBallOrCycle returns a cycle $C$ then $2$ -SparseOrCycle returns $C$ . If NbrBallOrCycle returns a vertex set $\hat{U}$ then $\hat{U}$ is removed from $G$ . After the loop ends, if no cycle was found, we return null.

foreach $s\in V$ do

if $\sum_{v\in N(s)}deg(v)\geq m^{\frac{2}{k+1}}$ then

(\hat{U},C)\leftarrow\texttt{NbrBallOrCycle}(G,s,k)

;

if $C\neq null$ then return

C

;

G\leftarrow G\setminus\hat{U}

;

return null;

Algorithm 6

\texttt{$2$-SparseOrCycle}(G,k)

Remark. Notice that $\texttt{SparseOrCycle}(G,m^{\frac{2}{k+1}},2,k)$ either finds a $C_{\leq 2k+2}$ or removes vertices that are not on a $C_{\leq 2k}$ , such that for every $u\in G_{1}$ it holds that $|E_{u}^{2}(G_{1})|<m^{\frac{2}{k+1}}$ . Using NbrBallOrCycle instead of BallOrCycle in $2$ -SparseOrCycle enables us in the case that a cycle is found to bound the cycle length with $2k$ rather than $2k+2$ , while still maintaining the property that $|E_{u}^{2}(G_{1})|<m^{\frac{2}{k+1}}$ , for every $u\in G_{1}$ , in the case that no cycle is found.

We prove the following lemma.

Lemma 5.1.

$\texttt{$2$-SparseOrCycle}(G,k)$ satisfies the following:

(i)

If a cycle $C$ is returned then $wt(C)\leq 2k$
(ii)

If a cycle is not returned then $|E_{u}^{2}|<m^{\frac{2}{k+1}}$ , for every $u\in G_{1}$
(iii)

If $u\in G_{0}\setminus G_{1}$ then $u$ is not part of a $C_{\leq 2k}$ in $G_{0}$
(iv)

$\texttt{$2$-SparseOrCycle}(G,k)$ runs in $O(m^{1+\frac{k-1}{k+1}})$ time.

Proof.

(i)

Since $2$ -SparseOrCycle returns a cycle $C$ only if a call to $\texttt{NbrBallOrCycle}(G,s,k)$ returns a cycle $C$ , it follows from Lemma 4.1 that $wt(C)\leq 2k$ .
(ii)

Let $u\in G_{1}$ . Since $u$ was not removed, $u$ was considered in the for-each loop at some stage during the execution of $2$ -SparseOrCycle. At this stage $\sum_{v\in N(s)}deg(v)<m^{\frac{2}{k+1}}$ , as otherwise, since no cycle was returned, by Lemma 4.1 the call to NbrBallOrCycle with $u$ would have returned $\hat{U}=V_{u}^{1}$ , so $u$ would have been removed while removing $\hat{U}$ . Since $|E_{s}^{2}|\leq\sum_{v\in N(s)}deg(v)$ we have $|E_{u}^{2}|<m^{\frac{2}{k+1}}$ . As edges can only be removed during the run of $2$ -SparseOrCycle, we have $|E_{u}^{2}|<m^{\frac{2}{k+1}}$ also in $G_{1}$ .
(iii)

Since $u\in G_{0}\setminus G_{1}$ it follows that there was a vertex $s$ such that $u\in\hat{U}$ after a call to $\texttt{NbrBallOrCycle}(G,s,k)$ did not return a cycle. By Lemma 4.1, no vertex in $\hat{U}$ is part of a $C_{\leq 2k}$ in $G$ . Therefore, $u$ is not part of a $C_{\leq 2k}$ in $G$ . Since during the run of $2$ -SparseOrCycle we remove only vertices that are not part of a $C_{\leq 2k}$ , $u$ is not part of a $C_{\leq 2k}$ also in $G_{0}$ .
(iv)

Computing $\sum_{v\in N(s)}deg(v)$ takes $O(|N(s)|)=O(deg(v))$ time, as all the degrees can be computed in advance in $O(m)$ time. We compute this value for at most $n$ distinct vertices so the running time of this part is at most $O(\sum_{v\in V}deg(v))=O(m)$ in total. By Lemma 4.1, running NbrBallOrCycle takes $O(n+m)=O(m)$ time. Each edge in $E_{s}^{2}$ contributes at most $2$ to the sum $\sum_{v\in N(s)}deg(v)$ , so $\sum_{v\in N(s)}deg(v)\leq 2|E_{s}^{2}|$ and $\frac{1}{2}\sum_{v\in N(s)}deg(v)\leq|E_{s}^{2}|$ . If a call to $\texttt{NbrBallOrCycle}(G,s,k)$ did not return a cycle, then by Lemma 4.1, the set $\hat{U}=V_{s}^{1}$ is returned. $2$ -SparseOrCycle removes the set $\hat{U}=V_{s}^{1}$ and by doing so, the edge set $E_{s}^{2}$ is also removed. We charge each edge of $E_{s}^{2}$ with $O(m^{\frac{k-1}{k+1}})$ . Thus the total cost that we charge for $s$ is $O(|E_{s}^{2}|\cdot m^{\frac{k-1}{k+1}})\geq O(\sum_{v\in N(s)}deg(v)\cdot m^{\frac{k-1}{k+1}})\geq O(m^{\frac{2}{k+1}}\cdot m^{\frac{k-1}{k+1}})=O(m)$ , which covers the $O(m)$ cost of $\texttt{NbrBallOrCycle}(G,s,k)$ .

Since each edge can be charged and removed from $G$ at most once during the execution of $2$ -SparseOrCycle, the running time of $2$ -SparseOrCycle is at most $O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}})$ . ∎

Next, we use $2$ -SparseOrCycle to design a $2k$ -hybrid algorithm called $2k$ -Hybrid. Notice first that if $|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}$ for every $u\in V$ , then it is straightforward to obtain an $O(m^{1+\frac{k-1}{k+1}})$ -time $2k$ -hybrid algorithm, by running $\texttt{AllVtxBallOrCycle}(G,k)$ . Thus, in $2k$ -Hybrid we ensure that if we call AllVtxBallOrCycle then it holds for every $u\in V$ that $|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}$ . To do so, we run $2$ -SparseOrCycle and possibly SparseOrCycle. If no cycle was returned then it holds that $|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}$ for every $u\in V$ , and we can safely run AllVtxBallOrCycle.

$2k$ -Hybrid (see Algorithm 7) gets a graph $G$ and an integer $k\geq 2$ . $2k$ -Hybrid is composed of three stages. In the first stage we call $\texttt{$2$-SparseOrCycle}(G,k)$ . If $2$ -SparseOrCycle returns a cycle $C$ then $2k$ -Hybrid stops and returns $C$ , otherwise we proceed to the second stage. In the second stage, if $(k-1)\bmod 2\neq 0$ , we call $\texttt{SparseOrCycle}(G,m^{\frac{1}{k+1}},1,k)$ . If SparseOrCycle returns a cycle $C$ then $2k$ -Hybrid stops and returns $C$ , otherwise we proceed to the last stage. In the last stage, we call $\texttt{AllVtxBallOrCycle}(G,k)$ . If AllVtxBallOrCycle returns a cycle $C$ then $2k$ -Hybrid stops and returns $C$ , otherwise $2k$ -Hybrid returns null. In the next lemma we prove the correctness and analyze the running time of $\texttt{$2k$-Hybrid}(G,k)$ .

C\leftarrow\texttt{$2$-SparseOrCycle}(G,k)

;

if $C\neq\texttt{null}$ then return

C

;

if $(k-1)\bmod 2\neq 0$ then

C\leftarrow\texttt{SparseOrCycle}(G,m^{\frac{1}{k+1}},1,k)

if $C\neq\texttt{null}$ then return

C

;

C\leftarrow\texttt{AllVtxBallOrCycle}(G,k)

;

if $C\neq\texttt{null}$ then return

C

;

return null;

Algorithm 7

\texttt{$2k$-Hybrid}(G,k)

Lemma 5.2.

$\texttt{$2k$-Hybrid}(G,k)$ either returns a $C_{\leq 2k}$ or determines that $g>2k$ , in $O(m^{1+\frac{k-1}{k+1}})$ time.

Proof.

First, $\texttt{$2k$-Hybrid}(G,k)$ returns a cycle $C$ only if $\texttt{$2$-SparseOrCycle}(G,k)$ , $\texttt{SparseOrCycle}(G,m^{\frac{1}{k+1}},1,k)$ or $\texttt{AllVtxBallOrCycle}(G,k)$ returns a cycle $C$ . Therefore, if a cycle is returned then it follows from Lemmas 5.1, 3.6, or 3.3, respectively, that $wt(C)\leq 2k$ .

Second, we show that if no cycle was found then $g>2k$ . If no cycle was found then it might be that some vertices were removed from the graph. A vertex $u$ can be removed either by SparseOrCycle or by $2$ -SparseOrCycle. It follows from Lemma 3.6 and Lemma 5.1, that $u$ is not part of a $C_{\leq 2k}$ when $u$ is removed. Since only vertices that are not on a $C_{\leq 2k}$ are removed, every $C_{\leq 2k}$ that was in the input graph also belongs to the updated graph. We then call $\texttt{AllVtxBallOrCycle}(G,k)$ with the updated graph. Since we are in the case that no cycle was found, AllVtxBallOrCycle did not return a cycle. It follows from Lemma 3.3 that $g>2k$ in the updated graph, and therefore $g>2k$ also in the input graph.

Now we turn to analyze the running time of $2k$ -Hybrid. At the beginning, $2k$ -Hybrid calls $2$ -SparseOrCycle. By Lemma 5.1, $2$ -SparseOrCycle runs in $O(m^{1+\frac{k-1}{k+1}})$ time. Let $G_{1}$ be the graph after the call to $2$ -SparseOrCycle. By Lemma 5.1, for every $u\in G_{1}$ we have $|E_{u}^{2}|<m^{\frac{2}{k+1}}$ .

Next, $2k$ -Hybrid checks if $(k-1)\bmod 2\neq 0$ . We divide the rest of the proof to the case that $(k-1)\bmod 2=0$ and to the case that $(k-1)\bmod 2\neq 0$ . If $(k-1)\bmod 2=0$ then $2k$ -Hybrid calls AllVtxBallOrCycle. Since $(k-1)\bmod 2=0$ and since for every $u\in G_{1}$ we have $|E_{u}^{2}|<m^{\frac{2}{k+1}}$ , it follows from Corollary 3.2 that $|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}$ for every $u\in G_{1}$ . By Corollary 3.1, the running time of AllVtxBallOrCycle is $O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}})$ .

We now turn to the case that $(k-1)\bmod 2\neq 0$ . In this case it might be that $|E_{u}^{k-1}|\geq m^{\frac{k-1}{k+1}}$ for some vertices $u\in G_{1}$ . Therefore, we first call $\texttt{SparseOrCycle}(G_{1},m^{\frac{1}{k+1}},1,k)$ . Notice that since we are in the case that $(k-1)\bmod 2\neq 0$ it holds that $k\bmod 2=0$ . Moreover, $|E_{u}^{2}|<m^{\frac{2}{k+1}}$ for every $u\in G_{1}$ . Thus, it follows from Corollary 3.2 that $|E_{u}^{k}|<m^{\frac{k}{k+1}}$ for every $u\in G_{1}$ . By Corollary 3.3 if $|E_{u}^{k}|<m^{\frac{k}{k+1}}$ for every $u\in G_{1}$ then the running time of $\texttt{SparseOrCycle}(G_{1},m^{\frac{1}{k+1}},1,k)$ is $O(nm^{\frac{1}{k+1}}+mm^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}})$ .

Let $G_{2}$ be the graph after SparseOrCycle ends. By Lemma 3.6, for every $u\in G_{2}$ we have $|E_{u}^{1}|<m^{\frac{1}{k+1}}$ . It follows from Corollary 3.2 that $|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}$ . Now $2k$ -Hybrid calls AllVtxBallOrCycle, and using Corollary 3.1 again, we get that the running time of AllVtxBallOrCycle is $O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}})$ .

It follows from the above discussion that ShortCycleSparse either returns a $C_{\leq 2k}$ or determines that $g>2k$ , and the running time is $O(m^{1+\frac{k-1}{k+1}})$ . ∎

5.2 A $(+1)$ -approximation of the girth

Next, we describe algorithm AdtvGirthApprox, which uses $2k$ -Hybrid and the framework described in Section 1, to obtain a $(+1)$ -approximation of $g$ , when $g\leq\log n$ . AdtvGirthApprox (see Algorithm 8) gets a graph $G$ . In AdtvGirthApprox, we set $k$ to $2$ and start a while loop. In each iteration, we create a copy $G^{\prime}$ of $G$ and call $\texttt{$2k$-Hybrid}(G^{\prime},k)$ . If $2k$ -Hybrid finds a cycle $C$ then AdtvGirthApprox stops and returns $C$ , otherwise we increment $k$ by $1$ and continue to the next iteration. We prove the following theorem.

k\leftarrow 2

;

while true do

G^{\prime}\leftarrow

a copy of

G

;

C\leftarrow\texttt{$2k$-Hybrid}(G^{\prime},k)

;

if $C\neq\texttt{null}$ then return

C

;

k\leftarrow k+1

;

Algorithm 8

\texttt{AdtvGirthApprox}(G)

Theorem 5.1.

Algorithm $\texttt{AdtvGirthApprox}(G)$ returns either a $C_{g}$ or a $C_{g+1}$ , and runs in $\tilde{O}(m^{1+\frac{\ell-1}{\ell+1}})$ time, where $g=2\ell$ or $g=2\ell-1$ and $2\leq\ell<\log n$ .⁹⁹9 We note that when $\ell\geq\log n$ , the $O(n^{2})$ -time, $(+1)$ -approximation of Itai and Rodeh [6] can be used.

Proof.

We first prove the bound on the approximation. AdtvGirthApprox always returns a cycle in $G$ so it cannot be that a $C_{<g}$ will be returned. AdtvGirthApprox starts with $k=2$ . It follows from Lemma 5.2 that as long as $k<\ell$ the calls to $2k$ -Hybrid will not return a cycle since the graph does not contain a $C_{\leq 2k}$ . Consider now the iteration in which $k=\ell$ . AdtvGirthApprox calls to $\texttt{$2k$-Hybrid}(G^{\prime},\ell)$ where $G^{\prime}=G$ . It follows from Lemma 5.2 that $2k$ -Hybrid will either return a $C_{\leq 2\ell}$ , or determine that $g>2\ell$ . Since we assume that $g\leq 2\ell$ , $2k$ -Hybrid will return a $C_{\leq 2\ell}$ which is either a $C_{g}$ or a $C_{g+1}$ since $g=2\ell$ or $g=2\ell-1$ .

We now turn to analyze the running time. Creating a copy of $G$ takes $O(m)$ time, and by Lemma 5.2 the running time of $\texttt{$2k$-Hybrid}(G^{\prime},k)$ is $O(m^{1+\frac{k-1}{k+1}})$ . Therefore, for every $k\leq\ell$ , the running time of the iteration of the while loop with this value of $k$ is $O(m^{1+\frac{k-1}{k+1}})$ . From the previous part of this proof it follows that the last iteration of the while loop is when $k=\ell$ , thus, the running time of AdtvGirthApprox is $O(m^{1+\frac{1}{3}}+m^{1+\frac{2}{4}}+\cdots+m^{1+\frac{\ell-2}{\ell}}+m^{1+\frac{\ell-1}{\ell+1}})\leq O(\ell\cdot m^{1+\frac{\ell-1}{\ell+1}})$ . Therefore, when $\ell<\log n$ , the running time is $\tilde{O}(m^{2-\frac{2}{\ell+1}})$ . ∎

6 A general hybrid algorithm

Algorithm $\texttt{$2k$-Hybrid}(G,k)$ , presented in the previous section, either returns a $C_{\leq 2k}$ or determines that $g>2k$ , in $O(m^{1+\frac{k-1}{k+1}})$ time. In this section we introduce an additional parameter $2\leq\alpha\leq k$ and present a $(2k,2\alpha)$ -hybrid algorithm that either returns a $C_{\leq 2k}$ or determines that $g>2\alpha$ , in $O((\frac{k+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k+1}})$ time. In Section 7 we use the $(2k,2\alpha)$ -hybrid algorithm to present two tradeoffs for girth approximation.

To obtain the $(2k,2\alpha)$ -hybrid algorithm we first extend algorithm NbrBallOrCycle. Then, we use the extended NbrBallOrCycle together with additional tools that we develop to either return a $C_{\leq 2k}$ or sparsify dense regions of the graph, so that we can check whether $g>2\alpha$ (or return a $C_{\leq 2k}$ ) in $O(m^{1+\frac{\alpha-1}{k+1}})$ time, by running $\texttt{AllVtxBallOrCycle}(G,\alpha)$ .

6.1 Extending NbrBallOrCycle

In algorithm NbrBallOrCycle we mark vertices that can be removed from the graph, by using the property that if $\texttt{BallOrCycle}(v,k)$ did not return a $C_{\leq 2k}$ then $v$ is not on a $C_{\leq 2k}$ . In [7], they introduce an additional parameter $\alpha$ and used the following extended version of this property: If $\texttt{BallOrCycle}(v,k)$ did not return a $C_{\leq 2k}$ then no vertex of $V_{v}^{k-\alpha}$ is on a $C_{\leq 2\alpha}$ . We use the same approach and modify NbrBallOrCycle to get an additional integer parameter $\alpha$ such that $2\leq\alpha\leq k$ . After each call to $\texttt{BallOrCycle}(G^{\prime},v,k)$ , where $v\in N_{s}$ , if no cycle was found we add $V_{v}^{k-\alpha}$ , instead of $v$ , to $\hat{U}$ . The modified pseudo-code appears in Algorithm 9. We rephrase Lemma 4.1 to suit this modification.

\hat{U}\leftarrow\emptyset

;

(V_{s}^{k},C)\leftarrow\texttt{BallOrCycle}(s,k)

;

if $C\neq\texttt{null}$ then return

(\emptyset,C)

;

\hat{U}\leftarrow\hat{U}\cup\{s\}

;

N_{s}\leftarrow N(s)

;

G\leftarrow G\setminus\{s\}

;

foreach $(s,v)\in N_{s}$ do

(V_{v}^{k},C)\leftarrow\texttt{BallOrCycle}(v,k)

;

if $C\neq\texttt{null}$ then return

(\emptyset,C)

;

\hat{U}\leftarrow\hat{U}\cup V_{v}^{k-\alpha}

;

add

s

and the edge set

\{(s,v)\mid v\in N_{s}\}

back to

G

;

return

(\hat{U},\texttt{null});

Algorithm 9

\texttt{NbrBallOrCycle}(G,s,k,\alpha)

Lemma 6.1.

Let $2\leq\alpha\leq k$ . If $\texttt{NbrBallOrCycle}(G,s,k,\alpha)$ finds a cycle $C$ then $wt(C)\leq 2k$ . Otherwise, no vertex in $V_{s}^{k-\alpha+1}$ is part of a $C_{\leq 2\alpha}$ in $G$ , and the set $\hat{U}=V_{s}^{k-\alpha+1}$ is returned.

Proof.

The proof that if a cycle $C$ is returned then $wt(C)\leq 2k$ is as in Lemma 4.1. It is left to show that if no cycle is found then no vertex in $V_{s}^{k-\alpha+1}$ is part of a $C_{\leq 2\alpha}$ in $G$ , and the set $\hat{U}=V_{s}^{k-\alpha+1}$ is returned.

From Lemma 3.1 it follows that if $\texttt{BallOrCycle}(G^{\prime},v,k)$ did not return a cycle then no vertex in $V_{v}^{k-\alpha}$ is part of a $C_{\leq 2\alpha}$ in $G^{\prime}$ , and therefore also in $G$ , since $G^{\prime}=G\setminus\{s\}$ and since $s$ itself is not on a $C_{\leq 2\alpha}$ as otherwise $\texttt{BallOrCycle}(G^{\prime},v,k)$ would not have been called. Thus, if no cycle was found, we have $\hat{U}=V_{s}^{k-\alpha+1}(G)$ , as in this case $\hat{U}$ contains $s$ and $V_{v}^{k-\alpha}(G^{\prime})$ for each $v\in N(s)$ , which equals to $V_{s}^{k-\alpha+1}(G)$ . ∎

For the running time, we note that the sets $V_{v}^{k}$ are computed during the execution of $\texttt{BallOrCycle}(G^{\prime},v,k)$ , for every $(s,v)\in N_{s}$ for which no cycle was found. Their total size is also $O(n+m)$ , and we can obtain from them the sets $V_{v}^{k-\alpha}$ and add these sets to $\hat{U}$ in $O(n+m)$ time. Therefore, the modified NbrBallOrCycle also runs in $O(n+m)$ time.

6.2 A $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm

In this section we present a $(\max\{2k,g\},2\alpha)$ -hybrid algorithm called ShortCycle, where $\alpha=\lceil\frac{\tilde{g}}{2}\rceil$ . ShortCycle (see Algorithm 10) gets a graph $G$ and two integers $\alpha,k\geq 2$ . If $m\geq 1+\lceil n\cdot(1+n^{1/k})\rceil$ then we run algorithm $\texttt{ShortCycleDense}(G,k)$ (see Algorithm 11), which is based on algorithm DegenerateOrCycle of [7]. If $m<1+\lceil n\cdot(1+n^{1/k})\rceil$ then the main challenge is when $\alpha\leq k<\frac{n}{2}$ . In this case we run algorithm $\texttt{ShortCycleSparse}(G,k,\alpha)$ (described later). The cases that $k\geq\frac{n}{2}$ or $k<\alpha$ are relatively simple and treated in algorithm $\texttt{SpecialCases}(G,k,\alpha)$ (see Section 6.2.1). We summarize the properties of $\texttt{ShortCycle}(G,k,\alpha)$ in the next theorem.

if $m\geq 1+\lceil n\cdot(1+n^{1/k})\rceil$ then

return

\texttt{ShortCycleDense}(G,k)

;

if $\alpha\leq k<\frac{n}{2}$ then

return

\texttt{ShortCycleSparse}(G,k,\alpha)

;

return

\texttt{SpecialCases}(G,k,\alpha)

Algorithm 10

\texttt{ShortCycle}(G,k,\alpha)

G^{\prime}\leftarrow(V^{\prime},E^{\prime})

is the edge induced subgraph formed by an arbitrary subset of

1+\lceil n\cdot(1+n^{1/k})\rceil

edges;

S\leftarrow\{a\in V^{\prime}\mid deg_{G^{\prime}}(a)\leq 1+n^{1/k}\}

;

while $S\neq\emptyset$ do

pick

a

from

S

;

foreach $(a,b)\in E^{\prime}$ do

if $deg_{G^{\prime}}(b)=2+n^{1/k}$ then

S\leftarrow S\cup\{b\}

;

remove

a

from

G^{\prime}

and from

S

;

(*,C)\leftarrow\texttt{BallOrCycle}(G^{\prime},w,k)

with some

w\in V^{\prime}

;

return

C

;

C

is a cycle of length

\leq 2k

Algorithm 11

\texttt{ShortCycleDense}(G,k)

Theorem 6.1.

Let $\alpha,k\geq 2$ be integers. $\texttt{ShortCycle}(G,k,\alpha)$ runs whp in $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ time and either returns a $C_{\leq\max\{2k,g\}}$ , or determines that $g>2\alpha$ .

The next corollary follows from Theorem 6.1, when $\alpha\leq k$ .

Corollary 6.1.

Let $2\leq\alpha\leq k$ . Algorithm $\texttt{ShortCycle}(G,k,\alpha)$ runs whp in $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ time and either returns a $C_{\leq 2k}$ , or determines that $g>2\alpha$ .

In the rest of this section, we present the proof of Theorem 6.1. As follows from [7], if $m\geq 1+\lceil n\cdot(1+n^{1/k})\rceil$ then $\texttt{ShortCycleDense}(G,k)$ returns in $O(\min\{m,n^{1+1/k}\})$ time a $C_{\leq 2k}$ . We now consider the case in which $m<1+\lceil n\cdot(1+n^{1/k})\rceil$ . We prove in Section 6.2.1 that $\texttt{SpecialCases}(G,k,\alpha)$ satisfies the claim of Theorem 6.1, when $k\geq\frac{n}{2}$ or $k<\alpha$ .

Our main technical contribution is algorithm ShortCycleSparse that handles the case of $\alpha\leq k<\frac{n}{2}$ . Notice that if $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ for every $u\in V$ , then $\texttt{AllVtxBallOrCycle}(G,\alpha)$ is a $(2k,2\alpha)$ -hybrid algorithm that either finds a $C_{\leq 2\alpha}$ (which is also a $C_{\leq 2k}$ as $\alpha\leq k$ ) or determines that $g>2\alpha$ , in $O(m^{1+\frac{\alpha-1}{k+1}})$ time. Thus, in ShortCycleSparse we ensure that if we call AllVtxBallOrCycle, the property that $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ , for every $u\in V$ (whp), holds. To do so, we run BfsSample and possibly HandleReminder. If no cycle was returned, the property holds, and we can safely run AllVtxBallOrCycle.

ShortCycleSparse (see Algorithm 12) gets a graph $G$ and two integers $\alpha,k\geq 2$ such that $\alpha\leq k<\frac{n}{2}$ , and is composed of three stages. In the first stage we call $\texttt{BfsSample}(G,k,\alpha)$ (described later). If BfsSample returns a cycle $C$ then ShortCycleSparse stops and returns $C$ , otherwise we proceed to the second stage. In the second stage, if $(k+1)\bmod(\alpha-1)\neq 0$ , we call $\texttt{HandleReminder}(G,k,\alpha)$ (also described later). If HandleReminder returns a cycle $C$ then ShortCycleSparse stops and returns $C$ , otherwise we proceed to the last stage. In the last stage, we call $\texttt{AllVtxBallOrCycle}(G,\alpha)$ . If AllVtxBallOrCycle returns a cycle $C$ then $2k$ -Hybrid stops and returns $C$ , otherwise ShortCycleSparse returns null.

C\leftarrow\texttt{BfsSample}(G,k,\alpha)

;

if $C\neq\texttt{null}$ then return

C

;

if $(k+1)\bmod(\alpha-1)\neq 0$ then

C\leftarrow\texttt{HandleReminder}(G,k,\alpha)

;

if $C\neq\texttt{null}$ then return

C

;

C\leftarrow\texttt{AllVtxBallOrCycle}(G,\alpha)

;

if $C\neq\texttt{null}$ then return

C

;

return null;

Algorithm 12

\texttt{ShortCycleSparse}(G,k,\alpha)

Next, we give a high level description of BfsSample. The goal of BfsSample is to either sparsify the graph without removing any $C_{\leq 2\alpha}$ , or to report a $C_{\leq 2k}$ . For simplicity assume that $(k+1)\bmod(\alpha-1)=0$ . In such a case, if BfsSample does not report a $C_{\leq 2k}$ , then the graph after BfsSample ends contains all the $C_{\leq 2\alpha}$ s that were in the original graph, and satisfies, whp, the following sparsity property: For every $u\in V$ it holds that $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ .

This implies that in BfsSample we need to find every $u\in V$ which is in a dense region with $|E_{u}^{\alpha-1}|\geq m^{\frac{\alpha-1}{k+1}}$ , and to check if $u$ is in a $C_{\leq 2\alpha}$ , so that if not we can remove $u$ . Finding every such $u$ is possible within the time limit by running $\texttt{IsDense}(G,u,m^{\frac{\alpha-1}{k+1}},\alpha-1)$ for every $u\in V$ . The problem is that checking whether $u$ is on a $C_{\leq 2\alpha}$ for every such $u$ is too costly since there might be $n$ such vertices, and this check costs $O(n)$ using $\texttt{BallOrCycle}(G,u,\alpha)$ .

One way to overcome this problem is to sample an edge set $S$ of size $\tilde{\Theta}(m^{1-\frac{\alpha-1}{k+1}})$ that hits the $m^{\frac{\alpha-1}{k+1}}$ closest edges of each vertex, and then use $S$ to detect the vertices in the dense regions that are not on a $C_{\leq 2\alpha}$ . In BfsSample we use a detection process in which we call BallOrCycle or NbrBallOrCycle from the endpoints of $S$ ’s edges, and then, if no cycle was found, we use the information obtained from this call to identify vertices that are not on a $C_{\leq 2\alpha}$ . The detection process either detects vertices that are not on a $C_{\leq 2\alpha}$ and can be removed, or reports a $C_{\leq 2k}$ . However, it is not clear how to implement this detection process efficiently, since just running BallOrCycle from the endpoints of $S$ ’s edges takes $O(nm^{1-\frac{\alpha-1}{k+1}})$ time which might be too much. Our solution is an iterative sampling procedure that starts with a smaller hitting set of edges, of size $\tilde{\Theta}(m^{\frac{\alpha-1}{k+1}})$ . For such a hitting set we can run our detection process. If a $C_{\leq 2k}$ was not reported, then we remove the appropriate vertices and sparsify the graph without removing any $C_{\leq 2\alpha}$ . When the graph is sparser, the running time of our detection process becomes faster. Thus, in the following iteration we can sample a larger hitting set for which we run this process, and either return a $C_{\leq 2k}$ or sparsify the graph further for the next iteration. We continue the iterative sampling procedure until we get to the required sparsity property in which $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ for every $u\in V$ (whp).

We remark that in the first iteration of BfsSample, the detection process calls NbrBallOrCycle, while in the rest of the iterations BallOrCycle is called. The use of NbrBallOrCycle allows us, in the case that no $C_{\leq 2k}$ is reported, to bound $|E_{v}^{d}|$ with $m^{\frac{d}{k+1}}$ for every $v\in V$ , rather than $|E_{v}^{d-1}|$ . This is used to achieve the required sparsity property. Since NbrBallOrCycle runs in $O(m)$ time we can only use it in the first iteration when the sampled set is small enough. In the rest of the iterations we use BallOrCycle instead.

We now formally describe BfsSample. BfsSample (see Algorithm 13) gets a graph $G$ and two integers $\alpha,k\geq 2$ such that $\alpha\leq k<\frac{n}{2}$ . We first set $y$ to $\lceil\frac{k+1}{\alpha-1}\rceil-1$ . Then, we start the main for loop that has at most $y$ iterations. In the $i$ th iteration, we initialize $\hat{V}$ to $\emptyset$ and sample a set $S_{i}\subseteq E$ of size $\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}})$ . Next, we scan the endpoints in $V(S_{i})$ using an inner for-each loop.

If $i=1$ , we call $\texttt{NbrBallOrCycle}(G,s,k,\alpha)$ from every endpoint $s\in V(S_{1})$ . NbrBallOrCycle returns either a cycle or a set of vertices $\hat{U}$ . If NbrBallOrCycle returns a cycle then the cycle is returned by BfsSample. Otherwise, we add $\hat{U}$ to $\hat{V}$ .

If $i>1$ then we call $\texttt{BallOrCycle}(s,k_{i})$ , where $k_{i}=(k+1)-(i-1)\cdot(\alpha-1)$ , from every endpoint $s\in V(S_{i})$ . If a cycle is found by BallOrCycle then the cycle is returned by BfsSample. If BallOrCycle does not return a cycle then we add $V_{u_{j}}^{k_{i}-\alpha}$ to $\hat{V}$ .

Right after the inner for-each loop ends, we remove $\hat{V}$ from $G$ , and continue to the next iteration of the main for loop. If no cycle was found after $y$ iterations, we return null. Let $\ell\leq y$ be the last iteration in which vertices were removed. Let $\hat{V_{i}}$ be the set of vertices that were removed during the $i$ th iteration, where $\hat{V_{i}}=\emptyset$ for $i>\ell$ . Let $G_{0}$ ( $G_{1}$ ) be $G$ before (after) running BfsSample. Figure 5 and Figure 6 illustrate the key steps of the first and the following iterations of BfsSample, respectively. We summarize the properties BfsSample in the next lemma.

y\leftarrow\lceil\frac{k+1}{\alpha-1}\rceil-1

;

for $i\leftarrow 1$ to $y$ do

sample a set

S_{i}

\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}})

edges;

\hat{V}\leftarrow\emptyset

;

foreach $s\in V(S_{i})$ do

if $i=1$ then

(\hat{U},C)\leftarrow\texttt{NbrBallOrCycle}(G,s,k,\alpha)

;

if $C\neq\texttt{null}$ then return

C

;

\hat{V}\leftarrow\hat{V}\cup\hat{U}

;

else

k_{i}\leftarrow(k+1)-(i-1)\cdot(\alpha-1)

;

(V_{s}^{k_{i}},C)\leftarrow\texttt{BallOrCycle}(s,k_{i})

;

if $C\neq\texttt{null}$ then return

C

;

\hat{V}\leftarrow\hat{V}\cup V_{s}^{k_{i}-\alpha}

;

G\leftarrow G\setminus\hat{V}

;

return null;

Algorithm 13

\texttt{BfsSample}(G,k,\alpha)

Figure 5: The first iteration of BfsSample. (a) The edge

(x,y)

is a sampled edge from a dense region of

u

. (b) The run of NbrBallOrCycle from

x

and

y

leads to the removal of

u

Figure 6: Iteration

i>1

of BfsSample. (a) The edge

(x,y)

is a sampled edge from a dense region of

u

. (b) The run of BallOrCycle from

x

and

y

leads to the removal of

u

Lemma 6.2.

$\texttt{BfsSample}(G,k,\alpha)$ satisfies the following:

(i)

If a cycle $C$ is returned then $wt(C)\leq 2k$
(ii)

If a cycle is not returned then $|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}$ , for every $u\in G_{1}$ , whp
(iii)

If $u\in G_{0}\setminus G_{1}$ then $u$ is not part of a $C_{\leq 2\alpha}$ in $G_{0}$
(iv)

$\texttt{BfsSample}(G,k,\alpha)$ runs in $\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}})$ time, whp.

Proof.

(i)

We first show that if a cycle $C$ is returned then $wt(C)\leq 2k$ . BfsSample returns a cycle only if a call to $\texttt{NbrBallOrCycle}(G,s,k,\alpha)$ returns a cycle or a call to $\texttt{BallOrCycle}(s,k_{i})$ , where $2\leq i\leq y$ and $k_{i}=(k+1)-(i-1)\cdot(\alpha-1)$ , returns a cycle. By Lemma 6.1, if $\texttt{NbrBallOrCycle}(G,s,k,\alpha)$ returns a cycle $C$ then $wt(C)\leq 2k$ . By Lemma 3.2, if $\texttt{BallOrCycle}(s,k_{i})$ returns a cycle $C$ then $wt(C)\leq 2k_{i}$ . Since $i>1$ is an integer, and since $\alpha\geq 2$ , we have $k_{i}=(k+1)-(i-1)\cdot(\alpha-1)\leq(k+1)-(\alpha-1)\leq k$ . Therefore, $wt(C)\leq 2k$ .
(ii)

Next, we show that if a cycle is not returned then $|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}$ , for every $u\in G_{1}$ , whp. Since $k\geq\alpha$ , we have $y\geq 1$ and there is at least one iteration.

Consider the $i$ th iteration of the main loop. We show that if no cycle was found during the $i$ th iteration then after the $i$ th iteration every $u\in G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{j})$ satisfies the following property: $|E_{u}^{(k+1)-i\cdot(\alpha-1)}|<m^{1-\frac{i\cdot(\alpha-1)}{k+1}}$ , whp.

Let $u\in G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{j})$ . By Lemma 3.7, the set $S_{i}$ hits the $m^{1-\frac{i\cdot(\alpha-1)}{k+1}}$ closest edges of every vertex of $G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j})$ , whp. Assume that $S_{i}$ is indeed such a hitting set, and assume, towards a contradiction, that after the $i$ th iteration $|E_{u}^{(k+1)-i\cdot(\alpha-1)}|\geq m^{1-\frac{i\cdot(\alpha-1)}{k+1}}$ . Since $G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{j})\subseteq G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j})$ we have $u\in G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j})$ . Now, since $u\in G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j})$ and since the graph is not updated in the inner for-all loop, it follows that there is an edge $(u_{1},u_{2})\in S_{i}$ such that $(u_{1},u_{2})\in E_{u}^{(k+1)-i\cdot(\alpha-1)}$ . By the definition of $E_{u}^{(k+1)-i\cdot(\alpha-1)}$ , either $u_{1}$ or $u_{2}$ , denoted with $s$ , satisfies that $d(u,s)\leq(k+1)-i\cdot(\alpha-1)-1$ . By the definition of $V(S_{i})$ , we know that $s\in V(S_{i})$ .

When $i=1$ , we have $d(u,s)\leq(k+1)-i\cdot(\alpha-1)-1=k-\alpha+1$ . Therefore, $u\in V_{s}^{k-\alpha+1}$ . At the first iteration, when $i=1$ , the input graph has not changed yet so $G=G_{0}$ . Since no cycle was found by $\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha)$ , it follows from Lemma 6.1 that $\hat{U}=V_{s}^{k-\alpha+1}$ . Therefore, $u\in\hat{U}$ , and $u$ is added to $\hat{V_{i}}$ after the call to $\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha)$ . Hence, $u\notin G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{i})$ , a contradiction.

We now handle the case that $i>1$ . It holds that $d(u,s)\leq(k+1)-i\cdot(\alpha-1)-1=(k+1)-(i-1)\cdot(\alpha-1)-\alpha=k_{i}-\alpha$ . Hence, $u\in V_{s}^{k_{i}-\alpha}$ . Since no cycle was found during the $i$ th iteration, $u$ is added to $\hat{V_{i}}$ after the call to $\texttt{BallOrCycle}(s,k_{i})$ , and therefore, $u\notin G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{i})$ , a contradiction.

Now, if BfsSample does not return a cycle we get for $i=y$ that if $u\in G_{0}\setminus(\cup_{j=1}^{y}\hat{V}_{j})=G_{1}$ then $|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}$ , whp.
(iii)

Next, we prove that if $u\in G_{0}\setminus G_{1}$ then $u$ is not part of a $C_{\leq 2\alpha}$ in $G_{0}$ . Since $u\in G_{0}\setminus G_{1}$ and since $G_{1}=G_{0}\setminus(\cup_{j=1}^{y}\hat{V}_{j}$ ) it holds that $u\in\hat{V}_{i}$ , where $1\leq i\leq y$ .

If $i=1$ then since $u\in\hat{V}_{i}$ it follows that there was in the first iteration a vertex $s$ such that $u\in\hat{U}$ after a call to $\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha)$ did not return a cycle. By Lemma 6.1, $\hat{U}=V_{s}^{k-\alpha+1}$ and no vertex in $V_{s}^{k-\alpha+1}$ is part of a $C_{\leq 2\alpha}$ in $G_{0}$ . Therefore, $u$ is not part of a $C_{\leq 2\alpha}$ in $G_{0}$ .

If $i>1$ then since $u\in\hat{V}_{i}$ it follows that there was in the $i$ th iteration a vertex $s$ such that $u\in V_{s}^{k_{i}-\alpha}$ after a call to $\texttt{BallOrCycle}(s,k_{i})$ did not return a cycle. As $\texttt{BallOrCycle}(s,k_{i})$ did not return a cycle, by Lemma 3.2 we know that $B(s,k_{i})$ is a tree. It follows from Lemma 3.1 that no vertex in $V_{s}^{k_{i}-\alpha}$ , and in particular $u$ , is part of a $C_{\leq 2\alpha}$ in $G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j})$ . Since during the run of BfsSample we remove only vertices that are not part of a $C_{\leq 2\alpha}$ , $u$ is not part of a $C_{\leq 2\alpha}$ also in $G_{0}$ .
(iv)

Finally, we show that BfsSample runs in $\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}})$ time, whp. To do so, we show that the running time of the $i$ th iteration of the main for loop is whp $\tilde{O}(m^{1+\frac{\alpha-1}{k+1}})$ .

We start with the first iteration, in which $i=1$ . The size of $S_{1}$ is $\tilde{\Theta}(m^{\frac{\alpha-1}{k+1}})$ . The size of $V(S_{1})$ is at most $2\cdot|S_{1}|$ . For every $s\in V(S_{1})$ we run $\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha)$ . By Lemma 4.3, running NbrBallOrCycle from $s$ costs $O(n+m)=O(m)$ . Adding $\hat{U}$ to $\hat{V}$ costs $O(n)=O(m)$ . Therefore, the total running time for all $s\in V(S_{1})$ is at most $2\cdot|S_{1}|\cdot O(m)=\tilde{O}(m^{\frac{\alpha-1}{k+1}}\cdot m)=\tilde{O}(m^{1+\frac{\alpha-1}{k+1}})$ .

Now we assume that $i>1$ . We proved in (ii) that if $i>1$ and $u\in G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j})$ then after the $(i-1)$ th iteration, if no cycle was found, we have $|E_{u}^{(k+1)-(i-1)\cdot(\alpha-1)}|<m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}}$ , whp. By Lemma 3.2, for every $s\in V(S_{i})$ , the cost of running $\texttt{BallOrCycle}(s,k_{i})$ is $O(|V_{s}^{k_{i}}|)=O(|E_{s}^{k_{i}}|)=O(|E_{s}^{(k+1)-(i-1)\cdot(\alpha-1)}|)$ . In our case this is at most $O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}})$ . As the size of $S_{i}$ is $\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}})$ and the size of $V(S_{i})$ is at most $2\cdot|S_{i}|$ , the total running time of the calls to BallOrCycle for every $s\in V(S_{i})$ is, whp, $2\cdot|S_{i}|\cdot O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}})=\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}})\cdot O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}})=\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}).$

The cost of adding $V_{s}^{k_{i}-\alpha}$ to $\hat{V}$ is $O(|V_{s}^{k_{i}-\alpha}|)$ . This is at most $O(|V_{s}^{k_{i}}|)=O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}})$ (whp), which is $\tilde{O}(m^{1+\frac{\alpha-1}{k+1}})$ for all $s\in V(S_{i})$ (similarly to the previous calculation).

The cost of removing a vertex $v$ is $O(deg(v))$ . Thus, for every $i\geq 1$ , the total cost of removing all the vertices in $\hat{V_{i}}$ is at most $O(m)$ , so the total running time of the $i$ th iteration is $\tilde{O}(m^{1+\frac{\alpha-1}{k+1}})$ , whp.

If we are in the scenario that a cycle is returned, then the $i$ th iteration stops at an earlier stage, and therefore the running time is also $\tilde{O}(m^{1+\frac{\alpha-1}{k+1}})$ .

Now, since there are at most $y$ iterations of the main for loop, the running time of BfsSample is, whp¹⁰¹⁰10This is the running time in the case that $S_{i}$ was a hitting sets as described for every $1\leq i\leq y$ . This happens whp since we assume that $k<\frac{n}{2}<n$ and therefore $y<n$ . For every $1\leq i\leq y$ the probability that $S_{i}$ is not such a hitting set is at most $\frac{1}{n^{c}}$ . Therefore, using the standard union bound argument, the probability that there exists $1\leq i\leq y$ such that $S_{i}$ is not a hitting set is at most $y\cdot\frac{1}{n^{c}}\leq\frac{1}{n^{c-1}}$ . For large enough $c$ , we get that $S_{i}$ is a hitting set for every $1\leq i\leq y$ , whp., $\tilde{O}(y\cdot m^{1+\frac{\alpha-1}{k+1}})=\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}})$ . ∎

Recall that our goal is to obtain the sparsity property that $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ , for every $u\in V$ , so that we can run $\texttt{AllVtxBallOrCycle}(G,\alpha)$ . However, after running BfsSample the required sparsity property is guaranteed to hold (whp) only if $(k+1)\bmod(\alpha-1)=0$ . In the case that $(k+1)\bmod(\alpha-1)\neq 0$ we need an additional step which is implemented in HandleReminder, to guarantee that the required sparsity property holds.

Next, we formally describe HandleReminder. HandleReminder (see Algorithm 14) gets a graph $G$ and two integers $\alpha,k\geq 2$ such that $k+1=q(\alpha-1)+r$ where $q\geq 1$ and $r>0$ . We set $D$ to $m^{\frac{1}{k+1}}$ and $r$ to $(k+1)\bmod(\alpha-1)$ . Then, a while loop runs as long as $(\alpha-1)\bmod r\neq 0$ . Let $r_{i}$ be the value of $r$ when the $i$ th iteration begins, so that $r_{1}$ is $(k+1)\bmod(\alpha-1)$ . Let $\ell$ be the total number of iterations and $r_{\ell+1}$ the value of $r$ after the $\ell$ th iteration. During the $i$ th iteration, we set $r_{i+1}$ to $(\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i})-(\alpha-1)$ , where $\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}$ is the smallest multiple of $r_{i}$ that is at least $\alpha-1$ (see Figure 7). Then, we call $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ . If SparseOrCycle returns a cycle $C$ then HandleReminder returns $C$ . If SparseOrCycle does not return a cycle then it might be that some vertices were removed from $G$ , and we continue to the next iteration. If the while loop ends without returning a cycle then we return null. Let $G_{0}$ ( $G_{1}$ ) be $G$ before (after) running HandleReminder. Next, we prove two properties on the value of $r$ during the run of HandleReminder.

D\leftarrow m^{\frac{1}{k+1}}

;

r\leftarrow(k+1)\bmod(\alpha-1)

;

while $(\alpha-1)\bmod r\neq 0$ do

r\leftarrow(\lceil\frac{\alpha-1}{r}\rceil\cdot r)-(\alpha-1)

;

C\leftarrow\texttt{SparseOrCycle}(G,D,r,\alpha)

;

if $C\neq\texttt{null}$ then return

C

;

return null;

Algorithm 14

\texttt{HandleReminder}(G,k,\alpha)

Claim 6.1.

Let $k+1=q(\alpha-1)+r_{1}$ and assume $r_{1}>0$ . (i) $0<r_{\ell+1}<r_{\ell}<\cdots<r_{1}<\alpha-1$ . (ii) ( $r_{i+1}+\alpha-1)\bmod r_{i}=0$ , for every $1\leq i\leq\ell$ .

Proof.

(i)

First, since $r_{1}=(k+1)\bmod(\alpha-1)$ and we assume that $r_{1}>0$ , we have $0<r_{1}<\alpha-1$ . Now we show that $0<r_{i+1}<r_{i}$ , for every $1\leq i\leq\ell$ . This implies that $0<r_{\ell+1}<r_{\ell}<\cdots<r_{1}<\alpha$ , as required.

We first show by induction that $r_{i}>0$ , for every $1\leq i\leq\ell$ . The base of the induction follows from the assumption that $r_{1}>0$ . We assume that $r_{i}>0$ and prove that $r_{i+1}>0$ . During the $i$ th iteration of the while loop, we set $r_{i+1}$ to $(\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i})-(\alpha-1)$ . Since this occurs during the $i$ th iteration it must be that $(\alpha-1)\bmod r_{i}\neq 0$ , as otherwise the $i$ th iteration would not have started. Since $(\alpha-1)\bmod r_{i}\neq 0$ we have $\lceil\frac{\alpha-1}{r_{i}}\rceil>\frac{\alpha-1}{r_{i}}$ and therefore, $\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}>\alpha-1$ . We get that $r_{i+1}=(\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i})-(\alpha-1)>0$ , as required.

We now turn to prove that $r_{i+1}<r_{i}$ . As $\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}$ is the smallest multiple of $r_{i}$ that is at least $\alpha-1$ , we know that $(\lceil\frac{\alpha-1}{r_{i}}\rceil-1)\cdot r_{i}=\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-r_{i}<\alpha-1$ . Therefore, $\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-(\alpha-1)<r_{i}$ . Since $r_{i+1}=\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-(\alpha-1)$ , we get that $r_{i+1}<r_{i}$ .
(ii)

Let $1\leq i\leq\ell$ . Since $r_{i+1}=\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-(\alpha-1)$ , it follows that $r_{i+1}+(\alpha-1)$ is a multiple of $r_{i}$ , so ( $r_{i+1}+\alpha-1)\bmod r_{i}=0$ . ∎

Figure 7: The relation between

r_{i}

r_{i+1}

and

\alpha-1

We now prove the main lemma regarding HandleReminder.

Lemma 6.3.

Let $k+1=q(\alpha-1)+r_{1}$ and assume that $r_{1}>0$ , $q\geq 1$ , and that $|E_{u}^{r_{1}}|<D^{r_{1}}$ , for every $u\in V$ . $\texttt{HandleReminder}(G,k,\alpha)$ satisfies the following:

(i)

If a cycle $C$ is returned then $wt(C)\leq 2k$
(ii)

If a cycle is not returned then $|E_{u}^{\alpha-1}|<D^{\alpha-1}$ , for every vertex $u\in G_{2}$
(iii)

If $u\in G_{1}\setminus G_{2}$ then $u$ is not on a $C_{\leq 2\alpha}$ in $G_{1}$
(iv)

$\texttt{HandleReminder}(G,k,\alpha)$ runs in $O(r_{1}\cdot m^{1+\frac{\alpha-1}{k+1}})$ time.

Proof.

(i)

First, we show that if a cycle $C$ is returned then $wt(C)\leq 2k$ . HandleReminder returns a cycle only if a call to $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ , where $1\leq i\leq\ell$ , returns a cycle. By Lemma 3.6, if $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ returns a cycle $C$ then $wt(C)\leq 2(r_{i+1}-1+\alpha)$ . By Claim 6.1(i), $r_{i+1}<r_{1}$ . In addition, since $k+1=q(\alpha-1)+r_{1}$ with $q\geq 1$ , we have $r_{i+1}-1+\alpha<r_{1}-1+\alpha=(\alpha-1)+r_{1}\leq k+1$ . As $r_{i+1}-1+\alpha$ and $k$ are integers, we have $r_{i+1}-1+\alpha\leq k$ . Therefore, $wt(C)\leq 2(r_{i+1}-1+\alpha)\leq 2k$ .
(ii)

Next, we show that if a cycle is not returned then $|E_{u}^{\alpha-1}|<D^{\alpha-1}$ , for every vertex $u\in G_{2}$ . To do so, we show that if HandleReminder does not return a cycle then when the algorithm ends, for every vertex $u\in G_{2}$ we have $|E_{u}^{r_{\ell+1}}|<D^{r_{\ell+1}}$ .

If we do not enter the while loop and $\ell=0$ then by our assumption $|E_{u}^{r_{1}}|<D^{r_{1}}$ , for every $u\in V$ , as required. Now we assume that we enter the loop so $\ell>0$ . Consider the $i$ th iteration of the while loop. During the $i$ th iteration $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ is called. If SparseOrCycle does not return a cycle, then it follows from Lemma 3.6 that if $u\in\hat{V}$ ( $u$ was not removed) then $|E_{u}^{r_{i+1}}|<D^{r_{i+1}}$ . Therefore, if no cycle was returned by HandleReminder we get for $i=\ell$ that after the $\ell$ th iteration, we have $|E_{u}^{r_{\ell+1}}|<D^{r_{\ell+1}}$ , for every $u\in G_{2}$ .

The while loop ends when $(\alpha-1)\bmod r=0$ . Therefore, we know that after the while loop, $(\alpha-1)\bmod r_{\ell+1}=0$ . Additionally, By Claim 6.1(i) $r_{\ell+1}<\alpha-1$ , so there is $z>0$ such that $\alpha-1=zr_{\ell+1}$ . By Corollary 3.2, since $|E_{u}^{r_{\ell+1}}|<D^{r_{\ell+1}}$ , we know that $|E_{u}^{\alpha-1}|=|E_{u}^{zr_{\ell+1}}|<D^{{zr_{\ell+1}}}=D^{\alpha-1}$ , as required.
(iii)

Next, we prove that if $u\in G_{1}\setminus G_{2}$ then $u$ is not on a $C_{\leq 2\alpha}$ in $G_{1}$ . If $u\in G_{1}\setminus G_{2}$ then by the definition of $G_{1}$ and $G_{2}$ , $u$ was removed while executing HandleReminder. During the run of HandleReminder, a vertex can be removed only by $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ for some $1\leq i\leq\ell$ . Therefore, by Lemma 3.6, $u$ is not part of a $C_{\leq 2\alpha}$ in $G$ . Since during the run of HandleReminder only vertices that are not part of a $C_{\leq 2\alpha}$ are removed, $u$ is not part of a $C_{\leq 2\alpha}$ also in $G_{1}$ .
(iv)

Finally, we show that HandleReminder runs in $O(r_{1}\cdot m^{1+\frac{\alpha-1}{k+1}})$ time. To do so, we show that the running time of the $i$ th iteration of the while loop is $O(m^{1+\frac{\alpha-1}{k+1}})$ . When $\ell=0$ there are no iterations and the running time is $O(1)$ . Now we assume that $\ell>0$ . During the $i$ th iteration, we call $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ . We proved in (ii) that for $i>1$ if SparseOrCycle did not return a cycle during the $(i-1)$ th iteration and $u\in\hat{V}$ ( $u$ was not removed) then after the $(i-1)$ th iteration $|E_{u}^{r_{i}}|<D^{r_{i}}$ . For $i=1$ by our assumption $|E_{u}^{r_{1}}|<D^{r_{1}}$ . Therefore, before the $i$ th iteration starts, by Corollary 3.2, $|E_{u}^{zr_{i}}|<D^{zr_{i}}$ for every integer $z>0$ . By Claim 6.1(ii), $r_{i+1}-1+\alpha$ is divisible by $r_{i}$ , so $|E_{u}^{r_{i+1}-1+\alpha}|<D^{r_{i+1}-1+\alpha}$ for every vertex $u$ . It then follows from Corollary 3.3 that $\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha)$ runs in $O(nD^{r_{i+1}}+mD^{\alpha-1})$ time. By Claim 6.1(i), $r_{i+1}<\alpha-1$ so $O(nD^{r_{i+1}}+mD^{\alpha-1})\leq O(mD^{\alpha-1})=O(m^{1+\frac{\alpha-1}{k+1}})$ , and the running time of the $i$ th iteration is $O(m^{1+\frac{\alpha-1}{k+1}})$ . Now, it follows from Claim 6.1(i) that after at most $r_{1}<\alpha-1$ iterations, the value of $r$ cannot decrease anymore (since it cannot become less than $1$ , and $(\alpha-1)\bmod 1=0$ ) so the while loop ends. As we saw, the running time of each iteration is $O(m^{1+\frac{\alpha-1}{k+1}})$ , hence the total running time of the while loop is $O(r_{1}\cdot m^{1+\frac{\alpha-1}{k+1}})$ . ∎

Now we are ready to prove the correctness and running time of ShortCycleSparse.

Lemma 6.4.

Let $2\leq\alpha\leq k<\frac{n}{2}$ such that $k+1=q(\alpha-1)+r$ , where $q\geq 1$ and $0\leq r<\alpha-1$ are integers. Algorithm $\texttt{ShortCycleSparse}(G,k,\alpha)$ runs whp in $\tilde{O}((q+r)\cdot m^{1+\frac{\alpha-1}{k+1}})$ time and either returns a $C_{\leq 2k}$ , or determines that $g>2\alpha$ .

Proof.

First, $\texttt{ShortCycleSparse}(G,k,\alpha)$ returns a cycle $C$ only if $\texttt{BfsSample}(G,k,\alpha)$ , $\texttt{HandleReminder}(G,k,\alpha)$ , or $\texttt{AllVtxBallOrCycle}(G,\alpha)$ returns a cycle $C$ . If BfsSample or AllVtxBallOrCycle returns a cycle $C$ then by Lemma 6.2 or by Lemma 3.3, $wt(C)\leq 2k$ . If HandleReminder returns a cycle $C$ then by Lemma 6.3, since $k\geq\alpha$ and hence $q\geq 1$ , $wt(C)\leq 2k$ .

Second, we show that if no cycle was found then $g>2\alpha$ . If no cycle was found then it might be that some vertices were removed from the graph. A vertex $u$ can be removed either by BfsSample or by HandleReminder. It follows from Lemma 6.2 and Lemma 6.3, that $u$ is not part of a $C_{\leq 2\alpha}$ when $u$ is removed. Since only vertices that are not on a $C_{\leq 2\alpha}$ are removed, every $C_{\leq 2\alpha}$ that was in the input graph also belongs to the updated graph. After the (possible) removal of vertices, we call $\texttt{AllVtxBallOrCycle}(G,\alpha)$ with the updated graph. Since we are in the case that no cycle was found, AllVtxBallOrCycle did not return a cycle. It follows from Lemma 3.3 that $g>2\alpha$ in the updated graph, and therefore $g>2\alpha$ also in the input graph.

Now we turn to analyze the running time of ShortCycleSparse. At the beginning, ShortCycleSparse calls BfsSample. By Lemma 6.2, BfsSample runs in $\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}})=\tilde{O}(q\cdot m^{1+\frac{\alpha-1}{k+1}})$ time. Let $G_{1}$ be the graph after the call to BfsSample. Recall that $y=\lceil\frac{k+1}{\alpha-1}\rceil-1$ . By Lemma 6.2, for every $u\in G_{1}$ we have $|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}$ , whp. If $r>0$ then $y=q$ and $|E_{u}^{r}|<m^{\frac{r}{k+1}}$ . If $r=0$ then $y=q-1$ and $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ .

Next, ShortCycleSparse checks if $r=(k-1)\bmod(\alpha-1)>0$ . We divide the rest of the proof to the case that $r=0$ and to the case that $r>0$ . If $r=0$ then ShortCycleSparse calls AllVtxBallOrCycle. Since $r=0$ , we have $|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}$ after BfsSample. By Corollary 3.1, the running time of AllVtxBallOrCycle is $O(m\cdot m^{\frac{\alpha-1}{k+1}})=O(m^{1+\frac{\alpha-1}{k+1}})$ .

We now turn to the case that $r>0$ . In this case it might be that $|E_{u}^{\alpha-1}|\geq m^{\frac{\alpha-1}{k+1}}$ for some vertices $u\in G_{1}$ . Therefore, we first call $\texttt{HandleReminder}(G,k,\alpha)$ , knowing that $r>0$ , $q\geq 1$ and that whp, $|E_{u}^{r}|<m^{\frac{r}{k+1}}$ after BfsSample. By Lemma 6.3, the running time is $O(r\cdot m^{1+\frac{\alpha-1}{k+1}})$ . Let $G_{2}$ be the graph after HandleReminder ends. By Lemma 6.3, for every $u\in G_{2}$ we have $|E_{u}^{\alpha-1}|<D^{\alpha-1}$ , where $D=m^{\frac{1}{k+1}}$ . Now ShortCycleSparse calls AllVtxBallOrCycle, and using Corollary 3.1 again, we get that the running time of AllVtxBallOrCycle is $O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}})$ .

It follows from the above discussion that ShortCycleSparse either returns a $C_{\leq 2k}$ or determines that $g>2\alpha$ , and the running time is whp¹¹¹¹11 Throughout the run of ShortCycleSparse, some of the bounds that we get on $|E_{v}^{d}|$ for vertices $v\in V$ and distances $d$ , are whp, because the sets that we sample are hitting sets whp (see the proof of Lemma 6.2). Therefore, also the running times of BfsSample, HandleReminder and AllVtxBallOrCycle are whp, since they rely on these bounds. $\tilde{O}((q+r)\cdot m^{1+\frac{\alpha-1}{k+1}})$ . ∎

Since ShortCycleSparse is run by ShortCycle when $m\leq O(n^{1+\frac{1}{k}})$ and when $\alpha\leq k$ and so $1+\frac{\alpha-1}{k+1}\leq 2$ , we have $m^{1+\frac{\alpha-1}{k+1}}\leq O(n^{1+\frac{\alpha}{k}})$ . Thus, the running time of ShortCycleSparse is whp $\tilde{O}((q+r)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ , which is at most $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ .

6.2.1 Algorithm SpecialCases

We now present algorithm SpecialCases that handles special cases of $k$ and $\alpha$ . $\texttt{SpecialCases}(G,k,\alpha)$ gets as an input a graph $G$ and two integers $k\geq 2$ and $\alpha\geq 2$ .

If $k\geq\frac{n}{2}$ , the algorithm simply runs $\texttt{BallOrCycle}(G,w,n)$ from an arbitrary vertex $w\in V$ in $O(n)\leq O(\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ time, to check whether $G$ contains a cycle. If a cycle is found then its length is at most $n\leq 2k$ , and we return a $C_{\leq 2k}$ . Otherwise, $g=\infty$ so for every integer $\alpha$ we return that $g>2\alpha$ .

If $k\leq\alpha-1$ , we check if the graph contains a $C_{d}$ for $d\leftarrow 3,...,2\alpha$ , using an algorithm of Alon, Yuster and Zwick [2]. Their algorithm decides whether $G$ contains $C_{2\ell-1}$ s and $C_{2\ell}$ s, and finds such cycles if it does, in $O(m^{2-\frac{1}{\ell}})$ time. Applying this algorithm with increasing cycle lengths until a length of $2\alpha$ (the values of $\ell$ are in the worst case $2,3,...,\alpha$ ), we can either find the shortest cycle or determine that $g>2\alpha$ . The running time is $O(m^{2-\frac{1}{2}}+m^{2-\frac{1}{3}}+\dots m^{2-\frac{1}{\alpha}})=O(\alpha\cdot m^{2-\frac{1}{\alpha}})=O(\alpha\cdot m^{1+\frac{\alpha-1}{\alpha}})$ time¹²¹²12It is possible to modify the algorithm of Alon et al. [2] to search in $O(m^{2-\frac{1}{\ell}})$ time a shortest cycle of length at most $2\ell$ instead of exactly $2\ell-1$ or $2\ell$ , and then run it only with $\ell=\alpha$ , to avoid the $\alpha$ factor in the running time. . Since $k\leq\alpha-1$ , we have $k+1\leq\alpha$ and therefore $O(\alpha\cdot m^{1+\frac{\alpha-1}{\alpha}})\leq O(\alpha\cdot m^{1+\frac{\alpha-1}{k+1}})$ . In addition, since $m\leq O(n^{1+\frac{1}{k}})$ and since $1+\frac{\alpha-1}{\alpha}\leq 2$ , we have $m^{1+\frac{\alpha-1}{\alpha}}\leq O(n^{(1+\frac{1}{k})\cdot(1+\frac{\alpha-1}{\alpha})})\leq O(n^{(1+\frac{1}{k})\cdot(1+\frac{\alpha-1}{k+1})})=O(n^{1+\frac{\alpha}{k}})$ . Therefore, the running time is $O(\alpha\cdot m^{1+\frac{\alpha-1}{\alpha}})\leq O(\alpha\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ .

By choosing which algorithm to run according to the relation between $k$ , $\alpha$ and $\frac{n}{2}$ , we get that for every two integers $\alpha\geq 2$ and $k\geq 2$ , algorithm $\texttt{ShortCycle}(G,k,\alpha)$ runs whp in $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ time and either returns a $C_{\leq\max\{2k,g\}}$ , or determines that $g>2\alpha$ . This completes the proof of Theorem 6.1.

7 Approximation of the girth

In this section we present two new tradeoffs for girth approximation that follow from Corollary 6.1. In these tradeoffs we use ShortCycle with $2\leq\alpha\leq k$ , so by Corollary 6.1, ShortCycle is a $\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})$ -time, $(2k,2\alpha)$ -hybrid algorithm.

7.1 Dense graphs

Kadria et al. [7] presented an $O((\alpha-c)\cdot n^{1+\frac{\alpha}{2\alpha-c}})$ -time algorithm that either returns a $C_{\leq 4\alpha-2c}$ , or determines that $g>2\alpha$ , where $0<c\leq\alpha$ are two integers. This is a $(4\alpha-2c,2\alpha)$ -hybrid algorithm which, combined with a binary search, was used by [7] to compute for every $\varepsilon\in(0,1]$ a cycle $C$ such that $wt(C)\leq 4\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(2-\varepsilon)g+4$ , in $\widetilde{O}(n^{1+1/(2-\varepsilon)})$ time, if $g\leq\log^{2}n$ . We use ShortCycle in a similar way and prove:

Theorem 7.1.

Let $\ell\geq 2$ be an integer, $\varepsilon\in[0,1]$ and $g\leq\log^{2}n$ . It is possible to compute, whp, in $\widetilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)})$ time, a cycle $C$ such that $wt(C)\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(\ell-\varepsilon)g+\ell+2$ .

Proof.

For each $\tilde{g}$ in the range $[3,\log^{2}n]$ in increasing order, we call $\texttt{ShortCycle}(G,k(\alpha_{\tilde{g}}),\alpha_{\tilde{g}})$ , where $\alpha_{\tilde{g}}=\lceil\frac{\tilde{g}}{2}\rceil$ , and $k(\alpha)=\ell\alpha-\lfloor\varepsilon\alpha\rfloor$ . When we find the smallest value $\tilde{g}$ for which ShortCycle returns a cycle, we stop and return that cycle. Since $\ell\geq 2$ and $\varepsilon\leq 1$ we have $k(\alpha_{\tilde{g}})\geq\alpha_{\tilde{g}}$ , and it follows from Corollary 6.1 that ShortCycle either returns a $C_{\leq 2k(\alpha_{\tilde{g}})}$ or determines that $g>2\alpha_{\tilde{g}}\geq\tilde{g}$ in $\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot n^{1+\frac{\alpha}{k(\alpha)}})$ time, whp.

We first prove that the algorithm returns a cycle $C$ such that $wt(C)\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(\ell-\varepsilon)g+\ell+2$ . Let $g^{\prime}$ be the smallest value $\tilde{g}$ for which ShortCycle returned a cycle. This implies that for $g^{\prime}-1$ the algorithm did not return a cycle, and hence $g>g^{\prime}-1$ . Since $g$ and $g^{\prime}$ are integers, we have $g\geq g^{\prime}$ . Also for $g^{\prime}=3$ we have $g\geq g^{\prime}$ since the girth is at least $3$ .

The call to $\texttt{ShortCycle}(G,k(\alpha_{g^{\prime}}),\alpha_{g^{\prime}})$ returns a cycle $C$ such that $wt(C)\leq 2k(\alpha_{g^{\prime}})=2\cdot(\ell\alpha_{g^{\prime}}-\lfloor\varepsilon\alpha_{g^{\prime}}\rfloor)=2\cdot\ell\lceil\frac{g^{\prime}}{2}\rceil-2\cdot\lfloor\varepsilon\lceil\frac{g^{\prime}}{2}\rceil\rfloor\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor$ . Thus, $wt(C)\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\frac{g}{2}\rfloor\leq 2\ell\lceil\frac{g}{2}\rceil-2(\varepsilon\frac{g}{2}-1)\leq 2\ell\frac{g+1}{2}-\varepsilon g+2=\ell g+\ell-\varepsilon g+2=(\ell-\varepsilon)g+\ell+2$ .

For the running time, there are at most $O(\log^{2}n)$ calls to ShortCycle, and each call costs $\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot n^{1+\frac{\alpha}{k(\alpha)}})$ whp (with the values of $k$ and $\alpha$ that correspond to that call). In each call, $\frac{k(\alpha)+1}{\alpha-1}=\frac{\ell\alpha-\lfloor\varepsilon\alpha\rfloor+1}{\alpha-1}\leq\frac{\ell\alpha+1}{\alpha-1}$ which is at most $2\ell+1$ since $\alpha\geq 2$ . In addition, $n^{1+\frac{\alpha}{k(\alpha)}}=n^{1+\frac{\alpha}{\ell\alpha-\lfloor\varepsilon\alpha\rfloor}}\leq n^{1+\frac{\alpha}{\ell\alpha-\varepsilon\alpha}}=n^{1+\frac{1}{\ell-\varepsilon}}$ . Thus, the running time of each call is $\tilde{O}((\ell+\alpha)\cdot n^{1+\frac{1}{\ell-\varepsilon}})$ whp, which is $\tilde{O}(\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}})$ since $\alpha\leq\log^{2}n$ in each call. Therefore, the total running time is, whp¹³¹³13The running time of each call to ShortCycle is whp. Since the number of calls to ShortCycle is at most $\log^{2}n<n$ , using a union bound argument as in the proof of Lemma 6.2 we get that also the total running of all the calls is whp., $\tilde{O}(\log^{2}n\cdot\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}})=\tilde{O}(\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}})$ . ∎

Our algorithm can be viewed as a natural generalization of a couple of algorithms from [7]. By setting $\ell=2$ in Theorem 7.1 we get the $(2-\varepsilon)g+4$ approximation of [7]. By setting $\varepsilon=0$ we get an $\widetilde{O}(\ell\cdot n^{1+1/\ell})$ time algorithm that computes a $C_{\leq 2\ell\lceil\frac{g}{2}\rceil}$ , which is similar to the $\widetilde{O}(n^{1+1/\ell})$ time algorithm of [7] which computes a $C_{\leq 2\ell\lceil\frac{g}{2}\rceil}$ .

7.2 Sparse graphs

We use a similar approach to obtain a tradeoff for girth approximation in sparse graphs. We prove the following theorem.

Theorem 7.2.

Let $\ell\geq 3$ be an integer, $\varepsilon\in[0,1)$ and $g\leq\log^{2}n$ . It is possible to compute, whp, in $\widetilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)})$ time, a cycle $C$ such that $wt(C)\leq 2\ell(\lceil\frac{g}{2}\rceil-1)-2\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-2\leq(\ell-\varepsilon)g-\ell+2\varepsilon$ .

Proof.

For each $\tilde{g}$ in the range $[3,\log^{2}n]$ in increasing order, we call $\texttt{ShortCycle}(G,k(\alpha_{\tilde{g}}),\alpha_{\tilde{g}})$ , where $\alpha_{\tilde{g}}=\lceil\frac{\tilde{g}}{2}\rceil$ , and $k(\alpha)=\ell(\alpha-1)-\lfloor\varepsilon(\alpha-1)\rfloor-1$ . When we find the smallest value $\tilde{g}$ for which ShortCycle returns a cycle, we stop and return that cycle. Since $\ell\geq 3$ and $\varepsilon<1$ we have $k(\alpha_{\tilde{g}})\geq\alpha_{\tilde{g}}$ , and it follows from Corollary 6.1 that ShortCycle either returns a $C_{\leq 2k(\alpha_{\tilde{g}})}$ or determines that $g>2\alpha_{\tilde{g}}\geq\tilde{g}$ in $\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k(\alpha)+1}})$ time, whp.

We first prove that the algorithm returns a cycle $C$ such that $wt(C)\leq 2\ell(\lceil\frac{g}{2}\rceil-1)-2\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-2\leq(\ell-\varepsilon)g-\ell+2\varepsilon$ . Let $g^{\prime}$ be the smallest value $\tilde{g}$ for which ShortCycle returned a cycle. As before, this implies that $g\geq g^{\prime}$ . The call to $\texttt{ShortCycleSparse}(G,k(\alpha_{g^{\prime}}),\alpha_{g^{\prime}})$ returns a cycle $C$ such that

	$\displaystyle wt(C)\leq 2k(\alpha_{g^{\prime}})$	$\displaystyle=2\cdot(\ell(\alpha_{g^{\prime}}-1)-\lfloor\varepsilon(\alpha_{g^{\prime}}-1)\rfloor-1)$
		$\displaystyle=2\cdot(\ell(\lceil\frac{g^{\prime}}{2}\rceil-1)-\lfloor\varepsilon(\lceil\frac{g^{\prime}}{2}\rceil-1)\rfloor-1)$
		$\displaystyle\leq 2\cdot(\ell(\lceil\frac{g}{2}\rceil-1)-\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-1)$
		$\displaystyle=2\ell(\lceil\frac{g}{2}\rceil-1)-2\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-2.$

Now, since $\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor\geq\lfloor\varepsilon(\frac{g}{2}-1)\rfloor\geq\varepsilon(\frac{g}{2}-1)-1=\varepsilon\frac{g}{2}-\varepsilon-1$ , we get that $wt(C)\leq 2\ell(\lceil\frac{g}{2}\rceil-1)-2(\varepsilon\frac{g}{2}-\varepsilon-1)-2=2\ell\lceil\frac{g}{2}\rceil-2\ell-\varepsilon g+2\varepsilon\leq 2\ell\frac{g+1}{2}-2\ell-\varepsilon g+2\varepsilon=\ell g+\ell-2\ell-\varepsilon g+2\varepsilon=(\ell-\varepsilon)g-\ell+2\varepsilon$ .

For the running time, there are at most $O(\log^{2}n)$ calls to ShortCycle, and each call costs $\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k(\alpha)+1}})$ whp (with the values of $k$ and $\alpha$ that correspond to that call). We have $\frac{k(\alpha)+1}{\alpha-1}=\frac{\ell(\alpha-1)-\lfloor\varepsilon(\alpha-1)\rfloor}{\alpha-1}\leq\frac{\ell(\alpha-1)}{\alpha-1}=\ell$ . In addition, $m^{1+\frac{\alpha-1}{k+1}}=m^{1+\frac{\alpha-1}{\ell(\alpha-1)-\lfloor\varepsilon(\alpha-1)\rfloor}}\leq m^{1+\frac{\alpha-1}{\ell(\alpha-1)-\varepsilon(\alpha-1)}}=m^{1+\frac{1}{\ell-\varepsilon}}$ . Thus, the running time of each call is $\tilde{O}((\ell+\alpha)\cdot m^{1+\frac{1}{\ell-\varepsilon}})$ whp, which is $\tilde{O}(\ell\cdot m^{1+\frac{1}{\ell-\varepsilon}})$ since $\alpha\leq\log^{2}n$ in each call. Therefore, the total running time is, whp¹⁴¹⁴14See the previous footnote., $\tilde{O}(\log^{2}n\cdot\ell\cdot m^{1+\frac{1}{\ell-\varepsilon}})=\tilde{O}(\ell\cdot m^{1+\frac{1}{\ell-\varepsilon}})$ . ∎

By setting $\varepsilon=0$ we get an $\widetilde{O}(\ell\cdot m^{1+1/\ell})$ -time algorithm that computes a $C_{\leq 2\ell(\lceil\frac{g}{2}\rceil-1)}$ , as opposed to the $\widetilde{O}(\ell\cdot n^{1+1/\ell})$ time algorithm that computes a $C_{\leq 2\ell\lceil\frac{g}{2}\rceil}$ .

References

[1] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM Journal on Computing, 28(4):1167–1181, 1999.
[2] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997.
[3] Shiri Chechik, Yang P. Liu, Omer Rotem, and Aaron Sidford. Constant girth approximation for directed graphs in subquadratic time. In Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 1010–1023. ACM, 2020.
[4] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. Finding even cycles faster via capped k-walks. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 112–120. ACM, 2017.
[5] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. New subquadratic approximation algorithms for the girth. arXiv preprint arXiv:1704.02178, 2017.
[6] Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. In Proceedings of the ninth annual ACM symposium on Theory of computing, pages 1–10, 1977.
[7] Avi Kadria, Liam Roditty, Aaron Sidford, Virginia Vassilevska Williams, and Uri Zwick. Algorithmic trade-offs for girth approximation in undirected graphs. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1471–1492. SIAM, 2022.
[8] Andrzej Lingas and Eva-Marta Lundell. Efficient approximation algorithms for shortest cycles in undirected graphs. Information Processing Letters, 109(10):493–498, 2009.
[9] Liam Roditty and Roei Tov. Approximating the girth. ACM Transactions on Algorithms (TALG), 9(2):1–13, 2013.
[10] Liam Roditty and Virginia Vassilevska Williams. Fast approximation algorithms for the diameter and radius of sparse graphs. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 515–524, 2013.
[11] Liam Roditty and Virginia Vassilevska Williams. Subquadratic time approximation algorithms for the girth. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pages 833–845. SIAM, 2012.
[12] Virginia Vassilevska Williams and R. Ryan Williams. Subcubic equivalences between path, matrix, and triangle problems. J. ACM, 65(5):27:1–27:38, 2018.
[13] Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou. New bounds for matrix multiplication: from alpha to omega. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3792–3835. SIAM, 2024.

New algorithms for girth and cycle detection

Abstract

1 Introduction

Definition 1.1.

Problem 1.1.

2 Overview

3 Preliminaries

Lemma 3.1 ([7]).

Lemma 3.2 ([7]).

Lemma 3.3.

Proof.

Corollary 3.1.

Proof.

Lemma 3.4 ([7]).

Lemma 3.5.

Proof.

Corollary 3.2.

Proof.

Lemma 3.6.

Corollary 3.3.

Proof.

Lemma 3.7.

4 A new cycle searching technique

Lemma 4.1.

Proof.

Lemma 4.2.

Proof.

Lemma 4.3.

Proof.

5 A 2​k2k-hybrid algorithm and a (+1)(+1)-approximation of the girth

5.1 A 2​k2k-hybrid algorithm

Lemma 5.1.

Proof.

Lemma 5.2.

Proof.

5.2 A (+1)(+1)-approximation of the girth

Theorem 5.1.

Proof.

6 A general hybrid algorithm

6.1 Extending NbrBallOrCycle

Lemma 6.1.

Proof.

6.2 A (max⁡{2​k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm

Theorem 6.1.

Corollary 6.1.

Lemma 6.2.

Proof.

Claim 6.1.

Proof.

Lemma 6.3.

Proof.

Lemma 6.4.

Proof.

6.2.1 Algorithm SpecialCases

7 Approximation of the girth

7.1 Dense graphs

Theorem 7.1.

Proof.

7.2 Sparse graphs

Theorem 7.2.

Proof.

References

5 A $2k$ -hybrid algorithm and a $(+1)$ -approximation of the girth

5.1 A $2k$ -hybrid algorithm

5.2 A $(+1)$ -approximation of the girth

6.2 A $(\max\{2k,g\},\tilde{g})$ -hybrid algorithm