This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

New algorithms for girth and cycle detection

Liam Roditty    Plia Trabelsi    Liam Roditty Department of Computer Science, Bar Ilan University, Ramat Gan 5290002, Israel. E-mail liam.roditty@biu.ac.il. Supported in part by BSF grants 2016365 and 2020356.    Plia Trabelsi Department of Computer Science, Bar Ilan University, Ramat Gan 5290002, Israel. E-mail plia.trabelsi@gmail.com.
Abstract

Let G=(V,E)G=(V,E) be an unweighted undirected graph with nn vertices and mm edges. Let gg be the girth of GG, that is, the length of a shortest cycle in GG. We present a randomized algorithm with a running time of O~(n1+1ε)\tilde{O}\big{(}\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}}\big{)} that returns a cycle of length at most 2g22εg2,2\ell\left\lceil\frac{g}{2}\right\rceil-2\left\lfloor\varepsilon\left\lceil\frac{g}{2}\right\rceil\right\rfloor, where 2\ell\geq 2 is an integer and ε[0,1]\varepsilon\in[0,1], for every graph with g=polylog(n)g={\rm polylog}(n).

Our algorithm generalizes an algorithm of Kadria et al. [SODA’22] that computes a cycle of length at most 4g22εg24\left\lceil\frac{g}{2}\right\rceil-2\left\lfloor\varepsilon\left\lceil\frac{g}{2}\right\rceil\right\rfloor in O~(n1+12ε)\tilde{O}\big{(}n^{1+\frac{1}{2-\varepsilon}}\big{)} time. Kadria et al. presented also an algorithm that finds a cycle of length at most 2g22\ell\left\lceil\frac{g}{2}\right\rceil in O~(n1+1)\tilde{O}\big{(}n^{1+\frac{1}{\ell}}\big{)} time, where \ell must be an integer. Our algorithm generalizes this algorithm, as well, by replacing the integer parameter \ell in the running time exponent with a real-valued parameter ε\ell-\varepsilon, thereby offering greater flexibility in parameter selection and enabling a broader spectrum of combinations between running times and cycle lengths.

We also show that for sparse graphs a better tradeoff is possible, by presenting an O~(m1+1/(ε))\tilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)}) time randomized algorithm that returns a cycle of length at most 2(g12)2(εg12+1)2\ell(\lfloor\frac{g-1}{2}\rfloor)-2(\lfloor\varepsilon\lfloor\frac{g-1}{2}\rfloor\rfloor+1), where 3\ell\geq 3 is an integer and ε[0,1)\varepsilon\in[0,1), for every graph with g=polylog(n)g={\rm polylog}(n).

To obtain our algorithms we develop several techniques and introduce a formal definition of hybrid cycle detection algorithms. Both may prove useful in broader contexts, including other cycle detection and approximation problems. Among our techniques is a new cycle searching technique, in which we search for a cycle from a given vertex and possibly all its neighbors in linear time. Using this technique together with more ideas we develop two hybrid algorithms. The first allows us to obtain a O~(m22g/2+1)\tilde{O}(m^{2-\frac{2}{\lceil g/2\rceil+1}})-time, (+1)(+1)-approximation of gg. The second is used to obtain our O~(n1+1/(ε))\tilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)})-time and O~(m1+1/(ε))\tilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)})-time approximation algorithms.

1 Introduction

Let G=(V,E)G=(V,E) be an unweighted undirected graph with nn vertices and mm edges. A set of vertices C={v1,v2,,v+1}C_{\ell}=\{v_{1},v_{2},\cdots,v_{\ell+1}\} in GG, where 2\ell\geq 2, is a cycle of length \ell if v1=v+1v_{1}=v_{\ell+1} and (vi,vi+1)E(v_{i},v_{i+1})\in E, where 1i1\leq i\leq\ell. A CC_{\leq\ell} is a cycle of length at most \ell. The girth gg of GG is the length of a shortest cycle in GG. The girth of a graph has been studied extensively since the 1970s by researchers from both the graph theory and the algorithms communities.

Itai and Rodeh [6] showed that the girth can be computed in O(mn)O(mn) time or in O(nω)O(n^{\omega}) time, where ω<2.371552\omega<2.371552 [13], if Fast Matrix Multiplication (FMM) algorithms are used. They also proved that the problem of computing the girth is equivalent to the problem of deciding whether there is a C3C_{3} (triangle) in the graph or not.

Interestingly, there is a close connection between the girth problem and the All Pairs Shortest Path (APSP) problem. Vassilevska W. and Williams [12] proved that a truly subcubic time algorithm that computes the girth, without FMM, implies a truly subcubic time algorithm that computes APSP, without FMM. Such an algorithm for APSP would be a major breakthrough. In light of this girth and APSP connection, it is natural to settle for an approximation algorithm for the girth instead of exact computation. An (α,β)(\alpha,\beta)-approximation g^\hat{g} of gg (where α1\alpha\geq 1 and β0\beta\geq 0), satisfies gg^αg+βg\leq\hat{g}\leq\alpha\cdot g+\beta. We denote an approximation as an α\alpha-approximation if β=0\beta=0 and as a (+β)(+\beta)-approximation if α=1\alpha=1.

Itai and Rodeh [6] presented a (+1)(+1)-approximation algorithm that runs in O(n2)O(n^{2}) time. Notice that in contrast to the APSP problem, where a running time of Ω(n2)\Omega(n^{2}) is inevitable since the output size is Ω(n2)\Omega(n^{2}), in the girth problem the output is a single number, thus, there is no natural barrier for sub-quadratic time algorithms. Indeed, Lingas and Lundell [8] presented a 83\frac{8}{3}-approximation algorithm that runs in O~(n3/2)\tilde{O}(n^{3/2}) time, and Roditty and V. Williams [11] presented a 22-approximation algorithm that runs in O~(n5/3)\tilde{O}(n^{5/3}) time. Dahlgaard, Knudsen and Stöckel [5] presented two tradeoffs between running time and approximation. One generalizes the algorithms of [8, 11] and computes a cycle of length at most 2g2+2g2(1)2\lceil\frac{g}{2}\rceil+2\lceil\frac{g}{2(\ell-1)}\rceil in O~(n21/)\tilde{O}(n^{2-1/\ell}) time. The other computes, whp, a C2gC_{\leq 2^{\ell}g}, for any integer 2\ell\geq 2, in O~(n1+1/)\tilde{O}(n^{1+1/\ell}) time.

Kadria et al. [7] significantly improved upon the second algorithm of [5] and presented an algorithm, that for every integer 1\ell\geq 1, computes a C2g/2C_{\leq 2\ell\lceil g/2\rceil} in O~(n1+1/)\tilde{O}(n^{1+1/\ell}) time. They also presented an algorithm, that for every ε(0,1)\varepsilon\in(0,1), computes a cycle of length at most 4g22εg2(2ε)g+44\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(2-\varepsilon)g+4, in O~(n1+1/(2ε))\widetilde{O}(n^{1+1/(2-\varepsilon)}) time, for every graph with g=polylog(n)g={\rm polylog}(n).

These two algorithms of Kadria et al., as well as few other approximation algorithms (see for example, [8], [3], [9]), were obtained using a general framework for girth approximation in which a search is performed over the range of possible values of gg, using some algorithm 𝒜\mathcal{A} that gets as an input an integer g~\tilde{g} which is a guess for the value of gg. In each step of the search 𝒜\mathcal{A} either returns a cycle Cf(g~)C_{\leq f(\tilde{g})}, where ff is a non decreasing function, or determines that g>g~g>\tilde{g}. The goal of the search is to find the smallest g~\tilde{g}, for which 𝒜\mathcal{A} returns a cycle, because for this value we have g>g~1g>\tilde{g}-1 (and thus gg~g\geq\tilde{g}), and algorithm 𝒜\mathcal{A} returns a Cf(g~)C_{\leq f(\tilde{g})}. This cycle is of length at most f(g)f(g) since gg~g\geq\tilde{g} and ff is a non decreasing function. The two possible outcomes of 𝒜\mathcal{A} and its usage in the general girth approximation framework inspired us to formally define the notion of a (γ,δ)(\gamma,\delta)-hybrid algorithm as follows:

Definition 1.1.

A (γ,δ)(\gamma,\delta)-hybrid algorithm is an algorithm that either outputs a CγC_{\leq\gamma} or determines that g>δg>\delta.

When γ=δ\gamma=\delta, the algorithm is referred to as a γ\gamma-hybrid algorithm. The girth approximation framework described above suggests that a possible approach for developing efficient girth approximation algorithms is by developing efficient (γ,δ)(\gamma,\delta)-hybrid algorithms.

Kadria et al. [7] designed several algorithms that satisfy the definition of (γ,δ)(\gamma,\delta)-hybrid algorithms. Their girth approximation algorithms mentioned above were obtained using two different (f(g~),g~)(f(\tilde{g}),\tilde{g})-hybrid algorithms. Additionally, for every k2k\geq 2, they presented a (2k,3)(2k,3)-hybrid and a (2k,4)(2k,4)-hybrid algorithms that run in O(min{n1+2/k,m1+1/(k+1)})O(\min\{n^{1+2/k},m^{1+1/(k+1)}\}) time, and a (max{2k,g},5)(\max\{2k,g\},5)-hybrid algorithm that runs in O(min{n1+3/k,m1+2/(k+1)})O(\min\{n^{1+3/k},m^{1+2/(k+1)}\}) time. Therefore, for k2k\geq 2, g~{3,4,5}\tilde{g}\in\{3,4,5\} and α=g~2\alpha=\lceil\frac{\tilde{g}}{2}\rceil, there is a (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm that runs in O(min{n1+αk,m1+α1k+1})O(\min\{n^{1+\frac{\alpha}{k}},m^{1+\frac{\alpha-1}{k+1}}\}) time. A natural question is whether these three algorithms are only part of a general tradeoff between the running time, γ\gamma and δ\delta.

Problem 1.1.

Let g~6\tilde{g}\geq 6 and k2k\geq 2 be two integers and let α=g~2\alpha=\lceil\frac{\tilde{g}}{2}\rceil. Is it possible to obtain a (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm that runs in O(min{n1+αk,m1+α1k+1})O(\min\{n^{1+\frac{\alpha}{k}},m^{1+\frac{\alpha-1}{k+1}}\}) time?

In this paper we present a (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm that runs, whp, in O((k+1α1+α)min{n1+αk,m1+α1k+1})O((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{n^{1+\frac{\alpha}{k}},m^{1+\frac{\alpha-1}{k+1}}\}) time. This algorithm provides an affirmative answer to Problem 1.1, albeit the (k+1α1+α)(\frac{k+1}{\alpha-1}+\alpha) factor in the running time.

Using our (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm we obtain a generalization of the O~(n1+1/(2ε))\widetilde{O}(n^{1+1/(2-\varepsilon)}) time algorithm of Kadria et al. [7] that computes a cycle of length at most 4g22εg2(2ε)g+44\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(2-\varepsilon)g+4. Our generalized algorithm runs in O~(n1+1/(ε))\tilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)}) time, whp, and returns a cycle of length at most 2g22εg2(ε)g++22\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(\ell-\varepsilon)g+\ell+2, where 2\ell\geq 2 is an integer and ε[0,1]\varepsilon\in[0,1], for every graph with g=polylog(n)g={\rm polylog}(n). We also show that if the graph is sparse then the approximation can be improved. More specifically, we present an algorithm that runs in O~(m1+1/(ε))\tilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)}) time, whp, and returns a cycle of length at most (ε)g+2ε(\ell-\varepsilon)g-\ell+2\varepsilon, where 3\ell\geq 3 is an integer and ε[0,1)\varepsilon\in[0,1), for every graph with g=polylog(n)g={\rm polylog}(n).

Our O~(n1+1/(ε))\tilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)})-time algorithm also generalizes the O~(n1+1/)\tilde{O}(n^{1+1/\ell})-time algorithm of Kadria et al. [7], which computes a C2g/2C_{\leq 2\ell\lceil g/2\rceil} for every integer 1\ell\geq 1. In our algorithm, the integer parameter \ell that appears in the exponent of the running time is replaced by a real-valued parameter ε\ell-\varepsilon. Thus, we introduce many new points on the tradeoff curve between running time and approximation ratio. Specifically, for every integer polylog(n)\ell\leq{\rm polylog}(n), up to g/21\lceil g/2\rceil-1 additional tradeoff points are added (since for every such \ell, when ε\varepsilon is a multiple of 1g/2\frac{1}{\lceil g/2\rceil}, we get a O~(n1+1/(ε))\tilde{O}(n^{1+1/(\ell-\varepsilon)})-time algorithm which computes a C2(ε)g/2C_{\leq 2(\ell-\varepsilon)\lceil g/2\rceil}). For example, consider =3\ell=3 and a graph with girth g=5g=5 or g=6g=6. Our algorithm yields two additional points on the tradeoff curve, corresponding to ε=13\varepsilon=\frac{1}{3} and ε=23\varepsilon=\frac{2}{3}. For ε=13\varepsilon=\frac{1}{3}, we compute a C16C_{\leq 16} in O~(n1+38)\tilde{O}(n^{1+\frac{3}{8}}) time, and for ε=23\varepsilon=\frac{2}{3}, we compute a C14C_{\leq 14} in O~(n1+37)\tilde{O}(n^{1+\frac{3}{7}}) time. These points lie between the two points on the tradeoff curve given by the algorithm of Kadria et al. [7], which computes either a C12C_{\leq 12} in O~(n1+36)\tilde{O}(n^{1+\frac{3}{6}}) time or a C18C_{\leq 18} in O~(n1+39)\tilde{O}(n^{1+\frac{3}{9}}) time. See Figure 1 for a comparison.

0.20.20.40.40.60.60.80.8111.31.31.351.351.41.41.451.451.51.51.551.551.61.6y=1+1y=1+\frac{1}{\ell}y=1+11y=1+\frac{1}{\ell-1}y=1+1εy=1+\frac{1}{\ell-\varepsilon}(a)
0.20.20.40.40.60.60.80.811121214141616181820202222y=2g2y=2\ell\lceil\frac{g}{2}\rceily=2(1)g2y=2(\ell-1)\lceil\frac{g}{2}\rceily=2g22εg2y=2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor(b)

ε\varepsilon Time [3pt] exp Cycle bound 0 1+391+\frac{3}{9} 1818 13\frac{1}{3} 1+381+\frac{3}{8} 1616 23\frac{2}{3} 1+371+\frac{3}{7} 1414 11 1+361+\frac{3}{6} 1212

Figure 1: Our O~(n1+1ε)\tilde{O}(\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}})-time girth approximation algorithm compared to the O~(n1+1)\tilde{O}(n^{1+\frac{1}{\ell}})-time algorithm of  [7], for every ε[0,1]\varepsilon\in[0,1], choosing =3\ell=3 and g=5g=5 or 66. (a) The yy-axis is the exponent of nn in the running time, and the xx-axis is ε\varepsilon. (b) The yy-axis is the upper bound on the length of the returned cycle, and the xx-axis is ε\varepsilon. The blue points correspond to our algorithm at four specific choices of ε\varepsilon: ε=0,13,23,1\varepsilon=0,\frac{1}{3},\frac{2}{3},1 (see the table on the right).

The tradeoff curve to which we add new points encompasses many known algorithms, including those of Itai and Rodeh [6], Lingas and Lundell [8], and Kadria et al. [7] (and those of Roditty and V. Williams [11] for g=4cg=4c and g=4c1g=4c-1 when c1c\geq 1 is an integer, and Dahlgaard et al. [5] for some values of gg and \ell). Notably, some of these algorithms have resisted improvement for many years. The addition of new points to this curve reinforces the possibility that it captures a fundamental relationship between running time and approximation quality. This, in turn, motivates further investigation into whether a matching lower bound exists for this tradeoff.

The rest of this paper is organized as follows. In Section 2 we provide an overview. Preliminaries are in Section 3. In Section 4, we present a new cycle searching technique that is used by our algorithms. In Section 5 we present a 2k2k-hybrid algorithm and then use it to obtain a (+1)(+1)-approximation algorithm for the girth. In Section 6 we generalize the 2k2k-hybrid algorithm and present a (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm. In Section 7 we use the hybrid algorithm from Section 6 to obtain two more approximation algorithms for the girth.

2 Overview

Among the techniques that we develop to obtain our new algorithms, is a new cycle searching technique that might be of independent interest. Our new technique exploits the property that if sVs\in V is not on a C2kC_{\leq 2k}, then for any two neighbors xx and yy of ss, the set of vertices at distance exactly k1k-1 from xx and yy that are also at distance kk from ss are disjoint (see Figure 2). This allows us to check efficiently for all the neighbors of ss if they are on a C2kC_{\leq 2k}. Using this technique, together with more tools that we develop, we obtain two hybrid algorithms.

ssxxyyk1k-1kk
Figure 2: Disjoint sets of vertices at distance k1k-1 from xx and yy

The first is a relatively simple O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}})-time, 2k2k-hybrid algorithm. We use this hybrid algorithm in the girth approximation framework described earlier, to obtain an O~(m1+1+1)\tilde{O}(m^{1+\frac{\ell-1}{\ell+1}})-time, (+1)(+1)-approximation of the girth, where g=2g=2\ell or g=21g=2\ell-1. We remark that using an algorithm of [4] it is possible to obtain a (+1)(+1)-approximation in O~(O()m1+1+1)\tilde{O}(\ell^{O(\ell)}\cdot m^{1+\frac{\ell-1}{\ell+1}}) time.111[4] showed that a C2kC_{2k}, if exists, can be found in O(kO(k)m2kk+1)O(k^{O(k)}\cdot m^{\frac{2k}{k+1}}) time, and if not then a C2k1C_{2k-1}, if exists, can be found in the same time. Thus, if we run their algorithm with increasing values of kk we can obtain a (+1)(+1)-approximation for the girth in O~(O()m1+1+1)\tilde{O}(\ell^{O(\ell)}\cdot m^{1+\frac{\ell-1}{\ell+1}}) time, where g=2g=2\ell or g=21g=2\ell-1. However, the additional O()\ell^{O(\ell)} factor might be significant even for small values of \ell.

The second is the (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm that solves Problem 1.1. Its main component is an (2k,2α)(2k,2\alpha)-hybrid algorithm that runs in O~((k+1α1+α)m1+α1k+1)\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k+1}})-time, whp, and generalizes the first O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}})-time 2k2k-hybrid algorithm, by introducing an additional parameter αk\alpha\leq k. Using α\alpha we can tradeoff between the running time and the lower bound on gg and obtain a faster running time at the price of a worse lower bound.

We compare our (2k,2α)(2k,2\alpha)-hybrid algorithm to algorithm Cycle of Kadria et al. [7], an O(m+n1+1)O(m+n^{1+\frac{1}{\ell}})-time222 Cycle runs in O(n1+1+m)O(n^{1+\frac{1}{\ell}}+m) time, which can be reduced to O(n1+1)O(n^{1+\frac{1}{\ell}}) time, as shown in [7]. (f(g~),g~)(f(\tilde{g}),\tilde{g})-hybrid algorithm, where f(g~)=2g~/2=2αf(\tilde{g})=2\ell\lceil\tilde{g}/2\rceil=2\ell\alpha, that they used to obtain the O~(n1+1)\tilde{O}(n^{1+\frac{1}{\ell}})-time, 2g/22\ell\lceil g/2\rceil-approximation algorithm. As we show later, the running time of our (2k,2α)(2k,2\alpha)-hybrid algorithm can be bounded by O~((k+1α1+α)n1+αk)\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot n^{1+\frac{\alpha}{k}}). Since in our algorithm kk is not necessarily a multiple of α\alpha (compared to the α\ell\alpha of Cycle), our algorithm allows more flexibility, and we achieve many more possible tradeoffs between the running time and the output cycle length. For example, if we consider a multiplicative approximation better than 33, when the value of gg is a constant known in advance, our algorithm can return longer cycles that are still shorter than 3g3g, in a faster running time. See Figure 3 for a comparison.333 [7] presented also an O((αc)n1+α2αc)O((\alpha-c)\cdot n^{1+\frac{\alpha}{2\alpha-c}})-time, (4α2c,2α)(4\alpha-2c,2\alpha)-hybrid algorithm, where 0<cα0<c\leq\alpha are integers. For c=2αkc=2\alpha-k, this is an O((kα)n1+αk)O((k-\alpha)\cdot n^{1+\frac{\alpha}{k}})-time, (2k,2α)(2k,2\alpha)-hybrid algorithm, similar to our (2k,2α)(2k,2\alpha)-hybrid algorithm. However, since 0<cα0<c\leq\alpha, the possible values of kk are restricted and must satisfy αk<2α\alpha\leq k<2\alpha. By choosing α=g2\alpha=\lceil\frac{g}{2}\rceil and an appropriate kk the two algorithms have a similar flexibility for a 22-approximation, but since in our algorithm also larger values of kk are allowed, we can achieve a faster running time for a tt-approximation where t>2t>2.

22446688101012121414161618181.31.31.351.351.41.41.451.451.51.51.551.551.61.6(2k,2α)(2k,2\alpha)-hybridCycle [7] y=1+g3g2y=1+\frac{g}{3g-2}y=1+g+13g1y=1+\frac{g+1}{3g-1}(a)
224466881010121214141616181810102020303040405050(2k,2α)(2k,2\alpha)-hybridCycle [7] y=3gy=3g(b)
Figure 3: (2k,2α)(2k,2\alpha)-hybrid algorithm vs. Cycle algorithm of [7]. Both produce a multiplicative approximation strictly better than 33, given a constant gg known in advance. (a) The yy-axis is the exponent of nn in the fastest running time that achieves such an approximation, and the xx-axis is gg. (b) The yy-axis is the upper bound on the length of the cycle returned within this time, and the xx-axis is gg.

The flexibility of our algorithm is also demonstrated in Figure 4. For a given constant value of kk, if our (2k,2α)(2k,2\alpha)-hybrid algorithm returns a cycle then its length is at most 2k2k. If we want algorithm Cycle to output a C2kC_{\leq 2k}, then kα\lfloor\frac{k}{\alpha}\rfloor is the largest \ell that we can choose, since \ell must be an integer. The running time is O(n1+1/)=O(n1+1/kα)O(n^{1+1/\ell})=O(n^{1+1/\lfloor\frac{k}{\alpha}\rfloor}). Our algorithm achieves a better running time if kk is not divisible by α\alpha. (In Figure 4 we choose α=3\alpha=3.)

2233445566778899101011111212131314141.11.11.21.21.31.31.41.41.51.51.61.61.71.71.81.81.91.922(2k,2α)(2k,2\alpha)-hybrid Cycle [7]y=1+3xy=1+\frac{3}{x}y=1+1x/3y=1+\frac{1}{\lfloor x/3\rfloor}
Figure 4: α=3\alpha=3. Red points are the O(n1+1/kα)O(n^{1+1/\lfloor\frac{k}{\alpha}\rfloor})-time Cycle algorithm [7], and blue points are our O~(n1+αk)\widetilde{O}(n^{1+\frac{\alpha}{k}})-time (2k,2α)(2k,2\alpha)-hybrid algorithm. Both algorithms either return a C2kC_{\leq 2k} or determine that g>2αg>2\alpha, for a constant kk. The yy-axis is the exponent of nn in the running time. The xx-axis is kk.

Next, we overview our (2k,2α)(2k,2\alpha)-hybrid algorithm that either finds a C2kC_{\leq 2k} or determines that g>2αg>2\alpha in O~((k+1α1+α)m1+α1k+1)\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k+1}}) time. To determine that g2αg\geq 2\alpha, we can check for every vVv\in V if vv is on a C2αC_{\leq 2\alpha}. If vv is on a C2αC_{\leq 2\alpha}, then all the vertices and edges of this C2αC_{\leq 2\alpha} are at distance at most α\alpha from vv. If, for every vVv\in V, the number of edges at distance at most α\alpha is O(deg(v)mα1k+1)O(deg(v)\cdot m^{\frac{\alpha-1}{k+1}}) then using standard techniques we can check for every vVv\in V if vv is on a C2αC_{\leq 2\alpha} in O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}) time. However, this is not necessarily the case, and the region at distance at most α\alpha from some vertices might be dense. To deal with dense regions within the promised running time we develop an iterative sampling procedure (see BfsSample in Section 6), whose goal is to sparsify the graph, or to return a C2kC_{\leq 2k}. One component of the iterative sampling procedure is a generalization of our new cycle searching technique mentioned above. In the generalization instead of checking whether a vertex ss and its neighbors are on a C2kC_{\leq 2k}, we check whether all the vertices up to a possibly further distance from ss are on a C2αC_{\leq 2\alpha}, for αk\alpha\leq k, and if not we mark them so that they can be removed later.

If the iterative sampling procedure ends without finding a C2kC_{\leq 2k} then there are two possibilities. Let r=(k+1)mod(α1)r=(k+1)\bmod(\alpha-1). If r=0r=0 then it holds that the number of edges at distance at most α\alpha from every vVv\in V is O(deg(v)mα1k+1)O(deg(v)\cdot m^{\frac{\alpha-1}{k+1}}), whp, as required. If r>0r>0 then it holds that the number of edges at distance at most rr from every vVv\in V is O(mrk+1)O(m^{\frac{r}{k+1}}), whp. This does not necessarily imply that the graph is sparse enough for checking whether g>2αg>2\alpha. In this case, we run another algorithm (see HandleReminder in Section 6) that continues to sparsify the graph until the number of edges at distance at most α\alpha from every vVv\in V is O(deg(v)mα1k+1)O(deg(v)\cdot m^{\frac{\alpha-1}{k+1}}) and checking whether g>2αg>2\alpha is possible within the required running time of O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}).

3 Preliminaries

Let G=(V,E)G=(V,E) be an unweighted undirected graph with nn vertices and mm edges. Let UVU\subseteq V be a set of vertices and let GUG\setminus U be the graph obtained from GG by deleting all the vertices of UU together with their incident edges. For two graphs G=(V,E)G=(V,E) and G=(V,E)G^{\prime}=(V^{\prime},E^{\prime}), let GGG\setminus G^{\prime} be GVG\setminus V^{\prime}. We say that GGG\subseteq G^{\prime} if VVV\subseteq V^{\prime} and EEE\subseteq E^{\prime}. For convenience, we use both uVu\in V and uGu\in G to say that uVu\in V. For every u,vVu,v\in V, let dG(u,v)d_{G}(u,v) be the length of a shortest path between uu and vv in GG. The girth gg of GG is the length of a shortest cycle in GG. Let wt(C)wt(C) be the length of a cycle CC. For an integer \ell, we denote a cycle of length (at most) \ell by (CC_{\leq\ell}) CC_{\ell}.444Both CC_{\ell} and CC_{\leq\ell} might not be simple cycles. However, the cycles that our algorithms return are simple. Let E(v)E(v) be the edges incident to vv and E(v,i)E(v,i) the iith edge in E(v)E(v). Let degG(v)deg_{G}(v) be the degree of vv in GG. Let N(v)N(v) be the set of neighbors of vv, namely N(v)={w(v,w)E}N(v)=\{w\mid(v,w)\in E\}. For an edge set SS, let V(S)V(S) be the endpoints of SS’s edges, that is, V(S)={uV(u,v)S}V(S)=\{u\in V\mid\exists(u,v)\in S\}. Let e=(u,v)Ee=(u,v)\in E and wVw\in V. The distance dG(w,e)d_{G}(w,e) between ww and ee is min{dG(w,u),dG(w,v)}+1\min\{d_{G}(w,u),d_{G}(w,v)\}+1. For every uVu\in V and a real number kk let B(G,u,k)=(Vuk(G),Euk(G))B(G,u,k)=(V_{u}^{k}(G),E_{u}^{k}(G)) be the ball graph of uu, where Vuk(G)={vVdG(u,v)k}V_{u}^{k}(G)=\{v\in V\mid d_{G}(u,v)\leq k\} and Euk(G)={eEdG(u,e)k}E_{u}^{k}(G)=\{e\in E\mid d_{G}(u,e)\leq k\} [7].555When the graph is clear from the context, we sometimes omit GG from the notations.

We now turn to present several essential tools that are required in order to obtain our new algorithms. We first restate an important property of the ball graph B(v,R)B(v,R).

Lemma 3.1 ([7]).

Let 0tR0\leq t\leq R be two integers and let vVv\in V. If B(v,R)B(v,R) is a tree then no vertex in VvRtV_{v}^{R-t} is part of a cycle of length at most 2t2t in GG.

We use procedure BallOrCycle(G,v,R)\texttt{BallOrCycle}(G,v,R) [7, 8] (see Algorithm 1) that searches for a C2RC_{\leq 2R} in the ball graph B(v,R)B(v,R). We summarize the properties of BallOrCycle in the next lemma.

QQ\leftarrow a queue that contains vv with d(v)=0d(v)=0;
while QQ\neq\emptyset do
 udequeue(Q)u\leftarrow dequeue(Q);
 
 VvRVvR{u}V_{v}^{R}\leftarrow V_{v}^{R}\cup\{u\};
 
 i1i\leftarrow 1;
 while (i|E(u)|)andd(u)+1R(i\leq|E(u)|)~\mathrm{and}~d(u)+1\leq R do
    (u,w)E(u,i)(u,w)\leftarrow E(u,i);
    if wQw\in Q then
       return null, P(LCA(u,w),u){(u,w)}P(LCA(u,w),w)P(LCA(u,w),u)\cup\{(u,w)\}\cup P(LCA(u,w),w)666LCA(u,w)LCA(u,w) is the least common ancestor of uu and ww in the tree rooted at vv before the edge (u,w)(u,w) was discovered. P(x,y)P(x,y) is the path in this tree between xx and yy.;
       
    QQ{w}Q\leftarrow Q\cup\{w\} with d(w)=d(u)+1d(w)=d(u)+1;
    
    ii+1i\leftarrow i+1;
    
 
return VvRV_{v}^{R}, null;
Algorithm 1 BallOrCycle(G,v,R)\texttt{BallOrCycle}(G,v,R) [7, 8]
Lemma 3.2 ([7]).

Let vVv\in V. If the ball graph B(v,R)B(v,R) is not a tree then BallOrCycle(G,v,R)\texttt{BallOrCycle}(G,v,R) returns a C2RC_{\leq 2R} from B(v,R)B(v,R). If B(v,R)B(v,R) is a tree then BallOrCycle(G,v,R)\texttt{BallOrCycle}(G,v,R) returns VvRV_{v}^{R}.777If VvRV_{v}^{R} is returned then we assume that VvRV_{v}^{R} is ordered by the distance from vv, and for every uVvRu\in V_{v}^{R} we store d(u,v)d(u,v) with uu. Thus, given the set VvRV_{v}^{R}, we can find VvRV_{v}^{R^{\prime}} for every R<RR^{\prime}<R in O(|VvR|)O(|V_{v}^{R^{\prime}}|) time. The running time of BallOrCycle(G,v,R)\texttt{BallOrCycle}(G,v,R) is O(|VvR|)O(|V_{v}^{R}|).

Next, we obtain a simple tt-hybrid algorithm, called AllVtxBallOrCycle, using BallOrCycle. AllVtxBallOrCycle (see Algorithm 2) gets a graph GG and an integer t2t\geq 2, and runs BallOrCycle(v,t)\texttt{BallOrCycle}(v,t) from every vGv\in G as long as no cycle is found by BallOrCycle. If BallOrCycle finds a cycle then AllVtxBallOrCycle stops and returns that cycle. If no cycle is found then AllVtxBallOrCycle returns null. We prove the next Lemma.

foreach vVv\in V do
 (Vvt,C)BallOrCycle(v,t)(V_{v}^{t},C)\leftarrow\texttt{BallOrCycle}(v,t);
 if CnullC\neq\texttt{null} then return CC;
 
return null;
Algorithm 2 AllVtxBallOrCycle(G,t)\texttt{AllVtxBallOrCycle}(G,t)
Lemma 3.3.

AllVtxBallOrCycle(G,t)\texttt{AllVtxBallOrCycle}(G,t) either finds a C2tC_{\leq 2t} or determines that g>2tg>2t, in O(vV|Vvt|)O(\sum_{v\in V}|V_{v}^{t}|) time.

Proof.

By Lemma 3.2, if BallOrCycle(u,t)\texttt{BallOrCycle}(u,t) returns a cycle CC then wt(C)2twt(C)\leq 2t. Also by Lemma 3.2, if BallOrCycle(u,t)\texttt{BallOrCycle}(u,t) does not return a cycle then the ball graph B(u,t)B(u,t) is a tree, and specifically uu is not part of a C2tC_{\leq 2t} in GG. Therefore, if no cycle was found during any of the calls then all the vertices in GG are not part of a C2tC_{\leq 2t} in GG. Hence, GG does not contain a C2tC_{\leq 2t}, and we get that g>2tg>2t. By Lemma 3.2, the running time of BallOrCycle(v,t)\texttt{BallOrCycle}(v,t) is O(|Vvt|)O(|V_{v}^{t}|), which is O(vV|Vvt|)O(\sum_{v\in V}|V_{v}^{t}|) in total. ∎

We now show that if the input graph satisfies a certain sparsity property then the running time of AllVtxBallOrCycle(G,t)\texttt{AllVtxBallOrCycle}(G,t) can be bounded as follows.

Corollary 3.1.

If |Eut1|<Dt1|E_{u}^{t-1}|<D^{t-1} for every uVu\in V then AllVtxBallOrCycle(G,t)\texttt{AllVtxBallOrCycle}(G,t) runs in O(mDt1)O(mD^{t-1}) time.

Proof.

For every uVu\in V, we have O(|Vvt|)O(|Evt|)O(uN(v)|Eut1|)O(|V_{v}^{t}|)\leq O(|E_{v}^{t}|)\leq O(\sum_{u\in N(v)}|E_{u}^{t-1}|), which is at most O(uN(v)Dt1)=O(deg(v)Dt1)O(\sum_{u\in N(v)}D^{t-1})=O(deg(v)\cdot D^{t-1}). Thus, by Lemma 3.3, the running time of AllVtxBallOrCycle(G,t)\texttt{AllVtxBallOrCycle}(G,t) is O(vV|Vvt|)O(vVdeg(v)Dt1)=O(mDt1)O(\sum_{v\in V}|V_{v}^{t}|)\leq O(\sum_{v\in V}deg(v)\cdot D^{t-1})=O(mD^{t-1}). ∎

Next, we present procedure IsDense(G,w,T,r)\texttt{IsDense}(G,w,T,r) from [7]. IsDense (see Algorithm 3) gets a graph GG, a vertex ww, a budget T1T\geq 1 (real) and a distance r0r\geq 0 (integer). In the procedure a BFS is executed from ww. The BFS counts the edges that are scanned as long as their total number is less than TT and the farthest vertex from ww is at distance at most rr.

T0T^{\prime}\leftarrow 0, c1\ell_{c}\leftarrow 1, n0\ell_{n}\leftarrow 0, j0j\leftarrow 0;
Q{w}Q\leftarrow\{w\};
while (Q is not empty )and(j<r)and(T<T)\big{(}Q\text{ is not empty }\big{)}~\mathrm{and}~\big{(}j<r\big{)}~\mathrm{and}~\big{(}T^{\prime}<T\big{)} do
 udequeue(Q)u\leftarrow dequeue(Q);
 cc1\ell_{c}\leftarrow\ell_{c}-1, i1i\leftarrow 1;
 while (i|E(u)|)and(T<Ti\leq|E(u)|\big{)}~\mathrm{and}~\big{(}T^{\prime}<T) do
    (u,v)E(u,i)(u,v)\leftarrow E(u,i);
      remove (u,v)(u,v) from E(v)E(v);
    TT+1T^{\prime}\leftarrow T^{\prime}+1;
    if vv is not marked then
       QQ{v}Q\leftarrow Q\cup\{v\}, mark vv, nn+1\ell_{n}\leftarrow\ell_{n}+1;
       
    ii+1i\leftarrow i+1;
    
 if c=0\ell_{c}=0 then
    cn\ell_{c}\leftarrow\ell_{n}, n0\ell_{n}\leftarrow 0;
    jj+1j\leftarrow j+1;
    
 
if (TT)\big{(}T^{\prime}\geq T\big{)}  then return Yes;
else return No;
Algorithm 3 IsDense(G,w,T,r)\texttt{IsDense}(G,w,T,r) [7]
Lemma 3.4 ([7]).

Procedure IsDense(G,w,T,r)\texttt{IsDense}(G,w,T,r) runs in O(T)=O(T)O(\lceil T\rceil)=O(T) time. If IsDense(G,w,T,r)=No\texttt{IsDense}(G,w,T,r)=\texttt{No} then |Ewr|<T|E_{w}^{r}|<T. If IsDense(G,w,T,r)=Yes\texttt{IsDense}(G,w,T,r)=\texttt{Yes} then |Ewr|T|E_{w}^{r}|\geq T.

Given a vertex vv and a distance RR we sometimes want to bound |EvR||E_{v}^{R}|. Therefore, we adapt a lemma and a corollary of [7] from vertices to edges.

Lemma 3.5.

Let x,yx,y be positive integers, let D1D\geq 1 be a real number, and let wVw\in V. If |Ewx|<Dx|E_{w}^{x}|<D^{x}, and |Euy|<Dy|E_{u}^{y}|<D^{y} for every uVu\in V, then |Ewx+y|<Dx+y|E_{w}^{x+y}|<D^{x+y}.

Proof.

Let wVw\in V and assume that |Ewx|<Dx|E_{w}^{x}|<D^{x}. We also know that |Euy|<Dy|E_{u}^{y}|<D^{y}, for every uVu\in V. We denote L=Vwx{w}L=V_{w}^{x}\setminus\{w\}. If L=L=\emptyset then |Ewx|=|Ewx+y|=0Dx+y|E_{w}^{x}|=|E_{w}^{x+y}|=0\leq D^{x+y} as required. Now assume that LL\neq\emptyset. Since y1y\geq 1, Ewx+y=uLEuyE_{w}^{x+y}=\bigcup_{u\in L}E_{u}^{y}. Therefore, |Ewx+y|=|uLEuy||E_{w}^{x+y}|=|\bigcup_{u\in L}E_{u}^{y}|. As the ball graph B(w,x)=(Vwx,Ewx)B(w,x)=(V_{w}^{x},E_{w}^{x}) is connected, we know that |Vwx||Ewx|+1<Dx+1|V_{w}^{x}|\leq|E_{w}^{x}|+1<D^{x}+1 and since x>0x>0, |L|=|Vwx|1<(Dx+1)1=Dx|L|=|V_{w}^{x}|-1<(D^{x}+1)-1=D^{x}. Thus, we get that |Ewx+y|=|uLEuy|uL|Euy|<uLDy=|L|Dy<DxDy=Dx+y|E_{w}^{x+y}|=|\bigcup_{u\in L}E_{u}^{y}|\leq\sum_{u\in L}|E_{u}^{y}|<\sum_{u\in L}D^{y}=|L|\cdot D^{y}<D^{x}\cdot D^{y}=D^{x+y} so |Ewx+y|<Dx+y|E_{w}^{x+y}|<D^{x+y}. ∎

Using Lemma 3.5, we prove following corollary.

Corollary 3.2.

Let xx be a positive integer and let D1D\geq 1 be a real number. If |Ewx|<Dx|E_{w}^{x}|<D^{x} for every wVw\in V, then |Ewix|<Dix|E_{w}^{ix}|<D^{ix}, for every wVw\in V and i1i\geq 1.

Proof.

The proof is by induction on ii. For i=1i=1 it follows from our assumption that |Ewx|<Dx|E_{w}^{x}|<D^{x}. Assume now that the claim holds for j=i1j=i-1. This implies that |Ew(i1)x|<D(i1)x|E_{w}^{(i-1)x}|<D^{(i-1)x}, for every wVw\in V. Combining this with the fact that |Ewx|<Dx|E_{w}^{x}|<D^{x}, for every wVw\in V, by Lemma 3.5 we get that |Ewix|<Dix|E_{w}^{ix}|<D^{ix}, for every wVw\in V and i1i\geq 1. ∎

We also adapt procedure SparseOrCycle(G,D,x,y)\texttt{SparseOrCycle}(G,D,x,y) of [7] to our needs. SparseOrCycle (see Algorithm 4) gets a graph GG, a parameter D1D\geq 1, and two integers x,y>0x,y>0, and iterates over vertices using a for-each loop. Let ww be the vertex currently considered and G(w)G(w) the current graph. If IsDense(G(w),w,Dx,x)=Yes\texttt{IsDense}(G(w),w,D^{x},x)=\texttt{Yes} then BallOrCycle(G(w),w,x1+y)\texttt{BallOrCycle}(G(w),w,x-1+y) is called. If BallOrCycle returns a cycle CC then CC is returned by SparseOrCycle. Otherwise, the vertex set Vwx1V_{w}^{x-1} is removed from G(w)G(w) along with the edge set EwxE_{w}^{x}. After the loop ends, if no cycle was found, we return null. Let WVW\subseteq V be the set of vertices for which BallOrCycle was called and no cycle was found, and G^=(V^,E^)\hat{G}=(\hat{V},\hat{E}) the graph after SparseOrCycle ends.

foreach wVw\in V do
 if IsDense(G,w,Dx,x)=Yes\texttt{IsDense}(G,w,D^{x},x)=\texttt{Yes} then
    (Vwx1+y,C)BallOrCycle(w,x1+y)(V_{w}^{x-1+y},C)\leftarrow\texttt{BallOrCycle}(w,x-1+y);
    if CnullC\neq\texttt{null} then return CC;
    GGVwx1G\leftarrow G\setminus V_{w}^{x-1};
    
 
return nullnull;
Algorithm 4 SparseOrCycle(G,D,x,y)\texttt{SparseOrCycle}(G,D,x,y)

The following lemma is similar to the corresponding lemmas from [7], and the proof is omitted.

Lemma 3.6.

SparseOrCycle(G,D,x,y)\texttt{SparseOrCycle}(G,D,x,y) satisfies the following:

  1. (i)

    If a cycle CC is returned then wt(C)2(x1+y)wt(C)\leq 2(x-1+y)

  2. (ii)

    If a cycle is not returned then |Eux|<Dx|E_{u}^{x}|<D^{x}, for every uG^u\in\hat{G}

  3. (iii)

    If uGG^u\in G\setminus\hat{G} then uu is not part of a C2yC_{\leq 2y} in GG

  4. (iv)

    SparseOrCycle(G,D,x,y)\texttt{SparseOrCycle}(G,D,x,y) runs in O(nDx+wW(|Vwx1+y|))O(nD^{x}+\sum_{w\in W}(|V_{w}^{x-1+y}|)) time.

Similarly to AllVtxBallOrCycle, we show for SparseOrCycle that if GG satisfies a certain sparsity property, the running time can be bounded as follows.

Corollary 3.3.

If |Eux1+y|<Dx1+y|E_{u}^{x-1+y}|<D^{x-1+y} for every vertex uVu\in V then SparseOrCycle(G,D,x,y)\texttt{SparseOrCycle}(G,D,x,y) runs in O(nDx+mDy1)O(nD^{x}+mD^{y-1}) time.

Proof.

By Lemma 3.6, SparseOrCycle(G,D,x,y)\texttt{SparseOrCycle}(G,D,x,y) runs in O(|V|Dx+wW(|Vwx1+y|))O(|V|D^{x}+\sum_{w\in W}(|V_{w}^{x-1+y}|)) time. For every wWw\in W, the call to IsDense(G,w,Dx,x)\texttt{IsDense}(G,w,D^{x},x) returned Yes, so it follows from Lemma 3.4 that |Ewx|Dx|E_{w}^{x}|\geq D^{x}. The edge set EwxE_{w}^{x} is removed while removing Vwx1V_{w}^{x-1}. Therefore, for every wWw\in W we remove at least DxD^{x} edges. Since at most mm edges can be removed, the size of WW is at most mDx\frac{m}{D^{x}}. By our assumption, we have O(|Vwx1+y|)O(|Ewx1+y|)O(Dx1+y)O(|V_{w}^{x-1+y}|)\leq O(|E_{w}^{x-1+y}|)\leq O(D^{x-1+y}). Therefore, we get that O(wW(|Vwx1+y|))O(|W|Dx1+y)O(mDxDx1+y)=O(mDy1)O(\sum_{w\in W}(|V_{w}^{x-1+y}|))\leq O(|W|\cdot D^{x-1+y})\leq O(\frac{m}{D^{x}}\cdot D^{x-1+y})=O(mD^{y-1}). Thus, the running time of SparseOrCycle is O(nDx+mDy1)O(nD^{x}+mD^{y-1}). ∎

Finally, we include a standard lemma about sampling a hitting set (see, e.g., [1], [8], [10]).

Lemma 3.7.

It is possible to obtain in O(m)O(m) time, using sampling, a set of edges SS of size Θ~(ms)\tilde{\Theta}(\frac{m}{s}), that hits, whp, the ss closest edges of every vVv\in V.

We remark that some of our algorithms get a graph GG that is being updated during their run. Within their scope, GG denotes the current graph that includes all updates done so far.

4 A new cycle searching technique

Consider a vertex sVs\in V. It is straightforward to check whether ss is on a C2kC_{\leq 2k}, for every integer kk, using BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) in O(n)O(n) time. If BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) does not return a C2kC_{\leq 2k} then for every x,yN(s)x,y\in N(s) it holds that Vxk1(G)Vyk1(G)=V_{x}^{k-1}(G^{\prime})\cap V_{y}^{k-1}(G^{\prime})=\emptyset, where G=G{s}G^{\prime}=G\setminus\{s\}, as otherwise there was a C2kC_{\leq 2k} passing through ss and BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) would have returned a C2kC_{\leq 2k}. We show that it is possible to exploit this property to check for every vN(s)v\in N(s) whether vv is on a C2kC_{\leq 2k}, using BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k), in O(n+m)O(n+m) time instead of O(deg(s)n)O(deg(s)\cdot n). More specifically, we present algorithm NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k) (see Algorithm 5) that gets a graph GG, a vertex ss, and an integer k2k\geq 2. We first initialize U^\hat{U} to \emptyset. Then, we run BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k). If a cycle CC is found by BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) then CC is returned by NbrBallOrCycle. Otherwise, we add the vertex ss to U^\hat{U}, keep the neighbors of ss in NsN_{s}, and then remove ss from GG. Recall that GG^{\prime} equals G{s}G\setminus\{s\}. Next, for every vNsv\in N_{s} we run BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k), as long as a cycle is not found. If a cycle CC is returned by BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) then CC is returned by NbrBallOrCycle.888For our needs it suffices to stop and return a cycle passing through an ss neighbor once we find one, though BallOrCycle can be run from all the neighbors of ss in the same running time bound of O(n+m)O(n+m). Otherwise, vv is added to U^\hat{U}. After the loop ends, the vertex ss and its adjacent edges are added back to the graph, and the set U^\hat{U} is returned by NbrBallOrCycle. We prove the following lemma.

U^\hat{U}\leftarrow\emptyset;
(Vsk,C)BallOrCycle(s,k)(V_{s}^{k},C)\leftarrow\texttt{BallOrCycle}(s,k);
if CnullC\neq\texttt{null} then return (,C)(\emptyset,C);
U^U^{s}\hat{U}\leftarrow\hat{U}\cup\{s\};
NsN(s)N_{s}\leftarrow N(s);
GG{s}G\leftarrow G\setminus\{s\};
foreach vNsv\in N_{s} do
 (Vvk,C)BallOrCycle(v,k)(V_{v}^{k},C)\leftarrow\texttt{BallOrCycle}(v,k);
 
 if CnullC\neq\texttt{null} then return (,C)(\emptyset,C);
 
 U^U^{v}\hat{U}\leftarrow\hat{U}\cup\{v\} ;
 
add ss and the edge set {(s,v)vNs}\{(s,v)\mid v\in N_{s}\} back to GG;
return (U^,null);(\hat{U},\texttt{null});
Algorithm 5 NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k)
Lemma 4.1.

If algorithm NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k) finds a cycle CC then wt(C)2kwt(C)\leq 2k. Otherwise, no vertex in Vs1V_{s}^{1} is part of a C2kC_{\leq 2k} in GG, and the set U^=Vs1\hat{U}=V_{s}^{1} is returned.

Proof.

If a cycle was found, it happened during the call to BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) or one of the calls to BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k). Therefore, by Lemma 3.2, the cycle length is at most 2k2k.

If no cycle was found during the run of BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) then ss is not on a C2kC_{\leq 2k} in GG. In addition, for every vNsv\in N_{s} BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) did not find a cycle. Hence, vv is not on a C2kC_{\leq 2k} in GG^{\prime} and therefore also in GG, since G=G{s}G^{\prime}=G\setminus\{s\} and ss is not on a C2kC_{\leq 2k}. Since no cycle was found, ss and the vertices vv for every vNsv\in N_{s} are added to U^\hat{U}, so by the definition of Vs1V_{s}^{1} we have U^=Vs1\hat{U}=V_{s}^{1}. ∎

To bound the running time of NbrBallOrCycle, we show how to use the fact that no C2kC_{\leq 2k} was found by BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k), to efficiently run BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) for every vNsv\in N_{s}.

Lemma 4.2.

Let sVs\in V. If the ball graph B(G,s,k)B(G,s,k) is a tree then the total cost of running BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) for every vNsv\in N_{s} is O(|Vsk(G)|+wLsk(G)deg(w))O(|V_{s}^{k}(G)|+\sum_{w\in L_{s}^{k}(G)}deg(w)).

Proof.

By the definitions of VskV_{s}^{k}, LskL_{s}^{k} and GG^{\prime}, we know that Vsk(G)=vNsVvk1(G){s}V_{s}^{k}(G)=\cup_{v\in N_{s}}V_{v}^{k-1}(G^{\prime})\cup\{s\} and Lsk(G)=vNsLvk1(G)L_{s}^{k}(G)=\cup_{v\in N_{s}}L_{v}^{k-1}(G^{\prime}). Since B(G,s,k)B(G,s,k) contains no cycles it follows that Vxk1(G)Vyk1(G)=V_{x}^{k-1}(G^{\prime})\cap V_{y}^{k-1}(G^{\prime})=\emptyset and Lxk1(G)Lyk1(G)=L_{x}^{k-1}(G^{\prime})\cap L_{y}^{k-1}(G^{\prime})=\emptyset, for any two distinct vertices x,yNsx,y\in N_{s}. Therefore, (i) |Vsk(G)|=vNs|Vvk1(G)|+1|V_{s}^{k}(G)|=\sum_{v\in N_{s}}|V_{v}^{k-1}(G^{\prime})|+1, and (ii) Lsk(G)=ΓvNsLvk1(G)L_{s}^{k}(G)=\mathbin{\mathaccent 0{\cdot}\cup}_{v\in N_{s}}L_{v}^{k-1}(G^{\prime}). From Lemma 3.2 it follows that the total cost of the calls to BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) for every vNsv\in N_{s} is O(vNs|Vvk(G)|)O(\sum_{v\in N_{s}}|V_{v}^{k}(G^{\prime})|). It holds that |Vvk(G)||Vvk1(G)|+wLvk1(G)deg(w)|V_{v}^{k}(G^{\prime})|\leq|V_{v}^{k-1}(G^{\prime})|+\sum_{w\in L_{v}^{k-1}(G^{\prime})}deg(w), for every vNsv\in N_{s}. Thus, we get that the total cost is O(vNs|Vvk(G)|)=O(vNs(|Vvk1(G)|+wLvk1(G)deg(w)))O(\sum_{v\in N_{s}}|V_{v}^{k}(G^{\prime})|)=O(\sum_{v\in N_{s}}(|V_{v}^{k-1}(G^{\prime})|+\sum_{w\in L_{v}^{k-1}(G^{\prime})}deg(w))). This equals to O(vNs|Vvk1(G)|+vNswLvk1(G)deg(w)))O(\sum_{v\in N_{s}}|V_{v}^{k-1}(G^{\prime})|+\sum_{v\in N_{s}}\sum_{w\in L_{v}^{k-1}(G^{\prime})}deg(w))), and it follows from (i) and (ii) that this is at most O(|Vsk(G)|+wLsk(G)deg(w))O(|V_{s}^{k}(G)|+\sum_{w\in L_{s}^{k}(G)}deg(w)). ∎

We use Lemma 4.2 to bound the running time of NbrBallOrCycle.

Lemma 4.3.

Algorithm NbrBallOrCycle runs in O(n+m)O(n+m) time.

Proof.

Running BallOrCycle(G,s,k)\texttt{BallOrCycle}(G,s,k) costs O(n)O(n). If a cycle is found, it is returned and the running time is O(n)=O(n+m)O(n)=O(n+m). If no cycle is found, then removing (and later adding back) ss and its edges costs O(deg(s))O(deg(s)). By Lemma 4.2, the cost of running BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) for every vNsv\in N_{s} is O(|Vsk(G)|+wLsk(G)deg(w))=O(n+m)O(|V_{s}^{k}(G)|+\sum_{w\in L_{s}^{k}(G)}deg(w))=O(n+m). Adding ss and NsN_{s} to U^\hat{U} takes O(Vs1)=O(n)O(V_{s}^{1})=O(n) time. Thus, the total running time of NbrBallOrCycle is O(n+m)O(n+m). ∎

5 A 2k2k-hybrid algorithm and a (+1)(+1)-approximation of the girth

In this section we first show how to use algorithm NbrBallOrCycle from the previous section to obtain a 2k2k-hybrid algorithm that in O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}) time, either returns a C2kC_{\leq 2k} or determines that g>2kg>2k. Then, we use the 2k2k-hybrid algorithm to compute a (+1)(+1)-approximation of gg.

5.1 A 2k2k-hybrid algorithm

We first present algorithm 2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k) that gets a graph GG and an integer k2k\geq 2. Let G0G_{0} (G1G_{1}) be GG before (after) running 22-SparseOrCycle. 2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k) either finds a C2kC_{\leq 2k} or removes vertices that are not on a C2kC_{\leq 2k}, such that for every uG1u\in G_{1}, the ball graph B(G1,u,2)B(G_{1},u,2) is relatively sparse, that is, |Eu2(G1)|<m2k+1|E_{u}^{2}(G_{1})|<m^{\frac{2}{k+1}}. 22-SparseOrCycle (see Algorithm 6) iterates over vertices using a for-each loop. Let ss be the vertex currently considered. If vN(s)deg(v)m2k+1\sum_{v\in N(s)}deg(v)\geq m^{\frac{2}{k+1}} then NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k) is called. If NbrBallOrCycle returns a cycle CC then 22-SparseOrCycle returns CC. If NbrBallOrCycle returns a vertex set U^\hat{U} then U^\hat{U} is removed from GG. After the loop ends, if no cycle was found, we return null.

foreach sVs\in V do
 if  vN(s)deg(v)m2k+1\sum_{v\in N(s)}deg(v)\geq m^{\frac{2}{k+1}}  then
    
    (U^,C)NbrBallOrCycle(G,s,k)(\hat{U},C)\leftarrow\texttt{NbrBallOrCycle}(G,s,k);
    if CnullC\neq null then return CC;
    GGU^G\leftarrow G\setminus\hat{U};
    
 
return null;
Algorithm 6 2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k)

Remark. Notice that SparseOrCycle(G,m2k+1,2,k)\texttt{SparseOrCycle}(G,m^{\frac{2}{k+1}},2,k) either finds a C2k+2C_{\leq 2k+2} or removes vertices that are not on a C2kC_{\leq 2k}, such that for every uG1u\in G_{1} it holds that |Eu2(G1)|<m2k+1|E_{u}^{2}(G_{1})|<m^{\frac{2}{k+1}}. Using NbrBallOrCycle instead of BallOrCycle in 22-SparseOrCycle enables us in the case that a cycle is found to bound the cycle length with 2k2k rather than 2k+22k+2, while still maintaining the property that |Eu2(G1)|<m2k+1|E_{u}^{2}(G_{1})|<m^{\frac{2}{k+1}}, for every uG1u\in G_{1}, in the case that no cycle is found.

We prove the following lemma.

Lemma 5.1.

2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k) satisfies the following:

  1. (i)

    If a cycle CC is returned then wt(C)2kwt(C)\leq 2k

  2. (ii)

    If a cycle is not returned then |Eu2|<m2k+1|E_{u}^{2}|<m^{\frac{2}{k+1}}, for every uG1u\in G_{1}

  3. (iii)

    If uG0G1u\in G_{0}\setminus G_{1} then uu is not part of a C2kC_{\leq 2k} in G0G_{0}

  4. (iv)

    2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k) runs in O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}) time.

Proof.
  1. (i)

    Since 22-SparseOrCycle returns a cycle CC only if a call to NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k) returns a cycle CC, it follows from Lemma 4.1 that wt(C)2kwt(C)\leq 2k.

  2. (ii)

    Let uG1u\in G_{1}. Since uu was not removed, uu was considered in the for-each loop at some stage during the execution of 22-SparseOrCycle. At this stage vN(s)deg(v)<m2k+1\sum_{v\in N(s)}deg(v)<m^{\frac{2}{k+1}}, as otherwise, since no cycle was returned, by Lemma 4.1 the call to NbrBallOrCycle with uu would have returned U^=Vu1\hat{U}=V_{u}^{1}, so uu would have been removed while removing U^\hat{U}. Since |Es2|vN(s)deg(v)|E_{s}^{2}|\leq\sum_{v\in N(s)}deg(v) we have |Eu2|<m2k+1|E_{u}^{2}|<m^{\frac{2}{k+1}}. As edges can only be removed during the run of 22-SparseOrCycle, we have |Eu2|<m2k+1|E_{u}^{2}|<m^{\frac{2}{k+1}} also in G1G_{1}.

  3. (iii)

    Since uG0G1u\in G_{0}\setminus G_{1} it follows that there was a vertex ss such that uU^u\in\hat{U} after a call to NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k) did not return a cycle. By Lemma 4.1, no vertex in U^\hat{U} is part of a C2kC_{\leq 2k} in GG. Therefore, uu is not part of a C2kC_{\leq 2k} in GG. Since during the run of 22-SparseOrCycle we remove only vertices that are not part of a C2kC_{\leq 2k}, uu is not part of a C2kC_{\leq 2k} also in G0G_{0}.

  4. (iv)

    Computing vN(s)deg(v)\sum_{v\in N(s)}deg(v) takes O(|N(s)|)=O(deg(v))O(|N(s)|)=O(deg(v)) time, as all the degrees can be computed in advance in O(m)O(m) time. We compute this value for at most nn distinct vertices so the running time of this part is at most O(vVdeg(v))=O(m)O(\sum_{v\in V}deg(v))=O(m) in total. By Lemma 4.1, running NbrBallOrCycle takes O(n+m)=O(m)O(n+m)=O(m) time. Each edge in Es2E_{s}^{2} contributes at most 22 to the sum vN(s)deg(v)\sum_{v\in N(s)}deg(v), so vN(s)deg(v)2|Es2|\sum_{v\in N(s)}deg(v)\leq 2|E_{s}^{2}| and 12vN(s)deg(v)|Es2|\frac{1}{2}\sum_{v\in N(s)}deg(v)\leq|E_{s}^{2}|. If a call to NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k) did not return a cycle, then by Lemma 4.1, the set U^=Vs1\hat{U}=V_{s}^{1} is returned. 22-SparseOrCycle removes the set U^=Vs1\hat{U}=V_{s}^{1} and by doing so, the edge set Es2E_{s}^{2} is also removed. We charge each edge of Es2E_{s}^{2} with O(mk1k+1)O(m^{\frac{k-1}{k+1}}). Thus the total cost that we charge for ss is O(|Es2|mk1k+1)O(vN(s)deg(v)mk1k+1)O(m2k+1mk1k+1)=O(m)O(|E_{s}^{2}|\cdot m^{\frac{k-1}{k+1}})\geq O(\sum_{v\in N(s)}deg(v)\cdot m^{\frac{k-1}{k+1}})\geq O(m^{\frac{2}{k+1}}\cdot m^{\frac{k-1}{k+1}})=O(m), which covers the O(m)O(m) cost of NbrBallOrCycle(G,s,k)\texttt{NbrBallOrCycle}(G,s,k).

    Since each edge can be charged and removed from GG at most once during the execution of 22-SparseOrCycle, the running time of 22-SparseOrCycle is at most O(mmk1k+1)=O(m1+k1k+1)O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}}). ∎

Next, we use 22-SparseOrCycle to design a 2k2k-hybrid algorithm called 2k2k-Hybrid. Notice first that if |Euk1|<mk1k+1|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}} for every uVu\in V, then it is straightforward to obtain an O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}})-time 2k2k-hybrid algorithm, by running AllVtxBallOrCycle(G,k)\texttt{AllVtxBallOrCycle}(G,k). Thus, in 2k2k-Hybrid we ensure that if we call AllVtxBallOrCycle then it holds for every uVu\in V that |Euk1|<mk1k+1|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}. To do so, we run 22-SparseOrCycle and possibly SparseOrCycle. If no cycle was returned then it holds that |Euk1|<mk1k+1|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}} for every uVu\in V, and we can safely run AllVtxBallOrCycle.

2k2k-Hybrid (see Algorithm 7) gets a graph GG and an integer k2k\geq 2. 2k2k-Hybrid is composed of three stages. In the first stage we call 2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k). If 22-SparseOrCycle returns a cycle CC then 2k2k-Hybrid stops and returns CC, otherwise we proceed to the second stage. In the second stage, if (k1)mod20(k-1)\bmod 2\neq 0, we call SparseOrCycle(G,m1k+1,1,k)\texttt{SparseOrCycle}(G,m^{\frac{1}{k+1}},1,k). If SparseOrCycle returns a cycle CC then 2k2k-Hybrid stops and returns CC, otherwise we proceed to the last stage. In the last stage, we call AllVtxBallOrCycle(G,k)\texttt{AllVtxBallOrCycle}(G,k). If AllVtxBallOrCycle returns a cycle CC then 2k2k-Hybrid stops and returns CC, otherwise 2k2k-Hybrid returns null. In the next lemma we prove the correctness and analyze the running time of 2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G,k).

C2-SparseOrCycle(G,k)C\leftarrow\texttt{$2$-SparseOrCycle}(G,k);
if CnullC\neq\texttt{null} then return CC;
if (k1)mod20(k-1)\bmod 2\neq 0 then
 
 CSparseOrCycle(G,m1k+1,1,k)C\leftarrow\texttt{SparseOrCycle}(G,m^{\frac{1}{k+1}},1,k)
 if CnullC\neq\texttt{null} then return CC;
 
CAllVtxBallOrCycle(G,k)C\leftarrow\texttt{AllVtxBallOrCycle}(G,k);
if CnullC\neq\texttt{null} then return CC;
return null;
Algorithm 7 2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G,k)
Lemma 5.2.

2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G,k) either returns a C2kC_{\leq 2k} or determines that g>2kg>2k, in O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}) time.

Proof.

First, 2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G,k) returns a cycle CC only if 2-SparseOrCycle(G,k)\texttt{$2$-SparseOrCycle}(G,k), SparseOrCycle(G,m1k+1,1,k)\texttt{SparseOrCycle}(G,m^{\frac{1}{k+1}},1,k) or AllVtxBallOrCycle(G,k)\texttt{AllVtxBallOrCycle}(G,k) returns a cycle CC. Therefore, if a cycle is returned then it follows from Lemmas 5.1, 3.6, or 3.3, respectively, that wt(C)2kwt(C)\leq 2k.

Second, we show that if no cycle was found then g>2kg>2k. If no cycle was found then it might be that some vertices were removed from the graph. A vertex uu can be removed either by SparseOrCycle or by 22-SparseOrCycle. It follows from Lemma 3.6 and Lemma 5.1, that uu is not part of a C2kC_{\leq 2k} when uu is removed. Since only vertices that are not on a C2kC_{\leq 2k} are removed, every C2kC_{\leq 2k} that was in the input graph also belongs to the updated graph. We then call AllVtxBallOrCycle(G,k)\texttt{AllVtxBallOrCycle}(G,k) with the updated graph. Since we are in the case that no cycle was found, AllVtxBallOrCycle did not return a cycle. It follows from Lemma 3.3 that g>2kg>2k in the updated graph, and therefore g>2kg>2k also in the input graph.

Now we turn to analyze the running time of 2k2k-Hybrid. At the beginning, 2k2k-Hybrid calls 22-SparseOrCycle. By Lemma 5.1, 22-SparseOrCycle runs in O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}) time. Let G1G_{1} be the graph after the call to 22-SparseOrCycle. By Lemma 5.1, for every uG1u\in G_{1} we have |Eu2|<m2k+1|E_{u}^{2}|<m^{\frac{2}{k+1}}.

Next, 2k2k-Hybrid checks if (k1)mod20(k-1)\bmod 2\neq 0. We divide the rest of the proof to the case that (k1)mod2=0(k-1)\bmod 2=0 and to the case that (k1)mod20(k-1)\bmod 2\neq 0. If (k1)mod2=0(k-1)\bmod 2=0 then 2k2k-Hybrid calls AllVtxBallOrCycle. Since (k1)mod2=0(k-1)\bmod 2=0 and since for every uG1u\in G_{1} we have |Eu2|<m2k+1|E_{u}^{2}|<m^{\frac{2}{k+1}}, it follows from Corollary 3.2 that |Euk1|<mk1k+1|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}} for every uG1u\in G_{1}. By Corollary 3.1, the running time of AllVtxBallOrCycle is O(mmk1k+1)=O(m1+k1k+1)O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}}).

We now turn to the case that (k1)mod20(k-1)\bmod 2\neq 0. In this case it might be that |Euk1|mk1k+1|E_{u}^{k-1}|\geq m^{\frac{k-1}{k+1}} for some vertices uG1u\in G_{1}. Therefore, we first call SparseOrCycle(G1,m1k+1,1,k)\texttt{SparseOrCycle}(G_{1},m^{\frac{1}{k+1}},1,k). Notice that since we are in the case that (k1)mod20(k-1)\bmod 2\neq 0 it holds that kmod2=0k\bmod 2=0. Moreover, |Eu2|<m2k+1|E_{u}^{2}|<m^{\frac{2}{k+1}} for every uG1u\in G_{1}. Thus, it follows from Corollary 3.2 that |Euk|<mkk+1|E_{u}^{k}|<m^{\frac{k}{k+1}} for every uG1u\in G_{1}. By Corollary 3.3 if |Euk|<mkk+1|E_{u}^{k}|<m^{\frac{k}{k+1}} for every uG1u\in G_{1} then the running time of SparseOrCycle(G1,m1k+1,1,k)\texttt{SparseOrCycle}(G_{1},m^{\frac{1}{k+1}},1,k) is O(nm1k+1+mmk1k+1)=O(m1+k1k+1)O(nm^{\frac{1}{k+1}}+mm^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}}).

Let G2G_{2} be the graph after SparseOrCycle ends. By Lemma 3.6, for every uG2u\in G_{2} we have |Eu1|<m1k+1|E_{u}^{1}|<m^{\frac{1}{k+1}}. It follows from Corollary 3.2 that |Euk1|<mk1k+1|E_{u}^{k-1}|<m^{\frac{k-1}{k+1}}. Now 2k2k-Hybrid calls AllVtxBallOrCycle, and using Corollary 3.1 again, we get that the running time of AllVtxBallOrCycle is O(mmk1k+1)=O(m1+k1k+1)O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}}).

It follows from the above discussion that ShortCycleSparse either returns a C2kC_{\leq 2k} or determines that g>2kg>2k, and the running time is O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}). ∎

5.2 A (+1)(+1)-approximation of the girth

Next, we describe algorithm AdtvGirthApprox, which uses 2k2k-Hybrid and the framework described in Section 1, to obtain a (+1)(+1)-approximation of gg, when glogng\leq\log n. AdtvGirthApprox (see Algorithm 8) gets a graph GG. In AdtvGirthApprox, we set kk to 22 and start a while loop. In each iteration, we create a copy GG^{\prime} of GG and call 2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G^{\prime},k). If 2k2k-Hybrid finds a cycle CC then AdtvGirthApprox stops and returns CC, otherwise we increment kk by 11 and continue to the next iteration. We prove the following theorem.

k2k\leftarrow 2;
while true do
 GG^{\prime}\leftarrow a copy of GG;
 C2k-Hybrid(G,k)C\leftarrow\texttt{$2k$-Hybrid}(G^{\prime},k);
 if CnullC\neq\texttt{null} then return CC;
 
 kk+1k\leftarrow k+1;
 
Algorithm 8 AdtvGirthApprox(G)\texttt{AdtvGirthApprox}(G)
Theorem 5.1.

Algorithm AdtvGirthApprox(G)\texttt{AdtvGirthApprox}(G) returns either a CgC_{g} or a Cg+1C_{g+1}, and runs in O~(m1+1+1)\tilde{O}(m^{1+\frac{\ell-1}{\ell+1}}) time, where g=2g=2\ell or g=21g=2\ell-1 and 2<logn2\leq\ell<\log n.999 We note that when logn\ell\geq\log n, the O(n2)O(n^{2})-time, (+1)(+1)-approximation of Itai and Rodeh [6] can be used.

Proof.

We first prove the bound on the approximation. AdtvGirthApprox always returns a cycle in GG so it cannot be that a C<gC_{<g} will be returned. AdtvGirthApprox starts with k=2k=2. It follows from Lemma 5.2 that as long as k<k<\ell the calls to 2k2k-Hybrid will not return a cycle since the graph does not contain a C2kC_{\leq 2k}. Consider now the iteration in which k=k=\ell. AdtvGirthApprox calls to 2k-Hybrid(G,)\texttt{$2k$-Hybrid}(G^{\prime},\ell) where G=GG^{\prime}=G. It follows from Lemma 5.2 that 2k2k-Hybrid will either return a C2C_{\leq 2\ell}, or determine that g>2g>2\ell. Since we assume that g2g\leq 2\ell, 2k2k-Hybrid will return a C2C_{\leq 2\ell} which is either a CgC_{g} or a Cg+1C_{g+1} since g=2g=2\ell or g=21g=2\ell-1.

We now turn to analyze the running time. Creating a copy of GG takes O(m)O(m) time, and by Lemma 5.2 the running time of 2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G^{\prime},k) is O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}). Therefore, for every kk\leq\ell, the running time of the iteration of the while loop with this value of kk is O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}). From the previous part of this proof it follows that the last iteration of the while loop is when k=k=\ell, thus, the running time of AdtvGirthApprox is O(m1+13+m1+24++m1+2+m1+1+1)O(m1+1+1)O(m^{1+\frac{1}{3}}+m^{1+\frac{2}{4}}+\cdots+m^{1+\frac{\ell-2}{\ell}}+m^{1+\frac{\ell-1}{\ell+1}})\leq O(\ell\cdot m^{1+\frac{\ell-1}{\ell+1}}). Therefore, when <logn\ell<\log n, the running time is O~(m22+1)\tilde{O}(m^{2-\frac{2}{\ell+1}}). ∎

6 A general hybrid algorithm

Algorithm 2k-Hybrid(G,k)\texttt{$2k$-Hybrid}(G,k), presented in the previous section, either returns a C2kC_{\leq 2k} or determines that g>2kg>2k, in O(m1+k1k+1)O(m^{1+\frac{k-1}{k+1}}) time. In this section we introduce an additional parameter 2αk2\leq\alpha\leq k and present a (2k,2α)(2k,2\alpha)-hybrid algorithm that either returns a C2kC_{\leq 2k} or determines that g>2αg>2\alpha, in O((k+1α1+α)m1+α1k+1)O((\frac{k+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k+1}}) time. In Section 7 we use the (2k,2α)(2k,2\alpha)-hybrid algorithm to present two tradeoffs for girth approximation.

To obtain the (2k,2α)(2k,2\alpha)-hybrid algorithm we first extend algorithm NbrBallOrCycle. Then, we use the extended NbrBallOrCycle together with additional tools that we develop to either return a C2kC_{\leq 2k} or sparsify dense regions of the graph, so that we can check whether g>2αg>2\alpha (or return a C2kC_{\leq 2k}) in O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}) time, by running AllVtxBallOrCycle(G,α)\texttt{AllVtxBallOrCycle}(G,\alpha).

6.1 Extending NbrBallOrCycle

In algorithm NbrBallOrCycle we mark vertices that can be removed from the graph, by using the property that if BallOrCycle(v,k)\texttt{BallOrCycle}(v,k) did not return a C2kC_{\leq 2k} then vv is not on a C2kC_{\leq 2k}. In [7], they introduce an additional parameter α\alpha and used the following extended version of this property: If BallOrCycle(v,k)\texttt{BallOrCycle}(v,k) did not return a C2kC_{\leq 2k} then no vertex of VvkαV_{v}^{k-\alpha} is on a C2αC_{\leq 2\alpha}. We use the same approach and modify NbrBallOrCycle to get an additional integer parameter α\alpha such that 2αk2\leq\alpha\leq k. After each call to BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k), where vNsv\in N_{s}, if no cycle was found we add VvkαV_{v}^{k-\alpha}, instead of vv, to U^\hat{U}. The modified pseudo-code appears in Algorithm 9. We rephrase Lemma 4.1 to suit this modification.

U^\hat{U}\leftarrow\emptyset;
(Vsk,C)BallOrCycle(s,k)(V_{s}^{k},C)\leftarrow\texttt{BallOrCycle}(s,k);
if CnullC\neq\texttt{null} then return (,C)(\emptyset,C);
U^U^{s}\hat{U}\leftarrow\hat{U}\cup\{s\};
NsN(s)N_{s}\leftarrow N(s);
GG{s}G\leftarrow G\setminus\{s\};
foreach (s,v)Ns(s,v)\in N_{s} do
 (Vvk,C)BallOrCycle(v,k)(V_{v}^{k},C)\leftarrow\texttt{BallOrCycle}(v,k);
 
 if CnullC\neq\texttt{null} then return (,C)(\emptyset,C);
 
 U^U^Vvkα\hat{U}\leftarrow\hat{U}\cup V_{v}^{k-\alpha} ;
 
add ss and the edge set {(s,v)vNs}\{(s,v)\mid v\in N_{s}\} back to GG;
return (U^,null);(\hat{U},\texttt{null});
Algorithm 9 NbrBallOrCycle(G,s,k,α)\texttt{NbrBallOrCycle}(G,s,k,\alpha)
Lemma 6.1.

Let 2αk2\leq\alpha\leq k. If NbrBallOrCycle(G,s,k,α)\texttt{NbrBallOrCycle}(G,s,k,\alpha) finds a cycle CC then wt(C)2kwt(C)\leq 2k. Otherwise, no vertex in Vskα+1V_{s}^{k-\alpha+1} is part of a C2αC_{\leq 2\alpha} in GG, and the set U^=Vskα+1\hat{U}=V_{s}^{k-\alpha+1} is returned.

Proof.

The proof that if a cycle CC is returned then wt(C)2kwt(C)\leq 2k is as in Lemma 4.1. It is left to show that if no cycle is found then no vertex in Vskα+1V_{s}^{k-\alpha+1} is part of a C2αC_{\leq 2\alpha} in GG, and the set U^=Vskα+1\hat{U}=V_{s}^{k-\alpha+1} is returned.

From Lemma 3.1 it follows that if BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) did not return a cycle then no vertex in VvkαV_{v}^{k-\alpha} is part of a C2αC_{\leq 2\alpha} in GG^{\prime}, and therefore also in GG, since G=G{s}G^{\prime}=G\setminus\{s\} and since ss itself is not on a C2αC_{\leq 2\alpha} as otherwise BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k) would not have been called. Thus, if no cycle was found, we have U^=Vskα+1(G)\hat{U}=V_{s}^{k-\alpha+1}(G), as in this case U^\hat{U} contains ss and Vvkα(G)V_{v}^{k-\alpha}(G^{\prime}) for each vN(s)v\in N(s), which equals to Vskα+1(G)V_{s}^{k-\alpha+1}(G). ∎

For the running time, we note that the sets VvkV_{v}^{k} are computed during the execution of BallOrCycle(G,v,k)\texttt{BallOrCycle}(G^{\prime},v,k), for every (s,v)Ns(s,v)\in N_{s} for which no cycle was found. Their total size is also O(n+m)O(n+m), and we can obtain from them the sets VvkαV_{v}^{k-\alpha} and add these sets to U^\hat{U} in O(n+m)O(n+m) time. Therefore, the modified NbrBallOrCycle also runs in O(n+m)O(n+m) time.

6.2 A (max{2k,g},g~)(\max\{2k,g\},\tilde{g})-hybrid algorithm

In this section we present a (max{2k,g},2α)(\max\{2k,g\},2\alpha)-hybrid algorithm called ShortCycle, where α=g~2\alpha=\lceil\frac{\tilde{g}}{2}\rceil. ShortCycle (see Algorithm 10) gets a graph GG and two integers α,k2\alpha,k\geq 2. If m1+n(1+n1/k)m\geq 1+\lceil n\cdot(1+n^{1/k})\rceil then we run algorithm ShortCycleDense(G,k)\texttt{ShortCycleDense}(G,k) (see Algorithm 11), which is based on algorithm DegenerateOrCycle of [7]. If m<1+n(1+n1/k)m<1+\lceil n\cdot(1+n^{1/k})\rceil then the main challenge is when αk<n2\alpha\leq k<\frac{n}{2}. In this case we run algorithm ShortCycleSparse(G,k,α)\texttt{ShortCycleSparse}(G,k,\alpha) (described later). The cases that kn2k\geq\frac{n}{2} or k<αk<\alpha are relatively simple and treated in algorithm SpecialCases(G,k,α)\texttt{SpecialCases}(G,k,\alpha) (see Section 6.2.1). We summarize the properties of ShortCycle(G,k,α)\texttt{ShortCycle}(G,k,\alpha) in the next theorem.

if m1+n(1+n1/k)m\geq 1+\lceil n\cdot(1+n^{1/k})\rceil then
 return ShortCycleDense(G,k)\texttt{ShortCycleDense}(G,k);
 
if αk<n2\alpha\leq k<\frac{n}{2} then
 return ShortCycleSparse(G,k,α)\texttt{ShortCycleSparse}(G,k,\alpha);
 
return SpecialCases(G,k,α)\texttt{SpecialCases}(G,k,\alpha)
Algorithm 10 ShortCycle(G,k,α)\texttt{ShortCycle}(G,k,\alpha)
G(V,E)G^{\prime}\leftarrow(V^{\prime},E^{\prime}) is the edge induced subgraph formed by an arbitrary subset of 1+n(1+n1/k)1+\lceil n\cdot(1+n^{1/k})\rceil edges;
S{aVdegG(a)1+n1/k}S\leftarrow\{a\in V^{\prime}\mid deg_{G^{\prime}}(a)\leq 1+n^{1/k}\};
while SS\neq\emptyset do
   pick aa from SS;
 foreach (a,b)E(a,b)\in E^{\prime} do
    if degG(b)=2+n1/kdeg_{G^{\prime}}(b)=2+n^{1/k} then SS{b}S\leftarrow S\cup\{b\};
    
  remove aa from GG^{\prime} and from SS;
 
(,C)BallOrCycle(G,w,k)(*,C)\leftarrow\texttt{BallOrCycle}(G^{\prime},w,k) with some wVw\in V^{\prime};
return CC;
  // CC is a cycle of length 2k\leq 2k
Algorithm 11 ShortCycleDense(G,k)\texttt{ShortCycleDense}(G,k)
Theorem 6.1.

Let α,k2\alpha,k\geq 2 be integers. ShortCycle(G,k,α)\texttt{ShortCycle}(G,k,\alpha) runs whp in O~((k+1α1+α)min{m1+α1k+1,n1+αk})\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}) time and either returns a Cmax{2k,g}C_{\leq\max\{2k,g\}}, or determines that g>2αg>2\alpha.

The next corollary follows from Theorem 6.1, when αk\alpha\leq k.

Corollary 6.1.

Let 2αk2\leq\alpha\leq k. Algorithm ShortCycle(G,k,α)\texttt{ShortCycle}(G,k,\alpha) runs whp in O~((k+1α1+α)min{m1+α1k+1,n1+αk})\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}) time and either returns a C2kC_{\leq 2k}, or determines that g>2αg>2\alpha.

In the rest of this section, we present the proof of Theorem 6.1. As follows from [7], if m1+n(1+n1/k)m\geq 1+\lceil n\cdot(1+n^{1/k})\rceil then ShortCycleDense(G,k)\texttt{ShortCycleDense}(G,k) returns in O(min{m,n1+1/k})O(\min\{m,n^{1+1/k}\}) time a C2kC_{\leq 2k}. We now consider the case in which m<1+n(1+n1/k)m<1+\lceil n\cdot(1+n^{1/k})\rceil. We prove in Section 6.2.1 that SpecialCases(G,k,α)\texttt{SpecialCases}(G,k,\alpha) satisfies the claim of Theorem 6.1, when kn2k\geq\frac{n}{2} or k<αk<\alpha.

Our main technical contribution is algorithm ShortCycleSparse that handles the case of αk<n2\alpha\leq k<\frac{n}{2}. Notice that if |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}} for every uVu\in V, then AllVtxBallOrCycle(G,α)\texttt{AllVtxBallOrCycle}(G,\alpha) is a (2k,2α)(2k,2\alpha)-hybrid algorithm that either finds a C2αC_{\leq 2\alpha} (which is also a C2kC_{\leq 2k} as αk\alpha\leq k) or determines that g>2αg>2\alpha, in O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}) time. Thus, in ShortCycleSparse we ensure that if we call AllVtxBallOrCycle, the property that |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}, for every uVu\in V (whp), holds. To do so, we run BfsSample and possibly HandleReminder. If no cycle was returned, the property holds, and we can safely run AllVtxBallOrCycle.

ShortCycleSparse (see Algorithm 12) gets a graph GG and two integers α,k2\alpha,k\geq 2 such that αk<n2\alpha\leq k<\frac{n}{2}, and is composed of three stages. In the first stage we call BfsSample(G,k,α)\texttt{BfsSample}(G,k,\alpha) (described later). If BfsSample returns a cycle CC then ShortCycleSparse stops and returns CC, otherwise we proceed to the second stage. In the second stage, if (k+1)mod(α1)0(k+1)\bmod(\alpha-1)\neq 0, we call HandleReminder(G,k,α)\texttt{HandleReminder}(G,k,\alpha) (also described later). If HandleReminder returns a cycle CC then ShortCycleSparse stops and returns CC, otherwise we proceed to the last stage. In the last stage, we call AllVtxBallOrCycle(G,α)\texttt{AllVtxBallOrCycle}(G,\alpha). If AllVtxBallOrCycle returns a cycle CC then 2k2k-Hybrid stops and returns CC, otherwise ShortCycleSparse returns null.

CBfsSample(G,k,α)C\leftarrow\texttt{BfsSample}(G,k,\alpha);
if CnullC\neq\texttt{null} then return CC;
if (k+1)mod(α1)0(k+1)\bmod(\alpha-1)\neq 0 then
 CHandleReminder(G,k,α)C\leftarrow\texttt{HandleReminder}(G,k,\alpha);
 
 if CnullC\neq\texttt{null} then return CC;
 
CAllVtxBallOrCycle(G,α)C\leftarrow\texttt{AllVtxBallOrCycle}(G,\alpha);
if CnullC\neq\texttt{null} then return CC;
return null;
Algorithm 12 ShortCycleSparse(G,k,α)\texttt{ShortCycleSparse}(G,k,\alpha)

Next, we give a high level description of BfsSample. The goal of BfsSample is to either sparsify the graph without removing any C2αC_{\leq 2\alpha}, or to report a C2kC_{\leq 2k}. For simplicity assume that (k+1)mod(α1)=0(k+1)\bmod(\alpha-1)=0. In such a case, if BfsSample does not report a C2kC_{\leq 2k}, then the graph after BfsSample ends contains all the C2αC_{\leq 2\alpha}s that were in the original graph, and satisfies, whp, the following sparsity property: For every uVu\in V it holds that |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}.

This implies that in BfsSample we need to find every uVu\in V which is in a dense region with |Euα1|mα1k+1|E_{u}^{\alpha-1}|\geq m^{\frac{\alpha-1}{k+1}}, and to check if uu is in a C2αC_{\leq 2\alpha}, so that if not we can remove uu. Finding every such uu is possible within the time limit by running IsDense(G,u,mα1k+1,α1)\texttt{IsDense}(G,u,m^{\frac{\alpha-1}{k+1}},\alpha-1) for every uVu\in V. The problem is that checking whether uu is on a C2αC_{\leq 2\alpha} for every such uu is too costly since there might be nn such vertices, and this check costs O(n)O(n) using BallOrCycle(G,u,α)\texttt{BallOrCycle}(G,u,\alpha).

One way to overcome this problem is to sample an edge set SS of size Θ~(m1α1k+1)\tilde{\Theta}(m^{1-\frac{\alpha-1}{k+1}}) that hits the mα1k+1m^{\frac{\alpha-1}{k+1}} closest edges of each vertex, and then use SS to detect the vertices in the dense regions that are not on a C2αC_{\leq 2\alpha}. In BfsSample we use a detection process in which we call BallOrCycle or NbrBallOrCycle from the endpoints of SS’s edges, and then, if no cycle was found, we use the information obtained from this call to identify vertices that are not on a C2αC_{\leq 2\alpha}. The detection process either detects vertices that are not on a C2αC_{\leq 2\alpha} and can be removed, or reports a C2kC_{\leq 2k}. However, it is not clear how to implement this detection process efficiently, since just running BallOrCycle from the endpoints of SS’s edges takes O(nm1α1k+1)O(nm^{1-\frac{\alpha-1}{k+1}}) time which might be too much. Our solution is an iterative sampling procedure that starts with a smaller hitting set of edges, of size Θ~(mα1k+1)\tilde{\Theta}(m^{\frac{\alpha-1}{k+1}}). For such a hitting set we can run our detection process. If a C2kC_{\leq 2k} was not reported, then we remove the appropriate vertices and sparsify the graph without removing any C2αC_{\leq 2\alpha}. When the graph is sparser, the running time of our detection process becomes faster. Thus, in the following iteration we can sample a larger hitting set for which we run this process, and either return a C2kC_{\leq 2k} or sparsify the graph further for the next iteration. We continue the iterative sampling procedure until we get to the required sparsity property in which |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}} for every uVu\in V (whp).

We remark that in the first iteration of BfsSample, the detection process calls NbrBallOrCycle, while in the rest of the iterations BallOrCycle is called. The use of NbrBallOrCycle allows us, in the case that no C2kC_{\leq 2k} is reported, to bound |Evd||E_{v}^{d}| with mdk+1m^{\frac{d}{k+1}} for every vVv\in V, rather than |Evd1||E_{v}^{d-1}|. This is used to achieve the required sparsity property. Since NbrBallOrCycle runs in O(m)O(m) time we can only use it in the first iteration when the sampled set is small enough. In the rest of the iterations we use BallOrCycle instead.

We now formally describe BfsSample. BfsSample (see Algorithm 13) gets a graph GG and two integers α,k2\alpha,k\geq 2 such that αk<n2\alpha\leq k<\frac{n}{2}. We first set yy to k+1α11\lceil\frac{k+1}{\alpha-1}\rceil-1. Then, we start the main for loop that has at most yy iterations. In the iith iteration, we initialize V^\hat{V} to \emptyset and sample a set SiES_{i}\subseteq E of size Θ~(mi(α1)k+1)\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}}). Next, we scan the endpoints in V(Si)V(S_{i}) using an inner for-each loop.

If i=1i=1, we call NbrBallOrCycle(G,s,k,α)\texttt{NbrBallOrCycle}(G,s,k,\alpha) from every endpoint sV(S1)s\in V(S_{1}). NbrBallOrCycle returns either a cycle or a set of vertices U^\hat{U}. If NbrBallOrCycle returns a cycle then the cycle is returned by BfsSample. Otherwise, we add U^\hat{U} to V^\hat{V}.

If i>1i>1 then we call BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}), where ki=(k+1)(i1)(α1)k_{i}=(k+1)-(i-1)\cdot(\alpha-1), from every endpoint sV(Si)s\in V(S_{i}). If a cycle is found by BallOrCycle then the cycle is returned by BfsSample. If BallOrCycle does not return a cycle then we add VujkiαV_{u_{j}}^{k_{i}-\alpha} to V^\hat{V}.

Right after the inner for-each loop ends, we remove V^\hat{V} from GG, and continue to the next iteration of the main for loop. If no cycle was found after yy iterations, we return null. Let y\ell\leq y be the last iteration in which vertices were removed. Let Vi^\hat{V_{i}} be the set of vertices that were removed during the iith iteration, where Vi^=\hat{V_{i}}=\emptyset for i>i>\ell. Let G0G_{0} (G1G_{1}) be GG before (after) running BfsSample. Figure 5 and Figure 6 illustrate the key steps of the first and the following iterations of BfsSample, respectively. We summarize the properties BfsSample in the next lemma.

yk+1α11y\leftarrow\lceil\frac{k+1}{\alpha-1}\rceil-1;
for i1i\leftarrow 1 to yy do
   sample a set SiS_{i} of Θ~(mi(α1)k+1)\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}}) edges;
 
 V^\hat{V}\leftarrow\emptyset;
 
 foreach sV(Si)s\in V(S_{i}) do
    
    if i=1i=1 then
       (U^,C)NbrBallOrCycle(G,s,k,α)(\hat{U},C)\leftarrow\texttt{NbrBallOrCycle}(G,s,k,\alpha);
       
       if CnullC\neq\texttt{null} then return CC;
       V^V^U^\hat{V}\leftarrow\hat{V}\cup\hat{U};
       
    else
       ki(k+1)(i1)(α1)k_{i}\leftarrow(k+1)-(i-1)\cdot(\alpha-1);
       (Vski,C)BallOrCycle(s,ki)(V_{s}^{k_{i}},C)\leftarrow\texttt{BallOrCycle}(s,k_{i});
       
       if CnullC\neq\texttt{null} then return CC;
       V^V^Vskiα\hat{V}\leftarrow\hat{V}\cup V_{s}^{k_{i}-\alpha};
    
 GGV^G\leftarrow G\setminus\hat{V};
 
return null;
Algorithm 13 BfsSample(G,k,α)\texttt{BfsSample}(G,k,\alpha)
α\alphakαk-\alphakkuuuukα+1k-\alpha+1kα+2k-\alpha+2xxyyyyxx(a)(b)
Figure 5: The first iteration of BfsSample. (a) The edge (x,y)(x,y) is a sampled edge from a dense region of uu. (b) The run of NbrBallOrCycle from xx and yy leads to the removal of uu.
α\alphakiα=ki+11\begin{array}[]{l}k_{i}-\alpha=\\ k_{i+1}-1\end{array}kik_{i}uuki+11k_{i+1}-1ki+1k_{i+1}uuxxyy(a)xx(b)yy
Figure 6: Iteration i>1i>1 of BfsSample. (a) The edge (x,y)(x,y) is a sampled edge from a dense region of uu. (b) The run of BallOrCycle from xx and yy leads to the removal of uu.
Lemma 6.2.

BfsSample(G,k,α)\texttt{BfsSample}(G,k,\alpha) satisfies the following:

  1. (i)

    If a cycle CC is returned then wt(C)2kwt(C)\leq 2k

  2. (ii)

    If a cycle is not returned then |Eu(k+1)y(α1)|<m1y(α1)k+1|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}, for every uG1u\in G_{1}, whp

  3. (iii)

    If uG0G1u\in G_{0}\setminus G_{1} then uu is not part of a C2αC_{\leq 2\alpha} in G0G_{0}

  4. (iv)

    BfsSample(G,k,α)\texttt{BfsSample}(G,k,\alpha) runs in O~(k+1α1m1+α1k+1)\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}}) time, whp.

Proof.
  1. (i)

    We first show that if a cycle CC is returned then wt(C)2kwt(C)\leq 2k. BfsSample returns a cycle only if a call to NbrBallOrCycle(G,s,k,α)\texttt{NbrBallOrCycle}(G,s,k,\alpha) returns a cycle or a call to BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}), where 2iy2\leq i\leq y and ki=(k+1)(i1)(α1)k_{i}=(k+1)-(i-1)\cdot(\alpha-1), returns a cycle. By Lemma 6.1, if NbrBallOrCycle(G,s,k,α)\texttt{NbrBallOrCycle}(G,s,k,\alpha) returns a cycle CC then wt(C)2kwt(C)\leq 2k. By Lemma 3.2, if BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}) returns a cycle CC then wt(C)2kiwt(C)\leq 2k_{i}. Since i>1i>1 is an integer, and since α2\alpha\geq 2, we have ki=(k+1)(i1)(α1)(k+1)(α1)kk_{i}=(k+1)-(i-1)\cdot(\alpha-1)\leq(k+1)-(\alpha-1)\leq k. Therefore, wt(C)2kwt(C)\leq 2k.

  2. (ii)

    Next, we show that if a cycle is not returned then |Eu(k+1)y(α1)|<m1y(α1)k+1|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}, for every uG1u\in G_{1}, whp. Since kαk\geq\alpha, we have y1y\geq 1 and there is at least one iteration.

    Consider the iith iteration of the main loop. We show that if no cycle was found during the iith iteration then after the iith iteration every uG0(j=1iV^j)u\in G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{j}) satisfies the following property: |Eu(k+1)i(α1)|<m1i(α1)k+1|E_{u}^{(k+1)-i\cdot(\alpha-1)}|<m^{1-\frac{i\cdot(\alpha-1)}{k+1}}, whp.

    Let uG0(j=1iV^j)u\in G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{j}). By Lemma 3.7, the set SiS_{i} hits the m1i(α1)k+1m^{1-\frac{i\cdot(\alpha-1)}{k+1}} closest edges of every vertex of G0(j=1i1V^j)G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j}), whp. Assume that SiS_{i} is indeed such a hitting set, and assume, towards a contradiction, that after the iith iteration |Eu(k+1)i(α1)|m1i(α1)k+1|E_{u}^{(k+1)-i\cdot(\alpha-1)}|\geq m^{1-\frac{i\cdot(\alpha-1)}{k+1}}. Since G0(j=1iV^j)G0(j=1i1V^j)G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{j})\subseteq G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j}) we have uG0(j=1i1V^j)u\in G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j}). Now, since uG0(j=1i1V^j)u\in G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j}) and since the graph is not updated in the inner for-all loop, it follows that there is an edge (u1,u2)Si(u_{1},u_{2})\in S_{i} such that (u1,u2)Eu(k+1)i(α1)(u_{1},u_{2})\in E_{u}^{(k+1)-i\cdot(\alpha-1)}. By the definition of Eu(k+1)i(α1)E_{u}^{(k+1)-i\cdot(\alpha-1)}, either u1u_{1} or u2u_{2}, denoted with ss, satisfies that d(u,s)(k+1)i(α1)1d(u,s)\leq(k+1)-i\cdot(\alpha-1)-1. By the definition of V(Si)V(S_{i}), we know that sV(Si)s\in V(S_{i}).

    When i=1i=1, we have d(u,s)(k+1)i(α1)1=kα+1d(u,s)\leq(k+1)-i\cdot(\alpha-1)-1=k-\alpha+1. Therefore, uVskα+1u\in V_{s}^{k-\alpha+1}. At the first iteration, when i=1i=1, the input graph has not changed yet so G=G0G=G_{0}. Since no cycle was found by NbrBallOrCycle(G0,s,k,α)\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha), it follows from Lemma 6.1 that U^=Vskα+1\hat{U}=V_{s}^{k-\alpha+1}. Therefore, uU^u\in\hat{U}, and uu is added to Vi^\hat{V_{i}} after the call to NbrBallOrCycle(G0,s,k,α)\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha). Hence, uG0(j=1iV^i)u\notin G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{i}), a contradiction.

    We now handle the case that i>1i>1. It holds that d(u,s)(k+1)i(α1)1=(k+1)(i1)(α1)α=kiαd(u,s)\leq(k+1)-i\cdot(\alpha-1)-1=(k+1)-(i-1)\cdot(\alpha-1)-\alpha=k_{i}-\alpha. Hence, uVskiαu\in V_{s}^{k_{i}-\alpha}. Since no cycle was found during the iith iteration, uu is added to Vi^\hat{V_{i}} after the call to BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}), and therefore, uG0(j=1iV^i)u\notin G_{0}\setminus(\cup_{j=1}^{i}\hat{V}_{i}), a contradiction.

    Now, if BfsSample does not return a cycle we get for i=yi=y that if uG0(j=1yV^j)=G1u\in G_{0}\setminus(\cup_{j=1}^{y}\hat{V}_{j})=G_{1} then |Eu(k+1)y(α1)|<m1y(α1)k+1|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}, whp.

  3. (iii)

    Next, we prove that if uG0G1u\in G_{0}\setminus G_{1} then uu is not part of a C2αC_{\leq 2\alpha} in G0G_{0}. Since uG0G1u\in G_{0}\setminus G_{1} and since G1=G0(j=1yV^jG_{1}=G_{0}\setminus(\cup_{j=1}^{y}\hat{V}_{j}) it holds that uV^iu\in\hat{V}_{i}, where 1iy1\leq i\leq y.

    If i=1i=1 then since uV^iu\in\hat{V}_{i} it follows that there was in the first iteration a vertex ss such that uU^u\in\hat{U} after a call to NbrBallOrCycle(G0,s,k,α)\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha) did not return a cycle. By Lemma 6.1, U^=Vskα+1\hat{U}=V_{s}^{k-\alpha+1} and no vertex in Vskα+1V_{s}^{k-\alpha+1} is part of a C2αC_{\leq 2\alpha} in G0G_{0}. Therefore, uu is not part of a C2αC_{\leq 2\alpha} in G0G_{0}.

    If i>1i>1 then since uV^iu\in\hat{V}_{i} it follows that there was in the iith iteration a vertex ss such that uVskiαu\in V_{s}^{k_{i}-\alpha} after a call to BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}) did not return a cycle. As BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}) did not return a cycle, by Lemma 3.2 we know that B(s,ki)B(s,k_{i}) is a tree. It follows from Lemma 3.1 that no vertex in VskiαV_{s}^{k_{i}-\alpha}, and in particular uu, is part of a C2αC_{\leq 2\alpha} in G0(j=1i1V^j)G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j}). Since during the run of BfsSample we remove only vertices that are not part of a C2αC_{\leq 2\alpha}, uu is not part of a C2αC_{\leq 2\alpha} also in G0G_{0}.

  4. (iv)

    Finally, we show that BfsSample runs in O~(k+1α1m1+α1k+1)\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}}) time, whp. To do so, we show that the running time of the iith iteration of the main for loop is whp O~(m1+α1k+1)\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}).

    We start with the first iteration, in which i=1i=1. The size of S1S_{1} is Θ~(mα1k+1)\tilde{\Theta}(m^{\frac{\alpha-1}{k+1}}). The size of V(S1)V(S_{1}) is at most 2|S1|2\cdot|S_{1}|. For every sV(S1)s\in V(S_{1}) we run NbrBallOrCycle(G0,s,k,α)\texttt{NbrBallOrCycle}(G_{0},s,k,\alpha). By Lemma 4.3, running NbrBallOrCycle from ss costs O(n+m)=O(m)O(n+m)=O(m). Adding U^\hat{U} to V^\hat{V} costs O(n)=O(m)O(n)=O(m). Therefore, the total running time for all sV(S1)s\in V(S_{1}) is at most 2|S1|O(m)=O~(mα1k+1m)=O~(m1+α1k+1)2\cdot|S_{1}|\cdot O(m)=\tilde{O}(m^{\frac{\alpha-1}{k+1}}\cdot m)=\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}).

    Now we assume that i>1i>1. We proved in (ii) that if i>1i>1 and uG0(j=1i1V^j)u\in G_{0}\setminus(\cup_{j=1}^{i-1}\hat{V}_{j}) then after the (i1)(i-1)th iteration, if no cycle was found, we have |Eu(k+1)(i1)(α1)|<m1(i1)(α1)k+1|E_{u}^{(k+1)-(i-1)\cdot(\alpha-1)}|<m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}}, whp. By Lemma 3.2, for every sV(Si)s\in V(S_{i}), the cost of running BallOrCycle(s,ki)\texttt{BallOrCycle}(s,k_{i}) is O(|Vski|)=O(|Eski|)=O(|Es(k+1)(i1)(α1)|)O(|V_{s}^{k_{i}}|)=O(|E_{s}^{k_{i}}|)=O(|E_{s}^{(k+1)-(i-1)\cdot(\alpha-1)}|). In our case this is at most O(m1(i1)(α1)k+1)O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}}). As the size of SiS_{i} is Θ~(mi(α1)k+1)\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}}) and the size of V(Si)V(S_{i}) is at most 2|Si|2\cdot|S_{i}|, the total running time of the calls to BallOrCycle for every sV(Si)s\in V(S_{i}) is, whp, 2|Si|O(m1(i1)(α1)k+1)=Θ~(mi(α1)k+1)O(m1(i1)(α1)k+1)=O~(m1+α1k+1).2\cdot|S_{i}|\cdot O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}})=\tilde{\Theta}(m^{\frac{i\cdot(\alpha-1)}{k+1}})\cdot O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}})=\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}).

    The cost of adding VskiαV_{s}^{k_{i}-\alpha} to V^\hat{V} is O(|Vskiα|)O(|V_{s}^{k_{i}-\alpha}|). This is at most O(|Vski|)=O(m1(i1)(α1)k+1)O(|V_{s}^{k_{i}}|)=O(m^{1-\frac{(i-1)\cdot(\alpha-1)}{k+1}}) (whp), which is O~(m1+α1k+1)\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}) for all sV(Si)s\in V(S_{i}) (similarly to the previous calculation).

    The cost of removing a vertex vv is O(deg(v))O(deg(v)). Thus, for every i1i\geq 1, the total cost of removing all the vertices in Vi^\hat{V_{i}} is at most O(m)O(m), so the total running time of the iith iteration is O~(m1+α1k+1)\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}), whp.

    If we are in the scenario that a cycle is returned, then the iith iteration stops at an earlier stage, and therefore the running time is also O~(m1+α1k+1)\tilde{O}(m^{1+\frac{\alpha-1}{k+1}}).

    Now, since there are at most yy iterations of the main for loop, the running time of BfsSample is, whp101010This is the running time in the case that SiS_{i} was a hitting sets as described for every 1iy1\leq i\leq y. This happens whp since we assume that k<n2<nk<\frac{n}{2}<n and therefore y<ny<n. For every 1iy1\leq i\leq y the probability that SiS_{i} is not such a hitting set is at most 1nc\frac{1}{n^{c}}. Therefore, using the standard union bound argument, the probability that there exists 1iy1\leq i\leq y such that SiS_{i} is not a hitting set is at most y1nc1nc1y\cdot\frac{1}{n^{c}}\leq\frac{1}{n^{c-1}}. For large enough cc, we get that SiS_{i} is a hitting set for every 1iy1\leq i\leq y, whp., O~(ym1+α1k+1)=O~(k+1α1m1+α1k+1)\tilde{O}(y\cdot m^{1+\frac{\alpha-1}{k+1}})=\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}}). ∎

Recall that our goal is to obtain the sparsity property that |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}, for every uVu\in V, so that we can run AllVtxBallOrCycle(G,α)\texttt{AllVtxBallOrCycle}(G,\alpha). However, after running BfsSample the required sparsity property is guaranteed to hold (whp) only if (k+1)mod(α1)=0(k+1)\bmod(\alpha-1)=0. In the case that (k+1)mod(α1)0(k+1)\bmod(\alpha-1)\neq 0 we need an additional step which is implemented in HandleReminder, to guarantee that the required sparsity property holds.

Next, we formally describe HandleReminder. HandleReminder (see Algorithm 14) gets a graph GG and two integers α,k2\alpha,k\geq 2 such that k+1=q(α1)+rk+1=q(\alpha-1)+r where q1q\geq 1 and r>0r>0. We set DD to m1k+1m^{\frac{1}{k+1}} and rr to (k+1)mod(α1)(k+1)\bmod(\alpha-1). Then, a while loop runs as long as (α1)modr0(\alpha-1)\bmod r\neq 0. Let rir_{i} be the value of rr when the iith iteration begins, so that r1r_{1} is (k+1)mod(α1)(k+1)\bmod(\alpha-1). Let \ell be the total number of iterations and r+1r_{\ell+1} the value of rr after the \ellth iteration. During the iith iteration, we set ri+1r_{i+1} to (α1riri)(α1)(\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i})-(\alpha-1), where α1riri\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i} is the smallest multiple of rir_{i} that is at least α1\alpha-1 (see Figure 7). Then, we call SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha). If SparseOrCycle returns a cycle CC then HandleReminder returns CC. If SparseOrCycle does not return a cycle then it might be that some vertices were removed from GG, and we continue to the next iteration. If the while loop ends without returning a cycle then we return null. Let G0G_{0} (G1G_{1}) be GG before (after) running HandleReminder. Next, we prove two properties on the value of rr during the run of HandleReminder.

Dm1k+1D\leftarrow m^{\frac{1}{k+1}};
r(k+1)mod(α1)r\leftarrow(k+1)\bmod(\alpha-1);
while  (α1)modr0(\alpha-1)\bmod r\neq 0  do
 
 r(α1rr)(α1)r\leftarrow(\lceil\frac{\alpha-1}{r}\rceil\cdot r)-(\alpha-1);
 
 CSparseOrCycle(G,D,r,α)C\leftarrow\texttt{SparseOrCycle}(G,D,r,\alpha);
 
 if CnullC\neq\texttt{null} then return CC;
 
return null;
Algorithm 14 HandleReminder(G,k,α)\texttt{HandleReminder}(G,k,\alpha)
Claim 6.1.

Let k+1=q(α1)+r1k+1=q(\alpha-1)+r_{1} and assume r1>0r_{1}>0. (i) 0<r+1<r<<r1<α10<r_{\ell+1}<r_{\ell}<\cdots<r_{1}<\alpha-1. (ii) (ri+1+α1)modri=0r_{i+1}+\alpha-1)\bmod r_{i}=0, for every 1i1\leq i\leq\ell.

Proof.
  1. (i)

    First, since r1=(k+1)mod(α1)r_{1}=(k+1)\bmod(\alpha-1) and we assume that r1>0r_{1}>0, we have 0<r1<α10<r_{1}<\alpha-1. Now we show that 0<ri+1<ri0<r_{i+1}<r_{i}, for every 1i1\leq i\leq\ell. This implies that 0<r+1<r<<r1<α0<r_{\ell+1}<r_{\ell}<\cdots<r_{1}<\alpha, as required.

    We first show by induction that ri>0r_{i}>0, for every 1i1\leq i\leq\ell. The base of the induction follows from the assumption that r1>0r_{1}>0. We assume that ri>0r_{i}>0 and prove that ri+1>0r_{i+1}>0. During the iith iteration of the while loop, we set ri+1r_{i+1} to (α1riri)(α1)(\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i})-(\alpha-1). Since this occurs during the iith iteration it must be that (α1)modri0(\alpha-1)\bmod r_{i}\neq 0, as otherwise the iith iteration would not have started. Since (α1)modri0(\alpha-1)\bmod r_{i}\neq 0 we have α1ri>α1ri\lceil\frac{\alpha-1}{r_{i}}\rceil>\frac{\alpha-1}{r_{i}} and therefore, α1riri>α1\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}>\alpha-1. We get that ri+1=(α1riri)(α1)>0r_{i+1}=(\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i})-(\alpha-1)>0, as required.

    We now turn to prove that ri+1<rir_{i+1}<r_{i}. As α1riri\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i} is the smallest multiple of rir_{i} that is at least α1\alpha-1, we know that (α1ri1)ri=α1ririri<α1(\lceil\frac{\alpha-1}{r_{i}}\rceil-1)\cdot r_{i}=\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-r_{i}<\alpha-1. Therefore, α1riri(α1)<ri\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-(\alpha-1)<r_{i}. Since ri+1=α1riri(α1)r_{i+1}=\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-(\alpha-1), we get that ri+1<rir_{i+1}<r_{i}.

  2. (ii)

    Let 1i1\leq i\leq\ell. Since ri+1=α1riri(α1)r_{i+1}=\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i}-(\alpha-1), it follows that ri+1+(α1)r_{i+1}+(\alpha-1) is a multiple of rir_{i}, so (ri+1+α1)modri=0r_{i+1}+\alpha-1)\bmod r_{i}=0. ∎

0α1\alpha-1α1riri\lceil\frac{\alpha-1}{r_{i}}\rceil\cdot r_{i} rir_{i}rir_{i}rir_{i}rir_{i}ri+1r_{i+1}
Figure 7: The relation between rir_{i}, ri+1r_{i+1} and α1\alpha-1

We now prove the main lemma regarding HandleReminder.

Lemma 6.3.

Let k+1=q(α1)+r1k+1=q(\alpha-1)+r_{1} and assume that r1>0r_{1}>0, q1q\geq 1, and that |Eur1|<Dr1|E_{u}^{r_{1}}|<D^{r_{1}}, for every uVu\in V. HandleReminder(G,k,α)\texttt{HandleReminder}(G,k,\alpha) satisfies the following:

  1. (i)

    If a cycle CC is returned then wt(C)2kwt(C)\leq 2k

  2. (ii)

    If a cycle is not returned then |Euα1|<Dα1|E_{u}^{\alpha-1}|<D^{\alpha-1}, for every vertex uG2u\in G_{2}

  3. (iii)

    If uG1G2u\in G_{1}\setminus G_{2} then uu is not on a C2αC_{\leq 2\alpha} in G1G_{1}

  4. (iv)

    HandleReminder(G,k,α)\texttt{HandleReminder}(G,k,\alpha) runs in O(r1m1+α1k+1)O(r_{1}\cdot m^{1+\frac{\alpha-1}{k+1}}) time.

Proof.
  1. (i)

    First, we show that if a cycle CC is returned then wt(C)2kwt(C)\leq 2k. HandleReminder returns a cycle only if a call to SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha), where 1i1\leq i\leq\ell, returns a cycle. By Lemma 3.6, if SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha) returns a cycle CC then wt(C)2(ri+11+α)wt(C)\leq 2(r_{i+1}-1+\alpha). By Claim 6.1(i), ri+1<r1r_{i+1}<r_{1}. In addition, since k+1=q(α1)+r1k+1=q(\alpha-1)+r_{1} with q1q\geq 1, we have ri+11+α<r11+α=(α1)+r1k+1r_{i+1}-1+\alpha<r_{1}-1+\alpha=(\alpha-1)+r_{1}\leq k+1. As ri+11+αr_{i+1}-1+\alpha and kk are integers, we have ri+11+αkr_{i+1}-1+\alpha\leq k. Therefore, wt(C)2(ri+11+α)2kwt(C)\leq 2(r_{i+1}-1+\alpha)\leq 2k.

  2. (ii)

    Next, we show that if a cycle is not returned then |Euα1|<Dα1|E_{u}^{\alpha-1}|<D^{\alpha-1}, for every vertex uG2u\in G_{2}. To do so, we show that if HandleReminder does not return a cycle then when the algorithm ends, for every vertex uG2u\in G_{2} we have |Eur+1|<Dr+1|E_{u}^{r_{\ell+1}}|<D^{r_{\ell+1}}.

    If we do not enter the while loop and =0\ell=0 then by our assumption |Eur1|<Dr1|E_{u}^{r_{1}}|<D^{r_{1}}, for every uVu\in V, as required. Now we assume that we enter the loop so >0\ell>0. Consider the iith iteration of the while loop. During the iith iteration SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha) is called. If SparseOrCycle does not return a cycle, then it follows from Lemma 3.6 that if uV^u\in\hat{V} (uu was not removed) then |Euri+1|<Dri+1|E_{u}^{r_{i+1}}|<D^{r_{i+1}}. Therefore, if no cycle was returned by HandleReminder we get for i=i=\ell that after the \ellth iteration, we have |Eur+1|<Dr+1|E_{u}^{r_{\ell+1}}|<D^{r_{\ell+1}}, for every uG2u\in G_{2}.

    The while loop ends when (α1)modr=0(\alpha-1)\bmod r=0. Therefore, we know that after the while loop, (α1)modr+1=0(\alpha-1)\bmod r_{\ell+1}=0. Additionally, By Claim 6.1(i) r+1<α1r_{\ell+1}<\alpha-1, so there is z>0z>0 such that α1=zr+1\alpha-1=zr_{\ell+1}. By Corollary 3.2, since |Eur+1|<Dr+1|E_{u}^{r_{\ell+1}}|<D^{r_{\ell+1}}, we know that |Euα1|=|Euzr+1|<Dzr+1=Dα1|E_{u}^{\alpha-1}|=|E_{u}^{zr_{\ell+1}}|<D^{{zr_{\ell+1}}}=D^{\alpha-1}, as required.

  3. (iii)

    Next, we prove that if uG1G2u\in G_{1}\setminus G_{2} then uu is not on a C2αC_{\leq 2\alpha} in G1G_{1}. If uG1G2u\in G_{1}\setminus G_{2} then by the definition of G1G_{1} and G2G_{2}, uu was removed while executing HandleReminder. During the run of HandleReminder, a vertex can be removed only by SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha) for some 1i1\leq i\leq\ell. Therefore, by Lemma 3.6, uu is not part of a C2αC_{\leq 2\alpha} in GG. Since during the run of HandleReminder only vertices that are not part of a C2αC_{\leq 2\alpha} are removed, uu is not part of a C2αC_{\leq 2\alpha} also in G1G_{1}.

  4. (iv)

    Finally, we show that HandleReminder runs in O(r1m1+α1k+1)O(r_{1}\cdot m^{1+\frac{\alpha-1}{k+1}}) time. To do so, we show that the running time of the iith iteration of the while loop is O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}). When =0\ell=0 there are no iterations and the running time is O(1)O(1). Now we assume that >0\ell>0. During the iith iteration, we call SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha). We proved in (ii) that for i>1i>1 if SparseOrCycle did not return a cycle during the (i1)(i-1)th iteration and uV^u\in\hat{V} (uu was not removed) then after the (i1)(i-1)th iteration |Euri|<Dri|E_{u}^{r_{i}}|<D^{r_{i}}. For i=1i=1 by our assumption |Eur1|<Dr1|E_{u}^{r_{1}}|<D^{r_{1}}. Therefore, before the iith iteration starts, by Corollary 3.2, |Euzri|<Dzri|E_{u}^{zr_{i}}|<D^{zr_{i}} for every integer z>0z>0. By Claim 6.1(ii), ri+11+αr_{i+1}-1+\alpha is divisible by rir_{i}, so |Euri+11+α|<Dri+11+α|E_{u}^{r_{i+1}-1+\alpha}|<D^{r_{i+1}-1+\alpha} for every vertex uu. It then follows from Corollary 3.3 that SparseOrCycle(G,D,ri+1,α)\texttt{SparseOrCycle}(G,D,r_{i+1},\alpha) runs in O(nDri+1+mDα1)O(nD^{r_{i+1}}+mD^{\alpha-1}) time. By Claim 6.1(i), ri+1<α1r_{i+1}<\alpha-1 so O(nDri+1+mDα1)O(mDα1)=O(m1+α1k+1)O(nD^{r_{i+1}}+mD^{\alpha-1})\leq O(mD^{\alpha-1})=O(m^{1+\frac{\alpha-1}{k+1}}), and the running time of the iith iteration is O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}). Now, it follows from Claim 6.1(i) that after at most r1<α1r_{1}<\alpha-1 iterations, the value of rr cannot decrease anymore (since it cannot become less than 11, and (α1)mod1=0(\alpha-1)\bmod 1=0) so the while loop ends. As we saw, the running time of each iteration is O(m1+α1k+1)O(m^{1+\frac{\alpha-1}{k+1}}), hence the total running time of the while loop is O(r1m1+α1k+1)O(r_{1}\cdot m^{1+\frac{\alpha-1}{k+1}}). ∎

Now we are ready to prove the correctness and running time of ShortCycleSparse.

Lemma 6.4.

Let 2αk<n22\leq\alpha\leq k<\frac{n}{2} such that k+1=q(α1)+rk+1=q(\alpha-1)+r, where q1q\geq 1 and 0r<α10\leq r<\alpha-1 are integers. Algorithm ShortCycleSparse(G,k,α)\texttt{ShortCycleSparse}(G,k,\alpha) runs whp in O~((q+r)m1+α1k+1)\tilde{O}((q+r)\cdot m^{1+\frac{\alpha-1}{k+1}}) time and either returns a C2kC_{\leq 2k}, or determines that g>2αg>2\alpha.

Proof.

First, ShortCycleSparse(G,k,α)\texttt{ShortCycleSparse}(G,k,\alpha) returns a cycle CC only if BfsSample(G,k,α)\texttt{BfsSample}(G,k,\alpha), HandleReminder(G,k,α)\texttt{HandleReminder}(G,k,\alpha), or AllVtxBallOrCycle(G,α)\texttt{AllVtxBallOrCycle}(G,\alpha) returns a cycle CC. If BfsSample or AllVtxBallOrCycle returns a cycle CC then by Lemma 6.2 or by Lemma 3.3, wt(C)2kwt(C)\leq 2k. If HandleReminder returns a cycle CC then by Lemma 6.3, since kαk\geq\alpha and hence q1q\geq 1, wt(C)2kwt(C)\leq 2k.

Second, we show that if no cycle was found then g>2αg>2\alpha. If no cycle was found then it might be that some vertices were removed from the graph. A vertex uu can be removed either by BfsSample or by HandleReminder. It follows from Lemma 6.2 and Lemma 6.3, that uu is not part of a C2αC_{\leq 2\alpha} when uu is removed. Since only vertices that are not on a C2αC_{\leq 2\alpha} are removed, every C2αC_{\leq 2\alpha} that was in the input graph also belongs to the updated graph. After the (possible) removal of vertices, we call AllVtxBallOrCycle(G,α)\texttt{AllVtxBallOrCycle}(G,\alpha) with the updated graph. Since we are in the case that no cycle was found, AllVtxBallOrCycle did not return a cycle. It follows from Lemma 3.3 that g>2αg>2\alpha in the updated graph, and therefore g>2αg>2\alpha also in the input graph.

Now we turn to analyze the running time of ShortCycleSparse. At the beginning, ShortCycleSparse calls BfsSample. By Lemma 6.2, BfsSample runs in O~(k+1α1m1+α1k+1)=O~(qm1+α1k+1)\tilde{O}(\lfloor\frac{k+1}{\alpha-1}\rfloor\cdot m^{1+\frac{\alpha-1}{k+1}})=\tilde{O}(q\cdot m^{1+\frac{\alpha-1}{k+1}}) time. Let G1G_{1} be the graph after the call to BfsSample. Recall that y=k+1α11y=\lceil\frac{k+1}{\alpha-1}\rceil-1. By Lemma 6.2, for every uG1u\in G_{1} we have |Eu(k+1)y(α1)|<m1y(α1)k+1|E_{u}^{(k+1)-y\cdot(\alpha-1)}|<m^{1-\frac{y\cdot(\alpha-1)}{k+1}}, whp. If r>0r>0 then y=qy=q and |Eur|<mrk+1|E_{u}^{r}|<m^{\frac{r}{k+1}}. If r=0r=0 then y=q1y=q-1 and |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}}.

Next, ShortCycleSparse checks if r=(k1)mod(α1)>0r=(k-1)\bmod(\alpha-1)>0. We divide the rest of the proof to the case that r=0r=0 and to the case that r>0r>0. If r=0r=0 then ShortCycleSparse calls AllVtxBallOrCycle. Since r=0r=0, we have |Euα1|<mα1k+1|E_{u}^{\alpha-1}|<m^{\frac{\alpha-1}{k+1}} after BfsSample. By Corollary 3.1, the running time of AllVtxBallOrCycle is O(mmα1k+1)=O(m1+α1k+1)O(m\cdot m^{\frac{\alpha-1}{k+1}})=O(m^{1+\frac{\alpha-1}{k+1}}).

We now turn to the case that r>0r>0. In this case it might be that |Euα1|mα1k+1|E_{u}^{\alpha-1}|\geq m^{\frac{\alpha-1}{k+1}} for some vertices uG1u\in G_{1}. Therefore, we first call HandleReminder(G,k,α)\texttt{HandleReminder}(G,k,\alpha), knowing that r>0r>0, q1q\geq 1 and that whp, |Eur|<mrk+1|E_{u}^{r}|<m^{\frac{r}{k+1}} after BfsSample. By Lemma 6.3, the running time is O(rm1+α1k+1)O(r\cdot m^{1+\frac{\alpha-1}{k+1}}). Let G2G_{2} be the graph after HandleReminder ends. By Lemma 6.3, for every uG2u\in G_{2} we have |Euα1|<Dα1|E_{u}^{\alpha-1}|<D^{\alpha-1}, where D=m1k+1D=m^{\frac{1}{k+1}}. Now ShortCycleSparse calls AllVtxBallOrCycle, and using Corollary 3.1 again, we get that the running time of AllVtxBallOrCycle is O(mmk1k+1)=O(m1+k1k+1)O(m\cdot m^{\frac{k-1}{k+1}})=O(m^{1+\frac{k-1}{k+1}}).

It follows from the above discussion that ShortCycleSparse either returns a C2kC_{\leq 2k} or determines that g>2αg>2\alpha, and the running time is whp111111 Throughout the run of ShortCycleSparse, some of the bounds that we get on |Evd||E_{v}^{d}| for vertices vVv\in V and distances dd, are whp, because the sets that we sample are hitting sets whp (see the proof of Lemma 6.2). Therefore, also the running times of BfsSample, HandleReminder and AllVtxBallOrCycle are whp, since they rely on these bounds. O~((q+r)m1+α1k+1)\tilde{O}((q+r)\cdot m^{1+\frac{\alpha-1}{k+1}}). ∎

Since ShortCycleSparse is run by ShortCycle when mO(n1+1k)m\leq O(n^{1+\frac{1}{k}}) and when αk\alpha\leq k and so 1+α1k+121+\frac{\alpha-1}{k+1}\leq 2, we have m1+α1k+1O(n1+αk)m^{1+\frac{\alpha-1}{k+1}}\leq O(n^{1+\frac{\alpha}{k}}). Thus, the running time of ShortCycleSparse is whp O~((q+r)min{m1+α1k+1,n1+αk})\tilde{O}((q+r)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}), which is at most O~((k+1α1+α)min{m1+α1k+1,n1+αk})\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}).

6.2.1 Algorithm SpecialCases

We now present algorithm SpecialCases that handles special cases of kk and α\alpha. SpecialCases(G,k,α)\texttt{SpecialCases}(G,k,\alpha) gets as an input a graph GG and two integers k2k\geq 2 and α2\alpha\geq 2.

If kn2k\geq\frac{n}{2}, the algorithm simply runs BallOrCycle(G,w,n)\texttt{BallOrCycle}(G,w,n) from an arbitrary vertex wVw\in V in O(n)O(min{m1+α1k+1,n1+αk})O(n)\leq O(\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}) time, to check whether GG contains a cycle. If a cycle is found then its length is at most n2kn\leq 2k, and we return a C2kC_{\leq 2k}. Otherwise, g=g=\infty so for every integer α\alpha we return that g>2αg>2\alpha.

If kα1k\leq\alpha-1, we check if the graph contains a CdC_{d} for d3,,2αd\leftarrow 3,...,2\alpha, using an algorithm of Alon, Yuster and Zwick [2]. Their algorithm decides whether GG contains C21C_{2\ell-1}s and C2C_{2\ell}s, and finds such cycles if it does, in O(m21)O(m^{2-\frac{1}{\ell}}) time. Applying this algorithm with increasing cycle lengths until a length of 2α2\alpha (the values of \ell are in the worst case 2,3,,α2,3,...,\alpha), we can either find the shortest cycle or determine that g>2αg>2\alpha. The running time is O(m212+m213+m21α)=O(αm21α)=O(αm1+α1α)O(m^{2-\frac{1}{2}}+m^{2-\frac{1}{3}}+\dots m^{2-\frac{1}{\alpha}})=O(\alpha\cdot m^{2-\frac{1}{\alpha}})=O(\alpha\cdot m^{1+\frac{\alpha-1}{\alpha}}) time121212It is possible to modify the algorithm of Alon et al. [2] to search in O(m21)O(m^{2-\frac{1}{\ell}}) time a shortest cycle of length at most 22\ell instead of exactly 212\ell-1 or 22\ell, and then run it only with =α\ell=\alpha, to avoid the α\alpha factor in the running time. . Since kα1k\leq\alpha-1, we have k+1αk+1\leq\alpha and therefore O(αm1+α1α)O(αm1+α1k+1)O(\alpha\cdot m^{1+\frac{\alpha-1}{\alpha}})\leq O(\alpha\cdot m^{1+\frac{\alpha-1}{k+1}}). In addition, since mO(n1+1k)m\leq O(n^{1+\frac{1}{k}}) and since 1+α1α21+\frac{\alpha-1}{\alpha}\leq 2, we have m1+α1αO(n(1+1k)(1+α1α))O(n(1+1k)(1+α1k+1))=O(n1+αk)m^{1+\frac{\alpha-1}{\alpha}}\leq O(n^{(1+\frac{1}{k})\cdot(1+\frac{\alpha-1}{\alpha})})\leq O(n^{(1+\frac{1}{k})\cdot(1+\frac{\alpha-1}{k+1})})=O(n^{1+\frac{\alpha}{k}}). Therefore, the running time is O(αm1+α1α)O(αmin{m1+α1k+1,n1+αk})O(\alpha\cdot m^{1+\frac{\alpha-1}{\alpha}})\leq O(\alpha\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}).

By choosing which algorithm to run according to the relation between kk, α\alpha and n2\frac{n}{2}, we get that for every two integers α2\alpha\geq 2 and k2k\geq 2, algorithm ShortCycle(G,k,α)\texttt{ShortCycle}(G,k,\alpha) runs whp in O~((k+1α1+α)min{m1+α1k+1,n1+αk})\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\}) time and either returns a Cmax{2k,g}C_{\leq\max\{2k,g\}}, or determines that g>2αg>2\alpha. This completes the proof of Theorem 6.1.

7 Approximation of the girth

In this section we present two new tradeoffs for girth approximation that follow from Corollary 6.1. In these tradeoffs we use ShortCycle with 2αk2\leq\alpha\leq k, so by Corollary 6.1, ShortCycle is a O~((k+1α1+α)min{m1+α1k+1,n1+αk})\tilde{O}((\frac{k+1}{\alpha-1}+\alpha)\cdot\min\{m^{1+\frac{\alpha-1}{k+1}},n^{1+\frac{\alpha}{k}}\})-time, (2k,2α)(2k,2\alpha)-hybrid algorithm.

7.1 Dense graphs

Kadria et al. [7] presented an O((αc)n1+α2αc)O((\alpha-c)\cdot n^{1+\frac{\alpha}{2\alpha-c}})-time algorithm that either returns a C4α2cC_{\leq 4\alpha-2c}, or determines that g>2αg>2\alpha, where 0<cα0<c\leq\alpha are two integers. This is a (4α2c,2α)(4\alpha-2c,2\alpha)-hybrid algorithm which, combined with a binary search, was used by [7] to compute for every ε(0,1]\varepsilon\in(0,1] a cycle CC such that wt(C)4g22εg2(2ε)g+4wt(C)\leq 4\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(2-\varepsilon)g+4, in O~(n1+1/(2ε))\widetilde{O}(n^{1+1/(2-\varepsilon)}) time, if glog2ng\leq\log^{2}n. We use ShortCycle in a similar way and prove:

Theorem 7.1.

Let 2\ell\geq 2 be an integer, ε[0,1]\varepsilon\in[0,1] and glog2ng\leq\log^{2}n. It is possible to compute, whp, in O~(n1+1/(ε))\widetilde{O}(\ell\cdot n^{1+1/(\ell-\varepsilon)}) time, a cycle CC such that wt(C)2g22εg2(ε)g++2wt(C)\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(\ell-\varepsilon)g+\ell+2.

Proof.

For each g~\tilde{g} in the range [3,log2n][3,\log^{2}n] in increasing order, we call ShortCycle(G,k(αg~),αg~)\texttt{ShortCycle}(G,k(\alpha_{\tilde{g}}),\alpha_{\tilde{g}}), where αg~=g~2\alpha_{\tilde{g}}=\lceil\frac{\tilde{g}}{2}\rceil, and k(α)=αεαk(\alpha)=\ell\alpha-\lfloor\varepsilon\alpha\rfloor. When we find the smallest value g~\tilde{g} for which ShortCycle returns a cycle, we stop and return that cycle. Since 2\ell\geq 2 and ε1\varepsilon\leq 1 we have k(αg~)αg~k(\alpha_{\tilde{g}})\geq\alpha_{\tilde{g}}, and it follows from Corollary 6.1 that ShortCycle either returns a C2k(αg~)C_{\leq 2k(\alpha_{\tilde{g}})} or determines that g>2αg~g~g>2\alpha_{\tilde{g}}\geq\tilde{g} in O~((k(α)+1α1+α)n1+αk(α))\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot n^{1+\frac{\alpha}{k(\alpha)}}) time, whp.

We first prove that the algorithm returns a cycle CC such that wt(C)2g22εg2(ε)g++2wt(C)\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq(\ell-\varepsilon)g+\ell+2. Let gg^{\prime} be the smallest value g~\tilde{g} for which ShortCycle returned a cycle. This implies that for g1g^{\prime}-1 the algorithm did not return a cycle, and hence g>g1g>g^{\prime}-1. Since gg and gg^{\prime} are integers, we have ggg\geq g^{\prime}. Also for g=3g^{\prime}=3 we have ggg\geq g^{\prime} since the girth is at least 33.

The call to ShortCycle(G,k(αg),αg)\texttt{ShortCycle}(G,k(\alpha_{g^{\prime}}),\alpha_{g^{\prime}}) returns a cycle CC such that wt(C)2k(αg)=2(αgεαg)=2g22εg22g22εg2wt(C)\leq 2k(\alpha_{g^{\prime}})=2\cdot(\ell\alpha_{g^{\prime}}-\lfloor\varepsilon\alpha_{g^{\prime}}\rfloor)=2\cdot\ell\lceil\frac{g^{\prime}}{2}\rceil-2\cdot\lfloor\varepsilon\lceil\frac{g^{\prime}}{2}\rceil\rfloor\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor. Thus, wt(C)2g22εg22g22εg22g22(εg21)2g+12εg+2=g+εg+2=(ε)g++2wt(C)\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\lceil\frac{g}{2}\rceil\rfloor\leq 2\ell\lceil\frac{g}{2}\rceil-2\lfloor\varepsilon\frac{g}{2}\rfloor\leq 2\ell\lceil\frac{g}{2}\rceil-2(\varepsilon\frac{g}{2}-1)\leq 2\ell\frac{g+1}{2}-\varepsilon g+2=\ell g+\ell-\varepsilon g+2=(\ell-\varepsilon)g+\ell+2.

For the running time, there are at most O(log2n)O(\log^{2}n) calls to ShortCycle, and each call costs O~((k(α)+1α1+α)n1+αk(α))\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot n^{1+\frac{\alpha}{k(\alpha)}}) whp (with the values of kk and α\alpha that correspond to that call). In each call, k(α)+1α1=αεα+1α1α+1α1\frac{k(\alpha)+1}{\alpha-1}=\frac{\ell\alpha-\lfloor\varepsilon\alpha\rfloor+1}{\alpha-1}\leq\frac{\ell\alpha+1}{\alpha-1} which is at most 2+12\ell+1 since α2\alpha\geq 2. In addition, n1+αk(α)=n1+ααεαn1+ααεα=n1+1εn^{1+\frac{\alpha}{k(\alpha)}}=n^{1+\frac{\alpha}{\ell\alpha-\lfloor\varepsilon\alpha\rfloor}}\leq n^{1+\frac{\alpha}{\ell\alpha-\varepsilon\alpha}}=n^{1+\frac{1}{\ell-\varepsilon}}. Thus, the running time of each call is O~((+α)n1+1ε)\tilde{O}((\ell+\alpha)\cdot n^{1+\frac{1}{\ell-\varepsilon}}) whp, which is O~(n1+1ε)\tilde{O}(\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}}) since αlog2n\alpha\leq\log^{2}n in each call. Therefore, the total running time is, whp131313The running time of each call to ShortCycle is whp. Since the number of calls to ShortCycle is at most log2n<n\log^{2}n<n, using a union bound argument as in the proof of Lemma 6.2 we get that also the total running of all the calls is whp., O~(log2nn1+1ε)=O~(n1+1ε)\tilde{O}(\log^{2}n\cdot\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}})=\tilde{O}(\ell\cdot n^{1+\frac{1}{\ell-\varepsilon}}). ∎

Our algorithm can be viewed as a natural generalization of a couple of algorithms from [7]. By setting =2\ell=2 in Theorem 7.1 we get the (2ε)g+4(2-\varepsilon)g+4 approximation of [7]. By setting ε=0\varepsilon=0 we get an O~(n1+1/)\widetilde{O}(\ell\cdot n^{1+1/\ell}) time algorithm that computes a C2g2C_{\leq 2\ell\lceil\frac{g}{2}\rceil}, which is similar to the O~(n1+1/)\widetilde{O}(n^{1+1/\ell}) time algorithm of [7] which computes a C2g2C_{\leq 2\ell\lceil\frac{g}{2}\rceil}.

7.2 Sparse graphs

We use a similar approach to obtain a tradeoff for girth approximation in sparse graphs. We prove the following theorem.

Theorem 7.2.

Let 3\ell\geq 3 be an integer, ε[0,1)\varepsilon\in[0,1) and glog2ng\leq\log^{2}n. It is possible to compute, whp, in O~(m1+1/(ε))\widetilde{O}(\ell\cdot m^{1+1/(\ell-\varepsilon)}) time, a cycle CC such that wt(C)2(g21)2ε(g21)2(ε)g+2εwt(C)\leq 2\ell(\lceil\frac{g}{2}\rceil-1)-2\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-2\leq(\ell-\varepsilon)g-\ell+2\varepsilon.

Proof.

For each g~\tilde{g} in the range [3,log2n][3,\log^{2}n] in increasing order, we call ShortCycle(G,k(αg~),αg~)\texttt{ShortCycle}(G,k(\alpha_{\tilde{g}}),\alpha_{\tilde{g}}), where αg~=g~2\alpha_{\tilde{g}}=\lceil\frac{\tilde{g}}{2}\rceil, and k(α)=(α1)ε(α1)1k(\alpha)=\ell(\alpha-1)-\lfloor\varepsilon(\alpha-1)\rfloor-1. When we find the smallest value g~\tilde{g} for which ShortCycle returns a cycle, we stop and return that cycle. Since 3\ell\geq 3 and ε<1\varepsilon<1 we have k(αg~)αg~k(\alpha_{\tilde{g}})\geq\alpha_{\tilde{g}}, and it follows from Corollary 6.1 that ShortCycle either returns a C2k(αg~)C_{\leq 2k(\alpha_{\tilde{g}})} or determines that g>2αg~g~g>2\alpha_{\tilde{g}}\geq\tilde{g} in O~((k(α)+1α1+α)m1+α1k(α)+1)\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k(\alpha)+1}}) time, whp.

We first prove that the algorithm returns a cycle CC such that wt(C)2(g21)2ε(g21)2(ε)g+2εwt(C)\leq 2\ell(\lceil\frac{g}{2}\rceil-1)-2\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-2\leq(\ell-\varepsilon)g-\ell+2\varepsilon. Let gg^{\prime} be the smallest value g~\tilde{g} for which ShortCycle returned a cycle. As before, this implies that ggg\geq g^{\prime}. The call to ShortCycleSparse(G,k(αg),αg)\texttt{ShortCycleSparse}(G,k(\alpha_{g^{\prime}}),\alpha_{g^{\prime}}) returns a cycle CC such that

wt(C)2k(αg)\displaystyle wt(C)\leq 2k(\alpha_{g^{\prime}}) =2((αg1)ε(αg1)1)\displaystyle=2\cdot(\ell(\alpha_{g^{\prime}}-1)-\lfloor\varepsilon(\alpha_{g^{\prime}}-1)\rfloor-1)
=2((g21)ε(g21)1)\displaystyle=2\cdot(\ell(\lceil\frac{g^{\prime}}{2}\rceil-1)-\lfloor\varepsilon(\lceil\frac{g^{\prime}}{2}\rceil-1)\rfloor-1)
2((g21)ε(g21)1)\displaystyle\leq 2\cdot(\ell(\lceil\frac{g}{2}\rceil-1)-\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-1)
=2(g21)2ε(g21)2.\displaystyle=2\ell(\lceil\frac{g}{2}\rceil-1)-2\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor-2.

Now, since ε(g21)ε(g21)ε(g21)1=εg2ε1\lfloor\varepsilon(\lceil\frac{g}{2}\rceil-1)\rfloor\geq\lfloor\varepsilon(\frac{g}{2}-1)\rfloor\geq\varepsilon(\frac{g}{2}-1)-1=\varepsilon\frac{g}{2}-\varepsilon-1, we get that wt(C)2(g21)2(εg2ε1)2=2g22εg+2ε2g+122εg+2ε=g+2εg+2ε=(ε)g+2εwt(C)\leq 2\ell(\lceil\frac{g}{2}\rceil-1)-2(\varepsilon\frac{g}{2}-\varepsilon-1)-2=2\ell\lceil\frac{g}{2}\rceil-2\ell-\varepsilon g+2\varepsilon\leq 2\ell\frac{g+1}{2}-2\ell-\varepsilon g+2\varepsilon=\ell g+\ell-2\ell-\varepsilon g+2\varepsilon=(\ell-\varepsilon)g-\ell+2\varepsilon.

For the running time, there are at most O(log2n)O(\log^{2}n) calls to ShortCycle, and each call costs O~((k(α)+1α1+α)m1+α1k(α)+1)\tilde{O}((\frac{k(\alpha)+1}{\alpha-1}+\alpha)\cdot m^{1+\frac{\alpha-1}{k(\alpha)+1}}) whp (with the values of kk and α\alpha that correspond to that call). We have k(α)+1α1=(α1)ε(α1)α1(α1)α1=\frac{k(\alpha)+1}{\alpha-1}=\frac{\ell(\alpha-1)-\lfloor\varepsilon(\alpha-1)\rfloor}{\alpha-1}\leq\frac{\ell(\alpha-1)}{\alpha-1}=\ell. In addition, m1+α1k+1=m1+α1(α1)ε(α1)m1+α1(α1)ε(α1)=m1+1εm^{1+\frac{\alpha-1}{k+1}}=m^{1+\frac{\alpha-1}{\ell(\alpha-1)-\lfloor\varepsilon(\alpha-1)\rfloor}}\leq m^{1+\frac{\alpha-1}{\ell(\alpha-1)-\varepsilon(\alpha-1)}}=m^{1+\frac{1}{\ell-\varepsilon}}. Thus, the running time of each call is O~((+α)m1+1ε)\tilde{O}((\ell+\alpha)\cdot m^{1+\frac{1}{\ell-\varepsilon}}) whp, which is O~(m1+1ε)\tilde{O}(\ell\cdot m^{1+\frac{1}{\ell-\varepsilon}}) since αlog2n\alpha\leq\log^{2}n in each call. Therefore, the total running time is, whp141414See the previous footnote., O~(log2nm1+1ε)=O~(m1+1ε)\tilde{O}(\log^{2}n\cdot\ell\cdot m^{1+\frac{1}{\ell-\varepsilon}})=\tilde{O}(\ell\cdot m^{1+\frac{1}{\ell-\varepsilon}}). ∎

By setting ε=0\varepsilon=0 we get an O~(m1+1/)\widetilde{O}(\ell\cdot m^{1+1/\ell})-time algorithm that computes a C2(g21)C_{\leq 2\ell(\lceil\frac{g}{2}\rceil-1)}, as opposed to the O~(n1+1/)\widetilde{O}(\ell\cdot n^{1+1/\ell}) time algorithm that computes a C2g2C_{\leq 2\ell\lceil\frac{g}{2}\rceil}.

References

  • [1] Donald Aingworth, Chandra Chekuri, Piotr Indyk, and Rajeev Motwani. Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM Journal on Computing, 28(4):1167–1181, 1999.
  • [2] Noga Alon, Raphael Yuster, and Uri Zwick. Finding and counting given length cycles. Algorithmica, 17(3):209–223, 1997.
  • [3] Shiri Chechik, Yang P. Liu, Omer Rotem, and Aaron Sidford. Constant girth approximation for directed graphs in subquadratic time. In Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, pages 1010–1023. ACM, 2020.
  • [4] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. Finding even cycles faster via capped k-walks. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, Montreal, QC, Canada, June 19-23, 2017, pages 112–120. ACM, 2017.
  • [5] Søren Dahlgaard, Mathias Bæk Tejs Knudsen, and Morten Stöckel. New subquadratic approximation algorithms for the girth. arXiv preprint arXiv:1704.02178, 2017.
  • [6] Alon Itai and Michael Rodeh. Finding a minimum circuit in a graph. In Proceedings of the ninth annual ACM symposium on Theory of computing, pages 1–10, 1977.
  • [7] Avi Kadria, Liam Roditty, Aaron Sidford, Virginia Vassilevska Williams, and Uri Zwick. Algorithmic trade-offs for girth approximation in undirected graphs. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1471–1492. SIAM, 2022.
  • [8] Andrzej Lingas and Eva-Marta Lundell. Efficient approximation algorithms for shortest cycles in undirected graphs. Information Processing Letters, 109(10):493–498, 2009.
  • [9] Liam Roditty and Roei Tov. Approximating the girth. ACM Transactions on Algorithms (TALG), 9(2):1–13, 2013.
  • [10] Liam Roditty and Virginia Vassilevska Williams. Fast approximation algorithms for the diameter and radius of sparse graphs. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 515–524, 2013.
  • [11] Liam Roditty and Virginia Vassilevska Williams. Subquadratic time approximation algorithms for the girth. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, pages 833–845. SIAM, 2012.
  • [12] Virginia Vassilevska Williams and R. Ryan Williams. Subcubic equivalences between path, matrix, and triangle problems. J. ACM, 65(5):27:1–27:38, 2018.
  • [13] Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou. New bounds for matrix multiplication: from alpha to omega. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 3792–3835. SIAM, 2024.