This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Remarks on power-law random graphs

Mei Yin Department of Mathematics, University of Denver, Denver, CO 80208 mei.yin@du.edu

The theory of graphons is an important tool in understanding properties of large networks. We investigate a power-law random graph model and cast it in the graphon framework. The distinctively different structures of the limit graph are explored in detail in the sub-critical and super-critical regimes. In the sub-critical regime, the graph is empty with high probability, and in the rare event that it is non-empty, it consists of a single edge. Contrarily, in the super-critical regime, a non-trivial random graph exists in the limit, and it serves as an uncovered boundary case between different types of graph convergence.

Keywords Power-law random graph \cdot Graph limit \cdot sub-critical and super-critical regimes

Mathematics Subject Classification 05C80 \cdot 82B26

1. Introduction

The advent of large networks in the last decades has posed exciting challenges for researchers. Many interesting questions have been asked, ranging from a notion of limit distribution for sequences of graphs, to the understanding of local and global characteristics of large graphs, as well as the existence of effective algorithms to generate graphs with desired properties, and many more. Following the earlier work of Aldous [1] and Hoover [12], Lovász and coauthors (V.T. Sós, B. Szegedy, C. Borgs, J. Chayes, K. Vesztergombi, \dots) have constructed an elegant theory of graph limits in a sequence of papers [8] [9] [14] to address these questions. The graph limit theory sheds light on various topics such as graph testing and extremal graph theory, and has found applications in statistics and related areas (see for instance Bhattacharya et al. [2], Bickel and Chen [3], Chatterjee et al. [10], and Mukherjee and Xu [15]). For a comprehensive study on the theory of graph limits, we refer to van der Hofstad [11] and Lovász [13].

We present some basics of the graph limit theory in the context of dense graphs (number of edges comparable to the square of number of vertices), which was the initial setting for this theory. Let 𝒲\mathcal{W} be the space of all symmetric, measurable functions W:[0,1]2[0,1]W:[0,1]^{2}\to[0,1] (referred to as the graph limit space or graphon space). Any simple graph GnG_{n}, irrespective of the number of vertices nn, may be represented as an element Wn𝒲W_{n}\in\mathcal{W}, such that

Wn(x,y)={1if {nx,ny}is an edge in Gn,0otherwise.W_{n}(x,y)=\left\{\begin{array}[]{ll}1&\text{if }\left\{\left\lceil nx\right\rceil,\left\lceil ny\right\rceil\right\}\text{is an edge in }G_{n},\\ 0&\text{otherwise.}\\ \end{array}\right. (1.1)

A sequence of graphs {Gn}n1\{G_{n}\}_{n\geq 1} is said to converge to a function W𝒲W\in\mathcal{W} if for every finite simple graph HH with vertex set V(H)=[k]={1,,k}V(H)=[k]=\{1,...,k\} and edge set E(H)E(H),

limnt(H,Wn)=t(H,W),\lim_{n\to\infty}t(H,W_{n})=t(H,W), (1.2)

where by construction,

t(H,Wn)=|hom(H,Gn)||V(Gn)||V(H)|t(H,W_{n})=\frac{|\text{hom}(H,G_{n})|}{|V(G_{n})|^{|V(H)|}} (1.3)

equals the density of graph homomorphisms from HH to GnG_{n}, and

t(H,W)=[0,1]k{i,j}E(H)W(xi,xj)dx1dxk.t(H,W)=\int_{[0,1]^{k}}\prod_{\{i,j\}\in E(H)}W(x_{i},x_{j})dx_{1}\cdots dx_{k}. (1.4)

There are generalized versions of this type of convergence for weighted graphs, but the principal ideas are the same. Every function in 𝒲\mathcal{W} is the limit of a certain convergent graph sequence [14]. Intuitively, the interval [0,1][0,1] represents a continuum of vertices, and W(x,y)W(x,y) denotes the probability of putting an edge between xx and yy. For example, for the Erdös-Rényi random graph G(n,ρ)G(n,\rho), the associated limiting graphon is represented by the function that is identically equal to ρ\rho on [0,1]2[0,1]^{2}.

The graphon interpretation enables us to capture the notion of convergence in terms of subgraph densities (1.2) by an explicit metric on 𝒲\mathcal{W}, the cut distance:

d(U,V)=supS,T[0,1]|S×T(U(x,y)V(x,y))𝑑x𝑑y|d_{\square}(U,V)=\sup_{S,T\subseteq[0,1]}\left|\int_{S\times T}\left(U(x,y)-V(x,y)\right)dx\,dy\right| (1.5)

for U,V𝒲U,V\in\mathcal{W}. A non-trivial complication is that the topology induced by the cut metric is well defined only up to measure preserving transformations of [0,1][0,1] (and up to sets of Lebesgue measure zero), which in the context of finite graphs may be thought of as vertex relabeling. To tackle this issue, an equivalence relation \sim is introduced in 𝒲\mathcal{W}. We say that UVU\sim V if U(x,y)=Vσ(x,y):=V(σx,σy)U(x,y)=V_{\sigma}(x,y):=V(\sigma x,\sigma y) for some measure preserving bijection σ\sigma of [0,1][0,1]. Let U~\tilde{U} (referred to as a reduced graphon or unlabeled graphon) denote the closure of the orbit {Uσ}\{U_{\sigma}\} in (𝒲,d)(\mathcal{W},d_{\square}). Since dd_{\square} is invariant under σ\sigma, one can then define on the resulting quotient space 𝒲~\tilde{\mathcal{W}} the natural distance δ\delta_{\square} by δ(U~,V~)=infσ1,σ2d(Uσ1,Vσ2)\delta_{\square}(\tilde{U},\tilde{V})=\inf_{\sigma_{1},\sigma_{2}}d_{\square}(U_{\sigma_{1}},V_{\sigma_{2}}), where the infimum ranges over all measure preserving bijections σ1\sigma_{1} and σ2\sigma_{2}, making (𝒲~,δ)(\tilde{\mathcal{W}},\delta_{\square}) a metric space. With some abuse of notation we also refer to δ\delta_{\square} as the cut distance. The space (𝒲~,δ)(\tilde{\mathcal{W}},\delta_{\square}) enjoys many nice properties. For example, it is a compact space and homomorphism densities t(H,)t(H,\cdot) are continuous functions on it.

In addition to developing a standard theory of limits for sequences of dense graphs, serious efforts have been made at formulating parallel results for sparse graphs. Unlike dense graphs, sparse graphs display vastly different edge densities. Nevertheless, all simple sparse graphs converge to the zero graphon in the classical theory of graphons. To take into account the wide-ranging edge densities of sparse graphs, a renormalization procedure on their associated graphons has been introduced. The renormalization is executed in two ways, either by rescaling the height of the graphon or by stretching the domain on which it is defined.

The first renormalization approach was introduced in Bollobás and Riordan [4] and Borgs et al. [7], and is characterized by the rescaled cut metric δr\delta^{r}_{\square}. The graphon representation WnW_{n} of GnG_{n} is rescaled to WnrW_{n}^{r} by

Wnr(x,y)=Wn11Wn(x,y),W_{n}^{r}(x,y)={\left\|W_{n}\right\|}_{1}^{-1}W_{n}(x,y), (1.6)

i.e., divide the weight of each edge by the edge density Wn1{\left\|W_{n}\right\|}_{1}. The rescaled limit graphon WrW^{r} is in L1([0,1]2)L^{1}([0,1]^{2}), and takes values in [0,)[0,\infty). Note that for sparse graphs this scaling of the edges is appropriate, as it produces a constant order for Wnr1{\left\|W_{n}^{r}\right\|}_{1}. If we rescale WnW_{n} so that Wnr1=o(1){\left\|W_{n}^{r}\right\|}_{1}=o(1) instead, then automatically its cut norm (which is upper bounded by the L1L^{1}-norm) is negligible. As for dense graphs, there are generalized versions of this type of convergence for weighted sparse graphs, as well as to LpL^{p} graphons.

The second renormalization approach was introduced in Borgs et al. [5], and is characterized by the stretched cut metric δs\delta^{s}_{\square}. This time, the graphon representation WnW_{n} of GnG_{n} is stretched to WnsW_{n}^{s} by

Wns(x,y)=Wn(Wn11/2x,Wn11/2y),W_{n}^{s}(x,y)=W_{n}\left({\left\|W_{n}\right\|}_{1}^{1/2}x,{\left\|W_{n}\right\|}_{1}^{1/2}y\right), (1.7)

i.e., multiply the input arguments by the square root of the edge density Wn1{\left\|W_{n}\right\|}_{1} for valid x,yx,y values. Equivalently, the stretching may be interpreted as rescaling the measure of the underlying measure space. The stretched limit graphon WsW^{s} is in L1([0,)2)L^{1}([0,\infty)^{2}), and takes values in [0,1][0,1]. Under this rescaling of the measure perspective, graphons on σ\sigma-finite measure spaces may be considered as limiting objects for sequences of sparse graphs, similarly as graphons on probability spaces are considered as limits of dense graphs. Again there are further generalizations for this type of convergence.

While introducing the rescaled convergence mode in [7], Borgs et al. presented a motivating example. Consider a discrete graph GnG_{n} of nn vertices numbered 11 through nn. Connect vertices i,ji,j with probability

pn(i,j)=min{1,nβ/(ij)1/α}=min{1,nβ2/α(i/n)1/α(j/n)1/α},p_{n}(i,j)=\min\left\{1,n^{\beta}/(ij)^{1/\alpha}\right\}=\min\left\{1,n^{\beta-2/\alpha}(i/n)^{-1/\alpha}(j/n)^{-1/\alpha}\right\}, (1.8)

where α>1\alpha>1 and β(0,2/α)\beta\in(0,2/\alpha) are parameters. In other words, the edge connection probability between vertices i,ji,j behaves like (ij)1/α(ij)^{-1/\alpha}, but boosted by a factor of nβn^{\beta} in case it becomes too small. This configuration model is one of the simplest ways to get a power law degree distribution, as the expected degree of vertex ii scales according to an inverse power law in ii with exponent 1/α1/\alpha. The parameter range on α\alpha and β\beta is taken for practical considerations: α>1\alpha>1 avoids having almost all the edges of the graph between a sub-linear number of vertices, and β(0,2/α)\beta\in(0,2/\alpha) ensures that the cut-off from taking the minimum with 11 affects only a negligible fraction of the edges. Let {Ei,j}1i<jn\{E_{i,j}\}_{1\leq i<j\leq n} be Bernoulli random variables with parameter pn(i,j)p_{n}(i,j) and set Ei,j=Ej,iE_{i,j}=E_{j,i}. For x,y(0,1)x,y\in(0,1), let

Wnr(x,y)=1nβ2/αExn,ynW_{n}^{r}(x,y)=\frac{1}{n^{\beta-2/\alpha}}E_{\left\lceil xn\right\rceil,\left\lceil yn\right\rceil} (1.9)

be the empirical graphon rescaled by the expected edge density of the graph. The result in [7] (for more details see [6, Example 3.3.3]) says that

limnWnr(x,y)=(11/α)2(xy)1/α:=Wr(x,y).\lim_{n\to\infty}W_{n}^{r}(x,y)=(1-1/\alpha)^{2}(xy)^{-1/\alpha}:=W^{r}(x,y). (1.10)

The convergence is with respect to the cut metric, and the limit graphon Wr(x,y)W^{r}(x,y) lies in Lp([0,1]2)L^{p}([0,1]^{2}) for any p<αp<\alpha.

In this paper we will examine a graph model that is closely related to the motivating example discussed above. There are nn vertices numbered 11 through nn. We adapt the edge connection probability for vertices i,ji,j in two steps as shown below:

min{1,nβ2/α(i/n)1/α(j/n)1/α}min{1,XiXjan}𝟏{XiXjan>1},\min\left\{1,n^{\beta-2/\alpha}(i/n)^{-1/\alpha}(j/n)^{-1/\alpha}\right\}\rightarrow\min\left\{1,\frac{X_{i}X_{j}}{a_{n}}\right\}\rightarrow{{\bf 1}}_{\left\{\frac{X_{i}X_{j}}{a_{n}}>1\right\}}, (1.11)

where an=nβ+2/αa_{n}=n^{-\beta+2/\alpha}, Xi=dUi1/αX_{i}\stackrel{{\scriptstyle d}}{{=}}U_{i}^{-1/\alpha}, and UiU_{i} are i.i.d. (0,1)(0,1)-uniform random variables. The first step in the adaptation continualizes the discrete normalized vertex labels into a uniform measure, and implicitly relabels the vertices 1,,n1,\dots,n using the order statistics of their associated random variables X1,,XnX_{1},\dots,X_{n}, the latter not having a real impact on the structure of the graph. The second step in the adaptation is more significant. For every edge that our modified model connects, the original configuration construction in [7] connects them too. Call these “hard edges”. However, the original construction is not that strict with those edges that we drop. Instead they choose whether to connect them or not depending on a Bernoulli sampling probability between [0,1][0,1]. Call these “Bernoulli edges”. It might therefore be apt to refer to the model in [7] as a power-law random graph with Bernoulli edges and our adapted model as a power-law random graph without Bernoulli edges. Also note that the parameter range investigated in [7] translates to α>1\alpha>1 and ann2/αa_{n}\ll n^{2/\alpha} in our setting.

At the critical regime (ann2/αa_{n}\sim n^{2/\alpha}), a realization of this adapted model exhibits a small clique and large numbers of follower vertices asymptotically. The limit structure of the model away from criticality on the other hand is less understood, and will be the central focus of this work. We start with some straightforward calculations in Section 2 and make some quick observations. Section 3 studies the sub-critical regime (ann2/αa_{n}\gg n^{2/\alpha}). We show that although there is no graph in the limit, in the rare event that we do see a non-empty graph, typically it consists of exactly one edge. Section 4 studies the super-critical regime (ann2/αa_{n}\ll n^{2/\alpha}). We show that unlike the original model in [7], universality emerges in the limit graphon of the adapted model. After proper scaling, the parameter influence on the relation between number of vertices/edges disappears asymptotically. The qualitative difference between the limit graph structures in the original model vs. the adapted model is essentially due to the presence of Bernoulli edges, as the number of those edges is of larger order than the number of deterministic ones.

Putting our investigation of the asymptotic graph structure into the context of random graph limits, we will see that in the super-critical regime where there is a non-trivial random graph in the limit, our adapted model serves as an uncovered boundary case between different types of graph convergence. In contrast to the power-law random graph with Bernoulli edges analyzed in [7] whose graphon representation converges in the rescaled cut metric, the graphon representation of our power-law random graph without Bernoulli edges converges under a modified stretched convergence mode. See Theorem 2 and the accompanying implications for details.

2. First estimates

For Xi=dUi1/αX_{i}\stackrel{{\scriptstyle d}}{{=}}U_{i}^{-1/\alpha} where UiU_{i} are i.i.d. (0,1)(0,1)-uniform, one could take the i.i.d. XiX_{i} to have pdf αxα1dx𝟏{x1}\alpha x^{-\alpha-1}dx{{\bf 1}}_{\left\{x\geq 1\right\}} to make all calculations explicit. Given a realization X1,,XnX_{1},\dots,X_{n} of vertex values and a chosen normalization ana_{n}, we group the non-isolated vertices of the graph into two parts depending on whether Xi>anX_{i}>\sqrt{a_{n}} or XianX_{i}\leq\sqrt{a_{n}}, respectively referred to as “clique” and “followers”. Since two vertices only get connected when the product of their vertex values exceeds ana_{n}, a split graph is produced, as vertices are all connected within the clique and form a complete subgraph, while follower vertices can only be connected to clique vertices but not to themselves. Let us label the vertices according to decreasing XiX_{i} vertex values, where Yi=Xi:nY_{i}=X_{i:n} is the iith largest order statistic of i.i.d. random variables X1,,XnX_{1},\dots,X_{n}. Assume that the clique consists of Kn,0K_{n,0} vertices indexed by 1,2,,Kn,01,2,\dots,K_{n,0}, with vertex values Y1YKn,0>anY_{1}\geq\cdots\geq Y_{K_{n,0}}>\sqrt{a_{n}}. If Kn,0=0K_{n,0}=0 then there is no graph and the structure is trivial. So suppose Kn,0>0K_{n,0}>0. For each j=1,,Kn,0j=1,\dots,K_{n,0}, add in addition Kn,jK_{n,j} vertices that each connects to clique vertices indexed by 1,2,,Kn,0+1j1,2,\dots,K_{n,0}+1-j, but not clique vertices indexed by Kn,0+2j,,Kn,0K_{n,0}+2-j,\dots,K_{n,0} or all the other existing vertices. One might for the sake of simplicity set Kn,j=0K_{n,j}=0 for all j>Kn,0j>K_{n,0}. Then, 𝑲n={Kn,j}j0{\boldsymbol{K}}_{n}=\{K_{n,j}\}_{j\in{\mathbb{N}}_{0}} determines the structure of the random graph completely. For example, the total number of vertices is |Vn|=j=0Kn,j|V_{n}|=\sum_{j=0}^{\infty}K_{n,j} and the total number of edges is

|En|=(Kn,02)+j=1Kn,0(Kn,0+1j)Kn,j.|E_{n}|=\binom{K_{n,0}}{2}+\sum_{j=1}^{K_{n,0}}(K_{n,0}+1-j)K_{n,j}. (2.1)

We further compute the edge connection probability,

(X1X2>an)\displaystyle\mathbb{P}(X_{1}X_{2}>a_{n}) =1anαx1α1𝑑x1anx1αx2α1𝑑x2+anαx1α1𝑑x11αx2α1𝑑x2\displaystyle=\int_{1}^{a_{n}}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{\frac{a_{n}}{x_{1}}}^{\infty}\alpha x_{2}^{-\alpha-1}dx_{2}+\int_{a_{n}}^{\infty}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{1}^{\infty}\alpha x_{2}^{-\alpha-1}dx_{2}
=αanαlogan+anααanαlogan,\displaystyle=\alpha a_{n}^{-\alpha}\log a_{n}+a_{n}^{-\alpha}\sim\alpha a_{n}^{-\alpha}\log a_{n}, (2.2)

where “\sim” indicates that the ratio between the two quantities converges to 11. At the critical regime ann2/αa_{n}\sim n^{2/\alpha}, this gives the expected total number of edges as

𝔼|En|=(n2)(X1X2>an)logn.{\mathbb{E}}|E_{n}|=\binom{n}{2}\mathbb{P}(X_{1}X_{2}>a_{n})\sim\log n. (2.3)

Despite the logarithmic growth of edges, the average size of the clique stays at 11 asymptotically,

𝔼Kn,0=n(X1>an)=nanα/21.{\mathbb{E}}K_{n,0}=n\mathbb{P}(X_{1}>\sqrt{a_{n}})=\frac{n}{a_{n}^{\alpha/2}}\sim 1. (2.4)

Since Kn,0=i=1n𝟏{Xi/an>1}K_{n,0}=\sum_{i=1}^{n}{{\bf 1}}_{\left\{X_{i}/\sqrt{a_{n}}>1\right\}}, it is Binomial distributed with parameter (n,1/n)(n,1/n). The Binomial distribution falls off fast as one moves away from the mean, with Kn,0=1K_{n,0}=1 being most probable when a non-trivial graph is produced. Consequently, the average random graph at criticality has a small clique and large numbers of follower vertices.

To examine the structure of the random graph away from criticality, we introduce an auxiliary parameter γ>0\gamma>0 and set anα=nγlogna_{n}^{\alpha}=n^{\gamma}\log n. This explicit parametrization has some nice implications. First, since

{XiXj>an}={UiUj<1nγlogn},\left\{X_{i}X_{j}>a_{n}\right\}=\left\{U_{i}U_{j}<\frac{1}{n^{\gamma}\log n}\right\}, (2.5)

α\alpha becomes irrelevant as a parameter, and we can concentrate on tuning the parameter γ\gamma solely. The sub-critical regime ann2/αa_{n}\gg n^{2/\alpha} translates to γ2\gamma\geq 2 and the super-critical regime ann2/αa_{n}\ll n^{2/\alpha} translates to γ<2\gamma<2. Second, under this parametrization, the average number of edges of the graph exhibits an asymptotic growth order that is entirely dependent on γ\gamma,

𝔼|En|(n2)αanαlog(an)γ2n2γ.{\mathbb{E}}|E_{n}|\sim\binom{n}{2}\alpha a_{n}^{-\alpha}\log(a_{n})\sim\frac{\gamma}{2}n^{2-\gamma}. (2.6)

Note the difference between this order and the order of expected number of edges in the clique,

(𝔼Kn,0)2n2anαn2γlogn.\left({\mathbb{E}}K_{n,0}\right)^{2}\sim\frac{n^{2}}{a_{n}^{\alpha}}\sim\frac{n^{2-\gamma}}{\log n}. (2.7)

This implies that in asymptotics, most of the edges are coming from followers. See Figure 1 for some simulations.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1. Empirical graphons of the adapted model, with α=1.5,γ=0.5,0.8,1.1,1.5\alpha=1.5,\gamma=0.5,0.8,1.1,1.5, and n=10000n=10000. Vertices are labeled according to decreasing vertex values.

Other than the simple calculations presented above, many other distinguishing features of our adapted model could be derived explicitly. With some abuse of notation, let us denote the set of non-isolated vertices still by VnV_{n}. We compute the probability that a given vertex is non-isolated.

(maxi=2,,nX1Xi>an)\displaystyle\mathbb{P}\left(\max_{i=2,\dots,n}X_{1}X_{i}>a_{n}\right) =1αxα1𝑑x[1(X1anx)n1]\displaystyle=\int_{1}^{\infty}\alpha x^{-\alpha-1}dx\left[1-\mathbb{P}\left(X_{1}\leq\frac{a_{n}}{x}\right)^{n-1}\right]
=1anα+1anαxα1𝑑x[1(1(xan)α)n1]\displaystyle=\frac{1}{a_{n}^{\alpha}}+\int_{1}^{a_{n}}\alpha x^{-\alpha-1}dx\left[1-\left(1-\left(\frac{x}{a_{n}}\right)^{\alpha}\right)^{n-1}\right]
=1anα+nanαn/anαny2[1(1yn)n1]𝑑y,\displaystyle=\frac{1}{a_{n}^{\alpha}}+\frac{n}{a_{n}^{\alpha}}\int_{n/a_{n}^{\alpha}}^{n}y^{-2}\left[1-\left(1-\frac{y}{n}\right)^{n-1}\right]dy, (2.8)

where the last step consists of a change of variables (x/an)α=y/n(x/a_{n})^{\alpha}=y/n. In the case γ[1,2]\gamma\in[1,2], the lower limit of integration n/anα0n/a_{n}^{\alpha}\rightarrow 0, giving

𝔼|Vn|\displaystyle{\mathbb{E}}|V_{n}| =n(maxi=2,,nX1Xi>an)n2anαn/anαny2(1ey)𝑑y\displaystyle=n\mathbb{P}\left(\max_{i=2,\dots,n}X_{1}X_{i}>a_{n}\right)\sim\frac{n^{2}}{a_{n}^{\alpha}}\int_{n/a_{n}^{\alpha}}^{n}y^{-2}(1-e^{-y})dy
n2γlogn(n1γ/logn1y1𝑑y+1ny2𝑑y)\displaystyle\sim\frac{n^{2-\gamma}}{\log n}\left(\int_{n^{1-\gamma}/\log n}^{1}y^{-1}dy+\int_{1}^{n}y^{-2}dy\right)
n2γlogn(loglogn(1γ)logn+1).\displaystyle\sim\frac{n^{2-\gamma}}{\log n}\left(\log\log n-(1-\gamma)\log n+1\right). (2.9)

In the case γ(0,1)\gamma\in(0,1), the lower limit of integration n/anαn/a_{n}^{\alpha}\rightarrow\infty, giving

𝔼|Vn|\displaystyle{\mathbb{E}}|V_{n}| =n(maxi=2,,nX1Xi>an)n2anαn/anαny2𝑑y\displaystyle=n\mathbb{P}\left(\max_{i=2,\dots,n}X_{1}X_{i}>a_{n}\right)\sim\frac{n^{2}}{a_{n}^{\alpha}}\int_{n/a_{n}^{\alpha}}^{n}y^{-2}dy
n2γlognn1γ/lognny2𝑑yn.\displaystyle\sim\frac{n^{2-\gamma}}{\log n}\int_{n^{1-\gamma}/\log n}^{n}y^{-2}dy\sim n. (2.10)

The case γ>2\gamma>2 is clear, as 𝔼|En|0{\mathbb{E}}|E_{n}|\to 0. We summarize below the expected number of non-isolated vertices of the random graph in all parameter regimes:

𝔼|Vn|{0γ>2,(γ1)n2γγ(1,2],nloglognlognγ=1,nγ(0,1).{\mathbb{E}}|V_{n}|\sim\begin{cases}0&\gamma>2,\\ (\gamma-1)n^{2-\gamma}&\gamma\in(1,2],\\ n\frac{\log\log n}{\log n}&\gamma=1,\\ n&\gamma\in(0,1).\end{cases} (2.11)

All our calculations so far seem consistent and suggest that there is no random graph in the limit at the sub-critical regime γ2\gamma\geq 2. Contrarily, at the super-critical regime γ(0,2)\gamma\in(0,2), a limit random graph exists. We will respectively investigate these asymptotic phenomena in detail in Sections 3 and 4.

Letting An,iA_{n,i} denote the event that vertex ii is not isolated and Bn,(i,j)B_{n,(i,j)} the event that (i,j)(i,j) is an edge, the next quantity we will compute is ρn,(i,j)=(Bn,(i,j)An,iAn,j)\rho_{n,(i,j)}=\mathbb{P}(B_{n,(i,j)}\mid A_{n,i}\cap A_{n,j}). If this conditional probability has a limit, say ρ(i,j)\rho_{(i,j)}, then it may be interpreted as the probability of having an edge between two non-isolated vertices of the limit graph.

(An,1An,2Bn,(1,2)c)=21αx1α1𝑑x11x1αx2α1𝑑x2 1{x1x2<an}(1(X1anx2)n2)=21anαx2α1𝑑x2x2anx2αx1α1𝑑x1(1(1(x2an)α)n2)2n2an2αnanαnanα/2y22(1ey2)𝑑y2y2n2anα1y2y12𝑑y1,\mathbb{P}\left(A_{n,1}\cap A_{n,2}\cap B_{n,(1,2)}^{c}\right)=2\int_{1}^{\infty}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{1}^{x_{1}}\alpha x_{2}^{-\alpha-1}dx_{2}\ {{\bf 1}}_{\left\{x_{1}x_{2}<a_{n}\right\}}\left(1-\mathbb{P}\left(X_{1}\leq\frac{a_{n}}{x_{2}}\right)^{n-2}\right)\\ =2\int_{1}^{\sqrt{a_{n}}}\alpha x_{2}^{-\alpha-1}dx_{2}\int_{x_{2}}^{\frac{a_{n}}{x_{2}}}\alpha x_{1}^{-\alpha-1}dx_{1}\left(1-\left(1-\left(\frac{x_{2}}{a_{n}}\right)^{\alpha}\right)^{n-2}\right)\\ \sim\frac{2n^{2}}{a_{n}^{2\alpha}}\int_{\frac{n}{a_{n}^{\alpha}}}^{\frac{n}{a_{n}^{\alpha/2}}}y_{2}^{-2}\left(1-e^{-y_{2}}\right)dy_{2}\int_{y_{2}}^{\frac{n^{2}}{a_{n}^{\alpha}}\frac{1}{y_{2}}}y_{1}^{-2}dy_{1}, (2.12)

where for the first equality, w.l.o.g. we assumed that x1>x2x_{1}>x_{2}. The indicator function constraint then gives x2<anx_{2}<\sqrt{a_{n}} in the second equality. We then apply a change of variables (xi/an)α=yi/n(x_{i}/a_{n})^{\alpha}=y_{i}/n for i=1,2i=1,2. As was explained in the above calculation for 𝔼|Vn|{\mathbb{E}}|V_{n}|, a standard asymptotic study yields

(An,1An,2Bn,(1,2)c){2nγ1lognγ1,1γ(0,1).\mathbb{P}\left(A_{n,1}\cap A_{n,2}\cap B_{n,(1,2)}^{c}\right)\sim\begin{cases}\frac{2}{n^{\gamma-1}\log n}&\gamma\geq 1,\\ 1&\gamma\in(0,1).\end{cases} (2.13)

Complementarily,

(An,1An,2Bn,(1,2))=1αx1α1𝑑x11αx2α1𝑑x2 1{x1x2>an}=1anαx1α1𝑑x1anx1αx2α1𝑑x2+anαx1α1𝑑x11αx2α1𝑑x2γnγ for all γ>0.\mathbb{P}\left(A_{n,1}\cap A_{n,2}\cap B_{n,(1,2)}\right)=\int_{1}^{\infty}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{1}^{\infty}\alpha x_{2}^{-\alpha-1}dx_{2}\ {{\bf 1}}_{\left\{x_{1}x_{2}>a_{n}\right\}}\\ =\int_{1}^{a_{n}}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{\frac{a_{n}}{x_{1}}}^{\infty}\alpha x_{2}^{-\alpha-1}dx_{2}+\int_{a_{n}}^{\infty}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{1}^{\infty}\alpha x_{2}^{-\alpha-1}dx_{2}\sim\frac{\gamma}{n^{\gamma}}\hskip 5.69046pt\text{ for all }\gamma>0. (2.14)

Combining the above results, we have ρ(i,j)=0\rho_{(i,j)}=0 for all γ>0\gamma>0. The zero conditional probability of an edge connecting two non-isolated vertices may look puzzling at first sight when γ(0,2)\gamma\in(0,2) since we expect the existence of an infinite random graph in this parameter region. A possible interpretation is that, as the number of non-isolated vertices grows infinite in an appropriate sense, given any two vertices the probability of seeing an edge between them is zero.

3. Sub-critical regime

Since there is no random graph in distribution in the limit, the sub-critical regime ann2/αa_{n}\gg n^{2/\alpha} is relatively not as interesting as compared to the super-critical regime ann2/αa_{n}\ll n^{2/\alpha}. Nevertheless, the limit object for the sub-critical regime captures some intriguing characteristics, as we will describe in this section. For explicitness, we take the i.i.d. XiX_{i} to have pdf αxα1dx𝟏{x1}\alpha x^{-\alpha-1}dx{{\bf 1}}_{\left\{x\geq 1\right\}} and set anα=nγlogna_{n}^{\alpha}=n^{\gamma}\log n with γ2\gamma\geq 2.

Recall that Kn,0=i=1n𝟏{Xi/an>1}K_{n,0}=\sum_{i=1}^{n}{{\bf 1}}_{\left\{X_{i}/\sqrt{a_{n}}>1\right\}} denotes the number of vertices with large weight (vertex value >an>\sqrt{a_{n}}) and is Binomial distributed with parameter (n,anα/2)\left(n,a_{n}^{-\alpha/2}\right). These vertices are referred to as clique vertices if they are in addition non-isolated. Since

Kn,0𝔼Kn,0n1γ/2log1/2n0K_{n,0}\rightarrow{\mathbb{E}}K_{n,0}\sim\frac{n^{1-\gamma/2}}{\log^{1/2}n}\to 0 (3.1)

in probability in the sub-critical regime, only a conditional limit theorem is worth investigating in this case. The conditional law of (Kn,0|Kn,01)\mathcal{L}(K_{n,0}\left|\right.K_{n,0}\geq 1) is easy to derive, with

(Kn,01)=1(11anα/2)nnanα/2=n1γ/2log1/2n,\mathbb{P}\left(K_{n,0}\geq 1\right)=1-\left(1-\frac{1}{a_{n}^{\alpha/2}}\right)^{n}\sim\frac{n}{a_{n}^{\alpha/2}}=\frac{n^{1-{\gamma/2}}}{\log^{1/2}n}, (3.2)
(Kn,0=1)=n(1anα/2)(11anα/2)n1nanα/2=n1γ/2log1/2n.\mathbb{P}\left(K_{n,0}=1\right)=n\left(\frac{1}{a_{n}^{\alpha/2}}\right)\left(1-\frac{1}{a_{n}^{\alpha/2}}\right)^{n-1}\sim\frac{n}{a_{n}^{\alpha/2}}=\frac{n^{1-{\gamma/2}}}{\log^{1/2}n}. (3.3)

We conclude that given the appearance of a non-trivial random graph, the clique part (conditioning on non-empty) typically only contains one vertex. There is a fine point when computing the probability that Kn,0=1K_{n,0}=1 though. This event does not necessarily imply the appearance of a star graph as the edge number may still be zero; just having one large weight vertex is not enough. We demonstrate this subtlety below.

(one clique vertex)=nananαx1α1𝑑x1[(11anα/2)n1(1(x1an)α)n1]+nanαx1α1𝑑x1(11anα/2)n1.\mathbb{P}\left(\text{one clique vertex}\right)\\ =n\int_{\sqrt{a_{n}}}^{a_{n}}\alpha x_{1}^{-\alpha-1}dx_{1}\left[\left(1-\frac{1}{a_{n}^{\alpha/2}}\right)^{n-1}-\left(1-\left(\frac{x_{1}}{a_{n}}\right)^{\alpha}\right)^{n-1}\right]\\ +n\int_{a_{n}}^{\infty}\alpha x_{1}^{-\alpha-1}dx_{1}\left(1-\frac{1}{a_{n}^{\alpha/2}}\right)^{n-1}. (3.4)

Here we eliminate the situation where Kn,0=1K_{n,0}=1, but no edge is formed between the clique vertex X1X_{1} and the follower vertices X2,,XnX_{2},\dots,X_{n}. The scalar nn indicates that the clique could be centered at any vertex. The second term on the right is of order nanαn1γna_{n}^{-\alpha}\ll n^{1-\gamma}, while the first term, after a change of variables (x1/an)α=y1/n(x_{1}/a_{n})^{\alpha}=y_{1}/n, asymptotically becomes

n2anαnanα/2ny12(1ey1)𝑑y1n2anαlog(anα/2n)(γ21)n2γ.\frac{n^{2}}{a_{n}^{\alpha}}\int_{\frac{n}{a_{n}^{\alpha/2}}}^{n}y_{1}^{-2}\left(1-e^{-y_{1}}\right)dy_{1}\sim\frac{n^{2}}{a_{n}^{\alpha}}\log\left(\frac{a_{n}^{\alpha/2}}{n}\right)\sim\left(\frac{\gamma}{2}-1\right)n^{2-\gamma}. (3.5)

Combining the above analysis, the correct probability of a star graph with a lone vertex in the clique behaves like (γ/21)n2γ(\gamma/2-1)n^{2-\gamma}, which is smaller than (3.2) (3.3).

Let us delve deeper into the structure of this star graph. Notice that Kn,1K_{n,1} by itself counts the number of followers in this case; Kn,j=0K_{n,j}=0 for all j>1j>1 automatically by construction. We have

(one clique vertex,Kn,1=1)\displaystyle\mathbb{P}\left(\text{one clique vertex},K_{n,1}=1\right) =n(n1)ananαx1α1𝑑x1anx1anαx2α1𝑑x2(1(x1an)α)n2\displaystyle=n(n-1)\int_{\sqrt{a_{n}}}^{a_{n}}\alpha x_{1}^{-\alpha-1}dx_{1}\int_{\frac{a_{n}}{x_{1}}}^{\sqrt{a_{n}}}\alpha x_{2}^{-\alpha-1}dx_{2}\left(1-\left(\frac{x_{1}}{a_{n}}\right)^{\alpha}\right)^{n-2}
n2anαnanα/2ny11ey1𝑑y1n2anαlog(anα/2n)(γ21)n2γ.\displaystyle\sim\frac{n^{2}}{a_{n}^{\alpha}}\int_{\frac{n}{a_{n}^{\alpha/2}}}^{n}y_{1}^{-1}e^{-y_{1}}dy_{1}\sim\frac{n^{2}}{a_{n}^{\alpha}}\log\left(\frac{a_{n}^{\alpha/2}}{n}\right)\sim\left(\frac{\gamma}{2}-1\right)n^{2-\gamma}. (3.6)

Here the scalers nn and n1n-1 in the equality indicate that the clique could be centered at any vertex and the follower could come from any of the remaining vertices. A standard asymptotic analysis then yields the asymptotic order after a change of variables (x1/an)α=y1/n(x_{1}/a_{n})^{\alpha}=y_{1}/n. This probability is asymptotically the same as having a lone clique star graph that was established previously. We state this finding.

Theorem 1.

Consider the adapted model at the sub-critical regime (anα=nγlogna_{n}^{\alpha}=n^{\gamma}\log n with γ2\gamma\geq 2). Given that the graph is non-empty, in the limit predominantly it has exactly two vertices, one clique vertex and one follower vertex.

A physical interpretation of this phenomenon might be the following: For a typical behavior, with probability going to one we would not see any graph eventually. In the rare event that we do see one, we would need certain ‘extra energy’ (than typical) to push some of the XiX_{i} values up, and the most ‘economical’ way to do so is to push one up to the clique and another up as a follower. Pushing up two to the clique or pushing up more than one follower or any other construction, by comparison, might be too costly.

4. Super-critical regime

In this section we will examine the structure of the adapted model in the more intriguing super-critical regime ann2/αa_{n}\ll n^{2/\alpha}, where the limit random graph is expected to take a non-trivial form. Recall that Xi=dUi1/αX_{i}\stackrel{{\scriptstyle d}}{{=}}U_{i}^{-1/\alpha} and UiU_{i} are i.i.d. (0,1)(0,1)-uniform random variables. Denote the tail distribution by

F¯(x)=(X1>x)=(x1)α.\overline{F}(x)=\mathbb{P}(X_{1}>x)=(x\vee 1)^{-\alpha}. (4.1)

In a complementary manner, we also write F(x)=(X1x)F(x)=\mathbb{P}(X_{1}\leq x) for the cumulative distribution function. For ease of notation, let Kn:=Kn,0=i=1n𝟏{Xi/an>1}K_{n}:=K_{n,0}=\sum_{i=1}^{n}{{\bf 1}}_{\left\{X_{i}/\sqrt{a_{n}}>1\right\}} denote the number of vertices in the clique. In the super-critical regime, an=o(n2/α)a_{n}=o(n^{2/\alpha}) gives

σn2:=𝔼Kn=nF¯(an).\sigma_{n}^{2}:={\mathbb{E}}K_{n}=n\overline{F}(\sqrt{a_{n}})\to\infty. (4.2)

Since KnK_{n} is Binomial distributed with parameter (n,F¯(an))\left(n,\overline{F}(\sqrt{a_{n}})\right), Knσn2K_{n}\sim\sigma_{n}^{2} in probability, and the random variable KnK_{n} is well-concentrated around σn2\sigma_{n}^{2}.

Introduce two i.i.d. sequences of random variables {Yn,i}i\{Y_{n,i}\}_{i\in{\mathbb{N}}} and {Zn,i}i\{Z_{n,i}\}_{i\in{\mathbb{N}}} with

(Yn,1>y)\displaystyle\mathbb{P}(Y_{n,1}>y) =F¯(yan)F¯(an)=yα, as n for all y>1,\displaystyle=\frac{\overline{F}(y\sqrt{a_{n}})}{\overline{F}(\sqrt{a_{n}})}=y^{-\alpha},\mbox{ as }n\to\infty\mbox{ for all }y>1, (4.3)
(Zn,1x)\displaystyle\mathbb{P}\left(Z_{n,1}\leq x\right) =F(x)F(an), x(0,an].\displaystyle=\frac{F(x)}{F(\sqrt{a_{n}})},\mbox{ }x\in(0,\sqrt{a_{n}}]. (4.4)

In other words, {Yn,i}i\{Y_{n,i}\}_{i\in{\mathbb{N}}} are i.i.d. with law as (X1|X1>an)\mathcal{L}(X_{1}\left|\right.X_{1}>\sqrt{a_{n}}) (with scaling adjustment) and {Zn,i}i\{Z_{n,i}\}_{i\in{\mathbb{N}}} are i.i.d. with law as (X1|X1an)\mathcal{L}(X_{1}\left|\right.X_{1}\leq\sqrt{a_{n}}). Assume further that these two sequences are independent. Then for every nn\in{\mathbb{N}}, given KnK_{n}, the values of {Xi}i=1,,n\{X_{i}\}_{i=1,\dots,n} corresponding to those larger than (less than resp.) the threshold an\sqrt{a_{n}} share the same joint law of Yn,1,,Yn,KnY_{n,1},\dots,Y_{n,K_{n}} (Zn,1,,Zn,nKnZ_{n,1},\dots,Z_{n,n-K_{n}} resp.). We order {Yn,i}i=1,,Kn\{Y_{n,i}\}_{i=1,\dots,K_{n}} in increasing order statistics

Yn,Kn:Kn>>Yn,1:Kn>an>anYn,1:Kn>>anYn,Kn:Kn,Y_{n,K_{n}:K_{n}}>\cdots>Y_{n,1:K_{n}}>\sqrt{a_{n}}>\frac{a_{n}}{Y_{n,1:K_{n}}}>\cdots>\frac{a_{n}}{Y_{n,K_{n}:K_{n}}}, (4.5)

where listed on the right hand side are the thresholds for different groups of followers.

Define the statistics

τn(x):=anYn,xKn:Kn,x(0,1).\tau_{n}(x):=\frac{a_{n}}{Y_{n,\left\lceil xK_{n}\right\rceil:K_{n}}},\hskip 5.69046ptx\in(0,1). (4.6)

We are interested in the asymptotic behavior of the height function

Hn(x):=i=1nKn𝟏{Zn,i>τn(x)}.H_{n}(x):=\sum_{i=1}^{n-K_{n}}{{\bf 1}}_{\left\{Z_{n,i}>\tau_{n}(x)\right\}}. (4.7)

This construction associates the law of Hn(x)H_{n}(x) to that of the number of not-in-clique vertices that are connected to the vertices in the clique corresponding to those top xKn\left\lceil xK_{n}\right\rceil-values of {Yn,i}i=1,,Kn\{Y_{n,i}\}_{i=1,\dots,K_{n}}. At one end, Hn(1)H_{n}(1) is the number of followers of the leader from the clique, i.e., nKnn-K_{n}, thus Hn(1)nH_{n}(1)\sim n with high probability. At the other end, we take Hn(0)0H_{n}(0)\equiv 0 by convention. For notational convenience, set

Bn,i(x):=𝟏{Zn,i>τn(x)}.B_{n,i}(x):={{\bf 1}}_{\left\{Z_{n,i}>\tau_{n}(x)\right\}}. (4.8)

We are now ready to state our main result, which says that the limit fluctuation of the height function has two independent components, one as a generalized Brownian bridge, the other as a time-changed Brownian motion.

Theorem 2.

Consider the adapted model at the super-critical regime (anα=nγlogna_{n}^{\alpha}=n^{\gamma}\log n with γ<2\gamma<2). Let σn2\sigma_{n}^{2} and Hn(x)H_{n}(x) be defined as in (4.2) and (4.7). We have

1σn{Hn(x)σn2x1x}x[0,1)f.d.d.{𝔹x/(1x)+𝔾x}x[0,1),\frac{1}{\sigma_{n}}\left\{H_{n}(x)-{\sigma_{n}^{2}}\frac{x}{1-x}\right\}_{x\in[0,1)}\stackrel{{\scriptstyle f.d.d.}}{{\Rightarrow}}\left\{{\mathbb{B}}_{x/(1-x)}+{\mathbb{G}}_{x}\right\}_{x\in[0,1)}, (4.9)

where f.d.d.f.d.d. indicates convergence of finite-dimensional distributions, {𝔹t}t[0,)\{{\mathbb{B}}_{t}\}_{t\in[0,\infty)} is a standard Brownian motion, {𝔾x}x[0,1)\{{\mathbb{G}}_{x}\}_{x\in[0,1)} is a generalized Brownian bridge with covariance function

Cov(𝔾x,𝔾y)=min(x,y)(1max(x,y))(1x)2(1y)2,x,y[0,1),{\rm{Cov}}\left({\mathbb{G}}_{x},{\mathbb{G}}_{y}\right)=\frac{\min(x,y)(1-\max(x,y))}{(1-x)^{2}(1-y)^{2}},\hskip 5.69046ptx,y\in[0,1), (4.10)

and 𝔹{\mathbb{B}} and 𝔾{\mathbb{G}} are independent.

Note that throughout, the index xx is strictly less than 11 (covariance explodes as x1x\uparrow 1). On the other hand, convergence at x=0x=0 is clear, as both sides equal zero.

Implications of Theorem 2. Recall that σn2=𝔼Kn\sigma_{n}^{2}={\mathbb{E}}K_{n} denotes the average number of vertices in the clique. From Theorem 2, we may deduce that

𝔼Hn(x)σn2x1x,{\mathbb{E}}H_{n}(x)\sim\sigma_{n}^{2}\frac{x}{1-x},
Var(Hn(x))σn2x1x(1+1(1x)2).{\rm{Var}}\left(H_{n}(x)\right)\sim\sigma_{n}^{2}\frac{x}{1-x}\left(1+\frac{1}{(1-x)^{2}}\right). (4.11)

Since we established the theorem using increasing order statistics, this introduces a simple transformation x1xx\mapsto 1-x to the simulations in Figure 1. Then for x(0,1]x\in(0,1],

h(x):=1+limn𝔼Hn(x)σn2=1xh(x):=1+\lim_{n\to\infty}\frac{{\mathbb{E}}H_{n}(x)}{\sigma_{n}^{2}}=\frac{1}{x} (4.12)

should give the boundary line in question, where the extra 11 in the above expression comes from the clique-clique contribution. This is a universal result independent of the parameters. Having the same asymptotic order for the expected value and the variance of the height function Hn(x)H_{n}(x) also explains why the simulations look so regular.

Let WnW_{n} denote the graphon of our model with nn vertices without scaling (a {0,1}\{0,1\}-valued function on [0,n]2[0,n]^{2}). Following explanations above,

Wn(x,y)=Wn(𝔼Knx,𝔼Kny)W_{n}^{\prime}(x,y)=W_{n}\left({\mathbb{E}}K_{n}\cdot x,\hskip 2.84544pt{\mathbb{E}}K_{n}\cdot y\right) (4.13)

has the deterministic limit W(x,y)=𝟏{xy1},x,y(0,)W(x,y)={{\bf 1}}_{\left\{xy\leq 1\right\}},x,y\in(0,\infty), with boundary line y=1/xy=1/x (4.12). The asymptotics is confirmed by the simulations in Figure 1. This mode of convergence feels very close to the stretched convergence mode (1.7). Our stretching acts in a similar way as in (1.7), but by a different stretching order, as Wn1=𝔼|En|(γ/2)n2γ{\left\|W_{n}\right\|}_{1}={\mathbb{E}}|E_{n}|\sim(\gamma/2)n^{2-\gamma} while 𝔼Kn=nanα/2n1γ/2/(logn)1/2{\mathbb{E}}K_{n}=na_{n}^{-\alpha/2}\sim n^{1-\gamma/2}/(\log n)^{1/2}. So there is an extra log\log term in our stretching as compared to (1.7). Furthermore, contrary to the stretched convergence mode, the limit graphon WW is not L1L^{1}-integrable. Our adapted model may therefore be viewed as an example that lies at the boundary between rescaled convergence and stretched convergence.

To prove Theorem 2, we shall decompose HnH_{n} further and identify the relevant contribution from the different parts of the statistics to 𝔹{\mathbb{B}} and 𝔾{\mathbb{G}}. For this purpose, we introduce 𝒦n:=σ(Kn,Yn,1,,Yn,Kn)\mathcal{K}_{n}:=\sigma(K_{n},Y_{n,1},\dots,Y_{n,K_{n}}), and

p^n(x):=(Zn,i>τn(x)|𝒦n) and pn(x):=F¯(an)F(an)(θn(1x)1),\widehat{p}_{n}(x):=\mathbb{P}\left(Z_{n,i}>\tau_{n}(x)\;\middle|\;\mathcal{K}_{n}\right)\quad\mbox{ and }\quad p_{n}(x):=\frac{\overline{F}(\sqrt{a_{n}})}{F(\sqrt{a_{n}})}\left(\theta_{n}(1-x)-1\right), (4.14)

where

θn(x)={x1x>F¯(an)=anα/2,anα/2xanα/2.\theta_{n}(x)=\begin{cases}x^{-1}&x>\overline{F}(\sqrt{a_{n}})=a_{n}^{-\alpha/2},\\ a_{n}^{\alpha/2}&x\leq a_{n}^{-\alpha/2}.\end{cases} (4.15)

Write

H¯n(x)\displaystyle\overline{H}_{n}(x) =i=1nKn(Bn,i(x)pn(x))\displaystyle=\sum_{i=1}^{n-K_{n}}\left(B_{n,i}(x)-p_{n}(x)\right)
=i=1nKn(Bn,i(x)p^n(x))+(nKn)(p^n(x)pn(x))=:H¯1,n(x)+H¯2,n(x).\displaystyle=\sum_{i=1}^{n-K_{n}}(B_{n,i}(x)-\widehat{p}_{n}(x))+(n-K_{n})(\widehat{p}_{n}(x)-p_{n}(x))=:\overline{H}_{1,n}(x)+\overline{H}_{2,n}(x). (4.16)

We will show that σn1H¯1,n\sigma_{n}^{-1}\overline{H}_{1,n} and σn1H¯2,n\sigma_{n}^{-1}\overline{H}_{2,n} converge to 𝔹{\mathbb{B}} and 𝔾{\mathbb{G}}, respectively. We first examine H¯2,n\overline{H}_{2,n}.

Lemma 1.

With the notations above, under the assumptions in Theorem 2, we have

nσn{p^n(x)pn(x)}x(0,1)f.d.d.{𝔾x}x(0,1).\frac{n}{\sigma_{n}}\left\{\widehat{p}_{n}(x)-p_{n}(x)\right\}_{x\in(0,1)}\stackrel{{\scriptstyle f.d.d.}}{{\Rightarrow}}\left\{{\mathbb{G}}_{x}\right\}_{x\in(0,1)}. (4.17)
Proof.

We begin the analysis of asymptotics by first examining the i.i.d. Yn,iY_{n,i}. Definition (4.3) implies that Yn,1=dU1/αY_{n,1}\stackrel{{\scriptstyle d}}{{=}}U^{-1/\alpha}, where UU is a uniform random variable on (0,1)(0,1). Set Wn,i:=Yn,i1=dU1/αW_{n,i}:=Y_{n,i}^{-1}\stackrel{{\scriptstyle d}}{{=}}U^{1/\alpha} and Wn:=Wn,1W_{n}:=W_{n,1}. We need some background on quantile processes from Shorack [16] [17]. Following notations from earlier, let FZF_{Z} be the cumulative distribution function of a random variable ZZ and F¯Z\overline{F}_{Z} its tail probability function. Let FZ1F_{Z}^{-1} denote the left-continuous inverse function of FZF_{Z}. Let 𝔽Z,n1(x)\mathbb{F}^{-1}_{Z,n}(x) denote the quantile process of i.i.d. copies Z1,,ZnZ_{1},\dots,Z_{n}. It follows that 𝔽Z,n1(x)=Zxn:n{\mathbb{F}}^{-1}_{Z,n}(x)=Z_{\left\lfloor xn\right\rfloor:n}, where Zi:nZ_{i:n} is the iith smallest order statistic of Z1,,ZnZ_{1},\dots,Z_{n}. Then FWn(Wn)F_{W_{n}}(W_{n}) is a uniform random variable, and for all mnm_{n}\to\infty,

mn{n,mn(x)x}x[0,1]:=mn{FWn𝔽Wn,mn1(x)x}x[0,1]{𝔹xbr}x[0,1].\sqrt{m_{n}}\left\{\mathbb{Q}_{n,m_{n}}(x)-x\right\}_{x\in[0,1]}:=\sqrt{m_{n}}\left\{F_{W_{n}}\circ\mathbb{F}_{W_{n},m_{n}}^{-1}(x)-x\right\}_{x\in[0,1]}\Rightarrow\left\{{\mathbb{B}}^{br}_{x}\right\}_{x\in[0,1]}. (4.18)

Here, {𝔹xbr}x[0,1]\{{\mathbb{B}}^{br}_{x}\}_{x\in[0,1]} is a standard Brownian bridge, a centered Gaussian process with

Cov(𝔹xbr,𝔹ybr)=min(x,y)(1max(x,y)),x,y[0,1]{\rm{Cov}}\left({\mathbb{B}}^{br}_{x},{\mathbb{B}}^{br}_{y}\right)=\min(x,y)(1-\max(x,y)),\hskip 5.69046ptx,y\in[0,1] (4.19)

in D([0,1])D([0,1]). n,mn(x)\mathbb{Q}_{n,m_{n}}(x) so defined has the law of the quantile process of mnm_{n} i.i.d. uniform random variables. Furthermore, from (4.6),

τn(x)=an𝔽Wn,Kn1(1x).\tau_{n}(x)=\sqrt{a_{n}}{\mathbb{F}}^{-1}_{W_{n},K_{n}}(1-x). (4.20)

Combining the above observations, we have

p^n(x)=F¯(an/Yn,xKn:Kn)F¯(an)F(an)=ρ^n(x)F¯(an)F(an),\widehat{p}_{n}(x)=\frac{\overline{F}(a_{n}/Y_{n,\left\lceil xK_{n}\right\rceil:K_{n}})-\overline{F}(\sqrt{a_{n}})}{F(\sqrt{a_{n}})}=\widehat{\rho}_{n}(x)\frac{\overline{F}(\sqrt{a_{n}})}{F(\sqrt{a_{n}})}, (4.21)

with

ρ^n(x)=F¯(an𝔽Wn,Kn1(1x))F¯(an)1.\widehat{\rho}_{n}(x)=\frac{\overline{F}(\sqrt{a_{n}}\mathbb{F}_{W_{n},K_{n}}^{-1}(1-x))}{\overline{F}(\sqrt{a_{n}})}-1. (4.22)

We rewrite ρ^n\widehat{\rho}_{n} as a function of n:=n,Kn\mathbb{Q}_{n}:=\mathbb{Q}_{n,K_{n}}. Namely,

ρ^n(x)\displaystyle\widehat{\rho}_{n}(x) =F¯(anFWn1FWn𝔽Wn,Kn1(1x))F¯(an)1\displaystyle=\frac{\overline{F}(\sqrt{a_{n}}F_{W_{n}}^{-1}\circ F_{W_{n}}\circ\mathbb{F}_{W_{n},K_{n}}^{-1}(1-x))}{\overline{F}(\sqrt{a_{n}})}-1
=F¯(anFWn1n(1x))F¯(an)1.\displaystyle=\frac{\overline{F}(\sqrt{a_{n}}F_{W_{n}}^{-1}\circ\mathbb{Q}_{n}(1-x))}{\overline{F}(\sqrt{a_{n}})}-1. (4.23)

Let us make these calculations more explicit. For x(0,1)x\in(0,1), FWn(x)=xαF_{W_{n}}(x)=x^{\alpha} and FWn1(x)=x1/αF_{W_{n}}^{-1}(x)=x^{1/\alpha}. Hence

F¯(anFWn1(x))F¯(an)=θn(x),\frac{\overline{F}\left(\sqrt{a_{n}}F_{W_{n}}^{-1}(x)\right)}{\overline{F}(\sqrt{a_{n}})}=\theta_{n}(x), (4.24)

where θn(x)\theta_{n}(x) is defined as in (4.15). This gives

ρ^n(x)=θn(n(1x))1.\widehat{\rho}_{n}(x)=\theta_{n}(\mathbb{Q}_{n}(1-x))-1. (4.25)

We have

nσn(p^n(x)pn(x))=1F(an)σnKnKn(θn(n(1x))θn(1x)).\frac{n}{\sigma_{n}}(\widehat{p}_{n}(x)-p_{n}(x))=\frac{1}{F(\sqrt{a_{n}})}\cdot\frac{\sigma_{n}}{\sqrt{K_{n}}}\cdot\sqrt{K_{n}}\left(\theta_{n}(\mathbb{Q}_{n}(1-x))-\theta_{n}(1-x)\right). (4.26)

Note that θn(x)=x2\theta_{n}^{\prime}(x)=-x^{-2} for all x>0x>0 for nn large enough. A standard application of the delta method applied to (4.18) and (4.26) then yields the desired weak convergence. ∎

The examination of H¯1,n\overline{H}_{1,n} comes next. Following [17] we can actually construct, on a different probability space, for each nn\in{\mathbb{N}} copies {Y~n,i,K~n}i\{\widetilde{Y}_{n,i},\widetilde{K}_{n}\}_{i\in{\mathbb{N}}} of {Yn,i,Kn}i\{Y_{n,i},K_{n}\}_{i\in{\mathbb{N}}}, such that K~nσn2\widetilde{K}_{n}\sim\sigma_{n}^{2} almost surely, and the convergence (4.18) is in the almost sure sense. Note that this coupling construction does not necessarily imply that the joint laws of (Y~n,i,Y~m,j)(\widetilde{Y}_{n,i},\widetilde{Y}_{m,j}) are the same as (Yn,i,Ym,j)(Y_{n,i},Y_{m,j}) for mnm\neq n, but we will not need such joint laws in the sequel. For ease of notation, we still let Yn,i,Wn,i,Kn,τn(x)Y_{n,i},W_{n,i},K_{n},\tau_{n}(x) denote Y~n,i,W~n,i,K~n,τ~n(x)\widetilde{Y}_{n,i},\widetilde{W}_{n,i},\widetilde{K}_{n},\widetilde{\tau}_{n}(x) respectively, and emphasize that we are working on this different probability space simply by saying for the coupled model. Recall that 𝒦n:=σ(Kn,Yn,1,,Yn,Kn)\mathcal{K}_{n}:=\sigma(K_{n},Y_{n,1},\dots,Y_{n,K_{n}}). Further define 𝒦:=σ({𝒦n}n)\mathcal{K}:=\sigma(\{\mathcal{K}_{n}\}_{n\in{\mathbb{N}}}) for the coupled model. We also continue to assume Zn,iZ_{n,i} as before, independent from all other random variables discussed so far in this paragraph.

We start by noticing that for each nn\in{\mathbb{N}}, given 𝒦n\mathcal{K}_{n}, {Bn,i(x)}i\{B_{n,i}(x)\}_{i\in{\mathbb{N}}} are i.i.d. Bernoulli random variables with parameter p^n(x)\widehat{p}_{n}(x) for every xx. Moreover, they are nested in the sense that Bn,i(x)=1B_{n,i}(x)=1 implies Bn,i(y)=1B_{n,i}(y)=1 for all y(x,1)y\in(x,1). We fix x(0,1)x\in(0,1) (there is nothing to prove for x=0x=0). Then, conditioning on 𝒦n\mathcal{K}_{n}, for each nn\in{\mathbb{N}},

H¯1,n(x)=i=1nKn(Bn,i(x)p^n(x))\overline{H}_{1,n}(x)=\sum_{i=1}^{n-K_{n}}(B_{n,i}(x)-\widehat{p}_{n}(x)) (4.27)

is a partial sum of (centralized) i.i.d. Bernoulli random variables with parameter p^n(x)\widehat{p}_{n}(x). Now, from (4.18), we know that

n(x):=FWn𝔽Wn,Kn1(x)x almost surely,\mathbb{Q}_{n}(x):=F_{W_{n}}\circ{\mathbb{F}}_{W_{n},K_{n}}^{-1}(x)\to x\mbox{ almost surely,} (4.28)

and limnFWn(x)=xα\lim_{n\to\infty}F_{W_{n}}(x)=x^{\alpha} for x(0,1)x\in(0,1). Therefore, the conditional variance is

(nKn)p^n(x)(1p^n(x))\displaystyle(n-K_{n})\widehat{p}_{n}(x)(1-\widehat{p}_{n}(x)) nF¯(an)ρ^n(x)=nF¯(an)(θn(n(1x))1)\displaystyle\sim n\overline{F}(\sqrt{a_{n}})\widehat{\rho}_{n}(x)=n\overline{F}(\sqrt{a_{n}})(\theta_{n}({\mathbb{Q}}_{n}(1-x))-1)
σn2(11x1)=σn2x1x\displaystyle\sim\sigma_{n}^{2}\left(\frac{1}{1-x}-1\right)=\sigma_{n}^{2}\frac{x}{1-x} (4.29)

almost surely, where in the \sim step we used the coupling n(x)x{\mathbb{Q}}_{n}(x)\to x almost surely and limnθn(x)=x1\lim_{n\to\infty}\theta_{n}(x)=x^{-1}. Then, by the central limit theorem for triangular arrays of i.i.d. random variables, we have that

(1σnH¯1,n(x)|𝒦)(𝒩(0,x1x)) a.s.\mathcal{L}\left(\frac{1}{\sigma_{n}}\overline{H}_{1,n}(x)\;\middle|\;\mathcal{K}\right)\to\mathcal{L}\left(\mathcal{N}\left(0,\frac{x}{1-x}\right)\right)\mbox{ a.s.} (4.30)

The above statement is interpreted as almost-sure weak convergence, meaning that

limn𝔼[ϕ(1σnH¯1,n(x))|𝒦]=𝔼ϕ((x1x)1/2Z) almost surely for all continuous and bounded functions ϕ:,\lim_{n\to\infty}{\mathbb{E}}\left[\phi\left(\frac{1}{\sigma_{n}}\overline{H}_{1,n}(x)\right)\;\middle|\;\mathcal{K}\right]={\mathbb{E}}\phi\left(\left(\frac{x}{1-x}\right)^{1/2}Z\right)\\ \mbox{ almost surely for all continuous and bounded functions $\phi:{\mathbb{R}}\to{\mathbb{R}}$,} (4.31)

where ZZ on the right hand side is a standard Gaussian random variable. We shall use f.d.d.({σn1H¯1,n(x)}x[0,1)𝒦)\mathcal{L}_{f.d.d.}(\{\sigma_{n}^{-1}\overline{H}_{1,n}(x)\}_{x\in[0,1)}\mid\mathcal{K}) in Lemma 2 below for the corresponding almost-sure weak convergence of finite-dimensional distributions of H¯1,n\overline{H}_{1,n}.

This argument can be readily extended to the multivariate central limit theorem, and it suffices to compute the covariance. Alternatively, by a standard Poissonization argument one sees immediately that the limit Gaussian process has independent increments. So, the limits of finite-dimensional distributions of {σn1H¯1,n(x)}x[0,1)\{\sigma_{n}^{-1}\overline{H}_{1,n}(x)\}_{x\in[0,1)} are the corresponding ones of 𝔹x/(1x){\mathbb{B}}_{x/(1-x)}. We have thus proved the following.

Lemma 2.

For the coupled model, we have

f.d.d.(1σn{H¯1,n(x)}x[0,1)|𝒦)f.d.d.({𝔹x/(1x)}x[0,1)) a.s.,\mathcal{L}_{f.d.d.}\left(\frac{1}{\sigma_{n}}\{\overline{H}_{1,n}(x)\}_{x\in[0,1)}\;\middle|\;\mathcal{K}\right)\to\mathcal{L}_{f.d.d.}\left(\{{\mathbb{B}}_{x/(1-x)}\}_{x\in[0,1)}\right)\mbox{ a.s.}, (4.32)

where the interpretation of the above expression is explained right after (4.31).

Proof of Theorem 2.

Combining Lemmas 1 and 2 we obtain immediately that

1σn{H¯n(x)}x[0,1)f.d.d.{𝔹x/(1x)+𝔾x}x[0,1).\frac{1}{\sigma_{n}}\left\{\overline{H}_{n}(x)\right\}_{x\in[0,1)}\stackrel{{\scriptstyle f.d.d.}}{{\Rightarrow}}\left\{{\mathbb{B}}_{x/(1-x)}+{\mathbb{G}}_{x}\right\}_{x\in[0,1)}. (4.33)

However, σn1H¯n(x)\sigma_{n}^{-1}\overline{H}_{n}(x) has a slightly different centering from the one in Theorem 2. Using (4.2), (4.14) and (4.15), the difference is

1σn(σn2x1x(nKn)pn(x))\displaystyle\frac{1}{\sigma_{n}}\left(\sigma_{n}^{2}\frac{x}{1-x}-(n-K_{n})p_{n}(x)\right)
=\displaystyle= 1σn(σn2x1xσn2(θn(1x)1)+σn2(θn(1x)1)(nKn)pn(x))\displaystyle\frac{1}{\sigma_{n}}\left(\sigma_{n}^{2}\frac{x}{1-x}-\sigma_{n}^{2}(\theta_{n}(1-x)-1)+\sigma_{n}^{2}(\theta_{n}(1-x)-1)-(n-K_{n})p_{n}(x)\right)
=\displaystyle= σn((1x)1θn(1x))+1σn((n𝔼Kn)pn(x)(nKn)pn(x))\displaystyle\sigma_{n}\left((1-x)^{-1}-\theta_{n}(1-x)\right)+\frac{1}{\sigma_{n}}\left((n-{\mathbb{E}}K_{n})p_{n}(x)-(n-K_{n})p_{n}(x)\right)
=\displaystyle= σn((1x)1θn(1x))+Kn𝔼Knσnpn(x),\displaystyle\sigma_{n}\left((1-x)^{-1}-\theta_{n}(1-x)\right)+\frac{K_{n}-{\mathbb{E}}K_{n}}{\sigma_{n}}p_{n}(x), (4.34)

which converges to zero in L2L^{2}.∎

Acknowledgements

Mei Yin’s research was supported in part by the University of Denver’s Faculty Research Fund 84688-145601. She acknowledges helpful conversations with Yufei Zhao, and is particularly grateful to Yizao Wang for many constructive comments. She also thanks the anonymous reviewer for their insightful comments and suggestions.

References

  • [1] Aldous, D.: Representations for partially exchangeable arrays of random variables. J. Multivariate Anal. 11: 581-598 (1981).
  • [2] Bhattacharya, B.B., Bhattacharya, S., Ganguly, S.: Spectral edge in sparse random graphs: Upper and lower tail large deviations. arXiv: 2004.00611 (2020).
  • [3] Bickel, P.J., Chen, A.: A nonparametric view of network models and Newman-Girvan and other modularities. Proc. Natl. Acad. Sci. USA 106: 21068-21073 (2009).
  • [4] Bollobás, B., Riordan, O.: Metrics for sparse graphs. In: Huczynska, S., Mitchell, J.D., Roney-Dougal, C.M. (eds.) Surveys in Combinatorics 2009 (Volume 365 of London Mathematical Society Lecture Note Series), pp. 211-287. Cambridge University Press, Cambridge. (2009).
  • [5] Borgs, C., Chayes, J.T., Cohn, H., Holden, N.: Sparse exchangeable graphs and their limits via graphon processes. J. Mach. Learn. Res. 18: 1-71 (2018).
  • [6] Borgs, C., Chayes, J.T., Cohn, H., Zhao, Y.: An LpL^{p} theory of sparse graph convergence II: LD convergence, quotients, and right convergence. Ann. Probab. 46: 337-396 (2018).
  • [7] Borgs, C., Chayes, J.T., Cohn, H., Zhao, Y.: An LpL^{p} theory of sparse graph convergence I: Limits, sparse random graph models, and power law distributions. Trans. Amer. Math. Soc. 372: 3019-3062 (2019).
  • [8] Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing. Adv. Math. 219: 1801-1851 (2008).
  • [9] Borgs, C., Chayes, J.T., Lovász, L., Sós, V.T., Vesztergombi, K.: Convergent sequences of dense graphs II: Multiway cuts and statistical physics. Ann. of Math. 176: 151-219 (2012).
  • [10] Chatterjee, S., Diaconis, P., Sly, A.: Random graphs with a given degree sequence. Ann. Appl. Prob. 21: 1400-1435 (2011).
  • [11] van der Hofstad, R.: Random Graphs and Complex Networks Volume 1. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge. (2017).
  • [12] Hoover, D.: Row-column exchangeability and a generalized model for probability. In: Koch, G., Spizzichino, F. (eds.) Exchangeability in Probability and Statistics, pp. 281-291. North-Holland, Amsterdam. (1982).
  • [13] Lovász, L.: Large Networks and Graph Limits. American Mathematical Society, Providence. (2012).
  • [14] Lovász, L., Szegedy B.: Limits of dense graph sequences. J. Combin. Theory Ser. B 96: 933-957 (2006).
  • [15] Mukherjee, S., Xu, Y.: Statistics of the two-star ERGM. arXiv: 1310.4526 (2021).
  • [16] Shorack, G.R.: Functions of order statistics. Ann. Math. Statist. 43: 412-427 (1972).
  • [17] Shorack, G.R.: Convergence of reduced empirical and quantile processes with application to functions of order statistics in the non-i.i.d. case. Ann. Statist. 1: 146-152 (1973).