This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Limit Results for Estimation of Connectivity Matrix in Multi-layer Stochastic Block Models

Wenqing Su Department of Mathematical Sciences, Tsinghua University  Xiao Guo School of Mathematics, Northwest University Ying Yang Department of Mathematical Sciences, Tsinghua University
Abstract

Multi-layer networks arise naturally in various domains including biology, finance and sociology, among others. The multi-layer stochastic block model (multi-layer SBM) is commonly used for community detection in the multi-layer networks. Most of current literature focuses on statistical consistency of community detection methods under multi-layer SBMs. However, the asymptotic distributional properties are also indispensable which play an important role in statistical inference. In this work, we aim to study the estimation and asymptotic properties of the layer-wise scaled connectivity matrices in the multi-layer SBMs. We develop a novel and efficient method to estimate the scaled connectivity matrices. Under the multi-layer SBM and its variant multi-layer degree-corrected SBM, we establish the asymptotic normality of the estimated matrices under mild conditions, which can be used for interval estimation and hypothesis testing. Simulations show the superior performance of proposed method over existing methods in two considered statistical inference tasks. We also apply the method to a real dataset and obtain interpretable results.

Keywords: Multi-layer networks, Asymptotic properties, Spectral methods

1 Introduction

The multi-layer network has gained widespread attention for its ability to represent multiple relationships among the same entities of interest, thereby enhancing the understanding of complex network data (Mucha et al., 2010; Holme and Saramäki, 2012; Kivelä et al., 2014; Han et al., 2015). For example, the Worldwide Food and Agricultural Trade (WFAT) multi-layer trade network (De Domenico et al., 2015), encompassing trade relationships among the same countries (nodes) across different commodities (layers), offers a richer insight than a single-layer trade network.

The multi-layer stochastic block model (multi-layer SBM) (Han et al., 2015; Paul and Chen, 2016) has been popularly used for the community detection task in the multi-layer network, where each layer corresponds to a stochastic block model (SBM) (Holland et al., 1983). In particular, given common unobserved communities (blocks), the edges in each layer are generated according to a layer-wise block probability matrix or the so-called connectivity matrix. Various methods have been proposed to detect the communities of multi-layer networks under the multi-layer SBM, see, e.g., Han et al. (2015); Bhattacharyya and Chatterjee (2018); Paul and Chen (2020); Jing et al. (2021); Arroyo et al. (2021); Lei and Lin (2023).

Most of the aforementioned literature on community detection of multi-layer SBMs is devoted to the statistical consistency. However, the asymptotic distributional properties are also vital and particularly useful in subsequent inference tasks. In the context of single-layer SBMs or other related models, various versions of asymptotically Gaussian behavior emerged for different purposes (Agterberg and Cape, 2023). Specifically, Tang and Priebe (2018) provided a central limit theorem for the eigenvectors of the normalized Laplacian matrix under the random dot product graph (RDPG) model (Young and Scheinerman, 2007). Rubin-Delanchy et al. (2022) extended these asymptotic results to the generalized RDPG and applied the theory to SBMs. Bickel et al. (2013) and Tang et al. (2022) established the asymptotic normality results for the estimation of the connectivity matrix using the maximum likelihood method and spectral embedding, respectively. Fan et al. (2022b) tested whether two nodes from a mixed-membership model (Airoldi et al., 2008) have the same membership parameters by establishing the asymptotic distributional properties. Jin et al. (2023) proposed a goodness-of-fit approach to select the number of communities and the asymptotic null distribution of the test statistic is normal. Also see Agterberg and Cape (2023) for a comprehensive overview of asymptotic normality in SBMs.

In the multi-layer SBM, the asymptotic distributional properties are crucial for subsequent inferences especially for those tailored for the multi-layer networks. However, as far as we are aware, the asymptotic properties under multi-layer SBMs are largely unexplored. An important inference task in multi-layer SBMs is to test whether networks across different layers are generated from the same SBM, essentially the same connectivity matrix given the consensus community memberships. For instance, in analyzing the WFAT multi-layer network, the goal extends beyond the community detection to ascertaining whether trading patterns across different commodities are similar. Asymptotic properties play a fundamental role in such inference. In the literature, Arroyo et al. (2021) and Zheng and Tang (2022) showed that the connectivity matrices estimated using the spectral embedding are asymptotically normal under the common subspace independent edge graph (COSIE) model. These studies necessitate that the connectivity matrices for all network layers are full rank. However, in practical multi-layer networks like WFAT, not all layers carry comprehensive cluster information, potentially leading to missing clusters, which results in the rank-deficiency of individual connectivity matrices (Su et al., 2024). Moreover, the methods in Arroyo et al. (2021) and Zheng and Tang (2022) treat multi-layer SBMs as special case of their COSIE model, rather than being specifically tailored for them.

Motivated by the above problems, we study the asymptotic properties in the estimation of connectivity matrices under multi-layer SBMs, without imposing the full rank assumption on their populations. We develop a simple and efficient spectral clustering-based method for the scaled connectivity matrix estimation, where the scaled connectivity matrix is also called the score matrix (Arroyo et al., 2021). Under proper conditions, we establish the asymptotic normality of the estimated scaled connectivity matrices under multi-layer SBMs and its variants, the degree-corrected multi-layer SBM. We emphasize that the asymptotic properties can be used to perform various statistical inferences, such as interval estimation and hypothesis testing.

The main contributions of this paper are threefold. First, we systematically study and specifically tailor the asymptotic normality of the scaled connectivity matrix estimation under the multi-layer SBM and its variant, the multi-layer degree-corrected SBM. To the best of our knowledge, the asymptotic normality under the degree-corrected SBM is less explored, let alone the multi-layer degree-corrected SBM. Second, the conditions under which our results hold are relatively mild. Compared to previous work which necessitated full rank in each layer (Arroyo et al., 2021; Zheng and Tang, 2022), we allow individual connectivity matrices to be rank-deficient and only require their squared summation to be full rank. This adaptability makes our approach particularly suitable for analyzing multi-layer networks where each layer only captures partial underlying communities. Third, we apply the established asymptotic normality to two statistical inference tasks, one for interval estimation of the scaled connectivity matrices, and the other for testing whether the population matrices corresponding to different layers are the same. The numerical results show superior performance of the proposed method over compared methods in these tasks. The analysis of a real-world trade network dataset provides interpretable results.

The rest of the paper is organized as follows. Section 2 introduces the estimation method of individual connectivity matrices in the multi-layer SBM. Section 3 provides the asymptotic normality of the estimates. Section 4 extends the method to multi-layer degree-corrected SBMs and presents the corresponding asymptotic normality results. Sections 5 and 6 include the simulation and real data experiment. Section 7 concludes the paper and provides possible extensions. Technical proofs are included in the Appendix.

2 Methodology

In this section, we first present and reparameterize the multi-layer stochastic block model. Then we propose the method for estimating the individual connectivity matrices. The following notation is used throughout the paper.

Notation: We use [n][n] to denote the set {1,,n}\{1,...,n\}. For a matrix An×nA\in\mathbb{R}^{n\times n}, AF\|A\|_{F} and A1,\|A\|_{1,\infty} denote the Frobenius norm and the maximum row-wise 1\ell_{1} norm, respectively. The spectral norm of a matrix and the Euclidean norm of a vector are denoted by 2\|\cdot\|_{2}. The notation diag(){\rm diag}(\cdot) is used to denote a diagonal matrix formed from either a vector or the diagonal elements of a matrix. The notation vec(){\rm vec}(\cdot) refers to the vectorization of the upper triangular part of a symmetric matrix, proceeding in a column-wise order. For two sequences ana_{n} and bnb_{n}, we write anbna_{n}\lesssim b_{n} if there exists some constant c>0c>0 such that ancbna_{n}\leq cb_{n}, and use anbna_{n}\asymp b_{n} to denote the case where both anbna_{n}\lesssim b_{n} and bnanb_{n}\lesssim a_{n}. The notation an=ω(bn)a_{n}=\omega(b_{n}) denotes that bn/an0b_{n}/a_{n}\rightarrow 0 as nn goes to infinity, while an=o(bn)a_{n}=o(b_{n}) means that bn=ω(an)b_{n}=\omega(a_{n}).

2.1 Multi-layer SBMs

Consider the multi-layer network comprising LL layers and nn shared nodes, with its symmetric adjacency matrices represented by AlA_{l}, where Al{0,1}n×nA_{l}\in\{0,1\}^{n\times n} for all 1lL1\leq l\leq L. The multi-layer SBM (Han et al., 2015; Valles-Catala et al., 2016; Paul and Chen, 2016) is as follows.

Assume the nn nodes are partitioned into KK layer-independent communities with the community assignment of node ii denoted by gi[K]g_{i}\in[K]. With this community assignment, the element Al,ij(i<j)A_{l,ij}(i<j) of Al(l[L])A_{l}(l\in[L]) is generated independently as the following SBM (Holland et al., 1983),

Al,ijBernoulli(ρBl,gigj),A_{l,ij}\sim{\rm Bernoulli}(\rho B_{l,g_{i}g_{j}}),

where ρ(0,1]\rho\in(0,1] controls the overall sparsity of the generated networks, Bl[0,1]K×KB_{l}\in\mathbb{[}0,1]^{K\times K} denotes the heterogeneous connectivity matrix, and we assume the diagonal entries Al,iiA_{l,ii}’s are all 0.

It is easy to see that

Ql=ρΘBlΘT[0,1]n×nQ_{l}=\rho\Theta B_{l}\Theta^{T}\in[0,1]^{n\times n}

is the population matrix for AlA_{l}, in the sense that Qldiag(Ql)=𝔼(Al)Q_{l}-{\rm diag}(Q_{l})=\mathbb{E}(A_{l}), where Θ{0,1}n×K\Theta\in\{0,1\}^{n\times K} denotes the membership matrix with Θik=1\Theta_{ik}=1 if and only if k=gik=g_{i} and Θik=0\Theta_{ik}=0 otherwise. Define Δ=diag(n1,,nK)\Delta={\rm diag}(n_{1},\ldots,n_{K}), where each diagonal element nkn_{k} represents the number of nodes in community k[K]k\in[K], we can then rearrange QlQ_{l} as

Ql=ρΘΔ1/2Δ1/2BlΔ1/2Δ1/2ΘT:=UMlUT,Q_{l}=\rho\Theta\Delta^{-1/2}\Delta^{1/2}B_{l}\Delta^{1/2}\Delta^{-1/2}\Theta^{T}:=UM_{l}U^{T}, (1)

where U=ΘΔ1/2U=\Theta\Delta^{-1/2} and Ml=ρΔ1/2BlΔ1/2M_{l}=\rho\Delta^{1/2}B_{l}\Delta^{1/2}. Here, UU is an n×Kn\times K orthogonal column matrix with KK different rows and serves as the eigenspace of the population matrix QlQ_{l}. The K×KK\times K symmetric matrix MlM_{l} is referred to as the scaled connectivity matrix or the score matrix. We will study its estimation and asymptotic normality in subsequent sections. Note that Arroyo et al. (2021) first considered this type of decomposition when they developed the COSIE model.

Through the decomposition in (1), the multi-layer SBM can be reparameterized by (U,M1,,MLU,M_{1},\ldots,M_{L}). Note that such parametrization is identifiable up to orthogonal transformation because for any orthogonal matrix Z𝕆(K)Z\in\mathbb{O}(K), (UZ)ZTMlZ(UZ)T(UZ)Z^{T}M_{l}Z(UZ)^{T} provides another valid decomposition, where 𝕆(K)\mathbb{O}(K) represents the set consists of all K×KK\times K orthogonal matrices.

2.2 Estimation of the scaled connectivity matrices

In order to estimate the scaled connectivity matrices Ml(l[L])M_{l}(l\in[L]), we first estimate the common eigenspace UU.

To utilize the information across layers, we regard UU as the eigenspace of the following l=1LQl2\sum_{l=1}^{L}Q_{l}^{2},

l=1LQl2=l=1Lρ2ΘBlΘTΘBlΘT=Ul=1Lρ2Δ1/2BlΔBlΔ1/2UT,\sum_{l=1}^{L}Q_{l}^{2}=\sum_{l=1}^{L}\rho^{2}\Theta B_{l}\Theta^{T}\Theta B_{l}\Theta^{T}=U\sum_{l=1}^{L}\rho^{2}\Delta^{1/2}B_{l}\Delta B_{l}\Delta^{1/2}U^{T},

where we squared QlQ_{l}’s before summing them up to avoid the community cancellation by direct summation (Lei and Lin, 2023). Suppose l=1LBl2\sum_{l=1}^{L}B_{l}^{2} is of full rank, and recalling that Δ\Delta is a full rank diagonal matrix, then l=1Lρ2Δ1/2BlΔBlΔ1/2\sum_{l=1}^{L}\rho^{2}\Delta^{1/2}B_{l}\Delta B_{l}\Delta^{1/2} is also of full rank. Denote the l=1Lρ2Δ1/2BlΔBlΔ1/2\sum_{l=1}^{L}\rho^{2}\Delta^{1/2}B_{l}\Delta B_{l}\Delta^{1/2} by WΛWTW\Lambda W^{T}, where W𝕆(K)W\in\mathbb{O}(K) and Λ\Lambda is a diagonal matrix. Then we have

l=1LQl2=UWΛWTUT.\sum_{l=1}^{L}Q_{l}^{2}=UW\Lambda W^{T}U^{T}.

Hence, UWUW are the eigenvectors of l=1LQl2\sum_{l=1}^{L}Q_{l}^{2}.

Therefore, we can estimate UU using the eigenvectors of the sample version l=1LQl2\sum_{l=1}^{L}Q_{l}^{2}. Specifically, to mitigate the bias introduced by the diagonal part of l=1LAl2\sum_{l=1}^{L}A_{l}^{2}, we conduct eigendecomposition on the following bias-adjusted sum of squares of the adjacency matrices

l=1L(Al2Dl),\sum_{l=1}^{L}(A_{l}^{2}-D_{l}),

where DlD_{l} is an n×nn\times n diagonal matrix with Dl,ii=j=1nAl,ijD_{l,ii}=\sum_{j=1}^{n}A_{l,ij} representing the degree of node ii in layer ll. The resulting eigenvectors are denoted by U^\hat{U}.

Given U^\hat{U}, we are ready to estimate the MlM_{l} matrices by recalling the decomposition of the population matrix QlQ_{l} in (1). Specifically, we can obtain the following estimator for each layer ll:

Ml^=argminMK×KAlU^MU^T=U^TAlU^.\hat{M_{l}}=\operatorname*{arg\,min}\limits_{M\in\mathbb{R}^{K\times K}}\|A_{l}-\hat{U}M\hat{U}^{T}\|=\hat{U}^{T}A_{l}\hat{U}. (2)
Remark 1.

Similar estimation approaches for scaled connectivity matrices have been used by Arroyo et al. (2021), where the common eigenspace UU is derived based on distributed estimation techniques (Fan et al., 2019). By contrast, our methodology utilizes the bias-adjusted sum of squares of the adjacency matrices to estimate the eigenspace, aiming to avoid potential signal cancellation. This aggregation method was initially proposed by Lei and Lin (2023); however, their primary focus was on the statistical consistency of community detection instead of the asymptotic normality in the estimation of connectivity matrices.

3 Asymptotic properties

In this section, we present the asymptotic properties of the individual estimates of the scaled connectivity matrices Ml(l[L])M_{l}(l\in[L]) in the multi-layer SBM.

To establish the asymptotic normality of MlM_{l}, we need the following assumptions.

Assumption 1.

The number of communities KK is fixed. The community sizes are balanced, that is, there exists some constant c1>1c_{1}>1 such that each community size is in [c11n/K,c1n/K][c_{1}^{-1}n/K,c_{1}n/K].

Assumption 1 also indicates that there exist positive constants c2,c3c_{2},c_{3}, and an orthogonal matrix Z𝕆(K)Z\in\mathbb{O}(K), such that c2/n|(UZ)is|c3/nc_{2}/\sqrt{n}\leq|(UZ)_{is}|\leq c_{3}/\sqrt{n} for all i[n]i\in[n] and s[K]s\in[K]. This is referred to as eigenvector delocalization (Rudelson and Vershynin, 2015; He et al., 2019). Indeed, by recalling the relationship U=ΘΔ1/2U=\Theta\Delta^{-1/2} and choosing the orthogonal matrix ZZ with c2/KZstc3/Kc_{2}/\sqrt{K}\leq Z_{st}\leq c_{3}/\sqrt{K} for each KK and each pair 1s,tK1\leq s,t\leq K, the delocalization of UU follows by Assumption 1.

Assumption 2.

The minimum eigenvalue of l=1LBl2\sum_{l=1}^{L}B_{l}^{2} is at least c4Lc_{4}L for some constant c4>0c_{4}>0.

Assumption 2 specifies the growth rate of the minimum eigenvalue of l=1LBl2\sum_{l=1}^{L}B_{l}^{2} in order to make the theoretical bound of the estimated common eigenspace U^\hat{U} around UU concrete. The linear growth rate is reasonable because when the connectivity matrices are common, Assumption 2 holds naturally.

Assumption 3.

Suppose the sum of the variance of the edges satisfies

s2(Ql):=i=1nj=1nQl,ij(1Ql,ij)=ω(1)s^{2}(Q_{l}):=\sum_{i=1}^{n}\sum_{j=1}^{n}Q_{l,ij}(1-Q_{l,ij})=\omega(1)

for all l[L]l\in[L].

Assumption 3 is critical to ensure the Lindeberg conditions for the Central Limit Theorem are met. For balanced community sizes, Assumption 3 simplifies to s=1Kt=1KρBl,st(1ρBl,st)=ω(1/n2)\sum_{s=1}^{K}\sum_{t=1}^{K}\rho B_{l,st}(1-\rho B_{l,st})=\omega(1/n^{2}), which suffices if ρ=ω(1/n2)\rho=\omega(1/n^{2}).

Before stating the next assumption, we first define the following K(K+1)/2×K(K+1)/2K(K+1)/2\times K(K+1)/2 matrix Σl\Sigma^{l},

Σ2s+t(t1)2,2s+t(t1)2l:=i=1n1j=i+1n(UisUjt+UjsUit)(UisUjt+UjsUit)Ql,ij(1Ql,ij)\Sigma_{\frac{2s+t(t-1)}{2},\frac{2s^{\prime}+t^{\prime}(t^{\prime}-1)}{2}}^{l}:=\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}(U_{is}U_{jt}+U_{js}U_{it})\;(U_{is^{\prime}}U_{jt^{\prime}}+U_{js^{\prime}}U_{it^{\prime}})\;Q_{l,ij}(1-Q_{l,ij}) (3)

for each 1stK1\leq s\leq t\leq K and 1stK1\leq s^{\prime}\leq t^{\prime}\leq K. Also, define Gl=AlPlG_{l}=A_{l}-P_{l} with Pl=𝔼(Al)=Qldiag(Ql)P_{l}=\mathbb{E}(A_{l})=Q_{l}-\text{diag}(Q_{l}) to be the noise matrix. As we will see, Σl\Sigma^{l} actually serves as the covariance matrix of the vectorized UTGlUU^{T}G_{l}U, denoted by vec(UTGlU)K(K+1)/2{\rm vec}(U^{T}G_{l}U)\in\mathbb{R}^{K(K+1)/2}, where we focus on the upper triangular part of the symmetric UTGlUU^{T}G_{l}U. Specifically, for any 1stK1\leq s\leq t\leq K, the 2s+t(t1)2\frac{2s+t(t-1)}{2}th entry in vec(UTGlU){\rm vec}(U^{T}G_{l}U) corresponds to (UTGlU)s,t(U^{T}G_{l}U)_{s,t}.

Assumption 4.

λmin(Σl)=ω(1/n2)\lambda_{\min}(\Sigma^{l})=\omega(1/n^{2}) for all l[L]l\in[L], where λmin()\lambda_{\min}(\cdot) denotes the smallest eigenvalue.

Similar to Assumption 3, Assumption 4 contributes to verifying the Lindeberg conditions for the Central Limit Theorem. By Assumption 3 and the discussions after Assumption 1, we can conclude that each entry of Σl\Sigma_{l} is ω(1/n2)\omega(1/n^{2}). Hence, Assumption 4 is stronger than Assumption 3.

Remark 2.

Assumptions 3 and 4 were first used in Arroyo et al. (2021).

With these assumptions, we obtain the following entry-wise and vector-wise asymptotic properties of the estimated scaled connectivity matrices under the multi-layer SBM.

Theorem 1.

Suppose Assumptions 1 and 2 hold for the multi-layer SBM, and a positive constant c5c_{5} exists such that nρc5log(L+n)n\rho\geq c_{5}\log(L+n). Then the estimate Ml^\hat{M_{l}} obtained from (2) has the following asymptotic properties.

(a) If Assumption 3 holds, then for any given l[L]l\in[L] and s,t[K]s,t\in[K], we have

(Σ2s+t(t1)2,2s+t(t1)2l)1/2(ZMl^ZTMlZElZT)s,td𝒩(0,1)\left(\Sigma_{\frac{2s+t(t-1)}{2},\frac{2s+t(t-1)}{2}}^{l}\right)^{-1/2}\left(Z\hat{M_{l}}Z^{T}-M_{l}-ZE_{l}Z^{T}\right)_{s,t}\rightarrow_{d}\mathcal{N}(0,1)

as nn goes to infinity.

(b) If Assumption 4 holds, then for any given l[L]l\in[L], we have

(Σl)1/2vec(ZMl^ZTMlZElZT)d𝒩(𝟎,𝑰)\left(\Sigma^{l}\right)^{-1/2}{\rm vec}\left(Z\hat{M_{l}}Z^{T}-M_{l}-ZE_{l}Z^{T}\right)\rightarrow_{d}\mathcal{N}(\bm{0},\bm{I})

as nn goes to infinity. Here, 𝒩(𝟎,𝐈)\mathcal{N}(\bm{0},\bm{I}) is a standard normal distribution in K(K+1)/2K(K+1)/2 dimensions.

In (a) and (b), Z=arginfZ𝕆(K)UTU^ZFZ=\operatorname*{arg\,inf}\limits_{Z\in\mathbb{O}(K)}\|U^{T}\hat{U}-Z\|_{F} is a K×KK\times K orthogonal matrix, and ElE_{l} is a diminishing term satisfying ElFc6(ρ+log1/2(L+n)L1/2)\|E_{l}\|_{F}\leq c_{6}\left(\rho+\frac{\log^{1/2}(L+n)}{L^{1/2}}\right) with high probability for some positive constant c6c_{6}.

Remark 3.

By Assumption 3 and the discussions after Assumption 1, we have Σ2s+t(t1)2,2s+t(t1)2l=s2(Ql)n2(1+o(1))\Sigma_{\frac{2s+t(t-1)}{2},\frac{2s+t(t-1)}{2}}^{l}\\ =\frac{s^{2}(Q_{l})}{n^{2}}(1+o(1)). Thus the smaller edge variance s2(Ql)s^{2}(Q_{l}) would lead to a more efficient estimate (M^l)st(\hat{M}_{l})_{st}, as long as the edge variance satisfies Assumption 3.

Remark 4.

The bias term ElF\|E_{l}\|_{F} is negligible because it is dominated by MlFnρ\|M_{l}\|_{F}\asymp n\rho. This is also numerically verified in Section 5.

Remark 5.

Compared with Arroyo et al. (2021), the merit of Theorem 1 lies in that we derive the non-asymptotic error bound for the bias term while their bound holds in expected value. In addition, we relax the full rank assumption of each connectivity matrix therein.

4 Extension to multi-layer degree-corrected SBMs

In this section, we extend the proposed method and the corresponding asymptotic normality results to multi-layer degree-corrected stochastic block models, which is a counterpart of the multi-layer SBM but in each layer the network is assumed to be generated from a degree-corrected SBM (DCSBM) (Karrer and Newman, 2011).

The SBM can not capture the degree heterogeneity inherent in the networks. To address this, the DCSBM extends the standard SBM by incorporating node specific parameters, allowing degrees to vary within the same community.

We now introduce the multi-layer DCSBM, where the layer-wise networks are generated from the DCSBM. Without specification, the notes and notation are the same as those in Section 2. Define ψ+n\psi\in\mathbb{R}^{n}_{+} to be the degree heterogeneity parameter which measures the propensity of a node in forming edges with other nodes and is consensus among layers. For identifiability, we assume maxiCkψi=1\max_{i\in C_{k}}\psi_{i}=1 for all k[K]k\in[K] (Lei and Rinaldo, 2015; Zhang et al., 2022). Given ψ\psi, the community assignments gi[K]g_{i}\in[K], the connectivity matrix Bl(l[L])B_{l}(l\in[L]), and the network sparsity parameter ρ\rho, the element Al,ij(i<j)A_{l,ij}(i<j) is generated independently as follows

Al,ijBernoulli(ρψiψjBl,gigj),A_{l,ij}\sim{\rm Bernoulli}(\rho\psi_{i}\psi_{j}B_{l,g_{i}g_{j}}),

and Al,ij=Al,jiA_{l,ij}=A_{l,ji} and Al,ii=0A_{l,ii}=0. The population adjacency matrix of AlA_{l} is then

Ql=ρdiag(ψ)ΘBlΘTdiag(ψ).Q_{l}=\rho{\rm diag}(\psi)\Theta B_{l}\Theta^{T}{\rm diag}(\psi).

To facilitate further analysis, we now give some additional notation. For each k[K]k\in[K], define Gk={1in:gi=k}G_{k}=\{1\leq i\leq n:g_{i}=k\} as the set of nodes whose community membership is kk. Let ϕk\phi_{k} be an n×1n\times 1 vector that matches ψ\psi on the index set GkG_{k} and is zero elsewhere. Define Ψ=diag(ϕ122,,ϕK22)\Psi={\rm diag}(\|\phi_{1}\|_{2}^{2},\ldots,\|\phi_{K}\|_{2}^{2}), where each diagonal element ϕk22\|\phi_{k}\|_{2}^{2} represents the effective community size of community kk. This allows us to reformat QlQ_{l} as

Ql=ρdiag(ψ)ΘΨ1/2Ψ1/2BlΘTΨ1/2Ψ1/2diag(ψ):=UMlUT,Q_{l}=\rho{\rm diag}(\psi)\Theta\Psi^{-1/2}\Psi^{1/2}B_{l}\Theta^{T}\Psi^{1/2}\Psi^{-1/2}{\rm diag}(\psi):=UM_{l}U^{T},

where U=diag(ψ)ΘΨ1/2U={\rm diag}(\psi)\Theta\Psi^{-1/2} and Ml=ρΨ1/2BlΨ1/2M_{l}=\rho\Psi^{1/2}B_{l}\Psi^{1/2}. It can be shown that UTU=IU^{T}U=I.

As described in the case of multi-layer SBMs, we regard UU as the eigenspace of the following l=1LQl2\sum_{l=1}^{L}Q_{l}^{2},

l=1LQl2=l=1Lρ2diag(ψ)ΘBlΘTdiag2(ψ)ΘBlΘTdiag(ψ)=Ul=1Lρ2Ψ1/2BlΨBlΨ1/2UT.\sum_{l=1}^{L}Q_{l}^{2}=\sum_{l=1}^{L}\rho^{2}{\rm diag}(\psi)\Theta B_{l}\Theta^{T}{\rm diag}^{2}(\psi)\Theta B_{l}\Theta^{T}{\rm diag}(\psi)=U\sum_{l=1}^{L}\rho^{2}\Psi^{1/2}B_{l}\Psi B_{l}\Psi^{1/2}U^{T}.

Suppose l=1LBl2\sum_{l=1}^{L}B_{l}^{2} is of full rank, then l=1Lρ2Ψ1/2BlΨBlΨ1/2\sum_{l=1}^{L}\rho^{2}\Psi^{1/2}B_{l}\Psi B_{l}\Psi^{1/2} is of full rank as well. Similar to the case of multi-layer SBMs, we perform the eigendecomposition on the bias-adjusted sum of squares of the adjacency matrices l=1L(Al2Dl)\sum_{l=1}^{L}(A_{l}^{2}-D_{l}), where recall DlD_{l} is an n×nn\times n diagonal matrix with Dl,ii=j=1nAl,ijD_{l,ii}=\sum_{j=1}^{n}A_{l,ij} and the resulting eigenvectors are denoted by U^\hat{U}. We further obtain Ml^\hat{M_{l}} with the help of U^\hat{U} through (2).

The following Assumptions E1, E3, and E4 are needed for establishing the asymptotic normality of M^l\hat{M}_{l}, which are counterparts of the assumptions under the multi-layer SBM.

Assumption E1.

The number of communities KK is fixed and Kϕk2ψ2K\|\phi_{k}\|_{2}\asymp\|\psi\|_{2} for all k[K]k\in[K].

Assumption E3.

minl[L]i=1nj=1nQl,ij(1Ql,ij)=ω(n2/ψ24)\min\limits_{l\in[L]}\sum_{i=1}^{n}\sum_{j=1}^{n}Q_{l,ij}(1-Q_{l,ij})=\omega(n^{2}/\|\psi\|_{2}^{4}).

Assumption E4.

λmin(Σl)=ω(1/ψ24)\lambda_{\min}(\Sigma^{l})=\omega(1/\|\psi\|^{4}_{2}) for all l[L]l\in[L], where λmin()\lambda_{\min}(\cdot) denotes the smallest eigenvalue.

Assumption E1 requires that the node propensity parameters restricted to each community have the same order of Euclidean norm. This condition is frequently imposed in the analysis of the DCSBM; see Su et al. (2020); Jin et al. (2023) and Agterberg et al. (2022). Generally, Assumptions E3 and E4 are more stringent than Assumptions 3 and 4 because ψ2n\|\psi\|_{2}\lesssim\sqrt{n}; while in the special case of multi-layer SBMs where ψ=𝟏\psi=\mathbf{1} and ψ2n\|\psi\|_{2}\asymp\sqrt{n}, Assumptions E3 and E4 reduce to Assumptions 3 and 4, respectively. With these assumptions, the entry-wise and vector-wise asymptotic normality hold for the estimated scaled connectivity matrix M^l\hat{M}_{l} under the multi-layer DCSBM.

Theorem 2.

Suppose Assumptions E1 and 2 hold for the multi-layer DCSBM, and a positive constant c7c_{7} exists such that nρc7log(L+n)n\rho\geq c_{7}\log(L+n). Consider the estimate Ml^\hat{M_{l}} obtained from (2), which has the following asymptotic properties.

(a) If Assumption E3 holds, then for any given l[L]l\in[L] and s,t[K]s,t\in[K], we have

(Σ2s+t(t1)2,2s+t(t1)2l)1/2(ZMl^ZTMlZElZT)s,td𝒩(0,1)\left(\Sigma_{\frac{2s+t(t-1)}{2},\frac{2s+t(t-1)}{2}}^{l}\right)^{-1/2}\left(Z\hat{M_{l}}Z^{T}-M_{l}-ZE_{l}Z^{T}\right)_{s,t}\rightarrow_{d}\mathcal{N}(0,1)

as nn goes to infinity.

(b) If Assumption E4 holds, then for any given l[L]l\in[L], we have

(Σl)1/2vec(ZMl^ZTMlZElZT)d𝒩(𝟎,𝑰)\left(\Sigma^{l}\right)^{-1/2}{\rm vec}\left(Z\hat{M_{l}}Z^{T}-M_{l}-ZE_{l}Z^{T}\right)\rightarrow_{d}\mathcal{N}(\bm{0},\bm{I})

as nn goes to infinity. Here, 𝒩(𝟎,𝐈)\mathcal{N}(\bm{0},\bm{I}) is a standard normal distribution in K(K+1)/2K(K+1)/2 dimensions.

In (a) and (b), Z=arginfZ𝕆(K)UTU^ZFZ=\operatorname*{arg\,inf}\limits_{Z\in\mathbb{O}(K)}\|U^{T}\hat{U}-Z\|_{F} is a K×KK\times K orthogonal matrix, and ElE_{l} is a bias term satisfying ElFc8(n2log1/2(L+n)L1/2ψ24+max{n3/2ρ1/2ψ24,ρ})\|E_{l}\|_{F}\leq c_{8}\left(\frac{n^{2}\log^{1/2}(L+n)}{L^{1/2}\|\psi\|_{2}^{4}}+\max\{\frac{n^{3/2}\rho^{1/2}}{\|\psi\|_{2}^{4}},\rho\}\right) with high probability for some positive constant c8c_{8}.

Remark 6.

ψ2\|\psi\|_{2} reveals the degree of heterogeneity to some extent. A larger ψ2\|\psi\|_{2} would lead to a smaller bias ElF\|E_{l}\|_{F}. When ψ2=ω(n1/4)\|\psi\|_{2}=\omega(n^{1/4}), the upper bound of ElF\|E_{l}\|_{F} does not exceed c9(n1/2ρ1/2+nL1/2)c_{9}\left(n^{1/2}\rho^{1/2}+\frac{n}{L^{1/2}}\right) up to log factors, where c9c_{9} is some positive constant. In this case, when ρL1/2=ω(1)\rho L^{1/2}=\omega(1), the bias term ElF\|E_{l}\|_{F} is dominated by MlFnρ\|M_{l}\|_{F}\asymp n\rho.

Remark 7.

The discrepancy between UU and U^\hat{U}, which we establish in Lemma A2, is important for Theorem 2. It is of independent interest in evaluating the misclassification performance of the spectral clustering-based algorithm under the multi-layer DCSBM.

5 Simulations

In this section, we conduct simulations to investigate the finite sample performance of the proposed method. In Section 5.1, we verify the vanishing behavior of the bias term ElE_{l} in the formulation of asymptotic normality. In Sections 5.2 and 5.3, we test the efficacy of the derived asymptotic normality in two downstream statistical inference tasks, one for the interval estimation of the entries of the scaled connectivity matrices and the other for hypothesis testing to infer whether the multi-layer network exhibits homogeneity.

The proposed method is denoted by SCCE, namely, the abbreviation for spectral clustering-based method for connectivity matrix estimation. Since studies on the asymptotic properties under multi-layer SBMs are relatively limited, we compare SCCE with the Multiple Adjacency Spectral Embedding (MASE) method studied in Arroyo et al. (2021) and Zheng and Tang (2022).

5.1 Bias evaluation

As shown in Theorems 1 and 2, M^l\hat{M}_{l} exhibits asymptotic normality with a bias term ElE_{l} for all l[L]l\in[L]. The theoretical results indicate that for a fixed number of nodes nn, the bias tends to zero in Frobenius norm as the number of layers LL increases and the edge density ρ\rho decreases. The exact formulation for ElE_{l} can be found in the proofs and (A1).

In this experiment, we numerically verify the vanishing behavior of the bias ElE_{l}. To this end, we generate the following multi-layer SBM. Consider nn nodes per network with K=3K=3 communities proportional to the number of nodes via (0.4,0.3,0.3)(0.4,0.3,0.3). We set Bl=ρB(1)B_{l}=\rho B^{(1)} for l{1,,L/2}l\in\{1,\ldots,L/2\} and Bl=ρB(2)B_{l}=\rho B^{(2)} for l{L/2+1,,L}l\in\{L/2+1,\ldots,L\}, with

B(1)=U[1.50000.20000.5]UT[0.6750.1750.460.1750.6750.460.460.460.85]B^{(1)}=U\begin{bmatrix}1.5&0&0\\ 0&0.2&0\\ 0&0&0.5\end{bmatrix}U^{T}\approx\begin{bmatrix}0.675&0.175&0.46\\ 0.175&0.675&0.46\\ 0.46&0.46&0.85\end{bmatrix}

and

B(2)=U[1.50000.20000.5]UT[0.1750.6750.460.6750.1750.460.460.460.85],B^{(2)}=U\begin{bmatrix}1.5&0&0\\ 0&0.2&0\\ 0&0&-0.5\end{bmatrix}U^{T}\approx\begin{bmatrix}0.175&0.675&0.46\\ 0.675&0.175&0.46\\ 0.46&0.46&0.85\end{bmatrix},

where

U=[1/21/22/21/21/22/22/22/20].U=\begin{bmatrix}1/2&1/2&-\sqrt{2}/2\\ 1/2&1/2&\sqrt{2}/2\\ \sqrt{2}/2&-\sqrt{2}/2&0\end{bmatrix}. (4)

We test how the bias l=1LElF/L\sum_{l=1}^{L}\|E_{l}\|_{F}/L varies against the number of layers LL. The average results over 100 replications are displayed in Figure 1. Specifically, in Figure 1(a), the number of nodes is fixed to be 500 and we test the method for various fixed ρ{0.3,0.2,0.1,0.05}\rho\in\{0.3,0.2,0.1,0.05\}. The bias decreases as LL increases and ρ\rho decreases. In particular, when ρ\rho is large, our upper bound on the bias indicates that ρ\rho is the dominating term and large LL will not decrease the bias. The numerical result with ρ=0.3\rho=0.3 coincides with this theoretical finding. In Figure 1(b), the overall edge density ρ\rho is fixed to be 0.1 and we test the results for various fixed n{200,300,400,500}n\in\{200,300,400,500\}. The results are also consistent with the theoretical bound, showing that the bias decreases as the number of layers LL or the number of nodes nn increases.

Refer to caption
(a)
Refer to caption
(b)
Figure 1: The Frobenius norm of the bias in Theorem 1, averaged over LL layers. (a) The number of nodes n=500n=500. (b) The overall edge density ρ=0.1\rho=0.1.

5.2 Interval estimation

The asymptotic distribution of M^l,ij\hat{M}_{l,ij} helps to establish the interval estimate of Ml,stM_{l,st} for 1lL1\leq l\leq L and 1s,tK1\leq s,t\leq K. In particular, the interval estimate for Ml,stM_{l,st}, as indicated by Theorems 1 and 2, is given by

(ZM^lZT)s,t±uα/2(Σ^2s+t(t1)2,2s+t(t1)2l)1/2,(Z\hat{M}_{l}Z^{T})_{s,t}\pm u_{\alpha/2}\left(\hat{\Sigma}_{\frac{2s+t(t-1)}{2},\frac{2s+t(t-1)}{2}}^{l}\right)^{1/2},

where uα/2u_{\alpha/2} is the α/2\alpha/2-th percentile of the standard normal distribution, and Σ^\hat{\Sigma} represents the estimated covariance matrix, which is the counterpart of (3) with UU and QlQ_{l}’s replaced by U^\hat{U} and A^l\hat{A}_{l}’s, respectively.

In this experiment, we test the accuracy of the estimated interval using the probability of the estimated interval covering the true underlying parameter Ml,ij{M}_{l,ij} over 200 replications. We consider two model set-ups, one for the multi-layer SBM and the other for the multi-layer DCSBM. In the multi-layer SBM, the network generation process is similar to that described in Section 5.1. We consider various combinations of n{100,200,300,400,500}n\in\{100,200,300,400,500\} and ρ{0.3,0.2,0.1,0.05}\rho\in\{0.3,0.2,0.1,0.05\}. For each combination of (n,ρ)(n,\rho), we consider two different numbers of layers, LL, set at 50 and 100, respectively. In the multi-layer DCSBM, the network generation process is also similar to that in Section 5.1 with the number of nodes n{300,400,500}n\in\{300,400,500\} except that we introduce the degree heterogeneity parameter ψ\psi. For each ψi\psi_{i}, we independently generate ψ~i\tilde{\psi}_{i} from the uniform distribution U(2,3)U(2,3) and set ψi=βnψ~i/ψ~2\psi_{i}=\beta_{n}\tilde{\psi}_{i}/\|\tilde{\psi}\|_{2} (Jin et al., 2023). Here we set the corresponding βn\beta_{n} to 10.4, 12 and 13.4 when the number of nodes nn is 300, 400 and 500, respectively.

Tables 1 and 2 show the average rate of the 95% confidence intervals covering the true parameters over 200 replications for the multi-layer SBM and multi-layer DCSBM, respectively. In particular, we calculate the average coverage rate over all the triples (s,t,L)(s,t,L). The results show that as ρ\rho, nn, and LL increase, the coverage rate under both multi-layer SBMs and multi-layer DCSBMs improves. The proposed method consistently outperforms the MASE over all range of the considered parameter settings. In addition, we observe that the 95% confidence intervals do not achieve a 95% coverage rate, which can be attributed to the presence of a bias term or to an inadequate number of nodes and network layers. Nevertheless, as the number of layers and nodes increases, the coverage rate becomes close to 95%.

Table 1: The average coverage rate of the 95% confidence intervals under multi-layer SBMs.
ρ=0.3\rho=0.3 ρ=0.2\rho=0.2 ρ=0.1\rho=0.1 ρ=0.05\rho=0.05
n{n} LL SCCE MASE SCCE MASE SCCE MASE SCCE MASE
100 50 0.860 0.764 0.752 0.696 0.787 0.697 0.817 0.743
100 0.906 0.864 0.846 0.728 0.811 0.732 0.854 0.753
200 50 0.920 0.876 0.902 0.782 0.705 0.655 0.726 0.530
100 0.924 0.906 0.919 0.872 0.865 0.715 0.755 0.598
300 50 0.924 0.895 0.920 0.799 0.831 0.618 0.588 0.550
100 0.928 0.917 0.929 0.885 0.905 0.756 0.724 0.636
400 50 0.925 0.917 0.928 0.832 0.903 0.456 0.630 0.467
100 0.927 0.923 0.928 0.894 0.922 0.658 0.857 0.490
500 50 0.929 0.926 0.927 0.854 0.911 0.404 0.752 0.413
100 0.929 0.929 0.927 0.903 0.925 0.602 0.886 0.419
Table 2: The average coverage rate of the 95% confidence intervals under multi-layer DCSBMs.
ρ=0.3\rho=0.3 ρ=0.2\rho=0.2 ρ=0.1\rho=0.1 ρ=0.05\rho=0.05
n L SCCE MASE SCCE MASE SCCE MASE SCCE MASE
300 50 0.883 0.555 0.732 0.537 0.621 0.438 0.721 0.581
100 0.915 0.692 0.881 0.579 0.664 0.434 0.758 0.577
400 50 0.905 0.480 0.843 0.450 0.522 0.363 0.660 0.463
100 0.924 0.697 0.905 0.531 0.700 0.428 0.679 0.457
500 50 0.916 0.441 0.880 0.365 0.480 0.351 0.561 0.402
100 0.925 0.634 0.914 0.434 0.796 0.433 0.583 0.392

5.3 Hypothesis testing

In multi-layer networks, an interesting statistical inference task is to test whether there is homogeneity across different layers, specifically whether some layer-wise adjacency matrices come from the same population. In the multi-layer SBM, recall the model can be reparameterized by (U,M1,,MLU,M_{1},\ldots,M_{L}), as described in (1), which implies that for any pair (k,l)(k,l) with klk\neq l and k,l{1,,L}k,l\in\{1,...,L\}, the populations are identical, namely Qk=QlQ_{k}=Q_{l}, if and only if Mk=MlM_{k}=M_{l}. Consequently, this type of homogeneity manifests as a partition of the scaled connectivity matrices MlM_{l}, say

Mi1==Mir;Mir+1==Mir+j;.M_{i_{1}}=\cdots=M_{i_{r}};\quad M_{i_{r+1}}=\cdots=M_{i_{r+j}};\quad\cdots.

Here iri_{r} is the index in {1,,L}\{1,\ldots,L\}. All pairs within a set of the partition are equal, and two MlM_{l}’s in different partitions are unequal. To infer the homogeneity of the multi-layer SBM, we consider the simultaneous testing of the (L2)\binom{L}{2} hypotheses

Hkl,0:Mk=Ml,1k<lL,H_{kl,0}:M_{k}=M_{l},\quad\quad 1\leq k<l\leq L, (5)

and use a Holm type step-down procedure to control the family-wise error rate (Lehmann and Romano, 2005). The totality of acceptance and rejection statements resulting from the multiple comparison procedure may lead to a partition of the connectivity matrices. In the Holm procedure, null hypotheses are considered successively, from most significant to least significant, with further tests depending on the outcome of earlier ones. If any hypothesis is rejected at the level α=α/(L2)\alpha^{\prime}=\alpha/\binom{L}{2}, the denominator of α\alpha^{\prime} for the next test is (L2)1\binom{L}{2}-1 and the criterion continues to be modified in a stage-wise manner, with the denominator of α\alpha^{\prime} reduced by 1 each time a hypothesis is rejected, so that tests can be conducted at successively higher significance levels. This type of multiple comparison procedure is commonly used (Dudoit et al., 2008; Noble, 2009).

The primary challenge lies in how to test the (L2)\binom{L}{2} individual hypotheses, which can be facilitated using the asymptotic distribution of the estimated scaled connectivity matrices. Specifically, to test each individual hypothesis Hkl,0H_{kl,0} at the specified significance level, we employ the Frobenius norm of the difference between the estimated scaled connectivity matrices Tkl:=M^kM^lFT_{kl}:=\|\hat{M}_{k}-\hat{M}_{l}\|_{F} as the test statistic. The distribution of the test statistic can be determined with the help of asymptotic distribution of vec(M^kM^l){\rm vec}(\hat{M}_{k}-\hat{M}_{l}).

We first specify the distribution of vec(M^kM^l){\rm vec}(\hat{M}_{k}-\hat{M}_{l}). In Theorem 1, we have provided the asymptotic distribution of vec(ZMl^ZTMl){\rm vec}(Z\hat{M_{l}}Z^{T}-M_{l}), with the vanishing bias term excluded. However, the dependence between M^k\hat{M}_{k} and M^l\hat{M}_{l} prevents a straightforward summation to obtain the asymptotic distribution of vec(ZMk^ZTMkZMl^ZT+Ml){\rm vec}(Z\hat{M_{k}}Z^{T}-M_{k}-Z\hat{M_{l}}Z^{T}+M_{l}), which, under the hypothesis Hkl,0H_{kl,0}, reduces to the distribution of vec(Z(M^kM^l)ZT){\rm vec}(Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}). Fortunately, by leveraging the analogous technique used in the proof of Theorem 1, specifically the decomposition (A2), we can obtain the limiting distribution of vec(Z(M^kM^l)ZTMk+Ml){\rm vec}(Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}-M_{k}+M_{l}). Specifically, we have

(Σk+Σl)1/2vec(Z(M^kM^l)ZTMk+MlZ(EkEl)ZT)d𝒩(𝟎,𝑰)\left(\Sigma^{k}+\Sigma^{l}\right)^{-1/2}{\rm vec}\left(Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}-M_{k}+M_{l}-Z(E_{k}-E_{l})Z^{T}\right)\rightarrow_{d}\mathcal{N}(\bm{0},\bm{I})

for each distinct pair kk and ll satisfying 1klL1\leq k\neq l\leq L, where EkE_{k} and ElE_{l} are negligible under certain conditions and the notation is the same as those in Theorem 1. The conclusion holds for both the multi-layer SBM and the multi-layer DCSBM. As a result, under the hypothesis Hkl,0H_{kl,0} and omitting the bias terms, we can approximate the asymptotic distribution of vec(Z(M^kM^l)ZT){\rm vec}(Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}) by 𝒩(𝟎,Σk+Σl)\mathcal{N}(\bm{0},\Sigma^{k}+\Sigma^{l}). With the help of asymptotic distribution of vec(Z(M^kM^l)ZT){\rm vec}(Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}), we estimate the distribution of the test statistic Tkl=M^kM^lF=Z(M^kM^l)ZTFT_{kl}=\|\hat{M}_{k}-\hat{M}_{l}\|_{F}=\|Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}\|_{F} by drawing samples from 𝒩(𝟎,Σ^k+Σ^l)\mathcal{N}(\bm{0},\hat{\Sigma}^{k}+\hat{\Sigma}^{l}) and calculating the empirical distribution of the Frobenius norm, where Σ^k\hat{\Sigma}^{k} and Σ^l\hat{\Sigma}^{l} denote the estimated covariance matrices defined in Section 5.2. The above procedure to determine the distribution of the test statistic is denoted by SCCE.

Experiment 1.

The distribution of the test statistic TklT_{kl} enables the testing of the individual hypothesis Hkl,0H_{kl,0}. In this experiment, we evaluate the power of the test statistic TklT_{kl} in testing Hkl,0H_{kl,0} for a given pair klk\neq l. Both the multi-layer SBM and the multi-layer DCSBM are considered. We fix L=50L=50, n=300n=300 and K=3K=3. The number of nodes in three communities is assigned according to (0.4,0.3,0.3)(0.4,0.3,0.3). The overall edge density ρ\rho is fixed at 0.2. The network generation processes of both models are similar with that in Section 5.2 except the definitions of BlB_{l}’s. In this experiment, we set Bl=ρB(1)B_{l}=\rho B^{(1)} for l{3,,L/2+1}l\in\{3,\ldots,L/2+1\}, and Bl=ρB(2)B_{l}=\rho B^{(2)} for l{L/2+2,,L}l\in\{L/2+2,\ldots,L\}, where B(1)B^{(1)} and B(2)B^{(2)} are defined in Section 5.1. For the first layer B1=ρB(1)B_{1}=\rho B^{(1)}, and for the second layer B2=ρB(2)B_{2}=\rho B^{(2^{\prime})}, the B(2)B^{(2^{\prime})} is defined to be the same as B(1)B^{(1)} except for the first entry B11(2)B^{(2^{\prime})}_{11}. We vary B11(2)B^{(2^{\prime})}_{11} to obtain a sequence of B2B_{2} matrices. Under these parameter settings, we test the individual hypothesis H12,0:M1=M2H_{12,0}:M_{1}=M_{2} at a specified significance level.

Refer to caption
(a) multi-layer SBM
Refer to caption
(b) multi-layer DCSBM
Figure 2: Empirical power to reject the hypothesis that the two different network layers have an identical population matrix at a 0.05 significance level under (a) multi-layer SBMs and (b) multi-layer DCSBMs. Empirical power is measured as the proportion of trials for which the test identified a significant result.

Figure 2 shows the averaged empirical power to reject the individual hypothesis H12,0H_{12,0} at a 0.05 significance level with B1B2F\|B_{1}-B_{2}\|_{F} increasing over 100 replications. In our comparisons, we compare SCCE with a counterpart SCCE-Oracle, which differs from SCCE primarily in that it does not use the asymptotic distribution of vec(Z(M^kM^l)ZT){\rm vec}(Z(\hat{M}_{k}-\hat{M}_{l})Z^{T}). Instead, SCCE-Oracle generates various samples from the population to obtain various M^kM^l\hat{M}_{k}-\hat{M}_{l} and uses the empirical distribution to approximate the real distribution of the test statistic TklT_{kl}. Another comparison method, MASE, is defined similarly with SCCE except that we use the estimator M^l\hat{M}_{l} and the asymptotic distribution derived in Arroyo et al. (2021). The results in Figure 2 show that the proposed method SCCE has power close to that of SCCE-Oracle, demonstrating the accuracy of the asymptotic distributions. In addition, the proposed method SCCE has higher power than MASE under both network models, showing the efficacy of the proposed method in hypothesis testing.

Experiment 2.

We shall now consider the simultaneous testing of the (L2)\binom{L}{2} hypotheses (5) by means of a Holm type step-down procedure for inferring homogeneity of multi-layer networks. For this purpose, we consider the multi-layer SBM with two distinct connectivity matrices, where all connectivity matrices constitute a partition as follows:

M1=M2==ML/2;ML/2+1=ML/2+2==ML.M_{1}=M_{2}=\cdots=M_{L/2};\quad\quad M_{L/2+1}=M_{L/2+2}=\cdots=M_{L}.

All pairs within a set of the partition are equal, and two MlM_{l}’s in different partitions are unequal. It is clear that the overall network is heterogeneous, while the network layers within each partition exhibit homogeneity. Specifically, we set Bl=ρB(1)B_{l}=\rho B^{(1)} for l{1,,L/2}l\in\{1,\ldots,L/2\} and Bl=ρB(2)B_{l}=\rho B^{(2)} for l{L/2+1,,L}l\in\{L/2+1,\ldots,L\}, with

B(1)=U[10000.40000.1]UT[0.40.30.2120.30.40.2120.2120.2120.7]B^{(1)}=U\begin{bmatrix}1&0&0\\ 0&0.4&0\\ 0&0&0.1\end{bmatrix}U^{T}\approx\begin{bmatrix}0.4&0.3&0.212\\ 0.3&0.4&0.212\\ 0.212&0.212&0.7\end{bmatrix}

and

B(2)=U[10000.40000.1]UT[0.30.40.2120.40.30.2120.2120.2120.7],B^{(2)}=U\begin{bmatrix}1&0&0\\ 0&0.4&0\\ 0&0&-0.1\end{bmatrix}U^{T}\approx\begin{bmatrix}0.3&0.4&0.212\\ 0.4&0.3&0.212\\ 0.212&0.212&0.7\end{bmatrix},

where UU is defined as in (4). We set L=20L=20, n=500n=500 and K=3K=3, with the number of nodes in three communities proportioned at (0.4,0.3,0.3)(0.4,0.3,0.3). The overall edge density ρ\rho is fixed at 0.2. Recall the definition of MlM_{l}, the difference in MlM_{l} between different partitions is now given by ρB(1)B(2)F=0.04\rho\|B^{(1)}-B^{(2)}\|_{F}=0.04.

Figure 3 presents the outcomes of the multiple comparisons conducted using a Holm type step-down procedure, where olive green signifies acceptance and white signifies rejection of the individual hypotheses. We control the Holm procedure to ensure that the family-wise error rate is no bigger than α=0.05\alpha=0.05. In the results of the proposed method SCCE, two distinct blocks are observed within which all individual hypotheses are simultaneously accepted, demonstrating the presence of homogeneity within these blocks. This is consistent with our experimental setup. However, MASE cannot accurately infer the true homogeneity. Moreover, in cases where the true hypothesis is false, the proposed method SCCE almost always rejects them.

Refer to caption
(a) SCCE
Refer to caption
(b) MASE
Figure 3: Results of the multiple comparisons conducted using a Holm type step-down procedure, where olive green signifies the acceptance of the individual hypotheses.

6 Real data analysis

In this section, we use the proposed method SCCE to measure the homogeneity across all layers of a real multi-layer network, specifically the WFAT dataset (De Domenico et al., 2015). For this purpose, we simultaneously test the (L2)\binom{L}{2} hypotheses

Hkl,0:Mk=Ml,1k<lL,H_{kl,0}:M_{k}=M_{l},\quad\quad 1\leq k<l\leq L,

using a Holm type step-down procedure. Next, we provide the data description and the results for the real dataset.

The WFAT dataset, sourced from the Food and Agriculture Organization (FAO) of the United Nations, contains annual trade records for over 400 food and agricultural products across all countries globally. Our analysis focuses specifically on 2020 trade data for cereals, a category defined by the FAO and detailed in Table A3. We constructed a multi-layer network using this data. Each layer represents a different type of cereal, with nodes representing countries and edges within each layer representing the trade relationships with respect a specific cereal. In particular, a trade link between countries is established if the trade value for a cereal exceeds $2000\$2000. To ensure network connectivity, we excluded countries with a total degree over 37 layers less than 23. This guarantees that each node is linked to at least one other node in at least half of the layers. The resulting network consists of 37 layers, each containing 114 common nodes, denoted by Al,l{1,,37}A_{l},l\in\{1,\ldots,37\}. The five continents, including America, Africa, Asia, Europe, and Oceania, suggest the choice K=5K=5 (Noroozi and Pensky, 2024).

Figure 4 shows the results of the multiple comparisons conducted using a Holm type step-down procedure, where olive green indicates the acceptance and white denotes the rejection of the individual hypotheses. We control the Holm procedure to ensure that the family-wise error rate is no greater than α=0.05\alpha=0.05. Additionally, the pp-value matrix for individual hypothesis tests across all network layers is shown. In Figure 4(a), several distinct blocks are observed where all individual hypotheses within each block are simultaneously accepted, demonstrating the presence of homogeneity within the multi-layer network. For instance, layers 5 to 14, corresponding to ‘Germ of maize’, ‘Triticale’, ‘Cereals n.e.c.’, ‘Millet’, ‘Sorghum’, ‘Bran of maize’, ‘Bran of cereals n.e.c.’, ‘Flour of mixed grain’, ‘Flour of rice’, and ‘Rye’, exhibit no rejections of the individual hypotheses in any pairwise comparisons. This outcome suggests a significant similarity in trade patterns among these cereal products. Notably, most of these cereals are consumed primarily by Asia, which might lead to simultaneous acceptance of the individual hypotheses. This observation suggests the need for an integrated analysis of the trade patterns of these cereals.

Refer to caption
(a) multiple comparisons
Refer to caption
(b) pp-values matrix
Figure 4: (a) Results of the multiple comparisons conducted using a Holm type step-down procedure for the WFAT data, with olive green indicating acceptance of the individual hypotheses. (b) Matrix of pp-values for the individual hypothesis tests corresponding to all pairs of network layers, where the layer ll correspond to the cereal numbered ll in Table A3.

7 Conclusion

In this paper, we focused on the asymptotic normality in the estimation of the scaled connectivity matrix under multi-layer SBMs and multi-layer DCSBMs. To this end, we first leveraged the leading eigenvectors of the bias-adjusted sum of squares of the adjacency matrices to estimate the common eigenspaces across layers (Lei and Lin, 2023). Building on this, we further derived efficient estimates for the layer-wise scaled connectivity matrices. These estimates are shown to satisfy asymptotic normality under mild additional assumptions. To corroborate the theoretical results, we conducted a series of simulations to apply the derived asymptotic normality to statistical inference tasks, including interval estimation and hypothesis testing. Finally, we applied the method to a real-world network dataset to infer the population homogeneity within the multi-layer network.

There are many ways to extend the content of this paper. First, we assume the connectivity matrices are different among layers. However, in practice, the network layers are often grouped such that connectivity matrices are the same within each group (Jing et al., 2021; Fan et al., 2022a). It is thus meaningful to make use of all the network layers within each group to estimate the connectivity matrices, which would lead to improved estimation accuracy and a weakened requirement for network sparsity. Second, we focused on the undirected multi-layer networks. It is of great importance to extend the method and theory to multi-layer directed networks (Zheng and Tang, 2022; Su et al., 2024). Third, we assumed the number of communities is known and fixed. In the future, it is important to study the estimation of the number of communities under multi-layer SBMs.

Appendix

Appendix A.1 provides the technical lemmas that are needed to prove the central limit theorems. Appendix A.2 includes the proof of main theorems in the main text. Appendix A.3 contains auxiliary lemmas. Appendix A.4 provides the additional results for real data analysis.

A.1 Technical lemmas

Lemma A1.

Suppose the multi-layer SBM satisfies Assumptions 1 and 2. Let U^\hat{U} denotes the leading eigenvectors of l=1L(Al2Dl)\sum_{l=1}^{L}(A_{l}^{2}-D_{l}). If nρc1log(L+n)n\rho\geq c_{1}\log(L+n) for some positive constant c1c_{1}, then

U^UZFc2(1n+log1/2(L+n)L1/2ρ1/2n1/2)\displaystyle\|\hat{U}-UZ\|_{F}\leq c_{2}\left(\frac{1}{n}+\frac{\log^{1/2}(L+n)}{L^{1/2}\rho^{1/2}n^{1/2}}\right)

with probability at least 1O((L+n)1)1-O((L+n)^{-1}) for some constant c2>0c_{2}>0.

Proof.

The bias-adjusted sum-of-squares of the adjacency matrices l=1L(Al2Dl)\sum_{l=1}^{L}(A_{l}^{2}-D_{l}) can be decomposed into a signal term and for noise terms

l=1L(Al2Dl)=l=1LQl2+N1+N2+N3+N4,\sum_{l=1}^{L}\left(A_{l}^{2}-D_{l}\right)=\sum_{l=1}^{L}Q_{l}^{2}+N_{1}+N_{2}+N_{3}+N_{4},

where N1=diag2(Ql)Qldiag(Ql)diag(Ql)QlN_{1}=\mbox{diag}^{2}(Q_{l})-Q_{l}\mbox{diag}(Q_{l})-\mbox{diag}(Q_{l})Q_{l}, N2=l=1L(PlGlT+GlPlT)N_{2}=\sum_{l=1}^{L}\left(P_{l}G_{l}^{T}+G_{l}P_{l}^{T}\right), N3=l=1LGl2diag(l=1LGl2)N_{3}=\sum_{l=1}^{L}G_{l}^{2}-\mbox{diag}(\sum_{l=1}^{L}G_{l}^{2}), and N4=diag(l=1LGl2)l=1LDlN_{4}=\mbox{diag}(\sum_{l=1}^{L}G_{l}^{2})-\sum_{l=1}^{L}D_{l}. Recall that in the context of multi-layer SBMs, Ql=ρΘBlΘTQ_{l}=\rho\Theta B_{l}\Theta^{T}, Pl=Qldiag(Ql)P_{l}=Q_{l}-\mbox{diag}(Q_{l}) and Gl=AlPlG_{l}=A_{l}-P_{l}.

According to Theorem 1 of Lei and Lin (2023), the minimum eigenvalue of the signal term l=1LQl2\sum_{l=1}^{L}Q_{l}^{2} is lower bounded by cLρ2n2cL\rho^{2}n^{2} for some positive constant cc. Note that we use cc to represent the generic constant and it may be different from line to line. We proceed to establish upper bounds for the spectral norms of all noise terms. The first noise term satisfies N12cLnρ2.\|N_{1}\|_{2}\leq cLn\rho^{2}. Regarding N2N_{2}, under the condition nρc1log(L+n)n\rho\geq c_{1}\log(L+n), we have N22cL1/2n3/2ρ3/2log1/2(n+L)\|N_{2}\|_{2}\leq cL^{1/2}n^{3/2}\rho^{3/2}\log^{1/2}(n+L) with high probability. For N3N_{3}, by Theorem 4 of Lei and Lin (2023), under the condition nρc1log(L+n)n\rho\geq c_{1}\log(L+n), we have N32cL1/2nρlog(L+n)\|N_{3}\|_{2}\leq cL^{1/2}n\rho\log(L+n) with probability at least 1O((L+n)1)1-O((L+n)^{-1}). Finally, for N4N_{4}, given (l=1LGl2)iiLnρ2+Dl,ii(\sum_{l=1}^{L}G_{l}^{2})_{ii}\leq Ln\rho^{2}+D_{l,ii} for all i[m]i\in[m], it follows that N42Lnρ2\|N_{4}\|_{2}\leq Ln\rho^{2}.

Integrating the bounds of all terms, we have with high probability,

N1+N2+N3+N42λK(l=1LQl2)cLnρ2+L1/2n3/2ρ3/2log1/2(L+n)Lρ2n2.\frac{\|N_{1}+N_{2}+N_{3}+N_{4}\|_{2}}{\lambda_{K}(\sum_{l=1}^{L}Q_{l}^{2})}\leq c\frac{Ln\rho^{2}+L^{1/2}n^{3/2}\rho^{3/2}\log^{1/2}(L+n)}{L\rho^{2}n^{2}}.

Let U^\widehat{U} be the n×Kn\times K matrices that contain the leading eigenvectors of l=1L(Al2Dl)\sum_{l=1}^{L}\left(A_{l}^{2}-D_{l}\right). In accordance with Proposition 2.2 of Vu and Lei (2013) and the Davis-Kahan sinΘ\Theta theorem (Theorem VII.3.1 of Bhatia (1997)), there exists a K×KK\times K orthogonal matrix ZZ such that

U^UZFKU^UZ2c(1n+log1/2(L+n)L1/2ρ1/2n1/2).\|\widehat{U}-UZ\|_{F}\leq\sqrt{K}\|\widehat{U}-UZ\|_{2}\leq c\left(\frac{1}{n}+\frac{\log^{1/2}(L+n)}{L^{1/2}\rho^{1/2}n^{1/2}}\right).

The proof is completed. ∎

Lemma A2.

Suppose the multi-layer DCSBM satisfies Assumptions E1 and 2. Let U^\hat{U} denotes the leading eigenvectors of l=1L(Al2Dl)\sum_{l=1}^{L}(A_{l}^{2}-D_{l}). If nρc1log(L+n)n\rho\geq c_{1}\log(L+n) for some positive constant c1c_{1}, then

U^UZFc2(nψ24+n3/2log1/2(L+n)L1/2ρ1/2ψ24)\|\hat{U}-UZ\|_{F}\leq c_{2}\left(\frac{n}{\|\psi\|_{2}^{4}}+\frac{n^{3/2}\log^{1/2}(L+n)}{L^{1/2}\rho^{1/2}\|\psi\|_{2}^{4}}\right)

with probability at least 1O((L+n)1)1-O((L+n)^{-1}) for some positive constant c2c_{2}.

Proof.

Following the approach used in the proof of Lemma A1, we separately bound the signal term and the noise terms. Recall that in the context of multi-layer degree-corrected SBMs, Pl=Qldiag(Ql)P_{l}=Q_{l}-\mbox{diag}(Q_{l}), Ql=ρdiag(ψ)ΘBlΘTdiag(ψ)Q_{l}=\rho\mbox{diag}(\psi)\Theta B_{l}\Theta^{T}\mbox{diag}(\psi) and Gl=AlPlG_{l}=A_{l}-P_{l}. Given the balanced effective community sizes assumption and the fixed number of communities, the minimum eigenvalue of Ψ\Psi is lower bounded by cψ22c\|\psi\|_{2}^{2} for some positive constant cc. Then we have

l=1LQl2\displaystyle\sum_{l=1}^{L}Q_{l}^{2} =\displaystyle= ρ2Θ~Ψ1/2[l=1LBlΨBl]Ψ1/2Θ~T\displaystyle\rho^{2}\tilde{\Theta}\Psi^{1/2}\left[\sum_{l=1}^{L}B_{l}\Psi B_{l}\right]\Psi^{1/2}\tilde{\Theta}^{T}
\displaystyle\succeq cρ2ψ22Θ~Ψ1/2l=1LBl2Ψ1/2Θ~T\displaystyle c\rho^{2}\|\psi\|_{2}^{2}\tilde{\Theta}\Psi^{1/2}\sum_{l=1}^{L}B_{l}^{2}\Psi^{1/2}\tilde{\Theta}^{T}
\displaystyle\succeq cLρ2ψ24Θ~Θ~T,\displaystyle cL\rho^{2}\|\psi\|_{2}^{4}\tilde{\Theta}\tilde{\Theta}^{T},

where \succeq denotes the Loewner partial order, in particular, let AA and BB be two Hermitian matrices of order pp, we say that ABA\succeq B if ABA-B is positive semi-definite. The latter \succeq is derived from Assumption 2. Recall that Θ~\tilde{\Theta} is an orthogonal column matrix, the smallest eigenvalue of l=1LQl2\sum_{l=1}^{L}Q_{l}^{2} is lower bounded by cLρ2ψ24cL\rho^{2}\|\psi\|_{2}^{4}. For the bounds of the spectral norms of all noise terms, they are consistent with those established in Lemma A1.

Combining the bounds of all terms, Proposition 2.2 of Vu and Lei (2013) and the Davis-Kahan sinΘ\Theta theorem, there exists a K×KK\times K orthogonal matrix ZZ such that

U^UZFKU^UZ2cLnρ2+L1/2n3/2ρ3/2log1/2(L+n)Lρ2ψ24\|\widehat{U}-UZ\|_{F}\leq\sqrt{K}\|\widehat{U}-UZ\|_{2}\leq c\frac{Ln\rho^{2}+L^{1/2}n^{3/2}\rho^{3/2}\log^{1/2}(L+n)}{L\rho^{2}\|\psi\|_{2}^{4}}

holds with high probability. The proof is completed. ∎

A.2 Main proofs

Proof of Theorem 1

Recall that Ql=ρΘBlΘTQ_{l}=\rho\Theta B_{l}\Theta^{T} and the noise matrix Gl=AlPlG_{l}=A_{l}-P_{l}, where Pl=Qldiag(Ql)P_{l}=Q_{l}-\text{diag}(Q_{l}) for all l[L]l\in[L]. Then the estimator Ml^\hat{M_{l}} can be rearranged as

Ml^\displaystyle\hat{M_{l}} =\displaystyle= U^TQlU^+U^TGlU^U^Tdiag(Ql)U^\displaystyle\hat{U}^{T}Q_{l}\hat{U}+\hat{U}^{T}G_{l}\hat{U}-\hat{U}^{T}{\rm diag}(Q_{l})\hat{U} (A1)
=\displaystyle= U^TUMlUTU^+U^TGlU^U^Tdiag(Ql)U^\displaystyle\hat{U}^{T}UM_{l}U^{T}\hat{U}+\hat{U}^{T}G_{l}\hat{U}-\hat{U}^{T}{\rm diag}(Q_{l})\hat{U}
=\displaystyle= (U^TUZT)Ml(UTU^)+ZTMl(UTU^Z)+ZTMlZ\displaystyle(\hat{U}^{T}U-Z^{T})M_{l}(U^{T}\hat{U})+Z^{T}M_{l}(U^{T}\hat{U}-Z)+Z^{T}M_{l}Z
+(U^TZTUT)Gl(U^UZ)+ZTUTGl(U^UZ)(ZTUTU^T)GlUZ\displaystyle+\leavevmode\nobreak\ (\hat{U}^{T}-Z^{T}U^{T})G_{l}(\hat{U}-UZ)+Z^{T}U^{T}G_{l}(\hat{U}-UZ)-(Z^{T}U^{T}-\hat{U}^{T})G_{l}UZ
+ZTUTGlUZU^Tdiag(Ql)U^\displaystyle+\leavevmode\nobreak\ Z^{T}U^{T}G_{l}UZ-\hat{U}^{T}{\rm diag}(Q_{l})\hat{U}
=\displaystyle= ZTMlZ+ZTUTGlUZ+El,\displaystyle Z^{T}M_{l}Z+Z^{T}U^{T}G_{l}UZ+E_{l},

where ElE_{l} in the final equality represents all terms from the penultimate equality excluding ZTMlZZ^{T}M_{l}Z and ZTUTGlUZZ^{T}U^{T}G_{l}UZ. Here, Z=arginfZ𝕆(K)UTU^ZFZ=\operatorname*{arg\,inf}\limits_{Z\in\mathbb{O}(K)}\|U^{T}\hat{U}-Z\|_{F} is a K×KK\times K orthogonal matrix. Thus,

ZMl^ZTMlZElZT=UTGlU.Z\hat{M_{l}}Z^{T}-M_{l}-ZE_{l}Z^{T}=U^{T}G_{l}U. (A2)

We initiate our argument by establishing that ZElZTZE_{l}Z^{T} tends towards zero as the number of nodes and layers increase, and the edge density decreases. We accomplish this by exerting control over each term in ElE_{l}.

ElF\displaystyle\|E_{l}\|_{F} \displaystyle\leq (U^TUZT)Ml(UTU^)F+ZTMl(UTU^Z)F\displaystyle\|(\hat{U}^{T}U-Z^{T})M_{l}(U^{T}\hat{U})\|_{F}+\|Z^{T}M_{l}(U^{T}\hat{U}-Z)\|_{F}
+(U^TZTUT)Gl(U^UZ)F+2ZTUTGl(U^UZ)F\displaystyle+\leavevmode\nobreak\ \|(\hat{U}^{T}-Z^{T}U^{T})G_{l}(\hat{U}-UZ)\|_{F}+2\|Z^{T}U^{T}G_{l}(\hat{U}-UZ)\|_{F}
+U^Tdiag(Ql)U^F\displaystyle+\leavevmode\nobreak\ \|\hat{U}^{T}{\rm diag}(Q_{l})\hat{U}\|_{F}
\displaystyle\leq U^TUZTFMl2+(UTU^Z)FMl2\displaystyle\|\hat{U}^{T}U-Z^{T}\|_{F}\|M_{l}\|_{2}+\|(U^{T}\hat{U}-Z)\|_{F}\|M_{l}\|_{2}
+U^UZF2Gl2+2U^UZFGl2+diag(Ql)2U^F\displaystyle+\leavevmode\nobreak\ \|\hat{U}-UZ\|_{F}^{2}\|G_{l}\|_{2}+2\|\hat{U}-UZ\|_{F}\|G_{l}\|_{2}+\|{\rm diag}(Q_{l})\|_{2}\|\hat{U}\|_{F}
=\displaystyle= 4(Ml2+Gl2)U^UZF2+2U^UZFGl2+Kdiag(Ql)2.\displaystyle 4(\|M_{l}\|_{2}+\|G_{l}\|_{2})\|\hat{U}-UZ\|_{F}^{2}+2\|\hat{U}-UZ\|_{F}\|G_{l}\|_{2}+\sqrt{K}\|{\rm diag}(Q_{l})\|_{2}.

The upper bound of U^UZF\|\hat{U}-UZ\|_{F} is provided in Lemma A1. With regards to the term MlM_{l}, it holds that Ml2=UMlUT2=Ql2\|M_{l}\|_{2}=\|UM_{l}U^{T}\|_{2}=\|Q_{l}\|_{2}. For GlG_{l}, applying Lemma A3, we have

Gl2cQl1,\|G_{l}\|_{2}\leq c\sqrt{\|Q_{l}\|_{1,\infty}}

with probability at least 1O(n1)1-O(n^{-1}). Note that we use cc and cc^{\prime} to represent the generic positive constants and it may be different from line to line. Consequently, it can be deduced that

ZElZTF\displaystyle\|ZE_{l}Z^{T}\|_{F} \displaystyle\leq c(Ql2+Ql1,)(1n2+log(L+n)Lnρ)\displaystyle c(\|Q_{l}\|_{2}+\sqrt{\|Q_{l}\|_{1,\infty}})\left(\frac{1}{n^{2}}+\frac{\log(L+n)}{Ln\rho}\right)
+cQl1,(1n+log1/2(L+n)L1/2n1/2ρ1/2)+Kρ\displaystyle\leavevmode\nobreak\ +c\sqrt{\|Q_{l}\|_{1,\infty}}\left(\frac{1}{n}+\frac{\log^{1/2}(L+n)}{L^{1/2}n^{1/2}\rho^{1/2}}\right)+\sqrt{K}\rho

with high probability. Additionally, given that every element of QlQ_{l} is at most ρ\rho, both Ql2\|Q_{l}\|_{2} and Ql1,\|Q_{l}\|_{1,\infty} do not exceed nρn\rho. Thus, we can deduce that ZElZTFc(ρ+log1/2(L+n)L1/2)\|ZE_{l}Z^{T}\|_{F}\leq c\left(\rho+\frac{\log^{1/2}(L+n)}{L^{1/2}}\right). As the number of layers LL increase, and the overall edge probability decrease, ZElZTZE_{l}Z^{T} tends to vanish.

Consequently, demonstrating the central limit theorem can be reduced to verifying the asymptotic normality of UTGlUU^{T}G_{l}U. The covariance of any pair (UTGlU)s,t(U^{T}G_{l}U)_{s,t} and (UTGlU)s,t(U^{T}G_{l}U)_{s^{\prime},t^{\prime}}, where 1stK1\leq s\leq t\leq K and 1stK1\leq s^{\prime}\leq t^{\prime}\leq K, is

Cov((UTGlU)s,t,(UTGlU)s,t)\displaystyle{\rm Cov}\left((U^{T}G_{l}U)_{s,t},\;(U^{T}G_{l}U)_{s^{\prime},t^{\prime}}\right)
=\displaystyle= Cov(1i,jnUisGl,ijUjt,1i,jnUisGl,ijUjt)\displaystyle{\rm Cov}\left(\sum_{1\leq i,j\leq n}U_{is}G_{l,ij}U_{jt},\;\sum_{1\leq i,j\leq n}U_{is^{\prime}}G_{l,ij}U_{jt^{\prime}}\right)
=\displaystyle= i=1n1j=i+1n(UisUjt+UjsUit)(UisUjt+UjsUit)Cov(Gl,ij,Gl,ij)\displaystyle\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}(U_{is}U_{jt}+U_{js}U_{it})\;(U_{is^{\prime}}U_{jt^{\prime}}+U_{js^{\prime}}U_{it^{\prime}})\;{\rm Cov}\left(G_{l,ij},\;G_{l,ij}\right)
=\displaystyle= i=1n1j=i+1n(UisUjt+UjsUit)(UisUjt+UjsUit)Ql,ij(1Ql,ij).\displaystyle\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}(U_{is}U_{jt}+U_{js}U_{it})\;(U_{is^{\prime}}U_{jt^{\prime}}+U_{js^{\prime}}U_{it^{\prime}})\;Q_{l,ij}(1-Q_{l,ij}).

That is the Σ2s+t(t1)2,2s+t(t1)2l\Sigma_{\frac{2s+t(t-1)}{2},\frac{2s^{\prime}+t^{\prime}(t^{\prime}-1)}{2}}^{l} term defined in (3). Regarding the expectation term, given that 𝔼(Gl)=0\mathbb{E}(G_{l})=0, it is straightforward to see that each element of UTGlUU^{T}G_{l}U has a zero-mean, as each element sums terms with zero-mean. In other words, we have 𝔼((UTGlU)s,t)=0\mathbb{E}\left((U^{T}G_{l}U)_{s,t}\right)=0 for all 1stK1\leq s\leq t\leq K.

Part (a): Note that

(UTGlU)s,t=i=1n1j=i+1n(UisUjt+UjsUit)Gl,ij:=i=1n1j=i+1nFl,ij,(U^{T}G_{l}U)_{s,t}=\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}(U_{is}U_{jt}+U_{js}U_{it})\;G_{l,ij}:=\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}F_{l,ij}, (A3)

where we let Fl,ijF_{l,ij} denote (UisUjt+UjsUit)Gl,ij(U_{is}U_{jt}+U_{js}U_{it})\;G_{l,ij}. Given ll, it is clear that the Fl,ijF_{l,ij} terms are zero-mean and independent of each other for all 1i<jn1\leq i<j\leq n. By the definition of UU and according to Assumption 1, we have |Fl,ij|c/n|F_{l,ij}|\leq c^{\prime}/n for all l[L]l\in[L] and 1i<jn1\leq i<j\leq n, where cc^{\prime} is some positive constant. To affirm the Lindeberg condition, we proceed by calculating the sum of Var(Fl,ij){\rm Var}(F_{l,ij}) for all 1i<jn1\leq i<j\leq n. Given Assumption 3 and the premise of balanced community sizes, it follows that Var((UTGlU)s,t)=ω(1/n2){\rm Var}\left((U^{T}G_{l}U)_{s,t}\right)=\omega(1/n^{2}). Therefore, for any ε>0\varepsilon>0 and sufficiently large nn, we can state that |Fl,ij|<ε[Var((UTGlU)s,t)]1/2|F_{l,ij}|<\varepsilon\left[{\rm Var}\left((U^{T}G_{l}U)_{s,t}\right)\right]^{1/2}. This leads to the Lindeberg condition as given by

limn1Var((UTGlU)s,t)i=1n1j=i+1n𝔼(|Fl,ij|2𝕀{|Fl,ij|>ε[Var((UTGlU)s,t)]1/2})=0\lim\limits_{n\rightarrow\infty}\frac{1}{{\rm Var}\left((U^{T}G_{l}U)_{s,t}\right)}\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\mathbb{E}\left(|F_{l,ij}|^{2}\mathbb{I}_{\{|F_{l,ij}|>\varepsilon[{\rm Var}((U^{T}G_{l}U)_{s,t})]^{1/2}\}}\right)=0 (A4)

is satisfied. It should be emphasised that here UU, GlG_{l} and Fl,ijF_{l,ij} are all related to nn. For simplicity, we have not notated the nn display. By the Lindeberg-Feller Central Limit Theorem, we have

(Σ2s+t(t1)2,2s+t(t1)2l)1/2(UTGlU)s,td𝒩(0,1)\left(\Sigma_{\frac{2s+t(t-1)}{2},\frac{2s+t(t-1)}{2}}^{l}\right)^{-1/2}(U^{T}G_{l}U)_{s,t}\rightarrow_{d}\mathcal{N}(0,1)

for each l[L]l\in[L]. The claim follows by combining the above discussion with (A2).

Part (b): To complete the central limit theorem for vectorlized UTGlUU^{T}G_{l}U, we first show that this term can be expressed as a summation of independent zero-mean random variables. Let Σ~l:=(Σl)1/2\widetilde{\Sigma}^{l}:=(\Sigma^{l})^{-1/2} and consider an arbitrary vector xK(K+1)/2x\in\mathbb{R}^{K(K+1)/2}.

xT(Σl)1/2vec(UTGlU)\displaystyle x^{T}(\Sigma^{l})^{-1/2}{\rm vec}(U^{T}G_{l}U) (A5)
=\displaystyle= 1stKx2s+t(t1)21stKΣ~2s+t(t1)2,2s+t(t1)2li=1n1j=i+1n(UisUjt+UjsUit)Gl,ij\displaystyle\sum_{1\leq s\leq t\leq K}x_{\frac{2s+t(t-1)}{2}}\sum_{1\leq s^{\prime}\leq t^{\prime}\leq K}\tilde{\Sigma}_{\frac{2s+t(t-1)}{2},\frac{2s^{\prime}+t^{\prime}(t^{\prime}-1)}{2}}^{l}\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}(U_{is^{\prime}}U_{jt^{\prime}}+U_{js^{\prime}}U_{it^{\prime}})\;G_{l,ij}
=\displaystyle= i=1n1j=i+1n{Gl,ij1stKx2s+t(t1)21stKΣ~2s+t(t1)2,2s+t(t1)2l(UisUjt+UjsUit)}\displaystyle\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\left\{G_{l,ij}\sum_{1\leq s\leq t\leq K}x_{\frac{2s+t(t-1)}{2}}\sum_{1\leq s^{\prime}\leq t^{\prime}\leq K}\tilde{\Sigma}_{\frac{2s+t(t-1)}{2},\frac{2s^{\prime}+t^{\prime}(t^{\prime}-1)}{2}}^{l}(U_{is^{\prime}}U_{jt^{\prime}}+U_{js^{\prime}}U_{it^{\prime}})\right\}
:=\displaystyle:= i=1n1j=i+1nF¯l,ij,\displaystyle\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\bar{F}_{l,ij},

where we let F¯l,ij\bar{F}_{l,ij} denote the portion enclosed in braces in the second equality, and the first equality follows from (A3). Given ll, it is obvious that the F¯l,ij\bar{F}_{l,ij} terms are zero-mean and independent of each other for all 1i<jn1\leq i<j\leq n. In accordance with Assumption 4, we have

|F¯l,ij|\displaystyle|\bar{F}_{l,ij}| \displaystyle\leq cn1stKx2s+t(t1)21stK|Σ~2s+t(t1)2,2s+t(t1)2l|\displaystyle\frac{c^{\prime}}{n}\sum_{1\leq s\leq t\leq K}x_{\frac{2s+t(t-1)}{2}}\sum_{1\leq s^{\prime}\leq t^{\prime}\leq K}\left|\tilde{\Sigma}_{\frac{2s+t(t-1)}{2},\frac{2s^{\prime}+t^{\prime}(t^{\prime}-1)}{2}}^{l}\right|
\displaystyle\leq cnK(K+1)2x2{1stK1stK(Σ~2s+t(t1)2,2s+t(t1)2l)2}1/2\displaystyle\frac{c^{\prime}}{n}\frac{K(K+1)}{2}\|x\|_{2}\left\{\sum_{\begin{subarray}{c}1\leq s\leq t\leq K\\[3.0pt] 1\leq s^{\prime}\leq t^{\prime}\leq K\end{subarray}}\left(\tilde{\Sigma}_{\frac{2s+t(t-1)}{2},\frac{2s^{\prime}+t^{\prime}(t^{\prime}-1)}{2}}^{l}\right)^{2}\right\}^{1/2}
\displaystyle\leq cnK3x2λmax(Σ~l)\displaystyle\frac{c^{\prime}}{n}K^{3}\|x\|_{2}\lambda_{\max}(\widetilde{\Sigma}^{l})
\displaystyle\leq cK3x2nλmin(Σl)=o(x22)\displaystyle\frac{c^{\prime}K^{3}\|x\|_{2}}{n\sqrt{\lambda_{\min}(\Sigma^{l})}}=o(\|x_{2}\|_{2})

for all l[L]l\in[L] and 1i<jn1\leq i<j\leq n. To affirm the Lindeberg condition, we proceed by calculating the sum of Var(F¯l,ij){\rm Var}(\bar{F}_{l,ij}) for all 1i<jn1\leq i<j\leq n. By a simple calculation, we have Var(xT(Σl)1/2vec(UTGlU))=x22{\rm Var}\left(x^{T}(\Sigma^{l})^{-1/2}{\rm vec}(U^{T}G_{l}U)\right)=\|x\|_{2}^{2}. Therefore, for any ε>0\varepsilon>0 and sufficiently large nn, it holds true that |F¯l,ij|<ε[Var(xT(Σl)1/2vec(UTGlU))]1/2|\bar{F}_{l,ij}|<\varepsilon\left[{\rm Var}\left(x^{T}(\Sigma^{l})^{-1/2}{\rm vec}(U^{T}G_{l}U)\right)\right]^{1/2}. This subsequently satisfies the Lindeberg condition as given by

limn1x22i=1n1j=i+1n𝔼(|F¯l,ij|2𝕀{|F¯l,ij|>ε[Var(xT(Σl)1/2vec(UTGlU))]1/2})=0.\lim\limits_{n\rightarrow\infty}\frac{1}{\|x\|_{2}^{2}}\sum_{i=1}^{n-1}\sum_{j=i+1}^{n}\mathbb{E}\left(|\bar{F}_{l,ij}|^{2}\mathbb{I}_{\{|\bar{F}_{l,ij}|>\varepsilon[{\rm Var}(x^{T}(\Sigma^{l})^{-1/2}{\rm vec}(U^{T}G_{l}U))]^{1/2}\}}\right)=0.

We should also emphasize that UU, GlG_{l}, and F¯l,ij\bar{F}_{l,ij} are all dependent on nn. For the sake of simplicity, we have not explicitly displayed the dependence on nn in our notation. By the Lindeberg-Feller Central Limit Theorem and the Cramér-Wold theorem (Thoerem 3.9.5 of Durrett (2010)), we have

(Σl)1/2vec(UTGlU)d𝒩(𝟎,𝑰)\left(\Sigma^{l}\right)^{-1/2}{\rm vec}\left(U^{T}G_{l}U\right)\rightarrow_{d}\mathcal{N}(\bm{0},\bm{I})

for each l[L]l\in[L]. Here, 𝒩(𝟎,𝑰)\mathcal{N}(\bm{0},\bm{I}) is a K(K+1)/2K(K+1)/2 dimensional standard normal distribution. The claim follows by combining the above discussion with (A2). \hfill\qed

Proof of Theorem 2

Consistent with the proof of Theorem 1, a random matrix ElE_{l} exists such that A2 holds. According to Theorem A2, the bias term ElE_{l} satisfies

ZElZTF\displaystyle\|ZE_{l}Z^{T}\|_{F} \displaystyle\leq c(Ql2+Ql1,)(n2ψ28+n3log(L+n)Lρψ28)\displaystyle c(\|Q_{l}\|_{2}+\sqrt{\|Q_{l}\|_{1,\infty}})\left(\frac{n^{2}}{\|\psi\|_{2}^{8}}+\frac{n^{3}\log(L+n)}{L\rho\|\psi\|_{2}^{8}}\right)
+cQl1,(nψ24+n3/2log1/2(L+n)L1/2ρ1/2ψ24)+Kρ\displaystyle\leavevmode\nobreak\ +c\sqrt{\|Q_{l}\|_{1,\infty}}\left(\frac{n}{\|\psi\|_{2}^{4}}+\frac{n^{3/2}\log^{1/2}(L+n)}{L^{1/2}\rho^{1/2}\|\psi\|_{2}^{4}}\right)+\sqrt{K}\rho

with high probability. Furthermore, given that every element of QlQ_{l} is at most ρ\rho, neither Ql2\|Q_{l}\|_{2} nor Ql1,\|Q_{l}\|_{1,\infty} exceeds nρn\rho. Thus, we can deduce that ZElZTFc(n2log1/2(L+n)L1/2ψ24+max{n3/2ρ1/2ψ24,ρ})\|ZE_{l}Z^{T}\|_{F}\leq c\left(\frac{n^{2}\log^{1/2}(L+n)}{L^{1/2}\|\psi\|_{2}^{4}}+\max\{\frac{n^{3/2}\rho^{1/2}}{\|\psi\|_{2}^{4}},\rho\}\right). Under certain setting for ψ\psi, ZElZTZE_{l}Z^{T} tends to vanish as the number of layers LL increase, and as the overall edge density ρ\rho decrease.

To demonstrate the central limit theorem, it suffices to ascertain the asymptotic normality of UTGlUU^{T}G_{l}U. For part (a), recall that the decomposition of (UGlUT)s,t(UG_{l}U^{T})_{s,t} in A3 for all 1stK1\leq s\leq t\leq K. By the definition of UU, that is Θ~\tilde{\Theta} here, and according to Assumption E1, we have Fl,ijc/ψ22F_{l,ij}\leq c/\|\psi\|_{2}^{2} for all l[L]l\in[L] and 1ijn1\leq i\leq j\leq n. Combing with Assumption E3, the Lindeberg condition in (A4) holds. For part (b), recall that the decomposition of xT(Σl)1/2vec(UTGlU)x^{T}(\Sigma^{l})^{-1/2}{\rm vec}(U^{T}G_{l}U) and the definition of F¯l,ij\bar{F}_{l,ij} in (A5). Under Assumption E4, it follows that |F¯l,ij|cx2ψ22λmin(Σl)=o(x2)|\bar{F}_{l,ij}|\leq\frac{c\|x\|_{2}}{\|\psi\|_{2}^{2}\sqrt{\lambda_{\min}(\Sigma^{l})}}=o(\|x\|_{2}). The rest proof is similar to that of Theorem 1, we omit it here. \hfill\qed

A.3 Auxiliary lemmas

Lemma A3 (Theorem 5.2 of Lei and Rinaldo (2015)).

Let AA be the adjacency matrix of a random graph on nn nodes in which edges occur independently. Set 𝔼(A)=P=(pij)\mathbb{E}(A)=P=(p_{ij}) for i,j=1,,n{i,j=1,\ldots,n}, and assume that nmaxi,jpij<dn\max_{i,j}p_{ij}<d for d>clognd>c\log n and c>0c>0. Then, for any r>0r>0 there exists a constant C=C(r,c)C=C(r,c) such that

AP2Cd\|A-P\|_{2}\leq C\sqrt{d}

with probability at least 1nr1-n^{-r}.

A.4 Additional results for data analysis

Table A3: Categories of cereals as defined by the FAO, with ‘n.e.c.’ representing ‘not elsewhere classified’. Cereal numbers correspond to layers in the WFAT network depicted in Figure 4(b).
1.Malt, whether or not roasted;  2.Rice;  3.Canary seed;  4.Buckwheat;  5.Germ of
maize;  6.Triticale;  7.Cereals n.e.c.;  8.Millet;  9.Sorghum;  10.Bran of maize;  11.Bran
of cereals n.e.c.;  12.Flour of mixed grain;  13.Flour of rice;  14.Rye;  15.Malt extract;
16.Rice, milled (husked);  17.Oats;  18.Cereal preparations;  19.Oats, rolled;  20.Bran of
wheat;  21.Flour of cereals n.e.c.;  22.Rice, broken;  23.Flour of maize;  24.Gluten feed
and meal;  25.Bread;  26.Husked rice;  27.Wheat and meslin flour;  28.Communion
wafers, empty cachets of a kind suitable for pharmaceutical use, sealing wafers, rice
paper and similar products;  29.Barley;  30.Breakfast cereals;  31.Uncooked pasta, not
stuffed or otherwise prepared;  32.Maize (corn);  33.Wheat;  34.Mixes and doughs for
the preparation of bakers’ wares;  35.Food preparations of flour, meal or malt extract;
36.Rice, milled;  37.Pastry
Refer to caption
Figure A5: Scree plot of the top eigenvalues of l=1LAl2/n\sum_{l=1}^{L}A_{l}^{2}/n of the WFAT data graph. The plot show the 50 largest eigenvalues ordered by magnitude.

This section presents the additional experimental results that are not shown in the main text. In Section 6, for the WFAT dataset, we initially chose the number of communities as K=5K=5 in the multi-layer SBM analysis, corresponding naturally to the geographical division into five continents. Here, we adopt an alternative approach to determine KK. By observing an elbow in the scree plot of the top absolute eigenvalues of the sum of squared adjacency matrices at the 4th position, as shown in Figure A5, we select K=4K=4. Figure A6 presents the outcomes of the multiple comparisons using a Holm type step-down procedure, where olive green indicates the acceptance and white denotes the rejection of the individual hypotheses. We control the Holm procedure to ensure that the family-wise error rate is no greater than α=0.05\alpha=0.05. In this setup, layer ll corresponds to the cereal numbered ll in Table A3. Similar to Figure 4(a), the outcomes of this multiple comparison procedure also exhibit several distinct blocks where all individual hypotheses are simultaneously accepted within each block. Moreover, the results are largely consistent with those observed when K=5K=5. This strongly suggests that some global cereal trade patterns are similar and also demonstrates the stability of our method.

Refer to caption
Figure A6: Results of the multiple comparisons conducted using a Holm type step-down procedure for the WFAT data, with the number of communities set to K=4K=4. Olive green indicates the acceptance of the individual hypotheses. Each layer ll corresponding to the cereal numbered ll in Table A3.

References

  • Agterberg and Cape (2023) Agterberg, J. and J. Cape (2023). An overview of asymptotic normality in stochastic blockmodels: Cluster analysis and inference. arXiv preprint arXiv:2305.06353.
  • Agterberg et al. (2022) Agterberg, J., Z. Lubberts, and J. Arroyo (2022). Joint spectral clustering in multilayer degree-corrected stochastic blockmodels. arXiv preprint arXiv:2212.05053.
  • Airoldi et al. (2008) Airoldi, E. M., D. Blei, S. Fienberg, and E. Xing (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research 9(65), 1981–2014.
  • Arroyo et al. (2021) Arroyo, J., A. Athreya, J. Cape, G. Chen, C. E. Priebe, and J. T. Vogelstein (2021). Inference for multiple heterogeneous networks with a common invariant subspace. Journal of Machine Learning Research 22(142), 1–49.
  • Bhatia (1997) Bhatia, R. (1997). Matrix analysis. Springer-Verlag, New York.
  • Bhattacharyya and Chatterjee (2018) Bhattacharyya, S. and S. Chatterjee (2018). Spectral clustering for multiple sparse networks: I. arXiv preprint arXiv:1805.10594.
  • Bickel et al. (2013) Bickel, P., D. Choi, X. Chang, and H. Zhang (2013). Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels1. The Annals of Statistics 41(4), 1922–1943.
  • De Domenico et al. (2015) De Domenico, M., V. Nicosia, A. Arenas, and V. Latora (2015). Structural reducibility of multilayer networks. Nature Communications 6, 6864.
  • Dudoit et al. (2008) Dudoit, S., M. J. Van Der Laan, and M. J. van der Laan (2008). Multiple testing procedures with applications to genomics. Springer, New York.
  • Durrett (2010) Durrett, R. (2010). Probability: theory and examples. Cambridge University Press, New York.
  • Fan et al. (2022a) Fan, J., Y. Fan, X. Han, and J. Lv (2022a). Asymptotic theory of eigenvectors for random matrices with diverging spikes. Journal of the American Statistical Association 117(538), 996–1009.
  • Fan et al. (2022b) Fan, J., Y. Fan, X. Han, and J. Lv (2022b). Simple: Statistical inference on membership profiles in large networks. Journal of the Royal Statistical Society Series B: Statistical Methodology 84(2), 630–653.
  • Fan et al. (2019) Fan, J., D. Wang, K. Wang, and Z. Zhu (2019). Distributed estimation of principal eigenspaces. The Annals of Statistics 47(6), 3009–3031.
  • Han et al. (2015) Han, Q., K. Xu, and E. Airoldi (2015). Consistent estimation of dynamic and multi-layer block models. In International Conference on Machine Learning, pp. 1511–1520. PMLR.
  • He et al. (2019) He, Y., A. Knowles, and M. Marcozzi (2019). Local law and complete eigenvector delocalization for supercritical Erdős–Rényi graphs. The Annals of Probability 47(5), 3278 – 3302.
  • Holland et al. (1983) Holland, P. W., K. B. Laskey, and S. Leinhardt (1983). Stochastic blockmodels: First steps. Social Networks 5(2), 109–137.
  • Holme and Saramäki (2012) Holme, P. and J. Saramäki (2012). Temporal networks. Physics Reports 519(3), 97–125.
  • Jin et al. (2023) Jin, J., Z. T. Ke, S. Luo, and M. Wang (2023). Optimal estimation of the number of network communities. Journal of the American Statistical Association 118(543), 2101–2116.
  • Jing et al. (2021) Jing, B.-Y., T. Li, Z. Lyu, and D. Xia (2021). Community detection on mixture multilayer networks via regularized tensor decomposition. The Annals of Statistics 49(6), 3181–3205.
  • Karrer and Newman (2011) Karrer, B. and M. E. Newman (2011). Stochastic blockmodels and community structure in networks. Physical Review E 83(1), 016107.
  • Kivelä et al. (2014) Kivelä, M., A. Arenas, M. Barthelemy, J. P. Gleeson, Y. Moreno, and M. A. Porter (2014). Multilayer networks. Journal of Complex Networks 2(3), 203–271.
  • Lehmann and Romano (2005) Lehmann, E. L. and J. P. Romano (2005). Testing statistical hypotheses. Springer, New York.
  • Lei and Lin (2023) Lei, J. and K. Z. Lin (2023). Bias-adjusted spectral clustering in multi-layer stochastic block models. Journal of the American Statistical Association 118(544), 2433–2445.
  • Lei and Rinaldo (2015) Lei, J. and A. Rinaldo (2015). Consistency of spectral clustering in stochastic block models. The Annals of Statistics 43(1), 215–237.
  • Mucha et al. (2010) Mucha, P. J., T. Richardson, K. Macon, M. A. Porter, and J.-P. Onnela (2010). Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878.
  • Noble (2009) Noble, W. S. (2009). How does multiple testing correction work? Nature Biotechnology 27(12), 1135–1137.
  • Noroozi and Pensky (2024) Noroozi, M. and M. Pensky (2024). Sparse subspace clustering in diverse multiplex network model. Journal of Multivariate Analysis 203, 105333.
  • Paul and Chen (2016) Paul, S. and Y. Chen (2016). Consistent community detection in multi-relational data through restricted multi-layer stochastic blockmodel. Electronic Journal of Statistics 10(2), 3807–3870.
  • Paul and Chen (2020) Paul, S. and Y. Chen (2020). Spectral and matrix factorization methods for consistent community detection in multi-layer networks. The Annals of Statistics 48(1), 230–250.
  • Rubin-Delanchy et al. (2022) Rubin-Delanchy, P., J. Cape, M. Tang, and C. E. Priebe (2022). A statistical interpretation of spectral embedding: The generalised random dot product graph. Journal of the Royal Statistical Society Series B: Statistical Methodology 84(4), 1446–1473.
  • Rudelson and Vershynin (2015) Rudelson, M. and R. Vershynin (2015). Delocalization of eigenvectors of random matrices with independent entries. Duke Mathematical Journal 164(13), 2507–2538.
  • Su et al. (2020) Su, L., W. Wang, and Y. Zhang (2020). Strong consistency of spectral clustering for stochastic block models. IEEE Transactions on Information Theory 66(1), 324–338.
  • Su et al. (2024) Su, W., X. Guo, X. Chang, and Y. Yang (2024). Spectral co-clustering in multi-layer directed networks. Computational Statistics & Data Analysis 198, 107987.
  • Tang et al. (2022) Tang, M., J. Cape, and C. E. Priebe (2022). Asymptotically efficient estimators for stochastic blockmodels: The naive mle, the rank-constrained mle, and the spectral estimator. Bernoulli 28(2), 1049–1073.
  • Tang and Priebe (2018) Tang, M. and C. E. Priebe (2018). Limit theorems for eigenvectors of the normalized laplacian for random graphs. The Annals of Statistics 46(5), 2360–2415.
  • Valles-Catala et al. (2016) Valles-Catala, T., F. A. Massucci, R. Guimera, and M. Sales-Pardo (2016). Multilayer stochastic block models reveal the multilayer structure of complex networks. Physical Review X 6(1), 011036.
  • Vu and Lei (2013) Vu, V. Q. and J. Lei (2013). Minimax sparse principal subspace estimation in high dimensions. The Annals of Statistics 41(6), 2905–2947.
  • Young and Scheinerman (2007) Young, S. J. and E. R. Scheinerman (2007). Random dot product graph models for social networks. In International Workshop on Algorithms and Models for the Web-Graph, pp.  138–149. Springer.
  • Zhang et al. (2022) Zhang, H., X. Guo, and X. Chang (2022). Randomized spectral clustering in large-scale stochastic block models. Journal of Computational and Graphical Statistics 31(3), 887–906.
  • Zheng and Tang (2022) Zheng, R. and M. Tang (2022). Limit results for distributed estimation of invariant subspaces in multiple networks inference and pca. arXiv preprint arXiv:2206.04306.