Topology Identification under Spatially Correlated Noise
Abstract
This article addresses the problem of reconstructing the topology of a network of agents interacting via linear dynamics, while being excited by exogenous stochastic sources that are possibly correlated across the agents, from time-series measurements alone. It is shown, under the assumption that the correlations are affine in nature, such network of nodal interactions is equivalent to a network with added agents. The added agents are represented by nodes that are latent, where no corresponding time-series measurements are available; however, here all the exogenous excitements are spatially (that is, across agents) uncorrelated. Generalizing affine correlations, it is shown that, under polynomial correlations, the latent nodes in the expanded networks can be excited by clusters of noise sources, where the clusters are uncorrelated with each other. The clusters can be replaced with a single noise source if the latent nodes are allowed to have non-linear interactions. Finally, using the sparse plus low-rank matrix decomposition of the imaginary part of the inverse power spectral density matrix (IPSDM) of the time-series data, the topology of the network is reconstructed. Under non conservative assumptions, the correlation graph of the noise sources is retrieved.
keywords:
Linear dynamical systems, time-series analysis, probabilistic graphical model, network topology identification, power spectral density, sparse estimation, latent nodes; Structure Learning; Learning and Control; Sensor placement.,
1 Introduction
Networks and graphical models provide convenient tools for effective representations of complex high dimensional multi-agent systems. Such a representation is useful in applications including power grids [32, 47], meteorology [19], neuroscience [4], and finance [5]. Knowledge of the network interaction structure (also known as network topology) helps in understanding, predicting, and in many applications, controlling/identifying the system behavior [16, 24, 28, 35, 39, 45]. In applications such as power grid, finance, and meteorological system, it is either difficult or impossible to intervene and affect the system. Hence, inferring the network properties by passive means such as time-series measurements is of great interest in these applications. An example is unveiling the correlation structure between the stocks in the stock market from daily share prices [5], which is useful in predicting the market behavior.
Learning the conditional independence relation between variables from time-series measurements is an active research field among machine learning (ML), probabilistic graphical model (PGM), and statistics communities [6, 8, 29, 30, 33, 38], where the system modules are considered as random variables. However, such studies fail to capture dynamic dependencies between the entities in a system, which are prevalent in most of the aforementioned applications. For dynamical systems, autoregressive (AR) models that are excited by exogenous Gaussian noise sources, which are independent across time and variables, are explored in [1, 2, 3, 40]. Here, the graph topology captured the sparsity pattern of the regressor coefficient matrices, which characterized the conditional independence between the variables. It was shown that the sparsity pattern of the inverse power spectral density matrix (IPSDM), also known as concentration matrix, identifies the conditional independence relation between the variables (see [3, 40]). As shown in [27], the conditional independence structure between the variables is equivalent to the moral-graph structure of the underlying directed graph. For multi-agent systems with linear dynamical interaction, excited by wide sense stationary (WSS) noise sources that are mutually uncorrelated across the agents, moral graph reconstruction using Wiener projection has gained popularity in the last decade [21, 25, 27]. Here, the moral-graph of the underlying linear dynamic influence model (LDIM) is recovered from the magnitude response of Wiener coefficients, obtained from time-series measurements. For a wide range of applications, the spurious connections in the moral-graph can be identified–returning the true topology–by observing the phase response of the Wiener coefficients [43]. Furthermore, for systems with strictly causal dynamical dependencies, Granger causality based algorithms can unveil the exact cause-effect nature of the interactions; thus recovering the exact parent-child relationship of the underlying graph [14, 15, 27].
In many networks, time-series at only a subset of nodes is available. The nodes where the time-series measurements are not available form the latent/confound/hidden nodes. Topology reconstruction is more challenging in the presence of latent nodes as additional spurious connections due to confounded effects are formed when applying the aforementioned techniques. AR model identification of the independent Gaussian time-series discussed above, in the presence of latent nodes, was studied in [9, 10, 11, 13, 17, 22, 48, 49]. Here, the primary goal is to eliminate the spurious connections due to latent nodes and retrieve the original conditional independence structure from the observed time-series. In applications such as power grid, there is a need to retrieve the complete topology of the network, including that of the latent nodes. Such problems are studied for bidirected tree [42], poly-forest [36], and poly-tree [26, 37] networks excited by WSS noise sources that are uncorrelated across the agents. Recently, an approach to reconstruct the complete topology of a general linear dynamical network with WSS noise was provided in [46]. The work in [46] can be considered as a generalization of AR model identification with latent nodes, in asymptotic time-series regime. However, one major caveat of the aforementioned literature on graphical models is that the results fail if the exogenous noise sources are spatially correlated, i.e. if the noise is correlated across the agents/variables. Prior works have studied the systems with spatially correlated noise sources [18, 28, 35]; however, these studies assume the knowledge of topology of the network. In a related work, [34] studied topology estimation under spatially correlated noise sources and used this estimated topology in local module identification.
This article studies the problem of topology identification of the LDIMs that allow spatially correlated noise sources, similar to the problem in [34]. However, this article provides an alternate treatment, where the noise correlations are transformed to latent nodes. This transformation enables one to gain additional insights and apply techniques from topology identification with latent nodes to solve the problem.
The first major result of this article is to provide a transformation that converts an LDIM without latent nodes, but excited by spatially correlated exogenous noise sources, to LDIMs with latent nodes. Here, the latent nodes are characterized by the maximal cliques in the correlation graph, the undirected graph that represents the spatial correlation structure. It is shown that, under affine correlation assumption, that is, when the correlated noise sources are related in affine way (Assumption 1), there exist transformed LDIMs with latent nodes, where all the nodes are excited by spatially uncorrelated noise sources. A key feature of the transformation is that the correlations are completely captured using the latent nodes, while the original topology remains unaltered. The original moral-graph/topology is the same as the moral-graph/topology among the observed nodes in the altered graph. Thus, this transformed problem is shown to be equivalent to topology identification of networks with latent nodes. Consequently, any of the aforementioned techniques for the networks with latent node can be applied on the transformed LDIM to reconstruct the original moral graph/topology.
Next, relaxing the affine correlation assumption, polynomial correlation is considered (Assumption 2); here, the focus is on noise sources with distributions that are symmetric around the mean. It is shown that, in this scenario, the transformed dynamical model can be excited by clusters of noise sources, where the clusters are uncorrelated. However, the noise sources inside the clusters can be correlated. Using the sparselow-rank decomposition of IPSDM from [46], the true topology of the network along with the correlation structure is reconstructed, from the IPSDM of the original LDIM, without any additional information, if the network satisfies a necessary and sufficient condition. Notice that, the results discussed here are applicable to the networks with static random variables and AR models also, when the exogenous noise sources are spatially correlated.
The article is organized as follows. Section 2 introduces the system model, including LDIM, and the essential definitions. Section 3 discusses IPSDM based topology reconstruction. Section 4 describes the correlation graph to latent nodes transformation and some major results. Section 5 discusses transformation of LDIMs with polynomial correlation to LDIMs with latent nodes. Section 6 explains the sparselow-rank decomposition technique and how the topology can be reconstructed without the knowledge of correlation graph. Simulation results are provided in Section 7. Finally, Section 8 concludes the article.
Notations: Bold capital letters, , denotes matrices; and represents entry of ; Bold small letters, , denotes vectors; indicates entry of ; ; The subscript denotes the observed nodes index set and the subscript indicates the latent nodes index set , For example, and ; denote power spectral density matrix, where is cross power spectral density between the time-series at nodes and ; denotes the set of real rational single input single output transfer functions that are analytic on unit circle; denotes the set of rationally related zero mean jointly wide sense stationary (JWSS) scalar stochastic process; denotes bilateral -transform; denotes the space of all skew symmetric matrices; denotes the set of natural numbers, ; For a set , denotes the cardinality of the set. denotes the nuclear norm and denote sum of absolute values of entries of . denotes the phase of the complex number .
2 System model
Consider a network of interconnected nodes, where node is equipped with time-series measurements, , . The network interaction is described by,
(1) |
where , . are the exogenous noise sources with is non-singular, but possibly non-diagonal. is the weighted adjacency matrix with , for all . An LDIM is defined as the pair , whose output process is given by (1). The LDIM is well-posed if every entry of is analytic on the unit circle, and topologically detectable if is positive definite.
Every LDIM has two associated graphs, viz: 1) a directed graph, the linear dynamic influence graph (LDIG), , where and and 2) an undirected graph, the correlation graph, , where . Notice that, in a directed graph, if then there is a directed arrow from to in the graphical representation of (see Fig. 1(a) for example). Concisely, “LDIG ” is used to denote the LDIG corresponding to the LDIM . For a directed graph , the parent, the children, and the spouse sets of node are , , and respectively. The topology of a directed graph , denoted , is the undirected graph obtained by removing the directions from every edge . Moral-graph (also called kin-graph) of , denoted is an undirected graph, , where . A clique is a sub-graph of a given undirected graph where every pair of nodes in the sub-graph are adjacent. A maximal clique is a clique that is not a subset of a larger clique. For a correlation graph, , denotes the number of maximal cliques with clique size .
The problem we address is described as follows.
Problem 1.
(P1) Consider a well-posed and topologically detectable LDIM, , where is allowed to be non-diagonal and its associated graphs and are unknown. Given the power spectral density matrix , where is given by (1), reconstruct the topology of .
3 IPSDM based topology reconstruction
In this section, the IPSDM based topology reconstruction is presented. In [27], the authors showed that, for any LDIM characterized by , the availability of IPSDM, which can be written as
(2) |
is sufficient for reconstructing the moral-graph of . That is, if , then and are kins. However, an important assumption for the result to hold true is that is diagonal. If is non-diagonal, then the result does not hold in general. For ,
and any of the four terms can cause , depending on . Hence, this technique cannot be applied directly to solve Problem 1. For example, consider the network given in Fig. 1(a) with . Then, since , which implies that the estimated topology has present, while .
In the next section, the correlation graph, , is scrutinized and properties of are evaluated as the first step in unveiling the true topology when noise sources admit correlations.
4 Spatial Correlation to Latent Node Transformation
In this section, a transformation of the LDIM, , to an LDIM with latent nodes by exploiting the structural properties of the noise correlation is obtained. The transformation converts an LDIM without latent nodes and driven by spatially correlated exogenous noise sources to an LDIM with latent nodes that are excited by spatially uncorrelated exogenous noise sources. Assuming perfect knowledge of the noise correlation structure, the latent nodes and their children in the transformed LDIM are characterized. It is shown that although the transformation is not unique, the topology of the transformation is unique under affine correlation (Assumption 1).
For and , , (or with a slight abuse of notation, ) denotes an LDIM with latent nodes, where denotes the directed edge weight from latent node to observed node and denotes the directed edge weight from observed node to observed node . Notice that the latent nodes considered in this article are strict parents and they do not have incoming edges.
4.1 Relation between Spatial Correlation and Latent Nodes
Here, it is demonstrated that the spatially correlated exogenous noise sources can be viewed as the children of a latent node that is a common parent of the correlated sources. The idea is explained in the motivating example below. Towards this, let us define the following notion of equivalent networks.
Definition 4.1.
Let and be two LDIMs and let and respectively. Then, the LDIMs and are said to be equivalent, denoted if and only if .


Consider the network of three nodes in Fig. 1(a). Suppose the noise sources and are correlated, that is, . Consider the network in Fig. 1(b), where , and are jointly uncorrelated and node is a latent node. Note that is same in both the LDIMs while are different from . From the LDIM of Fig. 1(a), it follows that
(3) |
From the LDIM of Fig. 1(b), it follows that
(4) |
Comparing (3) and (4), it is evident that if , and are such that and , then the time-series obtained from both the LDIMs are identical. That is, the LDIMs are equivalent.
For a given LDIM, , with correlated exogenous noise sources, one can define a space of equivalent LDIMs with uncorrelated noise sources that provide the same time-series. The following definition captures this space of “LDIMs with latent nodes and uncorrelated noise sources” that are equivalent to the original LDIM with correlated noise sources.
Assumption 1.
In LDIM , defined by (1), the exogenous noise processes are correlated only via affine interactions, i.e., if and only if there exists an affine transform, , such that either or .
Definition 4.2.
For any LDIM, , with non-diagonal, and satisfying Assumption 1,
(5) |
is the space of all -transformations for a given .
Remark 4.3.
In (4.2), the number of latent nodes, , is not fixed a priori.
Based on the definition of in (5), LDIM for can be rewritten as
(6) |
where are mutually uncorrelated and .




An LDIM, , is called a transformed LDIM of in this article. Intuitively, -transformation completely assigns the spatial correlation component present in the original noise source, , to latent nodes, without altering the LDIG, , similar to the discussion on the aforementioned motivating example. For the LDIM in Fig. 2(a) with correlation graph in Fig. 2(b), Fig. 2(c), and Fig. 2(d) show some examples of the LDIGs that belong to . Thus, -transformation returns a larger network with latent nodes. Notice that the latent nodes present in an LDIG of , are strict parents, whose interaction with the true nodes are characterized by .
The following theorem formalizes the relation between the correlation graph and the latent nodes in the transformed higher dimensional LDIMs. In particular, it shows that two noise sources and are correlated if and only if there is a latent node, which is a common parent of both the nodes and in the transformed LDIM.
Lemma 2.
Proof: Refer Appendix A ∎
Thus, and in the original LDIM are correlated if and only if for every -transformed LDIM there exists a such that and .
The following lemma shows that the number of latent nodes, , present in any is at least , the number of maximal cliques with clique size greater than one in .
Lemma 3.
Let be an LDIM and let be the correlation graph of the exogenous noise sources, . Then, the number, , of latent nodes present in any satisfies , where is the number of maximal cliques with clique size in .
Proof: Refer Appendix B. ∎
Remark 4.4.
Let be the correlation graph of the exogenous noise sources. Then, for any given maximal clique and for any , there exists a transformed LDIG with number of latent nodes such that the set is equal to the children of the latent nodes. For , Fig. 2 shows examples with (see proof of Lemma 3 for details).
As shown in Lemma 3 and Fig. 2, the LDIMs in the characterizing space of the equivalent LDIMs can have an arbitrary number of latent nodes, which leads to multiple transformed representations with varying number of latent nodes. The following definition restricts the number of latent nodes present in the equivalent transformed LDIGs and provides a minimal transformed representation–representation with minimal number of nodes–of the true LDIG.
Definition 4.5.
Define to be the set of all LDIMs in with the number of latent nodes equal to the number of maximal cliques with clique size present in .
The following lemma shows that there exist a unique latent node in every transformed LDIG , corresponding to each maximal clique.
Theorem 4.
Let be an LDIM that satisfies Assumption 1 and let be the correlation graph of the exogenous noise sources,. Suppose has maximal cliques . Consider any LDIG with latent nodes, . Then, for a maximal clique in , there exists a unique latent node such that .
Proof: Refer Appendix C. ∎
Remark 4.6.
Notice that the existence of the unique latent node is true only for the space . Such a unique node might not exist for the transformed representations in . In other words, Theorem 4 identifies the minimal set of latent nodes required to explain the data.
4.2 Uniqueness of the topology
Here, the LDIMs in from the previous section is studied more carefully. In topology reconstruction, only the support structure of the transfer functions matter. The following proposition shows that the support structure is unique.
Proposition 5.
Let be an LDIM that satisfies Assumption 1. The topology of every LDIG is the same.
Proof: Refer Appendix D. ∎
Based on the above results, the following transformation of Problem (P1) is formulated.
Problem 6.
Remark 4.7.
The time series data among the observed nodes of the transformed LDIM is the same as the time series obtained from the original LDIM.
Remark 4.8.
Therefore, instead of reconstructing the topology of with non-diagonal, it is sufficient to reconstruct the topology among the observed nodes for one of the LDIMs , which has diagonal.
5 Polynomial Correlation
The results in the previous section assumed that the correlations between the exogenous noise sources are affine in nature. In this section, a generalization of the affine correlation is addressed. It is shown that the noise sources that are correlated via non-affine, but a polynomial, interaction can be characterized using latent nodes with non-linear interaction dynamics. By lifting the processes to a higher dimension, the non-linearity is converted to linear interactions.
The following definitions are useful for the presentation in this section.
Definition 5.1.
[12] Let . A monomial in is a product of the form , where . A polynomial in with coefficient in a field (or a commutative ring), , is a finite linear combination of monomials, i.e.,
-
(i)
The degree of a monomial is .
-
(ii)
The total degree of is the maximum of such that .
Definition 5.2.
For any define the list of monomials of total degree at most ,
where lists all the -degree monomials, . For example, when , , , and . In general, the total number of monomials having degree is given by , where . The total number of monomials up to total degree is . For and , .
5.1 Characterization of Polynomial Correlations
Suppose and , where and , , with diagonal. Let be the vector obtained by concatenating , . Letting and , we have , where is the vector obtained by concatenating .
Notice that is lifting of into a higher dimensional space of polynomials. In the following, a discussion on the structure of is provided, with the help of examples.
5.2 Example: Lifting of a zero mean Gaussian Process
To illustrate lifting of noise processes to higher dimension, consider independent and identically distributed (IID) Gaussian process (GP). It is shown that, under lifting, the power spectral density matrix (PSDM) is block diagonal.
Consider an IID, GP . Then, . It can be shown that [31]
(7) |
where denotes the double factorial. Consider and let . That is, lists all the monomials of with degree .
Notice that if and only if both and are even. Then, . Straight forward computation shows that only for . Similarly, if and only if , and if and only if . Notice that corresponds to the terms with odd power on and even power on . Repeating the same for every , , one can show that, after appropriate rearrangement of rows and columns, the covariance matrix and the PSDM of forms a block diagonal matrix, given by (8). Here, .
(8) |
It is worth mentioning that since the Gaussian process is zero mean IID, the auto-correlation function , where is Kronecker delta and thus , is white (same for all the frequencies).




The following proposition shows this for general .
Proposition 7.
Consider the Gaussian IID process . Let . Then, there exists a , a permutation of , such that is block diagonal with non-zero blocks.
Proof: Refer Appendix E. ∎
Based on this example, one can extend the result to symmetric zero mean WSS processes. Symmetric distributions are the distributions with probability density functions symmetric around mean, for example, Gaussian distribution.
Definition 5.3.
A probability distribution is said to be symmetric around mean if and only if it’s density function, , satisfies for every , where is the mean of the distribution.
Proposition 8.
Consider a zero mean WSS process, with symmetric distribution and diagonal. Let . Then, there exists a , a permutation of such that is block diagonal with non-zero blocks.
Proof: Proof is similar to the IID GP case. For symmetric distributions, odd moments are zero, since odd moments are the integral of odd functions. ∎
As shown in the proof of Proposition 7, the monomial nodes corresponding to a given block diagonal (i.e., the same odd-even pattern) can be grouped into a cluster. Fig. 3(c) shows such a clustering with and . The red nodes are the lifted processes in the higher dimension (the monomial nodes ). The nodes inside a given cluster (shown by blue ellipse) are correlated with each other and the nodes from different clusters are not correlated with each other; this is a result of being block-diagonal. Here, and are correlated via the cluster of and , whereas and are correlated via the cluster of and . Notice the following important distinction from the LDIMs in Fig. 1. Here, it is sufficient for the observed nodes to be connected to one of the latent nodes in a given cluster. It is not necessary for the nodes to have a common parent like the case in Fig. 1. The common parent property from Section 4 is replaced with a common ancestral cluster here.
5.3 Transformation of Polynomial Correlation to Latent Nodes
Based on the aforementioned discussion, relaxation of Assumption 1 and extension of the LDIM transformation results from Section 4 is presented here. It is shown that in order to relax Assumption 1, non-linear interactions are required between the latent nodes and the observed nodes. The following assumption is a generalization of Assumption 1.
Assumption 2.
In LDIM , defined by (1), the noise processes are correlated if and only if there exist polynomials and , , , with diagonal, , such that and , where , , for some .
Remark 5.4.
Definition 5.5.
For any LDIM, , with non-diagonal, and satisfying Assumption 2, and
(9) |
is the space of all -transformations for a given . The matrix is obtained by concatenating . This is done by first listing all the corresponding to “degree one” monomials in lexicographic order, then “degree ” in lexicographic order, etc. That is, where denotes the column vector of corresponding to , canonical basis. For from Section 5.2, . Here, .
With this new “polynomial lifting” definition, one can transform a given LDIM with correlated noise sources to a transformed LDIM with latent nodes. As shown in the following lemma, transformation of correlations to uncorrelated latent nodes from Section 4 is replaced with uncorrelated latent clusters here. For any cluster , , that is, denotes the union of the children of the nodes present in cluster .
Lemma 9.
Let be an LDIM that satisfies Assumption 2, and let be the correlation graph of the exogenous noise sources, .
Then, for every distinct , if and only if for every LDIG , there exists a cluster such that .
Proof: Refer Appendix F. ∎
The following theorem shows that a subgraph, , of the correlation graph, , forms a maximal clique in if and only if for any transformed LDIG in , the set of nodes in is equal to the set of the children of some latent cluster.
Theorem 10.
Let be an LDIM defined by (1) which satisfies Assumption 2,and let be the correlation graph of the exogenous noise sources, . Suppose that is a maximal clique with . Then, for any LDIM , there exist latent clusters in the LDIG of such that
(10) |
where . In particular, for any latent cluster in the LDIG of , forms a clique in .
Proof: See Appendix G. ∎
Consider the LDIG shown in Fig. 3(a) with correlation graph given by Fig. 3(b). As shown in Fig. 3(d), if all the three nodes are correlated, one can transform this LDIM to an LDIM with latent node . However, here node should capture the interactions from the latent nodes . Therefore, during reconstruction, one should accommodate non-linear interaction between the latent node and the observed nodes.
In the next section, we describe a way to perform the reconstruction, when is unavailable.
6 Moral-Graph Reconstruction by Matrix Decomposition
In the previous sections, frameworks that convert a network with spatially correlated noise to a network with latent nodes were studied. In this section, a technique is provided to reconstruct the topology of the transformed LDIMs using the sparse plus low-rank matrix decomposition of the IPSDM obtained from the observed time-series. Note that this result does not require any extra information other than the IPSDM obtained from the true LDIM.
6.1 Topology Reconstruction under Affine Correlation
Here, reconstruction under affine correlation is discussed. Recall from Definition 4.2 that . Then, PSDM of , , can be written as [27]: , where the second equality follows because and are uncorrelated with mean zero. Then, IPSDM of the observed nodes in ,
(11) |
where and . Equality (a) follows from the matrix inversion lemma [20]. (11) can then be rewritten as:
(12) | ||||
(13) | ||||
(14) |
, and , which is similar to the model in [46]. If the moral-graph of the original LDIG is sparse and , then is sparse and is low-rank. It was shown in [27] that the support of retrieves the moral graph of . Furthermore, as shown in [46], under some assumptions that is applicable to a large class of problems, -th entry of is strictly real if and only if the edge is a strict spouse edge. Thus, it can be shown that, in a large class of problems, support of retrieves the exact topology of . Following the approach in [46], we reconstruct the network topology from the sparse+low-rank decomposition of , which is a skew-symmetric matrix. For completeness, the essential theories and a relevant algorithm from [46] is provided below in Section 6.4. The idea here is to decompose a given skew-symmetric matrix, i.e. , into the sparse and low-rank components ( and respectively), and then to reconstruct the moral graph/topology from , for some .
The next subsection shows how the sparse low-rank decomposition is applicable in polynomial correlation setting also.
6.2 Topology reconstruction under polynomial correlation
Under polynomial correlation, recall from Definition 5.5 that . Let and let . Then, one can write,
(15) |
where, captures the influence of on and is block diagonal.
The topology of the sub-graph restricted to the observed nodes in the above LDIM and the true topology of the network are equivalent. Moreover, similar to (11) (see [46] for details), one can obtain (12) to (14) exactly.
Here , , and . If , then is low-rank. Let be the set of monomials that has non zero contribution to and let . Under this scenario, is low-rank if . Section 7.1 demonstrates an example on application of the sparse plus low-rank decomposition to reconstruct the topology under polynomial correlation.
6.3 Low-rankSparse Matrix Decomposition
Here, the following problem is considered: given a matrix that is known to be sum of a sparse skew-symmetric matrix and a low-rank skew-symmetric matrix , retrieve the sparse and low-rank components. The following optimization program modified from [7] is used to obtain the sparse low-rank decomposition, where is a pre-selected penalty factor [46].
(16) |
where , for some .
In the next subsection, a sufficient condition and an algorithm for the exact recovery of the sparse and the low-rank components from using (16) are provided.
6.4 Sufficient Condition for Sparse Low-rank Matrix Decomposition
In this subsection, a sufficient condition (proved in [7, 46]) is provided to uniquely decompose a matrix as a sum of the sparse skew-symmetric and the low-rank skew-symmetric components. Furthermore, an algorithm is provided that utilizes this sufficient condition to retrieve the sparse and low-rank components.
The following definitions are used in the subsequent results.
(17) | ||||
(18) |
where is the compact singular value decomposition of and is the Euclidean norm of a vector.
The following is a sufficient condition that guarantees the unique decomposition of (see [7, 46] for details).
Lemma 11.
Suppose that we are given a matrix , which is the sum of a sparse matrix and a low-rank matrix . If satisfies
(19) |
then there exists a penalty factor such that (16) returns .
Remark 6.1.
The sufficient condition (19) roughly translates to being sparse and the number of maximal cliques, , being small, with clique sizes not too small (see [46] for more details).
The following metrics are used to measure the accuracy of the estimates in the optimization (16).
(20) |
where denotes the Frobenius norm and is a sufficiently small fixed constant. Note that requires the knowledge of the true matrices and , whereas does not.
The following Lemma is proved in [46] and is applied in Algorithm 1 later to retrieve the sparse and low-rank components.
Lemma 12.
Suppose we are given a matrix , which is obtained by summing and , where is a sparse skew-symmetric matrix and is a low-rank skew-symmetric matrix. If and satisfies , then there exist at least three regions where . In particular, there exists an interval with such that the solution of (16) is for any .
Following the procedure in [46], moral-graph/topology reconstruction from the imaginary part of the IPSDM is considered here. The following are some assumptions from [46], assumed for the exact recovery of the topology. The details can be found in [46], and are skipped here. In the absence of Assumption 4, the reconstruction algorithm will detect some false positive edges, but none of the true edges are missed.
Assumption 3.
For any , , if then , for all .
Assumption 4.
For the LDIM in (1), and , if and , then .
Lemma 13.
Theorem 14.
Proof: Refer Appendix H. ∎
Additionally, applying the algorithms in [46], the correlation graph also can be reconstructed, if the LDIM satisfies the following assumption, as shown in simulation results.
Assumption 5.
For every distinct latent nodes in the transformed dynamic graph, the distance between and is at least four hops.
7 Simulation results
In this section, we demonstrate topology reconstruction of an LDIM with non-diagonal, from , using the sparselow-rank decomposition technique discussed in Section 6 for an affinely correlated network. Fig. 4(a)-4(e) respectively depicts , , , , and described in Section 4. Simulations are done in Matlab. Yalmip [23] with SDP solver [44] is used to solve the convex program (16).
For the simulation, we assume we have access to the true PSDM, , of the LDIM of Fig. 4(a). Here, is non-diagonal with the (unkown) correlation structure as shown in Fig. 4(c).
For the reconstruction, the imaginary part of the IPSDM, is employed in the convex optimization (16) for . Optimization (16) is solved for all the values of , with the interval . Notice that for . Fig. 5 shows the comparison of and difft versus . for is picked, which belongs to the middle zero region of difft as described in [46].
returned the exact topology of Fig. 4(b). From , by following Algorithms and in [46], also is reconstructed, which matches Fig. 4(c) exactly.









7.1 Polynomial correlation
Here, the topology reconstruction of the network when the noise processes are polynomially correlated, is shown. For the simulation, consider the LDIG shown in Fig. 4(a) and correlation graph of 4(b). The noise processes are IID GP as shown in Section 5.2 with and . The entries of are such that only the co-efficients corresponding to and are non-zero, and the coefficients , for every , .
Figure 6 shows the difft and tolt plot of applying Algorithm 1 with . As shown in the plots, for , tolt is zero, which corresponds to the exact decomposition. Additionally, as mentioned in Lemma 12, difft is zero in this interval. The support of for some reconstructs the exact topology, which validates Theorem 14.
7.2 Finite data simulation
In this section, to evaluate the effect of finite data size, simulations are run on a synthetic data set based on the network shown in Fig. 4(e). For the PSD estimation, Welch method [41] is used. Notice that the accuracy and the sample complexity of the estimation can be improved by employing advanced IPSDM estimation techniques from literature, for example see [2, 48, 49]. Fig. 7 shows the estimated results from a sample size of ; Fig. 7(a) shows the true and the estimated IPSDM matrices. The fourth matrix of Fig. 7(a) shows the topology estimated directly from , without decomposition. The estimation is done by hard thresholding. That is, the edge is detected if is greater than a threshold, and not detected otherwise. The detection threshold is selected to obtain the minimum number of errors. 14 out of 16 edges are detected but 6 false positive edges are also detected, with the total error of 50% (8 out of 16 edges). This shows that estimating the topology directly from returns an undesirable number of errors.
Towards the exact topology retrieval, the optimization (16) is performed on to obtain sparse plus low-rank decomposition. Fig. 7(b) shows the sparse part retrieved from the decomposition of for various from to . at is selected for estimating the topology. Thus, as illustrated by the above example, with the approach proposed in the article it is possible to choose a threshold that yields 100% detection without sacrificing the false alarm performance.
In order to demonstrate that the techniques proposed in this article do not degrade drastically with lesser number of samples, a simulation is run on 6000 samples. Employing the detection from returned 48 false edges and missed one edge, thus a total of 49 errors. However, with the decomposition, detection from the support of at detected 14 out of 16 edges and missed none, thus giving an error rate of 87.5%. This confirms that the decomposition based reconstruction proposed by the article yield substantial advantages. The sample complexity analysis of the article’s methods is open for future research.
8 Conclusion
In this article, the problem of reconstructing the topology of an LDIM with spatially correlated noise sources was studied. First, assuming affine correlation and the knowledge of correlation graph, the given LDIM was transformed into an LDIM with latent nodes, where the latent nodes were characterized using the correlation graph and all the nodes were excited by uncorrelated noises. For polynomial correlation, a generalization of the affine correlation, the latent nodes in the transformed LDIMs were excited using clusters of noise, where the noise clusters were uncorrelated with each other. Finally, using a sparse low-rank matrix decomposition technique, the exact topology of the LDIM was reconstructed, solely from the IPSDM of the true LDIM, when the network satisfied a sufficient condition required for the matrix decomposition. Simulation results are provided that verify the theoretical results.
Appendix A Proof of Lemma 2
Let and let . By definition of , , where denotes the row of , for any . To prove the only if part, suppose that . By definition, . Then, , since and are uncorrelated for . Thus, , which implies that there exists a such that and . In other words, there exists a latent node in the corresponding LDIG of such that .
Let such that . Then, from the proof of only if part, which implies that or , except for a few pathological cases that occur with Lebesgue measure zero. We ignore the pathological cases here. Hence, there does not exist any latent node such that both . ∎
The following result shows that if a subgraph, , of the correlation graph, , forms a maximal clique in , then for any transformed LDIG in , the set of nodes in is equal to the set of children of some latent nodes in the LDIG.
Appendix B Proof of Lemma 3
The following lemma is useful in proving Lemma 3.
Lemma 15.
Let be an LDIM defined by (1) which satisfies Assumption 1, and let be the correlation graph of the exogenous noise sources, . Suppose that is a maximal clique with . Then, for every LDIG , there exist latent nodes such that
(21) |
where .
In particular, for any latent node in the LDIG , forms a clique (not necessarily maximal) that is restricted to in .
Proof: Let , , such that forms a clique in . Lemma 2 showed that, for any , there exists a latent node such that in the LDIG, , of . Since this is true for any pair , there exists a minimal set of latent nodes , , s. t. for any , we have for some , in . Hence, . Similarly, implies . Therefore, .
Next, we prove that all the children of a given latent node belong to a single (maximal) clique, which proves and . Let be a latent node in the LDIG of and suppose . Then, from the definition of , there exists such that and , and hence a.e. Thus, , excluding the pathological cases. Notice that this is true for any . Therefore, forms a clique (not necessarily maximal) in . Since this is true for every , . The similar proof shows that . ∎
Appendix C Proof of Theorem 4
Lemma 16.
(Pigeonhole principle): The pigeonhole principle states that if pigeons are put into pigeonholes, with , then at least one hole must contain more than one pigeon.
We use pigeonhole principle and Lemma 15 to prove this via contrapositive argument. Recall that the number of latent nodes is equal to the number of cliques . Suppose there exists a clique such that, for some , there does not exist a latent node with . By Lemma 15, there exist latent nodes such that and . By Lemma 15 again, all the children of are included in a single clique. That is, and , since . Then, excluding , , and , we are left with cliques and latent nodes. Applying pigeonhole principle, with pigeons (cliques) and holes (latent nodes), there would exist at least one latent node with belonging to two different maximal cliques, which is a contradiction of Lemma 15. ∎
Appendix D Proof of Proposition 5
Let , , be the topologies of two distinct transformations and respectively. Without loss of generality, let be such that and . By the definition of , if and , then . If and , then is an observed node and is a latent node. From Lemma 2, if and only if . Thus, , which is a contradiction, since both cannot be true. Similar contradiction holds if and . If , then since . Thus, the assumption leads to a contradiction, which implies that . ∎
Appendix E Proof of Proposition 7
Consider a pair of monomials with and . For notational convenience, the index is omitted. Then, . By (7), if and only if is even, . Suppose is odd. Then, must be odd. Similarly, must be even if is even, .
Define an element-wise boolean operator, such that for , if is odd and if is even. Then, for and , if and only if . Group the monomials with the the same odd-even pattern into one cluster. Since the total different values that can take is , there are distinct clusters that are uncorrelated with each other.
Reorder by grouping the monomials belonging to the same cluster together to obtain , similar to the example in Section 5.2. Then, is a block diagonal matrix with blocks for variable polynomials, where each diagonal block corresponds to one particular pattern of . ∎
Appendix F Proof of Lemma 9
Let and let . By definition of , , where denotes the row of , for any . To prove the only if part, suppose that . By definition, . Since , and are uncorrelated, and is block diagonal (Proposition 8),
Thus, there exists a such that and for some . That is, there exists a cluster such that .
Let such that . Then, from the proof of only if part, . That is, , except for a few pathological cases that occur with Lebesgue measure zero. We ignore the pathological cases here. Hence, there does not exist any cluster such that both . ∎
Appendix G Proof of Theorem 10
Appendix H Proof of Theorem 14
As shown in Lemma 11, if , then for the appropriately selected , the convex program (16) retrieves and when . If any one of the LDIMs satisfies this condition, then the imaginary part of returns the topology among the observed node of , by Lemma 13. The theorem follows from Remark 4.8 and Lemma 15. ∎
The authors acknowledge the support of NSF for supporting this research through the project titled ”RAPID: COVID-19 Transmission Network Reconstruction from Time-Series Data” under Award Number 2030096.
References
- [1] Daniele Alpago, Mattia Zorzi, and Augusto Ferrante. Identification of sparse reciprocal graphical models. IEEE Control Systems Letters, 2(4):659–664, 2018.
- [2] Daniele Alpago, Mattia Zorzi, and Augusto Ferrante. A scalable strategy for the identification of latent-variable graphical models. IEEE Transactions on Automatic Control, 67(7):3349–3362, 2022.
- [3] Enrico Avventi, Anders G. Lindquist, and Bo Wahlberg. Arma identification of graphical models. IEEE Transactions on Automatic Control, 58(5):1167–1178, 2013.
- [4] James M. Bower and David Beeman. The book of GENESIS: exploring realistic neural models with the GEneral NEural SImulation System. Springer Science & Business Media, 2012.
- [5] David Carfi and Giovanni Caristi. Financial dynamical systems. Differential Geometry–Dynamical Systems, 2008.
- [6] Elena Ceci, Yanning Shen, Georgios B. Giannakis, and Sergio Barbarossa. Graph-based learning under perturbations via total least-squares. IEEE Transactions on Signal Processing, 68:2870–2882, 2020.
- [7] Venkat Chandrasekaran, Sujay Sanghavi, Pablo A. Parrilo, and Alan S. Willsky. Rank-sparsity incoherence for matrix decomposition. SIAM Journal on Optimization, 21(2):572–596, 2011.
- [8] Venkat Chandrasekaran, Pablo A. Parrilo, and Alan S. Willsky. Latent variable graphical model selection via convex optimization. The Annals of Statistics, 40(4):1935–1967, 2012.
- [9] Valentina Ciccone, Augusto Ferrante, and Mattia Zorzi. Robust identification of “sparse plus low-rank” graphical models: An optimization approach. In 2018 IEEE Conference on Decision and Control (CDC), 2241–2246, Dec 2018.
- [10] Valentina Ciccone, Augusto Ferrante, and Mattia Zorzi. Factor models with real data: A robust estimation of the number of factors. IEEE Transactions on Automatic Control, 64(6):2412–2425, 2019.
- [11] Valentina Ciccone, Augusto Ferrante, and Mattia Zorzi. Learning latent variable dynamic graphical models by confidence sets selection. IEEE Transactions on Automatic Control, 65(12):5130–5143, 2020.
- [12] David A. Cox, John Little, and Donal O’Shea. Ideals, varieties, and algorithms-an introduction to computational algebraic geometry and commutative algebra. Springer, 2007.
- [13] Francesca Crescente, Lucia Falconi, Federica Rozzi, Augusto Ferrante, and Mattia Zorzi. Learning ar factor models. In 2020 59th IEEE Conference on Decision and Control (CDC), 274–279, 2020.
- [14] Mihaela Dimovska and Donatello Materassi. Granger-causality meets causal inference in graphical models: Learning networks via non-invasive observations. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), 5268–5273, Dec 2017.
- [15] Mihaela Dimovska and Donatello Materassi. A control theoretic look at granger causality: extending topology reconstruction to networks with direct feedthroughs. IEEE Transactions on Automatic Control, Early Access:1–1, 2020.
- [16] H. J. Dreef, M. C. F. Donkers, and Paul M. J. Van den Hof. Identifiability of linear dynamic networks through switching modules. IFAC-PapersOnLine, 54(7):37–42, 2021.
- [17] Lucia Falconi, Augusto Ferrante, and Mattia Zorzi. A robust approach to arma factor modeling. arXiv preprint arXiv:2107.03873, 2021.
- [18] Stefanie J. M. Fonken, Karthik Raghavan Ramaswamy, and Paul M. J. Van den Hof. A scalable multi-step least squares method for network identification with unknown disturbance topology. Automatica, 141:110295, 2022.
- [19] M. Ghil, M. R. Allen, M. D. Dettinger, K. Ide, D. Kondrashov, M. E. Mann, A. W. Robertson, A. Saunders, Y. Tian, F. Varadi, and P. Yiou. Advanced spectral methods for climatic time series. Reviews of Geophysics, 40(1):3–1–3–41, 2002.
- [20] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, USA, 2nd edition, 2012.
- [21] Giacomo Innocenti and Donatello Materassi. Modeling the topology of a dynamical network via wiener filtering approach. Automatica, 48(5):936–946, 2012.
- [22] Raphaël Liégeois, Bamdev Mishra, Mattia Zorzi, and Rudolph Sepulchre. Sparse plus low-rank autoregressive identification in neuroimaging time series. In 2015 54th IEEE Conference on Decision and Control (CDC), 3965–3970, Dec 2015.
- [23] Johan Lofberg. Yalmip : a toolbox for modeling and optimization in matlab. In 2004 IEEE International Conference on Robotics and Automation (IEEE Cat. No.04CH37508), 284–289, Sep. 2004.
- [24] Eduardo Mapurunga and Alexandre Sanfelici Bazanella. Optimal allocation of excitation and measurement for identification of dynamic networks. arXiv preprint arXiv:2007.09263, 2020.
- [25] Donatello Materassi and Giacomo Innocenti. Topological identification in networks of dynamical systems. IEEE Transactions on Automatic Control, 55(8):1860–1871, 2010.
- [26] Donatello Materassi and Murti V. Salapaka. Network reconstruction of dynamical polytrees with unobserved nodes. In 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), 4629–4634, 2012.
- [27] Donatello Materassi and Murti V. Salapaka. On the problem of reconstructing an unknown topology via locality properties of the wiener filter. IEEE Transactions on Automatic Control, 57(7):1765–1777, July 2012.
- [28] Donatello Materassi and Murti V. Salapaka. Signal selection for estimation and identification in networks of dynamic systems: A graphical model approach. IEEE Transactions on Automatic Control, 1–1, 2019.
- [29] Rohan Money, Joshin Krishnan, and Baltasar Beferull-Lozano. Online non-linear topology identification from graph-connected time series. In 2021 IEEE Data Science and Learning Workshop (DSLW), 1–6, 2021.
- [30] Rohan Money, Joshin Krishnan, and Baltasar Beferull-Lozano. Random feature approximation for online nonlinear graph topology identification. In 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 1–6, 2021.
- [31] Athanasios Papoulis. Probability and Statistics. Prentice-Hall, Inc., USA, 1990.
- [32] Sourav Patel, Sandeep Attree, Saurav Talukdar, Mangal Prakash, and Murti V Salapaka. Distributed apportioning in a power network for providing demand response services. In 2017 IEEE International Conference on Smart Grid Communications (SmartGridComm), 38–44. IEEE, 2017.
- [33] Christopher J. Quinn, Negar Kiyavash, and Todd P. Coleman. Directed information graphs. IEEE Transactions on Information Theory, 61(12):6887–6909, 2015.
- [34] Venkatakrishnan C. Rajagopal, Karthik R. Ramaswamy, and Paul M. J. Van Den Hof. Learning local modules in dynamic networks without prior topology information. In 2021 60th IEEE Conference on Decision and Control (CDC), 840–845, 2021.
- [35] Karthik R. Ramaswamy and Paul M.J. Vandenhof. A local direct method for module identification in dynamic networks with correlated noise. IEEE Transactions on Automatic Control, 1–1, 2020.
- [36] Firoozeh Sepehr and Donatello Materassi. Blind learning of tree network topologies in the presence of hidden nodes. IEEE Transactions on Automatic Control, 65(3):1014–1028, March 2020.
- [37] Firoozeh Sepehr and Donatello Materassi. An algorithm to learn polytree networks with hidden nodes. In Advances in Neural Information Processing Systems 32, 15110–15119. Curran Associates, Inc., 2019.
- [38] Yanning Shen, Xiao Fu, Georgios B. Giannakis, and Nicholas D. Sidiropoulos. Topology identification of directed graphs via joint diagonalization of correlation matrices. IEEE Transactions on Signal and Information Processing over Networks, 6:271–283, 2020.
- [39] Shengling Shi, Xiaodong Cheng, and Paul M. J. Van den Hof. Single module identifiability in linear dynamic networks with partial excitation and measurement. IEEE Transactions on Automatic Control, 68(1): 285 - 300 December 2021.
- [40] Jitkomut Songsiri and Lieven Vandenberghe. Topology selection in graphical models of autoregressive processes. Journal of Machine Learning Research, 11(91):2671–2705, 2010.
- [41] Petre Stoica and Randolph L. Moses. Spectral analysis of signals, volume 452. Pearson Prentice Hall Upper Saddle River, NJ, 2005.
- [42] Saurav Talukdar, Deepjyothi Deka, Michael Chertkov, and Murti V. Salapaka. Topology learning of radial dynamical systems with latent nodes. In 2018 Annual American Control Conference (ACC), 1096–1101, June 2018.
- [43] Saurav Talukdar, Deepjyoti Deka, Harish Doddi, Donatello Materassi, Michael Chertkov, and Murti V. Salapaka. Physics informed topology learning in networks of linear dynamical systems. Automatica, 112:108705, 2020.
- [44] Reha H. Tütüncü, Kim-Chuan Toh, and Michael J. Todd. Sdpt3—a matlab software package for semidefinite-quadratic-linear programming, version 3.0. Web page http://www. math. nus. edu. sg/mattohkc/sdpt3. html, 2001.
- [45] Paul M. J. Van den Hof, Arne Dankers, Peter S. C. Heuberger, and Xavier Bombois. Identification of dynamic models in complex networks with prediction error methods—basic methods for consistent module estimates. Automatica, 49(10):2994–3006, 2013.
- [46] Mishfad S. Veedu, Doddi Harish, and Murti V. Salapaka. Topology learning of linear dynamical systems with latent nodes using matrix decomposition. IEEE Transactions on Automatic Control, 67(11): 5746 - 5761, Nov. 2022.
- [47] Allen J Wood, Bruce F Wollenberg, and Gerald B Sheblé. Power generation, operation, and control. John Wiley & Sons, 2013.
- [48] Mattia Zorzi and Rudolph Sepulchre. Ar identification of latent-variable graphical models. IEEE Transactions on Automatic Control, 61(9):2327–2340, Sep. 2016.
- [49] Mattia Zorzi and Alessandro Chiuso. Sparse plus low rank network identification: A nonparametric approach. Automatica, 76:355–366, 2017.