A local direct method for module identification in dynamic networks with correlated noise

Karthik R. Ramaswamy, Paul M.J. Van den Hof Karthik Ramaswamy and Paul Van den Hof are with the Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands {k.r.ramaswamy, p.m.j.vandenhof}@tue.nlPaper submitted for publication 2 August 2019. Revised: 16 January 2020 and 10 July 2020. Final version 31 October 2020. This work has received funding from the European Research Council (ERC), Advanced Research Grant SYSDYNET, under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 694504).

Abstract

The identification of local modules in dynamic networks with known topology has recently been addressed by formulating conditions for arriving at consistent estimates of the module dynamics, under the assumption of having disturbances that are uncorrelated over the different nodes. The conditions typically reflect the selection of a set of node signals that are taken as predictor inputs in a MISO identification setup. In this paper an extension is made to arrive at an identification setup for the situation that process noises on the different node signals can be correlated with each other. In this situation the local module may need to be embedded in a MIMO identification setup for arriving at a consistent estimate with maximum likelihood properties. This requires the proper treatment of confounding variables. The result is a set of algorithms that, based on the given network topology and disturbance correlation structure, selects an appropriate set of node signals as predictor inputs and outputs in a MISO or MIMO identification setup. Three algorithms are presented that differ in their approach of selecting measured node signals. Either a maximum or a minimum number of measured node signals can be considered, as well as a preselected set of measured nodes.

Index Terms:

Closed-loop identification, dynamic networks, correlated noise, system identification, predictor input and predicted output selection.

I Introduction

In recent years increasing attention has been given to the development of new tools for the identification of large-scale interconnected systems, also known as dynamic networks. These networks are typically thought of as a set of measurable signals (the node signals) interconnected through linear dynamic systems (the modules), possibly driven by external excitations (the reference signals). Among the literature on this topic, we can distinguish three main categories of research. The first one focuses on identifying the topology of the dynamic network [1, 2, 3, 4, 5]. The second category concerns identification of the full network dynamics [6, 7, 8, 9, 10, 11], including aspects of identifiability, particularly addressed in [12, 13, 14], while the third one deals with identification of a specific component (module) of the network, assuming that the network topology is known (the so called local module identification), see [15, 16, 17, 18, 19, 20].

In this paper we will further expand the work on the local module identification problem. In [15], the classical direct-method [21] for closed-loop identification has been generalized to a dynamic network framework using a MISO identification setup. Consistent estimates of the target module can be obtained when the network topology is known and all the node signals in the MISO identification setup are measured. The work has been extended in [22, 23, 24] towards the situation where some node signals might be non-measurable, leading to an additional predictor input selection problem. A similar setup has also been studied in [18], where an approach has been presented based on empirical Bayesian methods to reduce the variance of the target module estimates. In [16] and [19], dynamic networks having node measurements corrupted by sensor noise have been studied, and informative experiments for consistent local module estimates have been addressed in [20].

A standing assumption in the aforementioned works [15], [18], [20], [23] is that the process noises entering the nodes of the dynamic network are uncorrelated with each other. This assumption facilitates the analysis and the development of methods for local module identification, reaching consistent module estimates using the direct method. However, when process noises are correlated over the nodes, the consistency results for the considered MISO direct method collapse. In this situation it is necessary to consider also the noise topology or disturbance correlation structure, when selecting an appropriate identification setup. Even though the indirect and two-stage methods in [16, 20] can handle the situation of correlated noise and deliver consistent estimates, the obtained estimates will not have minimum variance.

In this paper we particularly consider the situation of having dynamic networks with disturbance signals on different nodes that possibly are correlated, while our target moves from consistency only, to also minimum variance (or Maximum Likelihood (ML)) properties of the obtained local module estimates. We will assume that the topology of the network is known, as well as the (Boolean) correlation structure of the noise disturbances, i.e. the zero-elements in the spectral density matrix of the noise. While one could use techniques for full network identification (e.g., [8]), our aim is to develop a method that uses only local information. In this way, we avoid (i) the need to collect node measurements that are “far away” from the target module, and (ii) the need to identify unnecessary modules that would come with the price of higher variance in the estimates.

Using the reasoning first introduced in [25], we build a constructive procedure that, choosing a limited number of predictor inputs and predicted outputs, builds an identification setup that guarantees maximum likelihood (ML) properties (and thus asymptotic minimum variance) when applying a direct prediction error identification method. In this situation we have to deal with so-called confounding variables (see e.g. [25], [26]), that is, unmeasured variables that directly or indirectly influence both the predicted output and the predictor inputs, and lead to lack of consistency. The effect of confounding variables will be mitigated by extending the number of predictor inputs and/or predicted outputs in the identification setup, thus including more measured node signals in the identification. Preliminary results for the particular “full input” case have been presented in [27]. Here we generalize that reasoning to different node selection schemes, and provide a generally applicable theory that is independent of the particular node selection scheme selected.

This paper is organized as follows. In section II, the dynamic network setup is defined. Section III provides a summary of available results from the existing literature of local module identification related to the context of this paper. Next, important concepts and notations used in this paper are defined in Section IV while the MIMO identification setup and main results are presented in subsequent sections. Sections VII-IX provide algorithms and illustrative examples for three different ways of selecting input and output node signals: the full input case, the minimum input case, and the user selection case. This is followed by Conclusions. The technical proofs of all results are collected in the Appendix.

II Network and identification setup

Following the basic setup of [15], a dynamic network is built up out of $L$ scalar internal variables or nodes $w_{j}$ , $j=1,\ldots,L$ , and $K$ external variables $r_{k}$ , $k=1,\ldots K$ . Each internal variable is described as:

\displaystyle w_{j}(t)=\sum_{\stackrel{{\scriptstyle l=1}}{{l\neq j}}}^{L}G_{jl}(q)w_{l}(t)+u_{j}(t)+v_{j}(t)

(1)

where $q^{-1}$ is the delay operator, i.e. $q^{-1}w_{j}(t)=w_{j}(t-1)$ ,

•

$G_{jl}$ are proper rational transfer functions, referred to as modules;
•

There are no self-loops in the network, i.e. nodes are not directly connected to themselves $G_{jj}=0$ ;
•

$u_{j}(t)$ is generated by the external variables $r_{k}(t)$ that can directly be manipulated by the user and is given by $u_{j}(t)=\sum_{k=1}^{K}R_{jk}r_{k}(t)$ where $R_{jk}$ are stable, proper rational transfer functions;
•

$v_{j}$ is process noise, where the vector process $v=[v_{1}\cdots v_{L}]^{T}$ is modelled as a stationary stochastic process with rational spectral density $\Phi_{v}(\omega)$ , such that there exists a white noise process $e:=[e_{1}\cdots e_{L}]^{T}$ , with covariance matrix $\Lambda>0$ such that $v(t)=H(q)e(t)$ , where $H(q)$ is square, stable, monic and minimum-phase. The situation of correlated noise, as considered in this paper, refers to the situation that $\Phi_{v}(\omega)$ and $H$ are non-diagonal, while we assume that we know a priori which entries of $\Phi_{v}$ are nonzero.

We will assume that the standard regularity conditions on the data are satisfied that are required for convergence results of the prediction error identification method¹¹1See [21] page 249. This includes the property that $e(t)$ has bounded moments of order higher than $4$ ..

When combining the $L$ node signals we arrive at the full network expression

\displaystyle\begin{bmatrix}\!w_{1}\!\\[1.0pt] \!w_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!w_{L}\!\end{bmatrix}\!\!\!=\!\!\!\begin{bmatrix}0&\!G_{12}\!&\!\cdots\!&\!\!G_{1L}\!\\ \!G_{21}\!&0&\!\ddots\!&\!\!\vdots\!\\ \vdots&\!\ddots\!&\!\ddots\!&\!\!G_{L-1\ L}\!\\ \!G_{L1}\!&\!\cdots\!&\!\!G_{L\ L-1}\!\!&\!\!0\end{bmatrix}\!\!\!\!\begin{bmatrix}\!w_{1}\!\\[1.0pt] \!w_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!w_{L}\!\end{bmatrix}\!\!\!+\!\!\begin{bmatrix}\!u_{1}\!\\[1.0pt] \!u_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!u_{L}\!\end{bmatrix}\!\!\!+\!\!H\!\!\begin{bmatrix}\!e_{1}\!\\[1.0pt] \!e_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!e_{L}\!\end{bmatrix}\!\!\!

which results in the matrix equation:

\displaystyle w=Gw+Rr+He.

(2)

It is assumed that the dynamic network is stable, i.e. $(I-G)^{-1}$ is stable, and well posed (see [28] for details). The representation (2) is an extension of the dynamic structure function representation [12]. The identification problem to be considered is the problem of identifying one particular module $G_{ji}(q)$ on the basis of a selection of measured variables $w$ , and possibly r.

Let us define $\mathcal{N}_{j}$ as the set of node indices $k$ such that $G_{jk}\neq 0$ , i.e. the node signals in $\mathcal{N}_{j}$ are the $w$ -in-neighbors of the node signal $w_{j}$ . Let $\mathcal{D}_{j}$ denote the set of indices of the internal variables that are chosen as predictor inputs. It seems most obvious to have $\mathcal{D}_{j}\subset\mathcal{N}_{j}$ , but this is not necessary, as will be shown later in this paper. Let $\mathcal{V}_{j}$ denote the set of node indices $k$ such that $v_{k}$ has a path to $w_{j}$ . Let $\mathcal{Z}_{j}$ denote the set of indices not in $\{j\}\cup\mathcal{D}_{j}$ , i.e. $\mathcal{Z}_{j}=\{1,\ldots,L\}\setminus\{\{j\}\cup\mathcal{D}_{j}\}$ , reflecting the node signals that are discarded in the prediction/identification. Let $w_{\!{\scriptscriptstyle\mathcal{D}}}$ denote the vector $[w_{k_{1}}\ \cdots\ w_{k_{n}}]^{T}$ , where $\{k_{1},\ldots,k_{n}\}=\mathcal{D}_{j}$ . Let $u_{\!{\scriptscriptstyle\mathcal{D}}}$ denote the vector $[u_{k_{1}}\ \cdots\ u_{k_{n}}]^{T}$ , where $\{k_{1},\ldots,k_{n}\}=\mathcal{D}_{j}$ . The vectors $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ , $v_{\!{\scriptscriptstyle\mathcal{D}}}$ , $v_{\!{\scriptscriptstyle\mathcal{Z}}}$ and $u_{\!{\scriptscriptstyle\mathcal{Z}}}$ are defined analogously. The ordering of the elements of $w_{\!{\scriptscriptstyle\mathcal{D}}}$ , $v_{\!{\scriptscriptstyle\mathcal{D}}}$ , and $u_{\!{\scriptscriptstyle\mathcal{D}}}$ is not important, as long as it is the same for all vectors. The transfer function matrix between $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and $w_{j}$ is denoted $G_{j\!{\scriptscriptstyle\mathcal{D}}}$ . The other transfer function matrices are defined analogously.

TABLE I: Table with notation of variables and sets.

$G$	Network matrix with modules
$H$	Network noise model
$G_{ji}$	Target module with input $w_{i}$ and output $w_{j}$
$w_{j}$	Node signal $w_{j}$ , output of the target module
$w_{i}$	Node signal $w_{i}$ , input of the target module
$\mathcal{Y}$	Set of indexes of nodes that appear in the vector of predicted
	outputs
$\mathcal{D}$	Set of indexes of nodes that appear in the vector of predictor
	inputs for predicted outputs $w_{\!{\scriptscriptstyle\mathcal{Y}}}$
$\mathcal{D}_{j}$	Set of indexes of nodes that appear in the vector of predictor
	inputs for prediction of node $w_{j}$
$w_{o}$	Output node signal $w_{j}$ if it is not in set $w_{\!{\scriptscriptstyle\mathcal{Q}}}$
$\mathcal{Q}$	Set of indexes of nodes that appear both in the predicted output,
	and in the predictor input
$\mathcal{U}$	Set of indexes of nodes that only appear as predictor input:
	$\mathcal{U}=\mathcal{D}\backslash\mathcal{Q}$
$\mathcal{A}$	Set of indexes of nodes that only appear as predictor input,
	that do not have any confounding variable effect: $\mathcal{A}\subseteq\mathcal{U}$
$\mathcal{B}$	Set of indexes of nodes that only appear as predictor input:
	$\mathcal{B}=\mathcal{U}\backslash\mathcal{A}$
$\mathcal{Z}$	Set of indexes of nodes that are removed (immersed) from the
	network when predicting $w_{\!{\scriptscriptstyle\mathcal{Y}}}$
$\mathcal{Z}_{j}$	Set of indexes of nodes that are removed (immersed) from the
	network when predicting $w_{j}$
$v_{k}$	Disturbance signal on node $w_{k}$
$\mathcal{N}_{j}$	Index set of nodes that are $w$ -in-neighbors of $w_{j}$
$e$	(White noise) innovation of the noise process $v$
$\mathcal{L}$	Index set of all node signals: $[1,L]$
$\bar{G}$	Network matrix of the immersed and transformed network (8)
$\xi$	(White noise) innovation of the noise process in the immersed
	and transformed network (8)

To illustrate the notation, consider the network sketched in Figure 1, and let module $G_{21}^{0}$ be the target module for identification.

Refer to caption — Figure 1: Example network with target module $G_{21}^{0}$ (in green).

Then $j=2$ , $i=1$ ; $\mathcal{N}_{j}=\{1,4\}$ . If we choose the set of predictor inputs as $\mathcal{D}_{j}=\mathcal{N}_{j}$ , then the set of remaining (nonmeasured) signals, becomes $\mathcal{Z}_{j}=\{3,5,6\}$ .

By this notation, the network equation (2) is rewritten as:

\displaystyle\begin{bmatrix}w_{j}\\ w_{\!{\scriptscriptstyle\mathcal{D}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}=\begin{bmatrix}0&G_{j\!{\scriptscriptstyle\mathcal{D}}}&G_{j\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{D}}j}&G_{\!{\scriptscriptstyle\mathcal{D}}\!{\scriptscriptstyle\mathcal{D}}}&G_{\!{\scriptscriptstyle\mathcal{D}}\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{Z}}j}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{D}}}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\begin{bmatrix}w_{j}\\ w_{\!{\scriptscriptstyle\mathcal{D}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}+\begin{bmatrix}v_{j}\\ v_{\!{\scriptscriptstyle\mathcal{D}}}\\ v_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}+\begin{bmatrix}u_{j}\\ u_{\!{\scriptscriptstyle\mathcal{D}}}\\ u_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix},

(3)

where $G_{\!{\scriptscriptstyle\mathcal{D}}\!{\scriptscriptstyle\mathcal{D}}}$ and $G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}$ have zeros on the diagonal.

For identification of module $G_{ji}$ we select $\mathcal{D}_{j}$ such that $i\in\mathcal{D}_{j}$ , and subsequently estimate a multiple-input single-output model for the transfer functions in $G_{j\!{\scriptscriptstyle\mathcal{D}}}$ , by considering the one-step-ahead predictor²²2 $\bar{\mathbb{E}}$ refers to $\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{t=1}^{N}\mathbb{E}$ , and $w_{j}^{\ell}$ and $w_{\mathcal{D}_{j}}^{\ell}$ refer to signal samples $w_{j}(\tau)$ and $w_{k}(\tau)$ , $k\in\mathcal{D}_{j}$ , respectively, for all $\tau\leq\ell$ . $\hat{w}_{j}(t|t-1;\theta):=\bar{\mathbb{E}}\{w_{j}(t)\ |\ w_{j}^{t-1},w_{\mathcal{D}_{j}}^{t};\theta\}$ ([21]) and the resulting prediction error $\varepsilon_{j}(t,\theta)=w_{j}(t)-\hat{w}_{j}(t|t-1;\theta)$ , leading to:

\varepsilon_{j}(t,\theta)=H_{j}(\theta)^{-1}[(w_{j}-\!\!\sum_{k\in\mathcal{D}_{j}}G_{jk}(\theta)w_{k}-u_{j}]

(4)

where arguments $q$ and $t$ have been dropped for notational clarity. The parameterized transfer functions $G_{jk}(\theta)$ , $k\in\mathcal{D}_{j}$ and $H_{j}(\theta)$ are estimated by minimizing the sum of squared (prediction) errors: $V_{j}(\theta)=\frac{1}{N}\sum_{t=0}^{N-1}\varepsilon_{j}^{2}(t,\theta),$ where $N$ is the length of the data set. We refer to this identification method as the direct method, [15].

III Available results and problem specification

The following results are available from previous work:

•

When $\mathcal{D}_{j}$ is chosen equal to $\mathcal{N}_{j}$ and noise $v_{j}$ is uncorrelated to all $v_{k}$ , $k\in\mathcal{V}_{j}$ , then $G_{ji}$ can be consistently estimated in a MISO setup, provided that there is enough excitation in the predictor input signals, see [15].
•

When $\mathcal{D}_{j}$ is a subset of $\mathcal{N}_{j}$ , and disturbance are uncorrelated, confounding variables³³3A confounding variable is an unmeasured variable that has paths to both the input and output of an estimation problem [29]. can occur in the estimation problem, and these have to be taken into account in the choice of $\mathcal{D}_{j}$ in order to arrive at consistent estimates of $G_{ji}$ , see [23].
•

In [26] relaxed conditions for the selection of $\mathcal{D}_{j}$ have been formulated, while still staying in the context of MISO identification with noise spectrum of $v$ ( $\Phi_{v}$ ) being diagonal. This is particularly done by choosing additional predictor input signals that are not in $\mathcal{N}_{j}$ ,.i.e. that are no in-neighbors of the output $w_{j}$ of the target module.
•

For non-diagonal $\Phi_{v}$ , an indirect/two-stage identification method can be used to arrive at consistent estimates of $G_{ji}$ [15, 23, 20]. However the drawback of these methods is that they do not allow for a maximum likelihood analysis, i.e. they will not lead to minimum variance results.
•

This latter argument also holds for the method in [22, 24], where Wiener-filter estimates are combined to provide local module estimates, and diagonal $\Phi_{v}$ is considered.

In this paper, we go beyond consistency properties, and address the following problem: How to identify a single module in a dynamic network for the situation that the disturbance signals can be correlated, i.e. $\Phi_{v}$ not necessarily being diagonal, such that the estimate is consistent and asymptotically has Maximum Likelihood, and thus also minimum variance, properties.

Addressing this problem requires a more careful treatment and modelling of the noise that is acting on the different node signals. This can be illustrated through a simple Example that is presented in [25], where a two-node network is considered as given in Figure 2, with $v_{1}$ and $v_{2}$ being dynamically correlated. In this case, a SISO identification using the direct method with input $w_{1}$ and output $w_{2}$ will lead to a biased estimate of $G_{21}$ because of the unmodelled correlation of the disturbance signals on $w_{1}$ and $w_{2}$ ⁴⁴4In this particular example the bias is caused by the presence of $H_{21}$ .. For an analysis of this, see [25]. If both node signals $w_{1}$ and $w_{2}$ are predicted as outputs, then the correlation between the disturbance signals can be incorporated in a $2\times 2$ non-diagonal noise model, thus leading to an unbiased estimate of $G_{21}$ . In this way bias due to correlation in the noise signals can be avoided by predicting additional outputs other than the output of the target module. This leads to the following two suggestions:

•

confounding variables can be dealt with by modelling correlated disturbances on the node signals, and
•

this can be done by moving from a MISO identification setup to a MIMO setup.

These suggestions are being explored in the current paper. Next we will present an example to further illustrate the problem.

Example 1

Consider the network sketched in Figure 1, and let module $G_{21}^{0}$ be the target module for identification. If the node signals $w_{1}$ , $w_{2}$ and $w_{4}$ can be measured, then a two-input one-output model with inputs $w_{1},w_{4}$ and output $w_{2}$ can be considered. This can lead to a consistent estimate of $G_{21}^{0}$ and $G_{24}^{0}$ , provided that the disturbance signal $v_{2}$ is uncorrelated to all other disturbance signals. However if e.g. $v_{4}$ and $v_{2}$ are dynamically correlated, implying that a noise model $H$ of the two-dimensional noise process is non-diagonal, then a biased estimate will result for this approach. A solution is then to include $w_{4}$ in the set of predicted outputs, and by adding node signal $w_{3}$ as predictor input for $w_{4}$ . We then combine predicting $w_{2}$ on the basis of $(w_{1},w_{4})$ with predicting $w_{4}$ on the basis of $w_{3}$ . The correlation between $v_{2}$ and $v_{4}$ is then covered by modelling a $2\times 2$ non-diagonal noise model of the joint process $(v_{2},v_{4})$ .

In the next sections we will formalize the procedure as sketched in Example 1 for general networks.

IV Concepts and notation

In line with [29] we define the notion of confounding variable.

Definition 1 (confounding variable)

Consider a dynamic network defined by

w=Gw+He+u

(5)

with $e$ a white noise process, and consider the graph related to this network, with node signals $w$ and $e$ . Let $w_{\!{\scriptscriptstyle\mathcal{X}}}$ and $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ be two subsets of measured node signals in $w$ , and let $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ be the set of unmeasured node signals in $w$ . Then a noise component $e_{\ell}$ in $e$ is a confounding variable for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{X}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , if in the graph there exist simultaneous paths⁵⁵5A simultaneous path from $e_{1}$ to node signal $w_{1}$ and $w_{2}$ implies that there exist a path from $e_{1}$ to $w_{1}$ as well as from $e_{1}$ to $w_{2}$ . from $e_{\ell}$ to node signals $w_{k},k\in\mathcal{X}$ and $w_{n},n\in\mathcal{Y}$ , while these paths are either direct⁶⁶6A direct path from $e_{1}$ to node signal $w_{1}$ implies that there exist a path from $e_{1}$ to $w_{1}$ which does not pass through nodes in $w$ . or only run through nodes that are in $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ . $\Box$

We will denote $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ as the node signals in $w$ that serve as predicted outputs, and $w_{\!{\scriptscriptstyle\mathcal{D}}}$ as the node signals in $w$ that serve as predictor inputs. Next we decompose $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ and $w_{\!{\scriptscriptstyle\mathcal{D}}}$ into disjoint sets according to: $\mathcal{Y}=\mathcal{Q}\cup\{o\}\ ;\ \mathcal{D}=\mathcal{Q}\cup\mathcal{U}$ where $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ are the node signals that are common in $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ and $w_{\!{\scriptscriptstyle\mathcal{D}}}$ ; $w_{o}$ is the output $w_{j}$ of the target module; if $j\in\mathcal{Q}$ then $\{o\}$ is void; $w_{\!{\scriptscriptstyle\mathcal{U}}}$ are the node signals that are only in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ . In this situation the measured nodes will be $w_{\!{\scriptscriptstyle\mathcal{D}}\cup\!{\scriptscriptstyle\mathcal{Y}}}$ and the unmeasured nodes $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ will be determined by the set $\mathcal{Z}=\mathcal{L}\backslash\{\mathcal{D}\cup\mathcal{Y}\}$ , where $\mathcal{L}=\{1,2,\cdots L\}$ .

There can exist two types of confounding variable namely direct and indirect confounding variables. For direct confounding variables the simultaneous paths mentioned in the definition are both direct paths, while in all other cases we refer to the confounding variables as indirect confounding variables. For example, in the network as shown in Figure 3 with $\mathcal{D}=\{2\}$ , $\mathcal{Y}=\{1\}$ and $\mathcal{Z}=\{3\}$ , for the estimation problem $w_{2}\rightarrow w_{1}$ , $e_{2}$ is a direct confounding variable since it has a simultaneous path to $w_{1}$ and $w_{2}$ where both the paths are direct paths. Meanwhile $e_{3}$ is an indirect confounding variable since it has a simultaneous path to $w_{1}$ and $w_{2}$ where one of the path is an unmeasured path⁷⁷7An unmeasured path is a path that runs through nodes in $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ only. Analogously, we can define unmeasured loops through a node $w_{k}$ ..

Remark 1

Confounding variables are defined in accordance with their use in [26], on the basis of a network description as in (5). In this definition absence of confounding variables still allows that there are unmeasured signals that create correlation between the inputs and outputs of an estimation problem, in particular if the white noise signals in $e$ are statically correlated, i.e $cov(e)$ being non-diagonal. It will appear that this type of correlations will not hinder our identification results, as analysed in Section VI-C.

V Main results - Line of reasoning

On the basis of the decomposition of node signals as defined in the previous section we are going to represent the system’s equations (5) in the following structured form:

	$\displaystyle\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!$	$\displaystyle\!=\!$	$\displaystyle\!\begin{bmatrix}G_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&G_{\!{\scriptscriptstyle\mathcal{Q}}o}&G_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}&G_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{o\!{\scriptscriptstyle\mathcal{Q}}}&G_{oo}&G_{o\!{\scriptscriptstyle\mathcal{U}}}&G_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&G_{\!{\scriptscriptstyle\mathcal{U}}o}&G_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}&G_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Q}}}&G_{\!{\scriptscriptstyle\mathcal{Z}}o}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{U}}}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!+\!R(q)r$		(6)
			$\displaystyle+\begin{bmatrix}H_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&H_{\!{\scriptscriptstyle\mathcal{Q}}o}&H_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}&H_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ H_{o\!{\scriptscriptstyle\mathcal{Q}}}&H_{oo}&H_{o\!{\scriptscriptstyle\mathcal{U}}}&H_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ H_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&H_{\!{\scriptscriptstyle\mathcal{U}}o}&H_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}&H_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Z}}}\\ H_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Q}}}&H_{\!{\scriptscriptstyle\mathcal{Z}}o}&H_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{U}}}&H_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{U}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}$		(6)

where we make the notation agreement that the matrix $H$ is not necessarily monic, and the scaling of the white noise process $e$ is such that $cov(e)=I$ . Without loss of generality, we can assume $r=0$ for the sake of brevity.

Our objective is to end up with an an identification problem in which we identify the dynamics from inputs $(w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{\!{\scriptscriptstyle\mathcal{U}}})$ to outputs $(w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{o})$ , while our target module $G_{ji}(q)$ is present as one of the scalar transfers (modules) in this identified (MIMO) model. This can be realized by the following steps:

Firstly, we write the system’s equations for the measured variables as

\underbrace{\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ \hline\cr w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}}_{w_{m}}\!=\!\underbrace{\left[\begin{array}[]{c|c}\bar{G}&0\\ \hline\cr\\[-8.53581pt] \bar{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{D}}}&\bar{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\end{array}\right]}_{\bar{G}_{m}}\!\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\\ \hline\cr w_{o}\end{bmatrix}+\underbrace{\left[\begin{array}[]{c|c}\bar{H}&0\\ \hline\cr\\[-8.53581pt] 0&\bar{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\end{array}\right]}_{\bar{H}_{m}}\!\underbrace{\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\\ \hline\cr\xi_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}}_{\xi_{m}}

(7)

with $\xi_{m}$ a white noise process, while $\bar{H}$ is monic, stable and stably invertible and the components in $\bar{G}$ are zero if it concerns a mapping between identical signals. This step is made by removing the non-measured signals $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ from the network, while maintaining the second order properties of the remaining signals. This step is referred to as immersion of the nodes in $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ [23].

As an immediate result of the previous step we can write an expression for the output variables $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , by considering the upper part of the equation (7), as

\underbrace{\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\end{bmatrix}}_{w_{\!{\scriptscriptstyle\mathcal{Y}}}}\!\!=\!\!\underbrace{\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}}_{\bar{G}}\underbrace{\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}}_{w_{\!{\scriptscriptstyle\mathcal{D}}}}+\underbrace{\begin{bmatrix}\bar{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\bar{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}\\ \bar{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\bar{H}_{oo}\end{bmatrix}}_{\bar{H}}\underbrace{\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\end{bmatrix}}_{\xi_{\!{\scriptscriptstyle\mathcal{Y}}}}

(8)

with $cov(\xi_{\!{\scriptscriptstyle\mathcal{Y}}}):=\bar{\Lambda}$ .

3.

Thirdly, we will provide conditions to guarantee that $\bar{G}_{ji}(q)=G_{ji}(q)$ , i.e the target module appearing in equation (8) is the target module of the original network (invariance of target module). This will require conditions on the selection of node signals in $w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{o},w_{\!{\scriptscriptstyle\mathcal{U}}}$ .
4.

Finally, it will be shown that, on the basis of (8), under fairly general conditions, the transfer functions $\bar{G}(q)$ and $\bar{H}(q)$ can be estimated consistently, and with maximum likelihood properties. A pictorial representation of the identification setup with the classification of different sets of signals in (8) is provided in Figure 4. The figure also contains set $\mathcal{A},\mathcal{B},\mathcal{F}_{n}$ which will be introduced in the sequel.

The combination of steps 3 and 4 will lead to a consistent and maximum likelihood estimation of the target module $G_{ji}(q)$ . It has to be noted that an identification setup results, in which signals can simultaneously act as input and as output (the set $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ ). Because $\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}$ is restricted to be hollow, this does not lead to trivial transfers between signals that are the same. A related situation appears when identifying a full network, while using all node signals as both inputs and outputs, as in [8].

The steps 1)-4) above will require conditions on the selection of node signals, based on the known topology of the network and an allowed correlation structure of the disturbances in the network. Specifying these conditions on the selection of sets $w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{o},w_{\!{\scriptscriptstyle\mathcal{U}}}$ , will be an important objective of the next section.

VI Main Results - Derivations

VI-A System representation after immersion (Steps 1-2)

First we will show that a network in which signals in $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ are removed (immersed) can indeed be represented by (7).

Proposition 1

Consider a dynamic network given by (6), where the set of all nodes $w_{\!{\scriptscriptstyle\mathcal{L}}}$ is decomposed in disjunct sets $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ , $w_{o}$ , $w_{\!{\scriptscriptstyle\mathcal{U}}}$ and $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ as defined in Section IV. Then, for the situation $r=0$ ,

1.

there exists a representation (7) of the measured node signals $w_{m}$ , with $\bar{H}_{m}$ monic, stable and stably invertible, and $\xi_{m}$ a white noise process, and
2.

for this representation there are no confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ .

Proof: See appendix.

The consequence of Proposition 1 is that the output node signals in $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ can be explicitly written in the form of (8), in terms of input node signals $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and disturbances, without relying on (unmeasured) node signals in $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ . The particular structure of network representation (7) implies that there are no confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . This will be an important phenomenon for our identification setup. Based on (8), a typical prediction error identification method can provide estimates of $\bar{G}$ and $\bar{H}$ from measured signals $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ and $w_{\!{\scriptscriptstyle\mathcal{D}}}$ with $\mathcal{D}=\mathcal{Q}\cup\mathcal{U}$ . In this estimation problem, confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{Q}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ are treated by correlated noise modelling in $\bar{H}$ , while confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ are not present, due to the structure of (7).

In the following example, the step towards (7) will be illustrated, as well as its effect on the dynamics in $\bar{G}$ .

Example 2

Consider the 4-node network depicted in Figure 5(a), where all nodes are considered to be measured, and where we select $w_{o}=w_{1}$ , $\mathcal{U}=\{2,3,4\}$ , and $\mathcal{Q}=\emptyset$ . In this network, there is a confounding variable $e_{4}$ for the problem $w_{4}\rightarrow w_{1}$ (i.e $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ ), meaning that for the situation $\xi=e$ the noise model $\bar{H}_{m}$ in (7) will not be block diagonal. Therefore the network does not comply with the representation in (7) and (8). We can remove the confounding variable, by shifting the effect of $H_{14}$ into a transformed version of $G_{14}$ , which now becomes $G_{14}+H_{44}^{-1}H_{14}$ , as depicted in Figure 5(b). However, since this shift also affects the transfer from $e_{3}$ to $w_{1}$ , the change of $G_{14}$ needs to be mitigated by a new term $H_{13}$ , in order to keep the network signals invariant. In the resulting network the confounding variable for $w_{4}\rightarrow w_{1}$ is removed, but a new confounding variable ( $e_{3})$ for $w_{3}\rightarrow w_{1}$ has been created. In the second step, shown in Figure 5(c), the term $H_{13}$ is removed by incorporating its effect in the module $G_{13}$ which now becomes $G_{13}+H_{33}^{-1}H_{13}$ . In the resulting network there are no confounding variables for $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{1}$ . This representation complies with the structure in (7). Note that in the transformed network, the dynamics of $G_{12}$ is left invariant, while the dynamics of $G_{14}$ and $G_{13}$ have been changed. The intermediately occurring confounding variables relate to a sequence of linked confounders, as discussed in [26]. $\Box$

In the next subsection it will be investigated under which conditions our target module will remain invariant under the above transformation to a representation (7) without confounding variables.

VI-B Module invariance result (Step 3)

The transformation of a network into the form (7), leading to the resulting identification setup of (8), involves two basic steps, each of which can lead to a change of dynamic modules in $\bar{G}$ . These two steps are

(a)

Removing of non-measured signals in $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ (immersion), and
(b)

Transforming the system’s equations to a form where there are no confounding variables for $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ .

Module invariance in step (a) is covered by the following Condition:

Condition 1 (parallel path and loop condition[23])

Let $G_{ji}$ be the target network module to be identified. In the original network (6):

•

Every path from $w_{i}$ to $w_{j}$ , excluding the path through $G_{ji}$ , passes through a node $w_{k},k\in\mathcal{D}$ , and
•

Every loop through $w_{j}$ passes through a node in $w_{k},k\in\mathcal{D}$ . $\Box$

This condition has been introduced in [23] for a MISO identification setup, to guarantee that when immersing (removing) nonmeasured node signals from the network, the target module will remain invariant. As an alternative, more generalized notions of network abstractions have been developed for this purpose in [30]. Condition 1 will be used to guarantee module invariance under step (a).

Step (b) above is a new step, and requires studying module invariance in the step transforming a network from an original format where all nodes are measured, into a structure that complies with (7), i.e. with absence of confounding variables for $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ .

We are going to tackle this problem, by decomposing the set $\mathcal{U}$ into two disjunct sets $\mathcal{U}=\mathcal{A}\cup\mathcal{B}$ aiming at the situation that in the transformed network, the modules $G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{A}}}$ stay invariant, while for the modules $G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{B}}}$ we accept that the transformation can lead to module changes. We construct $\mathcal{A}$ by choosing signals $w_{k}\in w_{\!{\scriptscriptstyle\mathcal{U}}}$ such that in the original network there are no confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . For the selection of $\mathcal{B}$ , we do allow confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{B}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . By requiring a particular “disconnection” between the sets $\mathcal{A}$ and $\mathcal{B}$ , we can then still guarantee that the modules $G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{A}}}$ stay invariant.

The following condition will address the major requirement for addressing our step (b).

Condition 2

$\mathcal{U}$ is decomposed into two disjunct sets, $\mathcal{U}=\mathcal{A}\cup\mathcal{B}$ (see Figure 4), such that in the original network (6) there are no confounding variables for the estimation problems $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ and $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}$ . $\Box$

Condition 2 is not a restriction on $\mathcal{U}$ , as such a decomposition can always be made, e.g. by taking $\mathcal{A}=\emptyset$ and $\mathcal{B}=\mathcal{U}$ . The flexibility in choosing this decomposition will be instrumental in the sequel of this paper.

Example 3 (Example 2 continued)

In the example network depicted in Figure 5, we observe that in the original network there is a confounding variable for $w_{4}\rightarrow w_{1}$ . However in the step towards creating a network without confounding variables for $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ an intermediate step occurs, where there is also a confounding variable for $w_{3}\rightarrow w_{1}$ , as depicted in Figure 5(b). For $\mathcal{U}=\{2,3,4\}$ the choice $\mathcal{A}=\{2,3\}$ , $\mathcal{B}=\{4\}$ , is not valid since there exists a confounding variable ( $e_{3}$ ) for $w_{3}\rightarrow w_{4}$ which violates the second condition that there should be no confounding variables for $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}$ . Therefore the appropriate choice satisfying Condition 2 is $\mathcal{A}=\{2\}$ and $\mathcal{B}=\{3,4\}$ . Note that this matches with the situation that in the transformed network (Figure 5(c)), the module $G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{A}}}$ remains invariant, and the modules $G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{B}}}$ get changed. $\Box$

We can now formulate the module invariance result.

Theorem 1 (Module invariance result)

Let $G_{ji}$ be the target network module. In the transformed system’s equation (8), it holds that $\bar{G}_{ji}=G^{0}_{ji}$ under the following conditions:

1.

The parallel path and loop Condition 1 is satisfied, and
2.
The following three conditions are satisfied:
- a.
  
  $\mathcal{U}$ is decomposed in $\mathcal{A}$ and $\mathcal{B}$ , satisfying Condition 2, and
- b.
  
  $i\in\{\mathcal{A}\cup\mathcal{Q}\}$ , and
- c.
  
  Every path from $\{w_{i},w_{j}\}$ to $w_{\!{\scriptscriptstyle\mathcal{B}}}$ passes through a measured node in $w_{\mathcal{L}\backslash\mathcal{Z}}$ .

Proof: See appendix.

A more detailed illustration of the conditions in the theorem will be deferred to three different algorithms for selecting the node signals, to be presented in Sections VII-IX. We will first develop the identification results for the general case.

VI-C Identification results (Step 4)

If the conditions of Theorem 1 are satisfied, then the target module $\bar{G}_{ji}=G_{ji}^{0}$ can be identified on the basis of the system’s equation (8). For this system’s equation we can set up a predictor model with input $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and outputs $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , for the estimation of $\bar{G}$ and $\bar{H}$ . This will be based on a parameterized model set determined by

\mathcal{M}:=\left\{(\bar{G}(\theta),\bar{H}(\theta),\bar{\Lambda}(\theta)),\theta\in\Theta\right\},

while the actual data generating system is represented by $\mathcal{S}=(\bar{G}(\theta_{o}),\bar{H}(\theta_{o}),\bar{\Lambda}(\theta_{0}))$ . The corresponding identification problem is defined by considering the one-step-ahead prediction of $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the parametrized model, according to $\hat{w}_{\!{\scriptscriptstyle\mathcal{Y}}}(t|t-1;\theta):=\mathbb{E}\{w_{\!{\scriptscriptstyle\mathcal{Y}}}(t)\ |\ w_{\!{\scriptscriptstyle\mathcal{Y}}}^{t-1},w_{\!{\scriptscriptstyle\mathcal{D}}}^{t};\theta\}$ where $w_{\!{\scriptscriptstyle\mathcal{D}}}^{t}$ denotes the past of $w_{\!{\scriptscriptstyle\mathcal{D}}}$ , i.e. $\{w_{\!{\scriptscriptstyle\mathcal{D}}}(k),k\leq t\}$ . The resulting prediction error becomes: $\varepsilon(t,\theta):=w_{\!{\scriptscriptstyle\mathcal{Y}}}(t)-\hat{w}_{\!{\scriptscriptstyle\mathcal{Y}}}(t|t-1;\theta)$ , leading to

\varepsilon(t,\theta)=\bar{H}(q,\theta)^{-1}\left[w_{\!{\scriptscriptstyle\mathcal{Y}}}(t)-\bar{G}(q,\theta)w_{\!{\scriptscriptstyle\mathcal{D}}}(t)\right],

(9)

and the weighted least squares identification criterion

\hat{\theta}_{N}=\arg\min_{\theta}\frac{1}{N}\sum_{t=0}^{N-1}\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta),

(10)

with $W$ any positive definite weighting matrix. This parameter estimate then leads to an estimated subnetwork $\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{D}}}(q,\hat{\theta}_{N})$ and noise model $\bar{H}(q,\hat{\theta}_{N})$ , for which consistency and minimum variance results will be formulated next.

Theorem 2 (Consistency)

Consider a dynamic network represented by (7), and a related (MIMO) network identification setup with predictor inputs $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and predicted outputs $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , according to (8). Let $\mathcal{F}_{n}\subseteq\mathcal{U}$ be the set of node signals $k$ for which $\xi_{k}$ is statically uncorrelated with $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}$ ⁸⁸8This implies that $\mathbb{E}[\xi_{k}(t)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t)]=0$ . and let $\mathcal{F}:=\mathcal{U}\backslash\mathcal{F}_{n}$ . Then a direct prediction error identification method according to (9)-(10), applied to a parametrized model set $\mathcal{M}$ will provide consistent estimates of $\bar{G}$ and $\bar{H}$ if:

a.

$\mathcal{M}$ is chosen to satisfy $\mathcal{S}\in\mathcal{M}$ ;
b.

$\Phi_{\kappa}(\omega)>0$ for a sufficiently high number of frequencies, where $\kappa(t):=\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{D}}}^{\top}(t)&\xi_{\!{\scriptscriptstyle\mathcal{Q}}}^{\top}(t)&w_{o}(t)\end{bmatrix}^{\top}$ ;
(data-informativity condition).
c.
The following paths/loops should have at least a delay:
- •
  
  All paths/loops from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the network (8) and in its parametrized model; and
- •
  
  For every $w_{k}\in\mathcal{F}_{n}$ , all paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{k}$ in the network (8), or all paths from $w_{k}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the parametrized model.
(delay in path/loop condition.)

Proof: See appendix.

The consistency theorem has a structure that corresponds to the classical result of the direct prediction error identification method applied to a closed-loop experimental setup, [21]. A system in the model set condition (a), an informativity condition on the measured data (b), and a loop delay condition (c). Note however that conditions (b) and (c) are generalized versions of the typical closed-loop case [21, 15], and are dedicated for the considered network setup.

It is important to note that Theorem 2 is formulated in terms of conditions on the network in (7), which we refer to as the transformed network. However, it is quintessential to formulate the conditions in terms of properties of signals in the original network, represented by (6).

Proposition 2

If in the original network, $\mathcal{U}$ is decomposed in two disjunct sets $\mathcal{A}$ and $\mathcal{B}$ satisfying Condition 2, then Condition c of Theorem 2 can be reformulated as:

c.
The following paths/loops should have at least a delay:
- –
  
  All paths/loops from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{B}}}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the original network (6) and in the parametrized model; and
- –
  
  For every $w_{k}\in\mathcal{A}$ , all paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{B}}}$ to $w_{k}$ in the network (6), or all paths from $w_{k}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the parametrized model.

Proof: See appendix.

Condition (b) of Theorem 2 requires that there should be enough excitation present in the node signals, which actually reflects a type of identifiability property [13]. Note that this excitation condition may require that there are external excitation signals present at some locations, see also [15, 31, 32, 33, 14, 34], and [35], where it is shown that $dim(r)\geq|\mathcal{Q}|$ , with $|\mathcal{Q}|$ the cardinality of $\mathcal{Q}$ . Since we are using a direct method for identification, excitation signals $r$ are not directly used in the predictor model, although they serve the purpose of providing excitation in the network. A first result of a generalized method where, besides node signals $w$ , also signals $r$ are included in the predictor inputs, is presented in [36].

Since in the result of Theorem 2 we arrive at white innovation signals, the result can be extended to formulate Maximum Likelihood properties of the estimate.

Theorem 3

Consider the situation of Theorem 2, and let the conditions for consistency be satisfied. Let $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}$ be normally distributed, and let $\bar{\Lambda}(\theta)$ be parametrized independently from $\bar{G}(\theta)$ and $\bar{H}(\theta)$ . Then, under zero initial conditions, the Maximum Likelihood estimate of $\theta^{0}$ is

	$\displaystyle\hat{\theta}_{N}^{ML}$	$\displaystyle=$	$\displaystyle\arg\min_{\theta}\det\left(\frac{1}{N}\sum_{t=1}^{N}\varepsilon(t,\theta)\varepsilon^{T}(t,\theta)\right)$		(11)
	$\displaystyle\Lambda(\hat{\theta}_{N}^{ML})$	$\displaystyle=$	$\displaystyle\frac{1}{N}\sum_{t=1}^{N}\varepsilon(t,\hat{\theta}_{N}^{ML})\varepsilon^{T}(t,\hat{\theta}_{N}^{ML}).$		(12)

Proof: Can be shown by following a similar reasoning as in Theorem 1 of [8]. $\Box$

So far, we have analysed the situation for given sets of node signals $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ , $w_{o}$ , $w_{\!{\scriptscriptstyle\mathcal{A}}}$ , $w_{\!{\scriptscriptstyle\mathcal{B}}}$ and $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ . The presented results are very general and allow for different algorithms to select the appropriate signals and specify the particular signal sets, that will guarantee target module invariance and consistent and minimum variance module estimates with the presented local direct method. In the next sections we will focus on formulating guidelines for the selection of these sets, such that the target module invariance property holds, as formulated in Theorem 1. For formulating these conditions, we will consider three different situations with respect to the availability of measured node signals.

(a)

In the Full input case, we will assume that all in-neighbors of the predicted output signals are measured and used as predictor input;
(b)

In the Minimum input case, we will include the smallest possible number of node signals to be measured for arriving at our objective;
(c)

In the User selection case, we will formulate our results for a prior given set of measured node signals;

VII Algorithm for signal selection: full input case

The first algorithm to be presented is based on the strategy that for any node signal that is selected as output, we have access to all of its $w$ -in-neighbors, that are to be included as predictor inputs. This strategy will lead to an identification setup with a maximum use of measured node signals that contain information that is relevant for modeling our target module $G_{ji}$ . The following strategy will be followed:

•

We start by selecting $i\in\mathcal{D}$ and $j\in\mathcal{Y}$ ;
•

Then we extend $\mathcal{D}$ in such a way that all $w$ -in-neighbors of $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ are included in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ .
•

All node signals in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ that have noise terms $v_{k}$ , $k\in\mathcal{D}$ that are correlated with any $v_{\ell}$ , $\ell\in\mathcal{Y}$ (direct confounding variables for $w_{\!{\scriptscriptstyle\mathcal{D}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ ), are included in $\mathcal{Y}$ too. They become elements of $\mathcal{Q}$ .
•

With $\mathcal{A}:=\mathcal{D}\backslash\mathcal{Q}$ it follows that by construction there are no direct confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ .
•

Then we choose $w_{\!{\scriptscriptstyle\mathcal{B}}}$ as a subset of nodes that are not in $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ nor in $w_{\!{\scriptscriptstyle\mathcal{A}}}$ . This set needs to be introduced to block the indirect confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , and will be chosen to satisfy Condition 2a and 2c of Theorem 1.
•
Every node signal $w_{k}$ , $k\in\mathcal{A}$ for which there are only indirect confounding variables and cannot be blocked by a node in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ , is
- –
  
  moved to $\mathcal{B}$ if Conditions 2a and 2c of Theorem 1 are satisfied and $k\neq i$ ; (else)
- –
  
  included in $\mathcal{Y}$ and moved to $\mathcal{Q}$ ;
•

Finally, we define the identification setup as the estimation problem $w_{\mathcal{D}}\rightarrow w_{\mathcal{Y}}$ , with $\mathcal{D}=\mathcal{Q}\cup\mathcal{A}\cup\mathcal{B}$ and $\mathcal{Y}=\mathcal{Q}\cup\{o\}$ .

Note that because all $w$ -in-neighbors of $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ are included in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ , we automatically satisfy the parallel path and loop condition 1. In order for the selection of node signals $w_{\!{\scriptscriptstyle\mathcal{B}}}$ to satisfy the conditions of Theorem 1, we will specify the following Property 1.

Property 1

Let the node signals $w_{\!{\scriptscriptstyle\mathcal{B}}}$ be chosen to satisfy the following properties:

1.

If, in the original network, there are no confounding variables for the estimation problem $w_{\mathcal{A}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , then $\mathcal{B}$ is void implying that $w_{\!{\scriptscriptstyle\mathcal{B}}}$ is not present;
2.
If, in the original network, there are confounding variables for the estimation problem $w_{\mathcal{A}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , then all of the following conditions need to be satisfied:
- a.
  
  For any confounding variable for the estimation problem $w_{\mathcal{A}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , the unmeasured paths from the confounding variable to node signals $w_{\!{\scriptscriptstyle\mathcal{A}}}$ pass through a node in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ .
- b.
  
  There are no confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}$ .
- c.
  
  Every path from $\{w_{i},w_{j}\}$ to $w_{\!{\scriptscriptstyle\mathcal{B}}}$ passes through a measured node in $w_{\mathcal{L}\backslash\mathcal{Z}}$ . $\Box$

Property 2a) ensures that, after including $w_{\!{\scriptscriptstyle\mathcal{B}}}$ in the set of measured signals, there are no indirect confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , and Property 2b) guarantees that there are no confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}$ . Together we satisfy Condition 2a) of Theorem 1. Also, Property 2c) guarantees condition 2c) of Theorem 1 to be satisfied. Finally, as per the algorithm, $w_{i}$ can be either in $w_{\!{\scriptscriptstyle\mathcal{A}}}$ or $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ . Therefore at the end of the algorithm, we will obtain sets of signals that satisfy the conditions in Theorem 1 for target module invariance.

Example 4

Consider the network in Figure 6. $G_{12}$ is the target module that we want to identify. We now select the signals according to the algorithm presented in this section. First we include the input of the target module $w_{2}$ in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and the output of the target module $w_{1}$ in $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . Next we include all $w$ -in-neighbors of $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ (i.e. $w_{3}$ and $w_{4}$ ) in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ . All node signals in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ that have noise terms $v_{k}$ , $k\in\mathcal{D}$ that are correlated with any $v_{\ell},\ell\in\mathcal{Y}$ need to be included in $\mathcal{Y}$ too. This concerns $w_{2}$ , since $v_{1}$ is correlated with $v_{2}$ . Now $w_{\!{\scriptscriptstyle\mathcal{Y}}}=\{w_{1},w_{2}\}$ has changed and we need to include the $w$ -in-neighbors of $w_{2}$ , which is $w_{5}$ , in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ , leading to $w_{\!{\scriptscriptstyle\mathcal{D}}}=\{w_{2},w_{3},w_{4},w_{5}\}$ . After a check we can conclude that all node signals in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ that have noise terms $v_{k}$ , $k\in\mathcal{D}$ that are correlated with any $v_{\ell},\ell\in\mathcal{Y}$ are included in $\mathcal{Y}$ too. The result now becomes

	$\displaystyle\mathcal{Y}=\{1,2\}\$	;	$\displaystyle\ \mathcal{D}=\{2,3,4,5\}\$		(13)
	$\displaystyle\mathcal{Q}=\mathcal{Y}\cap\mathcal{D}=\{2\}\$	;	$\displaystyle\ \mathcal{A}=\mathcal{D}\backslash\mathcal{Q}=\{3,4,5\}.$		(14)

Since $v_{8}$ is dynamically correlated with $v_{1}$ , in the resulting situation we will have a confounding variable for the estimation problem $w_{5}\rightarrow w_{1}$ (i.e. $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ ). As per condition 2a of Property 1, the path of the confounding variable $e_{8}$ to $w_{5}$ should be blocked by a node signal in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ , which can be either $w_{7}$ or $w_{8}$ . $w_{7}$ cannot be chosen in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ since this would create a confounding variable for $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}$ (i.e. $w_{5}\rightarrow w_{7}$ ). Moreover, $w_{7}\in w_{\!{\scriptscriptstyle\mathcal{B}}}$ would also create an unmeasured path $w_{i}\rightarrow w_{7}$ with $w_{i}=w_{2}$ , thereby violating Condition 2c of Property 1. When $w_{8}$ is chosen in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ , the conditions in Property 1 are satisfied and hence we choose $\mathcal{B}=\{8\}$ . The resulting estimation problem is $(w_{2},w_{3},w_{4},w_{5},w_{8})\rightarrow(w_{1},w_{2})$ , and will according to Theorem 2 provide a consistent and maximum likehood estimate of $G_{12}$ .

VIII Algorithm for signal selection: minimum input case

Rather than measuring all node signals that are $w$ -in-neighbors of the output $w_{j}$ of our target module $G_{ji}$ , we now focus on an identification setup that uses a minimum number of measured node signals, according to the following strategy:

•

We start by selecting $i\in\mathcal{D}$ and $j\in\mathcal{Y}$ ;
•

Then we extend $\mathcal{D}$ with a minimum number of node signals that satisfies the parallel path and loop Condition 1.
•

Every node signal $w_{k}$ in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ for which there is a direct or indirect confounding variable for the estimation problem $w_{k}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ is included in $\mathcal{Y}$ and $\mathcal{Q}$ .
•

With $\mathcal{A}:=\mathcal{D}\backslash\mathcal{Q}$ and $\mathcal{B}=\emptyset$ it follows that by construction there are no confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ .
•

Finally, we define the identification setup as the estimation problem $w_{\mathcal{D}}\rightarrow w_{\mathcal{Y}}$ , with $\mathcal{D}=\mathcal{Q}\cup\mathcal{A}$ .

As we can observe, the algorithm does not require selection of set $\mathcal{B}$ . This is attributed to the way we handle the indirect confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . Instead of tackling these confounding variables by adding blocking node signals $w_{\!{\scriptscriptstyle\mathcal{B}}}$ (as in full input case) to be added as predictor inputs, we deal with them by moving the concerned $w_{k},k\in\mathcal{A}$ to $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ and thus to the set of predicted outputs. We choose this approach in order to minimize the required number of measured node signals. In this way, by construction, there will be no direct or indirect confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . From this result, we can guarantee that the conditions in Theorem 1 will be satisfied since $\mathcal{B}=\emptyset$ . Thus at the end of the algorithm we obtain a set of signals that provides target module invariance.

Example 5

Consider the same network as in example 4 represented by Figure 6. Applying the algorithm of this section, we first include the input of the target module $w_{2}$ in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and the output of the target module $w_{1}$ in $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . There exist two parallel paths from $w_{2}$ to $w_{1}$ , namely $w_{2}\rightarrow w_{3}\rightarrow w_{1}$ and $w_{2}\rightarrow w_{3}\rightarrow w_{4}\rightarrow w_{1}$ and no loops through $w_{1}$ . In order to satisfy Condition 1 we can include either $w_{3}$ in $\mathcal{D}$ such that $\mathcal{D}=\{2,3\}$ or both $w_{3},w_{4}$ in $\mathcal{D}$ such that $\mathcal{D}=\{2,3,4\}$ . We choose the former to have minimum number of node signals. Because of the correlation between $v_{2}$ and $v_{1}$ there is a confounding variable for the estimation problem $w_{2}\rightarrow w_{1}$ . According to step 3 of the algorithm, $w_{2}$ is then moved to $\mathcal{Y}$ and $\mathcal{Q}$ , leading to $w_{\!{\scriptscriptstyle\mathcal{Y}}}=\{w_{1},w_{2}\}$ . Because of this change of $\mathcal{Y}$ we have to recheck for presence of confounding variables. However this change does not introduce any additional confounding variables. The resulting estimation problem is $(w_{2},w_{3})\rightarrow(w_{1},w_{2})$ with $w_{\!{\scriptscriptstyle\mathcal{A}}}=w_{3}$ , $w_{\!{\scriptscriptstyle\mathcal{B}}}=\emptyset$ , $w_{\!{\scriptscriptstyle\mathcal{Q}}}=w_{2}$ and $w_{\!{\scriptscriptstyle\mathcal{Y}}}=(w_{1},w_{2})$ . $\Box$

In comparison with the full input case, the algorithm in this section will typically have a higher number of predicted output nodes and a smaller number of predictor inputs. This implies that there is a stronger emphasis on estimating a (multivariate) noise model $\bar{H}$ . Given the choice of the direct identification method, and the choice of signals to satisfy the parallel path and loop condition, this algorithm indeed adds the smallest number of additional signals to be measured, as the removal of any of the additional signals will lead to conflicts with the required conditions.

IX Algorithm for signal selection: User selection case

Next we focus on the situation that we have a prior given set of nodes that we have access to i.e. a set of nodes that can (possibly) be measured. We refer to these nodes as accessible nodes while the remaining nodes are called inaccessible. This strategy is different from the full input case since we do not assume that we have access to all in-neighbours of $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . This will lead to an identification setup with use of accessible node signals that contain information which is relevant for modeling our target module $G_{ji}$ . We consider the situation that nodes $w_{i}$ and $w_{j}$ are accessible nodes and there are accessible nodes that satisfy the parallel path and loop Condition 1.
The following strategy will be followed:

1.

We start by selecting $i\in\mathcal{D}$ and $j\in\mathcal{Y}$ ;
2.

Then we extend $\mathcal{D}$ to satisfy the parallel path and loop Condition 1;
3.

We include in $\mathcal{D}$ all accessible $w$ -in-neighbors of $\mathcal{Y}$ ;
4.

We extend $\mathcal{D}$ in such a way that for every non-accessible $w$ -in-neighbor $w_{k}$ of $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ we include all accessible nodes that have path to $w_{k}$ that runs through non-accessible nodes only.
5.

If there is a direct confounding variable for $w_{i}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , or an indirect one that has a path to $w_{i}$ that does not pass through any accessible nodes, then $i$ is included in $\mathcal{Y}$ and $\mathcal{Q}$ ;
6.

A node signal $w_{k}$ , $k\in\mathcal{D}$ is included in $\mathcal{A}$ if there are either no confounding variables for $w_{k}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ or only indirect confounding variables that have paths to $w_{k}$ that pass through accessible nodes.
7.
Every node signal $w_{k},k\in\mathcal{D}\backslash\{i\}$ that has a direct confounding variable for $w_{k}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ , or an indirect confounding variable with a path to $w_{k}$ that does not pass through any accessible nodes is:
- •
  
  included in $\mathcal{B}$ if condition 2a and 2c of Theorem 1 are satisfied on including it in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ (else)
- •
  
  included in $\mathcal{Y}$ and $\mathcal{Q}$ ; return to step 3.
8.
Every node signal $w_{k}$ , $k\in\mathcal{A}$ for which there are only indirect confounding variables as meant in Step 6, is
- •
  
  moved to $\mathcal{B}$ if Conditions 2a and 2c of Theorem 1 are satisfied and $k\neq i$ ; (else)
- •
  
  kept in $\mathcal{A}$ while a set of accessible nodes that blocks the path of the confounding variable is added to $\mathcal{B}\cup\mathcal{A}$ , while satisfying Conditions 2a and 2c of Theorem 1; (else)
- •
  
  included in $\mathcal{Y}$ and $\mathcal{Q}$ ;
9.

By construction there are no confounding variables for $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ .

In the algorithm above, the prime reasoning is to deal with confounding variables for $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . Direct confounding variables lead to including the respective node in the outputs $\mathcal{Y}$ or shifting the respective input node to $\mathcal{B}$ , while indirect confounding variables are treated by either shifting the input node to $\mathcal{B}$ or, if its effect can be blocked, by adding an accessible node to the inputs in $\mathcal{B}$ , or, if the blocking conditions can not be satisfied, by including the node in the output $\mathcal{Y}$ . Note that the algorithm always provides a solution if Condition 1 of Theorem 1 (parallel path and loop condition) can be satisfied.

Example 6

Consider the same network as in example 4 represented by Figure 7. However, we are given that only the nodes $w_{1},w_{2},w_{3}$ and $w_{6}$ are accessible. We now select the signals according to the algorithm presented in this section. First we include $w_{i}=w_{2}$ in $w_{\!{\scriptscriptstyle\mathcal{D}}}$ and $w_{j}=w_{1}$ in $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . Then we extend $\mathcal{D}$ such that the parallel path and loop Condition 1 is satisfied. This is done by selecting $\mathcal{D}=\{2,3\}$ . According to step 4, we extend $\mathcal{D}$ by node $w_{6}$ as it serves as nearest accessible in-neighbor of $w_{4}$ , being an inaccessible in-neighbor of $w_{1}$ . As per Step 5, since $v_{1}$ and $v_{2}$ are correlated, $w_{2}$ is moved to $\mathcal{Y}$ and $\mathcal{Q}$ . As per Step 6, there are no confounding variables for the estimation problem $w_{3}\rightarrow w_{1}$ and hence $w_{3}$ is included in $w_{\!{\scriptscriptstyle\mathcal{A}}}$ . Since $v_{4}$ and $v_{6}$ are correlated, it implies that there is an indirect confounding variable for the estimation problem $w_{6}\rightarrow w_{1}$ , which however does not pass through an accessible node. Step 7 does not apply since $w_{3}\in w_{\!{\scriptscriptstyle\mathcal{A}}}$ has no confounding variables. Step 8 requires to deal with the indirect confounding variable $v_{4}$ for $w_{6}\rightarrow w_{1}$ . Checking Conditions 2a and 2c of Theorem 1 for $\mathcal{A}$ and $\mathcal{B}$ , it appears that every path from $w_{i}=w_{2}$ or from $w_{j}=w_{1}$ to $w_{6}$ passes through a measured node and there are no confounding variable for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{6}$ . Hence we include $w_{6}$ in $w_{\!{\scriptscriptstyle\mathcal{B}}}$ . As a result, the estimation problem is $(w_{2},w_{3},w_{6})\rightarrow(w_{1},w_{2})$ .

Remark 2

Rather than starting the signal selection problem from a fixed set of accessible notes, the provided theory allows for an iterative and interactive algorithm for selecting accessible nodes in sensor allocation problems in a flexible way.

X Discussion

All three presented algorithms lead to a set of selected node signals that satisfy the conditions for target module invariance, and thus provide a predictor model in which no confounding variables can deteriorate the estimation of the target module. Only in the “User selection case” this is conditioned on the fact that appropriate node signals should be available to satisfy the parallel path and loop condition. Under these circumstances the presented algorithms are sound and complete [37]. This attractive feasibility result is mainly attributed to the addition of predicted outputs, that adds flexibility to solve the problem of confounding variables.

Note that the presented algorithms do not guarantee the consistency of the estimated target module. For this to hold the additional conditions for consistency, among which data-informativity and the delay in path/loop condition, need to be satisfied too, as illustrated in Figure 8. A specification of path-based conditions for data-informativity is beyond the scope of this paper, but first results on this problem are presented in [35]. Including these path-based conditions in the signal selection algorithms would be a next natural step to take. This also holds for the development of data-driven techniques to estimate the correlation structure of the disturbances.

It can be observed that the three algorithms presented in the previous sections rely only on the graphical conditions of the network. This paves way to automate the signal selection procedure using graph based algorithms that are scalable to large dimensions, with input being topology of the network and disturbance correlation structure represented as adjacency matrices. Also, it can be observed that the three considered cases in the previous sections, most likely will lead to three different experimental setups for estimating the single target module. For all three cases we can arrive at consistent and maximum likehood estimates of the target module. However, because of the fact that the experimental setups are different in the three cases, the data-informativity conditions and the statistical properties of the target module estimates will be different. The minimum variance expressions, in the form of the related Cramér-Rao lower bounds, will typically be different for the different experimental setups. Comparing these bounds for different experimental setups is beyond the scope of the current paper and considered as topic for future research.
We have formulated identification criteria in the realm of classical prediction error methods. This will typically lead to complex non-convex optimization problems that will scale poorly with the dimensions (number of parameters) of the problems. However alternative optimization approaches are becoming available that scale well and that rely on regularized kernel-based methods, thus exploiting new developments that originate from machine learning, see e.g. [18], and relaxations that rely on sequential convex optimization, see e.g. [38, 39].

XI CONCLUSIONS

A new local module identification approach has been presented to identify local modules in a dynamic network with given topology and process noise that is correlated over the different nodes. For this case, it is shown that the problem can be solved by moving from a MISO to a MIMO identification setup. In this setup the target module is embedded in a MIMO problem with appropriately chosen inputs and outputs, that warrant the consistent estimation of the target module with maximum likelihood properties. The key part of the procedure is the handling of direct and indirect confounding variables that are induced by correlated disturbances and/or non-measured node signals, and thus essentially dependent on the (Boolean) topology of the network and the (Boolean) correlation structure of the disturbances. A general theory has been developed that allows for specification of different types of algorithms, of which the “full input case”, the “minimum input case” and the “user selection case” have been illustrated through examples. The presented theory is suitable for generalization to the estimation of sets of target modules.

Appendix A Proof of Proposition 1

Starting with the network representation (6), we can eliminate the non-measured node variables $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ from the equations, by writing the last (block) row of (6) into an explicit expression for $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ :

w_{\!{\scriptscriptstyle\mathcal{Z}}}=(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}\left[\sum_{k\in\mathcal{Q}\cup\{o\}\cup\mathcal{U}}\!\!\!\!G_{\!{\scriptscriptstyle\mathcal{Z}}k}w_{k}+\sum_{\ell\in\mathcal{Q}\cup\{o\}\cup\mathcal{U}\cup\mathcal{Z}}\!\!\!\!H_{\!{\scriptscriptstyle\mathcal{Z}}\ell}w_{\ell}\right],

and by substituting this $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ into the expressions for the remaining $w$ -variables. As a result

	$\displaystyle\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}=\begin{bmatrix}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{oo}&\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}+\breve{v},$
	$\displaystyle\breve{v}=\breve{H}\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{U}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}=\begin{bmatrix}\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{oo}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{U}}}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{U}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\!\!\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{U}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}$		(15)

with $cov(e)=I$ , and where

\breve{G}_{kh}=G_{kh}+G_{k\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}G_{\!{\scriptscriptstyle\mathcal{Z}}h}

(16)

with $k,h\in\{\mathcal{Q}\cup\{o\}\cup\mathcal{U}\}$ , and

\breve{H}_{k\ell}=H_{k\ell}+G_{k\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}H_{\!{\scriptscriptstyle\mathcal{Z}}\ell},

(17)

with $\ell\in\{\mathcal{Q}\cup\{o\}\cup\mathcal{U}\cup\mathcal{Z}\}$ .

On the basis of (A), the spectral density of $\breve{v}$ is given by $\Phi_{\breve{v}}=\breve{H}\breve{H}^{*}$ . Applying a spectral factorization [40] to $\Phi_{\breve{v}}$ will deliver $\Phi_{\breve{v}}=\tilde{H}\tilde{\Lambda}\tilde{H}^{*}$ with $\tilde{H}$ a monic, stable and minimum phase rational matrix, and $\tilde{\Lambda}$ a positive definite (constant) matrix. Then there exists a white noise process $\tilde{\xi}$ defined by $\tilde{\xi}:=\tilde{H}^{-1}\breve{H}e$ such that $\tilde{H}\tilde{\xi}=\breve{v}$ , with cov( $\tilde{\xi}$ ) = $\tilde{\Lambda}$ , while $\tilde{H}$ is of the form

\tilde{H}=\begin{bmatrix}\tilde{H}_{11}&\tilde{H}_{12}&\tilde{H}_{13}\\ \tilde{H}_{21}&\tilde{H}_{22}&\tilde{H}_{23}\\ \tilde{H}_{31}&\tilde{H}_{32}&\tilde{H}_{33}\\ \end{bmatrix}

(18)

and where the block dimensions are conformable to the dimensions of $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ , $w_{o}$ and $w_{\!{\scriptscriptstyle\mathcal{U}}}$ respectively. As a result, (A) can be rewritten as

\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}=\begin{bmatrix}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{oo}&\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}+\tilde{H}\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \tilde{\xi}_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}.

(19)

By denoting

\begin{bmatrix}\check{H}_{13}\\ \check{H}_{23}\end{bmatrix}:=\begin{bmatrix}\tilde{H}_{13}\tilde{H}_{33}^{-1}\\ \tilde{H}_{23}\tilde{H}_{33}^{-1}\end{bmatrix}

(20)

and premultiplying (19) with

\begin{bmatrix}I&0&-\check{H}_{13}\\ 0&I&-\check{H}_{23}\\ 0&0&I\end{bmatrix}

(21)

while only keeping the identity terms on the left hand side, we obtain an equivalent network equation:

\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!\!\begin{bmatrix}\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}\!\\ \!\breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}\!&\!\breve{G}_{oo}^{\prime}\!\!&\!\!\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!+\!\!\begin{bmatrix}\!\tilde{H}_{11}^{\prime}\!\!&\!\!\tilde{H}_{12}^{\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{21}^{\prime}\!\!&\!\!\tilde{H}_{22}^{\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{31}\!\!&\!\!\tilde{H}_{32}\!\!&\!\!\tilde{H}_{33}\!\end{bmatrix}\!\!\begin{bmatrix}\!\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!\tilde{\xi}_{o}\!\\ \!\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!,

(22)

with

$\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}$	$\displaystyle=$	$\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}-\check{H}_{13}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}+\check{H}_{13}$	(23)
$\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}$	$\displaystyle=$	$\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}-\check{H}_{13}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\star}$	(24)
$\displaystyle\breve{G}_{o\star}^{\prime}$	$\displaystyle=$	$\displaystyle\breve{G}_{o\star}-\check{H}_{23}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\star}$	(25)
$\displaystyle\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime}$	$\displaystyle=$	$\displaystyle\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}-\check{H}_{23}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}+\check{H}_{23}$	(26)
$\displaystyle\tilde{H}_{1\square}^{\prime}$	$\displaystyle=$	$\displaystyle\tilde{H}_{1\square}-\check{H}_{13}\tilde{H}_{3\square}$	(27)
$\displaystyle\tilde{H}_{2\square}^{\prime}$	$\displaystyle=$	$\displaystyle\tilde{H}_{2\square}-\check{H}_{23}\tilde{H}_{3\square}.$	(28)

where $\star\in\{\mathcal{Q}\cup\{o\}\}$ and $\square\in\{1,2\}$ .

The next step is now to show that that the block elements $\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}$ and $\breve{G}_{oo}^{\prime}$ in $G$ can be made $0$ . This can be done by variable substitution as follows:

The second row in (22) is replaced by an explicit expression for $w_{o}$ according to

w_{o}=(1-\breve{G}_{oo}^{\prime})^{-1}[\breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}w_{\!{\scriptscriptstyle\mathcal{Q}}}+\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime}w_{\!{\scriptscriptstyle\mathcal{U}}}+\tilde{H}_{21}^{\prime}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}+\tilde{H}_{22}^{\prime}\tilde{\xi}_{o}].

Additionally, this expression for $w_{o}$ is substituted into the first block row of (22), to remove the $w_{o}$ -dependent term on the right hand side, leading to

\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}\!&\!0\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\tilde{H}_{11}^{\prime\prime}\!\!&\!\!\tilde{H}_{12}^{\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{21}^{\prime\prime}\!\!&\!\!\tilde{H}_{22}^{\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{31}\!\!&\!\!\tilde{H}_{32}\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \tilde{\xi}_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split}

(29)

with

$\displaystyle\bar{G}_{o\star}$	$\displaystyle=$	$\displaystyle(I-\breve{G}_{oo}^{\prime})^{-1}\breve{G}_{o\star}^{\prime}$	(30)
$\displaystyle\tilde{H}_{2\star}^{\prime\prime}$	$\displaystyle=$	$\displaystyle(I-\breve{G}_{oo}^{\prime})^{-1}\tilde{H}_{2\star}^{\prime}$	(31)
$\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime\prime}$	$\displaystyle=$	$\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}+\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}\bar{G}_{o\star}$	(32)
$\displaystyle\tilde{H}_{1\star}^{\prime\prime}$	$\displaystyle=$	$\displaystyle\tilde{H}_{1\star}^{\prime}+\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}\tilde{H}_{2\star}^{\prime\prime}.$	(33)

Since because of these operations, the matrix $\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}$ might not be hollow, we move any diagonal terms of this matrix to the left hand side of the equation, and premultiply the first (block) equation by the diagonal matrix $(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}$ , to obtain the expression

\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\tilde{H}_{11}^{\prime\prime\prime}\!\!&\!\!\tilde{H}_{12}^{\prime\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{21}^{\prime\prime}\!\!&\!\!\tilde{H}_{22}^{\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{31}\!\!&\!\!\tilde{H}_{32}\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \tilde{\xi}_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split}

(34)

with

$\displaystyle\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}$	$\displaystyle=$	$\displaystyle(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime})),$	(35)
$\displaystyle\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}$	$\displaystyle=$	$\displaystyle(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime}$	(36)
$\displaystyle\tilde{H}_{1\star}^{\prime\prime\prime}$	$\displaystyle=$	$\displaystyle(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}\tilde{H}_{1\star}^{\prime\prime}.$	(37)

As final step, we need the matrix $\tilde{H}_{r}:=\begin{bmatrix}\!\tilde{H}_{11}^{\prime\prime\prime}\!\!&\!\!\tilde{H}_{12}^{\prime\prime\prime}\\ \!\tilde{H}_{21}^{\prime\prime}\!\!&\!\!\tilde{H}_{22}^{\prime\prime}\end{bmatrix}$ to be monic, stable and minimum phase to obtain the representation as in (7). To that end, we consider the stochastic process $\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}:=\tilde{H}_{r}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}$ with $\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}:=\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}^{\top}&\tilde{\xi}_{o}^{\top}\end{bmatrix}^{\top}$ . The spectral density of $\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}$ is then given by $\Phi_{\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}}=\tilde{H}_{r}\tilde{\Lambda}_{\!{\scriptscriptstyle\mathcal{Y}}}\tilde{H}_{r}^{\star}$ with $\tilde{\Lambda}_{\!{\scriptscriptstyle\mathcal{Y}}}$ the covariance matrix of $\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}$ , that can be decomposed as $\tilde{\Lambda}_{\!{\scriptscriptstyle\mathcal{Y}}}=\tilde{\Gamma}_{r}\tilde{\Gamma}_{r}^{T}$ . From spectral factorization [40] it follows that the spectral factor $\tilde{H}_{r}\tilde{\Gamma}_{r}$ of $\Phi_{\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}}$ satisfies

\tilde{H}_{r}\tilde{\Gamma}_{r}=\bar{H}_{s}D

(38)

with $\bar{H}_{s}$ a stable and minimum phase rational matrix, and $D$ an “all pass” stable rational matrix satisfying $DD^{\star}=I$ .
The signal $\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}$ can then be written as

\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}=\tilde{H}_{r}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}=\bar{H}_{s}D\tilde{\Gamma}_{r}^{-1}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}.

By defining $\bar{H}_{s}^{\infty}:=\lim_{z\rightarrow\infty}\bar{H}_{s}$ , this can be rewritten as

\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}=\tilde{H}_{r}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}=\underbrace{\bar{H}_{s}(\bar{H}_{s}^{\infty})^{-1}}_{\bar{H}}\underbrace{\bar{H}_{s}^{\infty}D\tilde{\Gamma}_{r}^{-1}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}}_{\xi_{\!{\scriptscriptstyle\mathcal{Y}}}}.

As a result, $\bar{H}$ is a monic stable and stably invertible rational matrix, and $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}$ is a white noise process with spectral density given by $\bar{H}_{s}^{\infty}D\tilde{\Gamma}_{r}^{-1}\Phi_{\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}}\tilde{\Gamma}_{r}^{-T}D^{\star}(\bar{H}_{s}^{\infty})^{T}=\bar{H}_{s}^{\infty}(\bar{H}_{s}^{\infty})^{T}$ . Therefore we can write (34) as,

\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\bar{H}_{11}\!\!&\!\!\bar{H}_{12}\!\!&\!\!0\!\\ \!\bar{H}_{21}\!\!&\!\!\bar{H}_{22}\!\!&\!\!0\!\\ \!\bar{H}_{31}\!\!&\!\!\bar{H}_{32}\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split}

(39)

where $\begin{bmatrix}\bar{H}_{31}&\bar{H}_{32}\end{bmatrix}=\begin{bmatrix}\tilde{H}_{31}&\tilde{H}_{32}\end{bmatrix}\tilde{\Gamma}_{r}D^{-1}(\bar{H}_{s}^{\infty})^{-1}$ . Let $[\begin{matrix}\bar{H}_{31}^{\prime}&\bar{H}_{32}^{\prime}\end{matrix}]=[\begin{matrix}\bar{H}_{31}&\bar{H}_{32}\end{matrix}]\begin{bmatrix}\bar{H}_{11}&\bar{H}_{12}\\ \bar{H}_{21}&\bar{H}_{22}\end{bmatrix}^{-1}$ . Pre-multiplying (39) with $\begin{bmatrix}I&0&0\\ 0&I&0\\ -\bar{H}_{31}^{\prime}&-\bar{H}_{32}^{\prime}&I\end{bmatrix}$ while only keeping the identity terms on the left hand side, we obtain an equivalent network equation:

\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}^{\prime}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\bar{H}_{11}\!\!&\!\!\bar{H}_{12}\!\!&\!\!0\!\\ \!\bar{H}_{21}\!\!&\!\!\bar{H}_{22}\!\!&\!\!0\!\\ \!0\!\!&\!\!0\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split}

(40)

where $\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}-\bar{H}_{31}^{\prime}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime\prime}-\breve{H}_{32}^{\prime}\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}+\bar{H}_{31}^{\prime}$ , $\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}^{\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}+\bar{H}_{32}^{\prime}$ and $\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}-\bar{H}_{31}^{\prime}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime\prime}-\bar{H}_{32}^{\prime}\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime}$ . In order to make $\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}$ hollow, we move any diagonal terms of this matrix to the left hand side of the equation, and pre-multiply the third (block) equation by the diagonal matrix $(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}))^{-1}$ . This will modify (3,3) (block) element of the $H$ matrix to $(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}))^{-1}\tilde{H}_{33}$ , which we need to be monic, stable and stably invertible. Applying spectral factorization as before [40], we can write the term $(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}))^{-1}\tilde{H}_{33}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}$ as $\bar{H}_{33}\xi_{\!{\scriptscriptstyle\mathcal{U}}}$ where $\bar{H}_{33}$ is monic, stable and stably invertible and $\xi_{\!{\scriptscriptstyle\mathcal{U}}}$ is a white noise process with covariance $\Lambda_{33}$ . This completes the proof for obtaining (7).

The absence of confounding variables for the estimation problem $w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}$ can be proved as follows. Since all non-measured nodes $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ are removed in the network represented by (7), the only non-measured signals in the network are the noise signals in $\xi_{m}$ and they do not have any unmeasured paths to any nodes in the network (i.e. to $w_{m}$ ). Due to the block-diagonal structure of $\bar{H}_{m}$ in (7), the only non-measured signals that have direct paths to $w_{\!{\scriptscriptstyle\mathcal{U}}}$ originate from $\xi_{\!{\scriptscriptstyle\mathcal{U}}}$ , while the only non-measured signals that have direct paths to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ originate from $[\xi_{\!{\scriptscriptstyle\mathcal{Q}}}^{T}\ \xi_{o}]^{T}$ . Therefore there does not exist an element of $\xi_{m}$ that has simultaneous unmeasured paths or direct paths to both $w_{\!{\scriptscriptstyle\mathcal{U}}}$ and $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ . $\Box$

Appendix B Proof of Theorem 1

In order to prove Theorem 1, we first present three preparatory Lemmas.

Lemma 1

Consider a dynamic network as defined in (6), a vector $e_{\!{\scriptscriptstyle\mathcal{X}}}$ of white noise sources with $\mathcal{X}\subseteq\mathcal{L}$ , and two subsets of nodes $w_{\Phi}$ and $w_{\Omega}$ , $\Phi,\Omega\subset\mathcal{L}\backslash\mathcal{Z}$ . If in $e_{\!{\scriptscriptstyle\mathcal{X}}}$ there is no confounding variable for the estimation problem $w_{\Phi}\rightarrow w_{\Omega}$ , then

\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}^{*}=\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}^{*}=0,

where $\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}$ , $\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}$ are the noise model transfer functions in the immersed network (A) related to the appropriate variables.

Proof: If in $e_{\!{\scriptscriptstyle\mathcal{X}}}$ there is no confounding variable for the formulated estimation problem, then for all $e_{x}$ , $x\in\mathcal{X}$ there do not exist simultaneous paths from $e_{x}$ to $w_{\Phi}$ and $w_{\Omega}$ , that are direct or pass through nodes in $\mathcal{Z}$ only.
For the network where signals $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ are immersed, it follows from (17), that $\breve{H}_{k\ell}=H_{k\ell}+G_{k\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}H_{\!{\scriptscriptstyle\mathcal{Z}}\ell}$ where $k\in\Phi$ and $\ell\in\mathcal{X}$ . The first term in the sum (i.e. $H_{kl}$ ) is the noise model transfer in the direct path from $e_{\ell}$ to $w_{k}$ and the second part of the sum is the transfer function in the unmeasured paths (i.e. paths through $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ only) from $e_{\ell}$ to $w_{k}$ . If all paths from a node signal $e_{x}$ to $w_{\Phi}$ pass through a node in $w_{\mathcal{L}\backslash\mathcal{Z}}$ , then there are no direct or unmeasured paths from $e_{x}$ to nodes in $w_{\Phi}$ . This implies that $\breve{H}_{kx}=\breve{H}_{kx}^{*}=0$ for all $k\in\Phi$ (i.e $\breve{H}_{\Phi x}=0$ ). A dual reasoning applies to paths from $e_{x}$ to $w_{\Omega}$ . Consider $e_{\!{\scriptscriptstyle\mathcal{X}}}=[\begin{matrix}e_{x_{1}}&e_{x_{2}}&\dots&e_{x_{n}}\end{matrix}]^{\top}$ . Then we have $\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}^{*}=\breve{H}_{\Phi x_{1}}\breve{H}_{\Omega x_{1}}^{*}+\dots+\breve{H}_{\Phi x_{n}}\breve{H}_{\Omega x_{n}}^{*}$ . If the condition in the lemma is satisfied, implying that there do not exist simultaneous paths, then in each of the product terms we either have $\breve{H}_{\Phi x_{k}}=0$ or $\breve{H}_{\Omega x_{k}}^{*}=0$ where $k=\{1,2,\dots,n\}$ . This proves the result of lemma 1. $\Box$

Lemma 2

Consider a dynamic network as defined in (A) with target module $G_{ji}$ , where the non-measured node signals $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ are immersed, while the node sets $\{o,\mathcal{Q},\mathcal{U}\}$ are chosen according to the specifications in Section IV.
Then $\bar{G}_{ji}$ is given by the following expressions:

\mbox{If }i\in\mathcal{Q}:\ \bar{G}_{ji}\!\!=\!\!(I\!-\!\breve{G}_{jj}+\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}j})^{-1}(\breve{G}_{ji}\!-\!\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}i})

(41)

\mbox{If }i\in\mathcal{U}:\bar{G}_{ji}\!\!=\!\!(I\!-\!\breve{G}_{jj}\!+\!\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}j})^{-1}(\breve{G}_{ji}\!-\!\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}i}\!+\!\check{H}_{ji})

(42)

where $\check{H}_{j3}$ is the row vector corresponding to the row of node signal $j$ in $\check{H}_{13}$ (if $j\in\mathcal{Q}$ ) or in $\check{H}_{23}$ (if $j\in o$ ), and $\check{H}_{ji}$ is the element corresponding to the column of node signal $i$ in $\check{H}_{j3}$ .

Proof: For the target module $G_{ji}$ we have the following cases that can occur:

1.

$j=o$ and $i\in\mathcal{U}$ . From (30) we have $\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime})^{-1}\breve{G}_{ji}^{\prime}$ where $\breve{G}_{jj}^{\prime}$ is given by (25) and $\breve{G}_{ji}^{\prime}$ is given by (26). This directly leads to (42).
2.

$j=o$ and $i\in\mathcal{Q}$ . From (30) we have $\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime})^{-1}\breve{G}_{ji}^{\prime}$ where $\breve{G}_{jj}^{\prime}$ and $\breve{G}_{ji}^{\prime}$ are given by (25), leading to (41).
3.

$j\in\mathcal{Q}$ , $o$ is void and $i\in\mathcal{U}$ . From (36) we have $\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime\prime})^{-1}\breve{G}_{ji}^{\prime\prime}$ where $\breve{G}_{jj}^{\prime\prime}$ and $\breve{G}_{ji}^{\prime\prime}$ are given by (32). Since $o$ is void, (32) leads to $G_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}$ . Therefore $\breve{G}_{jj}^{\prime\prime}=\breve{G}_{jj}^{\prime}$ which is specified by (24), and $\breve{G}_{ji}^{\prime\prime}=\breve{G}_{ji}^{\prime}$ which is given by (23). This leads to (42).
4.

$j\in\mathcal{Q}$ , $o$ is void and $i\in\mathcal{Q}$ . Since $j\neq i$ it follows from (35) that $\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime\prime})^{-1}\breve{G}_{ji}^{\prime\prime}$ where $\breve{G}_{jj}^{\prime\prime}$ and $\breve{G}_{ji}^{\prime\prime}$ are given by (32). Since $o$ is void, (32) leads to $G_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}$ . Therefore for this case, $\breve{G}_{jj}^{\prime\prime}=\breve{G}_{jj}^{\prime}$ and $\breve{G}_{ji}^{\prime\prime}=\breve{G}_{ji}^{\prime}$ , which are given by (24). This leads to (41).

Lemma 3

Consider a dynamic network as defined in (A) where the non-measured node signals $w_{\!{\scriptscriptstyle\mathcal{Z}}}$ are immersed, and let $\mathcal{U}$ be decomposed in sets $\mathcal{A}$ and $\mathcal{B}$ satisfying Condition 2. Then the spectral density $\Phi_{\breve{v}}$ has the unique spectral factorization $\Phi_{\breve{v}}=\tilde{H}\Lambda\tilde{H}^{*}$ with $\Lambda$ constant and $\tilde{H}$ monic, stable, minimum phase, and of the form

\Lambda\!\!=\!\!\begin{bmatrix}\Lambda_{11}\!&\!\Lambda_{12}\!&\!\Lambda_{13}&0\\ \Lambda_{21}\!&\!\Lambda_{22}\!&\!\Lambda_{23}&0\\ \Lambda_{31}\!&\!\Lambda_{32}\!&\!\Lambda_{33}&0\\ 0\!&\!0\!&\!0\!&\!\Lambda_{44}\\ \end{bmatrix},\!\!\ \ \tilde{H}\!\!=\!\!\begin{bmatrix}\tilde{H}_{11}\!&\!\tilde{H}_{12}\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}\!&\!0\\ \tilde{H}_{21}\!&\!\tilde{H}_{22}\!&\!\tilde{H}_{o\!{\scriptscriptstyle\mathcal{B}}}\!&\!0\\ \tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}o}\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}\!&\!0\\ 0\!&\!0\!&\!0\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}\\ \end{bmatrix},\

(43)

where the block dimensions are conformable to the dimensions of $w_{\!{\scriptscriptstyle\mathcal{Q}}}$ , $w_{o}$ , $w_{\!{\scriptscriptstyle\mathcal{B}}}$ and $w_{\!{\scriptscriptstyle\mathcal{A}}}$ respectively.

Proof: On the basis of (A) we write $w_{\!{\scriptscriptstyle\mathcal{U}}}=[\begin{matrix}w_{\!{\scriptscriptstyle\mathcal{B}}}^{\top}&w_{\!{\scriptscriptstyle\mathcal{A}}}^{\top}\end{matrix}]^{\top}$ and

\breve{v}=\breve{H}\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{B}}}\\ e_{\!{\scriptscriptstyle\mathcal{A}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}=\begin{bmatrix}\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{oo}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\!\!\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{B}}}\\ e_{\!{\scriptscriptstyle\mathcal{A}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}

(44)

with $cov(e)=I$ and the components of $\breve{H}$ as specified in (17). Starting from the expression (44), the spectral density $\Phi_{\breve{v}}$ can be written as $\breve{H}\breve{H}^{*}$ while it is denoted as

\Phi_{\breve{v}}=\begin{bmatrix}\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}^{*}&\Phi_{\breve{v}_{o}}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}^{*}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\end{bmatrix}.

(45)

In this structure we are particularly going to analyse the elements

$\begin{split}\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}=&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{A}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}^{*}\\ \Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}=&\breve{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}^{*}+\breve{H}_{oo}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}^{*}+\breve{H}_{o\!{\scriptscriptstyle\mathcal{B}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}^{*}+\breve{H}_{o\!{\scriptscriptstyle\mathcal{A}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}^{*}+\breve{H}_{o\!{\scriptscriptstyle\mathcal{Z}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}^{*}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}=&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Q}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}o}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{A}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Z}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}^{*}\end{split}$

(46)

If $\mathcal{A}$ and $\mathcal{B}$ satisfy Condition 2, then none of the white noise terms $e_{x}$ , $x\in\mathcal{L}$ will be a confounding variable for the estimation problems $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Q}}}$ , $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{o}$ or $w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}$ . Then it follows from Lemma 1 that all of the terms in (46) are zero. As a result we can write the spectrum in equation (45) as,

\Phi_{\breve{v}}=\begin{bmatrix}\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&0\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}^{*}&\Phi_{\breve{v}_{o}}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&0\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&0\\ 0&0&0&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\end{bmatrix}

(47)

Then the spectral density $\Phi_{\breve{v}}$ has the unique spectral factorization [40]

\Phi_{\breve{v}}=\begin{bmatrix}F_{11}\Lambda_{1}F_{11}^{*}&0\\ 0&F_{22}\Lambda_{2}F_{22}^{*}\end{bmatrix}=\tilde{H}\Lambda\tilde{H}^{*}

(48)

where $\tilde{H}$ is of the form in (43), and monic, stable and minimum phase. $\Box$

Next we proceed with the proof of Theorem 1.

With Lemma 2 it follows that $\bar{G}_{ji}$ is given by either (41) or (42). For analysing these two expressions, we first are going to specify $\breve{G}_{ji}$ and $\breve{G}_{jj}$ . From (16), we have

	$\displaystyle\breve{G}_{ji}$	$\displaystyle=$	$\displaystyle G_{ji}+G_{j\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}G_{\!{\scriptscriptstyle\mathcal{Z}}i}$		(49)
	$\displaystyle\breve{G}_{jj}$	$\displaystyle=$	$\displaystyle G_{jj}+G_{j\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}G_{\!{\scriptscriptstyle\mathcal{Z}}j},$		(50)

where the first terms on the right hand sides reflect the direct connections from $w_{i}$ to $w_{j}$ (respectively from $w_{j}$ to $w_{j}$ ) and the second terms reflect the connections that pass only through nodes in $\mathcal{Z}$ . By definition, $G_{jj}=0$ since the $G$ matrix in the network in (6) is hollow. Under the parallel path and loop condition 1, the second terms on the right hand sides of (49), (50) are zero, so that $\breve{G}_{ji}=G_{ji}$ and $\breve{G}_{jj}=0$ .

What remains to be shown is that in (41) and (42), it holds that

\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}j}=\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}i}=0

(51)

while additionally for $i\in\mathcal{U}$ , it should hold that

\check{H}_{ji}=0.

(52)

With definition (20) for $\check{H}$ and the special structure of $\tilde{H}_{13}$ and $\tilde{H}_{23}$ in (18) that is implied by the result (43) of Lemma 3, we can write

\begin{bmatrix}\check{H}_{13}\\ \check{H}_{23}\end{bmatrix}=\begin{bmatrix}\tilde{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}&0\\ \tilde{H}_{o\!{\scriptscriptstyle\mathcal{B}}}&0\\ \end{bmatrix}\begin{bmatrix}\tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}&0\\ 0&\tilde{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}\\ \end{bmatrix}^{-1}=\begin{bmatrix}\check{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}&0\\ \check{H}_{o\!{\scriptscriptstyle\mathcal{B}}}&0\\ \end{bmatrix},

(53)

implying that columns in this matrix related to inputs $k\in\mathcal{A}$ are zero.
In order to satisfy (52) we need the condition that: if $i\in\mathcal{U}$ then $i\in\mathcal{A}$ . This is equivalently formulated as $i\in\mathcal{Q}\cup\mathcal{A}$ (conditon 2b).
In order to satisfy (51) we note that $\check{H}_{j3}$ is a row vector, of which the second part (the columns related to signals in $\mathcal{A}$ ) is equal to $0$ , according to (53). Consequently, (51) is satisfied if for every $k\in\mathcal{B}$ it holds that $\breve{G}_{kj}=\breve{G}_{ki}=0$ . On the basis of (16), this condition is satisfied if for every $w_{k}\in w_{\!{\scriptscriptstyle\mathcal{B}}}$ there do not exist direct or unmeasured paths from $w_{i}$ to $w_{k}$ and from $w_{j}$ to $w_{k}$ (condition 2c). $\Box$

Appendix C Proof of Theorem 2

Expression (8) can be written as

w_{\!{\scriptscriptstyle\mathcal{Y}}}=\bar{G}^{o}w_{\!{\scriptscriptstyle\mathcal{D}}}+\bar{H}^{o}\xi_{\!{\scriptscriptstyle\mathcal{Y}}}.

Substituting this into the expression for the prediction error (9), leads to

\varepsilon(t,\theta):=\bar{H}(q,\theta)^{-1}\left[\Delta\bar{G}(q,\theta)w_{\!{\scriptscriptstyle\mathcal{D}}}+\Delta\bar{H}(q,\theta)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}\right]+\xi_{\!{\scriptscriptstyle\mathcal{Y}}}

(54)

where $\Delta\bar{G}(q,\theta)=\bar{G}^{o}-\bar{G}(q,\theta)$ and $\Delta\bar{H}(q,\theta)=\bar{H}^{o}-\bar{H}(q,\theta)$ . The proof of consistency involves two steps.

1.

To show that $\mathbb{E}\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)$ achieves its minimum for $\Delta\bar{G}(\theta)=0$ and $\Delta\bar{H}(\theta)=0$ ,
2.

To show the conditions under which the minimum is unique.

Step 1: With Proposition 1 it follows that our data generating system can always be written in the form (7), such that $w_{m}=T(q)\xi_{m}$ . We denote $T_{1}$ as the matrix composed of the first and third (block) row of $T$ , such that $w_{\!{\scriptscriptstyle\mathcal{D}}}=T_{1}(q)\xi_{m}$ . Substituting this into (54) gives

\varepsilon(t,\theta):=\bar{H}(q,\theta)^{-1}\left[\Delta\bar{G}(q,\theta)T_{1}+\begin{bmatrix}\Delta\bar{H}(\theta)&0\end{bmatrix}\right]\xi_{m}+\xi_{\!{\scriptscriptstyle\mathcal{Y}}},

where $\xi_{m}$ is (block) structured as $[\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}\ \xi_{\!{\scriptscriptstyle\mathcal{U}}}^{\top}]^{\top}$ .
In order to prove that the minimum of $\bar{\mathbb{E}}\left[\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)\right]$ is attained for $\Delta\bar{G}(\theta)=0$ and $\Delta\bar{H}(\theta)=0$ , it is sufficient to show that

\begin{split}\left[\Delta\bar{G}(\theta)T_{1}(q)+\begin{bmatrix}\Delta\bar{H}(\theta)&0&0\end{bmatrix}\right]\xi_{m}(t)\\ \end{split}

(55)

is uncorrelated to $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t)$ . In order to show this, let $\mathcal{F}_{n}=\mathcal{U}\backslash\mathcal{F}$ , with $\mathcal{F}$ as defined in the Theorem, while we decompose $\xi_{m}$ according to $\xi_{m}=[\begin{matrix}\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}&\xi_{\!{\scriptscriptstyle\mathcal{F}}}^{\top}&\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}^{\top}\end{matrix}]^{\top}$ . Using a similar block-structure notation for $\Delta\bar{G}$ , $T$ and $\Delta\bar{H}$ , (55) can then be written as

$\begin{split}&\left(\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Y}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}\!{\scriptscriptstyle\mathcal{Y}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}}+\Delta\bar{H}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Y}}}(\theta)\right)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}+\\ &\ +\left(\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{F}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}\!{\scriptscriptstyle\mathcal{F}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}}\right)\xi_{\!{\scriptscriptstyle\mathcal{F}}}\\ &\ +\left(\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{F}}_{n}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}\!{\scriptscriptstyle\mathcal{F}}_{n}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}_{n}}\right)\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}.\end{split}$

(56)

Since, by definition, $\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}(t)$ is statically uncorrelated to $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t)$ , the $\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}$ -dependent term in (56) cannot create any static correlation with $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t)$ . Then it needs to be shown that the $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}$ - and $\xi_{\!{\scriptscriptstyle\mathcal{F}}}$ -dependent terms in (56) all reflect strictly proper filters. i.e. that they all contain at least a delay.
$\Delta\bar{H}(\theta)$ is strictly proper since both $\bar{H}(\theta)$ and $\bar{H}^{o}$ are monic. Therefore, $\Delta\bar{H}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Y}}}(\theta)$ will have at least a delay in each of its transfers.
If all paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the transformed network and in its parameterized model have at least a delay (as per Condition c in the theorem), then all terms $\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)$ and $\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)$ will have a delay.
We then need to consider the two remaining terms, $\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}}$ and $\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}}$ . From the definition of $\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)$ , each of the two terms can be represented as the sum of two terms. $\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}}$ and $\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}}$ represent paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ and from $w_{\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ respectively in the transformed network. Whereas, $\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}}$ and $\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}}$ is partly induced by the parameterized model and partly by the paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ to $w_{\!{\scriptscriptstyle\mathcal{F}}_{n}}$ and from $w_{\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{\!{\scriptscriptstyle\mathcal{F}}_{n}}$ respectively in the transformed network. According to condition c of the theorem (delay conditions), these transfer functions are strictly proper. This implies that (56) is statically uncorrelated to $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t)$ . Therefore we have, $\bar{\mathbb{E}}\left[\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)\right]=\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right]+\bar{\mathbb{E}}\left[\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}W\xi_{\!{\scriptscriptstyle\mathcal{Y}}}\right]$ where $\Delta X(\theta)=\bar{H}(\theta)^{-1}\left[\Delta\bar{G}(\theta)T_{1}(q)+\begin{bmatrix}\Delta\bar{H}(\theta)&0&0\end{bmatrix}\right]$ . As a result, the minimum of $\bar{\mathbb{E}}\left[\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)\right]$ , which is $\bar{\mathbb{E}}\left[\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}W\xi_{\!{\scriptscriptstyle\mathcal{Y}}}\right]$ , is achieved for $\Delta\bar{G}(\theta)=0$ and $\Delta\bar{H}(\theta)=0$ .

Step 2: When the minimum is achieved, we have $\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right]$ to be zero. From (54), we have $\Delta X(\theta)\xi_{m}=\bar{H}(q,\theta)^{-1}\left[\begin{bmatrix}\Delta\bar{G}(q,\theta)&\Delta\bar{H}(q,\theta)\end{bmatrix}\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{D}}}^{\top}&\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}\end{bmatrix}^{\top}\right].$ Using the expression of $\xi_{o}$ from (8) and substituting it in the expression of $\Delta X(\theta)\xi_{m}$ we get, $\Delta X(\theta)\xi_{m}=\bar{H}(q,\theta)^{-1}\left[\begin{bmatrix}\Delta\bar{G}(q,\theta)&\Delta\bar{H}(q,\theta)\end{bmatrix}J\kappa(t)\right]=\Delta x(\theta)J\kappa(t)$ where,

J\!\!=\!\!\begin{bmatrix}I&0&0\\ 0&I&0\\ -(\bar{H}_{oo})^{-1}\bar{G}_{o\!{\scriptscriptstyle\mathcal{D}}}\!&\!-(\bar{H}_{oo})^{-1}\bar{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\!(\bar{H}_{oo})^{-1}\!\end{bmatrix}\!;\bar{G}_{o\!{\scriptscriptstyle\mathcal{D}}}^{\top}\!=\!\begin{bmatrix}\bar{G}^{\top}_{o\!{\scriptscriptstyle\mathcal{Q}}}\\ \bar{G}^{\top}_{o\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\!.

Writing $\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right]=\bar{\mathbb{E}}\left[{||\Delta x(\theta)J\kappa(t)||}_{W}\right]=0$ using Parseval’s theorem in the frequency domain, we have

\frac{1}{2\pi}\int_{-\pi}^{\pi}\Delta x(e^{j\omega},\theta)^{\top}J\Phi_{\kappa}(\omega)J^{*}\Delta x(e^{-j\omega},\theta)d\omega=0.

(57)

The standard reasoning for showing uniqueness of the identification result is to show that if $\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right]$ equals $0$ (i.e. when the minimum power is achieved), this should imply that $\Delta\bar{G}(\theta)=0$ and $\Delta\bar{H}(\theta)=0$ . Since $J$ is full rank and positive definite, the above mentioned implication will be fulfilled only if $\Phi_{\kappa}(\omega)>0$ for a sufficiently high number of frequencies. On condition 2 of Theorem 2 being satisfied along with the other conditions in Theorem 1, it ensures that the minimum value is achieved only for $\bar{G}(\theta)=\bar{G}^{0}$ and $\bar{H}(\theta)=\bar{H}^{0}$ . $\Box$

Appendix D Proof of Proposition 2

The disturbances in the original network are characterized by $\breve{v}$ (A). From the results of Lemma 3, we can infer that the spectral density $\Phi_{\breve{v}}$ has the unique spectral factorization $\Phi_{\breve{v}}=\tilde{H}\Lambda\tilde{H}^{*}$ where $\tilde{H}$ is monic, stable, minimum phase, and of the form given in (43). Together with the form of $\Lambda$ in (43) it follows that $\xi_{\!{\scriptscriptstyle\mathcal{A}}}$ is uncorrelated with $\xi_{\!{\scriptscriptstyle\mathcal{Y}}}$ . As a result, the set $\mathcal{A}$ satisfies the properties of $\mathcal{F}_{n}$ , so that in Condition c we can replace $\mathcal{F}$ by $\mathcal{B}$ . What remains to be shown is that the delay in path/loop conditions in the transformed network (8) can be reformulated into the same conditions on the original network (6). To this end we will need two Lemma’s.

Lemma 4

Consider a dynamic network as dealt with in Theorem 2, with reference to eq. (8), where a selection of node signals is decomposed into sets $\mathcal{D}=\mathcal{Q}\cup\mathcal{U}$ , $\mathcal{Y}=\mathcal{Q}\cup\{o\}$ , and which is obtained after immersion of nodes in $\mathcal{Z}$ . Let $i$ be any element $i\in\mathcal{Y}\cup\mathcal{U}$ , and let $k$ be any element $k\in\mathcal{Y}$ .
If in the original network the direct path, as well as all paths that pass through non-measured nodes only, from $w_{i}$ to $w_{k}$ have a delay, then $\bar{G}_{ki}$ is strictly proper.

Proof: We will show that $\bar{G}_{ki}$ is strictly proper if all paths from $w_{i}$ to $w_{k}$ have a delay. For any $k\in\mathcal{Y}$ , $i\in\mathcal{D}$ , $\bar{G}_{ki}$ is given by either (41) or (42) with $j=k$ . The situation that is not covered by (41), (42) is the case where $i=\{o\}$ , but from (34) it follows that $\bar{G}_{ko}=0$ , for $k\in\mathcal{Y}$ . So for this situation strictly properness is guaranteed.
We will now use (41) and (42) for $j$ given by any $k\in\mathcal{Y}$ . In (41) and (42), it will hold that $\check{H}_{k3}$ is given by the appropriate component of (20), which, by the fact that (18) is monic, will imply that $\check{H}_{k3}$ is strictly proper. By the same reasoning this also holds for $\check{H}_{ki}$ .
From (41) and (42) it then follows that strictly properness of $\bar{G}_{ki}$ follows from strictly properness of $\breve{G}_{ki}$ if the inverse expression $(I-\breve{G}_{kk}+\check{H}_{k3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}k})^{-1}$ is proper. This latter condition is guaranteed by the fact that $\check{H}_{k3}$ is strictly proper and $\breve{G}_{kk}$ and $(I-\breve{G}_{kk})^{-1}$ are proper as they reflect a module and network transfer function in the immersed network [41, 30]. Finally, strictly properness of $\breve{G}_{ki}$ follows from strictly properness of $G_{ki}$ and the presence of a delay in all paths from $w_{i}$ to $w_{k}$ that pass through unmeasured nodes.

Lemma 5

Consider the transformed network and let $j,k$ be any elements $j,k\in\mathcal{Y}\cup\mathcal{U}$ . If in the original network all paths from $w_{k}$ to $w_{j}$ have a delay, then all paths from $w_{k}$ to $w_{j}$ in the transformed network have a delay.

Proof: This is proved using the Lemma 3 in [15] and Lemma 4. Let $\bar{G}(\infty)$ denote $\lim_{z\rightarrow\infty}\bar{G}(z)$ . From Lemma 4 we know $\bar{G}_{jk}$ is strictly proper if all paths from $w_{k}$ to $w_{j}$ in the original network have a delay. Therefore,

\bar{G}_{m}(\infty)=\begin{bmatrix}\ast&0\\ \ast&\ast\end{bmatrix},

(58)

where the 0 represents $\bar{G}_{jk}(\infty)$ . Using inverse rule of block matrices we have,

(I-\bar{G}_{m}(\infty))^{-1}=\begin{bmatrix}\ast&0\\ \ast&\ast\end{bmatrix}

(59)

Considering (7) we can write $w_{m}=\bar{G}_{m}w_{m}+v_{m}$ where $v_{m}=\bar{H}_{m}\xi_{m}$ . So have $w_{m}=(I-\bar{G}_{m})^{-1}v_{m}$ where $(I-\bar{G}_{m})^{-1}$ represents the transfer from $v_{m}$ to $w_{m}$ . Having 0 in (59) represents that the transfer function from $v_{k}$ to $w_{j}$ has a delay. Since $v_{k}$ has path only to $w_{k}$ with unit transfer function, $w_{k}$ to $w_{j}$ has a delay. $\Box$

We now look into the proof of Proposition 2. For this we need to generalize the result we have achieved in Lemma 5 in terms of scalar node signals to set of node signals. If all existing paths/loops from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{\!{\scriptscriptstyle\mathcal{Y}}}$ in the original network have at least a delay, then all existing paths/loops from $w_{k},k\in\mathcal{Y}\cup\mathcal{F}$ to $w_{j},j\in\mathcal{Y}$ in the original network have at least a delay. If all existing paths/loops from $w_{k},k\in\mathcal{Y}\cup\mathcal{F}$ to $w_{j},j\in\mathcal{Y}$ in the original network have at least a delay, then as a result of Lemma 5, all existing paths/loops from $w_{k},k\in\mathcal{Y}\cup\mathcal{F}$ to $w_{j},j\in\mathcal{Y}$ in the transformed network have at least a delay. This implies that all existing paths/loops from $w_{k},k\in\mathcal{Y}\cup\mathcal{F}$ to $w_{j},j\in\mathcal{Y}$ in the transformed network have at least a delay. Following the above reasoning, we can also show that if all existing paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{k},k\in\mathcal{F}_{n}$ in the original network have at least a delay, all existing paths from $w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}}$ to $w_{k},k\in\mathcal{F}_{n}$ in the transformed network have at least a delay.

Acknowledgment

The authors gratefully acknowledge discussions with and contributions from Arne Dankers, Giulio Bottegal and Harm Weerts on the initial research that led to the presented results.

References

[1] D. Materassi and M. Salapaka, “On the problem of reconstructing an unknown topology via locality properties of the Wiener filter,” IEEE Trans. Automatic Control, vol. 57, no. 7, pp. 1765–1777, 2012.
[2] B. Sanandaji, T. Vincent, and M. Wakin, “Exact topology identification of large-scale interconnected dynamical systems from compressive observations,” in Proc. American Control Conference (ACC), San Francisco, CA, USA, 2011, pp. 649–656.
[3] D. Materassi and G. Innocenti, “Topological identification in networks of dynamical systems,” IEEE Trans. Automatic Control, vol. 55, no. 8, pp. 1860–1871, 2010.
[4] A. Chiuso and G. Pillonetto, “A Bayesian approach to sparse dynamic network identification,” Automatica, vol. 48, no. 8, pp. 1553––1565, 2012.
[5] S. Shi, G. Bottegal, and P. M. J. Van den Hof, “Bayesian topology identification of linear dynamic networks,” in Proc. 2019 European Control Conference, Napels, Italy, 2019, pp. 2814–2819.
[6] A. Haber and M. Verhaegen, “Subspace identication of large-scale interconnected systems,” IEEE Transactions on Automatic Control, vol. 59, no. 10, pp. 2754–2759, 2014.
[7] P. Torres, J. W. van Wingerden, and M. Verhaegen, “Hierarchical PO-MOESP subspace identification for directed acyclic graphs,” Intern. J. Control, vol. 88, no. 1, pp. 123–137, 2015.
[8] H. H. M. Weerts, P. M. J. Van den Hof, and A. G. Dankers, “Prediction error identification of linear dynamic networks with rank-reduced noise,” Automatica, vol. 98, pp. 256–268, December 2018.
[9] ——, “Identification of dynamic networks operating in the presence of algebraic loops,” in Proc. 55nd IEEE Conf. on Decision and Control (CDC). IEEE, 2016, pp. 4606–4611.
[10] M. Zorzi and A. Chiuso, “Sparse plus low rank network identification: a nonparametric approach,” Automatica, vol. 76, pp. 355–366, 2017.
[11] A. S. Bazanella, M. Gevers, J. M. Hendrickx, and A. Parraga, “Identifiability of dynamical networks: which nodes need to be measured?” in Proc. 56th IEEE Conference on Decision and Control (CDC), 2017, pp. 5870–5875.
[12] J. Gonçalves and S. Warnick, “Necessary and sufficient conditions for dynamical structure reconstruction of LTI networks,” IEEE Trans. Automatic Control, vol. 53, no. 7, pp. 1670–1674, Aug. 2008.
[13] H. H. M. Weerts, P. M. J. Van den Hof, and A. G. Dankers, “Identifiability of linear dynamic networks,” Automatica, vol. 89, pp. 247–258, March 2018.
[14] J. Hendrickx, M. Gevers, and A. Bazanella, “Identifiability of dynamical networks with partial node measurements,” IEEE Trans. Autom. Control, vol. 64, no. 6, pp. 2240–2253, 2019.
[15] P. M. J. Van den Hof, A. G. Dankers, P. S. C. Heuberger, and X. Bombois, “Identification of dynamic models in complex networks with prediction error methods - basic methods for consistent module estimates,” Automatica, vol. 49, no. 10, pp. 2994–3006, 2013.
[16] A. G. Dankers, P. M. J. Van den Hof, X. Bombois, and P. S. C. Heuberger, “Errors-in-variables identification in dynamic networks – consistency results for an instrumental variable approach,” Automatica, vol. 62, pp. 39–50, 2015.
[17] J. Linder and M. Enqvist, “Identification of systems with unknown inputs using indirect input measurements,” International Journal of Control, vol. 90, no. 4, pp. 729–745, 2017.
[18] K. R. Ramaswamy, G. Bottegal, and P. M. J. Van den Hof, “Local module identification in dynamic networks using regularized kernel-based methods,” in Proc. 57th IEEE Conf. on Decision and Control (CDC), Miami Beach, FL, 2018, pp. 4713–4718.
[19] N. Everitt, G. Bottegal, and H. Hjalmarsson, “An empirical bayes approach to identification of modules in dynamic networks,” Automatica, vol. 91, pp. 144–151, 5 2018.
[20] M. Gevers, A. Bazanella, and G. Vian da Silva, “A practical method for the consistent identification of a module in a dynamical network,” IFAC-PapersOnLine, vol. 51-15, pp. 862–867, 2018.
[21] L. Ljung, System Identification: Theory for the User. Englewood Cliffs, NJ: Prentice-Hall, 1999.
[22] D. Materassi and M. Salapaka, “Identification of network components in presence of unobserved nodes,” in Proc. 2015 IEEE 54th Conf. Decision and Control, Osaka, Japan, 2015, pp. 1563–1568.
[23] A. G. Dankers, P. M. J. Van den Hof, P. S. C. Heuberger, and X. Bombois, “Identification of dynamic models in complex networks with prediction error methods: Predictor input selection,” IEEE Trans. on Automatic Control, vol. 61, no. 4, pp. 937–952, 2016.
[24] D. Materassi and M. V. Salapaka, “Signal selection for estimation and identification in networks of dynamic systems: a graphical model approach,” IEEE Trans. Automatic Control, vol. 65, no. 10, pp. 4138–4153, october 2020.
[25] P. M. J. Van den Hof, A. G. Dankers, and H. H. M. Weerts, “From closed-loop identification to dynamic networks: generalization of the direct method,” in Proc. 56nd IEEE Conf. on Decision and Control (CDC). Melbourne, Australia: IEEE, 2017, pp. 5845–5850.
[26] A. G. Dankers, P. M. J. Van den Hof, D. Materassi, and H. H. M. Weerts, “Conditions for handling confounding variables in dynamic networks,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 3983–3988, 2017, proc. 20th IFAC World Congress.
[27] P. M. J. Van den Hof, K. R. Ramaswamy, A. G. Dankers, and G. Bottegal, “Local module identification in dynamic networks with correlated noise: the full input case,” in Proc. 58th IEEE Conf. on Decision and Control (CDC), 2019, 5494-5499.
[28] A. G. Dankers, “System identification in dynamic networks,” PhD dissertation, Delft University of Technology, 2014.
[29] J. Pearl, Causality: Models, Reasoning, and Inference. New York: Cambridge University Press, 2000.
[30] H. H. M. Weerts, J. Linder, M. Enqvist, and P. M. J. Van den Hof, “Abstractions of linear dynamic networks for input selection in local module identification,” Automatica, vol. 117, 2020.
[31] M. Gevers and A. S. Bazanella, “Identification in dynamic networks: identifiability and experiment design issues,” in Proc. 54th IEEE Conference on Decision and Control (CDC). IEEE, 2015, pp. 4005–4010.
[32] H. J. van Waarde, P. Tesi, and M. K. Camlibel, “Topological conditions for identifiabaility of dynamical networks with partial node measurements,” IFAC-PapersOnLine, vol. 51-23, pp. 319–324, 2018, proc. 7th IFAC Workshop on Distrib. Estim. and Control in Networked Systems.
[33] H. H. M. Weerts, P. M. J. Van den Hof, and A. G. Dankers, “Single module identifiability in linear dynamic networks,” in Proc. 57th IEEE Conf. on Decision and Control (CDC). Miami Beach, FL: IEEE, 2018, pp. 4725–4730.
[34] X. Cheng, S. Shi, and P. M. J. Van den Hof, “Allocation of excitation signals for generic identifiability of dynamic networks,” in Proc. 58th IEEE Conf. on Decision and Control (CDC), Nice, France, 2019, pp. 5507–5512.
[35] P. M. J. Van den Hof and K. R. Ramaswamy, “Path-based data-informativity conditions for single module identification in dynamic networks,” in Proc. 59th IEEE Conf. Decision and Control, Jeju Island, Republic of Korea, 2020, to appear.
[36] K. R. Ramaswamy, P. M. J. Van den Hof, and A. G. Dankers, “Generalized sensing and actuation schemes for local module identification in dynamic networks,” in Proc. 58th IEEE Conf. on Decision and Control (CDC), 2019, pp. 5519–5524.
[37] D. Kroening and O. Strichmann, Decision Procedures - An algorithmic point of view, 2nd ed. Springer, 2016.
[38] H. H. M. Weerts, M. Galrinho, G. Bottegal, H. Hjalmarsson, and P. M. J. Van den Hof, “A sequential least squares algorithm for ARMAX dynamic network identification,” IFAC-PapersOnLine, vol. 51-15, pp. 844–849, 2018, proc. 18th IFAC Symp. System Identification.
[39] M. Galrinho, C. R. Rojas, and H. Hjalmarsson, “Parametric identification using weighted null-space fitting,” IEEE Trans. Automatic Control, vol. 64, no. 7, pp. 2798–2813, 2019.
[40] D. Youla, “On the factorization of rational matrices,” IRE Trans. Information Theory, vol. 7, pp. 172–189, 1961.
[41] N. Woodbury, A. Dankers, and S. Warnick, “Dynamic networks: representations, abstractions and well-posedness,” in Proc. 57th IEEE Conf. on Decision and Control, Miami Beach, FL, 2018, pp. 4719–4724.