This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A local direct method for module identification in dynamic networks with correlated noise

Karthik R. Ramaswamy,  Paul M.J. Van den Hof Karthik Ramaswamy and Paul Van den Hof are with the Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands {k.r.ramaswamy, p.m.j.vandenhof}@tue.nlPaper submitted for publication 2 August 2019. Revised: 16 January 2020 and 10 July 2020. Final version 31 October 2020. This work has received funding from the European Research Council (ERC), Advanced Research Grant SYSDYNET, under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 694504).
Abstract

The identification of local modules in dynamic networks with known topology has recently been addressed by formulating conditions for arriving at consistent estimates of the module dynamics, under the assumption of having disturbances that are uncorrelated over the different nodes. The conditions typically reflect the selection of a set of node signals that are taken as predictor inputs in a MISO identification setup. In this paper an extension is made to arrive at an identification setup for the situation that process noises on the different node signals can be correlated with each other. In this situation the local module may need to be embedded in a MIMO identification setup for arriving at a consistent estimate with maximum likelihood properties. This requires the proper treatment of confounding variables. The result is a set of algorithms that, based on the given network topology and disturbance correlation structure, selects an appropriate set of node signals as predictor inputs and outputs in a MISO or MIMO identification setup. Three algorithms are presented that differ in their approach of selecting measured node signals. Either a maximum or a minimum number of measured node signals can be considered, as well as a preselected set of measured nodes.

Index Terms:
Closed-loop identification, dynamic networks, correlated noise, system identification, predictor input and predicted output selection.

I Introduction

In recent years increasing attention has been given to the development of new tools for the identification of large-scale interconnected systems, also known as dynamic networks. These networks are typically thought of as a set of measurable signals (the node signals) interconnected through linear dynamic systems (the modules), possibly driven by external excitations (the reference signals). Among the literature on this topic, we can distinguish three main categories of research. The first one focuses on identifying the topology of the dynamic network [1, 2, 3, 4, 5]. The second category concerns identification of the full network dynamics [6, 7, 8, 9, 10, 11], including aspects of identifiability, particularly addressed in [12, 13, 14], while the third one deals with identification of a specific component (module) of the network, assuming that the network topology is known (the so called local module identification), see [15, 16, 17, 18, 19, 20].

In this paper we will further expand the work on the local module identification problem. In [15], the classical direct-method [21] for closed-loop identification has been generalized to a dynamic network framework using a MISO identification setup. Consistent estimates of the target module can be obtained when the network topology is known and all the node signals in the MISO identification setup are measured. The work has been extended in [22, 23, 24] towards the situation where some node signals might be non-measurable, leading to an additional predictor input selection problem. A similar setup has also been studied in [18], where an approach has been presented based on empirical Bayesian methods to reduce the variance of the target module estimates. In [16] and [19], dynamic networks having node measurements corrupted by sensor noise have been studied, and informative experiments for consistent local module estimates have been addressed in [20].

A standing assumption in the aforementioned works [15], [18], [20], [23] is that the process noises entering the nodes of the dynamic network are uncorrelated with each other. This assumption facilitates the analysis and the development of methods for local module identification, reaching consistent module estimates using the direct method. However, when process noises are correlated over the nodes, the consistency results for the considered MISO direct method collapse. In this situation it is necessary to consider also the noise topology or disturbance correlation structure, when selecting an appropriate identification setup. Even though the indirect and two-stage methods in [16, 20] can handle the situation of correlated noise and deliver consistent estimates, the obtained estimates will not have minimum variance.

In this paper we particularly consider the situation of having dynamic networks with disturbance signals on different nodes that possibly are correlated, while our target moves from consistency only, to also minimum variance (or Maximum Likelihood (ML)) properties of the obtained local module estimates. We will assume that the topology of the network is known, as well as the (Boolean) correlation structure of the noise disturbances, i.e. the zero-elements in the spectral density matrix of the noise. While one could use techniques for full network identification (e.g., [8]), our aim is to develop a method that uses only local information. In this way, we avoid (i) the need to collect node measurements that are “far away” from the target module, and (ii) the need to identify unnecessary modules that would come with the price of higher variance in the estimates.

Using the reasoning first introduced in [25], we build a constructive procedure that, choosing a limited number of predictor inputs and predicted outputs, builds an identification setup that guarantees maximum likelihood (ML) properties (and thus asymptotic minimum variance) when applying a direct prediction error identification method. In this situation we have to deal with so-called confounding variables (see e.g. [25], [26]), that is, unmeasured variables that directly or indirectly influence both the predicted output and the predictor inputs, and lead to lack of consistency. The effect of confounding variables will be mitigated by extending the number of predictor inputs and/or predicted outputs in the identification setup, thus including more measured node signals in the identification. Preliminary results for the particular “full input” case have been presented in [27]. Here we generalize that reasoning to different node selection schemes, and provide a generally applicable theory that is independent of the particular node selection scheme selected.

This paper is organized as follows. In section II, the dynamic network setup is defined. Section III provides a summary of available results from the existing literature of local module identification related to the context of this paper. Next, important concepts and notations used in this paper are defined in Section IV while the MIMO identification setup and main results are presented in subsequent sections. Sections VII-IX provide algorithms and illustrative examples for three different ways of selecting input and output node signals: the full input case, the minimum input case, and the user selection case. This is followed by Conclusions. The technical proofs of all results are collected in the Appendix.

II Network and identification setup

Following the basic setup of [15], a dynamic network is built up out of LL scalar internal variables or nodes wjw_{j}, j=1,,Lj=1,\ldots,L, and KK external variables rkr_{k}, k=1,Kk=1,\ldots K. Each internal variable is described as:

wj(t)=ljl=1LGjl(q)wl(t)+uj(t)+vj(t)\displaystyle w_{j}(t)=\sum_{\stackrel{{\scriptstyle l=1}}{{l\neq j}}}^{L}G_{jl}(q)w_{l}(t)+u_{j}(t)+v_{j}(t) (1)

where q1q^{-1} is the delay operator, i.e. q1wj(t)=wj(t1)q^{-1}w_{j}(t)=w_{j}(t-1),

  • GjlG_{jl} are proper rational transfer functions, referred to as modules;

  • There are no self-loops in the network, i.e. nodes are not directly connected to themselves Gjj=0G_{jj}=0;

  • uj(t)u_{j}(t) is generated by the external variables rk(t)r_{k}(t) that can directly be manipulated by the user and is given by uj(t)=k=1KRjkrk(t)u_{j}(t)=\sum_{k=1}^{K}R_{jk}r_{k}(t) where RjkR_{jk} are stable, proper rational transfer functions;

  • vjv_{j} is process noise, where the vector process v=[v1vL]Tv=[v_{1}\cdots v_{L}]^{T} is modelled as a stationary stochastic process with rational spectral density Φv(ω)\Phi_{v}(\omega), such that there exists a white noise process e:=[e1eL]Te:=[e_{1}\cdots e_{L}]^{T}, with covariance matrix Λ>0\Lambda>0 such that v(t)=H(q)e(t)v(t)=H(q)e(t), where H(q)H(q) is square, stable, monic and minimum-phase. The situation of correlated noise, as considered in this paper, refers to the situation that Φv(ω)\Phi_{v}(\omega) and HH are non-diagonal, while we assume that we know a priori which entries of Φv\Phi_{v} are nonzero.

We will assume that the standard regularity conditions on the data are satisfied that are required for convergence results of the prediction error identification method111See [21] page 249. This includes the property that e(t)e(t) has bounded moments of order higher than 44..

When combining the LL node signals we arrive at the full network expression

[w1w2wL]=[0G12G1LG210GL1LGL1GLL10][w1w2wL]+[u1u2uL]+H[e1e2eL]\displaystyle\begin{bmatrix}\!w_{1}\!\\[1.0pt] \!w_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!w_{L}\!\end{bmatrix}\!\!\!=\!\!\!\begin{bmatrix}0&\!G_{12}\!&\!\cdots\!&\!\!G_{1L}\!\\ \!G_{21}\!&0&\!\ddots\!&\!\!\vdots\!\\ \vdots&\!\ddots\!&\!\ddots\!&\!\!G_{L-1\ L}\!\\ \!G_{L1}\!&\!\cdots\!&\!\!G_{L\ L-1}\!\!&\!\!0\end{bmatrix}\!\!\!\!\begin{bmatrix}\!w_{1}\!\\[1.0pt] \!w_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!w_{L}\!\end{bmatrix}\!\!\!+\!\!\begin{bmatrix}\!u_{1}\!\\[1.0pt] \!u_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!u_{L}\!\end{bmatrix}\!\!\!+\!\!H\!\!\begin{bmatrix}\!e_{1}\!\\[1.0pt] \!e_{2}\!\\[1.0pt] \!\vdots\!\\[1.0pt] \!e_{L}\!\end{bmatrix}\!\!\!

which results in the matrix equation:

w=Gw+Rr+He.\displaystyle w=Gw+Rr+He. (2)

It is assumed that the dynamic network is stable, i.e. (IG)1(I-G)^{-1} is stable, and well posed (see [28] for details). The representation (2) is an extension of the dynamic structure function representation [12]. The identification problem to be considered is the problem of identifying one particular module Gji(q)G_{ji}(q) on the basis of a selection of measured variables ww, and possibly r.

Let us define 𝒩j\mathcal{N}_{j} as the set of node indices kk such that Gjk0G_{jk}\neq 0, i.e. the node signals in 𝒩j\mathcal{N}_{j} are the ww-in-neighbors of the node signal wjw_{j}. Let 𝒟j\mathcal{D}_{j} denote the set of indices of the internal variables that are chosen as predictor inputs. It seems most obvious to have 𝒟j𝒩j\mathcal{D}_{j}\subset\mathcal{N}_{j}, but this is not necessary, as will be shown later in this paper. Let 𝒱j\mathcal{V}_{j} denote the set of node indices kk such that vkv_{k} has a path to wjw_{j}. Let 𝒵j\mathcal{Z}_{j} denote the set of indices not in {j}𝒟j\{j\}\cup\mathcal{D}_{j}, i.e. 𝒵j={1,,L}{{j}𝒟j}\mathcal{Z}_{j}=\{1,\ldots,L\}\setminus\{\{j\}\cup\mathcal{D}_{j}\}, reflecting the node signals that are discarded in the prediction/identification. Let w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} denote the vector [wk1wkn]T[w_{k_{1}}\ \cdots\ w_{k_{n}}]^{T}, where {k1,,kn}=𝒟j\{k_{1},\ldots,k_{n}\}=\mathcal{D}_{j}. Let u𝒟u_{\!{\scriptscriptstyle\mathcal{D}}} denote the vector [uk1ukn]T[u_{k_{1}}\ \cdots\ u_{k_{n}}]^{T}, where {k1,,kn}=𝒟j\{k_{1},\ldots,k_{n}\}=\mathcal{D}_{j}. The vectors w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}}, v𝒟v_{\!{\scriptscriptstyle\mathcal{D}}}, v𝒵v_{\!{\scriptscriptstyle\mathcal{Z}}} and u𝒵u_{\!{\scriptscriptstyle\mathcal{Z}}} are defined analogously. The ordering of the elements of w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}, v𝒟v_{\!{\scriptscriptstyle\mathcal{D}}}, and u𝒟u_{\!{\scriptscriptstyle\mathcal{D}}} is not important, as long as it is the same for all vectors. The transfer function matrix between w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and wjw_{j} is denoted Gj𝒟G_{j\!{\scriptscriptstyle\mathcal{D}}}. The other transfer function matrices are defined analogously.

TABLE I: Table with notation of variables and sets.
GG Network matrix with modules
HH Network noise model
GjiG_{ji} Target module with input wiw_{i} and output wjw_{j}
wjw_{j} Node signal wjw_{j}, output of the target module
wiw_{i} Node signal wiw_{i}, input of the target module
𝒴\mathcal{Y} Set of indexes of nodes that appear in the vector of predicted
outputs
𝒟\mathcal{D} Set of indexes of nodes that appear in the vector of predictor
inputs for predicted outputs w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}
𝒟j\mathcal{D}_{j} Set of indexes of nodes that appear in the vector of predictor
inputs for prediction of node wjw_{j}
wow_{o} Output node signal wjw_{j} if it is not in set w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}
𝒬\mathcal{Q} Set of indexes of nodes that appear both in the predicted output,
and in the predictor input
𝒰\mathcal{U} Set of indexes of nodes that only appear as predictor input:
𝒰=𝒟\𝒬\mathcal{U}=\mathcal{D}\backslash\mathcal{Q}
𝒜\mathcal{A} Set of indexes of nodes that only appear as predictor input,
that do not have any confounding variable effect: 𝒜𝒰\mathcal{A}\subseteq\mathcal{U}
\mathcal{B} Set of indexes of nodes that only appear as predictor input:
=𝒰\𝒜\mathcal{B}=\mathcal{U}\backslash\mathcal{A}
𝒵\mathcal{Z} Set of indexes of nodes that are removed (immersed) from the
network when predicting w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}
𝒵j\mathcal{Z}_{j} Set of indexes of nodes that are removed (immersed) from the
network when predicting wjw_{j}
vkv_{k} Disturbance signal on node wkw_{k}
𝒩j\mathcal{N}_{j} Index set of nodes that are ww-in-neighbors of wjw_{j}
ee (White noise) innovation of the noise process vv
\mathcal{L} Index set of all node signals: [1,L][1,L]
G¯\bar{G} Network matrix of the immersed and transformed network (8)
ξ\xi (White noise) innovation of the noise process in the immersed
and transformed network (8)

To illustrate the notation, consider the network sketched in Figure 1, and let module G210G_{21}^{0} be the target module for identification.

Refer to caption

Figure 1: Example network with target module G210G_{21}^{0} (in green).

Then j=2j=2, i=1i=1; 𝒩j={1,4}\mathcal{N}_{j}=\{1,4\}. If we choose the set of predictor inputs as 𝒟j=𝒩j\mathcal{D}_{j}=\mathcal{N}_{j}, then the set of remaining (nonmeasured) signals, becomes 𝒵j={3,5,6}\mathcal{Z}_{j}=\{3,5,6\}.

By this notation, the network equation (2) is rewritten as:

[wjw𝒟w𝒵]=[0Gj𝒟Gj𝒵G𝒟jG𝒟𝒟G𝒟𝒵G𝒵jG𝒵𝒟G𝒵𝒵][wjw𝒟w𝒵]+[vjv𝒟v𝒵]+[uju𝒟u𝒵],\displaystyle\begin{bmatrix}w_{j}\\ w_{\!{\scriptscriptstyle\mathcal{D}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}=\begin{bmatrix}0&G_{j\!{\scriptscriptstyle\mathcal{D}}}&G_{j\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{D}}j}&G_{\!{\scriptscriptstyle\mathcal{D}}\!{\scriptscriptstyle\mathcal{D}}}&G_{\!{\scriptscriptstyle\mathcal{D}}\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{Z}}j}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{D}}}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\begin{bmatrix}w_{j}\\ w_{\!{\scriptscriptstyle\mathcal{D}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}+\begin{bmatrix}v_{j}\\ v_{\!{\scriptscriptstyle\mathcal{D}}}\\ v_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}+\begin{bmatrix}u_{j}\\ u_{\!{\scriptscriptstyle\mathcal{D}}}\\ u_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}, (3)

where G𝒟𝒟G_{\!{\scriptscriptstyle\mathcal{D}}\!{\scriptscriptstyle\mathcal{D}}} and G𝒵𝒵G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}} have zeros on the diagonal.

For identification of module GjiG_{ji} we select 𝒟j\mathcal{D}_{j} such that i𝒟ji\in\mathcal{D}_{j}, and subsequently estimate a multiple-input single-output model for the transfer functions in Gj𝒟G_{j\!{\scriptscriptstyle\mathcal{D}}}, by considering the one-step-ahead predictor222𝔼¯\bar{\mathbb{E}} refers to limN1Nt=1N𝔼\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{t=1}^{N}\mathbb{E}, and wjw_{j}^{\ell} and w𝒟jw_{\mathcal{D}_{j}}^{\ell} refer to signal samples wj(τ)w_{j}(\tau) and wk(τ)w_{k}(\tau), k𝒟jk\in\mathcal{D}_{j}, respectively, for all τ\tau\leq\ell. w^j(t|t1;θ):=𝔼¯{wj(t)|wjt1,w𝒟jt;θ}\hat{w}_{j}(t|t-1;\theta):=\bar{\mathbb{E}}\{w_{j}(t)\ |\ w_{j}^{t-1},w_{\mathcal{D}_{j}}^{t};\theta\} ([21]) and the resulting prediction error εj(t,θ)=wj(t)w^j(t|t1;θ)\varepsilon_{j}(t,\theta)=w_{j}(t)-\hat{w}_{j}(t|t-1;\theta), leading to:

εj(t,θ)=Hj(θ)1[(wjk𝒟jGjk(θ)wkuj]\varepsilon_{j}(t,\theta)=H_{j}(\theta)^{-1}[(w_{j}-\!\!\sum_{k\in\mathcal{D}_{j}}G_{jk}(\theta)w_{k}-u_{j}] (4)

where arguments qq and tt have been dropped for notational clarity. The parameterized transfer functions Gjk(θ)G_{jk}(\theta), k𝒟jk\in\mathcal{D}_{j} and Hj(θ)H_{j}(\theta) are estimated by minimizing the sum of squared (prediction) errors: Vj(θ)=1Nt=0N1εj2(t,θ),V_{j}(\theta)=\frac{1}{N}\sum_{t=0}^{N-1}\varepsilon_{j}^{2}(t,\theta), where NN is the length of the data set. We refer to this identification method as the direct method, [15].

III Available results and problem specification

The following results are available from previous work:

  • When 𝒟j\mathcal{D}_{j} is chosen equal to 𝒩j\mathcal{N}_{j} and noise vjv_{j} is uncorrelated to all vkv_{k}, k𝒱jk\in\mathcal{V}_{j}, then GjiG_{ji} can be consistently estimated in a MISO setup, provided that there is enough excitation in the predictor input signals, see [15].

  • When 𝒟j\mathcal{D}_{j} is a subset of 𝒩j\mathcal{N}_{j}, and disturbance are uncorrelated, confounding variables333A confounding variable is an unmeasured variable that has paths to both the input and output of an estimation problem [29]. can occur in the estimation problem, and these have to be taken into account in the choice of 𝒟j\mathcal{D}_{j} in order to arrive at consistent estimates of GjiG_{ji}, see [23].

  • In [26] relaxed conditions for the selection of 𝒟j\mathcal{D}_{j} have been formulated, while still staying in the context of MISO identification with noise spectrum of vv (Φv\Phi_{v}) being diagonal. This is particularly done by choosing additional predictor input signals that are not in 𝒩j\mathcal{N}_{j},.i.e. that are no in-neighbors of the output wjw_{j} of the target module.

  • For non-diagonal Φv\Phi_{v}, an indirect/two-stage identification method can be used to arrive at consistent estimates of GjiG_{ji} [15, 23, 20]. However the drawback of these methods is that they do not allow for a maximum likelihood analysis, i.e. they will not lead to minimum variance results.

  • This latter argument also holds for the method in [22, 24], where Wiener-filter estimates are combined to provide local module estimates, and diagonal Φv\Phi_{v} is considered.

In this paper, we go beyond consistency properties, and address the following problem: How to identify a single module in a dynamic network for the situation that the disturbance signals can be correlated, i.e. Φv\Phi_{v} not necessarily being diagonal, such that the estimate is consistent and asymptotically has Maximum Likelihood, and thus also minimum variance, properties.

Refer to caption

Figure 2: Two-node example network from [25] with v1v_{1} and v2v_{2} dynamically correlated and e1e_{1}, e2e_{2} white noise processes.

Addressing this problem requires a more careful treatment and modelling of the noise that is acting on the different node signals. This can be illustrated through a simple Example that is presented in [25], where a two-node network is considered as given in Figure 2, with v1v_{1} and v2v_{2} being dynamically correlated. In this case, a SISO identification using the direct method with input w1w_{1} and output w2w_{2} will lead to a biased estimate of G21G_{21} because of the unmodelled correlation of the disturbance signals on w1w_{1} and w2w_{2}444In this particular example the bias is caused by the presence of H21H_{21}.. For an analysis of this, see [25]. If both node signals w1w_{1} and w2w_{2} are predicted as outputs, then the correlation between the disturbance signals can be incorporated in a 2×22\times 2 non-diagonal noise model, thus leading to an unbiased estimate of G21G_{21}. In this way bias due to correlation in the noise signals can be avoided by predicting additional outputs other than the output of the target module. This leads to the following two suggestions:

  • confounding variables can be dealt with by modelling correlated disturbances on the node signals, and

  • this can be done by moving from a MISO identification setup to a MIMO setup.

These suggestions are being explored in the current paper. Next we will present an example to further illustrate the problem.

Example 1

Consider the network sketched in Figure 1, and let module G210G_{21}^{0} be the target module for identification. If the node signals w1w_{1}, w2w_{2} and w4w_{4} can be measured, then a two-input one-output model with inputs w1,w4w_{1},w_{4} and output w2w_{2} can be considered. This can lead to a consistent estimate of G210G_{21}^{0} and G240G_{24}^{0}, provided that the disturbance signal v2v_{2} is uncorrelated to all other disturbance signals. However if e.g. v4v_{4} and v2v_{2} are dynamically correlated, implying that a noise model HH of the two-dimensional noise process is non-diagonal, then a biased estimate will result for this approach. A solution is then to include w4w_{4} in the set of predicted outputs, and by adding node signal w3w_{3} as predictor input for w4w_{4}. We then combine predicting w2w_{2} on the basis of (w1,w4)(w_{1},w_{4}) with predicting w4w_{4} on the basis of w3w_{3}. The correlation between v2v_{2} and v4v_{4} is then covered by modelling a 2×22\times 2 non-diagonal noise model of the joint process (v2,v4)(v_{2},v_{4}).

In the next sections we will formalize the procedure as sketched in Example 1 for general networks.

IV Concepts and notation

In line with [29] we define the notion of confounding variable.

Definition 1 (confounding variable)

Consider a dynamic network defined by

w=Gw+He+uw=Gw+He+u (5)

with ee a white noise process, and consider the graph related to this network, with node signals ww and ee. Let w𝒳w_{\!{\scriptscriptstyle\mathcal{X}}} and w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} be two subsets of measured node signals in ww, and let w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} be the set of unmeasured node signals in ww. Then a noise component ee_{\ell} in ee is a confounding variable for the estimation problem w𝒳w𝒴w_{\!{\scriptscriptstyle\mathcal{X}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, if in the graph there exist simultaneous paths555A simultaneous path from e1e_{1} to node signal w1w_{1} and w2w_{2} implies that there exist a path from e1e_{1} to w1w_{1} as well as from e1e_{1} to w2w_{2}. from ee_{\ell} to node signals wk,k𝒳w_{k},k\in\mathcal{X} and wn,n𝒴w_{n},n\in\mathcal{Y}, while these paths are either direct666A direct path from e1e_{1} to node signal w1w_{1} implies that there exist a path from e1e_{1} to w1w_{1} which does not pass through nodes in ww. or only run through nodes that are in w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}}. \Box

We will denote w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} as the node signals in ww that serve as predicted outputs, and w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} as the node signals in ww that serve as predictor inputs. Next we decompose w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} and w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} into disjoint sets according to: 𝒴=𝒬{o};𝒟=𝒬𝒰\mathcal{Y}=\mathcal{Q}\cup\{o\}\ ;\ \mathcal{D}=\mathcal{Q}\cup\mathcal{U} where w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}} are the node signals that are common in w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} and w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}; wow_{o} is the output wjw_{j} of the target module; if j𝒬j\in\mathcal{Q} then {o}\{o\} is void; w𝒰w_{\!{\scriptscriptstyle\mathcal{U}}} are the node signals that are only in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}. In this situation the measured nodes will be w𝒟𝒴w_{\!{\scriptscriptstyle\mathcal{D}}\cup\!{\scriptscriptstyle\mathcal{Y}}} and the unmeasured nodes w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} will be determined by the set 𝒵=\{𝒟𝒴}\mathcal{Z}=\mathcal{L}\backslash\{\mathcal{D}\cup\mathcal{Y}\}, where ={1,2,L}\mathcal{L}=\{1,2,\cdots L\}.

Refer to caption

Figure 3: A simple network with 3 nodes w1w_{1}, w2w_{2}, w3w_{3} and unmeasured noise sources e1e_{1}, e2e_{2} and e3e_{3}. G12G_{12} is the target module to be identified.

There can exist two types of confounding variable namely direct and indirect confounding variables. For direct confounding variables the simultaneous paths mentioned in the definition are both direct paths, while in all other cases we refer to the confounding variables as indirect confounding variables. For example, in the network as shown in Figure 3 with 𝒟={2}\mathcal{D}=\{2\}, 𝒴={1}\mathcal{Y}=\{1\} and 𝒵={3}\mathcal{Z}=\{3\}, for the estimation problem w2w1w_{2}\rightarrow w_{1}, e2e_{2} is a direct confounding variable since it has a simultaneous path to w1w_{1} and w2w_{2} where both the paths are direct paths. Meanwhile e3e_{3} is an indirect confounding variable since it has a simultaneous path to w1w_{1} and w2w_{2} where one of the path is an unmeasured path777An unmeasured path is a path that runs through nodes in w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} only. Analogously, we can define unmeasured loops through a node wkw_{k}..

Remark 1

Confounding variables are defined in accordance with their use in [26], on the basis of a network description as in (5). In this definition absence of confounding variables still allows that there are unmeasured signals that create correlation between the inputs and outputs of an estimation problem, in particular if the white noise signals in ee are statically correlated, i.e cov(e)cov(e) being non-diagonal. It will appear that this type of correlations will not hinder our identification results, as analysed in Section VI-C.

V Main results - Line of reasoning

On the basis of the decomposition of node signals as defined in the previous section we are going to represent the system’s equations (5) in the following structured form:

[w𝒬wow𝒰w𝒵]\displaystyle\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\! =\displaystyle\!=\! [G𝒬𝒬G𝒬oG𝒬𝒰G𝒬𝒵Go𝒬GooGo𝒰Go𝒵G𝒰𝒬G𝒰oG𝒰𝒰G𝒰𝒵G𝒵𝒬G𝒵oG𝒵𝒰G𝒵𝒵][w𝒬wow𝒰w𝒵]+R(q)r\displaystyle\!\begin{bmatrix}G_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&G_{\!{\scriptscriptstyle\mathcal{Q}}o}&G_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}&G_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{o\!{\scriptscriptstyle\mathcal{Q}}}&G_{oo}&G_{o\!{\scriptscriptstyle\mathcal{U}}}&G_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&G_{\!{\scriptscriptstyle\mathcal{U}}o}&G_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}&G_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Z}}}\\ G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Q}}}&G_{\!{\scriptscriptstyle\mathcal{Z}}o}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{U}}}&G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\\ w_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!+\!R(q)r (6)
+[H𝒬𝒬H𝒬oH𝒬𝒰H𝒬𝒵Ho𝒬HooHo𝒰Ho𝒵H𝒰𝒬H𝒰oH𝒰𝒰H𝒰𝒵H𝒵𝒬H𝒵oH𝒵𝒰H𝒵𝒵][e𝒬eoe𝒰e𝒵]\displaystyle+\begin{bmatrix}H_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&H_{\!{\scriptscriptstyle\mathcal{Q}}o}&H_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}&H_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ H_{o\!{\scriptscriptstyle\mathcal{Q}}}&H_{oo}&H_{o\!{\scriptscriptstyle\mathcal{U}}}&H_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ H_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&H_{\!{\scriptscriptstyle\mathcal{U}}o}&H_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}&H_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Z}}}\\ H_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Q}}}&H_{\!{\scriptscriptstyle\mathcal{Z}}o}&H_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{U}}}&H_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{U}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}

where we make the notation agreement that the matrix HH is not necessarily monic, and the scaling of the white noise process ee is such that cov(e)=Icov(e)=I. Without loss of generality, we can assume r=0r=0 for the sake of brevity.

Our objective is to end up with an an identification problem in which we identify the dynamics from inputs (w𝒬,w𝒰)(w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{\!{\scriptscriptstyle\mathcal{U}}}) to outputs (w𝒬,wo)(w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{o}), while our target module Gji(q)G_{ji}(q) is present as one of the scalar transfers (modules) in this identified (MIMO) model. This can be realized by the following steps:

  1. 1.

    Firstly, we write the system’s equations for the measured variables as

    [w𝒬wow𝒰]wm=[G¯0G¯𝒰𝒟G¯𝒰o]G¯m[w𝒬w𝒰wo]+[H¯00H¯𝒰𝒰]H¯m[ξ𝒬ξoξ𝒰]ξm\underbrace{\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ \hline\cr w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}}_{w_{m}}\!=\!\underbrace{\left[\begin{array}[]{c|c}\bar{G}&0\\ \hline\cr\\[-8.53581pt] \bar{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{D}}}&\bar{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\end{array}\right]}_{\bar{G}_{m}}\!\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\\ \hline\cr w_{o}\end{bmatrix}+\underbrace{\left[\begin{array}[]{c|c}\bar{H}&0\\ \hline\cr\\[-8.53581pt] 0&\bar{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\end{array}\right]}_{\bar{H}_{m}}\!\underbrace{\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\\ \hline\cr\xi_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}}_{\xi_{m}} (7)

    with ξm\xi_{m} a white noise process, while H¯\bar{H} is monic, stable and stably invertible and the components in G¯\bar{G} are zero if it concerns a mapping between identical signals. This step is made by removing the non-measured signals w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} from the network, while maintaining the second order properties of the remaining signals. This step is referred to as immersion of the nodes in w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} [23].

  2. 2.

    As an immediate result of the previous step we can write an expression for the output variables w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}, by considering the upper part of the equation (7), as

    [w𝒬wo]w𝒴=[G¯𝒬𝒬G¯𝒬𝒰G¯o𝒬G¯o𝒰]G¯[w𝒬w𝒰]w𝒟+[H¯𝒬𝒬H¯𝒬oH¯o𝒬H¯oo]H¯[ξ𝒬ξo]ξ𝒴\underbrace{\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\end{bmatrix}}_{w_{\!{\scriptscriptstyle\mathcal{Y}}}}\!\!=\!\!\underbrace{\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}}_{\bar{G}}\underbrace{\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}}_{w_{\!{\scriptscriptstyle\mathcal{D}}}}+\underbrace{\begin{bmatrix}\bar{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\bar{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}\\ \bar{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\bar{H}_{oo}\end{bmatrix}}_{\bar{H}}\underbrace{\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\end{bmatrix}}_{\xi_{\!{\scriptscriptstyle\mathcal{Y}}}} (8)

    with cov(ξ𝒴):=Λ¯cov(\xi_{\!{\scriptscriptstyle\mathcal{Y}}}):=\bar{\Lambda}.

  3. 3.

    Thirdly, we will provide conditions to guarantee that G¯ji(q)=Gji(q)\bar{G}_{ji}(q)=G_{ji}(q), i.e the target module appearing in equation (8) is the target module of the original network (invariance of target module). This will require conditions on the selection of node signals in w𝒬,wo,w𝒰w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{o},w_{\!{\scriptscriptstyle\mathcal{U}}}.

  4. 4.

    Finally, it will be shown that, on the basis of (8), under fairly general conditions, the transfer functions G¯(q)\bar{G}(q) and H¯(q)\bar{H}(q) can be estimated consistently, and with maximum likelihood properties. A pictorial representation of the identification setup with the classification of different sets of signals in (8) is provided in Figure 4. The figure also contains set 𝒜,,n\mathcal{A},\mathcal{B},\mathcal{F}_{n} which will be introduced in the sequel.

Refer to caption


Figure 4: Figure to depict the identification setup and classification of different sets of signals in the input and output of the identification problem.

The combination of steps 3 and 4 will lead to a consistent and maximum likelihood estimation of the target module Gji(q)G_{ji}(q). It has to be noted that an identification setup results, in which signals can simultaneously act as input and as output (the set w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}). Because G¯𝒬𝒬\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}} is restricted to be hollow, this does not lead to trivial transfers between signals that are the same. A related situation appears when identifying a full network, while using all node signals as both inputs and outputs, as in [8].

The steps 1)-4) above will require conditions on the selection of node signals, based on the known topology of the network and an allowed correlation structure of the disturbances in the network. Specifying these conditions on the selection of sets w𝒬,wo,w𝒰w_{\!{\scriptscriptstyle\mathcal{Q}}},w_{o},w_{\!{\scriptscriptstyle\mathcal{U}}}, will be an important objective of the next section.

VI Main Results - Derivations

VI-A System representation after immersion (Steps 1-2)

First we will show that a network in which signals in w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} are removed (immersed) can indeed be represented by (7).

Proposition 1

Consider a dynamic network given by (6), where the set of all nodes ww_{\!{\scriptscriptstyle\mathcal{L}}} is decomposed in disjunct sets w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}, wow_{o}, w𝒰w_{\!{\scriptscriptstyle\mathcal{U}}} and w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} as defined in Section IV. Then, for the situation r=0r=0,

  1. 1.

    there exists a representation (7) of the measured node signals wmw_{m}, with H¯m\bar{H}_{m} monic, stable and stably invertible, and ξm\xi_{m} a white noise process, and

  2. 2.

    for this representation there are no confounding variables for the estimation problem w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}.

Proof: See appendix.

(a)                 (b)                   (c)

Refer to caption

Figure 5: (a): Original network with 4 nodes {wi}i=1,4\{w_{i}\}_{i=1,\cdots 4}, and unmeasured white noise sources {ei}i=1,4\{e_{i}\}_{i=1,\cdots 4}; (b): Transformed network with confounding variable for w4w1w_{4}\rightarrow w_{1} removed; (c): Transformed network with also the confounding variable for w3w1w_{3}\rightarrow w_{1} removed.

The consequence of Proposition 1 is that the output node signals in w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} can be explicitly written in the form of (8), in terms of input node signals w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and disturbances, without relying on (unmeasured) node signals in w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}}. The particular structure of network representation (7) implies that there are no confounding variables for the estimation problem w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}. This will be an important phenomenon for our identification setup. Based on (8), a typical prediction error identification method can provide estimates of G¯\bar{G} and H¯\bar{H} from measured signals w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} and w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} with 𝒟=𝒬𝒰\mathcal{D}=\mathcal{Q}\cup\mathcal{U}. In this estimation problem, confounding variables for the estimation problem w𝒬w𝒴w_{\!{\scriptscriptstyle\mathcal{Q}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} are treated by correlated noise modelling in H¯\bar{H}, while confounding variables for the estimation problem w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} are not present, due to the structure of (7).

In the following example, the step towards (7) will be illustrated, as well as its effect on the dynamics in G¯\bar{G}.

Example 2

Consider the 4-node network depicted in Figure 5(a), where all nodes are considered to be measured, and where we select wo=w1w_{o}=w_{1}, 𝒰={2,3,4}\mathcal{U}=\{2,3,4\}, and 𝒬=\mathcal{Q}=\emptyset. In this network, there is a confounding variable e4e_{4} for the problem w4w1w_{4}\rightarrow w_{1} (i.e w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}), meaning that for the situation ξ=e\xi=e the noise model H¯m\bar{H}_{m} in (7) will not be block diagonal. Therefore the network does not comply with the representation in (7) and (8). We can remove the confounding variable, by shifting the effect of H14H_{14} into a transformed version of G14G_{14}, which now becomes G14+H441H14G_{14}+H_{44}^{-1}H_{14}, as depicted in Figure 5(b). However, since this shift also affects the transfer from e3e_{3} to w1w_{1}, the change of G14G_{14} needs to be mitigated by a new term H13H_{13}, in order to keep the network signals invariant. In the resulting network the confounding variable for w4w1w_{4}\rightarrow w_{1} is removed, but a new confounding variable (e3)e_{3}) for w3w1w_{3}\rightarrow w_{1} has been created. In the second step, shown in Figure 5(c), the term H13H_{13} is removed by incorporating its effect in the module G13G_{13} which now becomes G13+H331H13G_{13}+H_{33}^{-1}H_{13}. In the resulting network there are no confounding variables for w𝒰w1w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{1}. This representation complies with the structure in (7). Note that in the transformed network, the dynamics of G12G_{12} is left invariant, while the dynamics of G14G_{14} and G13G_{13} have been changed. The intermediately occurring confounding variables relate to a sequence of linked confounders, as discussed in [26]. \Box

In the next subsection it will be investigated under which conditions our target module will remain invariant under the above transformation to a representation (7) without confounding variables.

VI-B Module invariance result (Step 3)

The transformation of a network into the form (7), leading to the resulting identification setup of (8), involves two basic steps, each of which can lead to a change of dynamic modules in G¯\bar{G}. These two steps are

  1. (a)

    Removing of non-measured signals in w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} (immersion), and

  2. (b)

    Transforming the system’s equations to a form where there are no confounding variables for w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}.

Module invariance in step (a) is covered by the following Condition:

Condition 1 (parallel path and loop condition[23])

Let GjiG_{ji} be the target network module to be identified. In the original network (6):

  • Every path from wiw_{i} to wjw_{j}, excluding the path through GjiG_{ji}, passes through a node wk,k𝒟w_{k},k\in\mathcal{D}, and

  • Every loop through wjw_{j} passes through a node in wk,k𝒟w_{k},k\in\mathcal{D}. \Box

This condition has been introduced in [23] for a MISO identification setup, to guarantee that when immersing (removing) nonmeasured node signals from the network, the target module will remain invariant. As an alternative, more generalized notions of network abstractions have been developed for this purpose in [30]. Condition 1 will be used to guarantee module invariance under step (a).

Step (b) above is a new step, and requires studying module invariance in the step transforming a network from an original format where all nodes are measured, into a structure that complies with (7), i.e. with absence of confounding variables for w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}.

We are going to tackle this problem, by decomposing the set 𝒰\mathcal{U} into two disjunct sets 𝒰=𝒜\mathcal{U}=\mathcal{A}\cup\mathcal{B} aiming at the situation that in the transformed network, the modules G𝒴𝒜G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{A}}} stay invariant, while for the modules G𝒴G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{B}}} we accept that the transformation can lead to module changes. We construct 𝒜\mathcal{A} by choosing signals wkw𝒰w_{k}\in w_{\!{\scriptscriptstyle\mathcal{U}}} such that in the original network there are no confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}. For the selection of \mathcal{B}, we do allow confounding variables for the estimation problem ww𝒴w_{\!{\scriptscriptstyle\mathcal{B}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}. By requiring a particular “disconnection” between the sets 𝒜\mathcal{A} and \mathcal{B}, we can then still guarantee that the modules G𝒴𝒜G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{A}}} stay invariant.

The following condition will address the major requirement for addressing our step (b).

Condition 2

𝒰\mathcal{U} is decomposed into two disjunct sets, 𝒰=𝒜\mathcal{U}=\mathcal{A}\cup\mathcal{B} (see Figure 4), such that in the original network (6) there are no confounding variables for the estimation problems w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} and w𝒜ww_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}. \Box

Condition 2 is not a restriction on 𝒰\mathcal{U}, as such a decomposition can always be made, e.g. by taking 𝒜=\mathcal{A}=\emptyset and =𝒰\mathcal{B}=\mathcal{U}. The flexibility in choosing this decomposition will be instrumental in the sequel of this paper.

Example 3 (Example 2 continued)

In the example network depicted in Figure 5, we observe that in the original network there is a confounding variable for w4w1w_{4}\rightarrow w_{1}. However in the step towards creating a network without confounding variables for w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} an intermediate step occurs, where there is also a confounding variable for w3w1w_{3}\rightarrow w_{1}, as depicted in Figure 5(b). For 𝒰={2,3,4}\mathcal{U}=\{2,3,4\} the choice 𝒜={2,3}\mathcal{A}=\{2,3\}, ={4}\mathcal{B}=\{4\}, is not valid since there exists a confounding variable (e3e_{3}) for w3w4w_{3}\rightarrow w_{4} which violates the second condition that there should be no confounding variables for w𝒜ww_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}. Therefore the appropriate choice satisfying Condition 2 is 𝒜={2}\mathcal{A}=\{2\} and ={3,4}\mathcal{B}=\{3,4\}. Note that this matches with the situation that in the transformed network (Figure 5(c)), the module G𝒴𝒜G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{A}}} remains invariant, and the modules G𝒴G_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{B}}} get changed. \Box

We can now formulate the module invariance result.

Theorem 1 (Module invariance result)

Let GjiG_{ji} be the target network module. In the transformed system’s equation (8), it holds that G¯ji=Gji0\bar{G}_{ji}=G^{0}_{ji} under the following conditions:

  1. 1.

    The parallel path and loop Condition 1 is satisfied, and

  2. 2.

    The following three conditions are satisfied:

    • a.

      𝒰\mathcal{U} is decomposed in 𝒜\mathcal{A} and \mathcal{B}, satisfying Condition 2, and

    • b.

      i{𝒜𝒬}i\in\{\mathcal{A}\cup\mathcal{Q}\}, and

    • c.

      Every path from {wi,wj}\{w_{i},w_{j}\} to ww_{\!{\scriptscriptstyle\mathcal{B}}} passes through a measured node in w\𝒵w_{\mathcal{L}\backslash\mathcal{Z}}.

Proof: See appendix.

A more detailed illustration of the conditions in the theorem will be deferred to three different algorithms for selecting the node signals, to be presented in Sections VII-IX. We will first develop the identification results for the general case.

VI-C Identification results (Step 4)

If the conditions of Theorem 1 are satisfied, then the target module G¯ji=Gji0\bar{G}_{ji}=G_{ji}^{0} can be identified on the basis of the system’s equation (8). For this system’s equation we can set up a predictor model with input w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and outputs w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}, for the estimation of G¯\bar{G} and H¯\bar{H}. This will be based on a parameterized model set determined by

:={(G¯(θ),H¯(θ),Λ¯(θ)),θΘ},\mathcal{M}:=\left\{(\bar{G}(\theta),\bar{H}(\theta),\bar{\Lambda}(\theta)),\theta\in\Theta\right\},

while the actual data generating system is represented by 𝒮=(G¯(θo),H¯(θo),Λ¯(θ0))\mathcal{S}=(\bar{G}(\theta_{o}),\bar{H}(\theta_{o}),\bar{\Lambda}(\theta_{0})). The corresponding identification problem is defined by considering the one-step-ahead prediction of w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the parametrized model, according to w^𝒴(t|t1;θ):=𝔼{w𝒴(t)|w𝒴t1,w𝒟t;θ}\hat{w}_{\!{\scriptscriptstyle\mathcal{Y}}}(t|t-1;\theta):=\mathbb{E}\{w_{\!{\scriptscriptstyle\mathcal{Y}}}(t)\ |\ w_{\!{\scriptscriptstyle\mathcal{Y}}}^{t-1},w_{\!{\scriptscriptstyle\mathcal{D}}}^{t};\theta\} where w𝒟tw_{\!{\scriptscriptstyle\mathcal{D}}}^{t} denotes the past of w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}, i.e. {w𝒟(k),kt}\{w_{\!{\scriptscriptstyle\mathcal{D}}}(k),k\leq t\}. The resulting prediction error becomes: ε(t,θ):=w𝒴(t)w^𝒴(t|t1;θ)\varepsilon(t,\theta):=w_{\!{\scriptscriptstyle\mathcal{Y}}}(t)-\hat{w}_{\!{\scriptscriptstyle\mathcal{Y}}}(t|t-1;\theta), leading to

ε(t,θ)=H¯(q,θ)1[w𝒴(t)G¯(q,θ)w𝒟(t)],\varepsilon(t,\theta)=\bar{H}(q,\theta)^{-1}\left[w_{\!{\scriptscriptstyle\mathcal{Y}}}(t)-\bar{G}(q,\theta)w_{\!{\scriptscriptstyle\mathcal{D}}}(t)\right], (9)

and the weighted least squares identification criterion

θ^N=argminθ1Nt=0N1εT(t,θ)Wε(t,θ),\hat{\theta}_{N}=\arg\min_{\theta}\frac{1}{N}\sum_{t=0}^{N-1}\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta), (10)

with WW any positive definite weighting matrix. This parameter estimate then leads to an estimated subnetwork G¯𝒴𝒟(q,θ^N)\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{D}}}(q,\hat{\theta}_{N}) and noise model H¯(q,θ^N)\bar{H}(q,\hat{\theta}_{N}), for which consistency and minimum variance results will be formulated next.

Theorem 2 (Consistency)

Consider a dynamic network represented by (7), and a related (MIMO) network identification setup with predictor inputs w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and predicted outputs w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}, according to (8). Let n𝒰\mathcal{F}_{n}\subseteq\mathcal{U} be the set of node signals kk for which ξk\xi_{k} is statically uncorrelated with ξ𝒴\xi_{\!{\scriptscriptstyle\mathcal{Y}}}888This implies that 𝔼[ξk(t)ξ𝒴(t)]=0\mathbb{E}[\xi_{k}(t)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t)]=0. and let :=𝒰\n\mathcal{F}:=\mathcal{U}\backslash\mathcal{F}_{n}. Then a direct prediction error identification method according to (9)-(10), applied to a parametrized model set \mathcal{M} will provide consistent estimates of G¯\bar{G} and H¯\bar{H} if:

  1. a.

    \mathcal{M} is chosen to satisfy 𝒮\mathcal{S}\in\mathcal{M};

  2. b.

    Φκ(ω)>0\Phi_{\kappa}(\omega)>0 for a sufficiently high number of frequencies, where κ(t):=[w𝒟(t)ξ𝒬(t)wo(t)]\kappa(t):=\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{D}}}^{\top}(t)&\xi_{\!{\scriptscriptstyle\mathcal{Q}}}^{\top}(t)&w_{o}(t)\end{bmatrix}^{\top};
    (data-informativity condition).

  3. c.

    The following paths/loops should have at least a delay:

    • All paths/loops from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the network (8) and in its parametrized model; and

    • For every wknw_{k}\in\mathcal{F}_{n}, all paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}} to wkw_{k} in the network (8), or all paths from wkw_{k} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the parametrized model.

    (delay in path/loop condition.)

Proof: See appendix.

The consistency theorem has a structure that corresponds to the classical result of the direct prediction error identification method applied to a closed-loop experimental setup, [21]. A system in the model set condition (a), an informativity condition on the measured data (b), and a loop delay condition (c). Note however that conditions (b) and (c) are generalized versions of the typical closed-loop case [21, 15], and are dedicated for the considered network setup.

It is important to note that Theorem 2 is formulated in terms of conditions on the network in (7), which we refer to as the transformed network. However, it is quintessential to formulate the conditions in terms of properties of signals in the original network, represented by (6).

Proposition 2

If in the original network, 𝒰\mathcal{U} is decomposed in two disjunct sets 𝒜\mathcal{A} and \mathcal{B} satisfying Condition 2, then Condition c of Theorem 2 can be reformulated as:

  • c.

    The following paths/loops should have at least a delay:

    • All paths/loops from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{B}}} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the original network (6) and in the parametrized model; and

    • For every wk𝒜w_{k}\in\mathcal{A}, all paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{B}}} to wkw_{k} in the network (6), or all paths from wkw_{k} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the parametrized model.

Proof: See appendix.

Condition (b) of Theorem 2 requires that there should be enough excitation present in the node signals, which actually reflects a type of identifiability property [13]. Note that this excitation condition may require that there are external excitation signals present at some locations, see also [15, 31, 32, 33, 14, 34], and [35], where it is shown that dim(r)|𝒬|dim(r)\geq|\mathcal{Q}|, with |𝒬||\mathcal{Q}| the cardinality of 𝒬\mathcal{Q}. Since we are using a direct method for identification, excitation signals rr are not directly used in the predictor model, although they serve the purpose of providing excitation in the network. A first result of a generalized method where, besides node signals ww, also signals rr are included in the predictor inputs, is presented in [36].

Since in the result of Theorem 2 we arrive at white innovation signals, the result can be extended to formulate Maximum Likelihood properties of the estimate.

Theorem 3

Consider the situation of Theorem 2, and let the conditions for consistency be satisfied. Let ξ𝒴\xi_{\!{\scriptscriptstyle\mathcal{Y}}} be normally distributed, and let Λ¯(θ)\bar{\Lambda}(\theta) be parametrized independently from G¯(θ)\bar{G}(\theta) and H¯(θ)\bar{H}(\theta). Then, under zero initial conditions, the Maximum Likelihood estimate of θ0\theta^{0} is

θ^NML\displaystyle\hat{\theta}_{N}^{ML} =\displaystyle= argminθdet(1Nt=1Nε(t,θ)εT(t,θ))\displaystyle\arg\min_{\theta}\det\left(\frac{1}{N}\sum_{t=1}^{N}\varepsilon(t,\theta)\varepsilon^{T}(t,\theta)\right) (11)
Λ(θ^NML)\displaystyle\Lambda(\hat{\theta}_{N}^{ML}) =\displaystyle= 1Nt=1Nε(t,θ^NML)εT(t,θ^NML).\displaystyle\frac{1}{N}\sum_{t=1}^{N}\varepsilon(t,\hat{\theta}_{N}^{ML})\varepsilon^{T}(t,\hat{\theta}_{N}^{ML}). (12)

Proof: Can be shown by following a similar reasoning as in Theorem 1 of [8]. \Box

So far, we have analysed the situation for given sets of node signals w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}, wow_{o}, w𝒜w_{\!{\scriptscriptstyle\mathcal{A}}}, ww_{\!{\scriptscriptstyle\mathcal{B}}} and w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}}. The presented results are very general and allow for different algorithms to select the appropriate signals and specify the particular signal sets, that will guarantee target module invariance and consistent and minimum variance module estimates with the presented local direct method. In the next sections we will focus on formulating guidelines for the selection of these sets, such that the target module invariance property holds, as formulated in Theorem 1. For formulating these conditions, we will consider three different situations with respect to the availability of measured node signals.

  1. (a)

    In the Full input case, we will assume that all in-neighbors of the predicted output signals are measured and used as predictor input;

  2. (b)

    In the Minimum input case, we will include the smallest possible number of node signals to be measured for arriving at our objective;

  3. (c)

    In the User selection case, we will formulate our results for a prior given set of measured node signals;

VII Algorithm for signal selection: full input case

The first algorithm to be presented is based on the strategy that for any node signal that is selected as output, we have access to all of its ww-in-neighbors, that are to be included as predictor inputs. This strategy will lead to an identification setup with a maximum use of measured node signals that contain information that is relevant for modeling our target module GjiG_{ji}. The following strategy will be followed:

  • We start by selecting i𝒟i\in\mathcal{D} and j𝒴j\in\mathcal{Y};

  • Then we extend 𝒟\mathcal{D} in such a way that all ww-in-neighbors of w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} are included in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}.

  • All node signals in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} that have noise terms vkv_{k}, k𝒟k\in\mathcal{D} that are correlated with any vv_{\ell}, 𝒴\ell\in\mathcal{Y} (direct confounding variables for w𝒟w𝒴w_{\!{\scriptscriptstyle\mathcal{D}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}), are included in 𝒴\mathcal{Y} too. They become elements of 𝒬\mathcal{Q}.

  • With 𝒜:=𝒟\𝒬\mathcal{A}:=\mathcal{D}\backslash\mathcal{Q} it follows that by construction there are no direct confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}.

  • Then we choose ww_{\!{\scriptscriptstyle\mathcal{B}}} as a subset of nodes that are not in w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} nor in w𝒜w_{\!{\scriptscriptstyle\mathcal{A}}}. This set needs to be introduced to block the indirect confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, and will be chosen to satisfy Condition 2a and 2c of Theorem 1.

  • Every node signal wkw_{k}, k𝒜k\in\mathcal{A} for which there are only indirect confounding variables and cannot be blocked by a node in ww_{\!{\scriptscriptstyle\mathcal{B}}}, is

    • moved to \mathcal{B} if Conditions 2a and 2c of Theorem 1 are satisfied and kik\neq i; (else)

    • included in 𝒴\mathcal{Y} and moved to 𝒬\mathcal{Q};

  • Finally, we define the identification setup as the estimation problem w𝒟w𝒴w_{\mathcal{D}}\rightarrow w_{\mathcal{Y}}, with 𝒟=𝒬𝒜\mathcal{D}=\mathcal{Q}\cup\mathcal{A}\cup\mathcal{B} and 𝒴=𝒬{o}\mathcal{Y}=\mathcal{Q}\cup\{o\}.

Note that because all ww-in-neighbors of w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} are included in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}, we automatically satisfy the parallel path and loop condition 1. In order for the selection of node signals ww_{\!{\scriptscriptstyle\mathcal{B}}} to satisfy the conditions of Theorem 1, we will specify the following Property 1.

Property 1

Let the node signals ww_{\!{\scriptscriptstyle\mathcal{B}}} be chosen to satisfy the following properties:

  1. 1.

    If, in the original network, there are no confounding variables for the estimation problem w𝒜w𝒴w_{\mathcal{A}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, then \mathcal{B} is void implying that ww_{\!{\scriptscriptstyle\mathcal{B}}} is not present;

  2. 2.

    If, in the original network, there are confounding variables for the estimation problem w𝒜w𝒴w_{\mathcal{A}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, then all of the following conditions need to be satisfied:

    • a.

      For any confounding variable for the estimation problem w𝒜w𝒴w_{\mathcal{A}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, the unmeasured paths from the confounding variable to node signals w𝒜w_{\!{\scriptscriptstyle\mathcal{A}}} pass through a node in ww_{\!{\scriptscriptstyle\mathcal{B}}}.

    • b.

      There are no confounding variables for the estimation problem w𝒜ww_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}.

    • c.

      Every path from {wi,wj}\{w_{i},w_{j}\} to ww_{\!{\scriptscriptstyle\mathcal{B}}} passes through a measured node in w\𝒵w_{\mathcal{L}\backslash\mathcal{Z}}. \Box

Property 2a) ensures that, after including ww_{\!{\scriptscriptstyle\mathcal{B}}} in the set of measured signals, there are no indirect confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, and Property 2b) guarantees that there are no confounding variables for the estimation problem w𝒜ww_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}. Together we satisfy Condition 2a) of Theorem 1. Also, Property 2c) guarantees condition 2c) of Theorem 1 to be satisfied. Finally, as per the algorithm, wiw_{i} can be either in w𝒜w_{\!{\scriptscriptstyle\mathcal{A}}} or w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}. Therefore at the end of the algorithm, we will obtain sets of signals that satisfy the conditions in Theorem 1 for target module invariance.

Refer to caption
Figure 6: Example network with v1v_{1} dynamically correlated with v2v_{2} and v8v_{8} (red colored). v4v_{4} is dynamically correlated with v6v_{6} (green colored) and v5v_{5} is dynamically correlated with v7v_{7} (blue colored).
Example 4

Consider the network in Figure 6. G12G_{12} is the target module that we want to identify. We now select the signals according to the algorithm presented in this section. First we include the input of the target module w2w_{2} in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and the output of the target module w1w_{1} in w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}. Next we include all ww-in-neighbors of w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} (i.e. w3w_{3} and w4w_{4}) in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}. All node signals in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} that have noise terms vkv_{k}, k𝒟k\in\mathcal{D} that are correlated with any v,𝒴v_{\ell},\ell\in\mathcal{Y} need to be included in 𝒴\mathcal{Y} too. This concerns w2w_{2}, since v1v_{1} is correlated with v2v_{2}. Now w𝒴={w1,w2}w_{\!{\scriptscriptstyle\mathcal{Y}}}=\{w_{1},w_{2}\} has changed and we need to include the ww-in-neighbors of w2w_{2}, which is w5w_{5}, in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}}, leading to w𝒟={w2,w3,w4,w5}w_{\!{\scriptscriptstyle\mathcal{D}}}=\{w_{2},w_{3},w_{4},w_{5}\}. After a check we can conclude that all node signals in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} that have noise terms vkv_{k}, k𝒟k\in\mathcal{D} that are correlated with any v,𝒴v_{\ell},\ell\in\mathcal{Y} are included in 𝒴\mathcal{Y} too. The result now becomes

𝒴={1,2}\displaystyle\mathcal{Y}=\{1,2\}\ ; 𝒟={2,3,4,5}\displaystyle\ \mathcal{D}=\{2,3,4,5\}\ (13)
𝒬=𝒴𝒟={2}\displaystyle\mathcal{Q}=\mathcal{Y}\cap\mathcal{D}=\{2\}\ ; 𝒜=𝒟\𝒬={3,4,5}.\displaystyle\ \mathcal{A}=\mathcal{D}\backslash\mathcal{Q}=\{3,4,5\}. (14)

Since v8v_{8} is dynamically correlated with v1v_{1}, in the resulting situation we will have a confounding variable for the estimation problem w5w1w_{5}\rightarrow w_{1} (i.e. w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}). As per condition 2a of Property 1, the path of the confounding variable e8e_{8} to w5w_{5} should be blocked by a node signal in ww_{\!{\scriptscriptstyle\mathcal{B}}}, which can be either w7w_{7} or w8w_{8}. w7w_{7} cannot be chosen in ww_{\!{\scriptscriptstyle\mathcal{B}}} since this would create a confounding variable for w𝒜ww_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}} (i.e. w5w7w_{5}\rightarrow w_{7}). Moreover, w7ww_{7}\in w_{\!{\scriptscriptstyle\mathcal{B}}} would also create an unmeasured path wiw7w_{i}\rightarrow w_{7} with wi=w2w_{i}=w_{2}, thereby violating Condition 2c of Property 1. When w8w_{8} is chosen in ww_{\!{\scriptscriptstyle\mathcal{B}}}, the conditions in Property 1 are satisfied and hence we choose ={8}\mathcal{B}=\{8\}. The resulting estimation problem is (w2,w3,w4,w5,w8)(w1,w2)(w_{2},w_{3},w_{4},w_{5},w_{8})\rightarrow(w_{1},w_{2}), and will according to Theorem 2 provide a consistent and maximum likehood estimate of G12G_{12}.

VIII Algorithm for signal selection: minimum input case

Rather than measuring all node signals that are ww-in-neighbors of the output wjw_{j} of our target module GjiG_{ji}, we now focus on an identification setup that uses a minimum number of measured node signals, according to the following strategy:

  • We start by selecting i𝒟i\in\mathcal{D} and j𝒴j\in\mathcal{Y};

  • Then we extend 𝒟\mathcal{D} with a minimum number of node signals that satisfies the parallel path and loop Condition 1.

  • Every node signal wkw_{k} in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} for which there is a direct or indirect confounding variable for the estimation problem wkw𝒴w_{k}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} is included in 𝒴\mathcal{Y} and 𝒬\mathcal{Q}.

  • With 𝒜:=𝒟\𝒬\mathcal{A}:=\mathcal{D}\backslash\mathcal{Q} and =\mathcal{B}=\emptyset it follows that by construction there are no confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}.

  • Finally, we define the identification setup as the estimation problem w𝒟w𝒴w_{\mathcal{D}}\rightarrow w_{\mathcal{Y}}, with 𝒟=𝒬𝒜\mathcal{D}=\mathcal{Q}\cup\mathcal{A}.

As we can observe, the algorithm does not require selection of set \mathcal{B}. This is attributed to the way we handle the indirect confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}. Instead of tackling these confounding variables by adding blocking node signals ww_{\!{\scriptscriptstyle\mathcal{B}}} (as in full input case) to be added as predictor inputs, we deal with them by moving the concerned wk,k𝒜w_{k},k\in\mathcal{A} to w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}} and thus to the set of predicted outputs. We choose this approach in order to minimize the required number of measured node signals. In this way, by construction, there will be no direct or indirect confounding variables for the estimation problem w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}. From this result, we can guarantee that the conditions in Theorem 1 will be satisfied since =\mathcal{B}=\emptyset. Thus at the end of the algorithm we obtain a set of signals that provides target module invariance.

Example 5

Consider the same network as in example 4 represented by Figure 6. Applying the algorithm of this section, we first include the input of the target module w2w_{2} in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and the output of the target module w1w_{1} in w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}. There exist two parallel paths from w2w_{2} to w1w_{1}, namely w2w3w1w_{2}\rightarrow w_{3}\rightarrow w_{1} and w2w3w4w1w_{2}\rightarrow w_{3}\rightarrow w_{4}\rightarrow w_{1} and no loops through w1w_{1}. In order to satisfy Condition 1 we can include either w3w_{3} in 𝒟\mathcal{D} such that 𝒟={2,3}\mathcal{D}=\{2,3\} or both w3,w4w_{3},w_{4} in 𝒟\mathcal{D} such that 𝒟={2,3,4}\mathcal{D}=\{2,3,4\}. We choose the former to have minimum number of node signals. Because of the correlation between v2v_{2} and v1v_{1} there is a confounding variable for the estimation problem w2w1w_{2}\rightarrow w_{1}. According to step 3 of the algorithm, w2w_{2} is then moved to 𝒴\mathcal{Y} and 𝒬\mathcal{Q}, leading to w𝒴={w1,w2}w_{\!{\scriptscriptstyle\mathcal{Y}}}=\{w_{1},w_{2}\}. Because of this change of 𝒴\mathcal{Y} we have to recheck for presence of confounding variables. However this change does not introduce any additional confounding variables. The resulting estimation problem is (w2,w3)(w1,w2)(w_{2},w_{3})\rightarrow(w_{1},w_{2}) with w𝒜=w3w_{\!{\scriptscriptstyle\mathcal{A}}}=w_{3}, w=w_{\!{\scriptscriptstyle\mathcal{B}}}=\emptyset, w𝒬=w2w_{\!{\scriptscriptstyle\mathcal{Q}}}=w_{2} and w𝒴=(w1,w2)w_{\!{\scriptscriptstyle\mathcal{Y}}}=(w_{1},w_{2}). \Box

In comparison with the full input case, the algorithm in this section will typically have a higher number of predicted output nodes and a smaller number of predictor inputs. This implies that there is a stronger emphasis on estimating a (multivariate) noise model H¯\bar{H}. Given the choice of the direct identification method, and the choice of signals to satisfy the parallel path and loop condition, this algorithm indeed adds the smallest number of additional signals to be measured, as the removal of any of the additional signals will lead to conflicts with the required conditions.

IX Algorithm for signal selection: User selection case

Next we focus on the situation that we have a prior given set of nodes that we have access to i.e. a set of nodes that can (possibly) be measured. We refer to these nodes as accessible nodes while the remaining nodes are called inaccessible. This strategy is different from the full input case since we do not assume that we have access to all in-neighbours of w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}. This will lead to an identification setup with use of accessible node signals that contain information which is relevant for modeling our target module GjiG_{ji}. We consider the situation that nodes wiw_{i} and wjw_{j} are accessible nodes and there are accessible nodes that satisfy the parallel path and loop Condition 1.
The following strategy will be followed:

  1. 1.

    We start by selecting i𝒟i\in\mathcal{D} and j𝒴j\in\mathcal{Y};

  2. 2.

    Then we extend 𝒟\mathcal{D} to satisfy the parallel path and loop Condition 1;

  3. 3.

    We include in 𝒟\mathcal{D} all accessible ww-in-neighbors of 𝒴\mathcal{Y};

  4. 4.

    We extend 𝒟\mathcal{D} in such a way that for every non-accessible ww-in-neighbor wkw_{k} of w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} we include all accessible nodes that have path to wkw_{k} that runs through non-accessible nodes only.

  5. 5.

    If there is a direct confounding variable for wiw𝒴w_{i}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, or an indirect one that has a path to wiw_{i} that does not pass through any accessible nodes, then ii is included in 𝒴\mathcal{Y} and 𝒬\mathcal{Q};

  6. 6.

    A node signal wkw_{k}, k𝒟k\in\mathcal{D} is included in 𝒜\mathcal{A} if there are either no confounding variables for wkw𝒴w_{k}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} or only indirect confounding variables that have paths to wkw_{k} that pass through accessible nodes.

  7. 7.

    Every node signal wk,k𝒟\{i}w_{k},k\in\mathcal{D}\backslash\{i\} that has a direct confounding variable for wkw𝒴w_{k}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}, or an indirect confounding variable with a path to wkw_{k} that does not pass through any accessible nodes is:

    • included in \mathcal{B} if condition 2a and 2c of Theorem 1 are satisfied on including it in ww_{\!{\scriptscriptstyle\mathcal{B}}} (else)

    • included in 𝒴\mathcal{Y} and 𝒬\mathcal{Q}; return to step 3.

  8. 8.

    Every node signal wkw_{k}, k𝒜k\in\mathcal{A} for which there are only indirect confounding variables as meant in Step 6, is

    • moved to \mathcal{B} if Conditions 2a and 2c of Theorem 1 are satisfied and kik\neq i; (else)

    • kept in 𝒜\mathcal{A} while a set of accessible nodes that blocks the path of the confounding variable is added to 𝒜\mathcal{B}\cup\mathcal{A}, while satisfying Conditions 2a and 2c of Theorem 1; (else)

    • included in 𝒴\mathcal{Y} and 𝒬\mathcal{Q};

  9. 9.

    By construction there are no confounding variables for w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}.

In the algorithm above, the prime reasoning is to deal with confounding variables for w𝒜w𝒴w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}}. Direct confounding variables lead to including the respective node in the outputs 𝒴\mathcal{Y} or shifting the respective input node to \mathcal{B}, while indirect confounding variables are treated by either shifting the input node to \mathcal{B} or, if its effect can be blocked, by adding an accessible node to the inputs in \mathcal{B}, or, if the blocking conditions can not be satisfied, by including the node in the output 𝒴\mathcal{Y}. Note that the algorithm always provides a solution if Condition 1 of Theorem 1 (parallel path and loop condition) can be satisfied.

Refer to caption
Figure 7: Example network of Figure 6 with accessible nodes w1w_{1}, w2w_{2}, w3w_{3}, w6w_{6} indicated in yellow.
Example 6

Consider the same network as in example 4 represented by Figure 7. However, we are given that only the nodes w1,w2,w3w_{1},w_{2},w_{3} and w6w_{6} are accessible. We now select the signals according to the algorithm presented in this section. First we include wi=w2w_{i}=w_{2} in w𝒟w_{\!{\scriptscriptstyle\mathcal{D}}} and wj=w1w_{j}=w_{1} in w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}. Then we extend 𝒟\mathcal{D} such that the parallel path and loop Condition 1 is satisfied. This is done by selecting 𝒟={2,3}\mathcal{D}=\{2,3\}. According to step 4, we extend 𝒟\mathcal{D} by node w6w_{6} as it serves as nearest accessible in-neighbor of w4w_{4}, being an inaccessible in-neighbor of w1w_{1}. As per Step 5, since v1v_{1} and v2v_{2} are correlated, w2w_{2} is moved to 𝒴\mathcal{Y} and 𝒬\mathcal{Q}. As per Step 6, there are no confounding variables for the estimation problem w3w1w_{3}\rightarrow w_{1} and hence w3w_{3} is included in w𝒜w_{\!{\scriptscriptstyle\mathcal{A}}}. Since v4v_{4} and v6v_{6} are correlated, it implies that there is an indirect confounding variable for the estimation problem w6w1w_{6}\rightarrow w_{1}, which however does not pass through an accessible node. Step 7 does not apply since w3w𝒜w_{3}\in w_{\!{\scriptscriptstyle\mathcal{A}}} has no confounding variables. Step 8 requires to deal with the indirect confounding variable v4v_{4} for w6w1w_{6}\rightarrow w_{1}. Checking Conditions 2a and 2c of Theorem 1 for 𝒜\mathcal{A} and \mathcal{B}, it appears that every path from wi=w2w_{i}=w_{2} or from wj=w1w_{j}=w_{1} to w6w_{6} passes through a measured node and there are no confounding variable for the estimation problem w𝒜w6w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{6}. Hence we include w6w_{6} in ww_{\!{\scriptscriptstyle\mathcal{B}}}. As a result, the estimation problem is (w2,w3,w6)(w1,w2)(w_{2},w_{3},w_{6})\rightarrow(w_{1},w_{2}).

Remark 2

Rather than starting the signal selection problem from a fixed set of accessible notes, the provided theory allows for an iterative and interactive algorithm for selecting accessible nodes in sensor allocation problems in a flexible way.

X Discussion

All three presented algorithms lead to a set of selected node signals that satisfy the conditions for target module invariance, and thus provide a predictor model in which no confounding variables can deteriorate the estimation of the target module. Only in the “User selection case” this is conditioned on the fact that appropriate node signals should be available to satisfy the parallel path and loop condition. Under these circumstances the presented algorithms are sound and complete [37]. This attractive feasibility result is mainly attributed to the addition of predicted outputs, that adds flexibility to solve the problem of confounding variables.

Note that the presented algorithms do not guarantee the consistency of the estimated target module. For this to hold the additional conditions for consistency, among which data-informativity and the delay in path/loop condition, need to be satisfied too, as illustrated in Figure 8. A specification of path-based conditions for data-informativity is beyond the scope of this paper, but first results on this problem are presented in [35]. Including these path-based conditions in the signal selection algorithms would be a next natural step to take. This also holds for the development of data-driven techniques to estimate the correlation structure of the disturbances.

Refer to caption
Figure 8: Figure to depict that consistency result requires satisfaction of conditions in Theorem 2 along with the appropriate predictor model.

It can be observed that the three algorithms presented in the previous sections rely only on the graphical conditions of the network. This paves way to automate the signal selection procedure using graph based algorithms that are scalable to large dimensions, with input being topology of the network and disturbance correlation structure represented as adjacency matrices. Also, it can be observed that the three considered cases in the previous sections, most likely will lead to three different experimental setups for estimating the single target module. For all three cases we can arrive at consistent and maximum likehood estimates of the target module. However, because of the fact that the experimental setups are different in the three cases, the data-informativity conditions and the statistical properties of the target module estimates will be different. The minimum variance expressions, in the form of the related Cramér-Rao lower bounds, will typically be different for the different experimental setups. Comparing these bounds for different experimental setups is beyond the scope of the current paper and considered as topic for future research.
We have formulated identification criteria in the realm of classical prediction error methods. This will typically lead to complex non-convex optimization problems that will scale poorly with the dimensions (number of parameters) of the problems. However alternative optimization approaches are becoming available that scale well and that rely on regularized kernel-based methods, thus exploiting new developments that originate from machine learning, see e.g. [18], and relaxations that rely on sequential convex optimization, see e.g. [38, 39].

XI CONCLUSIONS

A new local module identification approach has been presented to identify local modules in a dynamic network with given topology and process noise that is correlated over the different nodes. For this case, it is shown that the problem can be solved by moving from a MISO to a MIMO identification setup. In this setup the target module is embedded in a MIMO problem with appropriately chosen inputs and outputs, that warrant the consistent estimation of the target module with maximum likelihood properties. The key part of the procedure is the handling of direct and indirect confounding variables that are induced by correlated disturbances and/or non-measured node signals, and thus essentially dependent on the (Boolean) topology of the network and the (Boolean) correlation structure of the disturbances. A general theory has been developed that allows for specification of different types of algorithms, of which the “full input case”, the “minimum input case” and the “user selection case” have been illustrated through examples. The presented theory is suitable for generalization to the estimation of sets of target modules.

Appendix A Proof of Proposition 1

Starting with the network representation (6), we can eliminate the non-measured node variables w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} from the equations, by writing the last (block) row of (6) into an explicit expression for w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}}:

w𝒵=(IG𝒵𝒵)1[k𝒬{o}𝒰G𝒵kwk+𝒬{o}𝒰𝒵H𝒵w],w_{\!{\scriptscriptstyle\mathcal{Z}}}=(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}\left[\sum_{k\in\mathcal{Q}\cup\{o\}\cup\mathcal{U}}\!\!\!\!G_{\!{\scriptscriptstyle\mathcal{Z}}k}w_{k}+\sum_{\ell\in\mathcal{Q}\cup\{o\}\cup\mathcal{U}\cup\mathcal{Z}}\!\!\!\!H_{\!{\scriptscriptstyle\mathcal{Z}}\ell}w_{\ell}\right],

and by substituting this w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} into the expressions for the remaining ww-variables. As a result

[w𝒬wow𝒰]=[G˘𝒬𝒬G˘𝒬oG˘𝒬𝒰G˘o𝒬G˘ooG˘o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+v˘,\displaystyle\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}=\begin{bmatrix}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{oo}&\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}+\breve{v},
v˘=H˘[e𝒬eoe𝒰e𝒵]=[H˘𝒬𝒬H˘𝒬oH˘𝒬𝒰H˘𝒬𝒵H˘o𝒬H˘ooH˘o𝒰H˘o𝒵H˘𝒰𝒬H˘𝒰oH˘𝒰𝒰H˘𝒰𝒵][e𝒬eoe𝒰e𝒵]\displaystyle\breve{v}=\breve{H}\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{U}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}=\begin{bmatrix}\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{oo}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{U}}}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{U}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\!\!\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{U}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix} (15)

with cov(e)=Icov(e)=I, and where

G˘kh=Gkh+Gk𝒵(IG𝒵𝒵)1G𝒵h\breve{G}_{kh}=G_{kh}+G_{k\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}G_{\!{\scriptscriptstyle\mathcal{Z}}h} (16)

with k,h{𝒬{o}𝒰}k,h\in\{\mathcal{Q}\cup\{o\}\cup\mathcal{U}\}, and

H˘k=Hk+Gk𝒵(IG𝒵𝒵)1H𝒵,\breve{H}_{k\ell}=H_{k\ell}+G_{k\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}H_{\!{\scriptscriptstyle\mathcal{Z}}\ell}, (17)

with {𝒬{o}𝒰𝒵}\ell\in\{\mathcal{Q}\cup\{o\}\cup\mathcal{U}\cup\mathcal{Z}\}.

On the basis of (A), the spectral density of v˘\breve{v} is given by Φv˘=H˘H˘\Phi_{\breve{v}}=\breve{H}\breve{H}^{*}. Applying a spectral factorization [40] to Φv˘\Phi_{\breve{v}} will deliver Φv˘=H~Λ~H~\Phi_{\breve{v}}=\tilde{H}\tilde{\Lambda}\tilde{H}^{*} with H~\tilde{H} a monic, stable and minimum phase rational matrix, and Λ~\tilde{\Lambda} a positive definite (constant) matrix. Then there exists a white noise process ξ~\tilde{\xi} defined by ξ~:=H~1H˘e\tilde{\xi}:=\tilde{H}^{-1}\breve{H}e such that H~ξ~=v˘\tilde{H}\tilde{\xi}=\breve{v}, with cov(ξ~\tilde{\xi}) = Λ~\tilde{\Lambda}, while H~\tilde{H} is of the form

H~=[H~11H~12H~13H~21H~22H~23H~31H~32H~33]\tilde{H}=\begin{bmatrix}\tilde{H}_{11}&\tilde{H}_{12}&\tilde{H}_{13}\\ \tilde{H}_{21}&\tilde{H}_{22}&\tilde{H}_{23}\\ \tilde{H}_{31}&\tilde{H}_{32}&\tilde{H}_{33}\\ \end{bmatrix} (18)

and where the block dimensions are conformable to the dimensions of w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}, wow_{o} and w𝒰w_{\!{\scriptscriptstyle\mathcal{U}}} respectively. As a result, (A) can be rewritten as

[w𝒬wow𝒰]=[G˘𝒬𝒬G˘𝒬oG˘𝒬𝒰G˘o𝒬G˘ooG˘o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+H~[ξ~𝒬ξ~oξ~𝒰].\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}=\begin{bmatrix}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{oo}&\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\\ \breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}&\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{Q}}}\\ w_{o}\\ w_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}+\tilde{H}\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \tilde{\xi}_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}. (19)

By denoting

[Hˇ13Hˇ23]:=[H~13H~331H~23H~331]\begin{bmatrix}\check{H}_{13}\\ \check{H}_{23}\end{bmatrix}:=\begin{bmatrix}\tilde{H}_{13}\tilde{H}_{33}^{-1}\\ \tilde{H}_{23}\tilde{H}_{33}^{-1}\end{bmatrix} (20)

and premultiplying (19) with

[I0Hˇ130IHˇ2300I]\begin{bmatrix}I&0&-\check{H}_{13}\\ 0&I&-\check{H}_{23}\\ 0&0&I\end{bmatrix} (21)

while only keeping the identity terms on the left hand side, we obtain an equivalent network equation:

[w𝒬wow𝒰]=[G˘𝒬𝒬G˘𝒬oG˘𝒬𝒰G˘o𝒬G˘ooG˘o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+[H~11H~120H~21H~220H~31H~32H~33][ξ~𝒬ξ~oξ~𝒰],\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!\!\begin{bmatrix}\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}\!\\ \!\breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}\!&\!\breve{G}_{oo}^{\prime}\!\!&\!\!\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!\!&\!\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!+\!\!\begin{bmatrix}\!\tilde{H}_{11}^{\prime}\!\!&\!\!\tilde{H}_{12}^{\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{21}^{\prime}\!\!&\!\!\tilde{H}_{22}^{\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{31}\!\!&\!\!\tilde{H}_{32}\!\!&\!\!\tilde{H}_{33}\!\end{bmatrix}\!\!\begin{bmatrix}\!\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!\tilde{\xi}_{o}\!\\ \!\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!, (22)

with

G˘𝒬𝒰\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime} =\displaystyle= G˘𝒬𝒰Hˇ13G˘𝒰𝒰+Hˇ13\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}-\check{H}_{13}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}+\check{H}_{13} (23)
G˘𝒬\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime} =\displaystyle= G˘𝒬Hˇ13G˘𝒰\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}-\check{H}_{13}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\star} (24)
G˘o\displaystyle\breve{G}_{o\star}^{\prime} =\displaystyle= G˘oHˇ23G˘𝒰\displaystyle\breve{G}_{o\star}-\check{H}_{23}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\star} (25)
G˘o𝒰\displaystyle\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime} =\displaystyle= G˘o𝒰Hˇ23G˘𝒰𝒰+Hˇ23\displaystyle\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}-\check{H}_{23}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}+\check{H}_{23} (26)
H~1\displaystyle\tilde{H}_{1\square}^{\prime} =\displaystyle= H~1Hˇ13H~3\displaystyle\tilde{H}_{1\square}-\check{H}_{13}\tilde{H}_{3\square} (27)
H~2\displaystyle\tilde{H}_{2\square}^{\prime} =\displaystyle= H~2Hˇ23H~3.\displaystyle\tilde{H}_{2\square}-\check{H}_{23}\tilde{H}_{3\square}. (28)

where {𝒬{o}}\star\in\{\mathcal{Q}\cup\{o\}\} and {1,2}\square\in\{1,2\}.

The next step is now to show that that the block elements G˘𝒬o\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime} and G˘oo\breve{G}_{oo}^{\prime} in GG can be made 0. This can be done by variable substitution as follows:

The second row in (22) is replaced by an explicit expression for wow_{o} according to

wo=(1G˘oo)1[G˘o𝒬w𝒬+G˘o𝒰w𝒰+H~21ξ~𝒬+H~22ξ~o].w_{o}=(1-\breve{G}_{oo}^{\prime})^{-1}[\breve{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}w_{\!{\scriptscriptstyle\mathcal{Q}}}+\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime}w_{\!{\scriptscriptstyle\mathcal{U}}}+\tilde{H}_{21}^{\prime}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}+\tilde{H}_{22}^{\prime}\tilde{\xi}_{o}].

Additionally, this expression for wow_{o} is substituted into the first block row of (22), to remove the wow_{o}-dependent term on the right hand side, leading to

[w𝒬wow𝒰]=[G˘𝒬𝒬′′0G˘𝒬𝒰′′G¯o𝒬0G¯o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+[H~11′′H~12′′0H~21′′H~22′′0H~31H~32H~33][ξ~𝒬ξ~oξ~𝒰]\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}\!&\!0\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\tilde{H}_{11}^{\prime\prime}\!\!&\!\!\tilde{H}_{12}^{\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{21}^{\prime\prime}\!\!&\!\!\tilde{H}_{22}^{\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{31}\!\!&\!\!\tilde{H}_{32}\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \tilde{\xi}_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split} (29)

with

G¯o\displaystyle\bar{G}_{o\star} =\displaystyle= (IG˘oo)1G˘o\displaystyle(I-\breve{G}_{oo}^{\prime})^{-1}\breve{G}_{o\star}^{\prime} (30)
H~2′′\displaystyle\tilde{H}_{2\star}^{\prime\prime} =\displaystyle= (IG˘oo)1H~2\displaystyle(I-\breve{G}_{oo}^{\prime})^{-1}\tilde{H}_{2\star}^{\prime} (31)
G˘𝒬′′\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime\prime} =\displaystyle= G˘𝒬+G˘𝒬oG¯o\displaystyle\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}+\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}\bar{G}_{o\star} (32)
H~1′′\displaystyle\tilde{H}_{1\star}^{\prime\prime} =\displaystyle= H~1+G˘𝒬oH~2′′.\displaystyle\tilde{H}_{1\star}^{\prime}+\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}o}^{\prime}\tilde{H}_{2\star}^{\prime\prime}. (33)

Since because of these operations, the matrix G˘𝒬𝒬′′\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime} might not be hollow, we move any diagonal terms of this matrix to the left hand side of the equation, and premultiply the first (block) equation by the diagonal matrix (Idiag(G˘𝒬𝒬′′))1(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}, to obtain the expression

[w𝒬wow𝒰]=[G¯𝒬𝒬0G¯𝒬𝒰G¯o𝒬0G¯o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+[H~11′′′H~12′′′0H~21′′H~22′′0H~31H~32H~33][ξ~𝒬ξ~oξ~𝒰]\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\tilde{H}_{11}^{\prime\prime\prime}\!\!&\!\!\tilde{H}_{12}^{\prime\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{21}^{\prime\prime}\!\!&\!\!\tilde{H}_{22}^{\prime\prime}\!\!&\!\!0\!\\ \!\tilde{H}_{31}\!\!&\!\!\tilde{H}_{32}\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \tilde{\xi}_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split} (34)

with

G¯𝒬𝒬\displaystyle\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}} =\displaystyle= (Idiag(G˘𝒬𝒬′′))1(G˘𝒬𝒬′′diag(G˘𝒬𝒬′′)),\displaystyle(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime})), (35)
G¯𝒬𝒰\displaystyle\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}} =\displaystyle= (Idiag(G˘𝒬𝒬′′))1G˘𝒬𝒰′′\displaystyle(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime} (36)
H~1′′′\displaystyle\tilde{H}_{1\star}^{\prime\prime\prime} =\displaystyle= (Idiag(G˘𝒬𝒬′′))1H~1′′.\displaystyle(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}))^{-1}\tilde{H}_{1\star}^{\prime\prime}. (37)

As final step, we need the matrix H~r:=[H~11′′′H~12′′′H~21′′H~22′′]\tilde{H}_{r}:=\begin{bmatrix}\!\tilde{H}_{11}^{\prime\prime\prime}\!\!&\!\!\tilde{H}_{12}^{\prime\prime\prime}\\ \!\tilde{H}_{21}^{\prime\prime}\!\!&\!\!\tilde{H}_{22}^{\prime\prime}\end{bmatrix} to be monic, stable and minimum phase to obtain the representation as in (7). To that end, we consider the stochastic process v~𝒴:=H~rξ~𝒴\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}:=\tilde{H}_{r}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}} with ξ~𝒴:=[ξ~𝒬ξ~o]\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}:=\begin{bmatrix}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Q}}}^{\top}&\tilde{\xi}_{o}^{\top}\end{bmatrix}^{\top}. The spectral density of v~𝒴\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}} is then given by Φv~𝒴=H~rΛ~𝒴H~r\Phi_{\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}}=\tilde{H}_{r}\tilde{\Lambda}_{\!{\scriptscriptstyle\mathcal{Y}}}\tilde{H}_{r}^{\star} with Λ~𝒴\tilde{\Lambda}_{\!{\scriptscriptstyle\mathcal{Y}}} the covariance matrix of ξ~𝒴\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}, that can be decomposed as Λ~𝒴=Γ~rΓ~rT\tilde{\Lambda}_{\!{\scriptscriptstyle\mathcal{Y}}}=\tilde{\Gamma}_{r}\tilde{\Gamma}_{r}^{T}. From spectral factorization [40] it follows that the spectral factor H~rΓ~r\tilde{H}_{r}\tilde{\Gamma}_{r} of Φv~𝒴\Phi_{\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}} satisfies

H~rΓ~r=H¯sD\tilde{H}_{r}\tilde{\Gamma}_{r}=\bar{H}_{s}D (38)

with H¯s\bar{H}_{s} a stable and minimum phase rational matrix, and DD an “all pass” stable rational matrix satisfying DD=IDD^{\star}=I.
The signal v~𝒴\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}} can then be written as

v~𝒴=H~rξ~𝒴=H¯sDΓ~r1ξ~𝒴.\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}=\tilde{H}_{r}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}=\bar{H}_{s}D\tilde{\Gamma}_{r}^{-1}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}.

By defining H¯s:=limzH¯s\bar{H}_{s}^{\infty}:=\lim_{z\rightarrow\infty}\bar{H}_{s}, this can be rewritten as

v~𝒴=H~rξ~𝒴=H¯s(H¯s)1H¯H¯sDΓ~r1ξ~𝒴ξ𝒴.\tilde{v}_{\!{\scriptscriptstyle\mathcal{Y}}}=\tilde{H}_{r}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}=\underbrace{\bar{H}_{s}(\bar{H}_{s}^{\infty})^{-1}}_{\bar{H}}\underbrace{\bar{H}_{s}^{\infty}D\tilde{\Gamma}_{r}^{-1}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}}_{\xi_{\!{\scriptscriptstyle\mathcal{Y}}}}.

As a result, H¯\bar{H} is a monic stable and stably invertible rational matrix, and ξ𝒴\xi_{\!{\scriptscriptstyle\mathcal{Y}}} is a white noise process with spectral density given by H¯sDΓ~r1Φξ~𝒴Γ~rTD(H¯s)T=H¯s(H¯s)T\bar{H}_{s}^{\infty}D\tilde{\Gamma}_{r}^{-1}\Phi_{\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{Y}}}}\tilde{\Gamma}_{r}^{-T}D^{\star}(\bar{H}_{s}^{\infty})^{T}=\bar{H}_{s}^{\infty}(\bar{H}_{s}^{\infty})^{T}. Therefore we can write (34) as,

[w𝒬wow𝒰]=[G¯𝒬𝒬0G¯𝒬𝒰G¯o𝒬0G¯o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+[H¯11H¯120H¯21H¯220H¯31H¯32H~33][ξ𝒬ξoξ~𝒰]\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\bar{H}_{11}\!\!&\!\!\bar{H}_{12}\!\!&\!\!0\!\\ \!\bar{H}_{21}\!\!&\!\!\bar{H}_{22}\!\!&\!\!0\!\\ \!\bar{H}_{31}\!\!&\!\!\bar{H}_{32}\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split} (39)

where [H¯31H¯32]=[H~31H~32]Γ~rD1(H¯s)1\begin{bmatrix}\bar{H}_{31}&\bar{H}_{32}\end{bmatrix}=\begin{bmatrix}\tilde{H}_{31}&\tilde{H}_{32}\end{bmatrix}\tilde{\Gamma}_{r}D^{-1}(\bar{H}_{s}^{\infty})^{-1}. Let [H¯31H¯32]=[H¯31H¯32][H¯11H¯12H¯21H¯22]1[\begin{matrix}\bar{H}_{31}^{\prime}&\bar{H}_{32}^{\prime}\end{matrix}]=[\begin{matrix}\bar{H}_{31}&\bar{H}_{32}\end{matrix}]\begin{bmatrix}\bar{H}_{11}&\bar{H}_{12}\\ \bar{H}_{21}&\bar{H}_{22}\end{bmatrix}^{-1}. Pre-multiplying (39) with [I000I0H¯31H¯32I]\begin{bmatrix}I&0&0\\ 0&I&0\\ -\bar{H}_{31}^{\prime}&-\bar{H}_{32}^{\prime}&I\end{bmatrix} while only keeping the identity terms on the left hand side, we obtain an equivalent network equation:

[w𝒬wow𝒰]=[G¯𝒬𝒬0G¯𝒬𝒰G¯o𝒬0G¯o𝒰G˘𝒰𝒬G˘𝒰oG˘𝒰𝒰][w𝒬wow𝒰]+[H¯11H¯120H¯21H¯22000H~33][ξ𝒬ξoξ~𝒰]\begin{split}\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!=\!\!&\begin{bmatrix}\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!0\!&\!\bar{G}_{o\!{\scriptscriptstyle\mathcal{U}}}\!\\ \!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}^{\prime}\!&\!\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}\!\end{bmatrix}\!\!\!\begin{bmatrix}\!w_{\!{\scriptscriptstyle\mathcal{Q}}}\!\\ \!w_{o}\!\\ \!w_{\!{\scriptscriptstyle\mathcal{U}}}\!\end{bmatrix}\!\!\!+\!\!\!\begin{bmatrix}\!\bar{H}_{11}\!\!&\!\!\bar{H}_{12}\!\!&\!\!0\!\\ \!\bar{H}_{21}\!\!&\!\!\bar{H}_{22}\!\!&\!\!0\!\\ \!0\!\!&\!\!0\!\!&\!\!\tilde{H}_{33}\end{bmatrix}\!\!\!\begin{bmatrix}\xi_{\!{\scriptscriptstyle\mathcal{Q}}}\\ \xi_{o}\\ \tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\end{split} (40)

where G˘𝒰𝒬=G˘𝒰𝒬H¯31G˘𝒬𝒬′′′H˘32G¯o𝒬′′+H¯31\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{Q}}}-\bar{H}_{31}^{\prime}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime\prime}-\breve{H}_{32}^{\prime}\bar{G}_{o\!{\scriptscriptstyle\mathcal{Q}}}^{\prime\prime}+\bar{H}_{31}^{\prime}, G˘𝒰o=G˘𝒰o+H¯32\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}^{\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}o}+\bar{H}_{32}^{\prime} and G˘𝒰𝒰=G˘𝒰𝒰H¯31G˘𝒬𝒰′′′H¯32G˘o𝒰′′\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}-\bar{H}_{31}^{\prime}\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime\prime}-\bar{H}_{32}^{\prime}\breve{G}_{o\!{\scriptscriptstyle\mathcal{U}}}^{\prime\prime}. In order to make G˘𝒰𝒰\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime} hollow, we move any diagonal terms of this matrix to the left hand side of the equation, and pre-multiply the third (block) equation by the diagonal matrix (Idiag(G˘𝒰𝒰))1(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}))^{-1}. This will modify (3,3) (block) element of the HH matrix to (Idiag(G˘𝒰𝒰))1H~33(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}))^{-1}\tilde{H}_{33}, which we need to be monic, stable and stably invertible. Applying spectral factorization as before [40], we can write the term (Idiag(G˘𝒰𝒰))1H~33ξ~𝒰(I-\mathrm{diag}(\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}\!{\scriptscriptstyle\mathcal{U}}}^{\prime}))^{-1}\tilde{H}_{33}\tilde{\xi}_{\!{\scriptscriptstyle\mathcal{U}}} as H¯33ξ𝒰\bar{H}_{33}\xi_{\!{\scriptscriptstyle\mathcal{U}}} where H¯33\bar{H}_{33} is monic, stable and stably invertible and ξ𝒰\xi_{\!{\scriptscriptstyle\mathcal{U}}} is a white noise process with covariance Λ33\Lambda_{33}. This completes the proof for obtaining (7).

The absence of confounding variables for the estimation problem w𝒰w𝒴w_{\!{\scriptscriptstyle\mathcal{U}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Y}}} can be proved as follows. Since all non-measured nodes w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} are removed in the network represented by (7), the only non-measured signals in the network are the noise signals in ξm\xi_{m} and they do not have any unmeasured paths to any nodes in the network (i.e. to wmw_{m}). Due to the block-diagonal structure of H¯m\bar{H}_{m} in (7), the only non-measured signals that have direct paths to w𝒰w_{\!{\scriptscriptstyle\mathcal{U}}} originate from ξ𝒰\xi_{\!{\scriptscriptstyle\mathcal{U}}}, while the only non-measured signals that have direct paths to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} originate from [ξ𝒬Tξo]T[\xi_{\!{\scriptscriptstyle\mathcal{Q}}}^{T}\ \xi_{o}]^{T}. Therefore there does not exist an element of ξm\xi_{m} that has simultaneous unmeasured paths or direct paths to both w𝒰w_{\!{\scriptscriptstyle\mathcal{U}}} and w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}}. \Box

Appendix B Proof of Theorem 1

In order to prove Theorem 1, we first present three preparatory Lemmas.

Lemma 1

Consider a dynamic network as defined in (6), a vector e𝒳e_{\!{\scriptscriptstyle\mathcal{X}}} of white noise sources with 𝒳\mathcal{X}\subseteq\mathcal{L}, and two subsets of nodes wΦw_{\Phi} and wΩw_{\Omega}, Φ,Ω\𝒵\Phi,\Omega\subset\mathcal{L}\backslash\mathcal{Z}. If in e𝒳e_{\!{\scriptscriptstyle\mathcal{X}}} there is no confounding variable for the estimation problem wΦwΩw_{\Phi}\rightarrow w_{\Omega}, then

H˘Ω𝒳H˘Φ𝒳=H˘Φ𝒳H˘Ω𝒳=0,\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}^{*}=\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}^{*}=0,

where H˘Ω𝒳\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}, H˘Φ𝒳\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}} are the noise model transfer functions in the immersed network (A) related to the appropriate variables.

Proof: If in e𝒳e_{\!{\scriptscriptstyle\mathcal{X}}} there is no confounding variable for the formulated estimation problem, then for all exe_{x}, x𝒳x\in\mathcal{X} there do not exist simultaneous paths from exe_{x} to wΦw_{\Phi} and wΩw_{\Omega}, that are direct or pass through nodes in 𝒵\mathcal{Z} only.
For the network where signals w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} are immersed, it follows from (17), that H˘k=Hk+Gk𝒵(IG𝒵𝒵)1H𝒵\breve{H}_{k\ell}=H_{k\ell}+G_{k\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}H_{\!{\scriptscriptstyle\mathcal{Z}}\ell} where kΦk\in\Phi and 𝒳\ell\in\mathcal{X}. The first term in the sum (i.e. HklH_{kl}) is the noise model transfer in the direct path from ee_{\ell} to wkw_{k} and the second part of the sum is the transfer function in the unmeasured paths (i.e. paths through w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} only) from ee_{\ell} to wkw_{k}. If all paths from a node signal exe_{x} to wΦw_{\Phi} pass through a node in w\𝒵w_{\mathcal{L}\backslash\mathcal{Z}}, then there are no direct or unmeasured paths from exe_{x} to nodes in wΦw_{\Phi}. This implies that H˘kx=H˘kx=0\breve{H}_{kx}=\breve{H}_{kx}^{*}=0 for all kΦk\in\Phi (i.e H˘Φx=0\breve{H}_{\Phi x}=0). A dual reasoning applies to paths from exe_{x} to wΩw_{\Omega}. Consider e𝒳=[ex1ex2exn]e_{\!{\scriptscriptstyle\mathcal{X}}}=[\begin{matrix}e_{x_{1}}&e_{x_{2}}&\dots&e_{x_{n}}\end{matrix}]^{\top}. Then we have H˘Φ𝒳H˘Ω𝒳=H˘Φx1H˘Ωx1++H˘ΦxnH˘Ωxn\breve{H}_{\Phi\!{\scriptscriptstyle\mathcal{X}}}\breve{H}_{\Omega\!{\scriptscriptstyle\mathcal{X}}}^{*}=\breve{H}_{\Phi x_{1}}\breve{H}_{\Omega x_{1}}^{*}+\dots+\breve{H}_{\Phi x_{n}}\breve{H}_{\Omega x_{n}}^{*}. If the condition in the lemma is satisfied, implying that there do not exist simultaneous paths, then in each of the product terms we either have H˘Φxk=0\breve{H}_{\Phi x_{k}}=0 or H˘Ωxk=0\breve{H}_{\Omega x_{k}}^{*}=0 where k={1,2,,n}k=\{1,2,\dots,n\}. This proves the result of lemma 1. \Box

Lemma 2

Consider a dynamic network as defined in (A) with target module GjiG_{ji}, where the non-measured node signals w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} are immersed, while the node sets {o,𝒬,𝒰}\{o,\mathcal{Q},\mathcal{U}\} are chosen according to the specifications in Section IV.
Then G¯ji\bar{G}_{ji} is given by the following expressions:

If i𝒬:G¯ji=(IG˘jj+Hˇj3G˘𝒰j)1(G˘jiHˇj3G˘𝒰i)\mbox{If }i\in\mathcal{Q}:\ \bar{G}_{ji}\!\!=\!\!(I\!-\!\breve{G}_{jj}+\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}j})^{-1}(\breve{G}_{ji}\!-\!\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}i}) (41)
If i𝒰:G¯ji=(IG˘jj+Hˇj3G˘𝒰j)1(G˘jiHˇj3G˘𝒰i+Hˇji)\mbox{If }i\in\mathcal{U}:\bar{G}_{ji}\!\!=\!\!(I\!-\!\breve{G}_{jj}\!+\!\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}j})^{-1}(\breve{G}_{ji}\!-\!\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}i}\!+\!\check{H}_{ji}) (42)

where Hˇj3\check{H}_{j3} is the row vector corresponding to the row of node signal jj in Hˇ13\check{H}_{13} (if j𝒬j\in\mathcal{Q}) or in Hˇ23\check{H}_{23} (if joj\in o), and Hˇji\check{H}_{ji} is the element corresponding to the column of node signal ii in Hˇj3\check{H}_{j3}.

Proof: For the target module GjiG_{ji} we have the following cases that can occur:

  1. 1.

    j=oj=o and i𝒰i\in\mathcal{U}. From (30) we have G¯ji=(IG˘jj)1G˘ji\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime})^{-1}\breve{G}_{ji}^{\prime} where G˘jj\breve{G}_{jj}^{\prime} is given by (25) and G˘ji\breve{G}_{ji}^{\prime} is given by (26). This directly leads to (42).

  2. 2.

    j=oj=o and i𝒬i\in\mathcal{Q}. From (30) we have G¯ji=(IG˘jj)1G˘ji\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime})^{-1}\breve{G}_{ji}^{\prime} where G˘jj\breve{G}_{jj}^{\prime} and G˘ji\breve{G}_{ji}^{\prime} are given by (25), leading to (41).

  3. 3.

    j𝒬j\in\mathcal{Q}, oo is void and i𝒰i\in\mathcal{U}. From (36) we have G¯ji=(IG˘jj′′)1G˘ji′′\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime\prime})^{-1}\breve{G}_{ji}^{\prime\prime} where G˘jj′′\breve{G}_{jj}^{\prime\prime} and G˘ji′′\breve{G}_{ji}^{\prime\prime} are given by (32). Since oo is void, (32) leads to G𝒬′′=G˘𝒬G_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}. Therefore G˘jj′′=G˘jj\breve{G}_{jj}^{\prime\prime}=\breve{G}_{jj}^{\prime} which is specified by (24), and G˘ji′′=G˘ji\breve{G}_{ji}^{\prime\prime}=\breve{G}_{ji}^{\prime} which is given by (23). This leads to (42).

  4. 4.

    j𝒬j\in\mathcal{Q}, oo is void and i𝒬i\in\mathcal{Q}. Since jij\neq i it follows from (35) that G¯ji=(IG˘jj′′)1G˘ji′′\bar{G}_{ji}=(I-\breve{G}_{jj}^{\prime\prime})^{-1}\breve{G}_{ji}^{\prime\prime} where G˘jj′′\breve{G}_{jj}^{\prime\prime} and G˘ji′′\breve{G}_{ji}^{\prime\prime} are given by (32). Since oo is void, (32) leads to G𝒬′′=G˘𝒬G_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime\prime}=\breve{G}_{\!{\scriptscriptstyle\mathcal{Q}}\star}^{\prime}. Therefore for this case, G˘jj′′=G˘jj\breve{G}_{jj}^{\prime\prime}=\breve{G}_{jj}^{\prime} and G˘ji′′=G˘ji\breve{G}_{ji}^{\prime\prime}=\breve{G}_{ji}^{\prime}, which are given by (24). This leads to (41).

Lemma 3

Consider a dynamic network as defined in (A) where the non-measured node signals w𝒵w_{\!{\scriptscriptstyle\mathcal{Z}}} are immersed, and let 𝒰\mathcal{U} be decomposed in sets 𝒜\mathcal{A} and \mathcal{B} satisfying Condition 2. Then the spectral density Φv˘\Phi_{\breve{v}} has the unique spectral factorization Φv˘=H~ΛH~\Phi_{\breve{v}}=\tilde{H}\Lambda\tilde{H}^{*} with Λ\Lambda constant and H~\tilde{H} monic, stable, minimum phase, and of the form

Λ=[Λ11Λ12Λ130Λ21Λ22Λ230Λ31Λ32Λ330000Λ44],H~=[H~11H~12H~𝒬0H~21H~22H~o0H~𝒬H~oH~0000H~𝒜𝒜],\Lambda\!\!=\!\!\begin{bmatrix}\Lambda_{11}\!&\!\Lambda_{12}\!&\!\Lambda_{13}&0\\ \Lambda_{21}\!&\!\Lambda_{22}\!&\!\Lambda_{23}&0\\ \Lambda_{31}\!&\!\Lambda_{32}\!&\!\Lambda_{33}&0\\ 0\!&\!0\!&\!0\!&\!\Lambda_{44}\\ \end{bmatrix},\!\!\ \ \tilde{H}\!\!=\!\!\begin{bmatrix}\tilde{H}_{11}\!&\!\tilde{H}_{12}\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}\!&\!0\\ \tilde{H}_{21}\!&\!\tilde{H}_{22}\!&\!\tilde{H}_{o\!{\scriptscriptstyle\mathcal{B}}}\!&\!0\\ \tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}o}\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}\!&\!0\\ 0\!&\!0\!&\!0\!&\!\tilde{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}\\ \end{bmatrix},\ (43)

where the block dimensions are conformable to the dimensions of w𝒬w_{\!{\scriptscriptstyle\mathcal{Q}}}, wow_{o}, ww_{\!{\scriptscriptstyle\mathcal{B}}} and w𝒜w_{\!{\scriptscriptstyle\mathcal{A}}} respectively.

Proof: On the basis of (A) we write w𝒰=[ww𝒜]w_{\!{\scriptscriptstyle\mathcal{U}}}=[\begin{matrix}w_{\!{\scriptscriptstyle\mathcal{B}}}^{\top}&w_{\!{\scriptscriptstyle\mathcal{A}}}^{\top}\end{matrix}]^{\top} and

v˘=H˘[e𝒬eoee𝒜e𝒵]=[H˘𝒬𝒬H˘𝒬oH˘𝒬H˘𝒬𝒜H˘𝒬𝒵H˘o𝒬H˘ooH˘oH˘o𝒜H˘o𝒵H˘𝒬H˘oH˘H˘𝒜H˘𝒵H˘𝒜𝒬H˘𝒜oH˘𝒜H˘𝒜𝒜H˘𝒜𝒵][e𝒬eoee𝒜e𝒵]\breve{v}=\breve{H}\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{B}}}\\ e_{\!{\scriptscriptstyle\mathcal{A}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}=\begin{bmatrix}\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{oo}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{o\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Z}}}\\ \breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}&\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix}\!\!\!\begin{bmatrix}e_{\!{\scriptscriptstyle\mathcal{Q}}}\\ e_{o}\\ e_{\!{\scriptscriptstyle\mathcal{B}}}\\ e_{\!{\scriptscriptstyle\mathcal{A}}}\\ e_{\!{\scriptscriptstyle\mathcal{Z}}}\end{bmatrix} (44)

with cov(e)=Icov(e)=I and the components of H˘\breve{H} as specified in (17). Starting from the expression (44), the spectral density Φv˘\Phi_{\breve{v}} can be written as H˘H˘\breve{H}\breve{H}^{*} while it is denoted as

Φv˘=[Φv˘𝒬Φv˘𝒬v˘oΦv˘𝒬v˘Φv˘𝒬v˘𝒜Φv˘𝒬v˘oΦv˘oΦv˘ov˘Φv˘ov˘𝒜Φv˘𝒬v˘Φv˘ov˘Φv˘Φv˘v˘𝒜Φv˘𝒬v˘𝒜Φv˘ov˘𝒜Φv˘v˘𝒜Φv˘𝒜].\Phi_{\breve{v}}=\begin{bmatrix}\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}^{*}&\Phi_{\breve{v}_{o}}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}^{*}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\end{bmatrix}. (45)

In this structure we are particularly going to analyse the elements

Φv˘𝒬v˘𝒜=H˘𝒬𝒬H˘𝒜𝒬+H˘𝒬oH˘𝒜o+H˘𝒬H˘𝒜+H˘𝒬𝒜H˘𝒜𝒜+H˘𝒬𝒵H˘𝒜𝒵Φv˘ov˘𝒜=H˘o𝒬H˘𝒜𝒬+H˘ooH˘𝒜o+H˘oH˘𝒜+H˘o𝒜H˘𝒜𝒜+H˘o𝒵H˘𝒜𝒵Φv˘v˘𝒜=H˘𝒬H˘𝒜𝒬+H˘oH˘𝒜o+H˘H˘𝒜+H˘𝒜H˘𝒜𝒜+H˘𝒵H˘𝒜𝒵\begin{split}\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}=&\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Q}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}o}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{A}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Z}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}^{*}\\ \Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}=&\breve{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}^{*}+\breve{H}_{oo}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}^{*}+\breve{H}_{o\!{\scriptscriptstyle\mathcal{B}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}^{*}+\breve{H}_{o\!{\scriptscriptstyle\mathcal{A}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}^{*}+\breve{H}_{o\!{\scriptscriptstyle\mathcal{Z}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}^{*}\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}=&\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Q}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Q}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}o}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}o}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{B}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{A}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}^{*}+\breve{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{Z}}}\breve{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{Z}}}^{*}\end{split}

(46)

If 𝒜\mathcal{A} and \mathcal{B} satisfy Condition 2, then none of the white noise terms exe_{x}, xx\in\mathcal{L} will be a confounding variable for the estimation problems w𝒜w𝒬w_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{Q}}}, w𝒜wow_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{o} or w𝒜ww_{\!{\scriptscriptstyle\mathcal{A}}}\rightarrow w_{\!{\scriptscriptstyle\mathcal{B}}}. Then it follows from Lemma 1 that all of the terms in (46) are zero. As a result we can write the spectrum in equation (45) as,

Φv˘=[Φv˘𝒬Φv˘𝒬v˘oΦv˘𝒬v˘0Φv˘𝒬v˘oΦv˘oΦv˘ov˘0Φv˘𝒬v˘Φv˘ov˘Φv˘0000Φv˘𝒜]\Phi_{\breve{v}}=\begin{bmatrix}\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&0\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{o}}^{*}&\Phi_{\breve{v}_{o}}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&0\\ \Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{Q}}}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{o}\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}^{*}&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{B}}}}&0\\ 0&0&0&\Phi_{\breve{v}_{\!{\scriptscriptstyle\mathcal{A}}}}\end{bmatrix} (47)

Then the spectral density Φv˘\Phi_{\breve{v}} has the unique spectral factorization [40]

Φv˘=[F11Λ1F1100F22Λ2F22]=H~ΛH~\Phi_{\breve{v}}=\begin{bmatrix}F_{11}\Lambda_{1}F_{11}^{*}&0\\ 0&F_{22}\Lambda_{2}F_{22}^{*}\end{bmatrix}=\tilde{H}\Lambda\tilde{H}^{*} (48)

where H~\tilde{H} is of the form in (43), and monic, stable and minimum phase. \Box

Next we proceed with the proof of Theorem 1.

With Lemma 2 it follows that G¯ji\bar{G}_{ji} is given by either (41) or (42). For analysing these two expressions, we first are going to specify G˘ji\breve{G}_{ji} and G˘jj\breve{G}_{jj}. From (16), we have

G˘ji\displaystyle\breve{G}_{ji} =\displaystyle= Gji+Gj𝒵(IG𝒵𝒵)1G𝒵i\displaystyle G_{ji}+G_{j\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}G_{\!{\scriptscriptstyle\mathcal{Z}}i} (49)
G˘jj\displaystyle\breve{G}_{jj} =\displaystyle= Gjj+Gj𝒵(IG𝒵𝒵)1G𝒵j,\displaystyle G_{jj}+G_{j\!{\scriptscriptstyle\mathcal{Z}}}(I-G_{\!{\scriptscriptstyle\mathcal{Z}}\!{\scriptscriptstyle\mathcal{Z}}})^{-1}G_{\!{\scriptscriptstyle\mathcal{Z}}j}, (50)

where the first terms on the right hand sides reflect the direct connections from wiw_{i} to wjw_{j} (respectively from wjw_{j} to wjw_{j}) and the second terms reflect the connections that pass only through nodes in 𝒵\mathcal{Z}. By definition, Gjj=0G_{jj}=0 since the GG matrix in the network in (6) is hollow. Under the parallel path and loop condition 1, the second terms on the right hand sides of (49), (50) are zero, so that G˘ji=Gji\breve{G}_{ji}=G_{ji} and G˘jj=0\breve{G}_{jj}=0.

What remains to be shown is that in (41) and (42), it holds that

Hˇj3G˘𝒰j=Hˇj3G˘𝒰i=0\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}j}=\check{H}_{j3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}i}=0 (51)

while additionally for i𝒰i\in\mathcal{U}, it should hold that

Hˇji=0.\check{H}_{ji}=0. (52)

With definition (20) for Hˇ\check{H} and the special structure of H~13\tilde{H}_{13} and H~23\tilde{H}_{23} in (18) that is implied by the result (43) of Lemma 3, we can write

[Hˇ13Hˇ23]=[H~𝒬0H~o0][H~00H~𝒜𝒜]1=[Hˇ𝒬0Hˇo0],\begin{bmatrix}\check{H}_{13}\\ \check{H}_{23}\end{bmatrix}=\begin{bmatrix}\tilde{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}&0\\ \tilde{H}_{o\!{\scriptscriptstyle\mathcal{B}}}&0\\ \end{bmatrix}\begin{bmatrix}\tilde{H}_{\!{\scriptscriptstyle\mathcal{B}}\!{\scriptscriptstyle\mathcal{B}}}&0\\ 0&\tilde{H}_{\!{\scriptscriptstyle\mathcal{A}}\!{\scriptscriptstyle\mathcal{A}}}\\ \end{bmatrix}^{-1}=\begin{bmatrix}\check{H}_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{B}}}&0\\ \check{H}_{o\!{\scriptscriptstyle\mathcal{B}}}&0\\ \end{bmatrix}, (53)

implying that columns in this matrix related to inputs k𝒜k\in\mathcal{A} are zero.
In order to satisfy (52) we need the condition that: if i𝒰i\in\mathcal{U} then i𝒜i\in\mathcal{A}. This is equivalently formulated as i𝒬𝒜i\in\mathcal{Q}\cup\mathcal{A} (conditon 2b).
In order to satisfy (51) we note that Hˇj3\check{H}_{j3} is a row vector, of which the second part (the columns related to signals in 𝒜\mathcal{A}) is equal to 0, according to (53). Consequently, (51) is satisfied if for every kk\in\mathcal{B} it holds that G˘kj=G˘ki=0\breve{G}_{kj}=\breve{G}_{ki}=0. On the basis of (16), this condition is satisfied if for every wkww_{k}\in w_{\!{\scriptscriptstyle\mathcal{B}}} there do not exist direct or unmeasured paths from wiw_{i} to wkw_{k} and from wjw_{j} to wkw_{k} (condition 2c). \Box

Appendix C Proof of Theorem 2

Expression (8) can be written as

w𝒴=G¯ow𝒟+H¯oξ𝒴.w_{\!{\scriptscriptstyle\mathcal{Y}}}=\bar{G}^{o}w_{\!{\scriptscriptstyle\mathcal{D}}}+\bar{H}^{o}\xi_{\!{\scriptscriptstyle\mathcal{Y}}}.

Substituting this into the expression for the prediction error (9), leads to

ε(t,θ):=H¯(q,θ)1[ΔG¯(q,θ)w𝒟+ΔH¯(q,θ)ξ𝒴]+ξ𝒴\varepsilon(t,\theta):=\bar{H}(q,\theta)^{-1}\left[\Delta\bar{G}(q,\theta)w_{\!{\scriptscriptstyle\mathcal{D}}}+\Delta\bar{H}(q,\theta)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}\right]+\xi_{\!{\scriptscriptstyle\mathcal{Y}}} (54)

where ΔG¯(q,θ)=G¯oG¯(q,θ)\Delta\bar{G}(q,\theta)=\bar{G}^{o}-\bar{G}(q,\theta) and ΔH¯(q,θ)=H¯oH¯(q,θ)\Delta\bar{H}(q,\theta)=\bar{H}^{o}-\bar{H}(q,\theta). The proof of consistency involves two steps.

  1. 1.

    To show that 𝔼εT(t,θ)Wε(t,θ)\mathbb{E}\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta) achieves its minimum for ΔG¯(θ)=0\Delta\bar{G}(\theta)=0 and ΔH¯(θ)=0\Delta\bar{H}(\theta)=0,

  2. 2.

    To show the conditions under which the minimum is unique.

Step 1: With Proposition 1 it follows that our data generating system can always be written in the form (7), such that wm=T(q)ξmw_{m}=T(q)\xi_{m}. We denote T1T_{1} as the matrix composed of the first and third (block) row of TT, such that w𝒟=T1(q)ξmw_{\!{\scriptscriptstyle\mathcal{D}}}=T_{1}(q)\xi_{m}. Substituting this into (54) gives

ε(t,θ):=H¯(q,θ)1[ΔG¯(q,θ)T1+[ΔH¯(θ)0]]ξm+ξ𝒴,\varepsilon(t,\theta):=\bar{H}(q,\theta)^{-1}\left[\Delta\bar{G}(q,\theta)T_{1}+\begin{bmatrix}\Delta\bar{H}(\theta)&0\end{bmatrix}\right]\xi_{m}+\xi_{\!{\scriptscriptstyle\mathcal{Y}}},

where ξm\xi_{m} is (block) structured as [ξ𝒴ξ𝒰][\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}\ \xi_{\!{\scriptscriptstyle\mathcal{U}}}^{\top}]^{\top}.
In order to prove that the minimum of 𝔼¯[εT(t,θ)Wε(t,θ)]\bar{\mathbb{E}}\left[\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)\right] is attained for ΔG¯(θ)=0\Delta\bar{G}(\theta)=0 and ΔH¯(θ)=0\Delta\bar{H}(\theta)=0, it is sufficient to show that

[ΔG¯(θ)T1(q)+[ΔH¯(θ)00]]ξm(t)\begin{split}\left[\Delta\bar{G}(\theta)T_{1}(q)+\begin{bmatrix}\Delta\bar{H}(\theta)&0&0\end{bmatrix}\right]\xi_{m}(t)\\ \end{split} (55)

is uncorrelated to ξ𝒴(t)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t). In order to show this, let n=𝒰\\mathcal{F}_{n}=\mathcal{U}\backslash\mathcal{F}, with \mathcal{F} as defined in the Theorem, while we decompose ξm\xi_{m} according to ξm=[ξ𝒴ξξn]\xi_{m}=[\begin{matrix}\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}&\xi_{\!{\scriptscriptstyle\mathcal{F}}}^{\top}&\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}^{\top}\end{matrix}]^{\top}. Using a similar block-structure notation for ΔG¯\Delta\bar{G}, TT and ΔH¯\Delta\bar{H}, (55) can then be written as

(ΔG¯𝒴𝒬(θ)T𝒬𝒴+ΔG¯𝒴(θ)T𝒴+ΔG¯𝒴n(θ)Tn𝒴+ΔH¯𝒴𝒴(θ))ξ𝒴++(ΔG¯𝒴𝒬(θ)T𝒬+ΔG¯𝒴(θ)T+ΔG¯𝒴n(θ)Tn)ξ+(ΔG¯𝒴𝒬(θ)T𝒬n+ΔG¯𝒴(θ)Tn+ΔG¯𝒴n(θ)Tnn)ξn.\begin{split}&\left(\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{Y}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}\!{\scriptscriptstyle\mathcal{Y}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}}+\Delta\bar{H}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Y}}}(\theta)\right)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}+\\ &\ +\left(\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{F}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}\!{\scriptscriptstyle\mathcal{F}}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}}\right)\xi_{\!{\scriptscriptstyle\mathcal{F}}}\\ &\ +\left(\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{Q}}\!{\scriptscriptstyle\mathcal{F}}_{n}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}\!{\scriptscriptstyle\mathcal{F}}_{n}}+\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}_{n}}\right)\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}.\end{split}

(56)

Since, by definition, ξn(t)\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}(t) is statically uncorrelated to ξ𝒴(t)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t), the ξn\xi_{\!{\scriptscriptstyle\mathcal{F}}_{n}}-dependent term in (56) cannot create any static correlation with ξ𝒴(t)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t). Then it needs to be shown that the ξ𝒴\xi_{\!{\scriptscriptstyle\mathcal{Y}}}- and ξ\xi_{\!{\scriptscriptstyle\mathcal{F}}}-dependent terms in (56) all reflect strictly proper filters. i.e. that they all contain at least a delay.
ΔH¯(θ)\Delta\bar{H}(\theta) is strictly proper since both H¯(θ)\bar{H}(\theta) and H¯o\bar{H}^{o} are monic. Therefore, ΔH¯𝒴𝒴(θ)\Delta\bar{H}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Y}}}(\theta) will have at least a delay in each of its transfers.
If all paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the transformed network and in its parameterized model have at least a delay (as per Condition c in the theorem), then all terms ΔG¯𝒴𝒬(θ)\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{Q}}}(\theta) and ΔG¯𝒴(θ)\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}}(\theta) will have a delay.
We then need to consider the two remaining terms, ΔG¯𝒴n(θ)Tn𝒴\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}} and ΔG¯𝒴n(θ)Tn\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}}. From the definition of ΔG¯𝒴n(θ)\Delta\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta), each of the two terms can be represented as the sum of two terms. G¯𝒴nTn𝒴\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}} and G¯𝒴nTn\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}} represent paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} and from ww_{\!{\scriptscriptstyle\mathcal{F}}} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} respectively in the transformed network. Whereas, G¯𝒴n(θ)Tn𝒴\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{Y}}} and G¯𝒴n(θ)Tn\bar{G}_{\!{\scriptscriptstyle\mathcal{Y}}\!{\scriptscriptstyle\mathcal{F}}_{n}}(\theta)T_{\!{\scriptscriptstyle\mathcal{F}}_{n}\!{\scriptscriptstyle\mathcal{F}}} is partly induced by the parameterized model and partly by the paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} to wnw_{\!{\scriptscriptstyle\mathcal{F}}_{n}} and from ww_{\!{\scriptscriptstyle\mathcal{F}}} to wnw_{\!{\scriptscriptstyle\mathcal{F}}_{n}} respectively in the transformed network. According to condition c of the theorem (delay conditions), these transfer functions are strictly proper. This implies that (56) is statically uncorrelated to ξ𝒴(t)\xi_{\!{\scriptscriptstyle\mathcal{Y}}}(t). Therefore we have, 𝔼¯[εT(t,θ)Wε(t,θ)]=𝔼¯[ΔX(θ)ξmW]+𝔼¯[ξ𝒴Wξ𝒴]\bar{\mathbb{E}}\left[\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)\right]=\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right]+\bar{\mathbb{E}}\left[\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}W\xi_{\!{\scriptscriptstyle\mathcal{Y}}}\right] where ΔX(θ)=H¯(θ)1[ΔG¯(θ)T1(q)+[ΔH¯(θ)00]]\Delta X(\theta)=\bar{H}(\theta)^{-1}\left[\Delta\bar{G}(\theta)T_{1}(q)+\begin{bmatrix}\Delta\bar{H}(\theta)&0&0\end{bmatrix}\right]. As a result, the minimum of 𝔼¯[εT(t,θ)Wε(t,θ)]\bar{\mathbb{E}}\left[\varepsilon^{T}(t,\theta)W\varepsilon(t,\theta)\right], which is 𝔼¯[ξ𝒴Wξ𝒴]\bar{\mathbb{E}}\left[\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}W\xi_{\!{\scriptscriptstyle\mathcal{Y}}}\right], is achieved for ΔG¯(θ)=0\Delta\bar{G}(\theta)=0 and ΔH¯(θ)=0\Delta\bar{H}(\theta)=0.

Step 2: When the minimum is achieved, we have 𝔼¯[ΔX(θ)ξmW]\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right] to be zero. From (54), we have ΔX(θ)ξm=H¯(q,θ)1[[ΔG¯(q,θ)ΔH¯(q,θ)][w𝒟ξ𝒴]].\Delta X(\theta)\xi_{m}=\bar{H}(q,\theta)^{-1}\left[\begin{bmatrix}\Delta\bar{G}(q,\theta)&\Delta\bar{H}(q,\theta)\end{bmatrix}\begin{bmatrix}w_{\!{\scriptscriptstyle\mathcal{D}}}^{\top}&\xi_{\!{\scriptscriptstyle\mathcal{Y}}}^{\top}\end{bmatrix}^{\top}\right]. Using the expression of ξo\xi_{o} from (8) and substituting it in the expression of ΔX(θ)ξm\Delta X(\theta)\xi_{m} we get, ΔX(θ)ξm=H¯(q,θ)1[[ΔG¯(q,θ)ΔH¯(q,θ)]Jκ(t)]=Δx(θ)Jκ(t)\Delta X(\theta)\xi_{m}=\bar{H}(q,\theta)^{-1}\left[\begin{bmatrix}\Delta\bar{G}(q,\theta)&\Delta\bar{H}(q,\theta)\end{bmatrix}J\kappa(t)\right]=\Delta x(\theta)J\kappa(t) where,

J=[I000I0(H¯oo)1G¯o𝒟(H¯oo)1H¯o𝒬(H¯oo)1];G¯o𝒟=[G¯o𝒬G¯o𝒰].J\!\!=\!\!\begin{bmatrix}I&0&0\\ 0&I&0\\ -(\bar{H}_{oo})^{-1}\bar{G}_{o\!{\scriptscriptstyle\mathcal{D}}}\!&\!-(\bar{H}_{oo})^{-1}\bar{H}_{o\!{\scriptscriptstyle\mathcal{Q}}}\!&\!\!(\bar{H}_{oo})^{-1}\!\end{bmatrix}\!;\bar{G}_{o\!{\scriptscriptstyle\mathcal{D}}}^{\top}\!=\!\begin{bmatrix}\bar{G}^{\top}_{o\!{\scriptscriptstyle\mathcal{Q}}}\\ \bar{G}^{\top}_{o\!{\scriptscriptstyle\mathcal{U}}}\end{bmatrix}\!.

Writing 𝔼¯[ΔX(θ)ξmW]=𝔼¯[Δx(θ)Jκ(t)W]=0\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right]=\bar{\mathbb{E}}\left[{||\Delta x(\theta)J\kappa(t)||}_{W}\right]=0 using Parseval’s theorem in the frequency domain, we have

12πππΔx(ejω,θ)JΦκ(ω)JΔx(ejω,θ)𝑑ω=0.\frac{1}{2\pi}\int_{-\pi}^{\pi}\Delta x(e^{j\omega},\theta)^{\top}J\Phi_{\kappa}(\omega)J^{*}\Delta x(e^{-j\omega},\theta)d\omega=0. (57)

The standard reasoning for showing uniqueness of the identification result is to show that if 𝔼¯[ΔX(θ)ξmW]\bar{\mathbb{E}}\left[{||\Delta X(\theta)\xi_{m}||}_{W}\right] equals 0 (i.e. when the minimum power is achieved), this should imply that ΔG¯(θ)=0\Delta\bar{G}(\theta)=0 and ΔH¯(θ)=0\Delta\bar{H}(\theta)=0. Since JJ is full rank and positive definite, the above mentioned implication will be fulfilled only if Φκ(ω)>0\Phi_{\kappa}(\omega)>0 for a sufficiently high number of frequencies. On condition 2 of Theorem 2 being satisfied along with the other conditions in Theorem 1, it ensures that the minimum value is achieved only for G¯(θ)=G¯0\bar{G}(\theta)=\bar{G}^{0} and H¯(θ)=H¯0\bar{H}(\theta)=\bar{H}^{0}. \Box

Appendix D Proof of Proposition 2

The disturbances in the original network are characterized by v˘\breve{v} (A). From the results of Lemma 3, we can infer that the spectral density Φv˘\Phi_{\breve{v}} has the unique spectral factorization Φv˘=H~ΛH~\Phi_{\breve{v}}=\tilde{H}\Lambda\tilde{H}^{*} where H~\tilde{H} is monic, stable, minimum phase, and of the form given in (43). Together with the form of Λ\Lambda in (43) it follows that ξ𝒜\xi_{\!{\scriptscriptstyle\mathcal{A}}} is uncorrelated with ξ𝒴\xi_{\!{\scriptscriptstyle\mathcal{Y}}}. As a result, the set 𝒜\mathcal{A} satisfies the properties of n\mathcal{F}_{n}, so that in Condition c we can replace \mathcal{F} by \mathcal{B}. What remains to be shown is that the delay in path/loop conditions in the transformed network (8) can be reformulated into the same conditions on the original network (6). To this end we will need two Lemma’s.

Lemma 4

Consider a dynamic network as dealt with in Theorem 2, with reference to eq. (8), where a selection of node signals is decomposed into sets 𝒟=𝒬𝒰\mathcal{D}=\mathcal{Q}\cup\mathcal{U}, 𝒴=𝒬{o}\mathcal{Y}=\mathcal{Q}\cup\{o\}, and which is obtained after immersion of nodes in 𝒵\mathcal{Z}. Let ii be any element i𝒴𝒰i\in\mathcal{Y}\cup\mathcal{U}, and let kk be any element k𝒴k\in\mathcal{Y}.
If in the original network the direct path, as well as all paths that pass through non-measured nodes only, from wiw_{i} to wkw_{k} have a delay, then G¯ki\bar{G}_{ki} is strictly proper.

Proof: We will show that G¯ki\bar{G}_{ki} is strictly proper if all paths from wiw_{i} to wkw_{k} have a delay. For any k𝒴k\in\mathcal{Y}, i𝒟i\in\mathcal{D}, G¯ki\bar{G}_{ki} is given by either (41) or (42) with j=kj=k. The situation that is not covered by (41), (42) is the case where i={o}i=\{o\}, but from (34) it follows that G¯ko=0\bar{G}_{ko}=0, for k𝒴k\in\mathcal{Y}. So for this situation strictly properness is guaranteed.
We will now use (41) and (42) for jj given by any k𝒴k\in\mathcal{Y}. In (41) and (42), it will hold that Hˇk3\check{H}_{k3} is given by the appropriate component of (20), which, by the fact that (18) is monic, will imply that Hˇk3\check{H}_{k3} is strictly proper. By the same reasoning this also holds for Hˇki\check{H}_{ki}.
From (41) and (42) it then follows that strictly properness of G¯ki\bar{G}_{ki} follows from strictly properness of G˘ki\breve{G}_{ki} if the inverse expression (IG˘kk+Hˇk3G˘𝒰k)1(I-\breve{G}_{kk}+\check{H}_{k3}\breve{G}_{\!{\scriptscriptstyle\mathcal{U}}k})^{-1} is proper. This latter condition is guaranteed by the fact that Hˇk3\check{H}_{k3} is strictly proper and G˘kk\breve{G}_{kk} and (IG˘kk)1(I-\breve{G}_{kk})^{-1} are proper as they reflect a module and network transfer function in the immersed network [41, 30]. Finally, strictly properness of G˘ki\breve{G}_{ki} follows from strictly properness of GkiG_{ki} and the presence of a delay in all paths from wiw_{i} to wkw_{k} that pass through unmeasured nodes.

Lemma 5

Consider the transformed network and let j,kj,k be any elements j,k𝒴𝒰j,k\in\mathcal{Y}\cup\mathcal{U}. If in the original network all paths from wkw_{k} to wjw_{j} have a delay, then all paths from wkw_{k} to wjw_{j} in the transformed network have a delay.

Proof: This is proved using the Lemma 3 in [15] and Lemma 4. Let G¯()\bar{G}(\infty) denote limzG¯(z)\lim_{z\rightarrow\infty}\bar{G}(z). From Lemma 4 we know G¯jk\bar{G}_{jk} is strictly proper if all paths from wkw_{k} to wjw_{j} in the original network have a delay. Therefore,

G¯m()=[0],\bar{G}_{m}(\infty)=\begin{bmatrix}\ast&0\\ \ast&\ast\end{bmatrix}, (58)

where the 0 represents G¯jk()\bar{G}_{jk}(\infty). Using inverse rule of block matrices we have,

(IG¯m())1=[0](I-\bar{G}_{m}(\infty))^{-1}=\begin{bmatrix}\ast&0\\ \ast&\ast\end{bmatrix} (59)

Considering (7) we can write wm=G¯mwm+vmw_{m}=\bar{G}_{m}w_{m}+v_{m} where vm=H¯mξmv_{m}=\bar{H}_{m}\xi_{m}. So have wm=(IG¯m)1vmw_{m}=(I-\bar{G}_{m})^{-1}v_{m} where (IG¯m)1(I-\bar{G}_{m})^{-1} represents the transfer from vmv_{m} to wmw_{m}. Having 0 in (59) represents that the transfer function from vkv_{k} to wjw_{j} has a delay. Since vkv_{k} has path only to wkw_{k} with unit transfer function, wkw_{k} to wjw_{j} has a delay. \Box

We now look into the proof of Proposition 2. For this we need to generalize the result we have achieved in Lemma 5 in terms of scalar node signals to set of node signals. If all existing paths/loops from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}} to w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}} in the original network have at least a delay, then all existing paths/loops from wk,k𝒴w_{k},k\in\mathcal{Y}\cup\mathcal{F} to wj,j𝒴w_{j},j\in\mathcal{Y} in the original network have at least a delay. If all existing paths/loops from wk,k𝒴w_{k},k\in\mathcal{Y}\cup\mathcal{F} to wj,j𝒴w_{j},j\in\mathcal{Y} in the original network have at least a delay, then as a result of Lemma 5, all existing paths/loops from wk,k𝒴w_{k},k\in\mathcal{Y}\cup\mathcal{F} to wj,j𝒴w_{j},j\in\mathcal{Y} in the transformed network have at least a delay. This implies that all existing paths/loops from wk,k𝒴w_{k},k\in\mathcal{Y}\cup\mathcal{F} to wj,j𝒴w_{j},j\in\mathcal{Y} in the transformed network have at least a delay. Following the above reasoning, we can also show that if all existing paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}} to wk,knw_{k},k\in\mathcal{F}_{n} in the original network have at least a delay, all existing paths from w𝒴w_{\!{\scriptscriptstyle\mathcal{Y}}\cup\!{\scriptscriptstyle\mathcal{F}}} to wk,knw_{k},k\in\mathcal{F}_{n} in the transformed network have at least a delay.

Acknowledgment

The authors gratefully acknowledge discussions with and contributions from Arne Dankers, Giulio Bottegal and Harm Weerts on the initial research that led to the presented results.

References

  • [1] D. Materassi and M. Salapaka, “On the problem of reconstructing an unknown topology via locality properties of the Wiener filter,” IEEE Trans. Automatic Control, vol. 57, no. 7, pp. 1765–1777, 2012.
  • [2] B. Sanandaji, T. Vincent, and M. Wakin, “Exact topology identification of large-scale interconnected dynamical systems from compressive observations,” in Proc. American Control Conference (ACC), San Francisco, CA, USA, 2011, pp. 649–656.
  • [3] D. Materassi and G. Innocenti, “Topological identification in networks of dynamical systems,” IEEE Trans. Automatic Control, vol. 55, no. 8, pp. 1860–1871, 2010.
  • [4] A. Chiuso and G. Pillonetto, “A Bayesian approach to sparse dynamic network identification,” Automatica, vol. 48, no. 8, pp. 1553––1565, 2012.
  • [5] S. Shi, G. Bottegal, and P. M. J. Van den Hof, “Bayesian topology identification of linear dynamic networks,” in Proc. 2019 European Control Conference, Napels, Italy, 2019, pp. 2814–2819.
  • [6] A. Haber and M. Verhaegen, “Subspace identication of large-scale interconnected systems,” IEEE Transactions on Automatic Control, vol. 59, no. 10, pp. 2754–2759, 2014.
  • [7] P. Torres, J. W. van Wingerden, and M. Verhaegen, “Hierarchical PO-MOESP subspace identification for directed acyclic graphs,” Intern. J. Control, vol. 88, no. 1, pp. 123–137, 2015.
  • [8] H. H. M. Weerts, P. M. J. Van den Hof, and A. G. Dankers, “Prediction error identification of linear dynamic networks with rank-reduced noise,” Automatica, vol. 98, pp. 256–268, December 2018.
  • [9] ——, “Identification of dynamic networks operating in the presence of algebraic loops,” in Proc. 55nd IEEE Conf. on Decision and Control (CDC). IEEE, 2016, pp. 4606–4611.
  • [10] M. Zorzi and A. Chiuso, “Sparse plus low rank network identification: a nonparametric approach,” Automatica, vol. 76, pp. 355–366, 2017.
  • [11] A. S. Bazanella, M. Gevers, J. M. Hendrickx, and A. Parraga, “Identifiability of dynamical networks: which nodes need to be measured?” in Proc. 56th IEEE Conference on Decision and Control (CDC), 2017, pp. 5870–5875.
  • [12] J. Gonçalves and S. Warnick, “Necessary and sufficient conditions for dynamical structure reconstruction of LTI networks,” IEEE Trans. Automatic Control, vol. 53, no. 7, pp. 1670–1674, Aug. 2008.
  • [13] H. H. M. Weerts, P. M. J. Van den Hof, and A. G. Dankers, “Identifiability of linear dynamic networks,” Automatica, vol. 89, pp. 247–258, March 2018.
  • [14] J. Hendrickx, M. Gevers, and A. Bazanella, “Identifiability of dynamical networks with partial node measurements,” IEEE Trans. Autom. Control, vol. 64, no. 6, pp. 2240–2253, 2019.
  • [15] P. M. J. Van den Hof, A. G. Dankers, P. S. C. Heuberger, and X. Bombois, “Identification of dynamic models in complex networks with prediction error methods - basic methods for consistent module estimates,” Automatica, vol. 49, no. 10, pp. 2994–3006, 2013.
  • [16] A. G. Dankers, P. M. J. Van den Hof, X. Bombois, and P. S. C. Heuberger, “Errors-in-variables identification in dynamic networks consistency results for an instrumental variable approach,” Automatica, vol. 62, pp. 39–50, 2015.
  • [17] J. Linder and M. Enqvist, “Identification of systems with unknown inputs using indirect input measurements,” International Journal of Control, vol. 90, no. 4, pp. 729–745, 2017.
  • [18] K. R. Ramaswamy, G. Bottegal, and P. M. J. Van den Hof, “Local module identification in dynamic networks using regularized kernel-based methods,” in Proc. 57th IEEE Conf. on Decision and Control (CDC), Miami Beach, FL, 2018, pp. 4713–4718.
  • [19] N. Everitt, G. Bottegal, and H. Hjalmarsson, “An empirical bayes approach to identification of modules in dynamic networks,” Automatica, vol. 91, pp. 144–151, 5 2018.
  • [20] M. Gevers, A. Bazanella, and G. Vian da Silva, “A practical method for the consistent identification of a module in a dynamical network,” IFAC-PapersOnLine, vol. 51-15, pp. 862–867, 2018.
  • [21] L. Ljung, System Identification: Theory for the User. Englewood Cliffs, NJ: Prentice-Hall, 1999.
  • [22] D. Materassi and M. Salapaka, “Identification of network components in presence of unobserved nodes,” in Proc. 2015 IEEE 54th Conf. Decision and Control, Osaka, Japan, 2015, pp. 1563–1568.
  • [23] A. G. Dankers, P. M. J. Van den Hof, P. S. C. Heuberger, and X. Bombois, “Identification of dynamic models in complex networks with prediction error methods: Predictor input selection,” IEEE Trans. on Automatic Control, vol. 61, no. 4, pp. 937–952, 2016.
  • [24] D. Materassi and M. V. Salapaka, “Signal selection for estimation and identification in networks of dynamic systems: a graphical model approach,” IEEE Trans. Automatic Control, vol. 65, no. 10, pp. 4138–4153, october 2020.
  • [25] P. M. J. Van den Hof, A. G. Dankers, and H. H. M. Weerts, “From closed-loop identification to dynamic networks: generalization of the direct method,” in Proc. 56nd IEEE Conf. on Decision and Control (CDC). Melbourne, Australia: IEEE, 2017, pp. 5845–5850.
  • [26] A. G. Dankers, P. M. J. Van den Hof, D. Materassi, and H. H. M. Weerts, “Conditions for handling confounding variables in dynamic networks,” IFAC-PapersOnLine, vol. 50, no. 1, pp. 3983–3988, 2017, proc. 20th IFAC World Congress.
  • [27] P. M. J. Van den Hof, K. R. Ramaswamy, A. G. Dankers, and G. Bottegal, “Local module identification in dynamic networks with correlated noise: the full input case,” in Proc. 58th IEEE Conf. on Decision and Control (CDC), 2019, 5494-5499.
  • [28] A. G. Dankers, “System identification in dynamic networks,” PhD dissertation, Delft University of Technology, 2014.
  • [29] J. Pearl, Causality: Models, Reasoning, and Inference. New York: Cambridge University Press, 2000.
  • [30] H. H. M. Weerts, J. Linder, M. Enqvist, and P. M. J. Van den Hof, “Abstractions of linear dynamic networks for input selection in local module identification,” Automatica, vol. 117, 2020.
  • [31] M. Gevers and A. S. Bazanella, “Identification in dynamic networks: identifiability and experiment design issues,” in Proc. 54th IEEE Conference on Decision and Control (CDC). IEEE, 2015, pp. 4005–4010.
  • [32] H. J. van Waarde, P. Tesi, and M. K. Camlibel, “Topological conditions for identifiabaility of dynamical networks with partial node measurements,” IFAC-PapersOnLine, vol. 51-23, pp. 319–324, 2018, proc. 7th IFAC Workshop on Distrib. Estim. and Control in Networked Systems.
  • [33] H. H. M. Weerts, P. M. J. Van den Hof, and A. G. Dankers, “Single module identifiability in linear dynamic networks,” in Proc. 57th IEEE Conf. on Decision and Control (CDC). Miami Beach, FL: IEEE, 2018, pp. 4725–4730.
  • [34] X. Cheng, S. Shi, and P. M. J. Van den Hof, “Allocation of excitation signals for generic identifiability of dynamic networks,” in Proc. 58th IEEE Conf. on Decision and Control (CDC), Nice, France, 2019, pp. 5507–5512.
  • [35] P. M. J. Van den Hof and K. R. Ramaswamy, “Path-based data-informativity conditions for single module identification in dynamic networks,” in Proc. 59th IEEE Conf. Decision and Control, Jeju Island, Republic of Korea, 2020, to appear.
  • [36] K. R. Ramaswamy, P. M. J. Van den Hof, and A. G. Dankers, “Generalized sensing and actuation schemes for local module identification in dynamic networks,” in Proc. 58th IEEE Conf. on Decision and Control (CDC), 2019, pp. 5519–5524.
  • [37] D. Kroening and O. Strichmann, Decision Procedures - An algorithmic point of view, 2nd ed. Springer, 2016.
  • [38] H. H. M. Weerts, M. Galrinho, G. Bottegal, H. Hjalmarsson, and P. M. J. Van den Hof, “A sequential least squares algorithm for ARMAX dynamic network identification,” IFAC-PapersOnLine, vol. 51-15, pp. 844–849, 2018, proc. 18th IFAC Symp. System Identification.
  • [39] M. Galrinho, C. R. Rojas, and H. Hjalmarsson, “Parametric identification using weighted null-space fitting,” IEEE Trans. Automatic Control, vol. 64, no. 7, pp. 2798–2813, 2019.
  • [40] D. Youla, “On the factorization of rational matrices,” IRE Trans. Information Theory, vol. 7, pp. 172–189, 1961.
  • [41] N. Woodbury, A. Dankers, and S. Warnick, “Dynamic networks: representations, abstractions and well-posedness,” in Proc. 57th IEEE Conf. on Decision and Control, Miami Beach, FL, 2018, pp. 4719–4724.
[Uncaptioned image] Karthik Raghavan Ramaswamy was born in 1989. He received his Bachelor’s in Electrical and Electronics Engineering (with Distinction) in 2011 from Anna University and Master’s in Systems and Control (with great appreciation) from TU Eindhoven in 2017. From 2011 to 2015 he was Control & Automation engineer at Larsen & Toubro. Currently, he is a PhD researcher with the Control Systems research group, Department of Electrical Engineering, TU Eindhoven, The Netherlands. His research interests are in the area of data driven modeling, dynamic network identification and machine learning.
[Uncaptioned image] Paul Van den Hof received the M.Sc. and Ph.D. degrees in electrical engineering from Eindhoven University of Technology, Eindhoven, The Netherlands, in 1982 and 1989, respectively. In 1986 he moved to Delft University of Technology, where he was appointed as Full Professor in 1999. From 2003 to 2011, he was founding co-director of the Delft Center for Systems and Control (DCSC). As of 2011, he is a Full Professor in the Electrical Engineering Department, Eindhoven University of Technology. His research interests include data-driven modeling, identification for control, dynamic network identification, and model-based control and optimization, with applications in industrial process control systems and high-tech systems. He holds an ERC Advanced Research grant for a research project on identification in dynamic networks. Paul Van den Hof is an IFAC Fellow and IEEE Fellow, and Honorary Member of the Hungarian Academy of Sciences, and IFAC Advisor. He has been a member of the IFAC Council (1999–2005, 2017-2020), the Board of Governors of IEEE Control Systems Society (2003–2005), and an Associate Editor and Editor of Automatica (1992–2005). In the triennium 2017-2020 he was Vice-President of IFAC.