This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Greedy Sensor Selection for Weighted Linear Least Squares Estimation under Correlated Noise

Keigo Yamada the Department of Aerospace Engineering of Tohoku University, Sendai, Miyagi, Japan    Yuji Saito11footnotemark: 1    Taku Nonomura11footnotemark: 1    Keisuke Asai11footnotemark: 1
Abstract

Optimization of sensor selection has been studied to monitor complex and large-scale systems with data-driven linear reduced-order modeling. An algorithm for greedy sensor selection is presented under the assumption of correlated noise in the sensor signals. A noise model is given using truncated modes in reduced-order modeling, and sensor positions that are optimal for generalized least squares estimation are selected. The determinant of the covariance matrix of the estimation error is minimized by efficient one-rank computations in both underdetermined and overdetermined problems. The present study also reveals that the objective function with correlated noise is neither submodular nor supermodular. Several numerical experiments are conducted using randomly generated data and real-world data. The results show the effectiveness of the selection algorithm in terms of accuracy in the estimation of the states of large-dimensional measurement data.

1 Introduction

Observation is the primal step toward understanding real-world phenomena. When monitoring quantities that cannot be observed directly, system representations are constructed for describing the dynamical behavior of the phenomena of interest as the state space of physical equations including unknown parameters. Therefore, parameter estimation using sensor measurements has long attracted attention in many engineering and scientific applications [63, 2, 28, 60, 29, 65]. A reduction in the number of measurements is concurrently demanded for more practical use, especially under resource constraints on sensors and communication energy, or for processing measurements in real time. Optimization problems are presented here using metrics defined for sensor locations. The Fisher information matrix is a well-used metric for the assessment of uncertainty in parameter estimation, as this optimization task is closely related to the experimental designs [26, 4, 43]. The theory of information and statistics are also informative criteria for the optimization task [9, 30, 15, 44, 54].

Sensor placement based on the physical equations has been adopted in the reconstruction of physical fields, such as sound or seismic wave distribution [12, 13, 57]. With a similar formulation of the placement, sensor selection has been conducted by choosing the best subset of sensor nodes in the context of network monitoring and target tracking [46, 1]. Recently, advances in data-driven techniques enable us to obtain a system representation from astonishingly high dimensional monitoring data for complex phenomena, with sensor nodes defined by each sampling point in the data [5, 53, 48, 7, 31, 8, 19]. Spatiotemporal correlations between measurements at sensor nodes are here represented as a superposition of a limited number of bases, sometimes with impositions of a physical structure or robustness to the system [3, 58, 52, 6]. The use of optimized measurement has been accelerating the applications of data-driven modeling in several engineering fields such as face recognition [33], inbetweening of non-time-resoloved data [59], noise reduction [20], state estimation for air flow [24], wind orientation detection for vehicles [18], and source localization [25].

The main challenge of such optimization problems is intractability, where the problems are often classified as being nondeterministic polynomial-time hard. Therefore, heuristics to find suboptimal solutions have been intensely discussed. For example, the selection problem is solved by the linear convex relaxation methods [23, 42, 10] or by using proximal gradient methods [11, 35]. The submodularity property in these optimization problems also encourages the use of greedy algorithms [40, 14, 16, 17, 49, 50, 51]. Some recent studies attempt to improve the performance of greedy algorithms by grouping and multiobejective strategies, which considers multiple sensor subsets simultaneously [22, 36, 39].

Despite these established methodologies, considering the complex structure of measurement noise still remains a great challenge for the sensor selection problem, as treated in recent studies [45, 21, 34, 47]. The discrepancy between measurements and the model should be surely considered as spatially correlated measurement noise, for example, due to numerical computation of equations and assumptions in the modeling [27, 66], or due to truncation in the model order reduction [64, 37]. These errors cause correlated effects on the estimation adversely, thus the evaluation of the measurement noise should be included in the sensor selection objective. As previously illustrated by Ucinski [61], a sensor selection problem with continuous relaxation loses convexity when the noise covariance term is introduced. Liu et al. [32] put forward a semidefinite programming for the sensor selection problem with spatially correlated noise, but the calculation becomes prohibitive due to the large problem size. The greedy algorithm introduced in this study smoothly integrates the noise covariance into the formulation in Saito et al. [50], although the loss of submodularity is also confirmed. Another advantage of greedy methods is to circumvent the rounding procedures for obtaining sensor positions from a relaxed solution, which are still an arguable process especially under the nonconvexity of the optimized objective function.

The objective functions for the greedy selection are derived in both of the overdetermined and underdetermined settings, which generalize the previous D-optimality-based formulation [50]. The Fisher information matrix is defined in this work to evaluate the uncertainty of linear least squares estimation for a static system. The algorithm leverages one-rank computation for both a sensitivity term for each sensor and a weighting term for measurement noise. Some of the recent studies introduced prior distribution for Bayesian estimation [32, 64], maximum a posteriori estimation [55], and Kalman filtering [67]. Virtually, the hyper parameters in those distributions must be optimized using some information criteria or cross-validation techniques. The formulation in the present study excludes a prior distribution, because the optimization for hyperparameters is difficult for high dimensional data treated in subsection 3.2. In summary, we herein 1) propose an optimization problem for greedy sensor selection generalized for correlated measurement noise, which is easily extended to various optimality criteria, 2) confirm that the objective function is neither submodular nor supermodular, and 3) formulate a fast greedy algorithm that selects sensors optimized for both underdetermined and overdetermined cases in the weighted linear least squares estimation.

2 Formulation and Algorithm

This section describes problem settings for sensor selection tailored for weighted least squares estimation. Then, algorithms for greedy selection are discussed.

2.1 Sparse Sensing

A linear measurement equation for pp sensors 𝐲p\mathbf{y}\in\mathbb{R}^{p} and a state vector of rr components 𝐳r\mathbf{z}\in\mathbb{R}^{r} is corrupted by Gaussian noise 𝐰𝒩(𝟎|𝐑)p\mathbf{w}\sim\mathcal{N}(\mathbf{0}|\mathbf{R})\in\mathbb{R}^{p}, which is independent of 𝐳\mathbf{z},

𝐲=𝐂𝐳+𝐰.\displaystyle\mathbf{y}=\mathbf{C}\mathbf{z}+\mathbf{w}. (1)

We assume that the sensor characteristic 𝐂p×r\mathbf{C}\in\mathbb{R}^{p{\times}r} is known in advance and nonsingular, and the covariance of the measurement noise 𝐑p×p\mathbf{R}\in\mathbb{R}^{p{\times}p} is positive definite and symmetric. An parameter vector 𝐳~\tilde{\mathbf{z}} is estimated from Eq. 1:

𝐳~=\displaystyle\tilde{\mathbf{z}}= 𝐂(𝐂𝐂)1𝐲\displaystyle\mathbf{C}^{\top}\left(\mathbf{C}\mathbf{C}^{\top}\right)^{-1}\mathbf{y}\, (pr)\left(p\leq r\right) (2a)
𝐳~=\displaystyle\tilde{\mathbf{z}}= (𝐂𝐑1𝐂)1𝐂𝐑1𝐲\displaystyle\left(\mathbf{C}^{\top}\mathbf{R}^{-1}\mathbf{C}\right)^{-1}\mathbf{C}^{\top}\mathbf{R}^{-1}\mathbf{y}\, (p>r)\left(p>r\right), (2b)

Note that Eq. (2a) corresponds to the minimal norm solution in which the measurement noise is not considered, though Eq. (2a) is derived from the formulation including measurement noise. On the other hand, Eq. (2b) is a minimum variance unbiased estimation considering measurement noise as the generalized least squares estimation [56, Section 4.5]. The present study focuses on the formulation above, excluding any prior distribution of the state variables.

We also assume a large number of possible measurement points, e.g., 𝐱n(nr)\mathbf{x}\in\mathbb{R}^{n}\,(n\gg r). The linear coefficients and noise covariance for all of the measurement points are expressed as 𝐔n×r\mathbf{U}\in\mathbb{R}^{n\times r} and n×n\mathcal{R}\in\mathbb{R}^{n\times n}, respectively. Actual calculations for these terms are introduced later herein at subsection 2.2. With these notations, the measurement is expressed by substituting (𝐲,𝐂,𝐑)(𝐱,𝐔,)\left(\mathbf{y},\,\mathbf{C},\,\mathbf{R}\right)\leftarrow\left(\mathbf{x},\,\mathbf{U},\,\mathcal{R}\right). Here, the estimation Eq. 2a is redundant if rr is small, and thus the reduced measurement pnp\ll n is sufficient in terms of both estimation quality and calculation efficiency. A sensor indication matrix 𝐇(𝒮p)p×n\mathbf{H}{(\mathcal{S}_{p})}\in\mathbb{R}^{p{\times}n} is defined for a set 𝒮p\mathcal{S}_{p} of pp sensor indices selected from nn candidates. The position of unity in the ii-th row of 𝐇(𝒮p)\mathbf{H}{(\mathcal{S}_{p})} is associated with the ii-th component of 𝒮p\mathcal{S}_{p}, whereas the rest of the row is zero. Measurements and linear coefficients for the selected sensors are denoted as 𝐲(𝒮p)=𝐇(𝒮p)𝐱\mathbf{y}{(\mathcal{S}_{p})}=\mathbf{H}{(\mathcal{S}_{p})}\mathbf{x} and 𝐂(𝒮p)=𝐇(𝒮p)𝐔\mathbf{C}{(\mathcal{S}_{p})}=\mathbf{H}{(\mathcal{S}_{p})}\mathbf{U}, respectively. In addition, the covariance matrix for measurement noise is expressed by 𝐑(𝒮p)=𝐇(𝒮p)𝐇(𝒮p)\mathbf{R}({\mathcal{S}_{p}})=\mathbf{H}{(\mathcal{S}_{p})}\mathcal{R}\mathbf{H}{(\mathcal{S}_{p})}^{\top}. The argument (𝒮p){(\mathcal{S}_{p})} will be denoted as subscript 𝒮p\circ_{\mathcal{S}_{p}} for brevity hereinafter.

2.2 Data-driven modeling

In our implementation, the matrices 𝐔\mathbf{U} and \mathcal{R} are generated by modal decomposition of the collected data matrix, in a process known as principal component analysis, or proper orthogonal decomposition  [7, 33]. The collected data 𝐗=[𝐱1,,𝐱m]\mathbf{X}=\left[\mathbf{x}_{1},\,\ldots\,,\mathbf{x}_{m}\right] are assumed to consist of nn-point measurements in rows by mm instances in columns, where nmn\gg m. 𝐗\mathbf{X} is decomposed by singular value decomposition into mm orthonormal spatial modes 𝐔X\mathbf{U}_{X} and temporal modes 𝐕X\mathbf{V}_{X}, and a diagonal matrix of singular values 𝚺X\mathbf{\Sigma}_{X}. The approximation mode number rr is chosen to retain the covariance matrix for the original data matrix at a high rate:

𝐗\displaystyle\mathbf{X} =𝐔X𝚺X𝐕X=𝐔𝚺𝐕+𝐔N𝚺N𝐕N,\displaystyle=\mathbf{U}_{X}\mathbf{\Sigma}_{X}\mathbf{V}_{X}^{\top}=\mathbf{U}\mathbf{\Sigma}\mathbf{V}^{\top}+\mathbf{U}_{N}\mathbf{\Sigma}_{N}\mathbf{V}_{N}^{\top}, (3)

where 𝐗,𝐔Xn×m\mathbf{X},\mathbf{U}_{X}\in\mathbb{R}^{n{\times}m} and 𝚺X,𝐕Xm×m\mathbf{\Sigma}_{X},\mathbf{V}_{X}\in\mathbb{R}^{m{\times}m}, and 𝐔n×r\mathbf{U}\in\mathbb{R}^{n{\times}r}, 𝚺r×r\mathbf{\Sigma}\in\mathbb{R}^{r{\times}r} and 𝐕m×r\mathbf{V}\in\mathbb{R}^{m{\times}r}, respectively. Here, the second term with the subscript N\circ_{N} on the right-hand side is the portion that is excluded from rr rank representation and thus is regarded as the measurement noise. The ii-th column of 𝚺𝐕\mathbf{\Sigma}\mathbf{V}^{\top} and 𝐇𝒮p𝐗\mathbf{H}_{\mathcal{S}_{p}}\mathbf{X} are the variable vector and the measurement in Eq. 1 for the ii-th instance, respectively. Thus, one can immediately recover the low rank representation of the large-scale measurement as 𝐱~=𝐔𝐳~\tilde{\mathbf{x}}=\mathbf{U}\tilde{\mathbf{z}} by obtaining an estimation 𝐳~\tilde{\mathbf{z}} [33]. With respect to \mathcal{R}, several approaches are capable of expressing the statistical behavior of the residual between the measurement and the reduced order model, 𝐗𝐔𝚺𝐕\mathbf{X}-\mathbf{U}\mathbf{\Sigma}\mathbf{V}^{\top}, which are exemplified by kernel functions used in signal processing or data-driven modeling in Ref.  [64, Section 2]. By taking the expectation =𝐄[𝐰𝐰]\mathcal{R}=\mathbf{E}\left[\mathbf{w}\mathbf{w}^{\top}\right], the model of noise in the latter manner is denoted as =𝐔N𝚺N2𝐕N\mathcal{R}=\mathbf{U}_{N}\mathbf{\Sigma}_{N}^{2}\mathbf{V}_{N}^{\top} of Eq. 3, which is used in section 3. Appendix A shows how the data-driven design of the measurement noise covariance is affected in the standpoint of the correlation and amount of training data.

2.3 Objective function for sensor selection

Several criteria for sensor selection are available for scalar evaluation of the measurement design, like D-, E- or A-optimality mentioned in [62].The performances for D-, E- and A-optimality criteria were previously compared for sensor sets obtained by greedy sensor selection methods suited for these criteria [38]. Sensor selection based on the D-optimality criterion performed well in both of the computational time and the values of other criteria, thus being adopted in the present study. Note that the efficient implementation shown in subsection 2.4 can easily be extended to the A-optimality settings.

Geometrically, this optimization corresponds to the minimization of the volume of an ellipsoid, which represents the expected estimation error variance [23]:

argmin𝒮pdet(E[(𝐳𝐳~)(𝐳𝐳~)]),\displaystyle\operatorname{argmin}_{\mathcal{S}_{p}}\det\left(E\left[\left(\mathbf{z}-\tilde{\mathbf{z}}\right)\left(\mathbf{z}-\tilde{\mathbf{z}}\right)^{\top}\right]\right), (4)

where the operator E[]E\left[\circ\right] means taking the expected value of the argument. Furthermore, this matrix is known to correspond to the inverse of the Fisher information matrix. This equality is easily confirmed under the assumption of Gaussian measurement noise, by differentiating by 𝐳\mathbf{z} a log likelihood, L=(𝐲𝐂𝒮p𝐳~)𝐑1(𝐲𝐂𝒮p𝐳~)+const.L=-(\mathbf{y}-\mathbf{C}_{\mathcal{S}_{p}}\tilde{\mathbf{z}})\mathbf{R}^{-1}(\mathbf{y}-\mathbf{C}_{\mathcal{S}_{p}}\tilde{\mathbf{z}})+\textrm{const.}, then substituting estimation given by  Eq.(2b). The optimization returns the set of measurement point 𝒮p\mathcal{S}_{p} from all the possible locations, although this is normally an intractable process. Instead, a greedy algorithm is employed with objective functions for both prp\leq r and p>rp>r. They are derived by generalizing the formulation in Ref. [38] for the correlated measurement noise hereafter. The set of sensors are evaluated only in the observable subspace of 𝐑𝒮p1/2𝐂𝒮p\mathbf{R}_{\mathcal{S}_{p}}^{-1/2}\mathbf{C}_{\mathcal{S}_{p}}, since the measurement system Eq. 1 is underdetermined when p<rp<r. From Eq. 4, the subspace is separated by the projection 𝐳ξ=𝐕^𝐂𝐳\mathbf{z}\rightarrow\mathbf{\xi}=\hat{\mathbf{V}}_{\mathbf{C}}^{\top}\mathbf{z} after singular value decomposition of 𝐑𝒮p1/2𝐂𝒮p\mathbf{R}_{\mathcal{S}_{p}}^{-1/2}\mathbf{C}_{\mathcal{S}_{p}}:

E[(ξξ~)(ξξ~)]\displaystyle{E}\left[\left(\mathbf{\xi}-\tilde{\mathbf{\xi}}\right)\left(\mathbf{\xi}-\tilde{\mathbf{\xi}}\right)^{\top}\right]
={σn2𝐔^𝐂(𝐑𝒮p1/2𝐂𝒮p𝐂𝒮p𝐑𝒮p1/2)1𝐔^𝐂(pr)σn2𝐕^𝐂(𝐂𝒮p𝐑𝒮p1𝐂𝒮p)1𝐕^𝐂(p>r)\displaystyle=\left\{\begin{array}[]{ll}\sigma_{\textrm{n}}^{2}\hat{\mathbf{U}}_{\mathbf{C}}^{\top}\left(\mathbf{R}_{\mathcal{S}_{p}}^{-1/2}\mathbf{C}_{\mathcal{S}_{p}}\mathbf{C}_{\mathcal{S}_{p}}^{\top}\mathbf{R}_{\mathcal{S}_{p}}^{-1/2}\right)^{-1}\hat{\mathbf{U}}_{\mathbf{C}}&\left(p\leq r\right)\\ \sigma_{\textrm{n}}^{2}\hat{\mathbf{V}}_{\mathbf{C}}^{\top}\left(\mathbf{C}_{\mathcal{S}_{p}}^{\top}\mathbf{R}_{\mathcal{S}_{p}}^{-1}\mathbf{C}_{\mathcal{S}_{p}}\right)^{-1}\hat{\mathbf{V}}_{\mathbf{C}}&\left(p>r\right)\\ \end{array}\right. (7)

with some matrices of appropriate dimensions,

𝐑𝒮p1/2𝐂𝒮p={𝐔^𝐂[𝚺^𝐂𝟎][𝐕^𝐂1𝐕^𝐂2](pr)[𝐔^𝐂1𝐔^𝐂2][𝚺^𝐂𝟎]𝐕^𝐂(p>r).\displaystyle\quad\mathbf{R}_{\mathcal{S}_{p}}^{-1/2}\mathbf{C}_{\mathcal{S}_{p}}=\left\{\begin{array}[]{cl}\hat{\mathbf{U}}_{\mathbf{C}}\left[\begin{array}[]{cc}\hat{\mathbf{\Sigma}}_{\mathbf{C}}&\mathbf{0}\end{array}\right]\left[\begin{array}[]{cc}\hat{\mathbf{V}}_{\mathbf{C}}^{1\top}\\ \hat{\mathbf{V}}_{\mathbf{C}}^{2\top}\end{array}\right]&\left(p\leq r\right)\\ \left[\begin{array}[]{cc}\hat{\mathbf{U}}_{\mathbf{C}}^{1}&\hat{\mathbf{U}}_{\mathbf{C}}^{2}\end{array}\right]\left[\begin{array}[]{c}\hat{\mathbf{\Sigma}}_{\mathbf{C}}\\ \mathbf{0}\end{array}\right]\hat{\mathbf{V}}_{\mathbf{C}}^{\top}&\left(p>r\right).\end{array}\right. (16)

The evaluation of the error covariance in the observable subspace was recently introduced by Nakai et al. [38], and it is extended to the correlated noise case in the present study for the first time (to the best of our knowledge). One can use various metrics for the projected covariance matrix Eq. 7 like its determinant, trace or minimum eigenvalue, as in Ref. [23, 38]. The determinant of the inverse matrices in Eq. 7 is maximized in the present manuscript:

argmax𝒮pdet(𝐑𝒮p1𝐂𝒮p𝐂𝒮p)\displaystyle\operatorname{argmax}_{\mathcal{S}_{p}}\,\det\left(\mathbf{R}_{\mathcal{S}_{p}}^{-1}\mathbf{C}_{\mathcal{S}_{p}}\mathbf{C}_{\mathcal{S}_{p}}^{\top}\right) (pr)\left(p\leq r\right) (17a)
argmax𝒮pdet(𝐂𝒮p𝐑𝒮p1𝐂𝒮p)\displaystyle\operatorname{argmax}_{\mathcal{S}_{p}}\,\det\left(\mathbf{C}_{\mathcal{S}_{p}}^{\top}\mathbf{R}_{\mathcal{S}_{p}}^{-1}\mathbf{C}_{\mathcal{S}_{p}}\right) (p>r)\left(p>r\right). (17b)

Note that whitening all candidates with 1\mathcal{R}^{-1} before sensor selection is based on the assumption of weakly correlated noise, because \mathcal{R} contains noise covariance over sensors that are not selected as pointed out in [32]. In subsection 2.4, an algorithm is presented for achieving Eq. 17b in a greedy manner.

2.4 Efficient greedy algorithm

Algorithm 1 shows the procedure implemented in the computation conducted in section 3, which implicitly exploits the one-rank determinant lemma as [32, 50, 64]. It is worth mentioning that the previously presented noise-ignoring algorithm in [50], Algorithm 2, is easily obtained by substituting an identity matrix into \mathcal{R}.

Algorithm 1 Determinant-based greedy algorithm with noise covariance matrix (DG/NC)
0:  𝐔n×r,n×n,p>0\mathbf{U}\in\mathbb{R}^{n\times r},\,\mathcal{R}\in\mathbb{R}^{n\times n},\,p>0
0:  Indices of chosen pp sensor positions 𝒮p\mathcal{S}_{p}
  𝒮n{1,,n},𝒮0\mathcal{S}_{n}\leftarrow\left\{1,\ldots,n\right\},\,\mathcal{S}_{0}\leftarrow\emptyset
  for  k=1,,pk=1,\dots,p  do
     if krk\leq r then
        ik=argmaxi𝒮n\𝒮k1det(𝐑𝒮k1i1𝐂𝒮k1i𝐂𝒮k1i)i_{k}=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\det(\mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}) … [Eq. 19]
     else
        ik=argmaxi𝒮n\𝒮k1det(𝐂𝒮k1i𝐑𝒮k1i1𝐂𝒮k1i)i_{k}=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\det(\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}\mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}) … [Eq. 20]
     end if
     𝒮k𝒮k1ik\mathcal{S}_{k}\leftarrow\mathcal{S}_{k-1}\cup i_{k}
  end for

Algorithm 2 Determinant-based greedy algorithm (DG)[50]
0:  𝐔n×r,p>0\mathbf{U}\in\mathbb{R}^{n\times r},\,p>0
0:  Indices of chosen pp sensor positions 𝒮p\mathcal{S}_{p}
  𝒮n{1,,n},𝒮0\mathcal{S}_{n}\leftarrow\left\{1,\ldots,n\right\},\,\mathcal{S}_{0}\leftarrow\emptyset
  for  k=1,,pk=1,\dots,p  do
     if krk\leq r then
        ik=argmaxi𝒮n\𝒮k1det(𝐂𝒮k1i𝐂𝒮k1i)i_{k}=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\det(\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top})
     else
        ik=argmaxi𝒮n\𝒮k1det(𝐂𝒮k1i𝐂𝒮k1i)i_{k}=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\det(\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i})
     end if
     𝒮k𝒮k1ik\mathcal{S}_{k}\leftarrow\mathcal{S}_{k-1}\cup i_{k}
  end for

The equations are converted by the lemma shown later herein. First, consider an objective function when there are fewer sensors deployed than the number of state variables (pr)(p\leq r):

det(𝐑𝒮k1i1𝐂𝒮k1i𝐂𝒮k1i)\displaystyle\det\left(\mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}\right)
=det(𝐑𝒮k1i1)det(𝐂𝒮k1i𝐂𝒮k1i)\displaystyle=\det\left(\mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}^{-1}\right)\,\det\left(\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}\right)
=(𝐮i𝐮iT𝐮i𝐂𝒮k1T(𝐂𝒮k1𝐂𝒮k1T)1𝐂𝒮k1𝐮iT)(𝐭i𝐬k(i)𝐑𝒮k11𝐬k(i)T)det[(𝐑𝒮k11𝐂𝒮k1𝐂𝒮k1T)1],\displaystyle=\frac{\begin{aligned} \left(\mathbf{u}_{i}\mathbf{u}_{i}^{\text{T}}-\mathbf{u}_{i}\mathbf{C}_{\mathcal{S}_{k-1}}^{\text{T}}\left(\mathbf{C}_{\mathcal{S}_{k-1}}\mathbf{C}_{\mathcal{S}_{k-1}}^{\text{T}}\right)^{-1}\mathbf{C}_{\mathcal{S}_{k-1}}\mathbf{u}_{i}^{\text{T}}\right)\\ \end{aligned}}{\left(\mathbf{t}_{i}-\mathbf{s}_{k(i)}\mathbf{R}_{\mathcal{S}_{k-1}}^{-1}\mathbf{s}_{k(i)}^{\rm{T}}\right)\det\left[{\left(\mathbf{R}_{\mathcal{S}_{k-1}}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}}\mathbf{C}_{\mathcal{S}_{k-1}}^{\text{T}}\right)}^{-1}\right]}, (18)

where the subscript k(i){k(i)} represents the component produced by the ii-th sensor candidate in the kk-th step:

{𝐂𝒮k1i=(𝐂𝒮k1𝐮i),𝐑𝒮k1i=(𝐑𝒮k1𝐬k(i)𝐬k(i)ti).\displaystyle\begin{cases}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}=\left(\begin{array}[]{c}\mathbf{C}_{\mathcal{S}_{k-1}}\\ \mathbf{u}_{i}\end{array}\right),\\ \mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}=\left(\begin{array}[]{cc}\mathbf{R}_{\mathcal{S}_{k-1}}&\mathbf{s}_{k(i)}^{\top}\\ \mathbf{s}_{k(i)}&{t}_{i}\end{array}\right).\end{cases}

Here, 𝐮i\mathbf{u}_{i} is the ii-th row of 𝐔\mathbf{U}, and 𝐬k(i)\mathbf{s}_{k(i)} and ti{t}_{i} are the noise covariance between the selected sensors given by the previous steps and the ii-th candidate and noise variance for the ii-th candidate, respectively. The algorithm avoids expensive computations involving the determinant by separating the components of the obtained sensors from the objective function in the current selection step of Eq. 18:

ik=argmaxi𝒮n\𝒮k1det(𝐑𝒮k1i1)det(𝐂𝒮k1i𝐂𝒮k1i)\displaystyle i_{k}=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\,\det\left(\mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}^{-1}\right)\det\left(\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}\right)
=argmaxi𝒮n\𝒮k1𝐮i(𝐈𝐂𝒮k1T(𝐂𝒮k1𝐂𝒮k1T)1𝐂𝒮k1)𝐮iT𝐭i𝐬k(i)𝐑𝒮k11𝐬k(i)T,\displaystyle=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\,\frac{\mathbf{u}_{i}\left(\mathbf{I}-\mathbf{C}_{\mathcal{S}_{k-1}}^{\text{T}}\left(\mathbf{C}_{\mathcal{S}_{k-1}}\mathbf{C}_{\mathcal{S}_{k-1}}^{\text{T}}\right)^{-1}\mathbf{C}_{\mathcal{S}_{k-1}}\right)\mathbf{u}_{i}^{\text{T}}}{\mathbf{t}_{i}-\mathbf{s}_{k(i)}\mathbf{R}_{\mathcal{S}_{k-1}}^{-1}\mathbf{s}_{k(i)}^{\rm{T}}}, (19)

and then a unit vector 𝐞ik1×n\mathbf{e}_{i_{k}}\in\mathbb{R}^{1{\times}n}, of which only the iki_{k}-th entry is unity, is added to the kk-th row of 𝐇\mathbf{H}. Note that Eq. 19 corresponds to maximization of the difference when an arbitrary sensor is added to the sensor set of the previous step. The numerator of Eq. 19 is the 2\ell_{2} norm of the vector, and the denominator is positive, because the covariance matrix 𝐑𝒮k1\mathbf{R}_{\mathcal{S}_{k-1}} is assumed to be positive definite. Subsequently, the objective function is modified for the case in which more sensors than the number of state variables have already been determined:

ik=argmaxi𝒮n\𝒮k1det(𝐂𝒮k1i𝐑𝒮k1i1𝐂𝒮k1i)\displaystyle i_{k}=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\,\det\left(\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}^{\top}\mathbf{R}_{\mathcal{S}_{k-1}\cup\,i}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}\cup\,i}\right)
=argmaxi𝒮n\𝒮k1ϕ(i)(𝐂𝒮k1𝐑𝒮k11𝐂𝒮k1)1ϕ(i)ti𝐬k(i)𝐑𝒮k11𝐬k(i),\displaystyle=\underset{i\,\in\,\mathcal{S}_{n}\,\backslash\,\mathcal{S}_{k-1}}{\operatorname{argmax}}\,{\frac{\mathbf{\phi}_{(i)}\left(\mathbf{C}_{\mathcal{S}_{k-1}}^{\top}\mathbf{R}_{\mathcal{S}_{k-1}}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}}\right)^{-1}\mathbf{\phi}_{(i)}^{\top}}{{t}_{i}-\mathbf{s}_{k(i)}\mathbf{R}_{\mathcal{S}_{k-1}}^{-1}\mathbf{s}_{k(i)}^{\top}},} (20)

where ϕ(i)=𝐬k(i)𝐑𝒮k11𝐂𝒮k1𝐮i.\mathbf{\phi}_{(i)}=\mathbf{s}_{k(i)}\mathbf{R}_{\mathcal{S}_{k-1}}^{-1}\mathbf{C}_{\mathcal{S}_{k-1}}-\mathbf{u}_{i}. Eq. 20 is positive, and the objective function, Eq. (17b), increases monotonically. Details of the expansion are found in Ref. [64].

The computational cost of algorithms are listed in Table 1:

Algorithm Complexity
Greedy algorithm [50] 𝒪(pnr2)\mathcal{O}(pnr^{2})
Alternating direction 𝒪(nr2)\mathcal{O}(nr^{2}) / iteration
method of multipliers [35]
Newton method [23] 𝒪(n3)\mathcal{O}(n^{3}) / iteration
Newton method with 𝒪(n~3)\mathcal{O}(\tilde{n}^{3}) / iteration (n~<n)\left(\tilde{n}<n\right)
randomized subspace [42]
(a)
Algorithm Complexity
Alternating direction 𝒪(nm2)\mathcal{O}(nm^{2}) / iteration
method of multipliers [37]
Semi definite programming [32] 𝒪(n4.5)\mathcal{O}(n^{4.5}) / iteration
Greedy algorithm (Proposed) 𝒪(pnm2)\mathcal{O}(pnm^{2})
(b)
Table 1: Sensor selection algorithms using various optimization metrics

The objective function loses submodularity if the measurement noises at different sensor positions are strongly correlated with each other (or when the off-diagonal components in 𝐑\mathbf{R} are no smaller than the diagonal components.) The following example provides a nonsubmodular and nonsupermodular case for Eq. (17b). For simplicity, the spatial modes 𝐔3\mathbf{U}\in\mathbb{R}^{3} and the noise covariance 3×3\mathcal{R}\in\mathbb{R}^{3{\times}3} are set as follows. Here, the noise components i=2,3i=2,3 are strongly correlated, whereas those for i=1i=1 are relatively independent.

𝐔\displaystyle\mathbf{U} =(0.111),=(10.10.10.10.80.70.10.72)\displaystyle=\left(\begin{array}[]{c}0.1\\ 1\\ 1\end{array}\right),\quad\mathcal{R}=\left(\begin{array}[]{ccc}1&-0.1&0.1\\ -0.1&0.8&0.7\\ 0.1&0.7&2\end{array}\right) (27)

With these matrices, the values of the determinant function Eq. (17b) are

f{1,2}f{1}=1.2913\displaystyle f_{\{1,2\}}-f_{\{1\}}=1.2913 >f{1,2,3}f{1,3}=0.8038\displaystyle>f_{\{1,2,3\}}-f_{\{1,3\}}=0.8038
f{1,3}f{3}=0.0025\displaystyle f_{\{1,3\}}-f_{\{3\}}=0.0025 <f{1,2,3}f{2,3}=0.0450,\displaystyle<f_{\{1,2,3\}}-f_{\{2,3\}}=0.0450,

where f𝒮(𝒮2{1,2,3})f_{\mathcal{S}}\,(\mathcal{S}\in{2}^{\{1,2,3\}}) refers to the value of the determinant Eq. (17b) for the power sets of selected sensors. This example immediately shows that the objective function Eq. (17b) has neither submodularity nor supermodularity, whereas submodularity exists for the case with equally distributed uncorrelated measurement noise [50]. Thus, the sensor selection problem Eq. (17b) with a greedy method generally has no performance guarantee based on submodularity or supermodularity.

3 Results

This section describes some experiments that validate the algorithm. First, data matrices are constructed from randomly generated orthonormal bases. The NOAA-SST dataset [41] shows the results of a practical application.

Refer to caption
Figure 1: Estimation error comparison of two selection algorithms and two estimation methods, averaging 2,000 tests. (Labels: DG and DG/NC are the previously presented and the proposed algorithm for sensor selection, LS and GLS are normal linear least squares estimation and generalized estimation by considering noise correlation, respectively)

3.1 Randomly generated data matrix

Generalized results are shown in this subsection. The problem considered here is as follows: a data matrix 𝐗\mathbf{X} is constructed as 𝐗=𝐔XΣX𝐕X\mathbf{X}=\mathbf{U}_{X}\Sigma_{X}\mathbf{V}_{X}^{\top}, where 𝐔X\mathbf{U}_{X} and 𝐕X\mathbf{V}_{X} are 5,000×1005,000\times 100 and 100×100100\times 100 orthonormal matrices, respectively, generated from appropriately sized matrices containing numbers from a standard normal distribution, and ΣX\Sigma_{X} is a diagonal matrix with diag(ΣX)=(10.99(101j)/1000.01)\operatorname{diag}\left(\Sigma_{X}\right)=\left(\begin{array}[]{llllll}1&0.99&\ldots&(101-j)/100&\ldots&0.01\end{array}\right). The algorithms for the sensor selection treat these matrices after dividing the first 10 columns as 𝐔\mathbf{U} and 𝐕\mathbf{V} and the remaining columns as 𝐔N\mathbf{U}_{N} and 𝐕N\mathbf{V}_{N}. Then, the first 10 diagonal components and the remaining are labeled as 𝚺\mathbf{\Sigma} and 𝚺N\mathbf{\Sigma}_{N}, respectively. Note that the measure ee in terms of “reconstruction error” is expressed as e=𝐗𝐔𝐙~F/𝐗F.e={\|\mathbf{X}-\mathbf{U}\tilde{\mathbf{Z}}\|_{\mathrm{F}}}/{\|\mathbf{X}\|_{\mathrm{F}}}. Here, the series for the estimation 𝐳~\tilde{\mathbf{z}} of Eq. 2a is concatenated as 𝐙~\tilde{\mathbf{Z}}, and F\|\circ\|_{\mathrm{F}} represents the Frobenius norm of \circ. Figure 1 shows the result of the reconstruction with the estimate with pp sensors and the rr dimensional reduced-order model Eq. 3. Here, DG and DG/NC in the legend refer to “determinant-based greedy algorithm” in Ref. [50] and Algorithm 1 considering “noise covariance” in the measurement, respectively, and LS and GLS refer to “linear least squares estimation” and “generalized linear least squares estimation” using noise covariance, respectively. Note that the plots for prp\leq r are calculated by the same estimator Eq. (2a), and, therefore, the estimations with a small number of sensors for both GLS and LS are identical for each selected sensor set. First, the GLS estimation reduces the reconstruction error in oversampling cases for sensors for both algorithms. The measurement noise is quite excessive, and thus, sensors of both selection methods exhibit comparable results for the LS estimation. Second, the more sensors are deployed, the lower the reduction becomes thanks to the GLS estimation. This is partly because measurement using a large number of sensors suppresses outliers resulting from the correlated measurement noise. If a much larger number of sensors is available than the number of estimated variables, the importance of correlation in the measurement noise might diminish.

3.2 NOAA-SST

Refer to caption
(a) Previous greedy algorithm w/ 𝐔\mathbf{U}
Refer to caption
(b) Presented greedy algorithm w/ 𝐔\mathbf{U} and \mathcal{R}
Figure 2: Fifteen sensors for global distribution of sea surface temperature selected by the greedy algorithms, on color maps of the root mean square of the estimation error 𝐔𝚺𝐕𝐔𝐳~\mathbf{U}\mathbf{\Sigma}\mathbf{V}^{\top}-\mathbf{U}\tilde{\mathbf{z}} using each sensor set. The sensors of the presented algorithm are distributed and show better performance.

Here, we apply this strategy to pursue sensor selection using large-dimensional climate data. A brief description of the NOAA-SST data is given in Table 2.

Table 2: Description of SST data.
Label NOAA Optimum Interpolation (OI) SST V2 [41]
Temporal Coverage Weekly means from 1989/12/31 to 1999/12/12 (520520 snapshots)
Spatial Coverage 1.0 degree latitude ×\times 1.0 degree longitude global grid (n=44219n=44219 measurement on the ocean)

Similar to subsection 3.1, orthonormal modes are prepared by conducting SVD on the data matrix with the average being subtracted, then the first 10 of the 520 modes are used to build the reduced-order model of the temperature distribution (r=10r=10). The remaining modes are used for the noise covariance matrix. Several sensor positions for which the noise amplitude is extremely low (smaller than 1% of the maximum RMS of noise in this comparison) are eliminated from candidate set 𝒮n\mathcal{S}_{n} beforehand, as conducted in [64]. The results in this subsection can be compared with those in Ref. [33], [64], or [50]. In Fig. 2, the positions of sensors are represented by open circles on the colored maps, which illustrate the fluctuation of estimation error using those sensors, namely 𝐔𝚺𝐕𝐔𝐳~\mathbf{U}\mathbf{\Sigma}\mathbf{V}^{\top}-\mathbf{U}\tilde{\mathbf{z}}. The difference in the sensor positions is remarkable, since the proposed algorithm spreads sensors and avoids neighboring sensors that might be affected by correlated measurement noise. The reduction in the estimation error is also recognizable by comparing the backgrounds.

Figure 3 compares the results of estimation using the noise covariance information. Note that cross-validation is not conducted for this comparison, since it is hard to quantify the estimation error because of the dynamics in SST which is partly extracted by the reduced order modeling. In appendix A, the covariance matrix of measurement noise is characterized by the number of snapshots to form the noise covariance matrix. A horizontal broken line in Fig. 3 shows the modeling error due to the low-rank representation of Eq. 3. The red plots show better performance for sensors using the proposed algorithm than those using the previous DG algorithm [50] owing to the noise covariance matrix in the sensor selection procedure. There are several differences in the trend of plots compared to Fig. 1, e.g.  the contribution from sensors of the proposed algorithm is more significant than that for the GLS estimation. This is perhaps because of the weak amplitude of higher ordered modes in addition to the similarity in the location where the reduced-order phenomena and the measurement noise fluctuates greatly. The proposed algorithm that involves noise covariance evaluates the positions with less measurement noise. Therefore, accurate estimation is achieved even with linear least squares estimation, which contrasts with the errors of sensors of DG algorithm staying relatively high.

Refer to caption
Figure 3: Estimation error comparison of two selection algorithms and two estimation methods for NOAA-SST data:. (Labels: DG and DG/NC are the previously presented and the proposed algorithm for sensor selection, LS and GLS are normal linear least squares estimation and generalized estimation by considering noise correlation, respectively

4 Conclusions

A greedy algorithm for sensor selection for generalized least squares estimation is presented. A covariance matrix generated by truncated modes in reduced-order modeling is applied and a weighting matrix is built for the estimation. A specialized one-rank lemma involving the covariance matrix realizes a simple transformation from the true optimization into a series of a greedy scalar evaluation. In addition, the objective function is shown to be neither submodular nor supermodular. Numerical tests using two kinds of datasets are performed to assess the proposed determinant-based optimization method. The proposed algorithm gives less noisy sensors and results in stable estimation in the presence of measurement noise from truncated modes of the reduced-order modeling.

Acknowledgements

The present study was supported by JSPS KAKENHI (21J20671), JST ACT-X (JPMJAX20AD), JST CREST (JPMJCR1763) and JST FOREST (JPMJFR202C).

Appendix A Data-driven noise correlation in real-world data

In this section, some numerical experiments are conducted and the data dependency of the data-driven modeling of the measurement noise are explored. Mainly impact of the number of the used snapshots is investigated in this section. Here, six-fold cross-validation is applied to 624 snapshots of the same NOAA-SST data used in subsection 3.2.

Refer to caption
Figure 4: Comparison of estimation error for NOAA-SST data using 15 sensors determined: least squares estimation and generalized estimation with noise considered sensors using different number of snapshots. (Circle: average of cross-validation and 50 times resampling; Error bars: maximum and minimum)

The procedure is summarized as follows:

  1. 1.

    Save 624 snapshots in the memory of computer

  2. 2.

    Calculate reduced order representation 𝐔\bf{U} by Eq. 3

  3. 3.

    Randomize the order of the snapshots and divide into six parts

  4. 4.

    Sample the predetermined number of snapshots randomly from 520 snapshots labeled as “training” then calculate \mathcal{R}

  5. 5.

    Determine 15 sensor positions using 𝐔\mathbf{U} and \mathcal{R}

  6. 6.

    Reconstruct all-points measurement of 104 snapshots labeled as “test” using determined sensors and corresponding 𝐑\mathbf{R}

  7. 7.

    Store reconstruction error

  8. 8.

    Resample snapshots for calculating \mathcal{R}, then repeat from item 4 to item 6 for 50 times

  9. 9.

    Change “training” and “test”, then repeat from item 4 to item 8

Here, the low-rank representation is fixed for all of the sampling cases, and the change in the reduced order representation 𝐔\mathbf{U} which reflects temperature dynamics is excluded. An evaluation of the quality including the reduced order model needs more profound discussion, and thus, this topic remains to be solved. Calculation of noise correlation matrix in item 4 above is carried out by taking :=𝐗N𝐗N\mathcal{R}:=\mathbf{X}_{N}\mathbf{X}_{N}^{\top}, where 𝐗N=(𝐈𝐔𝐔)𝐗\mathbf{X}_{N}=\left(\mathbf{I}-\mathbf{U}\mathbf{U}^{\top}\right)\mathbf{X} with some notations in subsections 2.1 and 2.2.

In Fig. 4, the result is summarized by the average, maximum, and minimum of the reconstruction error with the abscissa of the number of snapshots used as training data. Among the two horizontal broken lines, the top one corresponds to the average of the reconstruction error of six divided snapshots by the previous approach that only uses 𝐔\mathbf{U}, and the bottom to the modeling error by approximating original snapshots by r=10r=10 modes. The lack of the training data for calculating \mathcal{R} influences the reconstruction at 20 snapshots, possibly since the weighting term using measurement noise is not captured well. The presented method, however, results in better performance than the previous approach as the number of snapshots in the training data increases. For the further reduction in the reconstruction error, it seems effective to increase the number of sensors (as shown in section 3), or consider the dynamics of the measured phenomena to the estimation if the time-series snapshots are applied.

References

  • [1] Charu C. Aggarwal, Amotz Bar-Noy, and Simon Shamoun. On sensor selection in linked information networks. Computer Networks, 126:100–113, 2017.
  • [2] Antonio A Alonso, Christos E Frouzakis, and Ioannis G Kevrekidis. Optimal sensor placement for state reconstruction of distributed process systems. AIChE Journal, 50(7):1438–1452, 2004.
  • [3] Peter J Baddoo, Benjamin Herrmann, Beverley J McKeon, J Nathan Kutz, and Steven L Brunton. Physics-informed dynamic mode decomposition (pidmd). arXiv preprint arXiv:2112.04307, 2021.
  • [4] RA Bates, RJ Buck, E Riccomagno, and HP Wynn. Experimental design and observation for large systems. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):77–94, 1996.
  • [5] Gal Berkooz, Philip Holmes, and John L Lumley. The proper orthogonal decomposition in the analysis of turbulent flows. Annual review of fluid mechanics, 25(1):539–575, 1993.
  • [6] Matthew Brand. Incremental singular value decomposition of uncertain data with missing values. In European Conference on Computer Vision, pages 707–720. Springer, 2002.
  • [7] Steven L Brunton and J Nathan Kutz. Data-driven science and engineering: Machine learning, dynamical systems, and control. Cambridge University Press, 2019.
  • [8] Steven L Brunton, Joshua L Proctor, Jonathan H Tu, and J Nathan Kutz. Compressed sensing and dynamic mode decomposition. Journal of Computational Dynamics, 2(2):165, 2015.
  • [9] Sundeep Prabhakar Chepuri and Geert Leus. Sparse sensing for distributed detection. IEEE Transactions on Signal Processing, 64(6):1446–1460, 2015.
  • [10] Donald J Chmielewski, Tasha Palmer, and Vasilios Manousiouthakis. On the theory of optimal sensor placement. AIChE journal, 48(5):1001–1012, 2002.
  • [11] Neil K Dhingra, Mihailo R Jovanović, and Zhi-Quan Luo. An admm algorithm for optimal sensor and actuator selection. In 53rd IEEE Conference on Decision and Control, pages 4039–4044. IEEE, 2014.
  • [12] Kutluyıl Doğançay and Hatem Hmam. On optimal sensor placement for time-difference-of-arrival localization utilizing uncertainty minimization. In 2009 17th European Signal Processing Conference, pages 1136–1140. IEEE, 2009.
  • [13] Wan Du, Zikun Xing, Mo Li, Bingsheng He, Lloyd Hock Chye Chua, and Haiyan Miao. Optimal sensor placement and measurement of wind for water quality studies in urban reservoirs. In IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks. IEEE, apr 2014.
  • [14] Uriel Feige, Vahab S Mirrokni, and Jan Vondrák. Maximizing non-monotone submodular functions. SIAM Journal on Computing, 40(4):1133–1153, 2011.
  • [15] Roman Garnett, Michael A Osborne, and Stephen J Roberts. Bayesian optimization for sensor set selection. In Proceedings of the 9th ACM/IEEE international conference on information processing in sensor networks, pages 209–219, 2010.
  • [16] Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research, 42:427–486, 2011.
  • [17] Abolfazl Hashemi, Mahsa Ghasemi, Haris Vikalo, and Ufuk Topcu. Randomized greedy sensor selection: Leveraging weak submodularity. IEEE Transactions on Automatic Control, 66(1):199–212, 2020.
  • [18] Ryoma Inoba, Kazuki Uchida, Yuto Iwasaki, Takayuki Nagata, Yuta Ozawa, Yuji Saito, Taku Nonomura, and Keisuke Asai. Optimization of sparse sensor placement for estimation of wind direction and surface pressure distribution using time-averaged pressure-sensitive paint data on automobile model. Journal of Wind Engineering and Industrial Aerodynamics, submitted.
  • [19] Tomoki Inoue, Tsubasa Ikami, Yasuhiro Egami, Hiroki Nagai, Yasuo Naganuma, Koichi Kimura, and Yu Matsuda. Data-driven optimal sensor placement for high-dimensional system using annealing machine. arXiv preprint arXiv:2205.05430e, 2022.
  • [20] Tomoki Inoue, Yu Matsuda, Tsubasa Ikami, Taku Nonomura, Yasuhiro Egami, and Hiroki Nagai. Data-driven approach for noise reduction in pressure-sensitive paint data based on modal expansion and time-series data at optimally placed points. Physics of Fluids, 33(7):077105, 2021.
  • [21] Hadi Jamali-Rad, Andrea Simonetto, Geert Leus, and Xiaoli Ma. Sparsity-aware sensor selection for correlated noise. In 17th International Conference on Information Fusion (FUSION), pages 1–7. IEEE, 2014.
  • [22] C. Jiang, Z. Chen, R. Su, and Y. C. Soh. Group greedy method for sensor placement. IEEE Transactions on Signal Processing, 67(9):2249–2262, 2019.
  • [23] Siddharth Joshi and Stephen Boyd. Sensor selection via convex optimization. IEEE Transactions on Signal Processing, 57(2):451–462, 2009.
  • [24] Naoki Kanda, Kumi Nakai, Yuji Saito, Taku Nonomura, and Keisuke Asai. Feasibility study on real-time observation of flow velocity field using sparse processing particle image velocimetry. Transactions of the Japan Society for Aeronautical and Space Sciences, 64(4):242–245, 2021.
  • [25] Sayumi Kaneko, Yuta Ozawa, Kumi Nakai, Yuji Saito, Taku Nonomura, Keisuke Asai, and Hiroki Ura. Data-driven sparse sampling for reconstruction of acoustic-wave characteristics used in aeroacoustic beamforming. Applied Sciences, 11(9):4216, 2021.
  • [26] Rex K Kincaid and Sharon L Padula. D-optimal designs for sensor and actuator locations. Computers & Operations Research, 29(6):701–713, 2002.
  • [27] R Kirlin and L Dewey. Optimal delay estimation in a multiple sensor array having spatially correlated noise. IEEE transactions on acoustics, speech, and signal processing, 33(6):1387–1396, 1985.
  • [28] Toni Kraft, Arnaud Mignan, and Domenico Giardini. Optimization of a large-scale microseismic monitoring network in northern switzerland. Geophysical Journal International, 195(1):474–490, 2013.
  • [29] Andreas Krause, Jure Leskovec, Carlos Guestrin, Jeanne VanBriesen, and Christos Faloutsos. Efficient sensor placement optimization for securing large water distribution networks. Journal of Water Resources Planning and Management, 134(6):516–526, 2008.
  • [30] Andreas Krause, Ajit Singh, and Carlos Guestrin. Near-optimal sensor placements in gaussian processes: Theory, efficient algorithms and empirical studies. Journal of Machine Learning Research, 9(Feb):235–284, 2008.
  • [31] J Nathan Kutz, Steven L Brunton, Bingni W Brunton, and Joshua L Proctor. Dynamic mode decomposition: data-driven modeling of complex systems. SIAM, 2016.
  • [32] Sijia Liu, Sundeep Prabhakar Chepuri, Makan Fardad, Engin Maşazade, Geert Leus, and Pramod K Varshney. Sensor selection for estimation with correlated measurement noise. IEEE Transactions on Signal Processing, 64(13):3509–3522, 2016.
  • [33] Krithika Manohar, Bingni W Brunton, J Nathan Kutz, and Steven L Brunton. Data-driven sparse sensor placement for reconstruction: Demonstrating the benefits of exploiting known patterns. IEEE Control Systems Magazine, 38(3):63–86, 2018.
  • [34] Engin Masazade, Makan Fardad, and Pramod K Varshney. Sparsity-promoting extended kalman filtering for target tracking in wireless sensor networks. IEEE Signal Processing Letters, 19(12):845–848, 2012.
  • [35] Takayuki Nagata, Taku Nonomura, Kumi Nakai, Keigo Yamada, Yuji Saito, and Shunsuke Ono. Data-driven sparse sensor selection based on a-optimal design of experiment with admm. IEEE Sensors Journal, 2021.
  • [36] Takayuki Nagata, Keigo Yamada, Kumi Nakai, Yuji Saito, and Taku Nonomura. Randomized group-greedy method for data-driven sensor selection. IEEE Sensors Journal, submitted.
  • [37] Takayuki Nagata, Keigo Yamada, Taku Nonomura, Kumi Nakai, Yuji Saito, and Shunsuke Ono. Data-driven sensor selection method based on proximal optimization for high-dimensional data with correlated measurement noise. arXiv preprint arXiv:2205.06067e, 2022.
  • [38] K. Nakai, K. Yamada, T. Nagata, Y. Saito, and T. Nonomura. Effect of objective function on data-driven greedy sparse sensor optimization. IEEE Access, 9:46731–46743, 2021.
  • [39] Kumi Nakai, Takayuki Nagata, Keigo Yamada, Yuji Saito, and Taku Nonomura. Nondominated-solution-based multiobjective-greedy sensor selection for optimal design of experiments. arXiv preprint arXiv:2204.12695e, 2022.
  • [40] George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical programming, 14(1):265–294, 1978.
  • [41] NOAA/OAR/ESRL. Noaa optimal interpolation (oi) sea surface temperature (sst) v2.
  • [42] T. Nonomura, S. Ono, K. Nakai, and Y. Saito. Randomized subspace newton convex method applied to data-driven sensor selection problem. IEEE Signal Processing Letters, 28:284–288, 2021.
  • [43] S Padula, D Palumbo, and R Kincaid. Optimal sensor/actuator locations for active structural acoustic control. In 39th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference and Exhibit, page 1865, 1998.
  • [44] Romain Paris, Samir Beneddine, and Julien Dandois. Robust flow control and optimal sensor placement using deep reinforcement learning. Journal of Fluid Mechanics, 913, 2021.
  • [45] Andrej Pázman and Werner G Müller. Optimal design of experiments subject to correlated errors. Statistics & probability letters, 52(1):29–34, 2001.
  • [46] M.S. Phatak. Recursive method for optimum gps satellite selection. IEEE Transactions on Aerospace and Electronic Systems, 37(2):751–754, apr 2001.
  • [47] Erik Rigtorp. Sensor selection with correlated noise. Master’s thesis, KTH Royal Institute of Tehcnology, 2010.
  • [48] Clarence W Rowley, Tim Colonius, and Richard M Murray. Model reduction for compressible flows using pod and galerkin projection. Physica D: Nonlinear Phenomena, 189(1-2):115–129, 2004.
  • [49] Yuji Saito, Taku Nonomura, Koki Nankai, Keigo Yamada, Keisuke Asai, Yasuo Sasaki, and Daisuke Tsubakino. Data-driven vector-measurement-sensor selection based on greedy algorithm. IEEE Sensors Letters, 4, 2020.
  • [50] Yuji Saito, Taku Nonomura, Keigo Yamada, Kumi Nakai, Takayuki Nagata, Keisuke Asai, Yasuo Sasaki, and Daisuke Tsubakino. Determinant-based fast greedy sensor selection algorithm. IEEE Access, 9:68535–68551, 2021.
  • [51] Yuji Saito, Keigo Yamada, Naoki Kanda, Kumi Nakai, Takayuki Nagata, Taku Nonomura, and Keisuke Asai. Data-driven determinant-based greedy under/oversampling vector sensor placement. CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 129(1):1–30, 2021.
  • [52] Isabel Scherl, Benjamin Strom, Jessica K Shang, Owen Williams, Brian L Polagye, and Steven L Brunton. Robust principal component analysis for modal decomposition of corrupt fluid flows. Physical Review Fluids, 5(5):054401, 2020.
  • [53] P. J. Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of Fluid Mechanics, 656(July 2010):5–28, 2010.
  • [54] Richard Semaan. Optimal sensor placement using machine learning. Computers & Fluids, 159:167–176, 2017.
  • [55] Manohar Shamaiah, Siddhartha Banerjee, and Haris Vikalo. Greedy sensor selection: Leveraging submodularity. In 49th IEEE conference on decision and control (CDC), pages 2572–2577. IEEE, 2010.
  • [56] M Kay Steven. Fundamentals of statistical signal processing: Estimation Theory, volume 10. PTR Prentice-Hall, Englewood Cliffs, NJ, USA, 1993.
  • [57] Mohammad J. Taghizadeh, Saeid Haghighatshoar, Afsaneh Asaei, Philip N. Garner, and Herve Bourlard. Robust microphone placement for source localization from noisy distance measurements. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, apr 2015.
  • [58] Naoya Takeishi, Yoshinobu Kawahara, Yasuo Tabei, and Takehisa Yairi. Bayesian dynamic mode decomposition. In IJCAI, pages 2814–2821, 2017.
  • [59] Jonathan H Tu, John Griffin, Adam Hart, Clarence W Rowley, Louis N Cattafesta, and Lawrence S Ukeiley. Integration of non-time-resolved piv and time-resolved velocity point sensors for dynamic estimation of velocity fields. Experiments in fluids, 54(2):1–20, 2013.
  • [60] Vasileios Tzoumas, Yuankun Xue, Sérgio Pequito, Paul Bogdan, and George J Pappas. Selecting sensors in biological fractional-order systems. IEEE Transactions on Control of Network Systems, 5(2):709–721, 2018.
  • [61] Dariusz Uciński. D-optimal sensor selection in the presence of correlated measurement noise. Measurement, 164:107873, 2020.
  • [62] Firdaus E Udwadia. Methodology for optimum sensor locations for parameter identification in dynamic systems. Journal of engineering mechanics, 120(2):368–390, 1994.
  • [63] Alain Vande Wouwer, Nicolas Point, Stephanie Porteman, and Marcel Remy. An approach to the selection of optimal sensor locations in distributed parameter systems. Journal of process control, 10(4):291–300, 2000.
  • [64] Keigo Yamada, Yuji Saito, Koki Nankai, Taku Nonomura, Keisuke Asai, and Daisuke Tsubakino. Fast greedy optimization of sensor selection in measurement with correlated noise. Mechanical Systems and Signal Processing, 158:107619, 2021.
  • [65] Ryoichi Yoshimura, Aiko Yakeno, Takashi Misaka, and Shigeru Obayashi. Application of observability gramian to targeted observation in wrf data assimilation. Tellus A: Dynamic Meteorology and Oceanography, 72(1):1–11, 2020.
  • [66] Jing Yu, Victor M Zavala, and Mihai Anitescu. A scalable design of experiments framework for optimal sensor placement. Journal of Process Control, 67:44–55, 2018.
  • [67] Armin Zare and Mihailo R Jovanović. Optimal sensor selection via proximal optimization algorithms. In 2018 IEEE Conference on Decision and Control (CDC), pages 6514–6518. IEEE, 2018.