This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Coarse reduced model selection for nonlinear state estimation

J.A. Nichols
(August 20, 2025)
Abstract

State estimation is the task of approximately reconstructing a solution uu of a parametric partial differential equation when the parameter vector yy is unknown and the only information is mm linear measurements of uu. In [3] the authors proposed a method to use a family of linear reduced spaces as a generalised nonlinear reduced model for state estimation. A computable surrogate distance is used to evaluate which linear estimate lies closest to a true solution of the PDE problem.

In this paper we propose a strategy of coarse computation of the surrogate distance while maintaining a fine mesh reduced model, as the computational cost of the surrogate distance is large relative to the reduced modelling task. We demonstrate numerically that the error induced by the coarse distance is dominated by other approximation errors.

1 Introduction

Complex physical systems are often modelled by a parametric partial differential equation (PDE). We consider the general problem of the form

A(y)u=f(y),A(y)u=f(y), (1)

where A(y):A(y):\mathcal{H}\to\mathcal{H}^{\prime} is an elliptic second order operator, \mathcal{H} is an appropriate Hilbert space. The problem is defined on a physical domain D2,3D\subset\mathbb{R}^{2,3}, and the parameter yy is within a parameter domain 𝒴\mathcal{Y}, typically a subset of d\mathbb{R}^{d} where dd might be large. Each parameter y𝒴y\in\mathcal{Y} is assumed to result in a unique solution u(x,y)u(x,y) but we mostly suppress the spatial dependence xDx\in D and write u(y)u(y). We denote all \mathcal{H}-norms by :=\|\cdot\|:=\|\cdot\|_{\mathcal{H}} and inner-products ,:=,\langle\cdot,\cdot\rangle:=\langle\cdot,\cdot\rangle_{\mathcal{H}}, and use an explicit subscript if they are from a different space.

In practical modelling applications it is often computationally expensive to produce a high precision numerical solution to the PDE problem (1). To our advantage however, the mapping u:yu:y\to\mathcal{H} is typically smooth and compact, hence the set of solutions over all parameter values will be a smooth manifold, possibly of finite intrinsic dimension. We define the solution manifold as follows, assuming from here that it is compact,

:={u(y):y𝒴}.\mathcal{M}:=\{u(y):y\in\mathcal{Y}\}.

The methods proposed in this paper extend on reduced basis approximations. A reduced basis is a linear subspace VV\subset\mathcal{H} of moderate dimension n:=dim(V)dim()dim()n:=\dim(V)\leq\dim(\mathcal{M})\leq\dim(\mathcal{H}). We use the worst-case error as a benchmark, defined as

ε(V,)=maxudist(u,V)=maxuuPVu,\varepsilon(V,\mathcal{M})=\max_{u\in\mathcal{M}}\mathop{\operator@font dist}(u,V)=\max_{u\in\mathcal{M}}\|u-P_{V}u\|, (2)

where PVP_{V} is the orthogonal projection operator on to VV. There are a variety of methods to construct a reduced basis with desirable worst-case error performance, and here we concentrate on greedy methods that select points in \mathcal{M} which become the basis for VV. We discuss these methods further in §1.1.

A reduced basis can be used to accelerate the forward problem. One can numerically solve the PDE problem for a given parameter yy by using VV directly in the Galerkin method, making the numerical problem vastly smaller while retaining a high level of accuracy. A thorough treatment of the development of reduced basis approximations is given in [5].

In this paper we are concerned with inverse problems. In this setting it is assumed that there is some unknown true state uu (which could correspond to the state of some physical system), and we do not know the parameter vector yy that gives this solution. Instead, we make do with a handful of mm linear measurements i(u)\ell_{i}(u). These measurements are used to make some kind of accurate reconstruction of uu (state estimation) or a guess of the true parameter yy (parameter estimation).

The Parametrized Background Data Weak (PBDW) approach introduced in [6] gives a straightforward procedure for finding an estimator uu^{*} of the true state uu, using only the linear measurement information and a reduced basis VV. One limitation however is that the estimator error uu\|u^{*}-u\| is bounded from below by the Kolmogorov nn-width given by

dn()=infdim(V)=nsupudist(u,V),d_{n}(\mathcal{M})=\inf_{\dim(V)=n}\sup_{u\in\mathcal{M}}\mathop{\operator@font dist}(u,V), (3)

where the infimum is taken over all nn dimensional linear spaces in \mathcal{H}. The nn-width is known to converge slowly for many parametric PDE problems.

We will review methods for constructing a family of local linear reduced models and a nonlinear estimator uu^{*} using a surrogate distance model selection procedure. We propose the use of coarser finite element meshes to perform this selection. This coarse selection strategy is motivated by the observation in [3] that the model selection is by far the most computationally costly component in the nonlinear estimation routine. Finally in §4 we examine numerical examples of surrogate distances over different mesh widths, and see that they make insubstantial impacts to the nonlinear estimator.

1.1 Linear reduced models

A reduced basis is a linear space of the form V=span(u1,,un)V=\mathop{\operator@font span}(u^{1},\ldots,u^{n})\subset\mathcal{H}, where the ui=u(yi)u^{i}=u(y^{i})\in\mathcal{M}. The parameter values yiy^{i} are typically chosen in some iterative greedy procedure to try and minimise ε(V,)\varepsilon(V,\mathcal{M}) at each step.

We can define a greedy procedure follows: given VV of dimension nn and a finite subset 𝒴~\widetilde{\mathcal{Y}} of 𝒴\mathcal{Y}, to produce an (n+1)(n+1)-dimensional reduced space we find the parameter yn+1𝒴~y^{n+1}\in\widetilde{\mathcal{Y}} which gives us the largest dist(u(yn+1),V)=u(yn+1)PVu(yn+1)\mathop{\operator@font dist}(u(y^{n+1}),V)=\|u(y^{n+1})-P_{V}u(y^{n+1})\|. We then augment the space VV with un+1=u(yn+1)u^{n+1}=u(y^{n+1}). This simple strategy, in some cases [1], yields a reduced basis that is optimal with regards to the Kolmogorov nn-width of \mathcal{M}. In this setting the quantity

εest(V,):=maxy𝒴~u(y)PVu(y)\varepsilon_{\mathrm{est}}(V,\mathcal{M}):=\max_{y\in\widetilde{\mathcal{Y}}}\|u(y)-P_{V}u(y)\| (4)

serves as a reasonable and calculable estimate of ε(V,)\varepsilon(V,\mathcal{M}), as shown in [4].

In practice, we also allow VV to be an affine space, with an offset u¯\bar{u}, such that

V=u¯span(u1,,un).V=\bar{u}\oplus\mathop{\operator@font span}(u^{1},\ldots,u^{n}).

Typically we take u¯\bar{u} to be an approximate barycenter of \mathcal{M}. From here we will use the term reduced space or model rather than basis.

With VV chosen, we now consider the state estimation problem. We have mm pieces of linear data i(u)\ell_{i}(u) for i=1,,mi=1,\ldots,m of the unknown state uu, and the i\ell_{i}\in\mathcal{H}^{\prime}. We assume we know the form of the functionals i\ell_{i} and hence the Riesz representers ωi\omega_{i} for which ωi,u=i(u)\langle\omega_{i},u\rangle=\ell_{i}(u). These ωi\omega_{i} define a measurement space and the measurement vector of uu:

W:=span(ω1,,ωm)andw=PWu.W:=\mathop{\operator@font span}(\omega_{1},\ldots,\omega_{m})\quad\text{and}\quad w=P_{W}u.

Note here that we assume no noise in our measurements, but allowing for random noise is straightforward and has been considered in [6, 2].

The PBDW approach, developed in [6], seeks a reconstruction candidate or estimator u(w)u^{*}(w) that is close to uu, but that agrees with the measurement data, that is PWu(w)=PWu=wP_{W}u^{*}(w)=P_{W}u=w. They define an estimator

u(w)=argmin{dist(v,V):PWv=w},u^{*}(w)=\mathop{\operator@font argmin}\{\mathop{\operator@font dist}(v,V):P_{W}v=w\},

which can be calculated through a set of normal equations of size n×mn\times m, using the cross-Grammian matrix of the bases of WW and VV. Given only the measurement information ww, the measurement space WW, and the reduced space VV, this estimator u(w)u^{*}(w) is an optimal choice [2]. The estimator lies in the subspace u(w)VWu^{*}(w)\in V\oplus W.

We require that WV=W^{\perp}\cap V=\emptyset for this reconstruction algorithm to be well posed (otherwise there are infinitely many candidates for uu^{*}), which in turn requires that n=dim(V)<dim(W)=mn=\dim(V)<\dim(W)=m. This dimensionality requirement is reflected in the error analysis. We define an inf-sup constant

μ(V,W):=maxvVvPWv\mu(V,W):=\max_{v\in V}\frac{\|v\|}{\|P_{W}v\|} (5)

which is the inverse of the cosine of the angle between VV and WW and we have μ(V,W)[1,]\mu(V,W)\in[1,\infty]. For μ\mu to be finite, we require WV=W^{\perp}\cap V=\emptyset. The inf-sup constant plays the role of a stability constant for our linear estimator as we have the well known bound

Ewc=maxuuu(PWmu)μ(V,W)ε(V,),E_{\mathrm{wc}}=\max_{u\in\mathcal{M}}\|u-u^{*}(P_{W_{m}}u)\|\leq\mu(V,W)\,\varepsilon(V,\mathcal{M}),

as deomstrated in [2]. From the definitions we have ε(V,)dn+1()\varepsilon(V,\mathcal{M})\geq d_{n+1}(\mathcal{M}), hence this reconstruction error can at best be the (n+1)(n+1)-width of \mathcal{M}.

2 Nonlinear reduced models

A fundamental drawback of linear reduced models is the slow decay of the Kolmogorov nn-width for a wide variety of PDE problems. To circumvent this limitation, a framework for non-linear reduced models and their use for state estimation was presented in [3]. The proposal involves determining a partition of the manifold

=k=1Kk\mathcal{M}=\bigcup_{k=1}^{K}\mathcal{M}_{k}

and producing a family of affine reduced space approximations VkV_{k} to each portion k\mathcal{M}_{k}. Each space has dimension nk=dim(Vk)n_{k}=\dim(V_{k}), requiring nk<mn_{k}<m for well-posedness.

Given any target ε>0\varepsilon>0 and μ1\mu\geq 1 it is possible with large enough KK to determine a partition and family of reduced spaces VkV_{k} that satisfies

εk:=ε(Vk,k)εandμk:=μ(Vk,W)μfor all k=1,,K\varepsilon_{k}:=\varepsilon(V_{k},\mathcal{M}_{k})\leq\varepsilon\quad\text{and}\quad\mu_{k}:=\mu(V_{k},W)\leq\mu\quad\text{for all }k=1,\ldots,K (6)

in which case we say the family (Vk)k=1K(V_{k})_{k=1}^{K} is (μ,ε)(\mu,\varepsilon)-admissible. A slightly looser criteria on the partition can also be satisfied: given some σ>0\sigma>0, we say the family (Vk)k=1K(V_{k})_{k=1}^{K} is σ\sigma-admissible if μkεkσ\mu_{k}\varepsilon_{k}\leq\sigma for all k=1,,Kk=1,\ldots,K. The existence of both (μ,ε)(\mu,\varepsilon) and σ\sigma-admissible families follow from the compactness of \mathcal{M}, and a full demonstration can be found in [3].

In practice one may construct a σ\sigma-admissible family in the following way. Say we are given a partition of the parameter space (𝒴k)k=1K1(\mathcal{Y}_{k})_{k=1}^{K-1} with k=1K1𝒴k=𝒴\cup_{k=1}^{K-1}\mathcal{Y}_{k}=\mathcal{Y}, and the associated partition of the manifold (k)k=1K1(\mathcal{M}_{k})_{k=1}^{K-1}. We have a reduced space VkV_{k}, produced by a greedy algorithm on each k\mathcal{M}_{k}. With each VkV_{k} we have an associated error estimate σest,k=μk(V,W)εest,k(V,)\sigma_{\mathrm{est,k}}=\mu_{k}(V,W)\varepsilon_{\mathrm{est},k}(V,\mathcal{M}). We can pick the the largest σest,k\sigma_{\mathrm{est,k}}, with index k~\tilde{k} say, and we split the cell 𝒴k~\mathcal{Y}_{\tilde{k}} in half for each parameter coordinate direction i{1,,d}i\in\{1,\ldots,d\}, resulting in two reduced spaces Vk~,i+V_{\tilde{k},i}^{+} and Vk~,iV_{\tilde{k},i}^{-} for each split direction. We take the split direction ii to be the one with the smallest maximum error max(σk~,i+,σk~,i)\max(\sigma_{\tilde{k},i}^{+},\sigma_{\tilde{k},i}^{-}), and we enrich the family (Vk)k=1K1(V_{k})_{k=1}^{K-1} with the two new reduced spaces, making sure to remove Vk~V_{\tilde{k}} from the collection. More details of the splitting procedure can be found in [3].

2.1 Surrogate reduced model selection

Based on a measurement ww, each affine reduced space has an associated reconstruction candidate that is found through the PBDW method,

uk(w):=argmin{dist(v,Vk):PWv=w} for k=1,,Ku^{*}_{k}(w):=\mathop{\operator@font argmin}\{\mathop{\operator@font dist}(v,V_{k}):P_{W}v=w\}\text{ for }k=1,\ldots,K (7)

If we happen to know which that the true state uu originates from some k\mathcal{M}_{k}, then we would best use uku^{*}_{k} as our estimator. In this scenario we would have an error bound of uuk(PWu)εkμk<σ\|u-u^{*}_{k}(P_{W}u)\|\leq\varepsilon_{k}\mu_{k}<\sigma. This information about the true state uu is not available in practice, so we require some other method to determine which candidate uku^{*}_{k} to choose.

Consider a surrogate distance 𝒮(v,)\mathcal{S}(v,\mathcal{M}) from vv to \mathcal{M} that satisfies the uniform bound for 0<rR0<r\leq R,

rdist(v,)𝒮(v,)Rdist(v,).r\mathop{\operator@font dist}(v,\mathcal{M})\leq\mathcal{S}(v,\mathcal{M})\leq R\mathop{\operator@font dist}(v,\mathcal{M}). (8)

If this surrogate distance is computable, then we can use it to find a surrogate selected estimator by choosing

k:=argmin{𝒮(uk,):k=1,,K}, and define u(w):=uk(w),k^{*}:=\mathop{\operator@font argmin}\{\mathcal{S}(u^{*}_{k},\mathcal{M}):k=1,\ldots,K\},\text{ and define }u^{*}(w):=u^{*}_{k^{*}}(w), (9)

noting that as there is a dependence on ww we can write k(w)k^{*}(w). We define an error benchmark

δσ:=sup{uv:dist(u,),dist(v,)σ,PWu=PWv}.\delta_{\sigma}:=\sup\{\|u-v\|:\mathop{\operator@font dist}(u,\mathcal{M}),\mathop{\operator@font dist}(v,\mathcal{M})\leq\sigma,P_{W}u=P_{W}v\}.

This quantity can take in to account errors from model bias, and we have the following result.

Theorem 2.1.

Given a σ\sigma-admissible family of affine reduced spaces (Vk)k=1K(V_{k})_{k=1}^{K}, the estimator based on the surrogate selection (9) has worst-case error bounded above by

maxuuu(PWu)δκσ,\max_{u\in\mathcal{M}}\|u-u^{*}(P_{W}u)\|\leq\delta_{\kappa\sigma}, (10)

where κ=R/r\kappa=R/r depends only on the uniform bounds of the surrogate distance.

The proof is detailed in Theorem 3.2 of [3]. Note that even given some optimal nonlinear reconstruction algorithm, our best possible error would be δ0\delta_{0}, and it is not bounded from below by dn()d_{n}(\mathcal{M}). We remark also that δσσ\delta_{\sigma}\geq\sigma.

3 Affine elliptic operators

Say the operator A(y)A(y) in (1) is uniformly bounded in yy with uniformly bounded inverse, that is for some 0<rR<0<r\leq R<\infty we have

A(y)RandA(y)1r1y𝒴.\|A(y)\|_{\mathcal{H}\to\mathcal{H}^{\prime}}\leq R\quad\text{and}\quad\|A(y)^{-1}\|_{\mathcal{H}^{\prime}\to\mathcal{H}}\leq r^{-1}\quad y\in\mathcal{Y}. (11)

Then we can show that for any vv\in\mathcal{H} the residual of the PDE,

(v,y):=A(y)vf(y),\mathcal{R}(v,y):=\|A(y)v-f(y)\|_{\mathcal{H}^{\prime}},

satisfies the uniform bound rvu(y)(v,y)Rvu(y)r\|v-u(y)\|\leq\mathcal{R}(v,y)\leq R\|v-u(y)\|. If we define the following,

𝒮(v,)=miny𝒴(v,y),\mathcal{S}(v,\mathcal{M})=\min_{y\in\mathcal{Y}}\mathcal{R}(v,y),

then we have arrived at a surrogate distance that satisfies the uniform bounds of (8). Using this surrogate in the selection (9) to define u(w)u^{*}(w), we have a nonlinear reconstruction algorithm with the error guarantees of (10).

We now make the further assumption that the operator A(y)A(y) and source term f(y)f(y) have affine dependence on yy, that is

A(y)=A0+j=1dAjandf(y)=f0+j=1dyjfj.A(y)=A_{0}+\sum_{j=1}^{d}A_{j}\quad\text{and}\quad f(y)=f_{0}+\sum_{j=1}^{d}y_{j}f_{j}.

The residual can be calculated using representers in \mathcal{H}. We define eje_{j} as member of \mathcal{H} that satisfies

ej,z=Aj(y)vfj,z,for allz,\langle e_{j},z\rangle=\langle A_{j}(y)v-f_{j},z\rangle_{\mathcal{H}^{\prime},\mathcal{H}}\quad\text{for all}\quad z\in\mathcal{H}, (12)

and now write e(y):=e0j=1dyjeje(y):=e_{0}-\sum_{j=1}^{d}y_{j}e_{j} to denote the representer of the overall residual problem. The residual is equal to (v,y)=e(y)\mathcal{R}(v,y)=\|e(y)\|, and determining the surrogate distance is a quadratic minimisation problem

𝒮(v,)2=miny𝒴e(y)2=miny𝒴e0+j=1dyjej2.\mathcal{S}(v,\mathcal{M})^{2}=\min_{y\in\mathcal{Y}}\|e(y)\|^{2}=\min_{y\in\mathcal{Y}}\big{\|}e_{0}+\sum_{j=1}^{d}y_{j}e_{j}\big{\|}^{2}.

This leads to a constrained least squares problem that can be solved using standard optimisation routines, using the values ei,ej\langle e_{i},e_{j}\rangle for 0i,jd0\leq i,j\leq d.

3.1 Finite element residual evaluation

In practice these calculations take place in a finite element space h\mathcal{H}_{h} that is the span of polynomial elements on a triangulation 𝒯h\mathcal{T}_{h} of width h>0h>0. In this setting the residual is h(v,y)=eh(y)\mathcal{R}_{h}(v,y)=\|e_{h}(y)\| where eh(y)=e0,hj=1dyjej,he_{h}(y)=e_{0,h}-\sum_{j=1}^{d}y_{j}e_{j,h} and the ej,hhe_{j,h}\in\mathcal{H}_{h} satisfy the variational problem

ej,h,z=Aj(y)vfj,z,for allzh.\langle e_{j,h},z\rangle=\langle A_{j}(y)v-f_{j},z\rangle_{\mathcal{H}^{\prime},\mathcal{H}}\quad\text{for all}\quad z\in\mathcal{H}_{h}. (13)

Naturally we define 𝒮h(v,):=miny𝒴h(v,y)\mathcal{S}_{h}(v,\mathcal{M}):=\min_{y\in\mathcal{Y}}\mathcal{R}_{h}(v,y). Note that when we subtract (12) from (13) we obtain eh(y)e(y),z=0\langle e_{h}(y)-e(y),z\rangle=0 for all zhz\in\mathcal{H}_{h}, meaning that eh(y)e(y)he_{h}(y)-e(y)\perp\mathcal{H}_{h}.

We write y=argminy𝒴(v,y)y^{*}=\mathop{\operator@font argmin}_{y\in\mathcal{Y}}\mathcal{R}(v,y) to be the minimiser selected in 𝒮(v,)\mathcal{S}(v,\mathcal{M}), and yhy^{*}_{h} the equivalent for 𝒮h(v,)\mathcal{S}_{h}(v,\mathcal{M}). In general yyhy^{*}\neq y^{*}_{h}, but we have

|𝒮(v,)𝒮h(v,)|\displaystyle|\mathcal{S}(v,\mathcal{M})-\mathcal{S}_{h}(v,\mathcal{M})| =|(v,y)h(v,yh)|\displaystyle=|\mathcal{R}(v,y^{*})-\mathcal{R}_{h}(v,y^{*}_{h})|
|(v,y)h(v,y)|+|(v,yh)h(v,yh)|\displaystyle\leq|\mathcal{R}(v,y^{*})-\mathcal{R}_{h}(v,y^{*})|+|\mathcal{R}(v,y^{*}_{h})-\mathcal{R}_{h}(v,y^{*}_{h})|
e(y)eh(y)+e(yh)eh(yh),\displaystyle\leq\|e(y^{*})-e_{h}(y^{*})\|+\|e(y^{*}_{h})-e_{h}(y^{*}_{h})\|, (14)

where in the last step we have used the fact that

|(v,y)h(v,y)|=|e(y)eh(y)|e(y)eh(y).\left|\mathcal{R}(v,y)-\mathcal{R}_{h}(v,y)\right|=\big{|}\|e(y)\|-\|e_{h}(y)\|\big{|}\leq\left\|e(y)-e_{h}(y)\right\|.

Thus the convergence of 𝒮h\mathcal{S}_{h} to 𝒮\mathcal{S} depends on the finite element convergence of solutions ehe_{h} to ee. This convergence is determined by the regularity of e(y)e(y), which depends on the smoothness of the data A(y)vhA(y)v-h and the so called Riesz lift implied in the variational problem (12).

Recall that A(y)A(y) in (1) is a second order symmetric elliptic operator. If we assume homogeneous Dirichlet boundary conditions on D\partial D and that f(y)L2(D)f(y)\in L^{2}(D), then a natural choice for our ambient space is the Sobolev space =H01(D)\mathcal{H}=H^{1}_{0}(D), with =H01(D)\|\cdot\|=\|\cdot\|_{H^{1}_{0}(D)}. In this setting the solutions e(y)e(y) of (12) are the weak solutions of the Poisson problem with homogeneous Dirichlet boundary conditions,

x2e(y)=A(y)vf(y) on D with e(y)=0 on D,\nabla_{x}^{2}e(y)=A(y)v-f(y)\text{ on }D\text{ with }e(y)=0\text{ on }\partial D, (15)

where we have written x2\nabla_{x}^{2} to denote the Laplacian in the spatial variables, recalling that we have the unwritten spatial dependence in e(y)=e(x,y)e(y)=e(x,y).

In our surrogate model selection, e(y)e(y) depends on uk(w)u^{*}_{k}(w). This estimator uk(w)u^{*}_{k}(w) lies in VkWV_{k}\oplus W. As VkV_{k} is the span of some solutions selected from k\mathcal{M}_{k}, the smoothness of uk(w)u^{*}_{k}(w) will depend on the smoothness of all solutions of the PDE problem and of the measurement functionals ωi\omega_{i}.

The convergence of ehe_{h} to ee in the weak form of (15) is well known in a wide variety of settings, for example we have the classical result

eh(y)e(y)chA(y)vf(y)L2(D),\|e_{h}(y)-e(y)\|\leq ch\|A(y)v-f(y)\|_{L^{2}(D)},

which would be applicable in a wide variety of situations. Under these circumstances we thus have that |𝒮(v,)𝒮h(v,)|h|\mathcal{S}(v,\mathcal{M})-\mathcal{S}_{h}(v,\mathcal{M})|\sim h.

3.2 Coarse surrogate evaluation

Say we construct a family of reduced spaces, (Vk)k=1K(V_{k})_{k=1}^{K}, where each family Vk=u¯k+span(uh1,,uhnk)V_{k}=\overline{u}_{k}+\mathop{\operator@font span}(u^{1}_{h^{\prime}},\ldots,u^{n_{k}}_{h^{\prime}}), and the solutions uhiu^{i}_{h^{\prime}} are numerically calculated with respect to a triangulation 𝒯h\mathcal{T}_{h^{\prime}} with mesh-width hh^{\prime}. Our estimators uk(w)u^{*}_{k}(w) will be in h\mathcal{H}_{h^{\prime}}.

We may consider 𝒯h\mathcal{T}_{h^{\prime}} to be our fine mesh, and nominate another triangulation 𝒯h\mathcal{T}_{h} with h<hh^{\prime}<h to be a coarse mesh. We can use this coarse mesh to compute the surrogate distance, noting that 𝒮h(v,)\mathcal{S}_{h}(v,\mathcal{M}) may be orders of magnitude faster to compute than the fine mesh equivalent. This will necessarily introduce an inaccuracy in the surrogate selection, however we maintain the high fidelity of the fine mesh reduced space approximations VkV_{k}.

If 𝒯h\mathcal{T}_{h^{\prime}} is a fine mesh that contains 𝒯h\mathcal{T}_{h} in the sense that hh\mathcal{H}_{h}\subset\mathcal{H}_{h^{\prime}}, then the variational problem (13) straightforward as we can calculate the inner-product A(y)vf(y),z\langle A(y)v-f(y),z\rangle based on the known relationships between the basis elements of h\mathcal{H}_{h} and h\mathcal{H}_{h^{\prime}}. Furthermore the size of the error eh(y)eh(y)\|e_{h}(y)-e_{h^{\prime}}(y)\|, and hence |𝒮h(v,)𝒮h(v,)||\mathcal{S}_{h}(v,\mathcal{M})-\mathcal{S}_{h^{\prime}}(v,\mathcal{M})|, is bounded above by a constant times e(y)eh(y)\|e(y)-e_{h}(y)\| through the same reasoning as in (14).

4 Numerical tests

We present two separate studies of the surrogate as evaluated in finite element spaces. On the unit square D=[0,1]2D=[0,1]^{2} we consider the PDE

x(a(x,y)xu(x,y))=1withu(x,y)=0 on D,-\nabla_{x}\cdot(a(x,y)\nabla_{x}u(x,y))=1\quad\text{with}\quad u(x,y)=0\text{ on }\partial D,

where our parametric diffusivity field is given by a(x,y)=1+j=116cjyjχDj(x)a(x,y)=1+\sum_{j=1}^{16}c_{j}y_{j}\raise 1.29167pt\hbox{\large$\chi$}_{D_{j}}(x). Here the DjD_{j} denote squares of side length 1/41/4 that subdivide the unit square in to 16 portions, and χDj\raise 1.29167pt\hbox{\large$\chi$}_{D_{j}} is the indicator function on DjD_{j}. The parameter range is the unit hypercube 𝒴=[1,1]16\mathcal{Y}=[-1,1]^{16}, and the coefficients are cj=0.9j1c_{j}=0.9j^{-1} or cj=0.99j1c_{j}=0.99j^{-1}, meaning that a(x,y)>0a(x,y)>0, but when cj=0.99j1c_{j}=0.99j^{-1} the a(x,y)a(x,y) can become closer to 0 hence the PDE problem can lose ellipticity.

We perform space discretisation by the Galerkin method using 1\mathbb{P}_{1} finite elements to produce solutions uh(y)u_{h^{\prime}}(y), with a fine triangulation on a regular grid of mesh size h=27h^{\prime}=2^{-7}. In the tests that follow, we will evaluate 𝒮h(v,)\mathcal{S}_{h}(v,\mathcal{M}) with coarse meshes of size h=2sh=2^{-s} with s=2,,6s=2,\ldots,6.

We generate training sets to compute the reduced models, and test sets on which we test the reconstruction algorithms. The training set ~tr\widetilde{\mathcal{M}}^{\mathrm{tr}} is taken to be the collection of PDE solutions for Ntr=1000N_{\mathrm{tr}}=1000 random samples 𝒴~tr={yjtr}j=1,,Ntr\widetilde{\mathcal{Y}}^{\mathrm{tr}}=\{y^{\mathrm{tr}}_{j}\}_{j=1,\ldots,N_{\mathrm{tr}}} drawn independently and uniformly on 𝒴=[1,1]16\mathcal{Y}=[-1,1]^{16}. The test set ~te\widetilde{\mathcal{M}}^{\mathrm{te}} is created from Nte=100N_{\mathrm{te}}=100 independent parameter samples that are distinct from the training set samples.

In our test the measurement space WW is given by a collection of m=8m=8 measurement functionals ωi,u=i(u)=|Bi|1uχBi\langle\omega_{i},u\rangle=\ell_{i}(u)=|B_{i}|^{-1}\int u\,\raise 1.29167pt\hbox{\large$\chi$}_{B_{i}} that are local averages in a small area BiB_{i} which are boxes of width 2h=262h=2^{-6}, each placed randomly in the unit square.

In our first test we compute a single reduced space VV using the greedy procedure on ~tr\widetilde{\mathcal{M}}^{\mathrm{tr}}. For each test candidate u~teu\in\widetilde{\mathcal{M}}^{\mathrm{te}} we calculate u(PWu)u^{*}(P_{W}u). In Figure 2 we plot the difference between the coarse and fine surrogate distances |𝒮h(u,)𝒮h(u,)||\mathcal{S}_{h}(u^{*},\mathcal{M})-\mathcal{S}_{h^{\prime}}(u^{*},\mathcal{M})|. We note that this error is significantly dominated by the calculated value of the estimator error (4), σest:=μ(V,W)εest(V,)\sigma_{\mathrm{est}}:=\mu(V,W)\,\varepsilon_{\mathrm{est}}(V,\mathcal{M}) as we found σest100\sigma_{\mathrm{est}}\approx 10^{0} for cj=0.9j1c_{j}=0.9j^{-1} and σest101\sigma_{\mathrm{est}}\approx 10^{1} for cj=0.99j1c_{j}=0.99j^{-1}. This dominates the errors presented in Figure 2 significantly.

Furthermore, we see a linear relationship in the right hand in Figure 2. The linear best fit of log(avg~te|𝒮h𝒮h|)\log(\mathrm{avg}_{\widetilde{\mathcal{M}}^{\mathrm{te}}}|\mathcal{S}_{h}-\mathcal{S}_{h^{\prime}}|) and log(h)=s\log(h)=-s has slope 1.511.51, which implies that on average

|𝒮h𝒮h|h1.51|\mathcal{S}_{h}-\mathcal{S}_{h^{\prime}}|\sim h^{1.51}
Refer to caption
Figure 1: Left, average absolute error of the surrogate 𝒮h\mathcal{S}_{h}, right, the CPU wall time for the computation of 𝒮h\mathcal{S}_{h} for all NteN_{\mathrm{te}} test points

For the second test we examine the impact of using the coarse surrogate 𝒮h(uk,)\mathcal{S}_{h}(u^{*}_{k},\mathcal{M}) for model selection. We build the σ\sigma-admissible families as outlined in §2 using the greedy splitting of 𝒴=[1,1]16\mathcal{Y}=[-1,1]^{16}. Note that at each point the cells 𝒴k\mathcal{Y}_{k} are rectangular, which we can split in half in coordinate directions. We split the parameter space 7 times, resulting in K=8K=8 local reduced spaces.

For each test candidate u~teu\in\widetilde{\mathcal{M}}^{\mathrm{te}} we have KK possible reconstructions u1(w),uK(w)u_{1}^{*}(w),\dots u_{K}^{*}(w). We use the coarse surrogate in the model selection (9), writing kh(w)=kh(PWu)k^{*}_{h}(w)=k^{*}_{h}(P_{W}u) to make the dependence on hh and ww clear, and we inspect how often it agrees with the fine selection kh(w)k^{*}_{h^{\prime}}(w) for all test points uu. We can also compare compare this to and the “true” selection ktrue(u)k^{*}_{\mathrm{true}}(u) for which uktrueu\in\mathcal{M}_{k^{*}_{\mathrm{true}}}.

Table 1 demonstrates that kh(w)k^{*}_{h}(w) agrees with khk^{*}_{h^{\prime}} the vast majority of the time from h=24h=2^{-4} onwards, in both cases for cjc_{j}. We also see that the fine selection khk^{*}_{h^{\prime}} agrees with the true selection 77 times out of 100 for cj=0.9j1c_{j}=0.9j^{-1} and 64 times for cj=0.99j1c_{j}=0.99j^{-1}, that is, it picks the estimator uk(w)u^{*}_{k}(w) that is trained on the portion of manifold k\mathcal{M}_{k} that uu originated from. Out of interest we also plot the histogram of selections khk^{*}_{h^{\prime}} and ktrue(u)k^{*}_{\mathrm{true}}(u), recalling that they are a number in 1,,81,\ldots,8. We see broadly similar patterns in the reduced model selection. Given the CPU time savings that we see in Figure 2, we conclude that model selection through a coarse surrogate distance is a worthwhile numerical strategy.

cj=0.9j1c_{j}=0.9j^{-1} cj=0.9j1c_{j}=0.9j^{-1}
Mesh width hh 232^{-3} 242^{-4} 252^{-5} 262^{-6} 272^{-7} 232^{-3} 242^{-4} 252^{-5} 262^{-6} 272^{-7}
#{kh(w)=kh(w)}\#\{k^{*}_{h}(w)=k^{*}_{h^{\prime}}(w)\} 97 100 100 100 100 88 94 96 98 100
#{kh(w)=ktrue(u)}\#\{k^{*}_{h}(w)=k^{*}_{\mathrm{true}}(u)\} 74 77 77 77 77 58 59 61 65 64
Table 1: Agreement of coarse model selection out of Nte=100N^{\mathrm{te}}=100 test points u~teu\in\widetilde{\mathcal{M}}^{\mathrm{te}}.
Refer to caption
Figure 2: A histogram of the number of fine surrogate selections khk^{*}_{h^{\prime}} and true reduced model selection ktrue(u)k^{*}_{\mathrm{true}}(u).

References

  • [1] Peter Binev, Albert Cohen, Wolfgang Dahmen, Ronald DeVore, Guergana Petrova, and Przemyslaw Wojtaszczyk. Convergence Rates for Greedy Algorithms in Reduced Basis Methods. SIAM Journal on Mathematical Analysis, 43(3):1457–1472, January 2011. Publisher: Society for Industrial and Applied Mathematics.
  • [2] Peter Binev, Albert Cohen, Wolfgang Dahmen, Ronald DeVore, Guergana Petrova, and Przemyslaw Wojtaszczyk. Data assimilation in reduced modeling. SIAM/ASA Journal on Uncertainty Quantification, 5(1):1–29, 2017.
  • [3] Albert Cohen, Wolfgang Dahmen, Olga Mula, and James Nichols. Nonlinear reduced models for state and parameter estimation. arXiv:2009.02687 [cs, math], November 2020. arXiv: 2009.02687.
  • [4] Albert Cohen, Dahmen Wolfgang, Ronald DeVore, and James Nichols. Reduced basis greedy selection using random training sets. ESAIM: Mathematical Modelling and Numerical Analysis, January 2020.
  • [5] Jan S. Hesthaven, Gianluigi Rozza, and Benjamin Stamm. Certified Reduced Basis Methods for Parametrized Partial Differential Equations. SpringerBriefs in Mathematics. Springer International Publishing, 2016.
  • [6] Yvon Maday, Anthony T. Patera, James D. Penn, and Masayuki Yano. A parameterized-background data-weak approach to variational data assimilation: formulation, analysis, and application to acoustics. International Journal for Numerical Methods in Engineering, 102(5):933–965, May 2015.