Applications of a space-time FOSLS formulation
for parabolic PDEs

Gregor Gantner Institute of Analysis and Scientific Computing, TU Wien, Wiedner Hauptstraße 8-10, 1040 Vienna, Austria. gregor.gantner@asc.tuwien.ac.at and Rob Stevenson Korteweg-de Vries (KdV) Institute for Mathematics, University of Amsterdam, P.O. Box 94248, 1090 GE Amsterdam, The Netherlands. rob.p.stevenson@gmail.com

(Date: March 4, 2025)

Abstract.

In this work, we show that the space-time first-order system least-squares (FOSLS) formulation [Führer, Karkulik, Comput. Math. Appl. 92 (2021)] for the heat equation and its recent generalization [Gantner, Stevenson, ESAIM Math. Model. Numer. Anal. 55 (2021)] to arbitrary second-order parabolic PDEs can be used to efficiently solve parameter-dependent problems, optimal control problems, and problems on time-dependent spatial domains.

Key words and phrases:

Parabolic PDEs, space-time FOSLS, reduced basis method, optimal control problems, time-dependent domains

2010 Mathematics Subject Classification:

35K20, 49J20, 65M12, 65M15, 65M60

The first author has been supported by the Austrian Science Fund (FWF) under grant J4379-N. The second author has been supported by NSF Grant DMS 172029

1. Introduction

It is well-known, e.g., from [Wlo87, SS09], that the operator corresponding to the heat equation $\partial_{t}u-\Delta_{\bf x}u=f$ , $u(0,\cdot)=u_{0}$ on a time-space cylinder $I\times\Omega$ , where $I:=(0,T)$ and $\Omega\subset\mathbb{R}^{d}$ , with homogeneous Dirichlet boundary conditions, is a boundedly invertible linear mapping between $X$ and $Y^{\prime}\times L_{2}(\Omega)$ , where $X:=L_{2}(I;H^{1}_{0}(\Omega))\cap H^{1}(I;H^{-1}(\Omega))$ and $Y:=L_{2}(I;H^{1}_{0}(\Omega))$ . For source term $f\in L_{2}(I\times\Omega)$ and initial data $u_{0}\in L_{2}(\Omega)$ , the recent work [FK21] has proven that

\displaystyle\operatorname*{argmin}_{{\bf u}=(u_{1},{\bf u}_{2})\in U}\|\operatorname{div}{\bf u}_{2}-f\|_{L_{2}(I\times\Omega)}^{2}+\|{\bf u}_{2}+\nabla_{\bf x}u_{1}\|^{2}_{L_{2}(I\times\Omega)^{d}}+\|u(0,\cdot)-u_{0}\|^{2}_{L_{2}(\Omega)},

where $U:=\{{\bf u}\in X\times L_{2}(I\times\Omega)^{d}\colon\operatorname{div}{\bf u}\in L_{2}(I\times\Omega)\}$ equipped with the graph norm, is a well-posed first-order system least-squares (FOSLS) formulation for the pair of the solution $u=u_{1}$ and (minus) its spatial gradient $-\nabla_{\bf x}u={\bf u}_{2}$ . This formulation can already be found in [BG09] without a proof of its well-posedness though. In [GS21], we have generalized the result to second-order parabolic PDEs with arbitrary Dirichlet and/or Neumann boundary conditions (in the case of inhomogeneous boundary conditions appending a squared norm of a boundary residual, which is not of $L_{2}$ -type, to the least-squares functional). In particular, we have shown that source terms $f\not\in L_{2}(I\times\Omega)$ are covered and $X$ in the definition of $U$ may be replaced by $Y$ . We also mention our recent generalization to the instationary Stokes problem with so-called slip boundary conditions [GS22a].

Compared to other space-time discretization approaches such as [And13, Ste15, LMN16, SW21b, SW21a], the FOSLS formulation has the major advantage that it results in a symmetric, and, w.r.t. a mesh-independent norm, bounded and coercive bilinear form so that the Galerkin approximation from any conforming trial space is a quasi-best approximation from that space. At least for homogeneous boundary conditions, the minimization is w.r.t. $L_{2}$ -norms, so that the arising stiffness matrix is sparse and can be easily computed. Moreover, the least-squares functional provides a reliable and efficient a posteriori estimator for the error in the $U$ -norm. One disadvantage of the FOSLS method from [FK21, GS21] is that the latter norm for the error in the pair $(u,-\nabla_{\bf x}u)$ appears to be considerably stronger than the $X$ -norm for the error in $u$ . Indeed, for standard Lagrange finite element spaces applied to non-smooth solutions, e.g., as those that result from a discontinuity in the transition of initial and boundary data, relatively low convergence rates are reported in [FK21], even when adaptive refinement is employed. This issue shall be addressed in our future work [GS22b] by constructing more suitable trial spaces.

In the present work, we exploit the aforementioned advantages of the FOSLS method to efficiently solve parameter-dependent problems, optimal control problems, and problems on time-dependent spatial domains.

1.1. Parameter-dependent problems

We consider a reduced basis method, where in a potentially expensive offline phase a basis of (highly accurate approximations of the) solutions corresponding to a suitable finite subset of the parameter space is computed so that in the subsequent online phase the solution for arbitrary given parameters can be efficiently approximated from the spanned reduced basis space.

Compared to reduced basis methods based on time stepping and proper orthogonal decomposition (POD), e.g., [HO08, EKP11, Haa17], reduced basis methods based on simultaneous space-time discretization, e.g., [UP14, Yan14, YPU14], yield not only a dimension reduction in space but in space-time; see the comparison in [GMU17]. Consequently, in any case the resulting online phase can be expected to be more efficient.

As already mentioned, compared to other space-time discretization approaches, the FOSLS formulation has the advantage that it results in a coercive bilinear form. In this setting the strongest theoretical results for the reduced basis method are available (see, e.g., [Haa17]), and the construction of the reduced basis does not need to be accompanied by the construction of a corresponding test space that yields inf-sup stability.

Finally, the reliable and efficient a posteriori error estimator of the FOSLS formulation is beneficial for both building the reduced basis and for certification of the obtained approximate solution in the online phase.

1.2. Optimal control problems

We consider a large class of optimal control problems constrained by second-order parabolic PDEs. This includes controls of the source term in $L_{2}(I\times\Omega)$ or in $Y^{\prime}$ and of the initial data in $L_{2}(\Omega)$ as well as desired observations of the PDE solution in $L_{2}(I\times\Omega)$ , its spatial gradient in $L_{2}(I\times\Omega)^{d}$ , or its restriction to the final time point in $L_{2}(\Omega)$ .

It is well-known that optimal control problems can be equivalently formulated as saddle-point problem; see, e.g., [Lio68]. The saddle-point formulation is a coupled system of the primal equation and some adjoint equation. When solving it iteratively, e.g., by a gradient descent method [Trö10, HPUU08], the approximate solutions are required over the whole space-time cylinder $I\times\Omega$ and not just at the final time point. This nullifies the potential advantage of time-stepping methods such as [MV07, MV08], which in the case of a forward problem only need to store the approximation at the current time step.

Instead space-time methods such as [GHZ12, LSTY21b, LSTY21a, LS21] aim to solve the saddle-point problem at once on some mesh of the space-time cylinder $I\times\Omega$ . All of the mentioned works consider optimal control problems constrained by the heat equation with control of the source term and desired observation of the PDE solution itself. In contrast to time-stepping methods, all of them allow for locally adapted meshes because the involved bilinear forms are coercive or inf-sup stable. However, only [GHZ12], which reformulates the problem as a system of fourth order in space and second order in time requiring additional regularity, yields quasi-optimal approximations in the usual sense with the same norm on both sides, as for the others the norms in which the bilinear forms are continuous differ from the norms in which they are coercive or inf-sup stable.

In our approach, we consider the saddle-point problem when the parabolic PDE is replaced by the equivalent FOSLS. Coercivity of the latter implies that this saddle-point problem is for arbitrary discrete subspaces uniformly inf-sup stable and continuous w.r.t. the same norm so that quasi-optimality in the usual sense is valid. We also provide one reliable and one efficient a posteriori error estimator for our method. Other estimators have been derived in [GHZ12, LS21] for the respective methods.

1.3. Time-dependent domains

Finally, we consider second-order parabolic PDEs on time-dependent domains. While this poses a technical difficulty for time-stepping methods [GS17, FR17, LO19, SG20, FS22], in principle, space-time methods [Leh15, HLZ16, LMN16, Moo18, BHT22] can be readily applied because the overall space-time domain remains fixed.

We show that this is indeed the case for the FOSLS method [FK21, GS21] verifying its well-posedness. The analysis requires minimal regularity assumptions on the mapping describing the motion of the spatial domain throughout time. However, we emphasize that our method ultimately only requires a mesh of the space-time domain. This is in contrast to the so-called arbitrary Lagrangian-Eulerian (ALE) time-stepping methods [SHD01, GS17, SG20], which solve the problem on the transformed time-independent domain and thus explicitly involve the mapping. Moreover, our method again allows for arbitrary discrete trial spaces and immediately provides an a reliable and efficient a posteriori error estimator.

1.4. Outline

In Section 2, we fix the general notation (Section 2.1) and recall the formulation of general second-order parabolic PDEs as a first-order system (Section 2.2). In Section 3, we introduce a corresponding reduced basis method for parameter-dependent problems (Section 3.1), discuss the generation of a suitable basis (Section 3.2), and provide a numerical experiment (Section 3.3). In Section 4, we show that the formulation of optimal control problems constrained by the first-order system as a saddle-point problem is uniformly stable for arbitrary trial spaces (Section 4.1), derive an optimality system of PDEs for a certain class of optimal control problems (Section 4.2), which is then used to derive a posteriori error estimators (Section 4.3), and provide two numerical experiments (Section 4.4), where we also demonstrate optimal convergence rate for piecewise linear trial functions if both the optimal control and the corresponding state are sufficiently smooth. In Section 5, we show that the formulation of general second-order PDEs on time-dependent domains as first-order system induces, as for time-independent domains, a linear isomorphism and provide two numerical experiments (Section 5.2). Finally, we draw some conclusion in Section 6.

2. Preliminaries

2.1. General notation

In this work, by $C\lesssim D$ we will mean that $C\geq 0$ can be bounded by a multiple of $D\geq 0$ , independently of parameters on which $C$ and $D$ may depend. Obviously, $C\gtrsim D$ is defined as $C\lesssim D$ , and $C\eqsim D$ as $C\lesssim D$ and $C\gtrsim D$ .

For normed linear spaces $E$ and $F$ , we will denote by $\mathcal{L}(E,F)$ the normed linear space of bounded linear mappings $E\rightarrow F$ . For simplicity only, we exclusively consider linear spaces over the scalar field $\mathbb{R}$ .

2.2. Formulation of parabolic PDEs as first-order system

Let $\Omega\subset\mathbb{R}^{d}$ , $d\geq 1$ , be a Lipschitz domain with boundary $\Gamma:=\partial\Omega$ , and $T>0$ a given end time point with corresponding time interval $I:=(0,T)$ . We abbreviate the space-time cylinder $Q:=I\times\Omega$ with lateral boundary $\Sigma:=I\times\Gamma$ . We consider the following parabolic PDE with homogeneous Dirichlet boundary conditions

\begin{array}[]{rcll}\partial_{t}u-\operatorname{div}_{\bf x}({\bf A}\nabla_{\bf x}u)+{\bf b}\cdot\nabla_{\bf x}u+cu&=&f&\text{ in }Q,\\ u&=&0&\text{ on }\Sigma,\\ u(0,\cdot)&=&u_{0}&\text{ on }\Omega.\end{array}

(2.1)

Here, we will require that ${\bf A}={\bf A}^{\top}\in L_{\infty}(Q)^{d\times d}$ is uniformly positive, ${\bf b}\in L_{\infty}(Q)^{d}$ , and $c\in L_{\infty}(Q)$ . For unique existence of a (weak) solution $u$ of (2.1), we recall the following theorem from, e.g., [SS09, Theorem 5.1].

Theorem 2.1.

With $X:=L_{2}(I;H_{0}^{1}(\Omega))\cap H^{1}(I;H^{-1}(\Omega))$ , $Y:=L_{2}(I;H_{0}^{1}(\Omega))$ ,

\displaystyle(Bu)(v):=\int_{Q}\partial_{t}u\,v+({\bf A}\nabla_{\bf x}u)\cdot\nabla_{\bf x}v+{\bf b}\cdot\nabla_{\bf x}u\,v+cuv\,{\rm d}{\bf x}\,{\rm d}t

and $\gamma_{0}(u):=u(0,\cdot)$ for all $u\in X,v\in Y$ , the mapping

\displaystyle\begin{pmatrix}B\\ \gamma_{0}\end{pmatrix}:X\to Y^{\prime}\times L_{2}(\Omega),

is a linear isomorphism. In particular, there exists a unique $u\in X$ such that

\begin{pmatrix}B\\ \gamma_{0}\end{pmatrix}u=\begin{pmatrix}f\\ u_{0}\end{pmatrix}.

(2.2)

Any $f\in Y^{\prime}=L_{2}(I;H^{-1}(\Omega))$ can be written as

f=f_{1}+\operatorname{div}_{\bf x}{\bf f}_{2},

(2.3)

for some $f_{1}\in L_{2}(Q)$ and ${\bf f}_{2}\in L_{2}(Q)^{d}$ , where $\int_{Q}\operatorname{div}_{\bf x}{\bf f}_{2}\,v\,{\rm d}{\bf x}\,{\rm d}t:=-\int_{Q}{\bf f}_{2}\cdot\nabla_{\bf x}v\,{\rm d}{\bf x}\,{\rm d}t$ for $v\in Y$ . With such a decomposition, and ${\bf u}=(u_{1},{\bf u}_{2})\colon Q\rightarrow\mathbb{R}\times\mathbb{R}^{d}$ , (2.2) is equivalent to the first-order system¹¹1The system $\left(\begin{array}[]{@{}c@{}}\operatorname{div}{\bf u}-{\bf b}\cdot{\bf A}^{-1}{\bf u}_{2}+cu_{1}\\ {\bf u}_{2}+{\bf A}\nabla_{\bf x}u_{1}\\ u_{1}(0,\cdot)\end{array}\right)=\left(\begin{array}[]{@{}c@{}}f_{1}+{\bf b}\cdot{\bf A}^{-1}{\bf f}_{2}\\ -{\bf f}_{2}\\ u_{0}\end{array}\right)$ studied in [GS21] leads to (2.10) by pre-multiplication with $\left(\begin{array}[]{@{}ccc@{}}I&{\bf b}\cdot{\bf A}^{-1}&0\\ 0&-I&0\\ 0&0&I\end{array}\right)$ , which is a linear isomorphism on the space $L$ defined in (2.12).

\displaystyle G{\bf u}:=\left(\begin{array}[]{@{}c@{}}\operatorname{div}{\bf u}+{\bf b}\cdot\nabla_{\bf x}u_{1}+cu_{1}\\ -{\bf u}_{2}-{\bf A}\nabla_{\bf x}u_{1}\\ u_{1}(0,\cdot)\end{array}\right)=\left(\begin{array}[]{@{}c@{}}f_{1}\\ {\bf f}_{2}\\ u_{0}\end{array}\right)=:{\bf f},\quad u_{1}|_{\Sigma}=0,

(2.10)

as is shown in the next theorem.

Theorem 2.2 ([GS21, Theorem 2.3 and Proposition 2.5]).

The operator $G$ is a linear isomorphism from the space

\displaystyle U:=\{{\bf u}=(u_{1},{\bf u}_{2})\in Y\times L_{2}(Q)^{d}\colon{\bf u}\in H(\operatorname{div};Q)\}

(2.11)

(equipped with the corresponding graph norm) to the space

\displaystyle L:=L_{2}(Q)\times L_{2}(Q)^{d}\times L_{2}(\Omega).

(2.12)

If $u\in X$ solves (2.2), ${\bf u}=(u,-{\bf A}\nabla_{\bf x}u-{\bf f}_{2})\in U$ solves (2.10). Conversely, if ${\bf u}=(u_{1},{\bf u}_{2})\in U$ solves (2.10), then $u=u_{1}$ solves (2.2).

Notice that $G{\bf u}={\bf f}$ is equivalent to the variational problem

\displaystyle\langle G{\bf u},G{\bf v}\rangle_{L}=\langle{\bf f},G{\bf v}\rangle_{L}\quad\text{for all }{\bf v}\in U.

Since the bilinear form at the left-hand side is bounded, symmetric and coercive, it provides the ideal setting for the application of Galerkin discretizations. The Galerkin solution from the employed trial space is a quasi-best approximation to ${\bf u}$ w.r.t. $\|\cdot\|_{U}$ . For any approximation $\widetilde{\bf u}=(\widetilde{u}_{1},\widetilde{\bf u}_{2})$ of ${\bf u}$ we have the computable a posteriori error estimator $\|{\bf u}-\widetilde{\bf u}\|_{U}\eqsim\|{\bf f}-G\widetilde{\bf u}\|_{L}$ . Moreover, the following lemma shows that $\|u_{1}-\widetilde{u}_{1}\|_{X}\lesssim\|{\bf u}-\widetilde{\bf u}\|_{U}$ .

Lemma 2.3 ([GS21, Lemma 2.2]).

The mapping ${\bf u}\mapsto u_{1}$ belongs to $\mathcal{L}(U,X)$ .

Finally in this section, we provide more information about the splitting (2.3).

Remark 2.4.

For any $f\in Y^{\prime}$ , the unique solution of

\displaystyle\operatorname*{argmin}_{\{(f_{1},{\bf f}_{2})\in L_{2}(Q)\times L_{2}(Q)^{d}\colon f_{1}+\operatorname{div}_{\bf x}{\bf f}_{2}=f\}}\|f_{1}\|_{L_{2}(Q)}^{2}+\|{\bf f}_{2}\|_{L_{2}(Q)^{d}}^{2}

is given by $(f_{1},{\bf f}_{2})=(w,-\nabla_{\bf x}w)$ where $\langle w,v\rangle_{Y}:=\langle w,v\rangle_{L_{2}(Q)}+\langle\nabla_{\bf x}w,\nabla_{\bf x}v\rangle_{L_{2}(Q)}=f(v)$ for all $v\in Y$ , i.e., $w\in Y$ is the Riesz lift of $f$ , and

\displaystyle\|f_{1}\|_{L_{2}(Q)}^{2}+\|{\bf f}_{2}\|_{L_{2}(Q)^{d}}^{2}=\|f\|^{2}_{Y^{\prime}}.

Indeed, the last property is a consequence of $\|w\|_{Y}=\|f\|_{Y^{\prime}}$ , whereas for arbitrary $(f_{1},{\bf f}_{2})\in L_{2}(Q)\times L_{2}(Q)^{d}$ with $f_{1}+\operatorname{div}_{\bf x}{\bf f}_{2}=f$ , $\|f\|^{2}_{Y^{\prime}}\leq\|f_{1}\|_{L_{2}(Q)}^{2}+\|{\bf f}_{2}\|_{L_{2}(Q)^{d}}^{2}$ follows from $f(v)=\langle f_{1},v\rangle_{L_{2}(Q)}-\langle{\bf f}_{2},\nabla_{\bf x}v\rangle_{L_{2}(Q)^{d}}\leq\sqrt{\|f_{1}\|_{L_{2}(Q)}^{2}+\|{\bf f}_{2}\|_{L_{2}(Q)^{d}}^{2}}\,\|v\|_{Y}$ .

Finally, notice that if $f=f_{1}+\operatorname{div}_{\bf x}{\bf f}_{2}$ with $\|f_{1}\|_{L_{2}(Q)}^{2}+\|{\bf f}_{2}\|_{L_{2}(Q)^{d}}^{2}\eqsim\|f\|_{Y^{\prime}}^{2}$ , then for the solutions ${\bf u}$ of (2.10) and $u$ of (2.2), it holds that $\|{\bf u}\|^{2}_{U}\eqsim\|{\bf f}\|^{2}_{L}\eqsim\|f\|^{2}_{Y^{\prime}}+\|u_{0}\|_{L_{2}(\Omega)}^{2}\eqsim\|u\|_{X}^{2}$ .

3. Parameter-dependent problems

In this section, we consider coefficients ${\bf A}={\bf A}[{\bm{\mu}}],{\bf b}={\bf b}[{\bm{\mu}}],c=c[{\bm{\mu}}]$ and right-hand side ${\bf f}[{\bm{\mu}}]=(f_{1}[{\bm{\mu}}],{\bf f}_{2}[{\bm{\mu}}],u_{0}[{\bm{\mu}}])$ that depend additionally on a tuple of parameters ${\bf{\bm{\mu}}}$ in a set $\mathcal{P}\subset\mathbb{R}^{p}$ . We assume that the coefficients and the right hand sides are parameter-separable in the sense that

\begin{array}[]{rcl rcl rcl}{\bf A}[{\bm{\mu}}]&=&\sum_{q=1}^{n_{\bf A}}\theta_{q}^{\bf A}({\bm{\mu}}){\bf A}_{q},&{\bf b}[{\bm{\mu}}]&=&\sum_{q=1}^{n_{\bf b}}\theta_{q}^{\bf b}({\bm{\mu}}){\bf b}_{q},&c[{\bm{\mu}}]&=&\sum_{q=1}^{n_{c}}\theta_{q}^{c}({\bm{\mu}})c_{q},\\ f_{1}[{\bm{\mu}}]&=&\sum_{q=1}^{n_{f_{1}}}\theta_{q}^{f_{1}}({\bm{\mu}})f_{1,q},&{\bf f}_{2}[{\bm{\mu}}]&=&\sum_{q=1}^{n_{{\bf f}_{2}}}\theta_{q}^{\bf f_{2}}({\bm{\mu}}){\bf f}_{2,q},&u_{0}[{\bm{\mu}}]&=&\sum_{q=1}^{n_{u_{0}}}\theta_{q}^{u_{0}}({\bm{\mu}})u_{0,q}\end{array}

(3.1)

for some functions $\theta_{q}^{(\cdot)}:\mathcal{P}\to\mathbb{R}$ and ${\bf A}_{q}\in L_{\infty}(\Omega)^{d\times d},{\bf b}_{q}\in L_{\infty}(\Omega)^{d},c_{q}\in L_{\infty}(\Omega)$ and $f_{1,q}\in L_{2}(Q),{\bf f}_{2,q}\in L_{2}(Q)^{d},u_{0,q}\in L_{2}(\Omega)$ . Let $F=F[{\bm{\mu}}]\in U^{\prime}$ be some given quantity of interest that is also parameter-separable, i.e., $F[{\bm{\mu}}]=\sum_{q=1}^{n_{F}}\theta_{q}^{F}({\bm{\mu}})F_{q}$ with $\theta_{q}^{F}:\mathcal{P}\to\mathbb{R}$ and $F_{q}\in U^{\prime}$ . After a possibly computationally expensive offline phase, we want to be able to instantly compute an approximation of $F[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])$ for different ${\bm{\mu}}\in\mathcal{P}$ in the so-called online phase.

3.1. Reduced basis method

Given some parameter ${\bm{\mu}}$ , the idea of the reduced basis method is to compute an approximation ${\bf u}^{N}[{\bm{\mu}}]$ of ${\bf u}[{\bm{\mu}}]$ from the (low-dimensional) span of some snapshots $\{{\bf u}[{\bm{\mu}}^{(1)}],\dots,{\bf u}[{\bm{\mu}}^{(N)}]\}$ .

Instead of ${\bf u}[{\bm{\mu}}^{(i)}]$ , very accurate approximations ${\bf u}^{\delta}[{\bm{\mu}}^{(i)}]$ thereof are computed in the offline phase. We will choose ${\bf u}^{\delta}[{\bm{\mu}}^{(i)}]$ as the best approximation to ${\bf u}[{\bm{\mu}}^{(i)}]$ w.r.t. $\|G[{\bm{\mu}}^{(i)}](\cdot)\|_{L}$ from some high-dimensional subspace $U^{\delta}\subset U$ , i.e., as the solution of the Galerkin system $\langle G[{\bm{\mu}}^{(i)}]{\bf u}^{\delta}[{\bm{\mu}}^{(i)}]-{\bf f}[{\bm{\mu}}^{(i)}],G[{\bm{\mu}}^{(i)}]{\bf v}^{\delta}\rangle_{L}=0$ for all ${\bf v}^{\delta}\in U^{\delta}$ . One easily checks that the parameter-separability (3.1) of the coefficients and the right-hand sides implies parameter-separability of the bilinear form

\displaystyle\langle G[{\bm{\mu}}](\cdot)\,,\,G[{\bm{\mu}}](\cdot)\rangle_{L}=\sum_{q=1}^{n_{b}}\theta_{q}^{b}({\bm{\mu}})b_{q}(\cdot,\cdot),

the linear form

\displaystyle\langle{\bf f}[{\bm{\mu}}]\,,\,G[{\bm{\mu}}](\cdot)\rangle_{L}=\sum_{q=1}^{n_{l}}\theta_{q}^{l}({\bm{\mu}})l_{q}(\cdot),

and the squared norm

\displaystyle\|G[{\bm{\mu}}]({\bf u})\|_{L}^{2}=\|{\bf f}[{\bm{\mu}}]\|_{L}^{2}=\sum_{q=1}^{n_{s}}\theta_{q}^{s}({\bm{\mu}}),

where again $\theta_{q}^{(\cdot)}:\mathcal{P}\to\mathbb{R}$ , and each $b_{q}:U\times U\to\mathbb{R}$ is a continuous bilinear form and each $l_{q}:U\to\mathbb{R}$ is a continuous linear form. Besides the functions ${\bf u}^{\delta}[{\bm{\mu}}^{(i)}]$ , which form the reduced basis, in the offline phase we further compute the matrices and vectors

\displaystyle\big{(}b_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(j)}],{\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i,j=1}^{N},\quad\big{(}l_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N},\quad\big{(}F_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N},

In the online phase, we compute ${\bf u}^{N}[{\bm{\mu}}]$ as the best approximation to ${\bf u}[{\bm{\mu}}]$ w.r.t. $\|G[{\bm{\mu}}](\cdot)\|_{L}$ from the span of the reduced basis, i.e., as the solution of a low-dimensional Galerkin system. Assuming that the snapshots ${\bf u}^{\delta}[{\bm{\mu}}^{(1)}],$ $\dots$ , ${\bf u}^{\delta}[{\bm{\mu}}^{(N)}]$ are linearly independent, the corresponding coefficient vector ${\bf c}^{N}[{\bm{\mu}}]$ of ${\bf u}^{N}[{\bm{\mu}}]$ with respect to this basis is just given by

\displaystyle{\bf c}^{N}[{\bm{\mu}}]=\Big{(}\sum_{q=1}^{n_{b}}\theta_{q}^{b}(\mu)\big{(}b_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(j)}],{\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i,j=1}^{N}\Big{)}^{-1}\Big{(}\sum_{q=1}^{n_{l}}\theta_{q}^{l}(\mu)\big{(}l_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N}\Big{)}.

We stress that the computational effort to solve this linear system depends only on $N$ and not on ${\rm dim}(U^{\delta})$ as the involved matrices and vectors have been already computed in the offline phase. Then, the quantity of interest is given by

\displaystyle F[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])=\sum_{q=1}^{n_{F}}\theta_{q}^{F}({\bm{\mu}})\Big{(}\big{(}F_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N}\cdot{\bf c}^{N}[{\bm{\mu}}]\Big{)}.

Again, the involved vectors $F_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N}$ have been precomputed in the offline phase. Finally, we can even instantly estimate the discretization error in the online phase by

	$\displaystyle\\|{\bf u}[{\bm{\mu}}]-{\bf u}^{N}[{\bm{\mu}}]\\|_{U}^{2}$	$\displaystyle\eqsim\\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\\|_{L}^{2}$
		$\displaystyle=\\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])\\|_{L}^{2}-\langle G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])\,,\,G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\rangle_{L}$
		$\displaystyle=\sum_{q=1}^{n_{s}}\theta_{q}^{s}({\bm{\mu}})-\sum_{q=1}^{n_{l}}\theta_{q}^{l}({\bm{\mu}})\Big{(}\big{(}l_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N}\cdot{\bf c}^{N}[{\bm{\mu}}]\Big{)},$

The constants hidden in the $\eqsim$ -symbol depend only on the $L_{\infty}$ -norms of the coefficients ${\bf A}[{\bm{\mu}}],{\bf b}[{\bm{\mu}}],c[{\bm{\mu}}]$ and the smallest eigenvalue of ${\bf A}[{\bm{\mu}}]$ .

3.2. Basis generation

It remains to explain how to determine a suitable reduced basis $\{{\bf u}^{\delta}[{\bm{\mu}}^{(1)}],$ $\dots$ , ${\bf u}^{\delta}[{\bm{\mu}}^{(N)}]\}$ . Given some sufficiently large training set $\mathcal{P}_{\rm train}\subseteq\mathcal{P}$ and some tolerance $\epsilon_{\rm tol}>0$ , we employ a greedy algorithm that starting with ${\bf u}^{0}[{\bm{\mu}}]:=0$ , ${\bm{\mu}}\in\mathcal{P}$ , iteratively adds the snapshot

\displaystyle{\bf u}^{\delta}\big{[}\operatorname*{argmax}\limits_{{\bm{\mu}}\in\mathcal{P}_{\rm train}}\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\|_{L}\big{]}

to the reduced basis and increments $N$ by one until

\displaystyle\max_{{\bm{\mu}}\in\mathcal{P}_{\rm train}}\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\|_{L}\leq\epsilon_{\rm tol}.

Note that this procedure terminates at most after $\#\mathcal{P}_{\rm train}$ steps and provides indeed a basis if the discretization error in $U^{\delta}$ is negligible. For possible choices of the training set $\mathcal{P}_{\rm train}$ , we refer, e.g., to [Haa17, Remark 2.44].

3.3. Numerical experiment

We consider the example from [GMU17]: Let $\Omega=(0,1)$ and $T:=0.3$ . We consider the parameter set $\mathcal{P}:=[0.5,1.5]\times[0,1]\times[0,1]\subset\mathbb{R}^{3}$ with ${\bf A}[{\bm{\mu}}]:=\mu_{1}$ , ${\bf b}[{\bm{\mu}}]:=\mu_{2}$ , and $c[{\bm{\mu}}]=\mu_{3}$ . Moreover, we choose the right-hand sides $f_{1}$ , ${\bf f}_{2}$ , and $u_{0}$ independently of the parameters with

\displaystyle f_{1}(t,x):=\sin(2\pi x)\big{(}(4\pi^{2}+0.5)\cos(4\pi t)-4\pi\sin(4\pi t)\big{)}+\pi\cos(2\pi x)\cos(4\pi t)

on the space-time cylinder $Q=(0,1)^{2}$ , ${\bf f}_{2}:=0$ , and $u_{0}(x):=\sin(2\pi x)$ on $\Omega$ , which corresponds to the solution $u[(1,0.5,0.5)](t,x)=\sin(2\pi x)\cos(4\pi t)$ .

Refer to caption — Figure 3.1. Approximation error of greedy algorithm for reduced basis method of Section 3.3.

We divide both $\Omega$ and $I=(0,T)$ into $2^{6}$ subintervals and choose $U^{\delta}$ to be the $2$ -fold Cartesian product of continuous piecewise bi-cubic functions³³3The reason why we consider bi-cubic instead of bi-affine elements as in [GMU17] is that we measure in a stronger norm but still want the approximation error in the space $U^{\delta}$ to be negligible., with the first coordinate space restricted by homogeneous Dirichlet boundary conditions on $\Sigma$ . The training set $\mathcal{P}_{\rm train}$ is chosen as $17$ equidistantly distributed points in $\mathcal{P}$ in each direction, and the tolerance $\epsilon_{\rm tol}$ is chosen as $10^{-3}$ . Figure 3.1 displays the approximation error $\max_{{\bm{\mu}}\in\mathcal{P}_{\rm train}}\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\|_{L}$ throughout the greedy algorithm of the offline phase with the final $N=21$ , where the $y$ -axis is scaled logarithmically. As expected from [BCD⁺11], we observe exponential convergence. In Figure 3.2, we plot the discretization error $\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\|_{L}$ of the reduced basis method applied for ${\bm{\mu}}$ in the test sets $\big{\{}(\mu_{1},0,0)\,:\,\mu_{1}\in[0.5,1.5]\big{\}}$ and $\big{\{}(0.5,\mu_{2},0.75)\,:\,\mu_{2}\in[0,1]\big{\}}$ . For comparison, we also plot the best possible error $\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{\delta}[{\bm{\mu}}])\|_{L}$ that one could hope for with the reduced basis method. Although the greedy algorithm only guarantees that these errors are below $\epsilon_{\rm tol}=10^{-3}$ on the training set $\mathcal{P}_{\rm train}$ , this bound appears to hold also true on the considered test sets.

While a fair comparison to the space-time results of [GMU17] is hard, we only mention that the considered errors, although measured in the stronger norm $\|G(\cdot)\|_{L}\simeq\|\cdot\|_{U}$ , are of similar magnitude as the errors of [GMU17] measured in (an approximation of) the norm $\|\cdot\|_{X}$ . The number of reduced basis functions to achieve an accuracy of $\epsilon_{\rm tol}=10^{-3}$ in [GMU17] is given by $N=12$ . It is also notable that, in contrast to the estimator of [GMU17], which is based on the residual corresponding to (the first components of) ${\bf u}^{\delta}[{\bm{\mu}}]-{\bf u}^{N}[{\bm{\mu}}]$ , the estimator considered here is provably equivalent to the actual norm of the error $\|{\bf u}[{\bm{\mu}}]-{\bf u}^{N}[{\bm{\mu}}]\|_{U}$ and can also be efficiently computed in the online phase.

4. Optimal control problems

Let ${\bf f}^{\star}=(f_{1}^{\star},{\bf f}_{2}^{\star},u_{0}^{\star})\in L$ be fixed, and let $Z$ be a Hilbert space that is continuously embedded in $L$ . We consider (2.10) with ${\bf f}=(f_{1},{\bf f}_{2},u_{0})={\bf f}^{\star}+{\bf z}$ , where ${\bf z}\in Z$ is a control variable. The corresponding solution ${\bf u}\in U$ is then called state. For $W$ being a further Hilbert space, $F\in\mathcal{L}(U,W)$ , desired observation $w^{\star}\in W$ , and a parameter $\varrho>0$ , we want to minimize the functional


	$\displaystyle J({\bf u},{\bf z}):=\frac{1}{2}\\|F{\bf u}-w^{\star}\\|_{W}^{2}+\frac{\varrho}{2}\\|{\bf z}\\|_{Z}^{2}$	(4.1a)
over

	$\displaystyle\big{\{}({\bf u},{\bf z})\in U\times Z\,:\,G{\bf u}={\bf f}^{\star}+{\bf z}\big{\}}.$	(4.1b)

Remark 4.1.

Let $F=\widetilde{F}\circ({\bf u}\mapsto u_{1})$ for some $\widetilde{F}\in\mathcal{L}(X,W)$ and $Z=L_{2}(Q)\times L_{2}(Q)^{d}\times Z_{3}$ for some continuously embedded subspace $Z_{3}$ of $L_{2}(\Omega)$ . Then, (4.1) is equivalent to minimizing


	$\displaystyle\frac{1}{2}\\|\widetilde{F}u-w^{\star}\\|_{W}^{2}+\frac{\varrho}{2}(\\|\widetilde{z}\\|^{2}_{Y^{\prime}}+\\|z_{3}\\|_{Z_{3}}^{2})$	(4.2a)
over

	$\displaystyle\big{\{}(u,\widetilde{z},z_{3})\in X\times Y^{\prime}\times Z_{3}\,:\,(B,\gamma_{0})u=(f_{1}^{\star}+\operatorname{div}_{\bf x}{\bf f}_{2}^{\star}+\widetilde{z},u_{0}^{\star}+z_{3})\big{\}}.$	(4.2b)

Indeed, from Theorem 2.2 we know that for $\widetilde{z}=z_{1}+\operatorname{div}_{\bf x}{\bf z}_{2}$ , $(u,\widetilde{z},z_{3})$ is in the set (4.2b) if and only if $({\bf u},{\bf z})$ with ${\bf u}=(u,-{\bf A}\nabla_{\bf x}u-{\bf f}_{2}^{\star}-{\bf z}_{2})$ is in the set (4.2b). Knowing that for any $(z_{1},{\bf z}_{2})\in L_{2}(Q)\times L_{2}(Q)^{d}$ , $\widetilde{z}:=z_{1}+\operatorname{div}_{\bf x}{\bf z}_{2}\in Y^{\prime}$ and conversely, any $\widetilde{z}\in Y^{\prime}$ can be written as $\widetilde{z}=z_{1}+\operatorname{div}_{\bf x}{\bf z}_{2}\in Y^{\prime}$ , where, as shown in Remark 2.4, for the pair $(z_{1},{\bf z}_{2})\in L_{2}(Q)\times L_{2}(Q)^{d}$ with smallest $\|z_{1}\|_{L_{2}(Q)}^{2}+\|{\bf z}_{2}\|_{L_{2}(Q)^{d}}^{2}$ it holds that $\|\widetilde{z}\|_{Y^{\prime}}^{2}=\|z_{1}\|_{L_{2}(Q)}^{2}+\|{\bf z}_{2}\|_{L_{2}(Q)^{d}}^{2}$ , the proof of equivalence of (4.1) and (4.2) is completed.

In particular, this shows that (4.1) covers the recently considered optimal control problem from [LSTY21a], where $\widetilde{F}={\rm Id}:X\to L_{2}(Q)$ and $Z_{3}=\{0\}$ .

4.1. Formulation as saddle-point problem

Writing (2.10) with right-hand side ${\bf f}={\bf f}^{\star}+{\bf z}$ as $\langle G{\bf u},G{\bf v}\rangle_{L}=\langle{\bf f}^{\star}+{\bf z},G{\bf v}\rangle_{L}$ for all ${\bf v}\in U$ , and introducing the bilinear forms

a({\bf u},{\bf v}):=\langle G{\bf u}\,,\,G{\bf v}\rangle_{L},\,\,b({\bf z},{\bf v}):=-\langle{\bf z}\,,\,G{\bf v}\rangle_{L},\,\,d({\bf u},{\bf v}):=\langle F{\bf u}\,,\,F{\bf v}\rangle_{W},\,\,e({\bf z},{\bf y}):=\varrho\,\langle{\bf z}\,,\,{\bf y}\rangle_{Z},

and the linear forms

f^{\star}({\bf v}):=\langle{\bf f}^{\star}\,,\,G{\bf v}\rangle_{L},\quad g^{\star}\big{(}({\bf u},{\bf z})\big{)}:=\langle F{\bf u}\,,\,w^{\star}\rangle_{W},

for ${\bf u},{\bf v}\in U$ and ${\bf z},{\bf y}\in Z$ , and noting that the constant term $\|w^{\star}\|_{W}^{2}$ in $\|F{\bf u}-w^{\star}\|_{W}^{2}$ can be neglected for minimization, the optimal control problem (4.1) can be rewritten as

\displaystyle\operatorname*{argmin}\limits_{\{({\bf u},{\bf z})\in U\times Z:\,a({\bf u},{\bf v})+b({\bf z},{\bf v})=f^{\star}({\bf v})\,({\bf v}\in U)\}}\frac{1}{2}d({\bf u},{\bf u})+\frac{1}{2}e({\bf z},{\bf z})-g^{\star}\big{(}({\bf u},{\bf z})\big{)}.

(4.3)

The following lemma implies well-posedness and equivalence to a saddle-point problem.

Lemma 4.2.

Let $U$ and $Z$ be arbitrary Hilbert spaces, $a:U\times U\to\mathbb{R}$ , $b:Z\times U\to\mathbb{R}$ , $d:U\times U\to\mathbb{R}$ , $e:Z\times Z\to\mathbb{R}$ arbitrary bounded bilinear forms such that $a$ and $e$ are coercive, $d$ is positive semi-definite, and $f:U\to\mathbb{R}$ , $g^{\star}:U\times Z\to\mathbb{R}$ are bounded linear forms. Then there exists a unique $({\bf u},{\bf z},{\bf p})\in U\times Z\times U$ that solves the saddle-point problem

\begin{array}[]{lcll}d({\bf u},{\bf v})+e({\bf z},{\bf y})+a({\bf v},{\bf p})+b({\bf y},{\bf p})&=&g^{\star}\big{(}({\bf v},{\bf y})\big{)}&\text{for all }({\bf v},{\bf y})\in U\times Z,\\ a({\bf u},{\bf q})+b({\bf z},{\bf q})&=&f^{\star}({\bf q})&\text{for all }{\bf q}\in U,\end{array}

(4.4)

and

\displaystyle\|{\bf u}\|_{U}+\|{\bf z}\|_{Z}+\|{\bf p}\|_{U}\leq C_{\text{\rm stab}}\big{(}\|f^{\star}\|_{U^{\prime}}+\|g^{\star}\|_{U^{\prime}\times Z^{\prime}}\big{)}

(4.5)

with a constant $C_{\text{\rm stab}}>0$ that depends only on the continuity constants of $a,b,d,e$ and the coercivity constants of $a,e$ .

Assuming symmetry of $d$ and $e$ , the pair $({\bf u},{\bf z})\in U\times Z$ is the unique solution of (4.3). In this setting, ${\bf p}\in U$ is known as the co-state.

Proof.

For the first statement, we only have to show that the bilinear form $\big{(}(\bf{u},{\bf z}),({\bf v},{\bf y})\big{)}\mapsto d({\bf u},{\bf v})+e({\bf z},{\bf y})$ is coercive on the kernel of the operator $B:U\times Z\to U^{\prime}$ defined by $(B({\bf u},{\bf z}))({\bf q}):=a({\bf u},{\bf q})+b({\bf z},{\bf q})$ , and that there holds the LBB (Ladyshenskaja–Babuška–Brezzi) condition

\displaystyle\sup_{({\bf v},{\bf y})\in U\times Z}\frac{a({\bf v},{\bf p})+b({\bf y},{\bf p})}{\|{\bf v}\|_{U}+\|{\bf y}\|_{Z}}\gtrsim\|{\bf p}\|_{U}\quad\text{for all }{\bf p}\in U.

Choosing ${\bf v}={\bf p}$ as well as ${\bf y}=0$ , coercivity of $a$ gives the latter inequality. To see coercivity on the kernel, let $({\bf u},{\bf z})\in U\times Z$ with $a({\bf u},{\bf q})+b({\bf z},{\bf q})=0$ for all ${\bf q}\in U$ . Then, coercivity of $a$ and continuity of $b$ yield that

\displaystyle\|{\bf u}\|_{U}^{2}\lesssim a({\bf u},{\bf u})=|-b({\bf z},{\bf u})|\lesssim\|{\bf z}\|_{Z}\|{\bf u}\|_{U},

or $\|{\bf u}\|_{U}\lesssim\|{\bf z}\|_{Z}$ . With positive semi-definiteness of $d$ and the coercivity of $e$ , we conclude that

\displaystyle d({\bf u},{\bf u})+e({\bf z},{\bf z})\geq e({\bf z},{\bf z})\gtrsim\|{\bf z}\|_{Z}^{2}\gtrsim\|{\bf u}\|_{U}^{2}+\|{\bf z}\|_{Z}^{2}

and thus the proof of the first statement.

One easily verifies that (4.3) has a unique solution $({\bf u},{\bf z})\in U\times Z$ that, assuming symmetric $d$ and $e$ , solves $a({\bf u},{\bf q})+b({\bf z},{\bf q})=f^{\star}({\bf q})$ for all ${\bf q}\in U$ , and $d({\bf u},{\bf v})+e({\bf z},{\bf y})=g^{\star}\big{(}({\bf v},{\bf y})\big{)}$ for all $({\bf v},{\bf y})\in U\times Z$ for which $a({\bf v},{\bf q})+b({\bf y},{\bf q})=0$ for all ${\bf q}\in U$ . It is known that thanks to the LBB condition, these both equations uniquely determine the first two components of the solution $({\bf u},{\bf z},{\bf p})$ of (4.4), which proves the second statement. ∎

Clearly, Lemma 4.2 is also applicable for arbitrary closed subspaces $U^{\delta}\subset U$ and $Z^{\delta}\subset Z$ with the same uniform constant $C_{\text{\rm stab}}>0$ . In particular, there exists a unique corresponding Galerkin solution $({\bf u}^{\delta},{\bf z}^{\delta},{\bf p}^{\delta})\in U^{\delta}\times Z^{\delta}\times U^{\delta}$ of (4.4), which is even quasi-optimal, i.e.,

\displaystyle\begin{split}&\|{\bf u}-{\bf u}^{\delta}\|_{U}+\|{\bf z}-{\bf z}^{\delta}\|_{Z}+\|{\bf p}-{\bf p}^{\delta}\|_{U}\\ &\qquad\leq C_{\text{\rm opt}}\inf_{({\bf v},{\bf y},{\bf q})\in U^{\delta}\times Z^{\delta}\times U^{\delta}}\big{(}\|{\bf u}-{\bf v}\|_{U}+\|{\bf z}-{\bf y}\|_{Z}+\|{\bf p}-{\bf q}\|_{U}\big{)},\end{split}

(4.6)

where $C_{\text{\rm opt}}>0$ is proportional to the product of $C_{\text{\rm stab}}$ and the maximum $C_{\text{\rm cont}}$ of the continuity constants of $a$ , $b$ , $d$ , and $e$ (see, e.g., [SW21b, Rem. 3.2]). Fixing the parabolic PDE (2.1), the spaces $Z$ and $W$ , and the mapping $F$ , from $e({\bf z},{\bf y})=\varrho\,\langle{\bf z}\,,\,{\bf y}\rangle_{Z}$ one infers that $C_{\text{\rm cont}}\eqsim\max(\varrho,1)$ , and $C_{\text{\rm stab}}\eqsim\min(\frac{1}{\varrho},1)$ (see, e.g., [EG21b, Thm. 49.13]), and thus

\displaystyle C_{\text{\rm opt}}\eqsim\max(\tfrac{1}{\varrho},\varrho).

4.2. Optimality system of PDEs

Let $\widehat{L}:=\operatorname{clos}_{L}Z$ , so that $Z\hookrightarrow\widehat{L}\simeq\widehat{L}^{\prime}\hookrightarrow Z^{\prime}$ is a Gelfand triple, let $\Pi\in\mathcal{L}(L,L)$ be the orthogonal projector onto $\widehat{L}$ , $\bm{\ell}:=G{\bf p}$ , and let $C\in\mathcal{L}(Z,Z^{\prime})$ be such that $\langle\cdot,\cdot\rangle_{Z}=\langle C\cdot,\cdot\rangle_{\widehat{L}}$ .

Then the first equation in (4.4) reads as

\varrho\langle C{\bf z},{\bf y}\rangle_{\widehat{L}}-\langle\Pi\bm{\ell},{\bf y}\rangle_{\widehat{L}}+\langle G{\bf v},\bm{\ell}\rangle_{L}=\langle F{\bf v},w^{\star}-F{\bf u}\rangle_{W}\text{ for all }({\bf v},{\bf y})\in U\times Z.

i.e., ${\bf z}=\frac{1}{\varrho}C^{-1}\Pi\bm{\ell}$ and $\bm{\ell}=G^{-*}(F^{*}(w^{\star}-F{\bf u}))$ with $G^{*}$ and $F^{*}$ the Hilbert adjoints of $G\in\mathcal{L}(U,L)$ and $F\in\mathcal{L}(U,W)$ , respectively. Together with ${\bf u}=G^{-1}({\bf f}^{\star}+\frac{1}{\varrho}C^{-1}\Pi\bm{\ell})$ , this last equation forms the coupled optimality system associated to our optimal control problem.

To derive a first-order PDE for $\bm{\ell}$ , we consider the case that

F{\bf v}=(\chi_{1}v_{1},\chi_{2}{\bf v}_{2},\chi_{3}v_{1}(T,\cdot)),\,W=L_{2}(Q)\times L_{2}(Q)^{d}\times L_{2}(\Omega),\,w^{\star}=(w_{1}^{\star},{\bf w}_{2}^{\star},w_{3}^{\star}),

(4.7)

for some $\chi_{1},\chi_{2}\in L_{\infty}(Q)$ , $\chi_{3}\in L_{\infty}(\Omega)$ , possibly with one or more $\chi_{i}$ being zero. We further make the additional regularity assumption $\operatorname{div}_{{\bf x}}{\bf b}\in L_{\infty}(Q)$ . With the outer normal vector ${\bf n}_{\bf x}$ on $\partial\Omega$ , (formal) integration-by-parts shows that

\begin{split}\langle G{\bf v},\bm{\ell}\rangle_{L}=&\langle-\partial_{t}\ell_{1}+\operatorname{div}_{\bf x}{\bf A}\bm{\ell}_{2}-{\bf b}\cdot\nabla_{\bf x}\ell_{1}+(c-\operatorname{div}_{\bf x}{\bf b})\ell_{1}\,,\,v_{1}\rangle_{L_{2}(Q)}\\ &-\langle\nabla_{\bf x}\ell_{1}+\bm{\ell}_{2}\,,\,{\bf v}_{2}\rangle_{L_{2}(Q)^{d}}+\langle\ell_{1}(t,\cdot)\,,\,v_{1}(t,\cdot)\rangle_{L_{2}(\Omega)}\big{|}_{t=0}^{t=T}\\ &+\int_{0}^{T}\langle\ell_{1}(t,\cdot){\bf v}_{2}(t,\cdot)\,,\,{\bf n}_{\bf x}\rangle_{L_{2}(\partial\Omega)}\,{\rm d}t+\langle\ell_{3}\,,\,v_{1}(0,\cdot)\rangle_{L_{2}(\Omega)}.\end{split}

(4.8)

From $\langle G{\bf v},\bm{\ell}\rangle_{L}=\langle F{\bf v},w^{\star}-F{\bf u}\rangle_{W}$ for all ${\bf v}\in U$ , we infer that $\ell_{3}=\ell_{1}(0,\cdot)$ , and

\begin{array}[]{rcll}-\partial_{t}\ell_{1}+\operatorname{div}_{\bf x}{\bf A}\bm{\ell}_{2}-{\bf b}\cdot\nabla_{\bf x}\ell_{1}+(c-\operatorname{div}_{\bf x}{\bf b})\ell_{1}&=&\chi_{1}(w_{1}^{\star}-\chi_{1}u_{1})&\text{ in }Q,\\ -\nabla_{\bf x}\ell_{1}-\bm{\ell}_{2}&=&\chi_{2}({\bf w}_{2}^{\star}-\chi_{2}{\bf u}_{2})&\text{ in }Q,\\ \ell_{1}&=&0&\text{ on }\Sigma,\\ \ell_{1}(T,\cdot)&=&\chi_{3}(w_{3}^{\star}-\chi_{3}u_{1}(T,\cdot))&\text{ on }\Omega,\end{array}

(4.9)

i.e., $\ell_{1}$ is the solution of a backward parabolic problem.

4.3. A posteriori error estimation

The well-posedness of (4.4) shows that for $({\bf u},{\bf z},{\bf p})$ being its solution, and any $({\bf u}^{\delta},{\bf z}^{\delta},{\bf p}^{\delta})\in U\times Z\times U$ ,

		$\displaystyle\\|{\bf u}-{\bf u}^{\delta}\\|^{2}_{U}+\\|{\bf z}-{\bf z}^{\delta}\\|^{2}_{Z}+\\|{\bf p}-{\bf p}^{\delta}\\|^{2}_{U}\eqsim$
		$\displaystyle\sup_{\mbox{}\hskip 35.00008pt0\neq({\bf v},{\bf y},{\bf q})\in U\times Z\times U}\frac{\big{[}d({\bf u}\!-\!{\bf u}^{\delta},{\bf v})\!+\!e({\bf z}\!-\!{\bf z}^{\delta},{\bf y})\!+\!a({\bf v},{\bf p}\!-\!{\bf p}^{\delta})\!+\!b({\bf y},{\bf p}\!-\!{\bf p}^{\delta})\!+\!a({\bf u}\!-\!{\bf u}^{\delta},{\bf q})\!+\!b({\bf z}\!-\!{\bf z}^{\delta},{\bf q})\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}\!+\!\\|{\bf y}\\|^{2}_{Z}\!+\!\\|{\bf q}\\|^{2}_{U}}=$
		$\displaystyle\sup_{\mbox{}\hskip 35.00008pt0\neq({\bf v},{\bf y},{\bf q})\in U\times Z\times U}\frac{\big{[}g^{\star}(({\bf v},{\bf y}))\!+\!f({\bf q})\!-\!\Big{(}d({\bf u}^{\delta},{\bf v})\!+\!e({\bf z}^{\delta},{\bf y})\!+\!a({\bf v},{\bf p}^{\delta})\!+\!b({\bf y},{\bf p}^{\delta})\!+\!a({\bf u}^{\delta},{\bf q})\!+\!b({\bf z}^{\delta},{\bf q})\Big{)}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}\!+\!\\|{\bf y}\\|^{2}_{Z}\!+\!\\|{\bf q}\\|^{2}_{U}}=$
		$\displaystyle\sup_{\mbox{}\hskip 35.00008pt0\neq({\bf v},{\bf y},{\bf q})\in U\times Z\times U}\frac{\big{[}\langle F{\bf v},w^{\star}\!-\!F{\bf u}^{\delta}\rangle_{W}\!-\!\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}\!+\!\langle{\bf f}^{\star}\!+\!{\bf z}^{\delta}\!-\!G{\bf u}^{\delta},G{\bf q}\rangle_{L}\!+\!\langle\Pi G{\bf p}^{\delta}\!-\!\varrho C{\bf z}^{\delta},{\bf y}\rangle_{\widehat{L}}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}\!+\!\\|{\bf y}\\|^{2}_{Z}\!+\!\\|{\bf q}\\|^{2}_{U}}=$
		$\displaystyle\sup_{0\neq{\bf v}\in U}\frac{\big{[}\langle F{\bf v},w^{\star}\!-\!F{\bf u}^{\delta}\rangle_{W}\!-\!\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}}\!+\!\sup_{0\neq{\bf q}\in U}\frac{\langle{\bf f}^{\star}\!+\!{\bf z}^{\delta}\!-\!G{\bf u}^{\delta},G{\bf q}\rangle_{L}^{2}}{\\|{\bf q}\\|_{U}^{2}}\!+\!\\|\Pi G{\bf p}^{\delta}\!-\!\varrho C{\bf z}^{\delta}\\|^{2}_{Z^{\prime}}\eqsim$
		$\displaystyle\sup_{0\neq{\bf v}\in U}\frac{\big{[}\langle F{\bf v},w^{\star}\!-\!F{\bf u}^{\delta}\rangle_{W}\!-\!\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}}\!+\!\\|{\bf f}^{\star}\!+\!{\bf z}^{\delta}\!-\!G{\bf u}^{\delta}\\|_{L}^{2}\!+\!\\|\Pi G{\bf p}^{\delta}\!-\!\varrho C{\bf z}^{\delta}\\|^{2}_{Z^{\prime}}$		(4.10)

from $G\colon U\rightarrow L$ being a linear isomorphism. The hidden constants absorbed by the two $\eqsim$ -symbols can be quantified in terms of the well-posedness of (4.4) and that of $G$ .

The second term in (4.10) is computable, and in any case when the topology of $Z$ equals that of $\widehat{L}$ and the application of $\Pi$ is computable, so is the third. To estimate the first term, we briefly discuss two possibilities.

Remark 4.3.

For the case that $Z=L_{2}(Q)\times\{0\}\times\{0\}$ and $(\chi_{1},\chi_{2},\chi_{3})=(\mathbbold{1},0,0)$ , a functional estimator was introduced in [LS21]. For arbitrary approximations of the state $u\in X$ and the co-state $p\in X$ , this estimator is a computable guaranteed upper bound for the error in the $X$ -norm.

4.3.1. A reliable estimator

We consider the case that $F$ is of the form given in (4.7) and abbreviate $\bm{\ell}^{\delta}=(\ell_{1}^{\delta},\bm{\ell}_{2}^{\delta},\ell_{3}^{\delta}):=G{\bf p}^{\delta}$ . In view of (4.8)–(4.9), any (sufficiently smooth) $\widetilde{\bm{\ell}}^{\delta}=(\widetilde{\ell}_{1}^{\delta},\widetilde{\bm{\ell}}_{2}^{\delta},\widetilde{\ell}_{3}^{\delta})$ with $(\widetilde{\ell}_{1}^{\delta})|_{\Sigma}=0$ and $\widetilde{\ell}_{3}^{\delta}=\widetilde{\ell}_{1}^{\delta}(0,\cdot)$ provides the upper bound

	$\displaystyle\sup_{0\neq{\bf v}\in U}$	$\displaystyle\frac{\langle F{\bf v},w^{\star}-F{\bf u}^{\delta}\rangle_{W}-\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}}{\\|{\bf v}\\|_{U}}\leq\\|G\\|_{\mathcal{L}(U,L)}\\|\widetilde{\bm{\ell}}^{\delta}-\bm{\ell}^{\delta}\\|_{L}+$
		$\displaystyle\\|-\partial_{t}\widetilde{\ell}_{1}^{\delta}+\operatorname{div}_{\bf x}{\bf A}\widetilde{\bm{\ell}}_{2}^{\delta}-{\bf b}\cdot\nabla_{\bf x}\widetilde{\ell}_{1}^{\delta}+(c-\operatorname{div}_{\bf x}{\bf b})\widetilde{\ell}_{1}^{\delta}-\chi_{1}(w_{1}^{\star}-\chi_{1}u_{1}^{\delta})\\|_{L_{2}(Q)}+$
		$\displaystyle\\|-\nabla_{\bf x}\widetilde{\ell}_{1}^{\delta}-\widetilde{\bm{\ell}}_{2}^{\delta}-\chi_{2}({\bf w}_{2}^{\star}-\chi_{2}{\bf u}_{2}^{\delta})\\|_{L_{2}(Q)^{d}}+$
		$\displaystyle\\|{\bf v}\mapsto v_{1}(T,\cdot)\\|_{\mathcal{L}(U,L_{2}(\Omega))}\\|\widetilde{\ell}_{1}^{\delta}(T,\cdot)-\chi_{3}(w_{3}^{\star}-\chi_{3}u_{1}^{\delta}(T,\cdot))\\|_{L_{2}(\Omega)}.$

Choosing $(\widetilde{\ell}_{1}^{\delta},\widetilde{\bm{\ell}}_{2}^{\delta})\in U$ as the solution of (4.9) with ${\bf u}$ replaced by the computable approximation ${\bf u}^{\delta}$ , only the term $\|G\|_{\mathcal{L}(U,L)}\|\widetilde{\bm{\ell}}^{\delta}-\bm{\ell}^{\delta}\|_{L}$ would be present in the upper bound, as then $\langle F{\bf v},w^{\star}-F{\bf u}^{\delta}\rangle_{W}=\langle G{\bf v},\widetilde{\bm{\ell}}^{\delta}\rangle_{L}$ . In this case, the term $\|\widetilde{\bm{\ell}}^{\delta}-\bm{\ell}^{\delta}\|_{L}$ would even be a reliable and efficient estimator for the first term in (4.10). However, in general the mentioned solution is not computable so that one has to approximate it, e.g., by a Galerkin method. Inspired by [LS21], a computationally cheaper option would be to simply post-process $\bm{\ell}^{\delta}$ to obtain $\widetilde{\bm{\ell}}^{\delta}$ , i.e., apply a suitable smoothening quasi-interpolator to the in general non-smooth $\bm{\ell}^{\delta}$ .

4.3.2. An efficient estimator

A computable efficient estimator is obtained by replacing the supremum over ${\bf v}\in U$ in the first term in (4.10) by a supremum over a finite-dimensional subspace $U^{\widetilde{\delta}}\subset U$ . Since for $({\bf u}^{\delta},{\bf z}^{\delta},{\bf p}^{\delta})$ being the Galerkin solution of (4.4) from $U^{\delta}\times Z^{\delta}\times U^{\delta}$ , the numerator in the first term vanishes for ${\bf v}\in U^{\delta}$ , $U^{\widetilde{\delta}}$ needs to be a ‘sufficient’ enlargement of $U^{\delta}$ .

The resulting estimator can be proven to be even equivalent to the error whenever there exists a projector $P^{\delta}\in\mathcal{L}(U,U)$ , bounded uniformly in $\delta$ , with $\operatorname{ran}P^{\delta}\subset U^{\widetilde{\delta}}$ and $\langle F({\rm Id}-P^{\delta}){\bf v},w^{\star}-F{\bf v}^{\delta}\rangle_{W}-\langle G({\rm Id}-P^{\delta}){\bf v},G{\bf q}^{\delta}\rangle_{L}=0$ for all ${\bf v}\in U$ and ${\bf v}^{\delta},{\bf q}^{\delta}\in U^{\delta}$ . For $U^{\delta}$ being a finite element space with respect to a general partition of $Q$ , i.e., not being a product of partitions of $I$ and $\Omega$ , the construction of such ‘Fortin’ projectors however seems hard.

4.4. Numerical experiments

We consider the heat equation, i.e., ${\bf A}:={\bf Id}$ , ${\bf b}:=0$ , and $c:=0$ in (2.1). Moreover, we set ${\bf f}^{\star}:=(f_{1}^{\star},0,u_{0}^{\star})$ , $W:=L_{2}(Q)$ , $F\colon{\bf v}\mapsto v_{1}$ , $Z:=L_{2}(Q)\times\{0\}\times\{0\}\simeq L_{2}(Q)$ . In particular, with $u=u_{1}$ the optimal control problem (4.1) in strong form reads as

\displaystyle\operatorname*{argmin}\limits_{\{(u,z)\in X\times L_{2}(Q):\,\partial_{t}u-\Delta_{\bf x}u=f_{1}^{\star}+z\wedge u(0,\cdot)=u^{\star}_{0}\}}\frac{1}{2}\|u-w^{\star}\|_{L_{2}(Q)}^{2}+\frac{\varrho}{2}\|z\|_{L_{2}(Q)}^{2}.

(4.11)

In this case, the optimality system derived in Section 4.2 reads as

\begin{array}[]{rcll}\partial_{t}u_{1}-\Delta_{\bf x}u_{1}&=&f_{1}^{\star}+\frac{1}{\varrho}\ell_{1}&\text{ on }Q,\\ u_{1}&=&0&\text{ on }\Sigma,\\ u_{1}(0,\cdot)&=&u^{\star}_{0}&\text{ on }\Omega,\end{array}

and

\begin{array}[]{rcll}-\partial_{t}\ell_{1}-\Delta_{\bf x}\ell_{1}&=&w^{\star}-u_{1}&\text{ on }Q,\\ \ell_{1}&=&0&\text{ on }\Sigma,\\ \ell_{1}(T,\cdot)&=&0&\text{ on }\Omega,\end{array}

where ${\bf u}_{2}=-\nabla_{\bf x}u_{1}$ , $\bm{\ell}_{2}=-\nabla_{\bf x}\ell_{1}$ , and $\ell_{3}=\ell_{1}(0,\cdot)$ . By prescribing arbitrary $u_{1}\in X$ , $\ell_{1}\in L_{2}(Q)$ with $\partial_{t}u_{1}-\Delta_{\bf x}u_{1}\in L_{2}(Q)$ , $-\partial_{t}\ell_{1}-\Delta_{\bf x}\ell_{1}\in L_{2}(Q)$ , $\ell_{1}(0,\cdot)\in L_{2}(\Omega)$ , $\ell_{1}=0$ on $\Sigma$ , and $\ell_{1}(T,\cdot)=0$ , the parameters $f_{1}^{\star}$ , $u^{\star}_{0}$ , and $w^{\star}$ are determined by the optimality system.

The control $z$ and co-state ${\bf p}$ are determined by $z=\frac{1}{\varrho}\ell_{1}$ and $G{\bf p}=\bm{\ell}$ , i.e.,

\begin{array}[]{rcll}\partial_{t}p_{1}-\Delta_{\bf x}p_{1}&=&\ell_{1}-\Delta_{\bf x}\ell_{1}&\text{ on }Q,\\ p_{1}&=&0&\text{ on }\Sigma,\\ p_{1}(0,\cdot)&=&\ell_{1}(0,\cdot)&\text{ on }\Omega,\end{array}

and ${\bf p}_{2}=\nabla_{\bf x}(\ell_{1}-p_{1})$ .

Given some conforming quasi-uniform partition $\mathcal{T}^{\delta}$ of the space-time cylinder $Q$ into $(d+1)$ -simplices, we take $U^{\delta}$ to be the $(d+1)$ -fold Cartesian product of continuous piecewise affine functions with the first-coordinate space restricted by homogeneous Dirichlet boundary conditions on $\Sigma$ , and $Z^{\delta}$ being the space of piecewise constants.

Taking sufficiently smooth $u_{1}$ and $\ell_{1}$ , in view of (4.6) for the Galerkin solution $({\bf u}^{\delta},z^{\delta},{\bf p}^{\delta})\in U^{\delta}\times Z^{\delta}\times U^{\delta}$ , we obtain $\|{\bf u}-{\bf u}^{\delta}\|_{U}+\|z-z^{\delta}\|_{L_{2}(Q)}+\|{\bf p}-{\bf p}^{\delta}\|_{U}={\mathcal{O}}({\rm dofs}^{-\frac{1}{d+1}})$ assuming that $\inf_{{\bf q}\in U^{\delta}}\|{\bf p}-{\bf q}\|_{U}={\mathcal{O}}({\rm dofs}^{-\frac{1}{d+1}})$ .

Concerning the latter, assuming the compatibility conditions $\ell_{1}(0,\cdot)\in H^{1}_{0}(\Omega)$ , $(\ell_{1}-\Delta_{\bf x}\ell_{1})(0,\cdot)+\Delta_{\bf x}\ell_{1}(0,\cdot)\in H^{1}_{0}(\Omega)$ , and $\partial_{t}(\ell_{1}-\Delta_{\bf x}\ell_{1})(0,\cdot)+\Delta_{\bf x}(\ell_{1}-\Delta_{\bf x}\ell_{1})(0,\cdot)+\Delta_{\bf x}^{2}\ell_{1}(0,\cdot)\in L_{2}(\Omega)$ , [Wlo87, Theorem 27.2] shows that $p_{1}\in H^{2}(I;H^{1}_{0}(\Omega))$ . For a smooth or convex $\Omega$ , by interchanging $\partial_{t}$ and $\Delta_{\bf x}$ , from $\partial_{t}(\ell_{1}-\Delta_{\bf x}\ell_{1}-\partial_{t}p_{1})\in L_{2}(I;L_{2}(\Omega))$ regularity of the Poisson problem shows that $\partial_{t}p_{1}\in L_{2}(I;H^{2}(\Omega))$ , i.e., $p_{1}\in H^{1}(I;H^{2}(\Omega))$ . For a smooth $\Omega$ , from $\ell_{1}-\Delta_{\bf x}\ell_{1}-\partial_{t}p_{1}\in L(I;H^{1}(\Omega))$ , regularity of the Poisson problem shows that $p_{1}\in L_{2}(I;H^{3}(\Omega))$ . For $\Omega$ being a square, the latter should be read as $p_{1}\in L_{2}(I;H^{3-\varepsilon}(\Omega))$ for any $\varepsilon>0$ ([Kon70], cf. also [Hac92, Ex. 9.1.25]). Pretending that $\varepsilon=0$ , we conclude that $p_{1}\in H^{2}(Q)$ and ${\bf p}_{2}\in H^{2}(Q)^{d}$ , so that $\inf_{{\bf q}\in U^{\delta}}\|{\bf p}-{\bf q}\|_{U}\lesssim\inf_{{\bf q}\in U^{\delta}}\|{\bf p}-{\bf q}\|_{H^{1}(Q)\times H^{1}(Q)^{d}}={\mathcal{O}}({\rm dofs}^{-\frac{1}{d+1}})$ , which is thus only ‘nearly’ demonstrated for $\Omega$ being a square.

4.4.1. Experiment in 1+1D

Let $\Omega:=(0,1)$ , $T:=1$ , and $\varrho=0.01$ . We prescribe

\displaystyle u(t,x)=u_{1}(t,x):=\cos(\pi t)\sin(\pi x),\quad\ell_{1}(t,x):=\varrho(1-t)\sin(\pi x)

on the space-time cylinder $Q=(0,1)^{2}$ , and determine $f_{1}^{\star}$ , $u_{0}^{\star}$ , and $w^{\star}$ by the optimality system.

Starting on an initial triangulation $\mathcal{T}^{\delta}$ of $Q$ with two elements, we define a sequence of uniform triangulations $\mathcal{T}^{\delta}$ by splitting all elements in the previous $\mathcal{T}^{\delta}$ into four new elements by repeated newest vertex bisection. The convergence plot for the resulting Galerkin approximations $({\bf u}^{\delta},z^{\delta},{\bf p}^{\delta})\in U^{\delta}\times Z^{\delta}\times U^{\delta}$ of (4.4) is displayed in Figure 4.1. While, as expected, $\|\nabla_{\bf x}(u-u^{\delta})\|_{L_{2}(Q)}$ and $\|z-z^{\delta}\|_{L_{2}(Q)}$ converge at rate ${\mathcal{O}}({\rm dofs}^{-\frac{1}{2}})$ , $\|u(0,\cdot)-u^{\delta}(0,\cdot)\|_{L_{2}(\Omega)}$ , $\|u(T,\cdot)-u^{\delta}(T,\cdot)\|_{L_{2}(\Omega)}$ , $\|u-u^{\delta}\|_{L_{2}(Q)}$ , and $|J({\bf u},z)-J({\bf u}^{\delta},z^{\delta})|$ even converge with the double rate.

4.4.2. Experiment in 2+1D

Let $\Omega:=(0,1)^{2}$ , $T:=1$ , and $\varrho=0.01$ . We prescribe

\displaystyle u(t,x_{1},x_{2}):=\cos(\pi t)\sin(\pi x_{1})\sin(\pi x_{2}),\quad\ell_{1}(t,x_{1},x_{2}):=\varrho(1-t)\sin(\pi x_{1})\sin(\pi x_{2})

on the space-time cylinder $Q=(0,1)^{3}$ , and we choose $f_{1}^{\star}$ , $u_{0}^{\star}$ , and $w^{\star}$ by the optimality system.

Starting on an initial triangulation $\mathcal{T}^{\delta}$ of $Q$ with $12$ elements, we define a sequence of quasi-uniform triangulations $\mathcal{T}^{\delta}$ by splitting all elements in the previous $\mathcal{T}^{\delta}$ into eight new elements by repeated newest vertex bisection. The convergence plot for the resulting Galerkin approximations $({\bf u}^{\delta},z^{\delta},{\bf p}^{\delta})\in U^{\delta}\times Z^{\delta}\times U^{\delta}$ of (4.4) is displayed in Figure 4.2. While, as expected, $\|\nabla_{\bf x}(u-u^{\delta})\|_{L_{2}(Q)}$ and $\|z-z^{\delta}\|_{L_{2}(Q)}$ converge at rate ${\mathcal{O}}({\rm dofs}^{-\frac{1}{3}})$ , $\|u(0,\cdot)-u^{\delta}(0,\cdot)\|_{L_{2}(\Omega)}$ , $\|u(T,\cdot)-u^{\delta}(T,\cdot)\|_{L_{2}(\Omega)}$ , $\|u-u^{\delta}\|_{L_{2}(Q)}$ , and $|J({\bf u},z)-J({\bf u}^{\delta},z^{\delta})|$ even converge (at least almost) with the double rate.

5. Time-dependent domains

In this section, we assume that the spatial domain $\Omega$ changes throughout time. More precisely, let $\widehat{\Omega}\subset\mathbb{R}^{d}$ be a Lipschitz domain with boundary $\partial\widehat{\Omega}=\widehat{\Gamma}$ , and $\widehat{Q}:=I\times\widehat{\Omega}$ the corresponding space-time cylinder with lateral boundary $\widehat{\Sigma}:=I\times\widehat{\Gamma}$ . We denote the spaces $X$ , $Y$ , $U$ , and $L$ on $\widehat{Q}$ by $\widehat{X}$ , $\widehat{Y}$ , $\widehat{U}$ , and $\widehat{L}$ , respectively. We suppose that the actual space-time domain $Q\subset\mathbb{R}^{d+1}$ is given via a bijection of the form

\displaystyle\kappa:\overline{\widehat{Q}}\to\overline{Q},\quad\left(\begin{array}[]{@{}c@{}}t\\ \widehat{\bf x}\end{array}\right)\mapsto\left(\begin{array}[]{@{}c@{}}t\\ \kappa^{\prime}(t,\widehat{x})\end{array}\right),

where, for all $t\in\overline{I}$ , $\kappa^{\prime}(t,\cdot):\widehat{\Omega}\to\mathbb{R}^{d}$ maps $\widehat{\Omega}$ bijectively onto some Lipschitz domain $\Omega_{t}\subset\mathbb{R}^{d}$ . We require the regularities $\kappa^{\prime}\in W^{1}_{\infty}(\widehat{Q})^{d}$ , $\operatorname{ess\,inf}_{\widehat{Q}}\det{\rm D}_{\widehat{\bf x}}\kappa^{\prime}>0$ , $\sup_{t\in I}\|\kappa^{\prime}(t,\cdot)\|_{W_{\infty}^{3}(\widehat{\Omega})^{d}}<\infty$ , and $\sup_{t\in I}\|\partial_{t}\kappa^{\prime}(t,\cdot)\|_{W_{\infty}^{1}(\widehat{\Omega})^{d}}<\infty$ . Note that

\displaystyle{\rm D}\kappa=\left(\begin{array}[]{@{}cc@{}}1&0\\ \partial_{t}\kappa^{\prime}&{\rm D}_{\widehat{\bf x}}\kappa^{\prime}\end{array}\right)\quad\text{and}\quad\det{\rm D}\kappa=\det{\rm D}_{\widehat{\bf x}}\kappa^{\prime}.

(5.3)

With the lateral boundary $\Sigma:=\kappa(\widehat{\Sigma})$ and $\Omega:=\Omega_{0}$ , we consider the parabolic PDE (2.1) with ${\bf A}={\bf A}^{\top}\in L_{\infty}(Q)^{d\times d}$ uniformly positive, ${\bf b}\in L_{\infty}(Q)^{d}$ , $c\in L_{\infty}(Q)$ , $f_{1}\in L_{2}(Q)$ , ${\bf f}_{2}\in L_{2}(Q)^{d}$ , and $u_{0}\in L_{2}(\Omega_{0})$ .

5.1. Formulation as first-order system

As in the time-independent domain case, in first-order system formulation (2.1) reads as

\displaystyle G{\bf u}:=\left(\begin{array}[]{@{}c@{}}\operatorname{div}{\bf u}+{\bf b}\cdot\nabla_{\bf x}u_{1}+cu_{1}\\ -{\bf u}_{2}-{\bf A}\nabla_{\bf x}u_{1}\\ u_{1}(0,\cdot)\end{array}\right)=\left(\begin{array}[]{@{}c@{}}f_{1}\\ {\bf f}_{2}\\ u_{0}\end{array}\right)=:{\bf f},

(5.10)

which is again well-posed:

Theorem 5.1.

With $U$ and $L$ defined as in the time-independent domain case⁴⁴4Setting $Y:=\big{\{}\widehat{v}\circ\kappa^{-1}\,:\,\widehat{v}\in\widehat{Y}\big{\}}=\big{\{}v\in L_{2}(Q)\,:\,\nabla_{\bf x}v\in L_{2}(Q)^{d}\wedge v|_{\Sigma}=0\big{\}}$ equipped with the norm $\sqrt{\|v\|_{L_{2}(Q)}^{2}+\|\nabla_{\bf x}v\|_{L_{2}(Q)^{d}}^{2}}$ ., $G$ is a linear isomorphism from $U$ to $L$ .

Proof.

We show that $G=F\circ\widetilde{G}\circ H$ with linear isomorphisms $F\in\mathcal{L}(\widehat{L},L),\widetilde{G}\in\mathcal{L}(\widehat{U},\widehat{L})$ , and $H\in\mathcal{L}(U,\widehat{U})$ .

We set $H{\bf u}:=\widehat{\bf u}:=\det{\rm D}\kappa\,({\rm D}\kappa)^{-1}{\bf u}\circ\kappa$ . A familiar property of the Piola transformation (e.g. [EG21a, Lemma 9.6]) is that

\operatorname{div}\widehat{\bf u}=\det{\rm D}\kappa\,\operatorname{div}{\bf u}\circ\kappa.

(5.11)

By definition of $\widehat{\bf u}$ we have

u_{1}\circ\kappa=(\det{\rm D}\kappa)^{-1}\widehat{u}_{1},\quad{\bf u}_{2}\circ\kappa=(\det{\rm D}\kappa)^{-1}[{\rm D}_{\widehat{\bf x}}\kappa^{\prime}\,\widehat{\bf u}_{2}+\widehat{u}_{1}\partial_{t}\kappa^{\prime}].

(5.12)

Applications of the product and chain rules show that

\nabla_{{\bf x}}u_{1}\circ\kappa=(\det{\rm D}\kappa)^{-1}({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-\top}\big{[}\nabla_{\widehat{\bf x}}\widehat{u}_{1}-(\det{\rm D}\kappa)^{-1}\widehat{u}_{1}\nabla_{\widehat{\bf x}}(\det{\rm D}\kappa)\big{]}.

(5.13)

We conclude that $\|{\bf u}\|_{U}\eqsim\|\widehat{\bf u}\|_{\widehat{U}}$ and thus that $H$ is a linear isomorphism.

We set $\widehat{\bf A}:={\bf A}\circ\kappa$ , $\widehat{\bf b}:={\bf b}\circ\kappa$ , $\widehat{c}:=c\circ\kappa$ , and

	$\displaystyle\widetilde{\bf A}$	$\displaystyle:=({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-1}\widehat{\bf A}({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-\top},$
	$\displaystyle\widetilde{\bf b}$	$\displaystyle:=({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-1}\widehat{\bf b},$
	$\displaystyle\widetilde{c}$	$\displaystyle:=(\det{\rm D}\kappa)^{-1}\big{[}\widehat{c}-(\det D\kappa)^{-1}({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-1}\widehat{\bf b}\cdot\nabla_{\widehat{\bf x}}(\det{\rm D}\kappa)\big{]}$
	$\displaystyle\widehat{\bf w}$	$\displaystyle:=({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-1}\partial_{t}\kappa^{\prime}-(\det{\rm D}\kappa)^{-1}\widetilde{\bf A}\nabla_{\widehat{\bf x}}\det{\rm D}\kappa.$

Setting $\widehat{f}_{1}:=f_{1}\circ\kappa$ , $\widehat{{\bf f}}_{2}:={\bf f}_{2}\circ\kappa$ , $\widehat{u}_{0}:=u_{0}\circ\kappa^{\prime}(0,\cdot)$ and using (5.11)–(5.13), we find that (5.10) is equivalent to

	$\displaystyle\widehat{f}_{1}=(\operatorname{div}{\bf u}+{\bf b}\cdot\nabla_{\bf x}u_{1}+cu_{1})\circ\kappa$	$\displaystyle=(\det{\rm D}\kappa)^{-1}\big{[}\operatorname{div}\widehat{\bf u}+\widetilde{\bf b}\cdot\nabla_{\widehat{\bf x}}\widehat{u}_{1}+\widetilde{c}\,\widehat{u}_{1}\big{]},$
	$\displaystyle\widehat{{\bf f}}_{2}=-({\bf u}_{2}+{\bf A}\nabla_{\bf x}u_{1})\circ\kappa$	$\displaystyle=-(\det{\rm D}\kappa)^{-1}{\rm D}_{\widehat{\bf x}}\kappa^{\prime}\big{[}\widehat{\bf u}_{2}+\widetilde{\bf A}\nabla_{\widehat{\bf x}}\widehat{u}_{1}+\widehat{u}_{1}\widehat{\bf w}\big{]},$
	$\displaystyle\widehat{u}_{0}=(u_{1}\circ\kappa)(0,\cdot)$	$\displaystyle=(\det{\rm D}_{\widehat{\bf x}}\kappa^{\prime}(0,\cdot))^{-1}\widehat{u}_{1}(0,\cdot).$

Defining the linear isomorphism $F$ via

\displaystyle F^{-1}{\bf f}:=\left(\begin{array}[]{@{}c@{}}\det{\rm D}\kappa\,\widehat{f}_{1}\\ \det{\rm D}\kappa\,({\rm D}_{\widehat{\bf x}}\kappa^{\prime})^{-1}\,\widehat{\bf f}_{2}\\ (\det{\rm D}_{\widehat{\bf x}}\kappa^{\prime}(0,\cdot))\widehat{u}_{0}\end{array}\right),

it remains to show that

\displaystyle\widetilde{G}\widehat{\bf u}:=\left(\begin{array}[]{@{}c@{}}\operatorname{div}\widehat{\bf u}+\widetilde{\bf b}\cdot\nabla_{\widehat{\bf x}}\widehat{u}_{1}+{\widetilde{c}}\,\widehat{u}_{1}\\ -\widehat{\bf u}_{2}-{\bf A}\nabla_{\widehat{\bf x}}\widehat{u}_{1}-\widehat{u}_{1}\widehat{\bf w}\\ \widehat{u}_{1}(0,\cdot)\end{array}\right)

is linear isomorphism from $\widehat{U}$ to $\widehat{L}$ .

This follows from the identity

\displaystyle\widetilde{G}\left(\begin{array}[]{@{}cc@{}}I&0\\ -\widehat{\bf w}&I\end{array}\right)\widehat{\bf u}=\left(\begin{array}[]{@{}c@{}}\operatorname{div}\widehat{\bf u}+(\widetilde{\bf b}-\widehat{\bf w})\cdot\nabla_{\widehat{\bf x}}\widehat{u}_{1}+(\widetilde{c}-\operatorname{div}_{\widehat{\bf x}}\widehat{\bf w})\widehat{u}_{1}\\ -\widehat{\bf u}_{2}-{\bf A}\nabla_{\widehat{\bf x}}\widehat{u}_{1}\\ \widehat{u}_{1}(0,\cdot)\end{array}\right).

By the assumed regularity of $\kappa$ and $\kappa^{-1}$ , (5.3) yields that $\operatorname{div}_{\widehat{\bf x}}\widehat{\bf w}\in L_{\infty}(\widehat{Q})$ , so that the latter mapping is a linear isomorphism from $\widehat{U}$ to $\widehat{L}$ according to Theorem 2.2. Noting that $\left(\begin{array}[]{@{}cc@{}}I&0\\ -\widehat{\bf w}&I\end{array}\right)$ is a linear isomorphism from $\widehat{U}$ to $\widehat{U}$ , we conclude the proof. ∎

Remark 5.2.

Assume that also $\partial_{t}\partial_{\widehat{x}_{i}}\kappa^{\prime}\in L_{\infty}(\widehat{Q})$ for all $i\in\{1,\dots,d\}$ . Given ${\bf f}=(f_{1},{\bf f}_{2},u_{0})\in L$ , ${\bf u}=(u_{1},{\bf u}_{2})\in U$ then solves (5.10) if and only if $u_{1}\in X:=\big{\{}\widehat{u}\circ\kappa^{-1}\,:\,\widehat{u}\in\widehat{X}\big{\}}$ solves

\displaystyle\int_{Q}\partial_{t}u\,v+({\bf A}\nabla_{\bf x}u)\cdot\nabla_{\bf x}v+{\bf b}\cdot\nabla_{\bf x}u\,v+cuv\,{\rm d}{\bf x}\,{\rm d}t

\displaystyle=\int_{Q}f_{1}v+\operatorname{div}_{\bf x}{\bf f}_{2}\,v\,{\rm d}{\bf x}\,{\rm d}t\quad\text{for all }v\in Y,

and $u_{1}(0,\cdot)=u_{0}$ . Indeed, ${\bf u}=(u_{1},{\bf u}_{2})\in U$ together with (5.3), (5.12), and Lemma 2.3 imply that $\det{\rm D}_{\widehat{\bf x}}\kappa^{\prime}\,u_{1}\circ\kappa\in\widehat{X}$ , and thus by the additional regularity assumption that $u_{1}\circ\kappa\in\widehat{X}$ . The remainder follows from partial integration. In particular, also the second part of Theorem 2.2 holds analogously for time-dependent domains. Together with the open mapping theorem, this insight also allows to generalize Theorem 2.1.

Theorem 5.1 shows that also in the time-dependent domain case one can approximate the solution of the parabolic PDE by applying a Galerkin discretization to the variational problem $\langle G{\bf u},G{\bf v}\rangle_{L}=\langle{\bf f},G{\bf v}\rangle_{L}$ for all ${\bf v}\in U$ . Thinking of finite element discretizations, in particular for polytopal domains $Q$ , this provides an attractive alternative to discretizations that require a transformation of the PDE to a time-independent domain, e.g., ALE time-stepping methods [SHD01, GS17, SG20].

5.2. Numerical experiments

We consider the heat equation, i.e., ${\bf A}:={\bf Id}$ , ${\bf b}:=0$ , and $c:=0$ in (2.1). Moreover, we set ${\bf f}_{2}:=0$ and ${\bf u}:=(u,-\nabla_{\bf x}u)$ . We will approximate the latter function ${\bf u}\in U$ in the conforming subspace of continuous piecewise affine functions $U^{\delta}$ on triangulations of the space-time cylinder $Q$ whose first component vanishes on the lateral boundary $\Sigma$ . Using that $\{{\bf v}=(v_{1},{\bf v}_{2})\in H^{1}(Q)^{d+1}\colon v_{1}|_{\Sigma}=0\}\hookrightarrow U$ , we know that $\|{\bf u}-{\bf u}^{\delta}\|_{U}={\mathcal{O}}({\rm dofs}^{-\frac{1}{d+1}})$ for sufficiently smooth $u$ and uniform mesh-refinement.

5.2.1. Experiment in 1+1D

Let

\displaystyle Q:={\rm int}\Bigg{(}\underbrace{{\rm conv}\left\{\begin{pmatrix}0\\ 0\end{pmatrix},\begin{pmatrix}0.5\\ 0.25\end{pmatrix},\begin{pmatrix}0.5\\ 0.75\end{pmatrix},\begin{pmatrix}0\\ 1\end{pmatrix}\right\}}_{=:Q_{1}}\cup\underbrace{{\rm conv}\left\{\begin{pmatrix}0.5\\ 0.25\end{pmatrix},\begin{pmatrix}1\\ 0\end{pmatrix},\begin{pmatrix}1\\ 1\end{pmatrix},\begin{pmatrix}0.5\\ 0.75\end{pmatrix}\right\}}_{=:Q_{2}}\Bigg{)},

where ${\rm int}(\cdot)$ denotes the interior of a set and ${\rm conv}(\cdot)$ denotes the convex hull of a set of points; see Figure 5.1 for an illustration. With the bijection $\kappa\colon[0,1]^{2}\to\overline{Q}\colon(t,\widehat{x})\mapsto\big{(}t,(\widehat{x}-\tfrac{1}{2})(|t-\tfrac{1}{2}|+\tfrac{1}{2})+\tfrac{1}{2}\big{)}$ , we prescribe the solution $u$ by

\displaystyle u(\kappa(t,\widehat{x})):=\sin(\pi\widehat{x}),

and choose $f_{1}:=\partial_{t}u-\Delta_{x}u$ , and $u_{0}:=u(0,\cdot)$ .

Starting on an initial triangulation $\mathcal{T}^{\delta}$ of $Q$ with six elements (such that each element is either in $\overline{Q}_{1}$ or $\overline{Q}_{2}$ ), we define a sequence of triangulations $\mathcal{T}^{\delta}$ by splitting all elements in the previous $\mathcal{T}^{\delta}$ into four new elements by repeated newest vertex bisection. We compute the Galerkin approximation ${\bf u}^{\delta}\in U^{\delta}$ of ${\bf u}\in U$ with respect to the scalar product $\langle G(\cdot)\,,\,G(\cdot)\rangle_{L}$ , with $U^{\delta}$ being the $2$ -fold Cartesian product of continuous piecewise affine functions on $\mathcal{T}^{\delta}$ whose first component vanishes on $\Sigma$ . The corresponding convergence plot is displayed in Figure 5.2. Here, $u^{\delta}$ denotes the first component of ${\bf u}^{\delta}$ . While $\|{G\bf u}-G{\bf u^{\delta}}\|_{L}\simeq\|{\bf u}-{\bf u^{\delta}}\|_{U}$ and $\|\nabla_{\bf x}(u-u^{\delta})\|_{L_{2}(Q)}$ converge as expected at rate ${\mathcal{O}}({\rm dofs}^{-\frac{1}{2}})$ , $\|u(0,\cdot)-u^{\delta}(0,\cdot)\|_{L_{2}(\Omega)}$ , $\|u(T,\cdot)-u^{\delta}(T,\cdot)\|_{L_{2}(\Omega)}$ , and $\|u-u^{\delta}\|_{L_{2}(Q)}$ converge roughly with the rate ${\mathcal{O}}({\rm dofs}^{-0.8})$ .

5.2.2. Experiment in 2+1D

Let

	$\displaystyle Q:={\rm int}\Bigg{(}$	$\displaystyle\underbrace{{\rm conv}\left\{\begin{pmatrix}0\\ 0\\ 0\end{pmatrix},\begin{pmatrix}0\\ 1\\ 0\end{pmatrix},\begin{pmatrix}0\\ 1\\ 1\end{pmatrix},\begin{pmatrix}0\\ 0\\ 1\end{pmatrix},\begin{pmatrix}0.5\\ 0.25\\ 0.25\end{pmatrix},\begin{pmatrix}0.5\\ 0.75\\ 0.25\end{pmatrix},\begin{pmatrix}0.5\\ 0.75\\ 0.75\end{pmatrix},\begin{pmatrix}0.5\\ 0.25\\ 0.75\end{pmatrix}\right\}}_{=:Q_{1}},$
	$\displaystyle\cup$	$\displaystyle\underbrace{{\rm conv}\left\{\begin{pmatrix}0.5\\ 0.25\\ 0.25\end{pmatrix},\begin{pmatrix}0.5\\ 0.75\\ 0.25\end{pmatrix},\begin{pmatrix}0.5\\ 0.75\\ 0.75\end{pmatrix},\begin{pmatrix}0.5\\ 0.25\\ 0.75\end{pmatrix},\begin{pmatrix}1\\ 0\\ 0\end{pmatrix},\begin{pmatrix}1\\ 1\\ 0\end{pmatrix},\begin{pmatrix}1\\ 1\\ 1\end{pmatrix},\begin{pmatrix}1\\ 0\\ 1\end{pmatrix}\right\}}_{=:Q_{2}}\Bigg{)}.$

where ${\rm int}(\cdot)$ denotes the interior of a set and ${\rm conv}(\cdot)$ denotes the convex hull of a set of points; see Figure 5.1 for an illustration. With the bijection $\kappa\colon[0,1]^{3}\to\overline{Q}\colon(t,\widehat{\bf x})\mapsto\big{(}t,\big{(}\widehat{\bf x}-(\tfrac{1}{2},\tfrac{1}{2})\big{)}(|t-\tfrac{1}{2}|+\tfrac{1}{2})+(\tfrac{1}{2},\tfrac{1}{2})\big{)}$ , we prescribe the solution $u$ by

\displaystyle u(\kappa(t,\widehat{x}_{1},\widehat{x}_{2})):=\sin(\pi\widehat{x}_{1})\sin(\pi\widehat{x}_{2}),

and choose $f_{1}:=\partial_{t}u-\Delta_{\bf x}u$ , and $u_{0}:=u(0,\cdot)$ .

Starting on an initial triangulation $\mathcal{T}^{\delta}$ of $Q$ with twelve elements (such that each element is either in $\overline{Q}_{1}$ or $\overline{Q}_{2}$ ), we define a sequence of triangulations $\mathcal{T}^{\delta}$ by splitting all elements in the previous $\mathcal{T}^{\delta}$ into eight new elements by repeated newest vertex bisection. We compute the Galerkin approximation ${\bf u}^{\delta}\in U^{\delta}$ of ${\bf u}\in U$ with respect to the scalar product $\langle G(\cdot)\,,\,G(\cdot)\rangle_{L}$ , with $U^{\delta}$ being the $3$ -fold Cartesian product of continuous piecewise affine functions on $\mathcal{T}^{\delta}$ whose first component vanishes on $\Sigma$ . The corresponding convergence plot is displayed in Figure 5.3. Here, $u^{\delta}$ denotes the first component of ${\bf u}^{\delta}$ . While $\|{G\bf u}-G{\bf u^{\delta}}\|_{L}\simeq\|{\bf u}-{\bf u^{\delta}}\|_{U}$ and $\|\nabla_{\bf x}(u-u^{\delta})\|_{L_{2}(Q)}$ converge as expected at rate ${\mathcal{O}}({\rm dofs}^{-\frac{1}{3}})$ , $\|u(0,\cdot)-u^{\delta}(0,\cdot)\|_{L_{2}(\Omega)}$ , $\|u(T,\cdot)-u^{\delta}(T,\cdot)\|_{L_{2}(\Omega)}$ , and $\|u-u^{\delta}\|_{L_{2}(Q)}$ converge roughly with the rate ${\mathcal{O}}({\rm dofs}^{-0.6})$ .

6. Conclusion

In this work, we have demonstrated that the space-time FOSLS [FK21] and its generalization [GS21] to general second-order parabolic PDEs can be easily applied to solve parameter-dependent problems, optimal control problems, and time-dependent domain problems. In each case, completely unstructured space-time finite elements may be employed, and our numerical experiments exhibit optimal convergence. We also want to stress that the FOSLS is applicable to the combined problem, i.e., parameter-dependent optimal control problems on time-dependent domains. Indeed, the inf-sup stability of the saddle-point problem corresponding to an optimal control problem essentially only hinges on the coercivity of the bilinear form $a$ (see Lemma 4.2), which is also valid in the case of time-dependent domains (see Theorem 5.1). As this stability holds uniformly for arbitrary trial spaces, a reduced basis method as in Section 3.1 can be employed, where the greedy algorithm of Section 3.2 can be steered by one of the estimators discussed in Section 4.3.

References

[And13] Roman Andreev. Stability of sparse space–time finite element discretizations of linear parabolic evolution equations. IMA J. Numer. Anal., 33(1):242–260, 2013.
[BCD⁺11] Peter Binev, Albert Cohen, Wolfgang Dahmen, Ronald DeVore, Guergana Petrova, and Przemyslaw Wojtaszczyk. Convergence rates for greedy algorithms in reduced basis methods. SIAM J. Math. Anal., 43(3):1457–1472, 2011.
[BG09] Pavel B. Bochev and Max D. Gunzburger. Least-squares finite element methods, volume 166 of Applied Mathematical Sciences. Springer, New York, 2009.
[BHT22] Rahel Brügger, Helmut Harbrecht, and Johannes Tausch. Boundary integral operators for the heat equation in time-dependent domains. Integral Equations Operator Theory, 94(2):1–28, 2022.
[EG21a] Alexandre Ern and Jean-Luc Guermond. Finite elements I: Approximation and interpolation, volume 72. Springer Nature, Cham, 2021.
[EG21b] Alexandre Ern and Jean-Luc Guermond. Finite Elements II: Galerkin approximation, elliptic and mixed PDEs, volume 73. Springer Nature, 2021.
[EKP11] Jens L. Eftang, David J. Knezevic, and Anthony T. Patera. An $hp$ certified reduced basis method for parametrized parabolic partial differential equations. Math. Comput. Model. Dyn. Syst., 17(4):395–422, 2011.
[FK21] Thomas Führer and Michael Karkulik. Space–time least-squares finite elements for parabolic equations. Comput. Math. Appl., 92:27–36, 2021.
[FR17] Stefan Frei and Thomas Richter. A second order time-stepping scheme for parabolic interface problems with moving interfaces. ESAIM Math. Model. Numer. Anal., 51(4):1539–1560, 2017.
[FS22] Stefan Frei and Maneesh Kumar Singh. An implicitly extended Crank-Nicolson scheme for the heat equation on time-dependent domains. Preprint, arXiv:2203.06581, 2022.
[GHZ12] Wei Gong, Michael Hinze, and ZJ Zhou. Space-time finite element approximation of parabolic optimal control problems. J. Numer. Math., 20(2):111–146, 2012.
[GMU17] Silke Glas, Antonia Mayerhofer, and Karsten Urban. Two ways to treat time in reduced basis methods. In Model reduction of parametrized systems, pages 1–16. Springer, 2017.
[GS17] Sashikumaar Ganesan and Shweta Srivastava. Ale-supg finite element method for convection–diffusion problems in time-dependent domains: Conservative form. Appl. Math. Comput., 303:128–145, 2017.
[GS21] Gregor Gantner and Rob Stevenson. Further results on a space-time FOSLS formulation of parabolic PDEs. ESAIM Math. Model. Numer. Anal., 55(1):283–299, 2021.
[GS22a] Gregor Gantner and Rob Stevenson. A well-posed First Order System Least Squares formulation of the instationary Stokes equations. Preprint, arXiv:2201.10843, 2022.
[GS22b] Gregor Gantner and Rob Stevenson. Improved rates for a space-time FOSLS of parabolic PDEs. In preparation, 2022.
[Haa17] Bernard Haasdonk. Model reduction and approximation: theory and algorithms, chapter Reduced basis methods for parametrized PDEs–a tutorial introduction for stationary and instationary problems. Siam Philadelphia, 2017.
[Hac92] Wolfgang Hackbusch. Elliptic differential equations: theory and numerical treatment, volume 18. Springer, Berlin, 1992.
[HLZ16] Peter Hansbo, Mats G. Larson, and Sara Zahedi. A cut finite element method for coupled bulk-surface problems on time-dependent domains. Comput. Methods Appl. Mech. Engrg., 307:96–116, 2016.
[HO08] Bernard Haasdonk and Mario Ohlberger. Reduced basis method for finite volume approximations of parametrized linear evolution equations. ESAIM Math. Model. Numer. Anal., 42(2):277–302, 2008.
[HPUU08] Michael Hinze, René Pinnau, Michael Ulbrich, and Stefan Ulbrich. Optimization with PDE constraints. Springer, Berlin, 2008.
[Kon70] Vladimir A. Kondratiev. The smoothness of the solution of the Dirichlet problem for second order elliptic equations in a piecewise smooth domain. Differentsial’nye Uravneniya, 6(10):1831–1843, 1970.
[Leh15] Christoph Lehrenfeld. The nitsche xfem-dg space-time method and its implementation in three space dimensions. SIAM J. Sci. Comput., 37(1):A245–A270, 2015.
[Lio68] Jacques-Louis Lions. Contrôle optimal de systèmes gouvernés par des équations aux dérivées partielles. Dunod Gauthier-Villars, Paris, 1968.
[LMN16] Ulrich Langer, Stephen E. Moore, and Martin Neumüller. Space-time isogeometric analysis of parabolic evolution problems. Comput. Methods Appl. Mech. Engrg., 306:342–363, 2016.
[LO19] Christoph Lehrenfeld and Maxim Olshanskii. An Eulerian finite element method for PDEs in time-dependent domains. ESAIM Math. Model. Numer. Anal., 53(2):585–614, 2019.
[LS21] Ulrich Langer and Andreas Schafelner. Adaptive space-time finite element methods for parabolic optimal control problems. J. Numer. Math., 2021.
[LSTY21a] Ulrich Langer, Olaf Steinbach, Fredi Tröltzsch, and Huidong Yang. Space-time finite element discretization of parabolic optimal control problems with energy regularization. SIAM J. Numer. Anal., 59(2):675–695, 2021.
[LSTY21b] Ulrich Langer, Olaf Steinbach, Fredi Tröltzsch, and Huidong Yang. Unstructured space-time finite element methods for optimal control of parabolic equations. SIAM J. Sci. Comput., 43(2):A744–A771, 2021.
[Moo18] Stephen Edward Moore. A stable space–time finite element method for parabolic evolution problems. Calcolo, 55(2):1–19, 2018.
[MV07] Dominik Meidner and Boris Vexler. Adaptive space-time finite element methods for parabolic optimization problems. SIAM J. Control Optim., 46(1):116–142, 2007.
[MV08] Dominik Meidner and Boris Vexler. A priori error estimates for space-time finite element discretization of parabolic optimal control problems Part I: Problems without control constraints. SIAM J. Control Optim., 47(3):1150–1177, 2008.
[SG20] Shweta Srivastava and Sashikumaar Ganesan. Local projection stabilization with discontinuous Galerkin method in time applied to convection dominated problems in time-dependent domains. BIT, 60(2):481–507, 2020.
[SHD01] Josep Sarrate, Antonio Huerta, and Jean Donea. Arbitrary Lagrangian–Eulerian formulation for fluid–rigid body interaction. Comput. Methods Appl. Mech. Engrg., 190(24-25):3171–3188, 2001.
[SS09] Christoph Schwab and Rob Stevenson. A space-time adaptive wavelet method for parabolic evolution problems. Math. Comp., 78:1293–1318, 2009.
[Ste15] Olaf Steinbach. Space-time finite element methods for parabolic problems. Comput. Methods Appl. Math., 15(4):551–566, 2015.
[SW21a] Rob Stevenson and Jan Westerdiep. Minimal residual space-time discretizations of parabolic equations: Asymmetric spatial operators. Comput. Math. Appl., 101:107–118, 2021.
[SW21b] Rob Stevenson and Jan Westerdiep. Stability of Galerkin discretizations of a mixed space–time variational formulation of parabolic evolution equations. IMA J. Numer. Anal., 41(1):28–47, 2021.
[Trö10] Fredi Tröltzsch. Optimal control of partial differential equations: theory, methods, and applications. American Mathematical Soc., Providence, 2010.
[UP14] Karsten Urban and Anthony Patera. An improved error bound for reduced basis approximation of linear parabolic problems. Math. Comp., 83(288):1599–1615, 2014.
[Wlo87] Joseph Wloka. Partial differential equations. Cambridge University, Cambridge, 1987.
[Yan14] Masayuki Yano. A space-time Petrov–Galerkin certified reduced basis method: Application to the Boussinesq equations. SIAM J. Sci. Comput., 36(1):A232–A266, 2014.
[YPU14] Masayuki Yano, Anthony T. Patera, and Karsten Urban. A space-time $hp$ -interpolation-based certified reduced basis method for Burgers’ equation. Math. Models Methods Appl. Sci., 24(09):1903–1935, 2014.

	$\displaystyle\\|{\bf u}[{\bm{\mu}}]-{\bf u}^{N}[{\bm{\mu}}]\\|_{U}^{2}$	$\displaystyle\eqsim\\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])-G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\\|_{L}^{2}$
		$\displaystyle=\\|G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])\\|_{L}^{2}-\langle G[{\bm{\mu}}]({\bf u}[{\bm{\mu}}])\,,\,G[{\bm{\mu}}]({\bf u}^{N}[{\bm{\mu}}])\rangle_{L}$
		$\displaystyle=\sum_{q=1}^{n_{s}}\theta_{q}^{s}({\bm{\mu}})-\sum_{q=1}^{n_{l}}\theta_{q}^{l}({\bm{\mu}})\Big{(}\big{(}l_{q}({\bf u}^{\delta}[{\bm{\mu}}^{(i)}])\big{)}_{i=1}^{N}\cdot{\bf c}^{N}[{\bm{\mu}}]\Big{)},$

		$\displaystyle\\|{\bf u}-{\bf u}^{\delta}\\|^{2}_{U}+\\|{\bf z}-{\bf z}^{\delta}\\|^{2}_{Z}+\\|{\bf p}-{\bf p}^{\delta}\\|^{2}_{U}\eqsim$
		$\displaystyle\sup_{\mbox{}\hskip 35.00008pt0\neq({\bf v},{\bf y},{\bf q})\in U\times Z\times U}\frac{\big{[}d({\bf u}\!-\!{\bf u}^{\delta},{\bf v})\!+\!e({\bf z}\!-\!{\bf z}^{\delta},{\bf y})\!+\!a({\bf v},{\bf p}\!-\!{\bf p}^{\delta})\!+\!b({\bf y},{\bf p}\!-\!{\bf p}^{\delta})\!+\!a({\bf u}\!-\!{\bf u}^{\delta},{\bf q})\!+\!b({\bf z}\!-\!{\bf z}^{\delta},{\bf q})\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}\!+\!\\|{\bf y}\\|^{2}_{Z}\!+\!\\|{\bf q}\\|^{2}_{U}}=$
		$\displaystyle\sup_{\mbox{}\hskip 35.00008pt0\neq({\bf v},{\bf y},{\bf q})\in U\times Z\times U}\frac{\big{[}g^{\star}(({\bf v},{\bf y}))\!+\!f({\bf q})\!-\!\Big{(}d({\bf u}^{\delta},{\bf v})\!+\!e({\bf z}^{\delta},{\bf y})\!+\!a({\bf v},{\bf p}^{\delta})\!+\!b({\bf y},{\bf p}^{\delta})\!+\!a({\bf u}^{\delta},{\bf q})\!+\!b({\bf z}^{\delta},{\bf q})\Big{)}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}\!+\!\\|{\bf y}\\|^{2}_{Z}\!+\!\\|{\bf q}\\|^{2}_{U}}=$
		$\displaystyle\sup_{\mbox{}\hskip 35.00008pt0\neq({\bf v},{\bf y},{\bf q})\in U\times Z\times U}\frac{\big{[}\langle F{\bf v},w^{\star}\!-\!F{\bf u}^{\delta}\rangle_{W}\!-\!\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}\!+\!\langle{\bf f}^{\star}\!+\!{\bf z}^{\delta}\!-\!G{\bf u}^{\delta},G{\bf q}\rangle_{L}\!+\!\langle\Pi G{\bf p}^{\delta}\!-\!\varrho C{\bf z}^{\delta},{\bf y}\rangle_{\widehat{L}}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}\!+\!\\|{\bf y}\\|^{2}_{Z}\!+\!\\|{\bf q}\\|^{2}_{U}}=$
		$\displaystyle\sup_{0\neq{\bf v}\in U}\frac{\big{[}\langle F{\bf v},w^{\star}\!-\!F{\bf u}^{\delta}\rangle_{W}\!-\!\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}}\!+\!\sup_{0\neq{\bf q}\in U}\frac{\langle{\bf f}^{\star}\!+\!{\bf z}^{\delta}\!-\!G{\bf u}^{\delta},G{\bf q}\rangle_{L}^{2}}{\\|{\bf q}\\|_{U}^{2}}\!+\!\\|\Pi G{\bf p}^{\delta}\!-\!\varrho C{\bf z}^{\delta}\\|^{2}_{Z^{\prime}}\eqsim$
		$\displaystyle\sup_{0\neq{\bf v}\in U}\frac{\big{[}\langle F{\bf v},w^{\star}\!-\!F{\bf u}^{\delta}\rangle_{W}\!-\!\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}\big{]}^{2}}{\\|{\bf v}\\|^{2}_{U}}\!+\!\\|{\bf f}^{\star}\!+\!{\bf z}^{\delta}\!-\!G{\bf u}^{\delta}\\|_{L}^{2}\!+\!\\|\Pi G{\bf p}^{\delta}\!-\!\varrho C{\bf z}^{\delta}\\|^{2}_{Z^{\prime}}$		(4.10)

	$\displaystyle\sup_{0\neq{\bf v}\in U}$	$\displaystyle\frac{\langle F{\bf v},w^{\star}-F{\bf u}^{\delta}\rangle_{W}-\langle G{\bf v},G{\bf p}^{\delta}\rangle_{L}}{\\|{\bf v}\\|_{U}}\leq\\|G\\|_{\mathcal{L}(U,L)}\\|\widetilde{\bm{\ell}}^{\delta}-\bm{\ell}^{\delta}\\|_{L}+$
		$\displaystyle\\|-\partial_{t}\widetilde{\ell}_{1}^{\delta}+\operatorname{div}_{\bf x}{\bf A}\widetilde{\bm{\ell}}_{2}^{\delta}-{\bf b}\cdot\nabla_{\bf x}\widetilde{\ell}_{1}^{\delta}+(c-\operatorname{div}_{\bf x}{\bf b})\widetilde{\ell}_{1}^{\delta}-\chi_{1}(w_{1}^{\star}-\chi_{1}u_{1}^{\delta})\\|_{L_{2}(Q)}+$
		$\displaystyle\\|-\nabla_{\bf x}\widetilde{\ell}_{1}^{\delta}-\widetilde{\bm{\ell}}_{2}^{\delta}-\chi_{2}({\bf w}_{2}^{\star}-\chi_{2}{\bf u}_{2}^{\delta})\\|_{L_{2}(Q)^{d}}+$
		$\displaystyle\\|{\bf v}\mapsto v_{1}(T,\cdot)\\|_{\mathcal{L}(U,L_{2}(\Omega))}\\|\widetilde{\ell}_{1}^{\delta}(T,\cdot)-\chi_{3}(w_{3}^{\star}-\chi_{3}u_{1}^{\delta}(T,\cdot))\\|_{L_{2}(\Omega)}.$

Applications of a space-time FOSLS formulation for parabolic PDEs

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

1.1. Parameter-dependent problems

1.2. Optimal control problems

1.3. Time-dependent domains

1.4. Outline

2. Preliminaries

2.1. General notation

2.2. Formulation of parabolic PDEs as first-order system

Theorem 2.1.

Theorem 2.2 ([GS21, Theorem 2.3 and Proposition 2.5]).

Lemma 2.3 ([GS21, Lemma 2.2]).

Remark 2.4.

3. Parameter-dependent problems

3.1. Reduced basis method

3.2. Basis generation

3.3. Numerical experiment

4. Optimal control problems

Remark 4.1.

4.1. Formulation as saddle-point problem

Lemma 4.2.

Proof.

4.2. Optimality system of PDEs

4.3. A posteriori error estimation

Remark 4.3.

4.3.1. A reliable estimator

4.3.2. An efficient estimator

4.4. Numerical experiments

4.4.1. Experiment in 1+1D

4.4.2. Experiment in 2+1D

5. Time-dependent domains

5.1. Formulation as first-order system

Theorem 5.1.

Proof.

Remark 5.2.

5.2. Numerical experiments

5.2.1. Experiment in 1+1D

5.2.2. Experiment in 2+1D

6. Conclusion

References

Applications of a space-time FOSLS formulation
for parabolic PDEs