This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Crank-Nicolson time stepping and variational discretization of control-constrained parabolic optimal control problems

Nikolaus von Daniels111Schwerpunkt Optimierung und Approximation, Universität Hamburg, Bundesstraße 55, 20146 Hamburg, Germany., Michael Hinze11footnotemark: 1, and Morten Vierling11footnotemark: 1
(6-MAR-2015)

Abstract: We consider a control constrained parabolic optimal control problem and use variational discretization for its time semi-discretization. The state equation is treated with a Petrov-Galerkin scheme using a piecewise constant Ansatz for the state and piecewise linear, continuous test functions. This results in variants of the Crank-Nicolson scheme for the state and the adjoint state. Exploiting a superconvergence result we prove second order convergence in time of the error in the controls. Moreover, the piecewise linear and continuous parabolic projection of the discrete state on the dual time grid provides a second order convergent approximation of the optimal state without further numerical effort. Numerical experiments confirm our analytical findings.

AMS Subjects Classification: 49J20, 35K20, 49M05, 49M25, 49M29, 65M12, 65M60.
Keywords: Optimal control, Heat equation, Control constraints, Crank-Nicolson time stepping, Error estimates, Variational control discretization, Finite elements.

1 Introduction

The purpose of this paper is to prove optimal a priori error bounds for the variational time semi-discretization of a generic parabolic optimal control problem, where the state in time is approximated with a Petrov Galerkin scheme. The key idea consists in choosing piecewise linear, continuous test functions and a discontinuous, piecewise constant Ansatz for the approximation of the state equation. With this Petrov Galerkin Ansatz variational discretization of the optimal control problem delivers a cG(1) time approximation of the optimal time semi-discrete adjoint state. The resulting time integration schemes for the state and the adjoint state are variants of the Crank-Nicolson scheme. Combining this setting with the supercloseness result of Corollary 4.3 for interval means we are able to prove second order in time convergence of the time discrete optimal control in Theorem 5.2. Moreover, the piecewise linear and continuous parabolic projection of the discrete state based on the the values of the discrete state on the dual time grid provides a second order convergent approximation of the optimal state without further numerical effort, see Theorem 5.3.

Our work is motivated by the work [MV11] of Meidner and Vexler, whoms technical results we use whenever possible. Under mild assumptions on the active set they show the same convergence order in time for the post-processed piecewise linear, continuous parabolic projection of the piecewise constant in time optimal control. This control is obtained by a Petrov Galerkin scheme with variational discretization of the parabolic optimal control problem. In comparison to their work, we switch Ansatz and test space in our numerical schemes.

Our work is novel in several aspects:

  • In (4.1) to the best of the authors knowledge a new, fully variational time-discretization scheme for the parabolic state equation is presented. It results in a Crank-Nicolson scheme with initial damping step for the nodal values of the state.

  • In Theorem 5.2 we provide an optimal error estimate for the time-discrete control, namely

    u¯u¯kL2(0,T,D)Ck2,\left\|{\bar{u}}-{\bar{u}_{k}}\right\|_{L^{2}(0,T,\mathbb{R}^{D})}\leq Ck^{2},

    with u¯{\bar{u}} the optimal control, which is the solution of the optimal control problem (\mathbb{P}), and u¯k{\bar{u}_{k}} denoting the optimal control obtained from the related discretized problem (k\mathbb{P}_{k}) (see below). Here, kk denotes the grid size of the time grid. This result could be compared to [MV11, Theorem 6.2]. There, under mild assumptions on the structure of the active set w.r.t. u¯{\bar{u}}, a similar bound is obtained for the post-processed parabolic projection of a piecewise constant optimal control. Our approach avoids such an assumption in the numerical analysis, and presents an error estimate for the variational-discrete optimal control.

  • In Theorem 5.3 we prove

    y¯πPky¯kL2(0,T,L2(Ω))Ck2,\left\|{\bar{y}}-\pi_{P_{k}^{*}}{\bar{y}_{k}}\right\|_{L^{2}(0,T,L^{2}(\Omega))}\leq Ck^{2},

    where πPky¯k\pi_{P_{k}^{*}}{\bar{y}_{k}} denotes the piecewise linear and continuous parabolic projection of the discrete state based on the the values of the discrete state on the dual time grid defined in Section 4. Since these values are already known, the projection is for free.

  • Our approach demonstrates that variational discretization of [Hin05] through the choice of Ansatz and test space in a Petrov Galerkin approximation of parabolic optimal control problems offers the possibility to specify the discrete structure of variational optimal controls. Of course this feature applies also to other classes of PDE constrained optimal control problems.

In our note we only consider semi-discretization in time for two reasons. On the one hand, error estimation for standard spatial finite element approximations of the time-semidiscrete optimal control problem (k\mathbb{P}_{k}) is along the lines of [MV11, Section 6.2]. On the other hand, we are interested in the approximation of optimal controls, which in a realistic time-dependent scenario only depend on time, see the possible definitions of the control operator BB below.

With I:=(0,T)I:=(0,T)\subset\mathbb{R}, T<T<\infty, and a fixed function ydL2(I,L2(Ω))y_{d}\in{L^{2}(I,{L^{2}(\Omega)})}, we consider the linear-quadratic optimal control problem

minyY,uUadJ(y,u)=12yydL2(I,L2(Ω))2+α2uU2,\displaystyle\min_{y\in Y,u\in{U_{\textup{ad}}}}J(y,u)=\frac{1}{2}\|y-y_{d}\|^{2}_{{L^{2}(I,{L^{2}(\Omega)})}}+\frac{\alpha}{2}\|u\|^{2}_{U}, (\mathbb{P})
s.t. y=S(Bu,y0).\displaystyle\text{s.t. }y=S(Bu,y_{0}).

The state space YY is given by

Y:=W(I):={vL2(I,H01(Ω)),tvL2(I,H1(Ω))}C([0,T],L2(Ω)),Y:=W(I):=\{v\in L^{2}(I,{H^{1}_{0}(\Omega)}),\partial_{t}v\in L^{2}(I,{H^{-1}(\Omega)})\}\hookrightarrow C([0,T],L^{2}(\Omega)),

and the operator S:L2(I,H1(Ω))×L2(Ω)W(I)S:L^{2}(I,{H^{-1}(\Omega)})\times L^{2}(\Omega)\rightarrow W(I), (f,κ)y:=S(f,κ)(f,\kappa)\mapsto y:=S(f,\kappa), denotes the weak solution operator associated with the parabolic problem

tyΔy\displaystyle\partial_{t}y-\Delta y =f\displaystyle=f in I×Ω,\displaystyle\text{in }I\times\Omega\,, (1.1)
y\displaystyle y =0\displaystyle=0 in I×Ω,\displaystyle\text{in }I\times\partial\Omega\,,
y(0)\displaystyle y(0) =κ\displaystyle=\kappa in Ω,\displaystyle\text{in }\Omega,

i.e. for (f,κ)L2(I,H1(Ω))×L2(Ω)(f,\kappa)\in L^{2}(I,{H^{-1}(\Omega)})\times L^{2}(\Omega) the function yW(I)y\in W(I) satisfies y(0)=κy(0)=\kappa and

0Tty(t),v(t)H1(Ω)H01(Ω)+a(y(t),v(t))dt=0Tf(t),v(t)H1(Ω)H01(Ω)𝑑tvL2(I,H01(Ω)).\int\limits_{0}^{T}\langle\partial_{t}y(t),v(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}+a(y(t),v(t))\,dt=\int\limits_{0}^{T}\langle f(t),v(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}\,dt\\ \quad\forall\,v\in L^{2}(I,{H^{1}_{0}(\Omega)}). (1.2)

Here Ωn\Omega\subset\mathbb{R}^{n}, n=2,3n=2,3, is a convex polygonal domain with boundary Ω\partial\Omega, and for y,vH01(Ω)y,v\in{H^{1}_{0}(\Omega)} we define

a(y,v):=Ωy(x)v(x)𝑑x.a(y,v):=\int\limits_{\Omega}\nabla y(x)\nabla v(x)\ dx.

In what follows several choices of control spaces are feasible. In particular we may choose as control space U=L2(I,D)U=L^{2}(I,\mathbb{R}^{D}), DD\in\mathbb{N}, and as the admissible set

Uad={uU|aiui(t)bi a.e. in Ii=1,,D},{U_{\textup{ad}}}=\left\{u\in U\,\left|\;a_{i}\leq u_{i}(t)\leq b_{i}\text{ a.e. in $I$, }i=1,\dots,D\right.\right\},

where aia_{i}, bib_{i}\in\mathbb{R}, ai<bia_{i}<b_{i} (i=1,,D)(i=1,\dots,D). In this case the control operator is given by

B:UL2(I,H1(Ω)),u(ti=1Dui(t)gi),B:U\rightarrow L^{2}(I,{H^{-1}(\Omega)})\,,\quad u\mapsto\left(t\mapsto\sum_{i=1}^{D}u_{i}(t)g_{i}\right), (1.3)

where giH1(Ω)g_{i}\in{H^{-1}(\Omega)} are given functionals, whose regularity is specified in Assumption 1.1 below. Clearly, BB is linear and bounded. A further possible choice for the control space is U=L2(I,L2(Ω))U={L^{2}(I,{L^{2}(\Omega)})} with

Uad={uU|au(t,x)b a.e. in I×Ω},{U_{\textup{ad}}}=\left\{u\in U\,\left|\;a\leq u(t,x)\leq b\text{ a.e. in }I\times\Omega\right.\right\},

where a<ba<b denote real constants. In this case the control operator BB is the injection from UU into L2(I,H1(Ω))L^{2}(I,H^{-1}(\Omega)). In both cases the admissible set Uad{U_{\textup{ad}}} is closed and convex. We build our exposition upon the practical more relevant first choice of time-dependent amplitudes as controls.

It is well known that the operator SS is well defined, i.e. for every (f,κ)L2(I,H1(Ω))×L2(Ω)(f,\kappa)\in L^{2}(I,{H^{-1}(\Omega)})\times L^{2}(\Omega) a unique state yW(I)y\in W(I) satisfying (1.2) exists. Furthermore, it fulfills

yW(I)C{fL2(I,H1(Ω))+κL2(Ω)}.\|y\|_{W(I)}\leq C\left\{\|f\|_{L^{2}(I,{H^{-1}(\Omega)})}+\|\kappa\|_{L^{2}(\Omega)}\right\}.

Now let yYy\in Y denote the unique solution of (1.2), and let vW(I)v\in W(I). Then it follows from integration by parts for functions in W(I)W(I), that with the bilinear form A:W(I)×W(I)A:W(I)\times W(I)\rightarrow\mathbb{R} defined by

A(y,v):=0Ttv(t),y(t)H1(Ω)H01(Ω)+a(y(t),v(t))dt+(y(T),v(T))L2(Ω),A(y,v):=\int\limits_{0}^{T}-\langle\partial_{t}v(t),y(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}+a(y(t),v(t))\,dt+(y(T),v(T))_{{L^{2}(\Omega)}}\,, (1.4)

the state yy also satisfies

A(y,v)=0Tf(t),v(t)H1(Ω)H01(Ω)𝑑t+(κ,v(0))L2(Ω)vW(I).A(y,v)=\int\limits_{0}^{T}\langle f(t),v(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}\,dt+(\kappa,v(0))_{{L^{2}(\Omega)}}\quad\forall\ v\in W(I). (1.5)

Furthermore, yy is the only function in YY which satisfies (1.5). In the next section we use the bilinear form AA to define our numerical approximation scheme for the state equation.

With 𝒪(k2)\mathcal{O}(k^{2}) error-bounds for the control in mind we follow [MV11] and make the following assumptions on the data.

Assumption 1.1.

Let ydH1(I,L2(Ω))y_{d}\in H^{1}(I,L^{2}(\Omega)), and yd(T)H01(Ω)y_{d}(T)\in{H^{1}_{0}(\Omega)}. Let further giH01(Ω)g_{i}\in{H^{1}_{0}(\Omega)}, i=1,,Di=1,\dots,D, and finally y0H01(Ω)y_{0}\in{H^{1}_{0}(\Omega)} with Δy0H01(Ω)\Delta y_{0}\in{H^{1}_{0}(\Omega)}.

A lot of literature is available on optimal control problems with parabolic state equations. We refer to [HPUU09] for a comprehensive discussion, and also to [AF12, MV08a, MV08b, MV11, SV13] for the most recent developments related to optimal control with Galerkin methods in time.

The paper is organized as follows. In section 2 we briefly summarize the solution theory of the optimal control problem. In section 3 we analyse the regularity of the state and the adjoint state, which plays an important role in the time discretization. In section 4 the time discretization of state and adjoint state is discussed in detail. In section 5 we introduce variational discretization of the optimal control problem (P)(P) and prove second order convergence of the variational discrete controls in time. In section 6 we present numerical results which confirm our analytical findings.

2 The continuous problem ()(\mathbb{P})

It is well known that problem (\mathbb{P}) admits a unique solution (y¯,u¯)Y×U({\bar{y}},{\bar{u}})\in Y\times U, where y¯=S(Bu¯,y0){\bar{y}}=S(B{\bar{u}},y_{0}). Moreover, using the orthogonal projection PUad:L2(I,D)UadP_{U_{\textup{ad}}}:L^{2}(I,\mathbb{R}^{D})\rightarrow{U_{\textup{ad}}}, the optimal control is characterized by the first-order necessary and sufficient condition

u¯=PUad(1αBp¯),{\bar{u}}=P_{U_{\textup{ad}}}\left(-\frac{1}{\alpha}B^{\prime}\bar{p}\right), (2.1)

where (p¯,q¯)L2(I,H01(Ω))×L2(Ω)(\bar{p},\bar{q})\in L^{2}(I,{H^{1}_{0}(\Omega)})\times{L^{2}(\Omega)} (here we use reflexivity of the involved spaces) denotes the adjoint variable which is the unique solution to

0Tty~(t),p¯(t)H1(Ω)H01(Ω)+a(y~(t),p¯(t))dt+(y~(0),q¯)L2(Ω)=0TΩ(y¯(t,x)yd(t,x))y~(t,x)𝑑x𝑑ty~W(I).\int\limits_{0}^{T}\langle\partial_{t}\tilde{y}(t),\bar{p}(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}+a(\tilde{y}(t),\bar{p}(t))\,dt+(\tilde{y}(0),\bar{q})_{{L^{2}(\Omega)}}\\ =\int\limits_{0}^{T}\int\limits_{\Omega}({\bar{y}}(t,x)-y_{d}(t,x))\tilde{y}(t,x)\,dxdt\quad\forall\ \tilde{y}\in W(I). (2.2)

Here, B:L2(I,H01(Ω))L2(I,D)B^{\prime}:L^{2}(I,{H^{1}_{0}(\Omega)})\rightarrow L^{2}(I,\mathbb{R}^{D}) denotes the adjoint operator of BB, which is characterized by

Bq(t)=(g1,q(t)H1(Ω)H01(Ω),,gD,q(t)H1(Ω)H01(Ω))T.B^{\prime}q(t)=\left(\langle g_{1},q(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}\ ,\ \dots\ ,\ \langle g_{D},q(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}\right)^{T}. (2.3)

Furthermore we note that for vL2(I,D)v\in L^{2}(I,\mathbb{R}^{D}) there holds

PUad(v)(t)=(P[ai,bi](vi(t)))i=1D,P_{U_{\textup{ad}}}(v)(t)=\left(P_{[a_{i},b_{i}]}(v_{i}(t))\right)_{i=1}^{D},

where for a,b,za,b,z\in\mathbb{R} with aba\leq b we set P[a,b](z):=max{a,min{z,b}}P_{[a,b]}(z):=\max\{a,\min\{z,b\}\}.

Since y¯ydL2(I,L2(Ω)){\bar{y}}-y_{d}\in L^{2}(I,L^{2}(\Omega)) in (2.2), we have p¯W(I)\bar{p}\in W(I), so that by integration by parts for functions in W(I)W(I) we conclude from (2.2) (compare (1.5))

0Ttp¯(t),y~(t)H1(Ω)H01(Ω)+a(y~(t),p¯(t))dt+(y~(0),q¯)L2(Ω)+(y~(T),p¯(T))L2(Ω)(y~(0),p¯(0))L2(Ω)=0TΩ(y¯(t,x)yd(t,x))y~(t,x)𝑑x𝑑ty~W(I),\int\limits_{0}^{T}-\langle\partial_{t}\bar{p}(t),\tilde{y}(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}+a(\tilde{y}(t),\bar{p}(t))\,dt+\\ (\tilde{y}(0),\bar{q})_{{L^{2}(\Omega)}}+(\tilde{y}(T),\bar{p}(T))_{{L^{2}(\Omega)}}-(\tilde{y}(0),\bar{p}(0))_{{L^{2}(\Omega)}}\\ =\int\limits_{0}^{T}\int\limits_{\Omega}({\bar{y}}(t,x)-y_{d}(t,x))\tilde{y}(t,x)\,dxdt\quad\forall\ \tilde{y}\in W(I), (2.4)

so that the function p¯\bar{p} can be identified with the unique weak solution to the adjoint equation

tp¯Δp¯\displaystyle-\partial_{t}\bar{p}-\Delta\bar{p} =h\displaystyle=h in I×Ω,\displaystyle\text{in }I\times\Omega\,, (2.5)
p¯\displaystyle\bar{p} =0\displaystyle=0 on I×Ω,\displaystyle\text{on }I\times\partial\Omega\,,
p¯(T)\displaystyle\bar{p}(T) =0\displaystyle=0 on Ω,\displaystyle\text{on }\Omega,

with h:=y¯ydh:=\bar{y}-y_{d}. Moreover, q¯=p¯(0).\bar{q}=\bar{p}(0).

3 Regularity results

In this section we summarize some existence and regularity results concerning equation (1.1) and (2.5), which can also be found in e.g. [MV11]. We abbreviate

I:=L2(I,L2(Ω)),||I:=L2(I,D).\left\|\cdot\right\|_{{I}}:=\|\cdot\|_{L^{2}(I,L^{2}(\Omega))}\,,\quad\left|\cdot\right|_{{I}}:=\|\cdot\|_{L^{2}(I,\mathbb{R}^{D})}.

For the unique weak solutions yy to (1.1) and pp to (2.5) we have from [Eva98, Theorems 7.1.5 and 5.9.4] the regularity results.

Lemma 3.1.

For f,hL2(I,L2(Ω))f,h\in L^{2}({I},{L^{2}(\Omega)}) and κH01(Ω)\kappa\in{H^{1}_{0}(\Omega)} the solutions yy of (1.1) and pp of (2.5) satisfy

y,pL2(I,H2(Ω)H01(Ω))H1(I,L2(Ω))C([0,T],H01(Ω)).y,p\in L^{2}({I},H^{2}(\Omega)\cap{H^{1}_{0}(\Omega)})\cap H^{1}({I},{L^{2}(\Omega)})\hookrightarrow C([0,T],{H^{1}_{0}(\Omega)}).

Furthermore, with some constant C>0C>0 there holds

yI+tyI+ΔyI+maxtI¯y(t)H1(Ω)C{fI+κH1(Ω)},\left\|y\right\|_{{I}}+\left\|\partial_{t}y\right\|_{{I}}+\left\|\Delta y\right\|_{{I}}+\max_{t\in\bar{I}}\|y(t)\|_{H^{1}(\Omega)}\leq C\left\{\left\|f\right\|_{{I}}+\|\kappa\|_{H^{1}(\Omega)}\right\},

and

tpI+ΔpI+maxtI¯p(t)H1(Ω)ChI.\left\|\partial_{t}p\right\|_{{I}}+\left\|\Delta p\right\|_{{I}}+\max_{t\in\bar{I}}\|p(t)\|_{H^{1}(\Omega)}\leq C\left\|h\right\|_{{I}}.

However, in order to achieve 𝒪(k2)\mathcal{O}(k^{2})-convergence we need more regularity, i.e., at least second weak time derivatives. From [MV11, Proposition 2.1] we have

Lemma 3.2.

Let f,hH1(I,L2(Ω))f,h\in H^{1}({I},{L^{2}(\Omega)}), f(0),h(T)H01(Ω)f(0),h(T)\in{H^{1}_{0}(\Omega)}, and κH01(Ω)\kappa\in{H^{1}_{0}(\Omega)} with ΔκH01(Ω)\Delta\kappa\in{H^{1}_{0}(\Omega)}. Then the solutions y of (1.1) and p of (2.5) satisfy

y,pH1(I,H2(Ω)H01(Ω))H2(I,L2(Ω)).y,p\in H^{1}({I},H^{2}(\Omega)\cap{H^{1}_{0}(\Omega)})\cap H^{2}({I},{L^{2}(\Omega)}).

With some constant C>0C>0 we have the a priori estimates

t2yI+tΔyIC{fH1(I,L2(Ω))+f(0)H1(Ω)+κH1(Ω)+ΔκH1(Ω)},\left\|\partial^{2}_{t}y\right\|_{{I}}+\left\|\partial_{t}\Delta y\right\|_{{I}}\leq C\left\{\|f\|_{H^{1}({I},L^{2}(\Omega))}+\|f(0)\|_{H^{1}(\Omega)}+\|\kappa\|_{H^{1}(\Omega)}+\|\Delta\kappa\|_{H^{1}(\Omega)}\right\},

and

t2pI+tΔpIC{hH1(I,L2(Ω))+h(T)H1(Ω)}.\left\|\partial^{2}_{t}p\right\|_{{I}}+\left\|\partial_{t}\Delta p\right\|_{{I}}\leq C\left\{\|h\|_{H^{1}({I},L^{2}(\Omega))}+\|h(T)\|_{H^{1}(\Omega)}\right\}.

From Lemma 3.1 we conclude that the optimal state y¯{\bar{y}} lives in H1(I,L2(Ω))H^{1}({I},{L^{2}(\Omega)}), and y¯(T)H01(Ω){\bar{y}}(T)\in{H^{1}_{0}(\Omega)}. Thus, by Lemma 3.2 and Assumption 1.1 the optimal adjoint state p¯{\bar{p}} is an element of H2(I,L2(Ω))H^{2}({I},{L^{2}(\Omega)}). It then follows from (2.3) that Bp¯H2(I,D)B^{\prime}{\bar{p}}\in H^{2}({I},\mathbb{R}^{D}). Furthermore, for vW1,r(I,D)v\in W^{1,r}({I},\mathbb{R}^{D}), 1r1\leq r\leq\infty, one has

tPUad(v)Lr(I,D)tvLr(I,D),\|\partial_{t}P_{{U_{\textup{ad}}}}(v)\|_{L^{r}({I},\mathbb{R}^{D})}\leq\|\partial_{t}v\|_{L^{r}({I},\mathbb{R}^{D})},

so that (2.1) and Bp¯H2(I,D)B^{\prime}{\bar{p}}\in H^{2}({I},\mathbb{R}^{D}) imply u¯W1,(I,D){\bar{u}}\in W^{1,\infty}({I},\mathbb{R}^{D}).

Hence, using our Assumption 1.1, Lemma 3.2 is applicable to the solution of the state equation and one obtains the following result, see e.g. [MV11, Proposition 2.3].

Lemma 3.3.

Let Assumption 1.1 hold. For the unique solution (y¯,u¯)({\bar{y}},{\bar{u}}) of (\mathbb{P}) and the corresponding adjoint state p¯{\bar{p}} there holds

y¯,p¯H1(I,H2(Ω)H01(Ω))H2(I,L2(Ω)),andu¯W1,(I,D).{\bar{y}},{\bar{p}}\in H^{1}({I},H^{2}(\Omega)\cap{H^{1}_{0}(\Omega)})\cap H^{2}({I},{L^{2}(\Omega)})\,,\quad\text{and}\quad{\bar{u}}\in W^{1,\infty}({I},\mathbb{R}^{D})\,.

4 Time discretization

Let [0,T)=m=1MIm[0,T)=\bigcup_{{m}=1}^{M}I_{m}, where the intervals Im=[tm1,tm)I_{m}=[t_{{m}-1},t_{m}) are defined through the partition 0=t0<t1<<tM=T0=t_{0}<t_{1}<\dots<t_{M}=T. Furthermore, let tm=tm1+tm2t_{m}^{*}=\frac{t_{{m}-1}+t_{m}}{2} for m=1,,Mm=1,\dots,{M} denote the interval midpoints. By 0=:t0<t1<<tM<tM+1:=T0=:t_{0}^{*}<t_{1}^{*}<\dots<t_{M}^{*}<t_{{M}+1}^{*}:=T we get the so-called dual partition of [0,T)[0,T), namely [0,T)=m=1M+1Im[0,T)=\bigcup_{{m}=1}^{{M}+1}I_{m}^{*}, with Im=[tm1,tm)I_{m}^{*}=[t_{{m}-1}^{*},t_{m}^{*}). The grid width of the first (primal) partition is defined by the mesh-parameters km=tmtm1k_{m}=t_{m}-t_{m-1} and

k=max1mMkm.k=\max_{1\leq{m}\leq{M}}k_{m}.

On these partitions we define the Ansatz and test spaces of our Petrov Galerkin scheme for the numerical approximation of the optimal control problem (\mathbb{P}) w.r.t. time. We set

Pk:={vC([0,T],H01(Ω))|v|Im𝒫1(Im,H01(Ω))}W(I),P_{k}:=\left\{v\in C([0,T],{H^{1}_{0}(\Omega)})\,\left|\;{\left.\kern-1.2ptv\vphantom{\big{|}}\right|_{I_{m}}}\in\mathcal{P}_{1}(I_{m},{H^{1}_{0}(\Omega)})\right.\right\}\hookrightarrow W(I),
Pk:={vC([0,T],H01(Ω))|v|Im𝒫1(Im,H01(Ω))}W(I)P_{k}^{*}:=\left\{v\in C([0,T],{H^{1}_{0}(\Omega)})\,\left|\;{\left.\kern-1.2ptv\vphantom{\big{|}}\right|_{I_{m}^{*}}}\in\mathcal{P}_{1}(I_{m}^{*},{H^{1}_{0}(\Omega)})\right.\right\}\hookrightarrow W(I)

and

Yk:={v:[0,T]H01(Ω)|v|Im𝒫0(Im,H01(Ω))}.Y_{k}:=\left\{v:[0,T]\rightarrow{H^{1}_{0}(\Omega)}\,\left|\;{\left.\kern-1.2ptv\vphantom{\big{|}}\right|_{I_{m}}}\in\mathcal{P}_{0}(I_{m},{H^{1}_{0}(\Omega)})\right.\right\}\,.

Here, 𝒫i(J,H01(Ω))\mathcal{P}_{i}(J,{H^{1}_{0}(\Omega)}), JI¯J\subset\bar{I}, i{0,1}i\in\{0,1\}, denotes the set of polynomial functions in time of degree at most ii on the interval JJ with values in H01(Ω){H^{1}_{0}(\Omega)}. We note that functions in PkYkP_{k}\cup Y_{k} can be uniquely determined by M+1{M}+1 elements from H01(Ω){H^{1}_{0}(\Omega)}. Furthermore each function in YkY_{k} is also an element of L2(I,H01(Ω))L^{2}(I,{H^{1}_{0}(\Omega)}).
In what follows we frequently use the interpolation operators

  1. 1.

    𝒫Yk:L2(I,H01(Ω))Yk\mathcal{P}_{Y_{k}}:L^{2}(I,{H^{1}_{0}(\Omega)})\rightarrow Y_{k}

    𝒫Ykv|Im:=1kmtm1tmv𝑑t for m=1,,M, and 𝒫Ykv(T):=0{\left.\kern-1.2pt\mathcal{P}_{Y_{k}}v\vphantom{\big{|}}\right|_{I_{m}}}:=\frac{1}{k_{m}}\int_{t_{{m}-1}}^{t_{m}}vdt\text{ for }m=1,\dots,{M},\text{ and }\mathcal{P}_{Y_{k}}v(T):=0
  2. 2.

    ΠYk:C([0,T],H01(Ω))Yk\Pi_{Y_{k}}:C([0,T],{H^{1}_{0}(\Omega)})\rightarrow Y_{k}

    ΠYkv|Im:=v(tm) for m=1,,M, and ΠYkv(T):=v(T).{\left.\kern-1.2pt\Pi_{Y_{k}}v\vphantom{\big{|}}\right|_{I_{m}}}:=v\left(t_{m}^{*}\right)\text{ for }m=1,\dots,{M},\text{ and }\Pi_{Y_{k}}v(T):=v(T).
  3. 3.

    πPk:C([0,T],H01(Ω))YkPk\pi_{P_{k}^{*}}:C([0,T],{H^{1}_{0}(\Omega)})\cup Y_{k}\rightarrow P_{k}^{*}

    πPkv|I1I2\displaystyle{\left.\kern-1.2pt\pi_{P_{k}^{*}}v\vphantom{\big{|}}\right|_{I_{1}^{*}\cup I_{2}^{*}}} :=v(t1)+tt1t2t1(v(t2)v(t1)),\displaystyle=v(t_{1}^{*})+\frac{t-t_{1}^{*}}{t_{2}^{*}-t_{1}^{*}}(v(t_{2}^{*})-v(t_{1}^{*})),
    πPkv|Im\displaystyle{\left.\kern-1.2pt\pi_{P_{k}^{*}}v\vphantom{\big{|}}\right|_{I_{m}^{*}}} :=v(tm1)+ttm1tmtm1(v(tm)v(tm1)), for m=3,,M1,\displaystyle=v\left(t_{{m}-1}^{*}\right)+\frac{t-t_{{m}-1}^{*}}{t_{m}^{*}-t_{{m}-1}^{*}}(v(t_{m}^{*})-v(t_{{m}-1}^{*})),\text{ for }m=3,\dots,{M}-1,
    πPkv|IMIM+1\displaystyle{\left.\kern-1.2pt\pi_{P_{k}^{*}}v\vphantom{\big{|}}\right|_{I_{M}^{*}\cup I_{{M}+1}^{*}}} :=v(tM1)+ttM1tMtM1(v(tM)v(tM1)).\displaystyle=v(t_{{M}-1}^{*})+\frac{t-t_{{M}-1}^{*}}{t_{M}^{*}-t_{{M}-1}^{*}}(v(t_{M}^{*})-v(t_{{M}-1}^{*})).

To apply variational discretization to (\mathbb{P}) we next introduce the Petrov-Galerkin scheme for the approximation of the states. For this purpose we extend the bilinear form AA of (1.4) from W(I)W(I) to W(I)YkW(I)\cup Y_{k}, i.e. we consider AA as a mapping A:W(I)Yk×W(I)A:W(I)\cup Y_{k}\times W(I)\rightarrow\mathbb{R}. Then, according to (1.5) we for (f,κ)L2(I,H1(Ω))×L2(Ω)(f,\kappa)\in L^{2}(I,{H^{-1}(\Omega)})\times L^{2}(\Omega) consider the time-semidiscrete problem: Find ykYky_{k}\in Y_{k}, such that

A(yk,vk)=0Tf(t),vk(t)H1(Ω)H01(Ω)𝑑t+(κ,vk(0))L2(Ω)vkPk.A(y_{k},v_{k})=\int\limits_{0}^{T}\langle f(t),v_{k}(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}\,dt+(\kappa,v_{k}(0))_{{L^{2}(\Omega)}}\quad\forall\ v_{k}\in P_{k}. (4.1)

Then ykYky_{k}\in Y_{k} is uniquely determined. This follows from the fact that with

yk=αM+1χ{T}+i=1MαiχIi,αiH01(Ω) for i=1,,M+1,y_{k}=\alpha_{M+1}\chi_{\{T\}}+\sum\limits_{i=1}^{M}\alpha_{i}\chi_{I_{i}},\quad\alpha_{i}\in{H^{1}_{0}(\Omega)}\text{ for }i=1,\dots,M+1,

the coefficients αi\alpha_{i} for i=2,,Mi=2,\dots,M are determined by a Crank-Nicolson scheme with a (Rannacher) smoothing step [Ran84] for α1\alpha_{1}, and αM+1\alpha_{M+1} is uniquely determined by αM\alpha_{M}.

Note that in all of the following results CC denotes a generic, strict positive real constant that does not depend on quantities which appear to the right of it.

The following stability result will be useful in the later analysis.

Lemma 4.1.

Let ykYky_{k}\in Y_{k} solve (4.1) for fL2(I,L2(Ω))f\in{L^{2}(I,{L^{2}(\Omega)})} and κL2(Ω)\kappa\in{L^{2}(\Omega)} given. Then there exists a constant C>0C>0 independent of the time mesh size kk such that

ykIC(fI+κL2(Ω)).\left\|y_{k}\right\|_{{I}}\leq C\left(\left\|f\right\|_{{I}}+\left\|\kappa\right\|_{{L^{2}(\Omega)}}\right).
Proof.

We test in (4.1) with the vkPkv_{k}\in P_{k} that is uniquely determined by vk(T):=0v_{k}(T):=0 and tvk|Im:=yk|Im{\left.\kern-1.2pt\partial_{t}v_{k}\vphantom{\big{|}}\right|_{I_{m}}}:=-{\left.\kern-1.2pty_{k}\vphantom{\big{|}}\right|_{I_{m}}}, m=1,,Mm=1,\dots,M. We get using integration by parts in W(I)W(I)

A(yk,vk)\displaystyle A(y_{k},v_{k}) =ykI20T(tvk,vk)L2(Ω)𝑑t\displaystyle=\left\|y_{k}\right\|_{{I}}^{2}-\int\limits_{0}^{T}(\partial_{t}\nabla v_{k},\nabla v_{k})_{{L^{2}(\Omega)}}\,dt
=ykI2+12(vk(0)L2(Ω)2vk(T)L2(Ω)2)\displaystyle=\left\|y_{k}\right\|_{{I}}^{2}+\frac{1}{2}\left(\left\|\nabla v_{k}(0)\right\|_{{L^{2}(\Omega)}}^{2}-\left\|\nabla v_{k}(T)\right\|_{{L^{2}(\Omega)}}^{2}\right)
=ykI2+12vk(0)L2(Ω)2=(4.1)0T(f,vk)L2(Ω)𝑑t+(κ,vk(0))L2(Ω)\displaystyle=\left\|y_{k}\right\|_{{I}}^{2}+\frac{1}{2}\left\|\nabla v_{k}(0)\right\|_{{L^{2}(\Omega)}}^{2}\overset{\eqref{E:WFD}}{=}\int\limits_{0}^{T}(f,v_{k})_{{L^{2}(\Omega)}}\,dt+(\kappa,v_{k}(0))_{{L^{2}(\Omega)}}
12(ykI2+vk(0)L2(Ω)2)+C(fI2+κL2(Ω)2),\displaystyle\leq\frac{1}{2}\left(\left\|y_{k}\right\|_{{I}}^{2}+\left\|\nabla v_{k}(0)\right\|_{{L^{2}(\Omega)}}^{2}\right)+C\left(\left\|f\right\|_{{I}}^{2}+\left\|\kappa\right\|_{{L^{2}(\Omega)}}^{2}\right),

where we use the Cauchy-Schwarz inequality and Poincaré’s inequality. Rearranging terms and taking the square root yields the claim.  

For Petrov Galerkin approximations ykYky_{k}\in Y_{k} of states yW(I)y\in W(I) we can only expect 𝒪(k)\mathcal{O}(k) convergence, since yky_{k} is piecewise constant in time, compare [MV11, Lemma 5.2]. In order to obtain 𝒪(k2)\mathcal{O}(k^{2}) control approximations in our convergence analysis for problem (\mathbb{P}) we rely on the following super-convergence results for the projections ΠYk\Pi_{Y_{k}} and 𝒫Yk\mathcal{P}_{Y_{k}}, see e.g. [MV11, Lemma 5.3].

Lemma 4.2.

Let (f,κ)(f,\kappa) satisfy the regularity requirements of Lemma 3.2, and let y,yky,y_{k} solve (1.2) and (4.1) with data (f,κ)(f,\kappa), thus yH1(I,H2(Ω)H01(Ω))H2(I,L2(Ω))y\in H^{1}\left({I},H^{2}(\Omega)\cap{H^{1}_{0}(\Omega)}\right)\cap H^{2}\left({I},{L^{2}(\Omega)}\right). Then there holds

ykΠYkyICk2(t2yI+tΔyI).\left\|y_{k}-\Pi_{Y_{k}}y\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}y\right\|_{{I}}+\left\|\partial_{t}\Delta y\right\|_{{I}}\right).

Note that the proof of [MV11, Lemma 5.3] is applicable in our situation since the initial value κ\kappa is the same for both, the continuous problem (1.2) and (4.1).

Corollary 4.3.

Let the assumptions of Lemma 4.2 hold. Then there holds

yk𝒫YkyICk2(t2yI+tΔyI).\left\|y_{k}-\mathcal{P}_{Y_{k}}y\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}y\right\|_{{I}}+\left\|\partial_{t}\Delta y\right\|_{{I}}\right)\,.
Proof.

With the result of Lemma 4.2 at hand it suffices to show that

ΠYky𝒫YkyIk2t2yI\left\|\Pi_{Y_{k}}y-\mathcal{P}_{Y_{k}}y\right\|_{{I}}\leq k^{2}\left\|\partial_{t}^{2}y\right\|_{{I}} (4.2)

holds. We prove this estimate for smooth functions wC2(I,L2(Ω))H2(I,L2(Ω))w\in C^{2}({I},{L^{2}(\Omega)})\cap H^{2}({I},{L^{2}(\Omega)}). The result then follows by a density argument.

Suppose wC2(I,L2(Ω))H2(I,L2(Ω))w\in C^{2}({I},{L^{2}(\Omega)})\cap H^{2}({I},{L^{2}(\Omega)}). We use the Taylor expansion of ww at tmt_{m}^{*} and obtain

tm1tmw(t)w(tm)dtL2(Ω)2=tm1tm(ttm)tw(tm)+tmt(ts)t2w(s)dsdtL2(Ω)2kmtm1tmtmt(ts)t2w(s)dsL2(Ω)2𝑑tkm4tm1tmtmtt2w(s)L2(Ω)2𝑑s𝑑tkm5tm1tmt2w(s)L2(Ω)2𝑑s,\left\|\int_{t_{{m}-1}}^{t_{m}}w(t)-w(t^{*}_{m})dt\right\|_{L^{2}(\Omega)}^{2}=\left\|\int_{t_{{m}-1}}^{t_{m}}(t-t^{*}_{m})\partial_{t}w(t^{*}_{m})+\int_{t^{*}_{m}}^{t}(t-s)\partial_{t}^{2}w(s)dsdt\right\|_{L^{2}(\Omega)}^{2}\\ \leq k_{m}\int_{t_{{m}-1}}^{t_{m}}\left\|\int_{t^{*}_{m}}^{t}(t-s)\partial_{t}^{2}w(s)ds\right\|_{L^{2}(\Omega)}^{2}dt\leq k_{m}^{4}\int_{t_{{m}-1}}^{t_{m}}\int_{t^{*}_{m}}^{t}\left\|\partial_{t}^{2}w(s)\right\|_{L^{2}(\Omega)}^{2}dsdt\\ \leq k^{5}_{m}\int_{t_{{m}-1}}^{t_{m}}\left\|\partial_{t}^{2}w(s)\right\|_{L^{2}(\Omega)}^{2}ds, (4.3)

where we have used the Cauchy-Schwarz inequality twice. This proves

(m=1Mkm1kmtm1tmw(t)w(tm)dtL2(Ω)2)12k2t2wI,\left(\sum_{{m}=1}^{M}k_{m}\left\|\frac{1}{k_{m}}\int_{t_{{m}-1}}^{t_{m}}w(t)-w(t^{*}_{m})dt\right\|_{L^{2}(\Omega)}^{2}\right)^{\frac{1}{2}}\leq k^{2}\left\|\partial_{t}^{2}w\right\|_{{I}},

which is (4.2).  

For the next Lemma, see [MV11, Lemma 5.6], we need the following condition on the time grid:

Assumption 4.4.

There exist constants 0<c1c2<0<c_{1}\leq c_{2}<\infty independent of kk such that

c1kmkm+1c2c_{1}\leq\frac{k_{m}}{k_{m+1}}\leq c_{2}

holds for all m=1,2,,M1m=1,2,\dots,{M}-1.

Lemma 4.5.

Let the Assumption 4.4 be fulfilled. The interpolation operator πPk\pi_{P_{k}^{*}} has the following properties, where C>0C>0 in both cases denotes a constant independent of kk.

  1. 1.

    wπPkwICk2t2wIwH2(I,L2(Ω))\left\|w-\pi_{P_{k}^{*}}w\right\|_{{I}}\leq Ck^{2}\left\|\partial_{t}^{2}w\right\|_{{I}}\quad\forall\ w\in H^{2}(I,{L^{2}(\Omega)}),

  2. 2.

    πPkwkICwkIwkYk\left\|\pi_{P_{k}^{*}}w_{k}\right\|_{{I}}\leq C\left\|w_{k}\right\|_{{I}}\quad\forall\ w_{k}\in Y_{k}.

Since the state is discretized by piecewise constant functions, we can only expect first order convergence in time for its discretization error. The following Lemma shows that a projected version of the discretized state converges second order in time to the continuous state. The benefit of this result will be discussed in the numerics section.

Lemma 4.6.

Let yy and yky_{k} be given as in Lemma 4.2. Then there holds

πPkykyICk2(t2yI+tΔyI).\left\|\pi_{P_{k}^{*}}y_{k}-y\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}y\right\|_{{I}}+\left\|\partial_{t}\Delta y\right\|_{{I}}\right).
Proof.

Making use of the splitting

πPkykyI=πPk(ykΠYky)I+πPkΠYkyyI=πPk(ykΠYky)I+πPkyyI,\left\|\pi_{P_{k}^{*}}y_{k}-y\right\|_{{I}}=\left\|\pi_{P_{k}^{*}}\left(y_{k}-\Pi_{Y_{k}}y\right)\right\|_{{I}}+\left\|\pi_{P_{k}^{*}}\Pi_{Y_{k}}y-y\right\|_{{I}}=\left\|\pi_{P_{k}^{*}}\left(y_{k}-\Pi_{Y_{k}}y\right)\right\|_{{I}}+\left\|\pi_{P_{k}^{*}}y-y\right\|_{{I}},

the claim is an immediate consequence of Lemma 4.5 and Lemma 4.2.  

In the numerical treatment of problem (\mathbb{P}) we also need error estimates for discrete adjoint functions pkPkW(I)p_{k}\in P_{k}\hookrightarrow W(I). For hL2(I,H1(Ω))h\in L^{2}(I,{H^{-1}(\Omega)}) we consider the problem: Find pkPkp_{k}\in P_{k} such that

A(y~,pk)=0Th(t),y~(t)H1(Ω)H01(Ω)𝑑ty~Yk.A(\tilde{y},p_{k})=\int\limits_{0}^{T}\langle h(t),\tilde{y}(t)\rangle_{{H^{-1}(\Omega)}{H^{1}_{0}(\Omega)}}\,dt\quad\forall\ \tilde{y}\in Y_{k}. (4.4)

This problem admits a unique solution pkPkp_{k}\in P_{k}. This follows from the fact that if we write

pk(t)=i=0Mβibi(t)p_{k}(t)=\sum\limits_{i=0}^{M}\beta_{i}b_{i}(t)

with coefficients βiH01(Ω)\beta_{i}\in{H^{1}_{0}(\Omega)} and biC([0,T])b_{i}\in C([0,T]), bi(tj)=δijb_{i}(t_{j})=\delta_{ij}, for i,j=0,,Mi,j=0,\dots,M, the coefficients βi\beta_{i} are determined by a backward in time Crank-Nicolson scheme, starting with βM0\beta_{M}\equiv 0. Similar to [MV11, Lemma 4.7] we have the following stability result, which we need to prove Theorem 5.3.

Lemma 4.7.

Let pkPkp_{k}\in P_{k} solve (4.4) with hL2(I,L2(Ω))h\in{L^{2}(I,{L^{2}(\Omega)})}. Then there exists a constant C>0C>0 independent of kk such that

pkH1(I,L2(Ω))+pk(0)H1(Ω)ChI.\left\|p_{k}\right\|_{H^{1}(I,{L^{2}(\Omega)})}+\left\|p_{k}(0)\right\|_{H^{1}(\Omega)}\leq C\left\|h\right\|_{{I}}.
Proof.

We define y~Yk\tilde{y}\in Y_{k} by y~|Im:=tpk|Im{\left.\kern-1.2pt\tilde{y}\vphantom{\big{|}}\right|_{I_{m}}}:=-{\left.\kern-1.2pt\partial_{t}p_{k}\vphantom{\big{|}}\right|_{I_{m}}}, m=1,,Mm=1,\dots,M, with y~(T)H01(Ω)\tilde{y}(T)\in{H^{1}_{0}(\Omega)} arbitrary. Testing with y~\tilde{y} in (4.4) we obtain

A(tpk,pk)=12pk(0)L2(Ω)2+tpkI2=(4.4)0T(h,tpk)L2(Ω)𝑑t,A(-\partial_{t}p_{k},p_{k})=\frac{1}{2}\left\|\nabla p_{k}(0)\right\|^{2}_{{L^{2}(\Omega)}}+\left\|\partial_{t}p_{k}\right\|_{{I}}^{2}\overset{\eqref{E:AdjDiscr}}{=}\int\limits_{0}^{T}(h,-\partial_{t}p_{k})_{{L^{2}(\Omega)}}\,dt\,,

where we have used pk(T)=0p_{k}(T)=0. Arguing as in Lemma 4.1 delivers the desired result.  

Moreover, from [MV11, Lemma 6.3] we have the following convergence results for discrete adjoint approximations.

Lemma 4.8.

Let p,pkp,p_{k} solve (2.5) and (4.4), respectively, where hL2(I,L2(Ω))h\in{L^{2}(I,{L^{2}(\Omega)})}. Then there holds

pkpICk2(t2pI+tΔpI).\left\|p_{k}-p\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}p\right\|_{{I}}+\left\|\partial_{t}\Delta p\right\|_{{I}}\right).

One essential ingredient of our convergence analysis is given by the following result.

Lemma 4.9.

Let yy and yky_{k} as in Lemma 4.2, and let pk(h)Pkp_{k}(h)\in P_{k} denote the solution to (4.4) with right hand side hh. Then there holds

pk(yky)ICk2(t2yI+tΔyI).\left\|p_{k}(y_{k}-y)\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}y\right\|_{{I}}+\left\|\partial_{t}\Delta y\right\|_{{I}}\right).
Proof.

The function pk(yky)p_{k}(y_{k}-y) solves (4.4) with pk(T)=0p_{k}(T)=0 and h=ykyh=y_{k}-y. Since the test functions y~\tilde{y} are elements of YkY_{k} we by Galerkin orthogonality obtain the same solution pkp_{k} with right hand side h=yk𝒫Ykyh=y_{k}-\mathcal{P}_{Y_{k}}y, i.e. pk(yky)=pk(yk𝒫Yky)p_{k}(y_{k}-y)=p_{k}(y_{k}-\mathcal{P}_{Y_{k}}y). Hence by Lemma 4.7 and Corollary 4.3 we obtain

pk(yky)I=pk(yk𝒫Yky)ICyk𝒫YkyICk2(t2yI+tΔyI),\left\|p_{k}(y_{k}-y)\right\|_{{I}}=\left\|p_{k}(y_{k}-\mathcal{P}_{Y_{k}}y)\right\|_{{I}}\leq C\left\|y_{k}-\mathcal{P}_{Y_{k}}y\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}y\right\|_{{I}}+\left\|\partial_{t}\Delta y\right\|_{{I}}\right),

which is the claim.  

5 Variational discretization of the optimal control problem (\mathbb{P})

To approximate the optimal control problem (\mathbb{P}) we apply variational discretization of [Hin05] w.r.t. time, where the Petrov Galerkin state discretization introduced in the previous section is applied, i.e. we consider the optimal control problem

minykYk,uUadJ(yk,u)=12ykydL2(I,L2(Ω))2+α2uU2,\displaystyle\min_{y_{k}\in Y_{k},u\in{U_{\textup{ad}}}}J(y_{k},u)=\frac{1}{2}\|y_{k}-y_{d}\|^{2}_{L^{2}(I,{L^{2}(\Omega)})}+\frac{\alpha}{2}\|u\|^{2}_{U}, (k\mathbb{P}_{k})
s.t. yk=Sk(Bu,y0),\displaystyle\text{s.t. }y_{k}=S_{k}(Bu,y_{0}),

where SkS_{k} is the solution operator associated to (4.1). This problem admits a unique solution (y¯k,u¯k)Yk×Uad(\bar{y}_{k},\bar{u}_{k})\in Y_{k}\times{U_{\textup{ad}}}, where y¯k=Sk(Bu¯k,y0)\bar{y}_{k}=S_{k}(B\bar{u}_{k},y_{0}). The first order necessary and sufficient optimality condition for problem (k\mathbb{P}_{k}) reads

u¯k=PUad(1αBp¯k),\bar{u}_{k}=P_{U_{\textup{ad}}}\left(-\frac{1}{\alpha}B^{\prime}\bar{p}_{k}\right), (5.1)

where p¯kPk\bar{p}_{k}\in P_{k} denotes the discrete adjoint variable, which is the unique solution to (4.4) with h:=y¯kydh:={\bar{y}_{k}}-y_{d}. Equation (5.1) is amenable to numerical treatment although the controls are not discretized explicitly, see [Hin05]. It is possible to implement a globalized semismooth Newton strategy in order to solve (5.1) numerically, see [HV12].

First let us establish an error estimate that resembles the standard estimate for variationally discretized problems. To begin with we for vUv\in U set y(v):=S(Bv,y0)y(v):=S(Bv,y_{0}) and denote with yk(v)y_{k}(v) the solution to (4.1) with f:=Bvf:=Bv. Furthermore, we for hL2(I,H1(Ω))h\in L^{2}(I,{H^{-1}(\Omega)}) denote with pk(h)p_{k}(h) the solution to (4.4).

Lemma 5.1.

Let u¯{\bar{u}} and u¯k{\bar{u}_{k}} solve (\mathbb{P}) and (k\mathbb{P}_{k}), respectively. Then there holds

α|u¯ku¯|I2(B(pk(y¯yd)p¯+pk(yk(u¯))pk(y¯)),u¯u¯k)L2(I,D).\alpha\left|{\bar{u}_{k}}-{\bar{u}}\right|_{{I}}^{2}\leq\left(B^{\prime}\Big{(}p_{k}({\bar{y}}-y_{d})-{\bar{p}}+p_{k}(y_{k}({\bar{u}}))-p_{k}({\bar{y}})\Big{)},{\bar{u}}-{\bar{u}_{k}}\right)_{L^{2}({I},\mathbb{R}^{D})}.
Proof.

We note that (2.1) and (5.1) can be equivalently expressed as

(αu¯+Bp¯,u¯u)L2(I,D)0uUad,\displaystyle(\alpha{\bar{u}}+B^{\prime}{\bar{p}},{\bar{u}}-u)_{L^{2}({I},\mathbb{R}^{D})}\leq 0\quad\forall\ u\in{U_{\textup{ad}}}\,, (2.1’)
(αu¯k+Bp¯k,u¯ku)L2(I,D)0uUad.\displaystyle(\alpha{\bar{u}_{k}}+B^{\prime}\bar{p}_{k},{\bar{u}_{k}}-u)_{L^{2}({I},\mathbb{R}^{D})}\leq 0\quad\forall\ u\in{U_{\textup{ad}}}\,. (5.1’)

Now inserting u¯k{\bar{u}_{k}} into (2.1) and u¯{\bar{u}} into (5.1) and adding the resulting inequalities yields

(α(u¯ku¯)+B(p¯kp¯),u¯ku¯)L2(I,D)0.\left(\alpha({\bar{u}_{k}}-{\bar{u}})+B^{\prime}\Big{(}\bar{p}_{k}-{\bar{p}}\Big{)},{\bar{u}_{k}}-{\bar{u}}\right)_{L^{2}({I},\mathbb{R}^{D})}\leq 0.

After some simple manipulations we obtain

α|u¯ku¯|I2\displaystyle\alpha\left|{\bar{u}_{k}}-{\bar{u}}\right|_{{I}}^{2}\leq (B(pk(y¯yd)p¯+pk(yk(u¯))pk(y¯)),u¯u¯k)L2(I,D)\displaystyle\left(B^{\prime}\Big{(}p_{k}({\bar{y}}-y_{d})-{\bar{p}}+p_{k}(y_{k}({\bar{u}}))-p_{k}({\bar{y}})\Big{)},{\bar{u}}-{\bar{u}_{k}}\right)_{L^{2}({I},\mathbb{R}^{D})}
+(B(p¯kpk(yk(u¯)yd)),u¯u¯k)L2(I,D)y¯kyk(u¯)I2\displaystyle\underbrace{+\left(B^{\prime}\Big{(}\bar{p}_{k}-p_{k}(y_{k}({\bar{u}})-y_{d})\Big{)},{\bar{u}}-{\bar{u}_{k}}\right)_{L^{2}({I},\mathbb{R}^{D})}}_{-\left\|\bar{y}_{k}-y_{k}({\bar{u}})\right\|_{{I}}^{2}}
\displaystyle\leq (B(pk(y¯yd)p¯+pk(yk(u¯))pk(y¯)),u¯u¯k)L2(I,D),\displaystyle\left(B^{\prime}\Big{(}p_{k}({\bar{y}}-y_{d})-{\bar{p}}+p_{k}(y_{k}({\bar{u}}))-p_{k}({\bar{y}})\Big{)},{\bar{u}}-{\bar{u}_{k}}\right)_{L^{2}({I},\mathbb{R}^{D})},

which is the desired estimate.  

We are now in the position to formulate our main result.

Theorem 5.2.

Let u¯{\bar{u}} and u¯k{\bar{u}_{k}} denote the solutions to (\mathbb{P}) and (k\mathbb{P}_{k}), respectively. Then

α|u¯ku¯|ICk2(u¯H1(I,D)+u¯(0)D+ydH1(I,L2(Ω))+yd(T)H1(Ω)+y0H1(Ω)+Δy0H1(Ω))\alpha\left|{\bar{u}_{k}}-{\bar{u}}\right|_{{I}}\leq Ck^{2}\Big{(}\left\|{\bar{u}}\right\|_{H^{1}({I},\mathbb{R}^{D})}+\left\|{\bar{u}}(0)\right\|_{\mathbb{R}^{D}}\\ +\left\|y_{d}\right\|_{H^{1}({I},{L^{2}(\Omega)})}+\left\|y_{d}(T)\right\|_{H^{1}(\Omega)}+\left\|y_{0}\right\|_{H^{1}(\Omega)}+\left\|\Delta y_{0}\right\|_{H^{1}(\Omega)}\Big{)} (5.2)

is satisfied.

Proof.

Making use of the continuity of BB and BB^{\prime}, compare (1.3) and (2.3), we directly infer from Lemma 5.1

α|u¯ku¯|I\displaystyle\alpha\left|{\bar{u}_{k}}-{\bar{u}}\right|_{{I}} C(pk(y¯yd)p¯I+pk(yk(u¯))pk(y¯)I)\displaystyle\leq C\left(\left\|p_{k}({\bar{y}}-y_{d})-{\bar{p}}\right\|_{{I}}+\left\|p_{k}(y_{k}({\bar{u}}))-p_{k}({\bar{y}})\right\|_{{I}}\right)
Ck2(t2p¯I+tΔp¯I+t2y¯I+tΔy¯I).\displaystyle\leq Ck^{2}\left(\left\|\partial_{t}^{2}{\bar{p}}\right\|_{{I}}+\left\|\partial_{t}\Delta{\bar{p}}\right\|_{{I}}+\left\|\partial_{t}^{2}{\bar{y}}\right\|_{{I}}+\left\|\partial_{t}\Delta{\bar{y}}\right\|_{{I}}\right).

The last estimate follows from the Lemmata 4.8 and 4.9. The claim is now a direct consequence of the Lemmata 3.1 and 3.2.  

Finally we prove second order convergence for πPky¯k\pi_{P_{k}^{*}}\bar{y}_{k}, where we note that this function is obtained for free from y¯k\bar{y}_{k}, since y¯k\bar{y}_{k} only has to be evaluated on the dual time grid.

Theorem 5.3.

Let u¯{\bar{u}} and u¯k{\bar{u}_{k}} denote the solutions to (\mathbb{P}) and (k\mathbb{P}_{k}), respectively. Then there holds

y¯πPky¯kICk2(|a|I+u¯H1(I,D)+u¯(0)D+ydH1(I,L2(Ω))+yd(T)H1(Ω)+y0H1(Ω)+Δy0H1(Ω)).\left\|\bar{y}-\pi_{P_{k}^{*}}\bar{y}_{k}\right\|_{{I}}\leq Ck^{2}\Big{(}\left|a\right|_{{I}}+\left\|{\bar{u}}\right\|_{H^{1}({I},\mathbb{R}^{D})}+\left\|{\bar{u}}(0)\right\|_{\mathbb{R}^{D}}\\ +\left\|y_{d}\right\|_{H^{1}({I},{L^{2}(\Omega)})}+\left\|y_{d}(T)\right\|_{H^{1}(\Omega)}+\left\|y_{0}\right\|_{H^{1}(\Omega)}+\left\|\Delta y_{0}\right\|_{H^{1}(\Omega)}\Big{)}. (5.3)
Proof.

We have

y¯πPky¯kIy¯y(u¯k)I+y(u¯k)πPky¯kI.\left\|\bar{y}-\pi_{P_{k}^{*}}\bar{y}_{k}\right\|_{{I}}\leq\left\|\bar{y}-y(\bar{u}_{k})\right\|_{{I}}+\left\|y(\bar{u}_{k})-\pi_{P_{k}^{*}}\bar{y}_{k}\right\|_{{I}}.

Lipschitz continuity combined with (5.2) yields

y¯y(u¯k)ICk2(u¯H1(I,D)+u¯(0)D+ydH1(I,L2(Ω))+yd(T)H1(Ω)+y0H1(Ω)+Δy0H1(Ω)).\left\|\bar{y}-y(\bar{u}_{k})\right\|_{{I}}\leq Ck^{2}\Big{(}\left\|{\bar{u}}\right\|_{H^{1}({I},\mathbb{R}^{D})}+\left\|{\bar{u}}(0)\right\|_{\mathbb{R}^{D}}\\ +\left\|y_{d}\right\|_{H^{1}({I},{L^{2}(\Omega)})}+\left\|y_{d}(T)\right\|_{H^{1}(\Omega)}+\left\|y_{0}\right\|_{H^{1}(\Omega)}+\left\|\Delta y_{0}\right\|_{H^{1}(\Omega)}\Big{)}.

For the second addend we have from Lemma 4.6 combined with Lemma 3.2

y(u¯k)πPky¯kICk2(t2y(u¯k)I+tΔy(u¯k)I)Ck2{u¯kH1(I,D))+u¯k(0)D+y0H1(Ω)+Δy0H1(Ω)}.\left\|y(\bar{u}_{k})-\pi_{P_{k}^{*}}\bar{y}_{k}\right\|_{{I}}\leq Ck^{2}\left(\left\|\partial_{t}^{2}y(\bar{u}_{k})\right\|_{{I}}+\left\|\partial_{t}\Delta y(\bar{u}_{k})\right\|_{{I}}\right)\\ \leq Ck^{2}\left\{\left\|\bar{u}_{k}\right\|_{H^{1}({I},\mathbb{R}^{D}))}+\left\|\bar{u}_{k}(0)\right\|_{\mathbb{R}^{D}}+\left\|y_{0}\right\|_{H^{1}(\Omega)}+\left\|\Delta y_{0}\right\|_{H^{1}(\Omega)}\right\}.

Using (2.3) we with the help of (5.1), Lipschitz continuity of the orthogonal projection, Lemma 4.7, Lemma 4.1, and (5.2) estimate

u¯kH1(I,D))+u¯k(0)D\displaystyle\left\|\bar{u}_{k}\right\|_{H^{1}({I},\mathbb{R}^{D}))}+\left\|\bar{u}_{k}(0)\right\|_{\mathbb{R}^{D}} C{|a|I+p¯kH1(I,L2(Ω))+p¯k(0)H1(Ω)}\displaystyle\leq C\left\{\left|a\right|_{{I}}+\left\|\bar{p}_{k}\right\|_{H^{1}({I},L^{2}(\Omega))}+\left\|\bar{p}_{k}(0)\right\|_{H^{1}(\Omega)}\right\}
C{|a|I+y¯kydI}\displaystyle\leq C\left\{\left|a\right|_{{I}}+\left\|\bar{y}_{k}-y_{d}\right\|_{{I}}\right\}
C{|a|I+|u¯k|I+y0L2(Ω)+ydI}\displaystyle\leq C\left\{\left|a\right|_{{I}}+\left|{\bar{u}_{k}}\right|_{{I}}+\left\|y_{0}\right\|_{{L^{2}(\Omega)}}+\left\|y_{d}\right\|_{{I}}\right\}
C{|a|I+u¯H1(I,D)+u¯(0)D\displaystyle\leq C\left\{\left|a\right|_{{I}}+\left\|{\bar{u}}\right\|_{H^{1}({I},\mathbb{R}^{D})}+\left\|{\bar{u}}(0)\right\|_{\mathbb{R}^{D}}\right.
+ydH1(I,L2(Ω))+yd(T)H1(Ω)}.\displaystyle\left.\quad\quad\quad\quad+\left\|y_{d}\right\|_{H^{1}({I},{L^{2}(\Omega)})}+\left\|y_{d}(T)\right\|_{H^{1}(\Omega)}\right\}.

Collecting all estimates gives the desired result.  

6 Numerical examples

We now construct numerical examples that validate our main result, i.e. Theorem 5.2.
In both examples we make use of the fact that instead of the linear control operator BB, given by (1.3), we can also use an affine linear control operator

B~:UL2(I,H1(Ω)),ug0+Bu.\tilde{B}:U\rightarrow L^{2}(I,{H^{-1}(\Omega)})\,,\quad u\mapsto g_{0}+Bu. (6.1)

If we assume that g0g_{0} is an element of H1(I,L2(Ω))H^{1}(I,L^{2}(\Omega)) with initial value g0(0)H01(Ω)g_{0}(0)\in{H^{1}_{0}(\Omega)}, all the preceding theory remains valid.

6.1 First example

The first example is taken from [MV11]. We recall it for convenience in our notation.

Given a space-time domain Ω×I=(0,1)2×(0,0.1)\Omega\times I=(0,1)^{2}\times(0,0.1), i.e. D=1D=1, we consider first the control operator B~\tilde{B}, which is fully characterized by means of the two functions

g1(x1,x2):=sin(πx1)sin(πx2),g_{1}(x_{1},x_{2}):=\sin(\pi x_{1})\sin(\pi x_{2})\,,
g0(t,x1,x2):=π4wa(t,x1,x2)BPUad(14α(exp(aπ2t)exp(aπ2T))),g_{0}(t,x_{1},x_{2}):=-\pi^{4}w_{a}(t,x_{1},x_{2})-BP_{U_{\textup{ad}}}\left(-\frac{1}{4\alpha}(\exp(a\pi^{2}t)-\exp(a\pi^{2}T))\right)\,,

where

wa(t,x1,x2):=exp(aπ2t)sin(πx1)sin(πx2),a,w_{a}(t,x_{1},x_{2}):=\exp(a\pi^{2}t)\,\sin(\pi x_{1})\sin(\pi x_{2})\,,\quad a\in\mathbb{R}\,,

denote eigenfunctions of ±tΔ\pm\partial_{t}-\Delta. As a consequence we have

(Bz)(t)=Ωz(t,x1,x2)g1(x1,x2)𝑑x1𝑑x2,(B^{\prime}z)(t)=\int_{\Omega}z(t,x_{1},x_{2})\cdot g_{1}(x_{1},x_{2})\,dx_{1}dx_{2}\,,

compare (2.3). Note that we consider the adjoint of BB, not of B~\tilde{B}. Furthermore we take

yd(t,x1,x2):=a252+aπ2wa(t,x1,x2)+2π2wa(T,x1,x2),y_{d}(t,x_{1},x_{2}):=\frac{a^{2}-5}{2+a}\pi^{2}w_{a}(t,x_{1},x_{2})+2\pi^{2}w_{a}(T,x_{1},x_{2})\,,

and

y0(x1,x2):=12+aπ2wa(0,x1,x2).y_{0}(x_{1},x_{2}):=\frac{-1}{2+a}\pi^{2}w_{a}(0,x_{1},x_{2})\,.

The admissible set Uad{U_{\textup{ad}}} is defined by the bounds a1:=25a_{1}:=-25 and b1:=1b_{1}:=-1. Furthermore α:=π4\alpha:=\pi^{-4} and a:=5a:=-\sqrt{5}.

The exact solution of the optimal control problem (\mathbb{P}) is given by

u¯(t)=PUad(14α(exp(aπ2t)exp(aπ2T))),{\bar{u}}(t)=P_{U_{\textup{ad}}}\left(-\frac{1}{4\alpha}(\exp(a\pi^{2}t)-\exp(a\pi^{2}T))\right)\,,
y¯(t,x1,x2)=12+aπ2wa(t,x1,x2),{\bar{y}}(t,x_{1},x_{2})=\frac{-1}{2+a}\pi^{2}w_{a}(t,x_{1},x_{2})\,,

and

p¯(t,x1,x2)=wa(t,x1,x2)wa(T,x1,x2).{\bar{p}}(t,x_{1},x_{2})=w_{a}(t,x_{1},x_{2})-w_{a}(T,x_{1},x_{2})\,.

Note that this example fulfills the Assumption 1.1.

We solve this problem numerically using a fixpoint iteration on equation (5.1). We discretize in space with a fixed number of nodes Nh=(27+1)2=16 641\text{Nh}=(2^{7}+1)^{2}=16\,641. We examine the behavior of the temporal convergence by considering a sequence of meshes with Nk=(2+1)2\text{Nk}=(2^{\ell}+1)^{2} nodes at refinement levels =1,2,3,4,5,6\ell=1,2,3,4,5,6. Each fixpoint iteration is initialized by the starting value ukh:=a1u_{kh}:=a_{1}. As a stopping criterion we require

B(pkhnewpkhold)L(Ω×I)<t0,\|B^{\prime}\left(p_{kh}^{\text{new}}-p_{kh}^{\text{old}}\right)\|_{L^{\infty}(\Omega\times I)}<t_{0}\,,

where t0:=105t_{0}:=10^{-5} is a prescribed threshold.

\ell u¯ukhL1(I,L1(Ω))\|{\bar{u}}-u_{kh}\|_{L^{1}(I,L^{1}(\Omega))} u¯ukhL2(L2)\|{\bar{u}}-u_{kh}\|_{L^{2}(L^{2})} u¯ukhL(L)\|{\bar{u}}-u_{kh}\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.07338346 0.31701554 1.97701729 / / /
2 0.01653824 0.08052755 0.66237792 2.15 1.98 1.58
3 0.00396507 0.01977927 0.19440662 2.06 2.03 1.77
4 0.00088306 0.00448012 0.05014900 2.17 2.14 1.95
5 0.00017870 0.00083749 0.00970228 2.31 2.42 2.37
6 0.00018581 0.00068442 0.00462541 -0.06 0.29 1.07
Table 1: First example: Errors and EOC in the control.
\ell y¯ykhL1(I,L1(Ω))\|{\bar{y}}-y_{kh}\|_{L^{1}(I,L^{1}(\Omega))} y¯ykhL2(L2)\|{\bar{y}}-y_{kh}\|_{L^{2}(L^{2})} y¯ykhL(L)\|{\bar{y}}-y_{kh}\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.19002993 0.96055898 14.78668742 / / /
2 0.09429883 0.49287844 9.02297459 1.01 0.96 0.71
3 0.04706727 0.24798983 5.06528533 1.00 0.99 0.83
4 0.02352485 0.12419027 2.69722511 1.00 1.00 0.91
5 0.01176627 0.06216408 1.39374255 1.00 1.00 0.95
6 0.00588802 0.03119134 0.70870727 1.00 0.99 0.98
Table 2: First example: Errors and EOC in the state.
\ell y¯πPkykhL1(I,L1(Ω))\|{\bar{y}}-\pi_{P_{k}^{*}}y_{kh}\|_{L^{1}(I,L^{1}(\Omega))} L2(L2)\|\dots\|_{L^{2}(L^{2})} L(L)\|\dots\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.10937032 0.48300664 6.29978738 / / /
2 0.02713496 0.13212665 1.94510739 2.01 1.87 1.70
3 0.00720221 0.03723408 0.61346273 1.91 1.83 1.66
4 0.00183081 0.00982563 0.17399005 1.98 1.92 1.82
5 0.00042588 0.00242796 0.04646078 2.10 2.02 1.90
6 0.00009796 0.00054833 0.01201333 2.12 2.15 1.95
Table 3: First example: Errors and EOC in the projected state.
\ell p¯pkhL1(I,L1(Ω))\|{\bar{p}}-p_{kh}\|_{L^{1}(I,L^{1}(\Omega))} p¯pkhL2(L2)\|{\bar{p}}-p_{kh}\|_{L^{2}(L^{2})} p¯pkhL(L)\|{\bar{p}}-p_{kh}\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.00125773 0.00652756 0.08119466 / / /
2 0.00029007 0.00166280 0.02721190 2.12 1.97 1.58
3 0.00006933 0.00040888 0.00799559 2.06 2.02 1.77
4 0.00001564 0.00009340 0.00207127 2.15 2.13 1.95
5 0.00000310 0.00001739 0.00041008 2.34 2.43 2.34
6 0.00000273 0.00001246 0.00017768 0.18 0.48 1.21
Table 4: First example: Errors and EOC in the adjoint.

Table 1 shows the behavior of several errors in time between the exact control u¯{\bar{u}} and its discretized computed counterpart ukhu_{kh}, obtained by the fixpoint iteration. Furthermore, the experimental order of convergence (EOC) is given. The table indicates an error behavior of 𝒪(k2)\mathcal{O}(k^{2}) for the L2L^{2} error in the control, which is in accordance with Theorem 5.2. Furthermore, the error of the adjoint, see table 4, shows the same behavior, as expected. Here we note that the EOC in our numerical example deteriorates if the temporal error reaches the size of the spatial error (which in the numerical investigations is fixed through the choice of Nh), see e.g. the last lines in Tables 1, 4, 5, 8.
Since the state is discretized piecewise constant in time, the order of convergence is only one. This is depicted in table 2. However, without further numerical effort we obtain a second order convergent approximation of the state with the projection πPkyk\pi_{P_{k}^{*}}y_{k} of the discrete state yky_{k}, see Theorem 5.3 and see table 3 for the corresponding numerical results. In practise this means that we can gain a better approximation of the state without further effort; we only have to interpret the discrete state vector yk\vec{y}_{k}, i.e. the vector containing the value of yky_{k} on each interval ImI_{m}, in the right way, namely as a vector of values on the gridpoints of the dual grid t1<<tMt_{1}^{*}<\dots<t_{M}^{*}.

Refer to caption
(a) =1\ell=1
Refer to caption
(b) =2\ell=2
Refer to caption
(c) =3\ell=3
Figure 1: First example: Optimal control u¯{\bar{u}} (solid) and ukhu_{kh} (dashed) over time after refinement level \ell.

Figure 1 illustrates the convergence of ukhu_{kh} to u¯{\bar{u}}. Note that the intersection points between inactive set kh={tI|a<ukh(t)<b}\mathcal{I}_{kh}=\left\{t\in I\,\left|\;a<u_{kh}(t)<b\right.\right\} and active set 𝒜kh:=I\kh\mathcal{A}_{kh}:=I\backslash\mathcal{I}_{kh} need not coincide with time grid points since we use variational discretization. Let us further note that the number of fixpoint iterations does not depend on the fineness of the time grid size. In our example four iterations are needed to reach the above mentioned threshold of t0:=105t_{0}:=10^{-5}. Let us mention that the fixpoint iteration only converges for large enough values of α\alpha, see e.g. [HV12] which seems to be the case in our numerical examples. For smaller values of α\alpha the semi-smooth Newton method can be applied, see also [HV12] for its numerical analysis in the case of variational discretization of elliptic optimal control problems.

6.2 Second example

This example is a slight variant of the first one yielding more intersection points between the active and inactive set. With the space-time domain Ω×I=(0,1)2×(0,0.5)\Omega\times I=(0,1)^{2}\times(0,0.5), we set

y¯(t,x1,x2):=wa(t,x1,x2):=cos(tT 2πa)g1(x1,x2),{\bar{y}}(t,x_{1},x_{2}):=w_{a}(t,x_{1},x_{2}):=\cos\left(\frac{t}{T}\,2\pi a\right)\cdot g_{1}(x_{1},x_{2})\,,
p¯(t,x1,x2):=wa(t,x1,x2)wa(T,x1,x2),{\bar{p}}(t,x_{1},x_{2}):=w_{a}(t,x_{1},x_{2})-w_{a}(T,x_{1},x_{2})\,,
y0(x1,x2):=g1(x1,x2),y_{0}(x_{1},x_{2}):=g_{1}(x_{1},x_{2})\,,

where g1g_{1} is defined in the first example. Consequently,

g0=g12π(aTsin(tT 2πa)+πcos(tT 2πa))Bu¯,g_{0}=g_{1}2\pi\left(-\frac{a}{T}\sin\left(\frac{t}{T}\,2\pi a\right)+\pi\cos\left(\frac{t}{T}\,2\pi a\right)\right)-B{\bar{u}}\,,
yd=g1(cos(tT 2πa)(12π2)2πaTsin(tT 2πa)+2π2cos(2πa)),y_{d}=g_{1}\left(\cos\left(\frac{t}{T}\,2\pi a\right)\left(1-2\pi^{2}\right)-\frac{2\pi a}{T}\sin\left(\frac{t}{T}\,2\pi a\right)+2\pi^{2}\cos\left(2\pi a\right)\right)\,,

and

u¯=PUad(14αcos(tT 2πa)+14α).{\bar{u}}=P_{U_{\textup{ad}}}\left(-\frac{1}{4\alpha}\cos\left(\frac{t}{T}\,2\pi a\right)+\frac{1}{4\alpha}\right)\,.

Futhermore, we set α=1\alpha=1, a1:=0.2a_{1}:=0.2, b1:=0.4b_{1}:=0.4 and a:=2a:=2. Note that this example also fulfills the Assumption 1.1.

We now consider refinement levels =1,2,3,4,5,6,7,8\ell=1,2,3,4,5,6,7,8 and proceed as described in the first example. We obtain the same EOCs for control, state, and adjoint, see the tables 5, 6, 7, 8, and figure 2.

\ell u¯ukhL1(I,L1(Ω))\|{\bar{u}}-u_{kh}\|_{L^{1}(I,L^{1}(\Omega))} u¯ukhL2(L2)\|{\bar{u}}-u_{kh}\|_{L^{2}(L^{2})} u¯ukhL(L)\|{\bar{u}}-u_{kh}\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.04925427 0.09237138 0.20000000 / / /
2 0.00256632 0.01106114 0.07336869 4.26 3.06 1.45
3 0.00403215 0.01144324 0.04704583 -0.65 -0.05 0.64
4 0.00069342 0.00204495 0.00893696 2.54 2.48 2.40
5 0.00016762 0.00050729 0.00249463 2.05 2.01 1.84
6 0.00003989 0.00011939 0.00064497 2.07 2.09 1.95
7 0.00000948 0.00003227 0.00020672 2.07 1.89 1.64
8 0.00000764 0.00002142 0.00009457 0.31 0.59 1.13
Table 5: Second example: Errors and EOC in the control.
\ell y¯ykhL1(I,L1(Ω))\|{\bar{y}}-y_{kh}\|_{L^{1}(I,L^{1}(\Omega))} y¯ykhL2(L2)\|{\bar{y}}-y_{kh}\|_{L^{2}(L^{2})} y¯ykhL(L)\|{\bar{y}}-y_{kh}\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.19657193 0.41315218 2.24553307 / / /
2 0.13005269 0.25408123 1.25552256 0.60 0.70 0.84
3 0.05650537 0.11224959 0.65977254 1.20 1.18 0.93
4 0.02611675 0.05637041 0.38210207 1.11 0.99 0.79
5 0.01277289 0.02827337 0.19029296 1.03 1.00 1.01
6 0.00635223 0.01418903 0.09710641 1.01 0.99 0.97
7 0.00317298 0.00718111 0.04892792 1.00 0.98 0.99
8 0.00158730 0.00375667 0.02456764 1.00 0.93 0.99
Table 6: Second example: Errors and EOC in the state.
\ell y¯πPkykhL1(I,L1(Ω))\|{\bar{y}}-\pi_{P_{k}^{*}}y_{kh}\|_{L^{1}(I,L^{1}(\Omega))} L2(L2)\|\dots\|_{L^{2}(L^{2})} L(L)\|\dots\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.19734452 0.42154165 2.65669891 / / /
2 0.13173168 0.25800727 1.39668789 0.58 0.71 0.93
3 0.03422500 0.07418402 0.40783930 1.94 1.80 1.78
4 0.01080693 0.02168391 0.15176831 1.66 1.77 1.43
5 0.00282859 0.00567595 0.04685968 1.93 1.93 1.70
6 0.00071212 0.00143268 0.01229008 1.99 1.99 1.93
7 0.00017551 0.00035509 0.00311453 2.02 2.01 1.98
8 0.00004104 0.00008530 0.00078765 2.10 2.06 1.98
Table 7: Second example: Errors and EOC in the projected state.
\ell p¯pkhL1(I,L1(Ω))\|{\bar{p}}-p_{kh}\|_{L^{1}(I,L^{1}(\Omega))} p¯pkhL2(L2)\|{\bar{p}}-p_{kh}\|_{L^{2}(L^{2})} p¯pkhL(L)\|{\bar{p}}-p_{kh}\|_{L^{\infty}(L^{\infty})} EOCL1\text{EOC}_{L^{1}} EOCL2\text{EOC}_{L^{2}} EOCL\text{EOC}_{L^{\infty}}
1 0.20659855 0.46853028 2.86360259 / / /
2 0.03491931 0.08118048 0.56829981 2.56 2.53 2.33
3 0.01994220 0.04100552 0.20495644 0.81 0.99 1.47
4 0.00440890 0.00895349 0.05815307 2.18 2.20 1.82
5 0.00105993 0.00215639 0.01668075 2.06 2.05 1.80
6 0.00026116 0.00053258 0.00447036 2.02 2.02 1.90
7 0.00006984 0.00014824 0.00116014 1.90 1.85 1.95
8 0.00004199 0.00008530 0.00046798 0.73 0.80 1.31
Table 8: Second example: Errors and EOC in the adjoint.
Refer to caption
(a) =1\ell=1
Refer to caption
(b) =2\ell=2
Refer to caption
(c) =3\ell=3
Figure 2: Second example: Optimal control u¯{\bar{u}} (solid) and ukhu_{kh} (dashed) over time after refinement level \ell.

Acknowledgements

Christian Kahle provided some hints on the numerical implementation which are gratefully acknowledged.

References

  • [AF12] Thomas Apel and Thomas G. Flaig. Crank-Nicolson schemes for optimal control problems with evolution equations. SIAM Journal on Numerical Analysis, 50:1484–1512, 2012.
  • [Eva98] Lawrence C. Evans. Partial Differential Equations. AMS, 1998.
  • [Hin05] Michael Hinze. A Variational Discretization Concept in Control Constrained Optimization: The Linear-Quadratic Case. Computational Optimization and Applications, 30(1):45–61, 2005.
  • [HPUU09] Michael Hinze, Rene Pinnau, Michael Ulbrich, and Stefan Ulbrich. Optimization with PDE Constraints. Springer, 2009.
  • [HV12] Michael Hinze and Morten Vierling. The semi-smooth Newton method for variationally discretized control constrained elliptic optimal control problems; implementation, convergence and globalization. Optimization Methods and Software, 27(6):933–950, 2012.
  • [MV08a] Dominik Meidner and Boris Vexler. A Priori Error Estimates for Space-Time Finite Element Discretization of Parabolic Optimal Control Problems Part I. SIAM Journal on Control and Optimization, 47(3):1150–1177, 2008.
  • [MV08b] Dominik Meidner and Boris Vexler. A Priori Error Estimates for Space-Time Finite Element Discretization of Parabolic Optimal Control Problems Part II. SIAM Journal on Control and Optimization, 47(3):1301–1329, 2008.
  • [MV11] Dominik Meidner and Boris Vexler. A priori error analysis of the Petrov Galerkin Crank Nicolson scheme for parabolic optimal control problems. SIAM Journal on Control and Optimization, 49(5):2183–2211, 2011.
  • [Ran84] Rolf Rannacher. Finite Element Solution of Diffusion Problems with Irregular Data. Numerische Mathematik, 43:309–327, 1984.
  • [SV13] Andreas Springer and Boris Vexler. Third order convergent time discretization for parabolic optimal control problems with control constraints. Computational Optimization and Applications, pages 1–36, 2013.