This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A short note on plain convergence of
adaptive Least-squares finite element methods

Thomas Führer  and  Dirk Praetorius Facultad de Matemáticas, Pontificia Universidad Católica de Chile, Santiago, Chile tofuhrer@mat.uc.cl  (corresponding author) TU Wien, Institute of Analysis and Scientific Computing, Wiedner Hauptstr. 8–10, 1040 Wien, Austria dirk.praetorius@asc.tuwien.ac.at
Abstract.

We show that adaptive least-squares finite element methods driven by the canonical least-squares functional converge under weak conditions on PDE operator, mesh-refinement, and marking strategy. Contrary to prior works, our plain convergence does neither rely on sufficiently fine initial meshes nor on severe restrictions on marking parameters. Finally, we prove that convergence is still valid if a contractive iterative solver is used to obtain the approximate solutions (e.g., the preconditioned conjugate gradient method with optimal preconditioner). The results apply within a fairly abstract framework which covers a variety of model problems.

Key words and phrases:
Least squares finite element methods, adaptive algorithm, convergence
2010 Mathematics Subject Classification:
65N12, 65N15, 65N30, 65N50
Acknowledgment. This work was supported by CONICYT (through FONDECYT project 11170050) and the Austrian Science Fund FWF (through project P33216 and the special research program SFB F65).

1. Introduction

Least-squares finite element methods (LSFEMs) are a class of finite element methods that minimize the residual in some norm. These methods often become rather easy to analyze and implement when the residual is measured in the space of square integrable functions. Some features of LSFEMs are the following: First, the resulting algebraic systems are always symmetric and positive definite, thus allowing the use of standard iterative solvers. Second, inf\infsup\sup stability for any conforming finite element space is inherited from the continuous problem. Finally, another feature is the built-in (localizable) error estimator that can be used to steer an adaptive mesh-refinement in, e.g., an adaptive algorithm of the form

SOLVEESTIMATEMARKREFINE.\displaystyle\boxed{\texttt{SOLVE}}\quad\longrightarrow\quad\boxed{\texttt{ESTIMATE}}\quad\longrightarrow\quad\boxed{\texttt{MARK}}\quad\longrightarrow\quad\boxed{\texttt{REFINE}}.

For standard discretizations, the mathematical understanding of adaptive finite element methods (AFEMs) has matured over the past decades. We mention [30, 33] for abstract theories for plain convergence of AFEM, the seminal works [18, 29, 4, 34, 17] for convergence of standard AFEM with optimal algebraic rates, and the abstract framework of [13], which also provides a comprehensive review of the state of the art.

In contrast to this, only very little is known about convergence of adaptive LSFEM; see [12, 8, 14, 15]. To the best of our knowledge, plain convergence of adaptive LSFEM when using the built-in error estimator has so far only been addressed in [15]. The other works [12, 8, 14] deal with optimal convergence results, but rely on alternative error estimators.

The aim of this work is to shed some new light on the plain convergence of adaptive LSFEMs using the canonical error estimator shipped with them. Basically, the whole idea of this note is to verify that a large class of LSFEMs fits into the abstract framework of [33]. From this, we then conclude that the sequence of discrete solutions produced by the above iteration converges to the exact solution of the underlying problem (see Theorem 2 below). Let us mention that our result is weaker than [15, Theorem 4.1] as we only show (plain) convergence whereas [15] proves that an equivalent error quantity is contractive. However, the latter result comes at the price of assuming sufficiently fine initial meshes and sufficiently large marking parameters 0<θ<10<\theta<1 in the Dörfler marking criterion. The latter contradicts somehow the by now standard proofs of optimal convergence where θ\theta needs to be sufficiently small; see [13] for an overview. However, we also note that [15] is constrained by the Dörfler marking criterion, while the present analysis, in the spirit of [33], covers a fairly wide range of marking strategies.

The remainder of the work is organized as follows: In Section 2, we state our assumptions on the PDE setting (Section 2.1), the mesh-refinement (Section 2.3), the discrete spaces (Section 2.4), and the marking strategy (Section 2.6). Moreover, we recall the least-squares discretization (Section 2.2) as well as the built-in error estimator (Section 2.5) and formulate the common adaptive algorithm (Algorithm 1). Our first main result proves plain convergence of adaptive LSFEM (Theorem 2), if the LSFEM solution is computed exactly. In practice, however, iterative solvers (e.g., multigrid or the preconditioned CG method) are used. Our second main result (Theorem 6) proves convergence of adaptive LSFEM in the presence of inexact solvers (Algorithm 4). Overall, the presented abstract setting covers several model problems like the Poisson problem (Section 3.1), general elliptic second-order PDEs (Section 3.2), linear elasticity (Section 3.3), the magnetostatic Maxwell problem (Section 3.4), and the Stokes problem (Section 3.5). While this work focusses on plain convergence, the short appendix (Appendix A) notes that LSFEM ensures discrete reliability so that, in the spirit of [13], the only missing link for the mathematical proof of optimal convergence rates is the verification of linear convergence.

2. Plain convergence of adaptive least-squares methods

2.1. Continuous model formulation

We consider a PDE in the abstract form

(1) 𝒖=𝑭in a bounded domain Ωd with d2.\displaystyle\mathcal{L}\boldsymbol{u}^{\star}=\boldsymbol{F}\quad\text{in a bounded domain }\Omega\subset\mathbb{R}^{d}\text{ with }d\geq 2.

Here, 𝑭L2(Ω)N\boldsymbol{F}\in L^{2}(\Omega)^{N} with N1N\geq 1 are given data, and :𝕍(Ω)L2(Ω)N\mathcal{L}\colon\mathbb{V}(\Omega)\to L^{2}(\Omega)^{N} is a linear operator from some Hilbert space 𝕍(Ω)\mathbb{V}(\Omega) with norm 𝕍(Ω)\|\cdot\|_{\mathbb{V}(\Omega)} to L2(Ω)NL^{2}(\Omega)^{N} with norm L2(Ω)\|\cdot\|_{L^{2}(\Omega)}. For simplicity, we assume that homogeneous boundary conditions are contained in the space 𝕍(Ω)\mathbb{V}(\Omega). To abbreviate notation, we write ω:=L2(ω)\|\cdot\|_{\omega}:=\|\cdot\|_{L^{2}(\omega)} for any measurable set ωΩ\omega\subseteq\Omega. Moreover, (,)ω(\cdot\hskip 1.42262pt,\cdot)_{\omega} denotes the corresponding L2(ω)L^{2}(\omega) scalar product.

We make the following assumptions on \mathcal{L} and 𝑭\boldsymbol{F}:

  1. (A1)

    𝓛\boldsymbol{\mathcal{L}} is continuously invertible: With constants ccnt,Ccnt>0c_{\rm cnt},C_{\rm cnt}>0, it holds that

    ccnt1𝒗𝕍(Ω)𝒗ΩCcnt𝒗𝕍(Ω)for all 𝒗𝕍(Ω).\displaystyle c_{\rm cnt}^{-1}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\leq\|\mathcal{L}\boldsymbol{v}\|_{\Omega}\leq C_{\rm cnt}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega).
  2. (A2)

    PDE admits solution: The given data satisfy 𝑭ran()\boldsymbol{F}\in\operatorname{ran}(\mathcal{L}).

While (A2) yields the existence of the solution 𝒖𝕍(Ω)\boldsymbol{u}^{\star}\in\mathbb{V}(\Omega) of (1), assumption (A1) leads to

(2) ccnt1𝒖𝒗𝕍(Ω)𝑭𝒗ΩCcnt𝒖𝒗𝕍(Ω)for all 𝒗𝕍(Ω)\displaystyle c_{\rm cnt}^{-1}\,\|\boldsymbol{u}^{\star}-\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\leq\|\boldsymbol{F}-\mathcal{L}\boldsymbol{v}\|_{\Omega}\leq C_{\rm cnt}\,\|\boldsymbol{u}^{\star}-\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega)

and hence, in particular, also guarantees uniqueness.

In practice, the norm on 𝕍(Ω)\mathbb{V}(\Omega) is often an integer order Sobolev space (see the examples in Section 3) that satisfies the following additional properties (which coincide with [33, eq. (2.3)]):

  1. (A3)

    Additivity: The norm on 𝕍(Ω)\mathbb{V}(\Omega) is additive, i.e., for two disjoint subdomains ω1,ω2Ω\omega_{1},\omega_{2}\subset\Omega with positive Lebesgue measure, it holds that

    𝒗𝕍(ω1ω2)2=𝒗𝕍(ω1)2+𝒗𝕍(ω2)2for all 𝒗𝕍(Ω).\displaystyle\|\boldsymbol{v}\|_{\mathbb{V}(\omega_{1}\cup\omega_{2})}^{2}=\|\boldsymbol{v}\|_{\mathbb{V}(\omega_{1})}^{2}+\|\boldsymbol{v}\|_{\mathbb{V}(\omega_{2})}^{2}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega).
  2. (A4)

    Absolute continuity: The norm on 𝕍(Ω)\mathbb{V}(\Omega) is absolutely continuous with respect to the Lebesgue measure, i.e., for all 𝒗𝕍(Ω)\boldsymbol{v}\in\mathbb{V}(\Omega) and all domains ωΩ\omega\subset\Omega, it holds that

    𝒗𝕍(ω)0as|ω|0.\displaystyle\|\boldsymbol{v}\|_{\mathbb{V}(\omega)}\to 0\quad\text{as}\quad|\omega|\to 0.

2.2. Least-squares method

Let 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) be a closed subspace of 𝕍(Ω)\mathbb{V}(\Omega). The least-squares method seeks 𝒖𝕍(Ω)\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega) as the minimizer of the least-squares functional, i.e.,

(3) LS(𝒖;𝑭)=min𝒗𝕍(Ω)LS(𝒗;𝑭),whereLS(𝒗;𝑭)=𝑭𝒗Ω2.\displaystyle{\rm LS}(\boldsymbol{u}_{\bullet}^{\star};\boldsymbol{F})=\min_{\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)}{\rm LS}(\boldsymbol{v}_{\bullet};\boldsymbol{F}),\quad\text{where}\quad{\rm LS}(\boldsymbol{v};\boldsymbol{F})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{v}\|_{\Omega}^{2}.

The Euler–Lagrange equations for this problem read: Find 𝒖𝕍(Ω)\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega) such that

(4) b(𝒖,𝒗)=F(𝒗)for all 𝒗𝕍(Ω),\displaystyle b(\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{v}_{\bullet})=F(\boldsymbol{v}_{\bullet})\quad\text{for all }\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega),

where

b(𝒘,𝒗):=(𝒘,𝒗)ΩandF(𝒗):=(𝑭,𝒗)Ωfor all 𝒗,𝒘𝕍(Ω).\displaystyle b(\boldsymbol{w},\boldsymbol{v}):=(\mathcal{L}\boldsymbol{w}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{\Omega}\quad\text{and}\quad F(\boldsymbol{v}):=(\boldsymbol{F}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{\Omega}\quad\text{for all }\boldsymbol{v},\boldsymbol{w}\in\mathbb{V}(\Omega).

It is straightforward to see that F()F(\cdot) and b(,)b(\cdot,\cdot) satisfy the assumptions of the Lax–Milgram lemma (and, in particular, [33, eq. (2.1)–(2.2)]), i.e., for all 𝒗,𝒘𝕍(Ω)\boldsymbol{v},\boldsymbol{w}\in\mathbb{V}(\Omega), it holds that

|F(𝒗)|Ccnt𝑭Ω𝒗𝕍(Ω),|b(𝒘,𝒗)|Ccnt2𝒘𝕍(Ω)𝒗𝕍(Ω),ccnt2𝒘𝕍(Ω)2b(𝒘,𝒘).\displaystyle|F(\boldsymbol{v})|\leq C_{\rm cnt}\,\|\boldsymbol{F}\|_{\Omega}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)},\quad|b(\boldsymbol{w},\boldsymbol{v})|\leq C_{\rm cnt}^{2}\,\|\boldsymbol{w}\|_{\mathbb{V}(\Omega)}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)},\quad c_{\rm cnt}^{-2}\,\|\boldsymbol{w}\|_{\mathbb{V}(\Omega)}^{2}\leq b(\boldsymbol{w},\boldsymbol{w}).

Therefore, the discrete variational formulation (4) admits a unique solution and is equivalent to the minimization problem (3). In particular, there holds the Céa lemma

(5) ccnt1𝒖𝒖𝕍(Ω)(A1)(𝒖𝒖)Ω=(3)min𝒗𝕍(Ω)(𝒖𝒗)Ω(A1)Ccntmin𝒗𝕍(Ω)𝒖𝒗𝕍(Ω),\displaystyle\begin{split}c_{\rm cnt}^{-1}\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\leq}}\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star})\|_{\Omega}&\stackrel{{\scriptstyle\eqref{eq:minization}}}{{=}}\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{v}_{\bullet})\|_{\Omega}\\ &\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\leq}}C_{\rm cnt}\,\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\|\boldsymbol{u}^{\star}-\boldsymbol{v}_{\bullet}\|_{\operatorname{\mathbb{V}}(\Omega)},\end{split}

i.e., the exact least square solutions are quasi-optimal. Moreover, Galerkin orthogonality

(6) b(𝒖𝒖,𝒗)=0for all 𝒗𝕍(Ω)\displaystyle b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{v}_{\bullet})=0\quad\text{for all }\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)

follows from the variational formulation (4).

2.3. Meshes and mesh-refinement

For simplicity, as in [33], we restrict our presentation to conforming simplicial triangulations 𝒯\mathcal{T}_{\bullet} of Ω\Omega and refinement by, e.g., the newest vertex bisection algorithm [35, 27]. For each triangulation 𝒯\mathcal{T}_{\bullet}, let hL(Ω)h_{\bullet}\in L^{\infty}(\Omega) denote the associated mesh-size function given by h|T=hT=|T|1/dh_{\bullet}|_{T}=h_{T}=|T|^{1/d}.

Let refine()\operatorname{refine}(\cdot) denote the refinement routine. We write 𝒯=refine(𝒯,)\mathcal{T}_{\circ}=\operatorname{refine}(\mathcal{T}_{\bullet},\mathcal{M}_{\bullet}) if 𝒯\mathcal{T}_{\circ} is generated from 𝒯\mathcal{T}_{\bullet} by refining (at least) all marked elements 𝒯\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet}. Moreover, 𝕋(𝒯)\mathbb{T}(\mathcal{T}_{\bullet}) denotes the set of all meshes that can be generated by an arbitrary but finite number of refinements of 𝒯\mathcal{T}_{\bullet}. Throughout, let 𝒯0\mathcal{T}_{0} be a given initial mesh and 𝕋:=𝕋(𝒯0)\mathbb{T}:=\mathbb{T}(\mathcal{T}_{0}).

We make the following assumptions (which essentially coincide with [33, eq. (2.4)]):

  1. (R1)

    Reduction on refined elements: The mesh-size function is monotone and contractive with constant 0<qref<10<q_{\mathrm{ref}}<1 on refined elements, i.e., for all 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T} and 𝒯𝕋(𝒯)\mathcal{T}_{\circ}\in\mathbb{T}(\mathcal{T}_{\bullet}), it holds that

    hha.e. in Ωandh|Tqrefh|Tfor all T𝒯𝒯.\displaystyle h_{\circ}\leq h_{\bullet}\quad\text{a.e.\ in }\Omega\quad\text{and}~\quad h_{\circ}|_{T}\leq q_{\mathrm{ref}}h_{\bullet}|_{T}\quad\text{for all }T\in\mathcal{T}_{\circ}\setminus\mathcal{T}_{\bullet}.
  2. (R2)

    Uniform shape regularity: There exists a constant κ>0\kappa>0 depending only on the initial mesh 𝒯0\mathcal{T}_{0} such that

    supT𝒯diam(T)d|T|κfor all 𝒯𝕋.\displaystyle\sup_{T\in\mathcal{T}_{\bullet}}\frac{\mathrm{diam}(T)^{d}}{|T|}\leq\kappa\quad\text{for all }\mathcal{T}_{\bullet}\in\mathbb{T}.
  3. (R3)

    Marked elements are refined: It holds that

    refine(𝒯,)=for all 𝒯𝕋 and all 𝒯.\displaystyle\mathcal{M}_{\bullet}\cap\operatorname{refine}(\mathcal{T}_{\bullet},\mathcal{M}_{\bullet})=\emptyset\quad\text{for all }\mathcal{T}_{\bullet}\in\mathbb{T}\text{ and all }\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet}.

2.4. Discrete spaces

We assume that each mesh 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T} is associated with some discrete space 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) satisfying the following properties (which coincide with [33, eq. (2.5)]):

  1. (S1)

    Conformity: The spaces are conforming and finite dimensional, i.e.,

    𝕍(Ω)𝕍(Ω) and dim(𝕍(Ω))<for all 𝒯𝕋.\displaystyle\mathbb{V}_{\bullet}(\Omega)\subset\mathbb{V}(\Omega)\text{ and }\dim(\mathbb{V}_{\bullet}(\Omega))<\infty\quad\text{for all }\mathcal{T}_{\bullet}\in\mathbb{T}.
  2. (S2)

    Nestedness: Mesh-refinement guarantees nested discrete spaces, i.e.,

    𝕍(Ω)𝕍(Ω)for all 𝒯𝕋(𝒯).\displaystyle\mathbb{V}_{\bullet}(\Omega)\subseteq\mathbb{V}_{\circ}(\Omega)\quad\text{for all }\mathcal{T}_{\circ}\in\mathbb{T}(\mathcal{T}_{\bullet}).

Moreover, we assume that there exists a dense subspace 𝔻(Ω)𝕍(Ω)\mathbb{D}(\Omega)\subset\mathbb{V}(\Omega) with additive norm 𝔻(Ω)\|\cdot\|_{\mathbb{D}(\Omega)} (see (A3) with 𝕍()\mathbb{V}(\cdot) being replaced by 𝔻()\mathbb{D}(\cdot)) satisfying an approximation property:

  1. (S3)

    Local approximation property: There exist constants Capx>0C_{\rm apx}>0 and s>0s>0 such that, for all 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T}, there exists an approximation operator 𝒜:𝔻(Ω)𝕍(Ω)\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\mathbb{V}_{\bullet}(\Omega) such that

    𝒗𝒜𝒗𝕍(T)CapxhTs𝒗𝔻(T)for all 𝒗𝔻(Ω) and all T𝒯.\displaystyle\|\boldsymbol{v}-\mathcal{A}_{\bullet}\boldsymbol{v}\|_{\mathbb{V}(T)}\leq C_{\rm apx}\,h_{T}^{s}\|\boldsymbol{v}\|_{\mathbb{D}(T)}\quad\text{for all }\boldsymbol{v}\in\mathbb{D}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet}.

2.5. Natural least-squares error estimator

Let 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T}. For all 𝒖𝕍(Ω)\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega) and all 𝒰𝒯\mathcal{U}_{\bullet}\subset\mathcal{T}_{\bullet}, we consider the contributions of the least-squares functional

(7) η(𝒰,u):=(T𝒰η(T,u)2)1/2,whereη(T,u):=𝑭𝒖T.\displaystyle\eta_{\bullet}(\mathcal{U}_{\bullet},u_{\bullet}):=\bigg{(}\sum_{T\in\mathcal{U}_{\bullet}}\eta_{\bullet}(T,u_{\bullet})^{2}\bigg{)}^{1/2},\quad\text{where}\quad\eta_{\bullet}(T,u_{\bullet}):=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}\|_{T}.

To abbreviate notation, let η(u):=η(𝒯,u)=𝑭𝒖Ω\eta_{\bullet}(u_{\bullet}):=\eta_{\bullet}(\mathcal{T}_{\bullet},u_{\bullet})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}\|_{\Omega}. From (2), we see that

(8) ccnt1𝒖𝒖𝕍(Ω)η(u)=LS(𝒖;𝑭)=(𝒖𝒖)ΩCcnt𝒖𝒖𝕍(Ω),\displaystyle c_{\rm cnt}^{-1}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}\leq\eta_{\bullet}(u_{\bullet})={\rm LS}(\boldsymbol{u}_{\bullet};\boldsymbol{F})=\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet})\|_{\Omega}\leq C_{\rm cnt}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)},

i.e., the least-squares functional provides a natural (localizable) error estimator which is reliable (upper bound 𝒖𝒖𝕍(Ω)η(u)\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}\lesssim\eta_{\bullet}(u_{\bullet})) and efficient (lower bound η(u)𝒖𝒖𝕍(Ω)\eta_{\bullet}(u_{\bullet})\lesssim\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}) with respect to the error 𝒖𝒖𝕍(Ω)\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)} of any approximation 𝒖𝒖𝕍(Ω)\boldsymbol{u}^{\star}\approx\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega). In contrast to usual error estimators for standard Galerkin FEM, no data approximation term appears explicitly in the definition of the estimator nor is needed to show the lower bound.

2.6. Marking strategy

We recall the following assumption from [33, Section 2.2.4] on the marking strategy, where 𝒖𝕍(Ω)\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega) is a fixed approximation 𝒖𝒖\boldsymbol{u}_{\bullet}\approx\boldsymbol{u}^{\star}:

  1. (M)

    There exists a fixed function g:00g\colon\mathbb{R}_{\geq 0}\to\mathbb{R}_{\geq 0} being continuous at 0=g(0)0=g(0) such that the set of marked elements 𝒯\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet} (corresponding to 𝒖\boldsymbol{u}_{\bullet}) satisfies that

    maxT𝒯η(T,𝒖)g(maxTη(T,𝒖)).\displaystyle\max_{T\in\mathcal{T}_{\bullet}\setminus\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\leq g\big{(}\max_{T\in\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\big{)}.

We note that the marking assumption (M) is clearly satisfied with g(s)=sg(s)=s if \mathcal{M}_{\bullet} contains at least one element with maximal error indicator, i.e., there exists TT\in\mathcal{M}_{\bullet} such that η(T,u)η(T,u)\eta_{\bullet}(T,u_{\bullet})\geq\eta_{\bullet}(T^{\prime},u_{\bullet}) for all T𝒯T^{\prime}\in\mathcal{T}_{\bullet}. We recall from [33, Section 4.1] that the latter is always satisfied for the following common marking strategies (so that the adaptivity parameter θ\theta could even vary between the steps of the adaptive algorithm):

  • Maximum strategy: Given 0<θ10<\theta\leq 1, the marked elements are determined by

    ={T𝒯:η(T,𝒖)θM},whereM:=maxT𝒯η(T,𝒖).\displaystyle\mathcal{M}_{\bullet}=\big{\{}T\in\mathcal{T}_{\bullet}\,:\,\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\geq\theta\,M_{\bullet}\big{\}},\quad\text{where}\quad M_{\bullet}:=\max_{T\in\mathcal{T}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}).
  • Equilibration strategy: Given 0<θ10<\theta\leq 1, the marked elements are determined by

    ={T𝒯:η(T,𝒖)2θη(𝒖)2/#𝒯}.\displaystyle\mathcal{M}_{\bullet}=\big{\{}T\in\mathcal{T}_{\bullet}\,:\,\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})^{2}\geq\theta\,\eta_{\bullet}(\boldsymbol{u}_{\bullet})^{2}/\#\mathcal{T}_{\bullet}\big{\}}.
  • Dörfler marking strategy: Given 0<θ10<\theta\leq 1, the set 𝒯\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet} is chosen such that

    θη(𝒖)2Tη(T,𝒖)2andmaxT𝒯η(T,𝒖)minTη(T,𝒖).\displaystyle\theta\eta_{\bullet}(\boldsymbol{u}_{\bullet})^{2}\leq\sum_{T\in\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})^{2}\quad\text{and}\quad\max_{T\in\mathcal{T}_{\bullet}\setminus\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\leq\min_{T\in\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}).

We note that the second condition on the Dörfler marking is not explicitly specified in [18], but usually satisfied if the implementation is based on sorting [31].

2.7. Convergent adaptive algorithm

Our first theorem states convergence of the following basic algorithm in the sense that error as well as error estimator are driven to zero.

Algorithm 1.

Input: Initial triangulation 𝒯0\mathcal{T}_{0}.
Loop: For all =0,1,2,\ell=0,1,2,\dots, iterate the following steps 
(i)–(iv):

  • (i)

    SOLVE. Compute the exact least-squares solution 𝒖𝕍(Ω)\boldsymbol{u}_{\ell}^{\star}\in\mathbb{V}_{\ell}(\Omega) by solving (3)–(4).

  • (ii)

    ESTIMATE. For all T𝒯T\in\mathcal{T}_{\ell}, compute the contributions η(T,𝒖)\eta_{\ell}(T,\boldsymbol{u}_{\ell}^{\star}) from (7).

  • (iii)

    MARK. Determine a set 𝒯\mathcal{M}_{\ell}\subseteq\mathcal{T}_{\ell} of marked elements satisfying (M) for 𝒖=𝒖\boldsymbol{u}_{\ell}=\boldsymbol{u}_{\ell}^{\star}.

  • (iv)

    REFINE. Generate a new mesh 𝒯+1:=refine(𝒯,)\mathcal{T}_{\ell+1}:={\rm refine}(\mathcal{T}_{\ell},\mathcal{M}_{\ell}).

Output: Sequences of approximations 𝐮\boldsymbol{u}_{\ell}^{\star} and corresponding error estimators η(𝐮)\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star}).∎

The following theorem is our first main result.

Theorem 2.

Suppose the assumptions (A1)–(A4), (R1)–(R3), (S1)–(S3), and (M). In addition, we make the following assumption on the locality of the operator \mathcal{L}:

  1. (L)

    Local boundedness: There exists Cloc>0C_{\rm loc}>0 such that, for all 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T}, it holds that

    𝒗TCloc𝒗𝕍(Ω(T))for all 𝒗𝕍(Ω) and all T𝒯,\displaystyle\|\mathcal{L}\boldsymbol{v}\|_{T}\leq C_{\rm loc}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega_{\bullet}(T))}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet},

    where Ω(T):={T𝒯:TT}Ω\Omega_{\bullet}(T):=\bigcup\big{\{}T^{\prime}\in\mathcal{T}_{\bullet}\,:\,T\cap T^{\prime}\neq\emptyset\big{\}}\subset\Omega denotes the patch of TT.

Then, Algorithm 1 generates a sequence of approximations (𝐮)0(\boldsymbol{u}_{\ell}^{\star})_{\ell\in\mathbb{N}_{0}} such that

(9) 𝒖𝒖𝕍(Ω)+η(𝒖)0as.\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}+\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})\to 0\quad\text{as}\quad\ell\to\infty.
Proof.

We only need to check that the assumptions of [33, Theorem 2.1] are satisfied. Our assumptions on marking strategy, refinement, and discrete spaces are the same as in [33, Section 2]. The bilinear form of the least-squares FEM is coercive on 𝕍(Ω)\mathbb{V}(\Omega) and thus satisfies the uniform inf\infsup\sup conditions from [33, eq. (2.6)]. It thus only remains to verify [33, eq. (2.10)]: Given an element 𝒘𝕍(Ω)\boldsymbol{w}\in\mathbb{V}(\Omega), define the residual 𝑹(𝒘)𝕍(Ω)\boldsymbol{R}(\boldsymbol{w})\in\mathbb{V}(\Omega)^{\prime} by

(10) 𝑹(𝒘),𝒗:=F(𝒗)b(𝒘,𝒗)=b(𝒖𝒘,𝒗)for all 𝒗𝕍(Ω).\displaystyle\langle\boldsymbol{R}(\boldsymbol{w})\hskip 1.42262pt,\boldsymbol{v}\rangle:=F(\boldsymbol{v})-b(\boldsymbol{w},\boldsymbol{v})=b(\boldsymbol{u}^{\star}-\boldsymbol{w},\boldsymbol{v})\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega).

Let 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T} and 𝒗𝕍(Ω)\boldsymbol{v}\in\mathbb{V}(\Omega). By definition and (L), we have that

𝑹(𝒖),𝒗=(𝑭𝒖,𝒗)Ω=T𝒯(𝑭𝒖,𝒗)TClocT𝒯η(T,𝒖)𝒗𝕍(Ω(T)),\displaystyle\langle\boldsymbol{R}(\boldsymbol{u}_{\bullet}^{\star})\hskip 1.42262pt,\boldsymbol{v}\rangle=(\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}^{\star}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{\Omega}=\sum_{T\in\mathcal{T}}(\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}^{\star}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{T}\leq C_{\rm loc}\sum_{T\in\mathcal{T}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}^{\star})\,\|\boldsymbol{v}\|_{\mathbb{V}(\Omega_{\bullet}(T))},

which is [33, eq. (2.10a)]. Moreover, it holds that

η(T,𝒖)𝑭T+Cloc𝒖𝕍(Ω(T))for all T𝒯.\displaystyle\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}^{\star})\leq\|\boldsymbol{F}\|_{T}+C_{\rm loc}\,\|\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega_{\bullet}(T))}\quad\text{for all }T\in\mathcal{T}_{\bullet}.

which is [33, eq. (2.10b)] and hence concludes the verification of [33, eq. (2.10)]. Altogether, [33, Theorem 2.1] applies and proves that 𝒖𝒖𝕍(Ω)0\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}\to 0 as \ell\to\infty. Since η(𝒖)𝒖𝒖𝕍(Ω)\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})\simeq\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}, we also have that η(𝒖)0\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})\to 0 as \ell\to\infty. ∎

Remark 3.

Note that the locality assumption (L) also directly implies local efficiency

η(T,u)=(𝒖𝒖)TCloc𝒖𝒖𝕍(Ω(T)) for all T𝒯 and all 𝒖𝕍(Ω).\displaystyle\eta_{\bullet}(T,u_{\bullet})=\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet})\|_{T}\leq C_{\rm loc}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega_{\bullet}(T))}\text{ for all }~T\in\mathcal{T}_{\bullet}\text{ and all }\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega).

Again, we stress that this local lower bound does not include data approximation terms.

2.8. Convergence in the case of inexact solvers

For given 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T}, the exact computation of the least-squares solution 𝒖𝕍(Ω)\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega) corresponds to the solution of a symmetric and positive definite algebraic system; see (3)–(4). Let us assume that we have a contractive iterative solver at hand:

  1. (C)

    Contractive iterative solver: There exists 0<qctr<10<q_{\rm ctr}<1 as well as an equivalent norm |||||||||\cdot||| on 𝕍(Ω)\mathbb{V}(\Omega) such that, for all 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T}, there exists Φ:𝕍(Ω)𝕍(Ω)\Phi_{\bullet}\colon\mathbb{V}_{\bullet}(\Omega)\to\mathbb{V}_{\bullet}(\Omega) with

    |𝒖Φ(𝒗)|qctr|𝒖𝒗|for all 𝒗𝕍(Ω).\displaystyle|||\boldsymbol{u}_{\bullet}^{\star}-\Phi_{\bullet}(\boldsymbol{v}_{\bullet})|||\leq q_{\rm ctr}\,|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{v}_{\bullet}|||\quad\text{for all }\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega).

Under this additional assumption, the following adaptive strategy steers adaptive mesh-refinement as well as the iterative solver, where we employ nested iteration in Algorithm 4(iv) to lower the number of solver steps.

Algorithm 4.

Input: Initial triangulation 𝒯0\mathcal{T}_{0}, initial guess 𝐮0,0𝕍0(Ω)\boldsymbol{u}_{0,0}\in\mathbb{V}_{0}(\Omega)
Loop: For all =0,1,2,\ell=0,1,2,\dots, iterate the following steps 
(i)–(iv):

  • (i)

    INEXACT SOLVE. Starting from 𝒖,0\boldsymbol{u}_{\ell,0}, do at least one step of the iterative solver

    𝒖,n:=Φ(𝒖,n1)for all n=1,,n¯=n¯()1.\displaystyle\boldsymbol{u}_{\ell,n}:=\Phi_{\ell}(\boldsymbol{u}_{\ell,n-1})\quad\text{for all }~n=1,\dots,\underline{n}=\underline{n}(\ell)\geq 1.
  • (ii)

    ESTIMATE. For all T𝒯T\in\mathcal{T}_{\ell}, compute the contributions η(T,𝒖,n¯)\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}}) from (7).

  • (iii)

    MARK. Determine a set 𝒯\mathcal{M}_{\ell}\subseteq\mathcal{T}_{\ell} of marked elements satisfying (M) for 𝒖=𝒖,n¯\boldsymbol{u}_{\ell}=\boldsymbol{u}_{\ell,\underline{n}}.

  • (iv)

    REFINE. Generate a new mesh 𝒯+1:=refine(𝒯,)\mathcal{T}_{\ell+1}:={\rm refine}(\mathcal{T}_{\ell},\mathcal{M}_{\ell}) and define 𝒖+1,0:=𝒖,n¯\boldsymbol{u}_{\ell+1,0}:=\boldsymbol{u}_{\ell,\underline{n}}.

Output: Sequences of approximations 𝐮,n¯\boldsymbol{u}_{\ell,\underline{n}} and corresponding error estimators η(𝐮,n¯)\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}}).∎

Remark 5.

For the plain convergence result of the following theorem, one solver step (i.e., n¯()=1\underline{n}(\ell)=1 for all 0\ell\in\mathbb{N}_{0}) is indeed sufficient. In practice, the steps INEXACT SOLVE and ESTIMATE in Algorithm 4(i)–(ii) are usually combined. With some parameter λ>0\lambda>0, a natural stopping criterion for the iterative solver reads

|𝒖,n¯𝒖,n¯1|λη(𝒖1,n¯)resp.|𝒖,n¯𝒖,n¯1|λη(𝒖,n¯).\displaystyle|||\boldsymbol{u}_{\ell,\underline{n}}-\boldsymbol{u}_{\ell,\underline{n}-1}|||\leq\lambda\,\eta_{\ell}(\boldsymbol{u}_{\ell-1,\underline{n}})\quad\text{resp.}\quad|||\boldsymbol{u}_{\ell,\underline{n}}-\boldsymbol{u}_{\ell,\underline{n}-1}|||\leq\lambda\,\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}}).

For adaptive FEM based on locally weighted estimators, we refer, e.g., to [1, 21] for linear convergence of such a strategy (based on the first criterion) and to [20, 23] for linear convergence with optimal rates (based on the second criterion for sufficiently small 0<λ10<\lambda\ll 1). Moreover, we note that the second criterion even allows to prove convergence with optimal rates with respect to the computational costs [22].

Our second theorem states convergence of Algorithm 4 in the sense that, also in the case of inexact solution of the least-squares systems, the error as well as the error estimator are driven to zero.

Theorem 6.

Suppose the assumptions (A1)–(A4), (R1)–(R3), (S1)–(S3), (M), (L), and (C). Then, Algorithm 4 generates a sequence of approximations (𝐮,n¯)0(\boldsymbol{u}_{\ell,\underline{n}})_{\ell\in\mathbb{N}_{0}} such that

(11) 𝒖𝒖,n¯𝕍(Ω)+η(𝒖,n¯)0as.\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}+\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})\to 0\quad\text{as}\quad\ell\to\infty.

The proof of Theorem 6 requires some preparations. First, we recall [33, Lemma 3.1], which already dates back to the seminal work [3].

Lemma 7.

Suppose assumptions (A1)–(A2) and (S1)–(S2). Let (𝐮)0(\boldsymbol{u}_{\ell}^{\star})_{\ell\in\mathbb{N}_{0}} denote the sequence of exact least-squares solutions obtained by solving (3)–(4) for the meshes generated by Algorithm 4. Then,

(12) 𝒖𝒖𝕍(Ω)0as ,\displaystyle\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}\to 0\quad\text{as }\ell\to\infty,

where 𝐮𝕍(Ω):=0𝕍(Ω)¯𝕍(Ω)\boldsymbol{u}_{\infty}^{\star}\in\mathbb{V}_{\infty}(\Omega):=\overline{\bigcup_{\ell\in\mathbb{N}_{0}}\mathbb{V}_{\ell}(\Omega)}\subseteq\mathbb{V}(\Omega) solves (and is, in fact, the unique solution of)

b(𝒖,𝒗)=F(𝒗)for all 𝒗𝕍(Ω).\displaystyle b(\boldsymbol{u}_{\infty}^{\star},\boldsymbol{v}_{\infty})=F(\boldsymbol{v}_{\infty})\quad\text{for all }\boldsymbol{v}_{\infty}\in\mathbb{V}_{\infty}(\Omega).\qquad\qed

The next lemma shows that the inexact least-squares solutions converge indeed towards the same limit as the exact least-squares solutions.

Lemma 8.

Besides the assumptions from Lemma 7 suppose assumption (C). Let (𝐮,n¯)0(\boldsymbol{u}_{\ell,\underline{n}})_{\ell\in\mathbb{N}_{0}} denote the (final) approximations computed in Algorithm 4. Then,

(13) 𝒖𝒖,n¯𝕍(Ω)0as ,\displaystyle\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\to 0\quad\text{as }\ell\to\infty,

where 𝐮𝕍(Ω)\boldsymbol{u}_{\infty}^{\star}\in\mathbb{V}(\Omega) is the limit from Lemma 7.

Proof.

Recall the norm |||||||||\cdot||| from (C). Since n¯(+1)1\underline{n}(\ell+1)\geq 1 and 𝒖+1,0=𝒖,n¯\boldsymbol{u}_{\ell+1,0}=\boldsymbol{u}_{\ell,\underline{n}}, it holds that

|𝒖+1𝒖+1,n¯|qctrn¯(+1)|𝒖+1𝒖+1,0|qctr|𝒖𝒖,n¯|+qctr|𝒖+1𝒖|.\displaystyle|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell+1,\underline{n}}|||\leq q_{\rm ctr}^{\underline{n}(\ell+1)}\,|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell+1,0}|||\leq q_{\rm ctr}\,|||\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||+q_{\rm ctr}\,|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell}^{\star}|||.

Let α:=|𝒖𝒖,n¯|0\alpha_{\ell}:=|||\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||\geq 0. Then, the latter estimate takes the form

0α+1qctrα+|𝒖+1𝒖|,wherelim|𝒖+1𝒖|=0.\displaystyle 0\leq\alpha_{\ell+1}\leq q_{\rm ctr}\,\alpha_{\ell}+|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell}^{\star}|||,\quad\text{where}\quad\lim_{\ell\to\infty}|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell}^{\star}|||=0.

Elementary calculus (see, e.g., the estimator reduction in [13, Corollary 4.8]) shows that

𝒖𝒖,n¯𝕍(Ω)|𝒖𝒖,n¯|=α0as .\displaystyle\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\simeq|||\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||=\alpha_{\ell}\to 0\quad\text{as }\ell\to\infty.

Overall, we thus see that

𝒖𝒖,n¯𝕍(Ω)𝒖𝒖𝕍(Ω)+𝒖𝒖,n¯𝕍(Ω)0as .\displaystyle\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\leq\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}+\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\to 0\quad\text{as }\ell\to\infty.

This concludes the proof. ∎

Proof of Theorem 6.

We verify the validity of the building blocks of the proof of [33, Theorem 2.1] in the case of inexact solutions. This basically follows from the convergence |𝒖𝒖,n¯|0|||\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||\to 0 shown in Lemma 8. Indeed, a closer look unveils that we only need to verify [33, Lemma 3.5–3.6 and Proposition 3.7] which we do in the following steps:

Step 1 (Uniform boundedness): From (A1), (A3), and Lemma 8, it follows that

η(𝒖,n¯)=𝑭𝒖,n¯Ω𝑭𝒖Ω+Ccnt𝒖𝒖,n¯𝕍(Ω)\displaystyle\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\ell,\underline{n}}\|_{\Omega}\leq\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\infty}^{\star}\|_{\Omega}+C_{\rm cnt}\,\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}

and hence sup0η(𝒖,n¯)<\displaystyle\sup_{\ell\in\mathbb{N}_{0}}\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})<\infty.

Step 2 (Convergence of estimator on marked elements): With (A1) and (A3), it holds that

η(T,𝒖,n¯)=𝑭𝒖,n¯T𝑭𝒖T+Ccnt𝒖𝒖,n¯𝕍(Ω).\displaystyle\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\ell,\underline{n}}\|_{T}\leq\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\infty}^{\star}\|_{T}+C_{\rm cnt}\,\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}.

By Lemma 8, the second term tends to zero. Arguing as in the proof of [33, Lemma 3.6] and hence exploiting (A4) and (R1)–(R3), we see that

limmax{η(T,𝒖,n¯):T}=0.\displaystyle\lim_{\ell\to\infty}\max\big{\{}\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})\,:\,T\in\mathcal{M}_{\ell}\big{\}}=0.

Step 3 (Weak convergence of residual): We tweak Step 2 in the proof of [33, Proposition 3.7]: Let 𝒗𝔻(Ω)\boldsymbol{v}\in\mathbb{D}(\Omega) and recall that 𝔻(Ω)𝕍(Ω)\mathbb{D}(\Omega)\subset\mathbb{V}(\Omega) is dense by assumption (S3). Recall the residual from (10). For the exact least-squares solution 𝒖𝕍(Ω)\boldsymbol{u}_{\ell}^{\star}\in\mathbb{V}_{\ell}(\Omega), we may use Galerkin orthogonality (6) and local boundedness (L) to see that

|𝑹(𝒖,n¯),𝒗|\displaystyle|\langle\boldsymbol{R}(\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\boldsymbol{v}\rangle| =|b(𝒖𝒖,n¯,𝒗𝒜𝒗)+b(𝒖𝒖,n¯,𝒜𝒗)|\displaystyle\,=\,|b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v})+b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\mathcal{A}_{\ell}\boldsymbol{v})|
=(6)|b(𝒖𝒖,n¯,𝒗𝒜𝒗)+b(𝒖𝒖,n¯,𝒜𝒗)|\displaystyle\,\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}\,|b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v})+b(\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\mathcal{A}_{\ell}\boldsymbol{v})|
(A1)|((𝒖𝒖,n¯),(𝒗𝒜𝒗))Ω|+𝒖𝒖,n¯𝕍(Ω)𝒜𝒗𝕍(Ω)\displaystyle\!\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\lesssim}}\!|(\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}))_{\Omega}|+\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\|\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega)}
(L)T𝒯(𝒖𝒖,n¯)T𝒗𝒜𝒗𝕍(Ω(T))+𝒖𝒖,n¯𝕍(Ω)𝒜𝒗𝕍(Ω)\displaystyle\stackrel{{\scriptstyle\eqref{ass:localOperator}}}{{\lesssim}}\sum_{T\in\mathcal{T}_{\ell}}\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}})\|_{T}\|\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega_{\ell}(T))}+\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\|\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega)}
=T𝒯η(T,𝒖,n¯)𝒗𝒜𝒗𝕍(Ω(T))+𝒖𝒖,n¯𝕍(Ω)𝒜𝒗𝕍(Ω).\displaystyle\,=\sum_{T\in\mathcal{T}_{\ell}}\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})\|\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega_{\ell}(T))}+\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\|\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega)}.

For fixed 𝒗𝔻(Ω)\boldsymbol{v}\in\mathbb{D}(\Omega), it holds that 𝒖𝒖,n¯𝕍(Ω)𝒜𝒗𝕍(Ω)0\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\|\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\to 0 by (S3) and Lemma 8. The sum on the right-hand side can be dealt with as in [33, Proposition 3.7], and we conclude that

lim𝑹(𝒖,n¯),𝒗=0for all 𝒗𝔻(Ω).\displaystyle\lim_{\ell\to\infty}\langle\boldsymbol{R}(\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\boldsymbol{v}\rangle=0\quad\text{for all }\boldsymbol{v}\in\mathbb{D}(\Omega).

Step 4 (Convergence of inexact solutions and estimators): With the auxiliary results established above, we can follow the proof of [33, Theorem 2.1] step by step to deduce that 𝒖𝒖,n¯𝕍(Ω)0\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\to 0 as \ell\to\infty. Finally, the equivalence 𝒖𝒖,n¯𝕍(Ω)η(𝒖,n¯)\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\simeq\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}}) concludes the proof. ∎

2.9. Preconditioned conjugate gradient method (PCG)

We recall a well-known result which follows from [25, Theorem 11.3.3]. The presented form is, e.g., found in [20, Lemma 1].

Lemma 9.

Let 𝐀,𝐏N×N\boldsymbol{A}_{\bullet},\boldsymbol{P}_{\bullet}\in\mathbb{R}^{N\times N} be symmetric and positive definite, 𝐛N\boldsymbol{b}_{\bullet}\in\mathbb{R}^{N}, 𝐱:=𝐀1𝐛\boldsymbol{x}_{\bullet}^{\star}:=\boldsymbol{A}_{\bullet}^{-1}\boldsymbol{b}_{\bullet}, and 𝐱,0N\boldsymbol{x}_{\bullet,0}\in\mathbb{R}^{N}. Suppose the 2\ell_{2}-condition number estimate

(14) cond2(𝑷1/2𝑨𝑷1/2)Cpcg.\displaystyle{\rm cond}_{2}(\boldsymbol{P}_{\bullet}^{-1/2}\boldsymbol{A}_{\bullet}\boldsymbol{P}_{\bullet}^{-1/2})\leq C_{\rm pcg}.

Then, the iterates 𝐱,n\boldsymbol{x}_{\bullet,n} of the PCG algorithm satisfy the contraction property

(15) 𝒙𝒙,n+1𝑨qctr𝒙𝒙,n𝑨for all n0,\displaystyle\|\boldsymbol{x}_{\bullet}^{\star}-\boldsymbol{x}_{\bullet,n+1}\|_{\boldsymbol{A}_{\bullet}}\leq q_{\rm ctr}\,\|\boldsymbol{x}_{\bullet}^{\star}-\boldsymbol{x}_{\bullet,n}\|_{\boldsymbol{A}_{\bullet}}\quad\text{for all }n\in\mathbb{N}_{0},

where qctr:=(11/Cpcg)1/2<1q_{\rm ctr}:=(1-1/C_{\rm pcg})^{1/2}<1 and 𝐲𝐀2:=𝐲𝐀𝐲\|\boldsymbol{y}_{\bullet}\|_{\boldsymbol{A}_{\bullet}}^{2}:=\boldsymbol{y}_{\bullet}\cdot\boldsymbol{A}_{\bullet}\boldsymbol{y}_{\bullet}  for 𝐲N\boldsymbol{y}_{\bullet}\in\mathbb{R}^{N}.∎

Let {ϕ,1,,ϕ,N}\{\boldsymbol{\phi}_{\bullet,1},\dots,\boldsymbol{\phi}_{\bullet,N}\} denote a basis of 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega). Denote with 𝑨\boldsymbol{A}_{\bullet} the Galerkin matrix and with 𝒃\boldsymbol{b}_{\bullet} the right-hand side of the least-squares method with respect to that basis, i.e., 𝑨[j,k]=b(ϕ,k,ϕ,j)\boldsymbol{A}_{\bullet}[j,k]=b(\boldsymbol{\phi}_{\bullet,k},\boldsymbol{\phi}_{\bullet,j}) and 𝒃[j]=F(ϕ,j)\boldsymbol{b}_{\bullet}[j]=F(\boldsymbol{\phi}_{\bullet,j}).

There is a one-to-one relation between vectors 𝒚N\boldsymbol{y}_{\bullet}\in\mathbb{R}^{N} and functions 𝒗𝕍(Ω)\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega) given by 𝒗=j=1N𝒚[j]ϕ,j\boldsymbol{v}_{\bullet}=\sum_{j=1}^{N}\boldsymbol{y}_{\bullet}[j]\boldsymbol{\phi}_{\bullet,j}. Let 𝒖,n𝕍(Ω)\boldsymbol{u}_{\bullet,n}\in\mathbb{V}_{\bullet}(\Omega) denote the function corresponding to the iterate 𝒙,n\boldsymbol{x}_{\bullet,n}. We note that the least-squares solution 𝒖𝕍(Ω)\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega) corresponds to the coefficient vector 𝒙:=𝑨1𝒃\boldsymbol{x}_{\bullet}^{\star}:=\boldsymbol{A}_{\bullet}^{-1}\boldsymbol{b}_{\bullet}. With the elementary identity

|𝒖𝒖,n|2:=(𝒖𝒖,n)2=b(𝒖𝒖,n,𝒖𝒖,n)=𝒙𝒙,n𝑨2,\displaystyle|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n}|||^{2}:=\|\mathcal{L}(\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n})\|^{2}=b(\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n},\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n})=\|\boldsymbol{x}_{\bullet}^{\star}-\boldsymbol{x}_{\bullet,n}\|_{\boldsymbol{A}_{\bullet}}^{2},

the contraction property (15) thus reads

(16) |𝒖𝒖,(n+1)|qctr|𝒖𝒖,n|for all n0.\displaystyle|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,(n+1)}|||\leq q_{\rm ctr}\,|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n}|||\quad\text{for all }n\in\mathbb{N}_{0}.

We make the following assumption on the preconditioner.

  1. (P)

    Optimal preconditioner: For all 𝒯𝕋\mathcal{T}_{\bullet}\in\mathbb{T}, there exists a symmetric preconditioner 𝑷\boldsymbol{P}_{\bullet} of the Galerkin matrix 𝑨\boldsymbol{A}_{\bullet} such that the constant in (14) depends only on the initial mesh 𝒯0\mathcal{T}_{0}.

Under this assumption, PCG fits into the abstract framework from Section 2.8.

3. Examples

In this section, we consider some common model problems. Throughout, we assume that one of the marking strategies from Section 2.6 is used. Moreover, we assume that 𝒯0\mathcal{T}_{0} is a conforming simplicial triangulation of some bounded Lipschitz domain Ωd\Omega\subset\mathbb{R}^{d} with d=2,3d=2,3 and NVB is used for mesh-refinement (see Section 2.3). In particular, there holds assumption (M) as well as (R1)–(R3). To conclude convergence of ALSFEM, it therefore remains to show that the assumptions (A1)–(A4), (L), and (S1)–(S3) are satisfied for the following examples.

3.1. Poisson problem

For given fL2(Ω)f\in L^{2}(\Omega), consider the Poisson problem

(17a) Δu\displaystyle-\Delta u =f\displaystyle=f in Ω,\displaystyle\text{in }\Omega,
(17b) u\displaystyle u =0\displaystyle=0 on Γ:=Ω.\displaystyle\text{on }\Gamma:=\partial\Omega.

With the substitution 𝝈:=u{\boldsymbol{\sigma}}:=\nabla u, this is equivalently reformulated as a first-order system

(18a) div𝝈\displaystyle-{\rm div\,}{\boldsymbol{\sigma}} =f\displaystyle=f in Ω,\displaystyle\text{in }\Omega,
(18b) u𝝈\displaystyle\nabla u-{\boldsymbol{\sigma}} =0\displaystyle=0 in Ω,\displaystyle\text{in }\Omega,
(18c) u\displaystyle u =0\displaystyle=0 on Γ.\displaystyle\text{on }\Gamma.

With the Hilbert space

(19) 𝕍(Ω)\displaystyle\mathbb{V}(\Omega) :=H01(Ω)×𝑯(div;Ω),\displaystyle:=H_{0}^{1}(\Omega)\times\boldsymbol{H}({\rm div\,};\Omega),

the first-order system (18) can equivalently be recast in the abstract form (1), i.e.,

(20) (u𝝈):=(div𝝈u𝝈)=(f𝟎)=:𝑭L2(Ω)d+1.\displaystyle\mathcal{L}\begin{pmatrix}u\\ {\boldsymbol{\sigma}}\end{pmatrix}:=\begin{pmatrix}-{\rm div\,}{\boldsymbol{\sigma}}\\ \nabla u-{\boldsymbol{\sigma}}\end{pmatrix}=\begin{pmatrix}f\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+1}.

It is well-known (see, e.g., the textbook [5]) that :𝕍(Ω)L2(Ω)d+1\mathcal{L}\colon\operatorname{\mathbb{V}}(\Omega)\to L^{2}(\Omega)^{d+1} is an isomorphism, so that (A1)–(A2) are guaranteed. Clearly, (A3)–(A4) are satisfied, since 𝕍(Ω)\|\cdot\|_{\operatorname{\mathbb{V}}(\Omega)} relies on the Lebesgue norm Ω\|\cdot\|_{\Omega}. Moreover, assumption (L) follows from

(v,𝝉)T2=div𝝉T2+v𝝉T2vT2+vT2+div𝝉T2+𝝉T2=(v,𝝉)𝕍(T)2.\displaystyle\|\mathcal{L}(v,{\boldsymbol{\tau}})\|_{T}^{2}=\|{\rm div\,}{\boldsymbol{\tau}}\|_{T}^{2}+\|\nabla v-{\boldsymbol{\tau}}\|_{T}^{2}\lesssim\|\nabla v\|_{T}^{2}+\|v\|_{T}^{2}+\|{\rm div\,}{\boldsymbol{\tau}}\|_{T}^{2}+\|{\boldsymbol{\tau}}\|_{T}^{2}=\|(v,{\boldsymbol{\tau}})\|_{{\mathbb{V}(T)}}^{2}.

A common FEM discretization of the energy space (19) involves the conforming subspace

(21) 𝕍(Ω):=S0k+1(𝒯)×𝑹𝑻k(𝒯)𝕍(Ω)\displaystyle\mathbb{V}_{\bullet}(\Omega):=S_{0}^{k+1}(\mathcal{T}_{\bullet})\times\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})\subset\operatorname{\mathbb{V}}(\Omega)

where S0k+1(𝒯)H01(Ω)S_{0}^{k+1}(\mathcal{T}_{\bullet})\subseteq H^{1}_{0}(\Omega) is the usual Courant FEM space of order k+1k+1 and 𝑹𝑻k(𝒯)𝑯(div;Ω)\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})\subset\boldsymbol{H}({\rm div\,};\Omega) is the Raviart–Thomas FEM space of order k0k\in\mathbb{N}_{0}; see, e.g., [6]. Clearly, there hold (S1)–(S2), since 𝒯refine(𝒯)\mathcal{T}_{\circ}\in\operatorname{refine}(\mathcal{T}_{\bullet}) yields that 𝕍(Ω)𝕍(Ω)𝕍(Ω)\mathbb{V}_{\bullet}(\Omega)\subseteq\operatorname{\mathbb{V}}_{\circ}(\Omega)\subset\operatorname{\mathbb{V}}(\Omega).

It remains to show the local approximation property (S3). To that end, recall that Cc(Ω¯)C_{c}^{\infty}(\overline{\Omega}) is dense in H01(Ω)H^{1}_{0}(\Omega) and C(Ω¯)dC^{\infty}(\overline{\Omega})^{d} is dense in 𝑯(div;Ω)\boldsymbol{H}({\rm div\,};\Omega); see, e.g., [24, Theorem 2.4]. Therefore,

Cc(Ω¯)×C(Ω¯)d𝔻(Ω):=[H2(Ω)H01(Ω)]×H2(Ω)d𝕍(Ω) is dense.\displaystyle C_{c}^{\infty}(\overline{\Omega})\times C^{\infty}(\overline{\Omega})^{d}\subset\mathbb{D}(\Omega):=\big{[}H^{2}(\Omega)\cap H^{1}_{0}(\Omega)\big{]}\times H^{2}(\Omega)^{d}\subset\operatorname{\mathbb{V}}(\Omega)\text{ is dense}.

Let :H2(Ω)S1(𝒯)\mathcal{I}_{\bullet}\colon H^{2}(\Omega)\to S^{1}(\mathcal{T}_{\bullet}) be the nodal interpolation operator onto the first-order Courant space. Then, it holds that

vvH1(T)hTvH2(T)for all vH2(Ω)H01(Ω) and all T𝒯.\displaystyle\|v-\mathcal{I}_{\bullet}v\|_{H^{1}(T)}\lesssim h_{T}\|v\|_{H^{2}(T)}\quad\text{for all }v\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet}.

Let :H1(Ω)𝑹𝑻0(𝒯)\mathcal{R}_{\bullet}\colon H^{1}(\Omega)\to\boldsymbol{RT}^{0}(\mathcal{T}_{\bullet}) be the Raviart–Thomas projector onto the lowest-order Raviart–Thomas space. Then, it holds that

𝝉𝝉𝑯(div;T)hT𝝉H2(T)for all 𝝉H2(Ω)d and all T𝒯.\displaystyle\|{\boldsymbol{\tau}}-\mathcal{R}_{\bullet}{\boldsymbol{\tau}}\|_{\boldsymbol{H}({\rm div\,};T)}\lesssim h_{T}\|{\boldsymbol{\tau}}\|_{H^{2}(T)}\quad\text{for all }{\boldsymbol{\tau}}\in H^{2}(\Omega)^{d}\text{ and all }T\in\mathcal{T}_{\bullet}.

Overall, the approximation operator 𝒜:𝔻(Ω)𝕍(Ω)\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\operatorname{\mathbb{V}}_{\bullet}(\Omega) defined by 𝒜(v,𝝉):=(v,𝝉)\mathcal{A}_{\bullet}(v,{\boldsymbol{\tau}}):=(\mathcal{I}_{\bullet}v,\mathcal{R}_{\bullet}{\boldsymbol{\tau}}) satisfies (S3) with s=1s=1.

3.2. General second-order problem

Given fL2(Ω)f\in L^{2}(\Omega), we consider the general second-order linear elliptic PDE

(22a) div(𝑨u)+𝒃u+cu\displaystyle-{\rm div\,}(\boldsymbol{A}\nabla u)+\boldsymbol{b}\cdot\nabla u+c\,u =f\displaystyle=f in Ω,\displaystyle\text{in }\Omega,
(22b) u\displaystyle u =0\displaystyle=0 on Γ,\displaystyle\text{on }\Gamma,

where 𝑨jk,𝒃j,cL(Ω)\boldsymbol{A}_{jk},\,\boldsymbol{b}_{j},\,c\in L^{\infty}(\Omega), and 𝑨\boldsymbol{A} is symmetric and uniformly positive definite. It follows from the Fredholm alternative that existence and uniqueness of the weak solution uH01(Ω)u\in H^{1}_{0}(\Omega) to (22) is equivalent to the well-posedness of the homogeneous problem, i.e.,

a(u0,v):=𝑨u0,vΩ+𝒃u0+cu0,vΩ\displaystyle a(u_{0},v):=\langle\boldsymbol{A}\nabla u_{0}\hskip 1.42262pt,\nabla v\rangle_{\Omega}+\langle\boldsymbol{b}\cdot\nabla u_{0}+c\,u_{0}\hskip 1.42262pt,v\rangle_{\Omega}

satisfies that

(23) u0H01(Ω):([vH01(Ω)a(u0,v)=0]u0=0);\displaystyle\forall u_{0}\in H^{1}_{0}(\Omega):\Big{(}\big{[}\forall v\in H^{1}_{0}(\Omega)\quad a(u_{0},v)=0\big{]}\quad\Longrightarrow\quad u_{0}=0\Big{)};

see, e.g., [7]. Clearly, the general problem (22) does not only cover the Poisson problem from Section 3.1 (with 𝑨\boldsymbol{A} being the identity, 𝒃=0\boldsymbol{b}=0, and c=0c=0, where (23) follows from the Poincaré inequality), but also the Helmholtz problem (with 𝑨\boldsymbol{A} being the identity, 𝒃=0\boldsymbol{b}=0, and c=ω2<0c=-\omega^{2}<0, provided that ω2\omega^{2} is not an eigenvalue of the Dirichlet–Laplace eigenvalue problem).

The first-order reformulation of (22) reads

(24) (u𝝈):=(div𝝈+𝒃u+cu𝑨u𝝈)=(f𝟎)=:𝑭L2(Ω)d+1.\displaystyle\mathcal{L}\begin{pmatrix}u\\ {\boldsymbol{\sigma}}\end{pmatrix}:=\begin{pmatrix}-{\rm div\,}{\boldsymbol{\sigma}}+\boldsymbol{b}\cdot\nabla u+c\,u\\ \boldsymbol{A}\nabla u-{\boldsymbol{\sigma}}\end{pmatrix}=\begin{pmatrix}f\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+1}.

With the Hilbert space 𝕍(Ω)\mathbb{V}(\Omega) from (19) and provided that problem (22) admits a unique solution uH1(Ω)u\in H^{1}(\Omega), we conclude with [10, Theorem 3.1] that (A1)–(A2) are satisfied. As in Section 3.1, the assumptions (A3)–(A4) are clearly satisfied and (L) follows as all coefficients of \mathcal{L} are bounded.

As before, a common FEM discretization involves the conforming subspace 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) from (21), and (S1)–(S3) follow as in Section 3.1. Overall, there holds the following convergence result:

Proposition 10.

Under the well-posedness assumption (23) and for any marking strategy satisfying (M), the adaptive least-squares formulation of (24) based on 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) from (21) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

3.3. Linear Elasticity

Given 𝒇𝑳2(Ω):=L2(Ω)d\boldsymbol{f}\in\boldsymbol{L}^{2}(\Omega):=L^{2}(\Omega)^{d} find the displacement 𝒖𝑯01(Ω):=H01(Ω)d\boldsymbol{u}\in{\boldsymbol{H}}_{0}^{1}(\Omega):=H_{0}^{1}(\Omega)^{d} and the stress 𝑴𝑯¯(𝐝𝐢𝐯;Ω):={𝑴L2(Ω)d×d:𝐝𝐢𝐯𝑴𝑳2(Ω)}\boldsymbol{M}\in\underline{{\boldsymbol{H}}}({\mathbf{div}\,};\Omega):=\big{\{}\boldsymbol{M}\in L^{2}(\Omega)^{d\times d}\,:\,{\mathbf{div}\,}\boldsymbol{M}\in\boldsymbol{L}^{2}(\Omega)\big{\}} satisfying

𝐝𝐢𝐯𝑴\displaystyle-{\mathbf{div}\,}\boldsymbol{M} =𝒇,\displaystyle=\boldsymbol{f},
𝑴ϵ(𝒖)\displaystyle\boldsymbol{M}-\mathbb{C}\boldsymbol{\epsilon}(\boldsymbol{u}) =𝟎.\displaystyle=\boldsymbol{0}.

Here, 𝐝𝐢𝐯(){\mathbf{div}\,}(\cdot) denotes the divergence operator applied to each row and ϵ()=12(()+())\boldsymbol{\epsilon}(\cdot)=\tfrac{1}{2}(\boldsymbol{\nabla}(\cdot)+\boldsymbol{\nabla}(\cdot)^{\top}) is the symmetric gradient, where (𝒗)ij=j𝒗i(\boldsymbol{\nabla}\boldsymbol{v})_{ij}=\partial_{j}\boldsymbol{v}_{i} is the Jacobian. Given the Lamé parameters λ,μ>0\lambda,\mu>0, the positive definite elasticity tensor is given by

𝑵=(λtr𝑵)𝑰+2μ𝑵\displaystyle\mathbb{C}\boldsymbol{N}=(\lambda\mathrm{tr}\boldsymbol{N})\boldsymbol{I}+2\mu\boldsymbol{N}

with 𝑰\boldsymbol{I} denoting the d×dd\times d identity matrix and tr𝑵=j=1d𝑵jj\mathrm{tr}\boldsymbol{N}=\sum_{j=1}^{d}\boldsymbol{N}_{jj} the trace operator. Consider the Hilbert space

(25) 𝕍(Ω):=𝑯01(Ω)×𝑯¯(𝐝𝐢𝐯;Ω)\displaystyle\mathbb{V}(\Omega):={\boldsymbol{H}}_{0}^{1}(\Omega)\times\underline{\boldsymbol{H}\!}\hskip 0.50003pt({\mathbf{div}\,};\Omega)

equipped with the norm

(𝒗,𝑵)𝕍(Ω)2=1/2𝑵Ω2+𝐝𝐢𝐯𝑵Ω2+1/2ϵ(𝒗)Ω2,\displaystyle\|(\boldsymbol{v},\boldsymbol{N})\|_{\mathbb{V}(\Omega)}^{2}=\|\mathbb{C}^{-1/2}\boldsymbol{N}\|_{\Omega}^{2}+\|{\mathbf{div}\,}\boldsymbol{N}\|_{\Omega}^{2}+\|\mathbb{C}^{1/2}\boldsymbol{\epsilon}(\boldsymbol{v})\|_{\Omega}^{2},

which is equivalent to the canonical norm by Korn’s inequality and the properties of the elasticity tensor. Then, the first-order system can be put into the abstract form (1), i.e.,

(26) (𝒖𝑴):=(𝐝𝐢𝐯𝑴1/2𝑴1/2ϵ(𝒖))=(𝒇𝟎)=:𝑭L2(Ω)d+d2,\displaystyle\mathcal{L}\begin{pmatrix}\boldsymbol{u}\\ \boldsymbol{M}\end{pmatrix}:=\begin{pmatrix}-{\mathbf{div}\,}\boldsymbol{M}\\ \mathbb{C}^{-1/2}\boldsymbol{M}-\mathbb{C}^{1/2}\boldsymbol{\epsilon}(\boldsymbol{u})\end{pmatrix}=\begin{pmatrix}\boldsymbol{f}\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+d^{2}},

where we identify d×dd\times d matrices with d2×1d^{2}\times 1 vectors. It is well known that the linear elasticity problem is well-posed so that (A1)–(A2) are satisfied; see, e.g., [9, Theorem 2.1]. The validity of the assumptions (A3)–(A4) and (L) follows as in Section 3.1. Consider the conforming discrete space

(27) 𝕍(Ω)=S0k+1(𝒯)d×𝑹𝑻k(𝒯)d𝕍(Ω).\displaystyle\mathbb{V}_{\bullet}(\Omega)=S_{0}^{k+1}(\mathcal{T}_{\bullet})^{d}\times\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})^{d}\subset\mathbb{V}(\Omega).

Here, 𝑹𝑻k(𝒯)d\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})^{d} denotes the space of matrix-valued functions, where each row is an element of 𝑹𝑻k(𝒯)\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet}). As in Section 3.1, we conclude that the assumptions (S1)–(S3) are satisfied. Overall, we then have the following result:

Proposition 11.

For any marking strategy satisfying (M), the adaptive least-squares formulation of (26) based on 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) from (27) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

3.4. Maxwell problem

We consider the case d=3d=3 only. Given 𝒇𝑳2(Ω)\boldsymbol{f}\in\boldsymbol{L}^{2}(\Omega) and cL(Ω)c\in L^{\infty}(\Omega), find 𝒖𝑯0(𝐜𝐮𝐫𝐥;Ω)\boldsymbol{u}\in\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega) and 𝝈𝑯(𝐜𝐮𝐫𝐥;Ω){\boldsymbol{\sigma}}\in\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega) satisfying

𝐜𝐮𝐫𝐥𝝈+c𝒖\displaystyle\boldsymbol{\operatorname{curl}}\,{\boldsymbol{\sigma}}+c\,\boldsymbol{u} =𝒇,\displaystyle=\boldsymbol{f},
𝐜𝐮𝐫𝐥𝒖𝝈\displaystyle\boldsymbol{\operatorname{curl}}\,\boldsymbol{u}-{\boldsymbol{\sigma}} =𝟎,\displaystyle=\boldsymbol{0},

where

𝑯(𝐜𝐮𝐫𝐥;Ω)\displaystyle\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega) :={𝒗𝑳2(Ω):𝐜𝐮𝐫𝐥𝒗𝑳2(Ω)},\displaystyle:=\big{\{}\boldsymbol{v}\in\boldsymbol{L}^{2}(\Omega)\,:\,\boldsymbol{\operatorname{curl}}\,\boldsymbol{v}\in\boldsymbol{L}^{2}(\Omega)\big{\}},
𝑯0(𝐜𝐮𝐫𝐥;Ω)\displaystyle\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega) :={𝒗𝑯(𝐜𝐮𝐫𝐥;Ω):𝒗×𝒏|Ω=0}.\displaystyle:=\big{\{}\boldsymbol{v}\in\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)\,:\,\boldsymbol{v}\times{\boldsymbol{n}}|_{\partial\Omega}=0\big{\}}.

With the Hilbert space

(28) 𝕍(Ω):=𝑯0(𝐜𝐮𝐫𝐥;Ω)×𝑯(𝐜𝐮𝐫𝐥;Ω)\displaystyle\mathbb{V}(\Omega):=\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega)\times\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)

equipped with the norm

(𝒗,𝝉)𝕍(Ω)2=𝒗Ω2+𝐜𝐮𝐫𝐥𝒗Ω2+𝝉Ω2+𝐜𝐮𝐫𝐥𝝉Ω2,\displaystyle\|(\boldsymbol{v},{\boldsymbol{\tau}})\|_{\mathbb{V}(\Omega)}^{2}=\|\boldsymbol{v}\|_{\Omega}^{2}+\|\boldsymbol{\operatorname{curl}}\,\boldsymbol{v}\|_{\Omega}^{2}+\|{\boldsymbol{\tau}}\|_{\Omega}^{2}+\|\boldsymbol{\operatorname{curl}}\,{\boldsymbol{\tau}}\|_{\Omega}^{2},

the first-order system can be written in the abstract form (1), i.e.,

(29) (𝒖𝝈):=(𝐜𝐮𝐫𝐥𝝈+c𝒖𝐜𝐮𝐫𝐥𝒖𝝈)=(𝒇𝟎)=:𝑭L2(Ω)6.\displaystyle\mathcal{L}\begin{pmatrix}\boldsymbol{u}\\ {\boldsymbol{\sigma}}\end{pmatrix}:=\begin{pmatrix}\boldsymbol{\operatorname{curl}}\,{\boldsymbol{\sigma}}+c\boldsymbol{u}\\ \boldsymbol{\operatorname{curl}}\,\boldsymbol{u}-{\boldsymbol{\sigma}}\end{pmatrix}=\begin{pmatrix}\boldsymbol{f}\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{6}.

The assumptions (A1)–(A2) are satisfied if essinfxΩc>0\operatorname{ess}\inf_{x\in\Omega}c>0 or c=ω2<0c=-\omega^{2}<0 and ω2\omega^{2} is not an eigenvalue of the cavity problem. The argumentation is similar to the Poisson case. For the case c=ω2c=-\omega^{2}, we also refer to [16, Section 2.4]. The validity of the assumptions (A3)–(A4) and (L) follows as in Section 3.1. Using the conforming discrete space

(30) 𝕍(Ω)=𝑵0k(𝒯)×𝑵k(𝒯):=(𝑵k(𝒯)𝑯0(𝐜𝐮𝐫𝐥;Ω))×𝑵k(𝒯)𝕍(Ω),\displaystyle\mathbb{V}_{\bullet}(\Omega)=\boldsymbol{N}_{0}^{k}(\mathcal{T}_{\bullet})\times\boldsymbol{N}^{k}(\mathcal{T}_{\bullet}):=(\boldsymbol{N}^{k}(\mathcal{T}_{\bullet})\cap\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega))\times\boldsymbol{N}^{k}(\mathcal{T}_{\bullet})\subset\mathbb{V}(\Omega),

where 𝑵k(𝒯)\boldsymbol{N}^{k}(\mathcal{T}_{\bullet}) denotes the Nédélec space of order k0k\in\mathbb{N}_{0}, this proves that the assumptions (S1)–(S2) are satisfied. It remains to verify the local approximation property (S3). Recall from [28] that Cc(Ω¯)3C_{c}^{\infty}(\overline{\Omega})^{3} is dense in 𝑯0(𝐜𝐮𝐫𝐥;Ω)\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega) and C(Ω¯)3C^{\infty}(\overline{\Omega})^{3} is dense in 𝑯(𝐜𝐮𝐫𝐥;Ω)\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega). This shows that

Cc(Ω¯)3×C(Ω¯)3𝔻(Ω):=(H2(Ω)H01(Ω))3×H2(Ω)3𝕍(Ω)\displaystyle C_{c}^{\infty}(\overline{\Omega})^{3}\times C^{\infty}(\overline{\Omega})^{3}\subset\mathbb{D}(\Omega):=(H^{2}(\Omega)\cap H_{0}^{1}(\Omega))^{3}\times H^{2}(\Omega)^{3}\subset\mathbb{V}(\Omega)

is dense. Let 𝒩:H1(Ω)3𝑵k(𝒯)\mathcal{N}_{\bullet}\colon H^{1}(\Omega)^{3}\to\boldsymbol{N}^{k}(\mathcal{T}_{\bullet}) denote the edge interpolation operator which satisfies

𝒗𝒩𝒗𝑯(𝐜𝐮𝐫𝐥;T)hT𝒗H2(T)for all 𝒗H2(Ω)3 and all T𝒯;\displaystyle\|\boldsymbol{v}-\mathcal{N}_{\bullet}\boldsymbol{v}\|_{\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;T)}\lesssim h_{T}\|\boldsymbol{v}\|_{H^{2}(T)}\quad\text{for all }\boldsymbol{v}\in H^{2}(\Omega)^{3}\text{ and all }T\in\mathcal{T}_{\bullet};

see, e.g., [26, Section 3.5–3.6]. Note that the projection 𝒩\mathcal{N}_{\bullet} by definition also preserves homogeneous boundary conditions. Therefore, 𝒜:𝔻(Ω)𝕍(Ω)\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\mathbb{V}_{\bullet}(\Omega) defined by 𝒜(𝒗,𝝉)=(𝒗,𝝉)\mathcal{A}_{\bullet}(\boldsymbol{v},{\boldsymbol{\tau}})=(\mathcal{R}_{\bullet}\boldsymbol{v},\mathcal{R}_{\bullet}{\boldsymbol{\tau}}) satisfies assumption (S3). Overall, we have the following result:

Proposition 12.

For any marking strategy satisfying (M), the adaptive least-squares formulation of (29) based on 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) from (30) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

3.5. Stokes problem

Given 𝒇𝑳2(Ω)\boldsymbol{f}\in\boldsymbol{L}^{2}(\Omega), find the velocity 𝒖𝑯01(Ω)\boldsymbol{u}\in{\boldsymbol{H}}_{0}^{1}(\Omega) and the pressure pL2(Ω)p\in L^{2}(\Omega) satisfying

𝚫𝒖+p\displaystyle-\boldsymbol{\Delta}\boldsymbol{u}+\nabla p =𝒇,\displaystyle=\boldsymbol{f},
div𝒖\displaystyle{\rm div\,}\boldsymbol{u} =0,\displaystyle=0,
Ωp𝑑x\displaystyle\int_{\Omega}p\,dx =0.\displaystyle=0.

There are many different reformulations that are suitable to define LSFEMs; see [5, Chapter 7]. Here, we consider a stress–velocity–pressure formulation (see, e.g. [11]), where the stress is defined by the relation

𝑴=ϵ(𝒖)p𝑰.\displaystyle\boldsymbol{M}=\boldsymbol{\epsilon}(\boldsymbol{u})-p\boldsymbol{I}.

Let ΠΩv=|Ω|1(v,1)Ω\Pi_{\Omega}v=|\Omega|^{-1}(v\hskip 1.42262pt,1)_{\Omega} denote the L2(Ω)L^{2}(\Omega)-orthogonal projection on constants. Consider the Hilbert space

(31) 𝕍(Ω):=𝑯01(Ω)×𝑯¯(𝐝𝐢𝐯;Ω)×L2(Ω)\displaystyle\mathbb{V}(\Omega):={\boldsymbol{H}}_{0}^{1}(\Omega)\times\underline{\boldsymbol{H}\!}\hskip 0.50003pt({\mathbf{div}\,};\Omega)\times L^{2}(\Omega)

equipped with the norm

(𝒗,𝑵,q)𝕍(Ω)2=𝒗𝑯1(Ω)2+𝑵Ω2+𝐝𝐢𝐯𝑵Ω2+qΩ2+ΠΩqΩ2.\displaystyle\|(\boldsymbol{v},\boldsymbol{N},q)\|_{\mathbb{V}(\Omega)}^{2}=\|\boldsymbol{v}\|_{\boldsymbol{H}^{1}(\Omega)}^{2}+\|\boldsymbol{N}\|_{\Omega}^{2}+\|{\mathbf{div}\,}\boldsymbol{N}\|_{\Omega}^{2}+\|q\|_{\Omega}^{2}+\|\Pi_{\Omega}q\|_{\Omega}^{2}.

Then, the first-order system can be recast in the abstract form (1), i.e.,

(32) (𝒖𝑴p):=(𝐝𝐢𝐯𝑴𝑴ϵ(𝒖)+p𝑰div𝒖ΠΩp)=(𝒇𝟎00)=:𝑭L2(Ω)d+d2+2,\displaystyle\mathcal{L}\begin{pmatrix}\boldsymbol{u}\\ \boldsymbol{M}\\ p\end{pmatrix}:=\begin{pmatrix}-{\mathbf{div}\,}\boldsymbol{M}\\ \boldsymbol{M}-\boldsymbol{\epsilon}(\boldsymbol{u})+p\boldsymbol{I}\\ {\rm div\,}\boldsymbol{u}\\ \Pi_{\Omega}p\end{pmatrix}=\begin{pmatrix}\boldsymbol{f}\\ \boldsymbol{0}\\ 0\\ 0\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+d^{2}+2},

where we identify d×dd\times d matrices as before with d2×1d^{2}\times 1 vectors.

Following the proof of [11, Theorem 3.2] we conclude that (A1)–(A2) are guaranteed (the only difference to the work [11] is that we include the pressure condition directly in the first-order formulation). Clearly, (A3)–(A4) follow as in Section 3.1. Moreover, we have included the term ΠΩqΩ\|\Pi_{\Omega}q\|_{\Omega} in the definition of the norm 𝕍(Ω)\|\cdot\|_{\mathbb{V}(\Omega)} to ensure that (L) is satisfied, i.e.,

(𝒗,𝑵,q)T2\displaystyle\|\mathcal{L}(\boldsymbol{v},\boldsymbol{N},q)\|_{T}^{2} =𝐝𝐢𝐯𝑵T2+𝑵ϵ(𝒗)+qT2+div𝒗T2+ΠΩqT2\displaystyle=\|{\mathbf{div}\,}\boldsymbol{N}\|_{T}^{2}+\|\boldsymbol{N}-\boldsymbol{\epsilon}(\boldsymbol{v})+q\|_{T}^{2}+\|{\rm div\,}\boldsymbol{v}\|_{T}^{2}+\|\Pi_{\Omega}q\|_{T}^{2}
𝒗𝑯1(T)2+𝑵T2+𝐝𝐢𝐯𝑵T2+qT2+ΠΩqT2=(𝒗,𝑵,q)𝕍(T)2\displaystyle\lesssim\|\boldsymbol{v}\|_{\boldsymbol{H}^{1}(T)}^{2}+\|\boldsymbol{N}\|_{T}^{2}+\|{\mathbf{div}\,}\boldsymbol{N}\|_{T}^{2}+\|q\|_{T}^{2}+\|\Pi_{\Omega}q\|_{T}^{2}=\|(\boldsymbol{v},\boldsymbol{N},q)\|_{\mathbb{V}(T)}^{2}

holds for all (𝒗,𝑵,q)𝕍(Ω)(\boldsymbol{v},\boldsymbol{N},q)\in\mathbb{V}(\Omega).

Consider the conforming subspace

(33) 𝕍(Ω)=S0k+1(𝒯)d×𝑹𝑻k(𝒯)d×Pk(𝒯)𝕍(Ω),\displaystyle\mathbb{V}_{\bullet}(\Omega)=S_{0}^{k+1}(\mathcal{T}_{\bullet})^{d}\times\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})^{d}\times P^{k}(\mathcal{T}_{\bullet})\subset\mathbb{V}(\Omega),

where Pk(𝒯)P^{k}(\mathcal{T}_{\bullet}) denotes the space of elementwise polynomials of degree k0\leq k\in\mathbb{N}_{0}. It follows that the assumptions (S1)–(S2) are satisfied. It remains to prove the local approximation assumption (S3). For the first two components (𝒗,𝑵)𝑯01(Ω)×𝑯¯(𝐝𝐢𝐯;Ω)(\boldsymbol{v},\boldsymbol{N})\in{\boldsymbol{H}}_{0}^{1}(\Omega)\times\underline{\boldsymbol{H}\!}\hskip 0.50003pt({\mathbf{div}\,};\Omega) in the space 𝕍(Ω)\mathbb{V}(\Omega), we argue as in Section 3.1. For the last component qL2(Ω)q\in L^{2}(\Omega), we note that H1(Ω)H^{1}(\Omega) is dense in L2(Ω)L^{2}(\Omega). Let 𝒫:L2(Ω)Pk(𝒯)\mathcal{P}_{\bullet}\colon L^{2}(\Omega)\to P^{k}(\mathcal{T}_{\bullet}) denote the L2(Ω)L^{2}(\Omega)-orthogonal projection and observe that ΠΩ(q𝒫q)=0\Pi_{\Omega}(q-\mathcal{P}_{\bullet}q)=0. Then,

q𝒫qT+ΠΩ(q𝒫q)T=q𝒫qThTqTfor all qH1(Ω) and all T𝒯.\displaystyle\|q-\mathcal{P}_{\bullet}q\|_{T}+\|\Pi_{\Omega}(q-\mathcal{P}_{\bullet}q)\|_{T}=\|q-\mathcal{P}_{\bullet}q\|_{T}\lesssim h_{T}\|\nabla q\|_{T}\quad\text{for all }q\in H^{1}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet}.

Altogether, we choose 𝔻(Ω)=(H2(Ω)H01(Ω))×H2(Ω)d×d×H1(Ω)\mathbb{D}(\Omega)=(H^{2}(\Omega)\cap H_{0}^{1}(\Omega))\times H^{2}(\Omega)^{d\times d}\times H^{1}(\Omega) and 𝒜:𝔻(Ω)𝕍(Ω)\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\mathbb{V}(\Omega) with 𝒜(𝒗,𝑵,q)=(𝒗,𝑵,𝒫q)\mathcal{A}_{\bullet}(\boldsymbol{v},\boldsymbol{N},q)=(\mathcal{I}_{\bullet}\boldsymbol{v},\mathcal{R}_{\bullet}\boldsymbol{N},\mathcal{P}_{\bullet}q), where \mathcal{I}_{\bullet} and \mathcal{R}_{\bullet} denote the operators from Section 3.1 applied to each row of 𝒗\boldsymbol{v} resp. 𝑵\boldsymbol{N}. Therefore, we conclude that also (S3) is satisfied. Overall, we have the following result:

Proposition 13.

For any marking strategy satisfying (M), the adaptive least-squares formulation of (32) based on 𝕍(Ω)\mathbb{V}_{\bullet}(\Omega) from (33) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

References

  • [1] M. Arioli, E. H. Georgoulis, and D. Loghin. Stopping criteria for adaptive finite element solvers. SIAM J. Sci. Comput., 35(3):A1537–A1559, 2013.
  • [2] M. Aurada, M. Feischl, J. Kemetmüller, M. Page, and D. Praetorius. Each H1/2H^{1/2}-stable projection yields convergence and quasi-optimality of adaptive FEM with inhomogeneous Dirichlet data in d\mathbb{R}^{d}. ESAIM Math. Model. Numer. Anal., 47(4):1207–1235, 2013.
  • [3] I. Babuška and M. Vogelius. Feedback and adaptive finite element solution of one-dimensional boundary value problems. Numer. Math., 44(1):75–102, 1984.
  • [4] P. Binev, W. Dahmen, and R. DeVore. Adaptive finite element methods with convergence rates. Numer. Math., 97(2):219–268, 2004.
  • [5] P. B. Bochev and M. D. Gunzburger. Least-squares finite element methods, volume 166 of Applied Mathematical Sciences. Springer, New York, 2009.
  • [6] D. Boffi, F. Brezzi, and M. Fortin. Mixed finite element methods and applications, volume 44 of Springer Series in Computational Mathematics. Springer, Heidelberg, 2013.
  • [7] S. C. Brenner and L. R. Scott. The mathematical theory of finite element methods, volume 15 of Texts in Applied Mathematics. Springer, New York, third edition, 2008.
  • [8] P. Bringmann, C. Carstensen, and G. Starke. An adaptive least-squares FEM for linear elasticity with optimal convergence rates. SIAM J. Numer. Anal., 56(1):428–447, 2018.
  • [9] Z. Cai, J. Korsawe, and G. Starke. An adaptive least squares mixed finite element method for the stress-displacement formulation of linear elasticity. Numer. Methods Partial Differential Equations, 21(1):132–148, 2005.
  • [10] Z. Cai, R. Lazarov, T. A. Manteuffel, and S. F. McCormick. First-order system least squares for second-order partial differential equations. I. SIAM J. Numer. Anal., 31(6):1785–1799, 1994.
  • [11] Z. Cai, B. Lee, and P. Wang. Least-squares methods for incompressible Newtonian fluid flow: linear stationary problems. SIAM J. Numer. Anal., 42(2):843–859, 2004.
  • [12] C. Carstensen. Collective marking for adaptive least-squares finite element methods with optimal rates. Math. Comp., 89(321):89–103, 2020.
  • [13] C. Carstensen, M. Feischl, M. Page, and D. Praetorius. Axioms of adaptivity. Comput. Math. Appl., 67(6):1195–1253, 2014.
  • [14] C. Carstensen and E.-J. Park. Convergence and optimality of adaptive least squares finite element methods. SIAM J. Numer. Anal., 53(1):43–62, 2015.
  • [15] C. Carstensen, E.-J. Park, and P. Bringmann. Convergence of natural adaptive least squares finite element methods. Numer. Math., 136(4):1097–1115, 2017.
  • [16] C. Carstensen and J. Storn. Asymptotic exactness of the least-squares finite element residual. SIAM J. Numer. Anal., 56(4):2008–2028, 2018.
  • [17] J. M. Cascon, C. Kreuzer, R. H. Nochetto, and K. G. Siebert. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal., 46(5):2524–2550, 2008.
  • [18] W. Dörfler. A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal., 33(3):1106–1124, 1996.
  • [19] A. Ern, T. Gudi, I. Smears, and M. Vohralik. Equivalence of local-and global-best approximations, a simple stable local commuting projector, and optimal hp approximation estimates in h(div)h({\rm div}). Preprint, arXiv:1908.08158, 2019.
  • [20] T. Führer, A. Haberl, D. Praetorius, and S. Schimanko. Adaptive BEM with inexact PCG solver yields almost optimal computational costs. Numer. Math., 141(4):967–1008, 2019.
  • [21] T. Führer and D. Praetorius. A linear Uzawa-type FEM-BEM solver for nonlinear transmission problems. Comput. Math. Appl., 75(8):2678–2697, 2018.
  • [22] G. Gantner, A. Haberl, D. Praetorius, and S. Schimanko. Rate optimality of adaptive finite element methods with respect to the overall computational costs. Preprint, arXiv:2003.10785, 2020.
  • [23] G. Gantner, A. Haberl, D. Praetorius, and B. Stiftner. Rate optimal adaptive FEM with inexact solver for nonlinear operators. IMA J. Numer. Anal., 38(4):1797–1831, 2018.
  • [24] V. Girault and P.-A. Raviart. Finite element methods for Navier-Stokes equations, volume 5 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, 1986. Theory and algorithms.
  • [25] G. H. Golub and C. F. Van Loan. Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition, 2013.
  • [26] R. Hiptmair. Finite elements in computational electromagnetism. Acta Numer., 11:237–339, 2002.
  • [27] M. Karkulik, D. Pavlicek, and D. Praetorius. On 2D newest vertex bisection: optimality of mesh-closure and H1H^{1}-stability of L2L_{2}-projection. Constr. Approx., 38(2):213–234, 2013.
  • [28] P. Monk. Finite element methods for Maxwell’s equations. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2003.
  • [29] P. Morin, R. H. Nochetto, and K. G. Siebert. Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal., 38(2):466–488, 2000.
  • [30] P. Morin, K. G. Siebert, and A. Veeser. A basic convergence result for conforming adaptive finite elements. Math. Models Methods Appl. Sci., 18(5):707–737, 2008.
  • [31] C.-M. Pfeiler and D. Praetorius. Dörfler marking with minimal cardinality is a linear complexity problem. Math. Comp., (accepted for publication), 2020.
  • [32] L. R. Scott and S. Zhang. Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math. Comp., 54(190):483–493, 1990.
  • [33] K. G. Siebert. A convergence proof for adaptive finite elements without lower bound. IMA J. Numer. Anal., 31(3):947–970, 2011.
  • [34] R. Stevenson. Optimality of a standard adaptive finite element method. Found. Comput. Math., 7(2):245–269, 2007.
  • [35] R. Stevenson. The completion of locally refined simplicial partitions created by bisection. Math. Comp., 77(261):227–241, 2008.
  • [36] A. Veeser. Approximating gradients with continuous piecewise polynomial functions. Found. Comput. Math., 16(3):723–750, 2016.
  • [37] L. Zhong, L. Chen, S. Shu, G. Wittum, and J. Xu. Convergence and optimality of adaptive edge finite element methods for time-harmonic Maxwell equations. Math. Comp., 81(278):623–642, 2012.

Appendix A Comments on optimal convergence rates

In this appendix, we consider the LSFEM discretization of the general second-order problem from Section 3.2. While Proposition 10 already proves plain convergence

𝒖𝒖𝕍(Ω)0as ,\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\to 0\quad\text{as }\ell\to\infty,

the present section proves that optimal convergence rates (in the sense of, e.g., [13]) would already follow from the Dörfler marking criterion (see Section 2.6) and linear convergence

(34) 𝒖𝒖+n𝕍(Ω)Clinqlinn𝒖𝒖𝕍(Ω)for all ,n0\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell+n}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\leq C_{\rm lin}\,q_{\rm lin}^{n}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\quad\text{for all }~\ell,n\in\mathbb{N}_{0}

with generic constants Clin>0C_{\rm lin}>0 and 0<qlin<10<q_{\rm lin}<1.

A.1. Scott–Zhang-type operators

For fixed p1p\geq 1, let 𝒥:H1(Ω)Sp(𝒯)\mathcal{J}_{\bullet}\colon H^{1}(\Omega)\to S^{p}(\mathcal{T}_{\bullet}) be the Scott–Zhang projector from [32]. We recall that 𝒥\mathcal{J}_{\bullet} preserves discrete boundary data so that 𝒥:H01(Ω)S0p(𝒯)\mathcal{J}_{\bullet}\colon H^{1}_{0}(\Omega)\to S^{p}_{0}(\mathcal{T}_{\bullet}) is well-defined. Besides this, we will only exploit the following two properties which hold for all vH1(Ω)v\in H^{1}(\Omega), vSp(𝒯)v_{\bullet}\in S^{p}(\mathcal{T}_{\bullet}), and T𝒯T\in\mathcal{T}_{\bullet}:

  • (i)

    local projection property: if v|Ω(T)=v|Ω(T)v|_{\Omega_{\bullet}(T)}=v_{\bullet}|_{\Omega_{\bullet}(T)}, then (𝒥v)|T=v|T(\mathcal{J}_{\bullet}v)|_{T}=v_{\bullet}|_{T};

  • (ii)

    global H𝟏\boldsymbol{H^{1}}-stability: (1𝒥)vH1(Ω)C(1+diam(Ω))vH1(Ω)\|(1-\mathcal{J}_{\bullet})v\|_{H^{1}(\Omega)}~\leq C(1+\mathrm{diam}(\Omega))\,\|v\|_{H^{1}(\Omega)}, where C>0C>0 depends only on the polynomial degree p1p\geq 1 and the shape regularity of 𝒯\mathcal{T}_{\bullet}.

The recent work [19] has constructed a Scott–Zhang-type projector 𝒫:𝑯(div;Ω)𝑹𝑻p1(𝒯)\mathcal{P}_{\bullet}\colon\boldsymbol{H}({\rm div\,};\Omega)\to\boldsymbol{RT}^{p-1}(\mathcal{T}_{\bullet}). While [19, Section 3.4] also includes Neumann boundary conditions 𝝈𝒏=0{\boldsymbol{\sigma}}\cdot\boldsymbol{n}=0 on ΓNΩ\Gamma_{N}\subseteq\partial\Omega, we only consider the full space 𝑯(div;Ω)\boldsymbol{H}({\rm div\,};\Omega). Moreover, we will only exploit the following two properties which hold for all 𝝈𝑯(div;Ω){\boldsymbol{\sigma}}\in\boldsymbol{H}({\rm div\,};\Omega), 𝝈𝑹𝑻p1(𝒯){\boldsymbol{\sigma}}_{\bullet}\in\boldsymbol{RT}^{p-1}(\mathcal{T}_{\bullet}), and T𝒯T\in\mathcal{T}_{\bullet}:

  • (i)

    local projection property: if 𝝈|Ω(T)=𝝈|Ω(T){\boldsymbol{\sigma}}|_{\Omega_{\bullet}(T)}={\boldsymbol{\sigma}}_{\bullet}|_{\Omega_{\bullet}(T)}, then (𝒫𝝈)|T=𝝈|T(\mathcal{P}_{\bullet}{\boldsymbol{\sigma}})|_{T}={\boldsymbol{\sigma}}_{\bullet}|_{T};

  • (ii)

    global H(𝐝𝐢𝐯;𝛀)\boldsymbol{H({\rm div};\Omega)}-stability: (1𝒫)𝝈𝑯(div;Ω)C(1+diam(Ω))𝝈𝑯(div;Ω)\|(1-\mathcal{P}_{\bullet}){\boldsymbol{\sigma}}\|_{\boldsymbol{H}({\rm div\,};\Omega)}~\leq C(1+\mathrm{diam}(\Omega))\,\|{\boldsymbol{\sigma}}\|_{\boldsymbol{H}({\rm div\,};\Omega)}, where again C>0C>0 depends only on p1p\geq 1 and the shape regularity of 𝒯\mathcal{T}_{\bullet}.

A.2. Discrete reliability

In the following, we exploit the local projection property and the global stability of the Scott–Zhang-type operators from Section A.1 and prove that the built-in least-squares estimator satisfies discrete reliability.

Lemma 14 (Discrete reliability).

There exist constants Cref,Cdrel1C_{\mathrm{ref}},C_{\mathrm{drel}}\geq 1 such that for all 𝒯refine(𝒯0)\mathcal{T}_{\bullet}\in\operatorname{refine}(\mathcal{T}_{0}) and all 𝒯refine(𝒯)\mathcal{T}_{\circ}\in\operatorname{refine}(\mathcal{T}_{\bullet}) there exists ,𝒯\mathcal{R}_{\bullet,\circ}\subset\mathcal{T}_{\bullet} such that

(35) 𝒖𝒖𝕍(Ω)Cdrelη(,)and#,Cref(#𝒯#𝒯).\displaystyle\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega)}\leq C_{\mathrm{drel}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\quad\text{and}\quad\#\mathcal{R}_{\bullet,\circ}\leq C_{\mathrm{ref}}(\#\mathcal{T}_{\circ}-\#\mathcal{T}_{\bullet}).

The constant CrefC_{\mathrm{ref}} depends only on shape regularity of 𝒯\mathcal{T}_{\bullet}, whereas CdrelC_{\mathrm{drel}} depends additionally on the constants ccntc_{\rm cnt}, CcntC_{\rm cnt} from (A1), the polynomial degree pp\in\mathbb{N}, and diam(Ω)\mathrm{diam}(\Omega).

Proof.

The proof is split into three steps.

Step 1. Define the set

,=𝒯𝒩,with𝒩,={T𝒯:Ω(T)𝒯𝒯}.\displaystyle\mathcal{R}_{\bullet,\circ}=\mathcal{T}_{\bullet}\setminus\mathcal{N}_{\bullet,\circ}\quad\text{with}\quad\mathcal{N}_{\bullet,\circ}=\big{\{}T\in\mathcal{T}_{\bullet}\,:\,\Omega_{\bullet}(T)\subset\mathcal{T}_{\bullet}\cap\mathcal{T}_{\circ}\big{\}}.

Given T,T\in\mathcal{R}_{\bullet,\circ}, there exists T𝒯T^{\prime}\in\mathcal{T}_{\bullet} such that TTT\cap T^{\prime}\neq\emptyset and T𝒯𝒯T^{\prime}\not\in\mathcal{T}_{\bullet}\cap\mathcal{T}_{\circ}. This implies that T𝒯\𝒯T^{\prime}\in\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ} and hence TT belongs to the patch around 𝒯\𝒯\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ}, i.e., T{T𝒯:T(𝒯\𝒯)¯}T\in\big{\{}T^{\prime}\in\mathcal{T}_{\bullet}\,:\,T^{\prime}\cap\overline{\bigcup(\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ})}\neq\emptyset\big{\}}. Overall, we thus conclude that

#,#{T𝒯:T(𝒯\𝒯)¯}(R2)#(𝒯\𝒯)#𝒯#𝒯.\displaystyle\#\mathcal{R}_{\bullet,\circ}\leq\#\big{\{}T^{\prime}\in\mathcal{T}_{\bullet}\,:\,T^{\prime}\cap\overline{\bigcup(\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ})}\neq\emptyset\big{\}}\stackrel{{\scriptstyle\eqref{ass:refinement:quasiuniform}}}{{\lesssim}}\#(\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ})\leq\#\mathcal{T}_{\circ}-\#\mathcal{T}_{\bullet}.

Step 2. We define the operator :𝕍(Ω)𝕍(Ω)\mathcal{I}_{\bullet}\colon\operatorname{\mathbb{V}}(\Omega)\to\operatorname{\mathbb{V}}_{\bullet}(\Omega) by 𝒗:=(𝒥v,𝒫𝝈)\mathcal{I}_{\bullet}\boldsymbol{v}:=(\mathcal{J}_{\bullet}v,\mathcal{P}_{\bullet}{\boldsymbol{\sigma}}) for 𝒗=(v,𝝈)𝕍(Ω)\boldsymbol{v}=(v,{\boldsymbol{\sigma}})\in\operatorname{\mathbb{V}}(\Omega). Since 𝒥\mathcal{J}_{\bullet} and 𝒫\mathcal{P}_{\bullet} are stable projections, it follows that also \mathcal{I}_{\bullet} is a stable projection, i.e., for all 𝒗𝕍(Ω)\boldsymbol{v}\in\operatorname{\mathbb{V}}(\Omega) and 𝒗𝕍(Ω)\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega), it holds that

(36) (1)𝒗𝕍(Ω)=(1)(𝒗𝒗)𝕍(Ω)𝒗𝒗𝕍(Ω),\displaystyle\|(1-\mathcal{I}_{\bullet})\boldsymbol{v}\|_{\operatorname{\mathbb{V}}(\Omega)}=\|(1-\mathcal{I}_{\bullet})(\boldsymbol{v}\!-\!\boldsymbol{v}_{\bullet})\|_{\operatorname{\mathbb{V}}(\Omega)}\lesssim\|\boldsymbol{v}-\boldsymbol{v}_{\bullet}\|_{\operatorname{\mathbb{V}}(\Omega)},

where the hidden constant depends on diam(Ω)\mathrm{diam}(\Omega) and the constants C>0C>0 from Section A.1. Moreover, let 𝒖=(u,𝝈)\boldsymbol{u}_{\circ}^{\star}=(u_{\circ}^{\star},{\boldsymbol{\sigma}}_{\circ}^{\star}). From the local projection properties of the operators 𝒥\mathcal{J}_{\bullet} and 𝒫\mathcal{P}_{\bullet} and the choice of 𝒩,\mathcal{N}_{\bullet,\circ}, it follows that 𝒖|𝒩,=(𝒖)|𝒩,\boldsymbol{u}_{\circ}^{\star}|_{\mathcal{N}_{\bullet,\circ}}=(\mathcal{I}_{\bullet}\boldsymbol{u}_{\circ}^{\star})|_{\mathcal{N}_{\bullet,\circ}}. This leads to supp(uu)(𝒯\𝒩,)=,\operatorname{supp}(u_{\circ}^{\star}-\mathcal{I}_{\bullet}u_{\circ}^{\star})\subseteq\bigcup(\mathcal{T}_{\bullet}\backslash\mathcal{N}_{\bullet,\circ})=\bigcup\mathcal{R}_{\bullet,\circ}. Since \mathcal{L} is defined pointwise, this also implies that

(37) supp(uu),.\displaystyle\operatorname{supp}\mathcal{L}(u_{\circ}^{\star}-\mathcal{I}_{\bullet}u_{\circ}^{\star})\subseteq\mbox{$\bigcup$}\mathcal{R}_{\bullet,\circ}.

Step 3. For any 𝒗𝕍(Ω)\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega), the Galerkin orthogonality (6) proves that

𝒖𝒖𝕍(Ω)2\displaystyle\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega)}^{2} (A1)b(𝒖𝒖,𝒖𝒖)=(6)b(𝒖𝒖,𝒖𝒖)=(6)b(𝒖𝒖,𝒖𝒗).\displaystyle\stackrel{{\scriptstyle{\eqref{ass:pde}}}}{{\simeq}}b(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star})\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}b(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star})\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}b(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}).

With the choice 𝒗=u\boldsymbol{v}_{\bullet}=\mathcal{I}_{\bullet}u_{\circ}^{\star}, we see that

b(𝒖𝒖,𝒖𝒗)=((𝒖𝒖),(𝒖𝒗))Ω=(37)((𝒖𝒖),(𝒖𝒗)),\displaystyle b(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet})=(\mathcal{L}(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}))_{\Omega}\stackrel{{\scriptstyle\eqref{eq:optimal2}}}{{=}}(\mathcal{L}(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}))_{\bigcup\mathcal{R}_{\bullet,\circ}}
(𝒖𝒖),(𝒖𝒗),=(37)η(,)(𝒖𝒗)Ω\displaystyle\quad\stackrel{{\scriptstyle\phantom{\eqref{ass:pde}}}}{{\leq}}\|\mathcal{L}(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star})\|_{\bigcup\mathcal{R}_{\bullet,\circ}}\,\|\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet})\|_{\bigcup\mathcal{R}_{\bullet,\circ}}\stackrel{{\scriptstyle\eqref{eq:optimal2}}}{{=}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\,\|\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet})\|_{\Omega}
(A1)η(,)𝒖𝒗𝕍(Ω)(36)η(,)𝒖𝒖𝕍(Ω).\displaystyle\quad\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\lesssim}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}\|_{\mathbb{V}(\Omega)}\stackrel{{\scriptstyle\eqref{eq:optimal1}}}{{\lesssim}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega)}.

Combining the latter two estimates, we conclude the proof. ∎

A.3. Linear convergence implies optimal algebraic convergence rates

Given s>0s>0, we consider the following approximation class

(38) 𝒖𝔸s:=supN0min𝒯𝕋#𝒯#𝒯0Nmin𝒗𝕍(Ω)(N+1)s𝒖𝒗𝕍(Ω)0{}.\displaystyle\|\boldsymbol{u}\|_{\mathbb{A}_{s}}:=\sup_{N\in\mathbb{N}_{0}}\min_{\begin{subarray}{c}\mathcal{T}_{\diamond}\in\mathbb{T}\\ \#\mathcal{T}_{\diamond}-\#\mathcal{T}_{0}\leq N\end{subarray}}\min_{\boldsymbol{v}_{\diamond}\in\operatorname{\mathbb{V}}_{\diamond}(\Omega)}(N+1)^{s}\|\boldsymbol{u}^{\star}-\boldsymbol{v}_{\diamond}\|_{\operatorname{\mathbb{V}}(\Omega)}\in\mathbb{R}_{\geq 0}\cup\{\infty\}.

We note that 𝒖𝔸s<\|\boldsymbol{u}\|_{\mathbb{A}_{s}}<\infty implies the existence of a sequence of meshes (¯𝒯)0(\bar{}\mathcal{T}_{\ell})_{\ell\in\mathbb{N}_{0}} with ¯𝒯0=𝒯0\bar{}\mathcal{T}_{0}=\mathcal{T}_{0} such that the corresponding least-square solutions satisfy

min¯𝒗𝕍¯(Ω)𝒖¯𝒗𝕍(Ω)(#¯𝒯#𝒯0)s0as .\displaystyle\min_{\bar{}\boldsymbol{v}_{\ell}\in\bar{\operatorname{\mathbb{V}}}_{\ell}(\Omega)}\|\boldsymbol{u}^{\star}-\bar{}\boldsymbol{v}_{\ell}\|_{\operatorname{\mathbb{V}}(\Omega)}\lesssim(\#\bar{}\mathcal{T}_{\ell}-\#\mathcal{T}_{0})^{-s}\to 0\quad\text{as }\ell\to\infty.

One says that the Algorithm 1 is rate-optimal, if and only if for all s>0s>0 with 𝒖𝔸s<\|\boldsymbol{u}\|_{\mathbb{A}_{s}}<\infty, the sequence of meshes (𝒯)0(\mathcal{T}_{\ell})_{\ell\in\mathbb{N}_{0}} generated by Algorithm 1 guarantees that

𝒖𝒖𝕍(Ω)(#𝒯#𝒯0)s0as .\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\lesssim(\#\mathcal{T}_{\ell}-\#\mathcal{T}_{0})^{-s}\to 0\quad\text{as }\ell\to\infty.

We refer to [13] for an abstract framework of the state of the art of rate-optimal adaptive algorithms and note that LSFEM guarantees the equivalence

(39) η(𝒖)(8)𝒖𝒖𝕍(Ω)(5)min𝒗𝕍(Ω)𝒖𝒗𝕍(Ω)(8)min𝒗𝕍(Ω)η(𝒗)\displaystyle\eta_{\bullet}(\boldsymbol{u}_{\bullet}^{\star})\stackrel{{\scriptstyle\eqref{eq:reliable-efficient}}}{{\simeq}}\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\stackrel{{\scriptstyle\eqref{eq:cea}}}{{\simeq}}\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\|\boldsymbol{u}^{\star}-\boldsymbol{v}_{\bullet}\|_{\operatorname{\mathbb{V}}(\Omega)}\stackrel{{\scriptstyle\eqref{eq:reliable-efficient}}}{{\simeq}}\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\eta_{\bullet}(\boldsymbol{v}_{\bullet})

so that the present definition of 𝒖𝔸s\|\boldsymbol{u}\|_{\mathbb{A}_{s}} is equivalent to that of [4, 34, 17, 13].

The following result is a direct consequence of discrete reliability (35) and the analysis from [13] under the following usual assumptions: First, suppose that we employ newest vertex bisection [35, 27] for mesh-refinement and that the initial mesh 𝒯0\mathcal{T}_{0} satisfies the admissibility condition from [35] for d3d\geq 3. Second, suppose that Algorithm 1 employs the Dörfler marking criterion with minimal cardinality, i.e., for given 0<θ10<\theta\leq 1, it holds that

(40) 𝕄:={𝒰𝒯:θη(u)2η(𝒰,u)2} and ##𝒰 for all 𝒰𝒯.\displaystyle\mathcal{M}_{\ell}\in\mathbb{M}_{\ell}:=\big{\{}\mathcal{U}_{\ell}\subseteq\mathcal{T}_{\ell}\,:\,\theta\,\eta_{\ell}(u_{\ell})^{2}\leq\eta_{\ell}(\mathcal{U}_{\ell},u_{\ell})^{2}\big{\}}\text{ \ \ and \ }\#\mathcal{M}_{\ell}\leq\#\mathcal{U}_{\ell}\text{ \ for all }\ \mathcal{U}_{\ell}\subseteq\mathcal{T}_{\ell}.
Proposition 15.

There exists a constant 0<θopt10<\theta_{\rm opt}\leq 1 such that for all 0<θ<θopt0<\theta<\theta_{\rm opt}, the following implication holds: If there holds linear convergence (34), then Algorithm 1 is even rate optimal, i.e., for all s>0s>0, there exists a constant Copt>0C_{\rm opt}>0 such that

(41) Copt1𝒖𝔸ssup0(#𝒯#𝒯0+1)s𝒖𝒖𝕍(Ω)Copt𝒖𝔸s.\displaystyle C_{\rm opt}^{-1}\,\|\boldsymbol{u}\|_{\mathbb{A}_{s}}\leq\sup_{\ell\in\mathbb{N}_{0}}(\#\mathcal{T}_{\ell}-\#\mathcal{T}_{0}+1)^{s}\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\leq C_{\rm opt}\,\|\boldsymbol{u}\|_{\mathbb{A}_{s}}.
Sketch of proof.

According to the triangle inequality, the built-in least-squares error estimator is stable on non-refined elements in the sense of [13, Axiom (A1)]. Together with discrete reliability (35), this implies optimality of the Dörfler marking; see [13, Section 4.5]. Due to (39), the error estimator is quasi-monotone [13, Eq. (3.8)] with respect to mesh-refinement. This implies the so-called comparison lemma; see [13, Lemma 4.14]. Together with linear convergence (34), [13, Proposition 4.15] concludes optimality (41). ∎

Remark 16.

We note that the (constrained) optimality result of Proposition 15 can be obtained for any of the examples presented in Section 3. For the needed 𝐇(𝐜𝐮𝐫𝐥;Ω)\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)-stable local projection, we refer, e.g., to [37, Section 4].