A short note on plain convergence of
adaptive Least-squares finite element methods

Thomas Führer and Dirk Praetorius Facultad de Matemáticas, Pontificia Universidad Católica de Chile, Santiago, Chile tofuhrer@mat.uc.cl (corresponding author) TU Wien, Institute of Analysis and Scientific Computing, Wiedner Hauptstr. 8–10, 1040 Wien, Austria dirk.praetorius@asc.tuwien.ac.at

Abstract.

We show that adaptive least-squares finite element methods driven by the canonical least-squares functional converge under weak conditions on PDE operator, mesh-refinement, and marking strategy. Contrary to prior works, our plain convergence does neither rely on sufficiently fine initial meshes nor on severe restrictions on marking parameters. Finally, we prove that convergence is still valid if a contractive iterative solver is used to obtain the approximate solutions (e.g., the preconditioned conjugate gradient method with optimal preconditioner). The results apply within a fairly abstract framework which covers a variety of model problems.

Key words and phrases:

Least squares finite element methods, adaptive algorithm, convergence

2010 Mathematics Subject Classification:

65N12, 65N15, 65N30, 65N50

Acknowledgment. This work was supported by CONICYT (through FONDECYT project 11170050) and the Austrian Science Fund FWF (through project P33216 and the special research program SFB F65).

1. Introduction

Least-squares finite element methods (LSFEMs) are a class of finite element methods that minimize the residual in some norm. These methods often become rather easy to analyze and implement when the residual is measured in the space of square integrable functions. Some features of LSFEMs are the following: First, the resulting algebraic systems are always symmetric and positive definite, thus allowing the use of standard iterative solvers. Second, $\inf$ – $\sup$ stability for any conforming finite element space is inherited from the continuous problem. Finally, another feature is the built-in (localizable) error estimator that can be used to steer an adaptive mesh-refinement in, e.g., an adaptive algorithm of the form

\displaystyle\boxed{\texttt{SOLVE}}\quad\longrightarrow\quad\boxed{\texttt{ESTIMATE}}\quad\longrightarrow\quad\boxed{\texttt{MARK}}\quad\longrightarrow\quad\boxed{\texttt{REFINE}}.

For standard discretizations, the mathematical understanding of adaptive finite element methods (AFEMs) has matured over the past decades. We mention [30, 33] for abstract theories for plain convergence of AFEM, the seminal works [18, 29, 4, 34, 17] for convergence of standard AFEM with optimal algebraic rates, and the abstract framework of [13], which also provides a comprehensive review of the state of the art.

In contrast to this, only very little is known about convergence of adaptive LSFEM; see [12, 8, 14, 15]. To the best of our knowledge, plain convergence of adaptive LSFEM when using the built-in error estimator has so far only been addressed in [15]. The other works [12, 8, 14] deal with optimal convergence results, but rely on alternative error estimators.

The aim of this work is to shed some new light on the plain convergence of adaptive LSFEMs using the canonical error estimator shipped with them. Basically, the whole idea of this note is to verify that a large class of LSFEMs fits into the abstract framework of [33]. From this, we then conclude that the sequence of discrete solutions produced by the above iteration converges to the exact solution of the underlying problem (see Theorem 2 below). Let us mention that our result is weaker than [15, Theorem 4.1] as we only show (plain) convergence whereas [15] proves that an equivalent error quantity is contractive. However, the latter result comes at the price of assuming sufficiently fine initial meshes and sufficiently large marking parameters $0<\theta<1$ in the Dörfler marking criterion. The latter contradicts somehow the by now standard proofs of optimal convergence where $\theta$ needs to be sufficiently small; see [13] for an overview. However, we also note that [15] is constrained by the Dörfler marking criterion, while the present analysis, in the spirit of [33], covers a fairly wide range of marking strategies.

The remainder of the work is organized as follows: In Section 2, we state our assumptions on the PDE setting (Section 2.1), the mesh-refinement (Section 2.3), the discrete spaces (Section 2.4), and the marking strategy (Section 2.6). Moreover, we recall the least-squares discretization (Section 2.2) as well as the built-in error estimator (Section 2.5) and formulate the common adaptive algorithm (Algorithm 1). Our first main result proves plain convergence of adaptive LSFEM (Theorem 2), if the LSFEM solution is computed exactly. In practice, however, iterative solvers (e.g., multigrid or the preconditioned CG method) are used. Our second main result (Theorem 6) proves convergence of adaptive LSFEM in the presence of inexact solvers (Algorithm 4). Overall, the presented abstract setting covers several model problems like the Poisson problem (Section 3.1), general elliptic second-order PDEs (Section 3.2), linear elasticity (Section 3.3), the magnetostatic Maxwell problem (Section 3.4), and the Stokes problem (Section 3.5). While this work focusses on plain convergence, the short appendix (Appendix A) notes that LSFEM ensures discrete reliability so that, in the spirit of [13], the only missing link for the mathematical proof of optimal convergence rates is the verification of linear convergence.

2. Plain convergence of adaptive least-squares methods

2.1. Continuous model formulation

We consider a PDE in the abstract form

(1)

\displaystyle\mathcal{L}\boldsymbol{u}^{\star}=\boldsymbol{F}\quad\text{in a bounded domain }\Omega\subset\mathbb{R}^{d}\text{ with }d\geq 2.

Here, $\boldsymbol{F}\in L^{2}(\Omega)^{N}$ with $N\geq 1$ are given data, and $\mathcal{L}\colon\mathbb{V}(\Omega)\to L^{2}(\Omega)^{N}$ is a linear operator from some Hilbert space $\mathbb{V}(\Omega)$ with norm $\|\cdot\|_{\mathbb{V}(\Omega)}$ to $L^{2}(\Omega)^{N}$ with norm $\|\cdot\|_{L^{2}(\Omega)}$ . For simplicity, we assume that homogeneous boundary conditions are contained in the space $\mathbb{V}(\Omega)$ . To abbreviate notation, we write $\|\cdot\|_{\omega}:=\|\cdot\|_{L^{2}(\omega)}$ for any measurable set $\omega\subseteq\Omega$ . Moreover, $(\cdot\hskip 1.42262pt,\cdot)_{\omega}$ denotes the corresponding $L^{2}(\omega)$ scalar product.

We make the following assumptions on $\mathcal{L}$ and $\boldsymbol{F}$ :

(A1)

$\boldsymbol{\mathcal{L}}$ is continuously invertible: With constants $c_{\rm cnt},C_{\rm cnt}>0$ , it holds that

\displaystyle c_{\rm cnt}^{-1}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\leq\|\mathcal{L}\boldsymbol{v}\|_{\Omega}\leq C_{\rm cnt}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega).

(A2)

PDE admits solution: The given data satisfy $\boldsymbol{F}\in\operatorname{ran}(\mathcal{L})$ .

While (A2) yields the existence of the solution $\boldsymbol{u}^{\star}\in\mathbb{V}(\Omega)$ of (1), assumption (A1) leads to

(2)

\displaystyle c_{\rm cnt}^{-1}\,\|\boldsymbol{u}^{\star}-\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\leq\|\boldsymbol{F}-\mathcal{L}\boldsymbol{v}\|_{\Omega}\leq C_{\rm cnt}\,\|\boldsymbol{u}^{\star}-\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega)

and hence, in particular, also guarantees uniqueness.

In practice, the norm on $\mathbb{V}(\Omega)$ is often an integer order Sobolev space (see the examples in Section 3) that satisfies the following additional properties (which coincide with [33, eq. (2.3)]):

(A3)

Additivity: The norm on $\mathbb{V}(\Omega)$ is additive, i.e., for two disjoint subdomains $\omega_{1},\omega_{2}\subset\Omega$ with positive Lebesgue measure, it holds that

\displaystyle\|\boldsymbol{v}\|_{\mathbb{V}(\omega_{1}\cup\omega_{2})}^{2}=\|\boldsymbol{v}\|_{\mathbb{V}(\omega_{1})}^{2}+\|\boldsymbol{v}\|_{\mathbb{V}(\omega_{2})}^{2}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega).

(A4)

Absolute continuity: The norm on $\mathbb{V}(\Omega)$ is absolutely continuous with respect to the Lebesgue measure, i.e., for all $\boldsymbol{v}\in\mathbb{V}(\Omega)$ and all domains $\omega\subset\Omega$ , it holds that

$\displaystyle\|\boldsymbol{v}\|_{\mathbb{V}(\omega)}\to 0\quad\text{as}\quad|\omega|\to 0.$

2.2. Least-squares method

Let $\mathbb{V}_{\bullet}(\Omega)$ be a closed subspace of $\mathbb{V}(\Omega)$ . The least-squares method seeks $\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega)$ as the minimizer of the least-squares functional, i.e.,

(3)

\displaystyle{\rm LS}(\boldsymbol{u}_{\bullet}^{\star};\boldsymbol{F})=\min_{\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)}{\rm LS}(\boldsymbol{v}_{\bullet};\boldsymbol{F}),\quad\text{where}\quad{\rm LS}(\boldsymbol{v};\boldsymbol{F})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{v}\|_{\Omega}^{2}.

The Euler–Lagrange equations for this problem read: Find $\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega)$ such that

(4)

\displaystyle b(\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{v}_{\bullet})=F(\boldsymbol{v}_{\bullet})\quad\text{for all }\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega),

where

\displaystyle b(\boldsymbol{w},\boldsymbol{v}):=(\mathcal{L}\boldsymbol{w}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{\Omega}\quad\text{and}\quad F(\boldsymbol{v}):=(\boldsymbol{F}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{\Omega}\quad\text{for all }\boldsymbol{v},\boldsymbol{w}\in\mathbb{V}(\Omega).

It is straightforward to see that $F(\cdot)$ and $b(\cdot,\cdot)$ satisfy the assumptions of the Lax–Milgram lemma (and, in particular, [33, eq. (2.1)–(2.2)]), i.e., for all $\boldsymbol{v},\boldsymbol{w}\in\mathbb{V}(\Omega)$ , it holds that

\displaystyle|F(\boldsymbol{v})|\leq C_{\rm cnt}\,\|\boldsymbol{F}\|_{\Omega}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)},\quad|b(\boldsymbol{w},\boldsymbol{v})|\leq C_{\rm cnt}^{2}\,\|\boldsymbol{w}\|_{\mathbb{V}(\Omega)}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega)},\quad c_{\rm cnt}^{-2}\,\|\boldsymbol{w}\|_{\mathbb{V}(\Omega)}^{2}\leq b(\boldsymbol{w},\boldsymbol{w}).

Therefore, the discrete variational formulation (4) admits a unique solution and is equivalent to the minimization problem (3). In particular, there holds the Céa lemma

(5)

\displaystyle\begin{split}c_{\rm cnt}^{-1}\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\leq}}\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star})\|_{\Omega}&\stackrel{{\scriptstyle\eqref{eq:minization}}}{{=}}\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{v}_{\bullet})\|_{\Omega}\\ &\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\leq}}C_{\rm cnt}\,\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\|\boldsymbol{u}^{\star}-\boldsymbol{v}_{\bullet}\|_{\operatorname{\mathbb{V}}(\Omega)},\end{split}

i.e., the exact least square solutions are quasi-optimal. Moreover, Galerkin orthogonality

(6)

\displaystyle b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{v}_{\bullet})=0\quad\text{for all }\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)

follows from the variational formulation (4).

2.3. Meshes and mesh-refinement

For simplicity, as in [33], we restrict our presentation to conforming simplicial triangulations $\mathcal{T}_{\bullet}$ of $\Omega$ and refinement by, e.g., the newest vertex bisection algorithm [35, 27]. For each triangulation $\mathcal{T}_{\bullet}$ , let $h_{\bullet}\in L^{\infty}(\Omega)$ denote the associated mesh-size function given by $h_{\bullet}|_{T}=h_{T}=|T|^{1/d}$ .

Let $\operatorname{refine}(\cdot)$ denote the refinement routine. We write $\mathcal{T}_{\circ}=\operatorname{refine}(\mathcal{T}_{\bullet},\mathcal{M}_{\bullet})$ if $\mathcal{T}_{\circ}$ is generated from $\mathcal{T}_{\bullet}$ by refining (at least) all marked elements $\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet}$ . Moreover, $\mathbb{T}(\mathcal{T}_{\bullet})$ denotes the set of all meshes that can be generated by an arbitrary but finite number of refinements of $\mathcal{T}_{\bullet}$ . Throughout, let $\mathcal{T}_{0}$ be a given initial mesh and $\mathbb{T}:=\mathbb{T}(\mathcal{T}_{0})$ .

We make the following assumptions (which essentially coincide with [33, eq. (2.4)]):

(R1)

Reduction on refined elements: The mesh-size function is monotone and contractive with constant $0<q_{\mathrm{ref}}<1$ on refined elements, i.e., for all $\mathcal{T}_{\bullet}\in\mathbb{T}$ and $\mathcal{T}_{\circ}\in\mathbb{T}(\mathcal{T}_{\bullet})$ , it holds that

\displaystyle h_{\circ}\leq h_{\bullet}\quad\text{a.e.\ in }\Omega\quad\text{and}~\quad h_{\circ}|_{T}\leq q_{\mathrm{ref}}h_{\bullet}|_{T}\quad\text{for all }T\in\mathcal{T}_{\circ}\setminus\mathcal{T}_{\bullet}.

(R2)

Uniform shape regularity: There exists a constant $\kappa>0$ depending only on the initial mesh $\mathcal{T}_{0}$ such that

\displaystyle\sup_{T\in\mathcal{T}_{\bullet}}\frac{\mathrm{diam}(T)^{d}}{|T|}\leq\kappa\quad\text{for all }\mathcal{T}_{\bullet}\in\mathbb{T}.

(R3)

Marked elements are refined: It holds that

\displaystyle\mathcal{M}_{\bullet}\cap\operatorname{refine}(\mathcal{T}_{\bullet},\mathcal{M}_{\bullet})=\emptyset\quad\text{for all }\mathcal{T}_{\bullet}\in\mathbb{T}\text{ and all }\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet}.

2.4. Discrete spaces

We assume that each mesh $\mathcal{T}_{\bullet}\in\mathbb{T}$ is associated with some discrete space $\mathbb{V}_{\bullet}(\Omega)$ satisfying the following properties (which coincide with [33, eq. (2.5)]):

(S1)

Conformity: The spaces are conforming and finite dimensional, i.e.,

\displaystyle\mathbb{V}_{\bullet}(\Omega)\subset\mathbb{V}(\Omega)\text{ and }\dim(\mathbb{V}_{\bullet}(\Omega))<\infty\quad\text{for all }\mathcal{T}_{\bullet}\in\mathbb{T}.

(S2)

Nestedness: Mesh-refinement guarantees nested discrete spaces, i.e.,

\displaystyle\mathbb{V}_{\bullet}(\Omega)\subseteq\mathbb{V}_{\circ}(\Omega)\quad\text{for all }\mathcal{T}_{\circ}\in\mathbb{T}(\mathcal{T}_{\bullet}).

Moreover, we assume that there exists a dense subspace $\mathbb{D}(\Omega)\subset\mathbb{V}(\Omega)$ with additive norm $\|\cdot\|_{\mathbb{D}(\Omega)}$ (see (A3) with $\mathbb{V}(\cdot)$ being replaced by $\mathbb{D}(\cdot)$ ) satisfying an approximation property:

(S3)

Local approximation property: There exist constants $C_{\rm apx}>0$ and $s>0$ such that, for all $\mathcal{T}_{\bullet}\in\mathbb{T}$ , there exists an approximation operator $\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\mathbb{V}_{\bullet}(\Omega)$ such that

\displaystyle\|\boldsymbol{v}-\mathcal{A}_{\bullet}\boldsymbol{v}\|_{\mathbb{V}(T)}\leq C_{\rm apx}\,h_{T}^{s}\|\boldsymbol{v}\|_{\mathbb{D}(T)}\quad\text{for all }\boldsymbol{v}\in\mathbb{D}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet}.

2.5. Natural least-squares error estimator

Let $\mathcal{T}_{\bullet}\in\mathbb{T}$ . For all $\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)$ and all $\mathcal{U}_{\bullet}\subset\mathcal{T}_{\bullet}$ , we consider the contributions of the least-squares functional

(7)

\displaystyle\eta_{\bullet}(\mathcal{U}_{\bullet},u_{\bullet}):=\bigg{(}\sum_{T\in\mathcal{U}_{\bullet}}\eta_{\bullet}(T,u_{\bullet})^{2}\bigg{)}^{1/2},\quad\text{where}\quad\eta_{\bullet}(T,u_{\bullet}):=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}\|_{T}.

To abbreviate notation, let $\eta_{\bullet}(u_{\bullet}):=\eta_{\bullet}(\mathcal{T}_{\bullet},u_{\bullet})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}\|_{\Omega}$ . From (2), we see that

(8)

\displaystyle c_{\rm cnt}^{-1}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}\leq\eta_{\bullet}(u_{\bullet})={\rm LS}(\boldsymbol{u}_{\bullet};\boldsymbol{F})=\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet})\|_{\Omega}\leq C_{\rm cnt}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)},

i.e., the least-squares functional provides a natural (localizable) error estimator which is reliable (upper bound $\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}\lesssim\eta_{\bullet}(u_{\bullet})$ ) and efficient (lower bound $\eta_{\bullet}(u_{\bullet})\lesssim\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}$ ) with respect to the error $\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega)}$ of any approximation $\boldsymbol{u}^{\star}\approx\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)$ . In contrast to usual error estimators for standard Galerkin FEM, no data approximation term appears explicitly in the definition of the estimator nor is needed to show the lower bound.

2.6. Marking strategy

We recall the following assumption from [33, Section 2.2.4] on the marking strategy, where $\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)$ is a fixed approximation $\boldsymbol{u}_{\bullet}\approx\boldsymbol{u}^{\star}$ :

(M)

There exists a fixed function $g\colon\mathbb{R}_{\geq 0}\to\mathbb{R}_{\geq 0}$ being continuous at $0=g(0)$ such that the set of marked elements $\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet}$ (corresponding to $\boldsymbol{u}_{\bullet}$ ) satisfies that

\displaystyle\max_{T\in\mathcal{T}_{\bullet}\setminus\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\leq g\big{(}\max_{T\in\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\big{)}.

We note that the marking assumption (M) is clearly satisfied with $g(s)=s$ if $\mathcal{M}_{\bullet}$ contains at least one element with maximal error indicator, i.e., there exists $T\in\mathcal{M}_{\bullet}$ such that $\eta_{\bullet}(T,u_{\bullet})\geq\eta_{\bullet}(T^{\prime},u_{\bullet})$ for all $T^{\prime}\in\mathcal{T}_{\bullet}$ . We recall from [33, Section 4.1] that the latter is always satisfied for the following common marking strategies (so that the adaptivity parameter $\theta$ could even vary between the steps of the adaptive algorithm):

•

Maximum strategy: Given $0<\theta\leq 1$ , the marked elements are determined by

\displaystyle\mathcal{M}_{\bullet}=\big{\{}T\in\mathcal{T}_{\bullet}\,:\,\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\geq\theta\,M_{\bullet}\big{\}},\quad\text{where}\quad M_{\bullet}:=\max_{T\in\mathcal{T}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}).

•

Equilibration strategy: Given $0<\theta\leq 1$ , the marked elements are determined by

\displaystyle\mathcal{M}_{\bullet}=\big{\{}T\in\mathcal{T}_{\bullet}\,:\,\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})^{2}\geq\theta\,\eta_{\bullet}(\boldsymbol{u}_{\bullet})^{2}/\#\mathcal{T}_{\bullet}\big{\}}.

•

Dörfler marking strategy: Given $0<\theta\leq 1$ , the set $\mathcal{M}_{\bullet}\subseteq\mathcal{T}_{\bullet}$ is chosen such that

\displaystyle\theta\eta_{\bullet}(\boldsymbol{u}_{\bullet})^{2}\leq\sum_{T\in\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})^{2}\quad\text{and}\quad\max_{T\in\mathcal{T}_{\bullet}\setminus\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet})\leq\min_{T\in\mathcal{M}_{\bullet}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}).

We note that the second condition on the Dörfler marking is not explicitly specified in [18], but usually satisfied if the implementation is based on sorting [31].

2.7. Convergent adaptive algorithm

Our first theorem states convergence of the following basic algorithm in the sense that error as well as error estimator are driven to zero.

Algorithm 1.

Input: Initial triangulation $\mathcal{T}_{0}$ .
Loop: For all $\ell=0,1,2,\dots$ , iterate the following steps (i)–(iv):

(i)

SOLVE. Compute the exact least-squares solution $\boldsymbol{u}_{\ell}^{\star}\in\mathbb{V}_{\ell}(\Omega)$ by solving (3)–(4).
(ii)

ESTIMATE. For all $T\in\mathcal{T}_{\ell}$ , compute the contributions $\eta_{\ell}(T,\boldsymbol{u}_{\ell}^{\star})$ from (7).
(iii)

MARK. Determine a set $\mathcal{M}_{\ell}\subseteq\mathcal{T}_{\ell}$ of marked elements satisfying (M) for $\boldsymbol{u}_{\ell}=\boldsymbol{u}_{\ell}^{\star}$ .
(iv)

REFINE. Generate a new mesh $\mathcal{T}_{\ell+1}:={\rm refine}(\mathcal{T}_{\ell},\mathcal{M}_{\ell})$ .

Output: Sequences of approximations $\boldsymbol{u}_{\ell}^{\star}$ and corresponding error estimators $\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})$ .∎

The following theorem is our first main result.

Theorem 2.

Suppose the assumptions (A1)–(A4), (R1)–(R3), (S1)–(S3), and (M). In addition, we make the following assumption on the locality of the operator $\mathcal{L}$ :

(L)

Local boundedness: There exists $C_{\rm loc}>0$ such that, for all $\mathcal{T}_{\bullet}\in\mathbb{T}$ , it holds that

\displaystyle\|\mathcal{L}\boldsymbol{v}\|_{T}\leq C_{\rm loc}\|\boldsymbol{v}\|_{\mathbb{V}(\Omega_{\bullet}(T))}\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet},

where $\Omega_{\bullet}(T):=\bigcup\big{\{}T^{\prime}\in\mathcal{T}_{\bullet}\,:\,T\cap T^{\prime}\neq\emptyset\big{\}}\subset\Omega$ denotes the patch of $T$ .

Then, Algorithm 1 generates a sequence of approximations $(\boldsymbol{u}_{\ell}^{\star})_{\ell\in\mathbb{N}_{0}}$ such that

(9)

\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}+\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})\to 0\quad\text{as}\quad\ell\to\infty.

Proof.

We only need to check that the assumptions of [33, Theorem 2.1] are satisfied. Our assumptions on marking strategy, refinement, and discrete spaces are the same as in [33, Section 2]. The bilinear form of the least-squares FEM is coercive on $\mathbb{V}(\Omega)$ and thus satisfies the uniform $\inf$ – $\sup$ conditions from [33, eq. (2.6)]. It thus only remains to verify [33, eq. (2.10)]: Given an element $\boldsymbol{w}\in\mathbb{V}(\Omega)$ , define the residual $\boldsymbol{R}(\boldsymbol{w})\in\mathbb{V}(\Omega)^{\prime}$ by

(10)

\displaystyle\langle\boldsymbol{R}(\boldsymbol{w})\hskip 1.42262pt,\boldsymbol{v}\rangle:=F(\boldsymbol{v})-b(\boldsymbol{w},\boldsymbol{v})=b(\boldsymbol{u}^{\star}-\boldsymbol{w},\boldsymbol{v})\quad\text{for all }\boldsymbol{v}\in\mathbb{V}(\Omega).

Let $\mathcal{T}_{\bullet}\in\mathbb{T}$ and $\boldsymbol{v}\in\mathbb{V}(\Omega)$ . By definition and (L), we have that

\displaystyle\langle\boldsymbol{R}(\boldsymbol{u}_{\bullet}^{\star})\hskip 1.42262pt,\boldsymbol{v}\rangle=(\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}^{\star}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{\Omega}=\sum_{T\in\mathcal{T}}(\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\bullet}^{\star}\hskip 1.42262pt,\mathcal{L}\boldsymbol{v})_{T}\leq C_{\rm loc}\sum_{T\in\mathcal{T}}\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}^{\star})\,\|\boldsymbol{v}\|_{\mathbb{V}(\Omega_{\bullet}(T))},

which is [33, eq. (2.10a)]. Moreover, it holds that

\displaystyle\eta_{\bullet}(T,\boldsymbol{u}_{\bullet}^{\star})\leq\|\boldsymbol{F}\|_{T}+C_{\rm loc}\,\|\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega_{\bullet}(T))}\quad\text{for all }T\in\mathcal{T}_{\bullet}.

which is [33, eq. (2.10b)] and hence concludes the verification of [33, eq. (2.10)]. Altogether, [33, Theorem 2.1] applies and proves that $\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}\to 0$ as $\ell\to\infty$ . Since $\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})\simeq\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}$ , we also have that $\eta_{\ell}(\boldsymbol{u}_{\ell}^{\star})\to 0$ as $\ell\to\infty$ . ∎

Remark 3.

Note that the locality assumption (L) also directly implies local efficiency

\displaystyle\eta_{\bullet}(T,u_{\bullet})=\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet})\|_{T}\leq C_{\rm loc}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}\|_{\mathbb{V}(\Omega_{\bullet}(T))}\text{ for all }~T\in\mathcal{T}_{\bullet}\text{ and all }\boldsymbol{u}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega).

Again, we stress that this local lower bound does not include data approximation terms.

2.8. Convergence in the case of inexact solvers

For given $\mathcal{T}_{\bullet}\in\mathbb{T}$ , the exact computation of the least-squares solution $\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega)$ corresponds to the solution of a symmetric and positive definite algebraic system; see (3)–(4). Let us assume that we have a contractive iterative solver at hand:

(C)

Contractive iterative solver: There exists $0<q_{\rm ctr}<1$ as well as an equivalent norm $|||\cdot|||$ on $\mathbb{V}(\Omega)$ such that, for all $\mathcal{T}_{\bullet}\in\mathbb{T}$ , there exists $\Phi_{\bullet}\colon\mathbb{V}_{\bullet}(\Omega)\to\mathbb{V}_{\bullet}(\Omega)$ with

\displaystyle|||\boldsymbol{u}_{\bullet}^{\star}-\Phi_{\bullet}(\boldsymbol{v}_{\bullet})|||\leq q_{\rm ctr}\,|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{v}_{\bullet}|||\quad\text{for all }\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega).

Under this additional assumption, the following adaptive strategy steers adaptive mesh-refinement as well as the iterative solver, where we employ nested iteration in Algorithm 4(iv) to lower the number of solver steps.

Algorithm 4.

Input: Initial triangulation $\mathcal{T}_{0}$ , initial guess $\boldsymbol{u}_{0,0}\in\mathbb{V}_{0}(\Omega)$
Loop: For all $\ell=0,1,2,\dots$ , iterate the following steps (i)–(iv):

(i)

INEXACT SOLVE. Starting from $\boldsymbol{u}_{\ell,0}$ , do at least one step of the iterative solver

\displaystyle\boldsymbol{u}_{\ell,n}:=\Phi_{\ell}(\boldsymbol{u}_{\ell,n-1})\quad\text{for all }~n=1,\dots,\underline{n}=\underline{n}(\ell)\geq 1.

(ii)

ESTIMATE. For all $T\in\mathcal{T}_{\ell}$ , compute the contributions $\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})$ from (7).
(iii)

MARK. Determine a set $\mathcal{M}_{\ell}\subseteq\mathcal{T}_{\ell}$ of marked elements satisfying (M) for $\boldsymbol{u}_{\ell}=\boldsymbol{u}_{\ell,\underline{n}}$ .
(iv)

REFINE. Generate a new mesh $\mathcal{T}_{\ell+1}:={\rm refine}(\mathcal{T}_{\ell},\mathcal{M}_{\ell})$ and define $\boldsymbol{u}_{\ell+1,0}:=\boldsymbol{u}_{\ell,\underline{n}}$ .

Output: Sequences of approximations $\boldsymbol{u}_{\ell,\underline{n}}$ and corresponding error estimators $\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})$ .∎

Remark 5.

For the plain convergence result of the following theorem, one solver step (i.e., $\underline{n}(\ell)=1$ for all $\ell\in\mathbb{N}_{0}$ ) is indeed sufficient. In practice, the steps INEXACT SOLVE and ESTIMATE in Algorithm 4(i)–(ii) are usually combined. With some parameter $\lambda>0$ , a natural stopping criterion for the iterative solver reads

\displaystyle|||\boldsymbol{u}_{\ell,\underline{n}}-\boldsymbol{u}_{\ell,\underline{n}-1}|||\leq\lambda\,\eta_{\ell}(\boldsymbol{u}_{\ell-1,\underline{n}})\quad\text{resp.}\quad|||\boldsymbol{u}_{\ell,\underline{n}}-\boldsymbol{u}_{\ell,\underline{n}-1}|||\leq\lambda\,\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}}).

For adaptive FEM based on locally weighted estimators, we refer, e.g., to [1, 21] for linear convergence of such a strategy (based on the first criterion) and to [20, 23] for linear convergence with optimal rates (based on the second criterion for sufficiently small $0<\lambda\ll 1$ ). Moreover, we note that the second criterion even allows to prove convergence with optimal rates with respect to the computational costs [22].

Our second theorem states convergence of Algorithm 4 in the sense that, also in the case of inexact solution of the least-squares systems, the error as well as the error estimator are driven to zero.

Theorem 6.

Suppose the assumptions (A1)–(A4), (R1)–(R3), (S1)–(S3), (M), (L), and (C). Then, Algorithm 4 generates a sequence of approximations $(\boldsymbol{u}_{\ell,\underline{n}})_{\ell\in\mathbb{N}_{0}}$ such that

(11)

\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}+\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})\to 0\quad\text{as}\quad\ell\to\infty.

The proof of Theorem 6 requires some preparations. First, we recall [33, Lemma 3.1], which already dates back to the seminal work [3].

Lemma 7.

Suppose assumptions (A1)–(A2) and (S1)–(S2). Let $(\boldsymbol{u}_{\ell}^{\star})_{\ell\in\mathbb{N}_{0}}$ denote the sequence of exact least-squares solutions obtained by solving (3)–(4) for the meshes generated by Algorithm 4. Then,

(12)

\displaystyle\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}\to 0\quad\text{as }\ell\to\infty,

where $\boldsymbol{u}_{\infty}^{\star}\in\mathbb{V}_{\infty}(\Omega):=\overline{\bigcup_{\ell\in\mathbb{N}_{0}}\mathbb{V}_{\ell}(\Omega)}\subseteq\mathbb{V}(\Omega)$ solves (and is, in fact, the unique solution of)

\displaystyle b(\boldsymbol{u}_{\infty}^{\star},\boldsymbol{v}_{\infty})=F(\boldsymbol{v}_{\infty})\quad\text{for all }\boldsymbol{v}_{\infty}\in\mathbb{V}_{\infty}(\Omega).\qquad\qed

The next lemma shows that the inexact least-squares solutions converge indeed towards the same limit as the exact least-squares solutions.

Lemma 8.

Besides the assumptions from Lemma 7 suppose assumption (C). Let $(\boldsymbol{u}_{\ell,\underline{n}})_{\ell\in\mathbb{N}_{0}}$ denote the (final) approximations computed in Algorithm 4. Then,

(13)

\displaystyle\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\to 0\quad\text{as }\ell\to\infty,

where $\boldsymbol{u}_{\infty}^{\star}\in\mathbb{V}(\Omega)$ is the limit from Lemma 7.

Proof.

Recall the norm $|||\cdot|||$ from (C). Since $\underline{n}(\ell+1)\geq 1$ and $\boldsymbol{u}_{\ell+1,0}=\boldsymbol{u}_{\ell,\underline{n}}$ , it holds that

\displaystyle|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell+1,\underline{n}}|||\leq q_{\rm ctr}^{\underline{n}(\ell+1)}\,|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell+1,0}|||\leq q_{\rm ctr}\,|||\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||+q_{\rm ctr}\,|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell}^{\star}|||.

Let $\alpha_{\ell}:=|||\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||\geq 0$ . Then, the latter estimate takes the form

\displaystyle 0\leq\alpha_{\ell+1}\leq q_{\rm ctr}\,\alpha_{\ell}+|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell}^{\star}|||,\quad\text{where}\quad\lim_{\ell\to\infty}|||\boldsymbol{u}_{\ell+1}^{\star}-\boldsymbol{u}_{\ell}^{\star}|||=0.

Elementary calculus (see, e.g., the estimator reduction in [13, Corollary 4.8]) shows that

\displaystyle\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\simeq|||\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||=\alpha_{\ell}\to 0\quad\text{as }\ell\to\infty.

Overall, we thus see that

\displaystyle\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\leq\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\mathbb{V}(\Omega)}+\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\to 0\quad\text{as }\ell\to\infty.

This concludes the proof. ∎

Proof of Theorem 6.

We verify the validity of the building blocks of the proof of [33, Theorem 2.1] in the case of inexact solutions. This basically follows from the convergence $|||\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}|||\to 0$ shown in Lemma 8. Indeed, a closer look unveils that we only need to verify [33, Lemma 3.5–3.6 and Proposition 3.7] which we do in the following steps:

Step 1 (Uniform boundedness): From (A1), (A3), and Lemma 8, it follows that

\displaystyle\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\ell,\underline{n}}\|_{\Omega}\leq\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\infty}^{\star}\|_{\Omega}+C_{\rm cnt}\,\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}

and hence $\displaystyle\sup_{\ell\in\mathbb{N}_{0}}\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})<\infty$ .

Step 2 (Convergence of estimator on marked elements): With (A1) and (A3), it holds that

\displaystyle\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})=\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\ell,\underline{n}}\|_{T}\leq\|\boldsymbol{F}-\mathcal{L}\boldsymbol{u}_{\infty}^{\star}\|_{T}+C_{\rm cnt}\,\|\boldsymbol{u}_{\infty}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}.

By Lemma 8, the second term tends to zero. Arguing as in the proof of [33, Lemma 3.6] and hence exploiting (A4) and (R1)–(R3), we see that

\displaystyle\lim_{\ell\to\infty}\max\big{\{}\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})\,:\,T\in\mathcal{M}_{\ell}\big{\}}=0.

Step 3 (Weak convergence of residual): We tweak Step 2 in the proof of [33, Proposition 3.7]: Let $\boldsymbol{v}\in\mathbb{D}(\Omega)$ and recall that $\mathbb{D}(\Omega)\subset\mathbb{V}(\Omega)$ is dense by assumption (S3). Recall the residual from (10). For the exact least-squares solution $\boldsymbol{u}_{\ell}^{\star}\in\mathbb{V}_{\ell}(\Omega)$ , we may use Galerkin orthogonality (6) and local boundedness (L) to see that

	$\displaystyle\|\langle\boldsymbol{R}(\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\boldsymbol{v}\rangle\|$	$\displaystyle\,=\,\|b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v})+b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\mathcal{A}_{\ell}\boldsymbol{v})\|$
		$\displaystyle\,\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}\,\|b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v})+b(\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\mathcal{A}_{\ell}\boldsymbol{v})\|$
		$\displaystyle\!\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\lesssim}}\!\|(\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}))_{\Omega}\|+\\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\\|_{\mathbb{V}(\Omega)}\\|\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega)}$
		$\displaystyle\stackrel{{\scriptstyle\eqref{ass:localOperator}}}{{\lesssim}}\sum_{T\in\mathcal{T}_{\ell}}\\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}})\\|_{T}\\|\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega_{\ell}(T))}+\\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\\|_{\mathbb{V}(\Omega)}\\|\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega)}$
		$\displaystyle\,=\sum_{T\in\mathcal{T}_{\ell}}\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})\\|\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega_{\ell}(T))}+\\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\\|_{\mathbb{V}(\Omega)}\\|\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega)}.$

For fixed $\boldsymbol{v}\in\mathbb{D}(\Omega)$ , it holds that $\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\|\mathcal{A}_{\ell}\boldsymbol{v}\|_{\mathbb{V}(\Omega)}\to 0$ by (S3) and Lemma 8. The sum on the right-hand side can be dealt with as in [33, Proposition 3.7], and we conclude that

\displaystyle\lim_{\ell\to\infty}\langle\boldsymbol{R}(\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\boldsymbol{v}\rangle=0\quad\text{for all }\boldsymbol{v}\in\mathbb{D}(\Omega).

Step 4 (Convergence of inexact solutions and estimators): With the auxiliary results established above, we can follow the proof of [33, Theorem 2.1] step by step to deduce that $\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\to 0$ as $\ell\to\infty$ . Finally, the equivalence $\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\|_{\mathbb{V}(\Omega)}\simeq\eta_{\ell}(\boldsymbol{u}_{\ell,\underline{n}})$ concludes the proof. ∎

2.9. Preconditioned conjugate gradient method (PCG)

We recall a well-known result which follows from [25, Theorem 11.3.3]. The presented form is, e.g., found in [20, Lemma 1].

Lemma 9.

Let $\boldsymbol{A}_{\bullet},\boldsymbol{P}_{\bullet}\in\mathbb{R}^{N\times N}$ be symmetric and positive definite, $\boldsymbol{b}_{\bullet}\in\mathbb{R}^{N}$ , $\boldsymbol{x}_{\bullet}^{\star}:=\boldsymbol{A}_{\bullet}^{-1}\boldsymbol{b}_{\bullet}$ , and $\boldsymbol{x}_{\bullet,0}\in\mathbb{R}^{N}$ . Suppose the $\ell_{2}$ -condition number estimate

(14)

\displaystyle{\rm cond}_{2}(\boldsymbol{P}_{\bullet}^{-1/2}\boldsymbol{A}_{\bullet}\boldsymbol{P}_{\bullet}^{-1/2})\leq C_{\rm pcg}.

Then, the iterates $\boldsymbol{x}_{\bullet,n}$ of the PCG algorithm satisfy the contraction property

(15)

\displaystyle\|\boldsymbol{x}_{\bullet}^{\star}-\boldsymbol{x}_{\bullet,n+1}\|_{\boldsymbol{A}_{\bullet}}\leq q_{\rm ctr}\,\|\boldsymbol{x}_{\bullet}^{\star}-\boldsymbol{x}_{\bullet,n}\|_{\boldsymbol{A}_{\bullet}}\quad\text{for all }n\in\mathbb{N}_{0},

where $q_{\rm ctr}:=(1-1/C_{\rm pcg})^{1/2}<1$ and $\|\boldsymbol{y}_{\bullet}\|_{\boldsymbol{A}_{\bullet}}^{2}:=\boldsymbol{y}_{\bullet}\cdot\boldsymbol{A}_{\bullet}\boldsymbol{y}_{\bullet}$ for $\boldsymbol{y}_{\bullet}\in\mathbb{R}^{N}$ .∎

Let $\{\boldsymbol{\phi}_{\bullet,1},\dots,\boldsymbol{\phi}_{\bullet,N}\}$ denote a basis of $\mathbb{V}_{\bullet}(\Omega)$ . Denote with $\boldsymbol{A}_{\bullet}$ the Galerkin matrix and with $\boldsymbol{b}_{\bullet}$ the right-hand side of the least-squares method with respect to that basis, i.e., $\boldsymbol{A}_{\bullet}[j,k]=b(\boldsymbol{\phi}_{\bullet,k},\boldsymbol{\phi}_{\bullet,j})$ and $\boldsymbol{b}_{\bullet}[j]=F(\boldsymbol{\phi}_{\bullet,j})$ .

There is a one-to-one relation between vectors $\boldsymbol{y}_{\bullet}\in\mathbb{R}^{N}$ and functions $\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)$ given by $\boldsymbol{v}_{\bullet}=\sum_{j=1}^{N}\boldsymbol{y}_{\bullet}[j]\boldsymbol{\phi}_{\bullet,j}$ . Let $\boldsymbol{u}_{\bullet,n}\in\mathbb{V}_{\bullet}(\Omega)$ denote the function corresponding to the iterate $\boldsymbol{x}_{\bullet,n}$ . We note that the least-squares solution $\boldsymbol{u}_{\bullet}^{\star}\in\mathbb{V}_{\bullet}(\Omega)$ corresponds to the coefficient vector $\boldsymbol{x}_{\bullet}^{\star}:=\boldsymbol{A}_{\bullet}^{-1}\boldsymbol{b}_{\bullet}$ . With the elementary identity

\displaystyle|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n}|||^{2}:=\|\mathcal{L}(\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n})\|^{2}=b(\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n},\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n})=\|\boldsymbol{x}_{\bullet}^{\star}-\boldsymbol{x}_{\bullet,n}\|_{\boldsymbol{A}_{\bullet}}^{2},

the contraction property (15) thus reads

(16)

\displaystyle|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,(n+1)}|||\leq q_{\rm ctr}\,|||\boldsymbol{u}_{\bullet}^{\star}-\boldsymbol{u}_{\bullet,n}|||\quad\text{for all }n\in\mathbb{N}_{0}.

We make the following assumption on the preconditioner.

(P)

Optimal preconditioner: For all $\mathcal{T}_{\bullet}\in\mathbb{T}$ , there exists a symmetric preconditioner $\boldsymbol{P}_{\bullet}$ of the Galerkin matrix $\boldsymbol{A}_{\bullet}$ such that the constant in (14) depends only on the initial mesh $\mathcal{T}_{0}$ .

Under this assumption, PCG fits into the abstract framework from Section 2.8.

3. Examples

In this section, we consider some common model problems. Throughout, we assume that one of the marking strategies from Section 2.6 is used. Moreover, we assume that $\mathcal{T}_{0}$ is a conforming simplicial triangulation of some bounded Lipschitz domain $\Omega\subset\mathbb{R}^{d}$ with $d=2,3$ and NVB is used for mesh-refinement (see Section 2.3). In particular, there holds assumption (M) as well as (R1)–(R3). To conclude convergence of ALSFEM, it therefore remains to show that the assumptions (A1)–(A4), (L), and (S1)–(S3) are satisfied for the following examples.

3.1. Poisson problem

For given $f\in L^{2}(\Omega)$ , consider the Poisson problem


(17a)	$\displaystyle-\Delta u$	$\displaystyle=f$	$\displaystyle\text{in }\Omega,$
(17b)	$\displaystyle u$	$\displaystyle=0$	$\displaystyle\text{on }\Gamma:=\partial\Omega.$

With the substitution ${\boldsymbol{\sigma}}:=\nabla u$ , this is equivalently reformulated as a first-order system


(18a)	$\displaystyle-{\rm div\,}{\boldsymbol{\sigma}}$	$\displaystyle=f$	$\displaystyle\text{in }\Omega,$
(18b)	$\displaystyle\nabla u-{\boldsymbol{\sigma}}$	$\displaystyle=0$	$\displaystyle\text{in }\Omega,$
(18c)	$\displaystyle u$	$\displaystyle=0$	$\displaystyle\text{on }\Gamma.$

With the Hilbert space

(19)

\displaystyle\mathbb{V}(\Omega)

\displaystyle:=H_{0}^{1}(\Omega)\times\boldsymbol{H}({\rm div\,};\Omega),

the first-order system (18) can equivalently be recast in the abstract form (1), i.e.,

(20)

\displaystyle\mathcal{L}\begin{pmatrix}u\\ {\boldsymbol{\sigma}}\end{pmatrix}:=\begin{pmatrix}-{\rm div\,}{\boldsymbol{\sigma}}\\ \nabla u-{\boldsymbol{\sigma}}\end{pmatrix}=\begin{pmatrix}f\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+1}.

It is well-known (see, e.g., the textbook [5]) that $\mathcal{L}\colon\operatorname{\mathbb{V}}(\Omega)\to L^{2}(\Omega)^{d+1}$ is an isomorphism, so that (A1)–(A2) are guaranteed. Clearly, (A3)–(A4) are satisfied, since $\|\cdot\|_{\operatorname{\mathbb{V}}(\Omega)}$ relies on the Lebesgue norm $\|\cdot\|_{\Omega}$ . Moreover, assumption (L) follows from

\displaystyle\|\mathcal{L}(v,{\boldsymbol{\tau}})\|_{T}^{2}=\|{\rm div\,}{\boldsymbol{\tau}}\|_{T}^{2}+\|\nabla v-{\boldsymbol{\tau}}\|_{T}^{2}\lesssim\|\nabla v\|_{T}^{2}+\|v\|_{T}^{2}+\|{\rm div\,}{\boldsymbol{\tau}}\|_{T}^{2}+\|{\boldsymbol{\tau}}\|_{T}^{2}=\|(v,{\boldsymbol{\tau}})\|_{{\mathbb{V}(T)}}^{2}.

A common FEM discretization of the energy space (19) involves the conforming subspace

(21)

\displaystyle\mathbb{V}_{\bullet}(\Omega):=S_{0}^{k+1}(\mathcal{T}_{\bullet})\times\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})\subset\operatorname{\mathbb{V}}(\Omega)

where $S_{0}^{k+1}(\mathcal{T}_{\bullet})\subseteq H^{1}_{0}(\Omega)$ is the usual Courant FEM space of order $k+1$ and $\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})\subset\boldsymbol{H}({\rm div\,};\Omega)$ is the Raviart–Thomas FEM space of order $k\in\mathbb{N}_{0}$ ; see, e.g., [6]. Clearly, there hold (S1)–(S2), since $\mathcal{T}_{\circ}\in\operatorname{refine}(\mathcal{T}_{\bullet})$ yields that $\mathbb{V}_{\bullet}(\Omega)\subseteq\operatorname{\mathbb{V}}_{\circ}(\Omega)\subset\operatorname{\mathbb{V}}(\Omega)$ .

It remains to show the local approximation property (S3). To that end, recall that $C_{c}^{\infty}(\overline{\Omega})$ is dense in $H^{1}_{0}(\Omega)$ and $C^{\infty}(\overline{\Omega})^{d}$ is dense in $\boldsymbol{H}({\rm div\,};\Omega)$ ; see, e.g., [24, Theorem 2.4]. Therefore,

\displaystyle C_{c}^{\infty}(\overline{\Omega})\times C^{\infty}(\overline{\Omega})^{d}\subset\mathbb{D}(\Omega):=\big{[}H^{2}(\Omega)\cap H^{1}_{0}(\Omega)\big{]}\times H^{2}(\Omega)^{d}\subset\operatorname{\mathbb{V}}(\Omega)\text{ is dense}.

Let $\mathcal{I}_{\bullet}\colon H^{2}(\Omega)\to S^{1}(\mathcal{T}_{\bullet})$ be the nodal interpolation operator onto the first-order Courant space. Then, it holds that

\displaystyle\|v-\mathcal{I}_{\bullet}v\|_{H^{1}(T)}\lesssim h_{T}\|v\|_{H^{2}(T)}\quad\text{for all }v\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet}.

Let $\mathcal{R}_{\bullet}\colon H^{1}(\Omega)\to\boldsymbol{RT}^{0}(\mathcal{T}_{\bullet})$ be the Raviart–Thomas projector onto the lowest-order Raviart–Thomas space. Then, it holds that

\displaystyle\|{\boldsymbol{\tau}}-\mathcal{R}_{\bullet}{\boldsymbol{\tau}}\|_{\boldsymbol{H}({\rm div\,};T)}\lesssim h_{T}\|{\boldsymbol{\tau}}\|_{H^{2}(T)}\quad\text{for all }{\boldsymbol{\tau}}\in H^{2}(\Omega)^{d}\text{ and all }T\in\mathcal{T}_{\bullet}.

Overall, the approximation operator $\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\operatorname{\mathbb{V}}_{\bullet}(\Omega)$ defined by $\mathcal{A}_{\bullet}(v,{\boldsymbol{\tau}}):=(\mathcal{I}_{\bullet}v,\mathcal{R}_{\bullet}{\boldsymbol{\tau}})$ satisfies (S3) with $s=1$ .

3.2. General second-order problem

Given $f\in L^{2}(\Omega)$ , we consider the general second-order linear elliptic PDE


(22a)	$\displaystyle-{\rm div\,}(\boldsymbol{A}\nabla u)+\boldsymbol{b}\cdot\nabla u+c\,u$	$\displaystyle=f$	$\displaystyle\text{in }\Omega,$
(22b)	$\displaystyle u$	$\displaystyle=0$	$\displaystyle\text{on }\Gamma,$

where $\boldsymbol{A}_{jk},\,\boldsymbol{b}_{j},\,c\in L^{\infty}(\Omega)$ , and $\boldsymbol{A}$ is symmetric and uniformly positive definite. It follows from the Fredholm alternative that existence and uniqueness of the weak solution $u\in H^{1}_{0}(\Omega)$ to (22) is equivalent to the well-posedness of the homogeneous problem, i.e.,

\displaystyle a(u_{0},v):=\langle\boldsymbol{A}\nabla u_{0}\hskip 1.42262pt,\nabla v\rangle_{\Omega}+\langle\boldsymbol{b}\cdot\nabla u_{0}+c\,u_{0}\hskip 1.42262pt,v\rangle_{\Omega}

satisfies that

(23)

\displaystyle\forall u_{0}\in H^{1}_{0}(\Omega):\Big{(}\big{[}\forall v\in H^{1}_{0}(\Omega)\quad a(u_{0},v)=0\big{]}\quad\Longrightarrow\quad u_{0}=0\Big{)};

see, e.g., [7]. Clearly, the general problem (22) does not only cover the Poisson problem from Section 3.1 (with $\boldsymbol{A}$ being the identity, $\boldsymbol{b}=0$ , and $c=0$ , where (23) follows from the Poincaré inequality), but also the Helmholtz problem (with $\boldsymbol{A}$ being the identity, $\boldsymbol{b}=0$ , and $c=-\omega^{2}<0$ , provided that $\omega^{2}$ is not an eigenvalue of the Dirichlet–Laplace eigenvalue problem).

The first-order reformulation of (22) reads

(24)

\displaystyle\mathcal{L}\begin{pmatrix}u\\ {\boldsymbol{\sigma}}\end{pmatrix}:=\begin{pmatrix}-{\rm div\,}{\boldsymbol{\sigma}}+\boldsymbol{b}\cdot\nabla u+c\,u\\ \boldsymbol{A}\nabla u-{\boldsymbol{\sigma}}\end{pmatrix}=\begin{pmatrix}f\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+1}.

With the Hilbert space $\mathbb{V}(\Omega)$ from (19) and provided that problem (22) admits a unique solution $u\in H^{1}(\Omega)$ , we conclude with [10, Theorem 3.1] that (A1)–(A2) are satisfied. As in Section 3.1, the assumptions (A3)–(A4) are clearly satisfied and (L) follows as all coefficients of $\mathcal{L}$ are bounded.

As before, a common FEM discretization involves the conforming subspace $\mathbb{V}_{\bullet}(\Omega)$ from (21), and (S1)–(S3) follow as in Section 3.1. Overall, there holds the following convergence result:

Proposition 10.

Under the well-posedness assumption (23) and for any marking strategy satisfying (M), the adaptive least-squares formulation of (24) based on $\mathbb{V}_{\bullet}(\Omega)$ from (21) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

3.3. Linear Elasticity

Given $\boldsymbol{f}\in\boldsymbol{L}^{2}(\Omega):=L^{2}(\Omega)^{d}$ find the displacement $\boldsymbol{u}\in{\boldsymbol{H}}_{0}^{1}(\Omega):=H_{0}^{1}(\Omega)^{d}$ and the stress $\boldsymbol{M}\in\underline{{\boldsymbol{H}}}({\mathbf{div}\,};\Omega):=\big{\{}\boldsymbol{M}\in L^{2}(\Omega)^{d\times d}\,:\,{\mathbf{div}\,}\boldsymbol{M}\in\boldsymbol{L}^{2}(\Omega)\big{\}}$ satisfying

	$\displaystyle-{\mathbf{div}\,}\boldsymbol{M}$	$\displaystyle=\boldsymbol{f},$
	$\displaystyle\boldsymbol{M}-\mathbb{C}\boldsymbol{\epsilon}(\boldsymbol{u})$	$\displaystyle=\boldsymbol{0}.$

Here, ${\mathbf{div}\,}(\cdot)$ denotes the divergence operator applied to each row and $\boldsymbol{\epsilon}(\cdot)=\tfrac{1}{2}(\boldsymbol{\nabla}(\cdot)+\boldsymbol{\nabla}(\cdot)^{\top})$ is the symmetric gradient, where $(\boldsymbol{\nabla}\boldsymbol{v})_{ij}=\partial_{j}\boldsymbol{v}_{i}$ is the Jacobian. Given the Lamé parameters $\lambda,\mu>0$ , the positive definite elasticity tensor is given by

\displaystyle\mathbb{C}\boldsymbol{N}=(\lambda\mathrm{tr}\boldsymbol{N})\boldsymbol{I}+2\mu\boldsymbol{N}

with $\boldsymbol{I}$ denoting the $d\times d$ identity matrix and $\mathrm{tr}\boldsymbol{N}=\sum_{j=1}^{d}\boldsymbol{N}_{jj}$ the trace operator. Consider the Hilbert space

(25)

\displaystyle\mathbb{V}(\Omega):={\boldsymbol{H}}_{0}^{1}(\Omega)\times\underline{\boldsymbol{H}\!}\hskip 0.50003pt({\mathbf{div}\,};\Omega)

equipped with the norm

\displaystyle\|(\boldsymbol{v},\boldsymbol{N})\|_{\mathbb{V}(\Omega)}^{2}=\|\mathbb{C}^{-1/2}\boldsymbol{N}\|_{\Omega}^{2}+\|{\mathbf{div}\,}\boldsymbol{N}\|_{\Omega}^{2}+\|\mathbb{C}^{1/2}\boldsymbol{\epsilon}(\boldsymbol{v})\|_{\Omega}^{2},

which is equivalent to the canonical norm by Korn’s inequality and the properties of the elasticity tensor. Then, the first-order system can be put into the abstract form (1), i.e.,

(26)

\displaystyle\mathcal{L}\begin{pmatrix}\boldsymbol{u}\\ \boldsymbol{M}\end{pmatrix}:=\begin{pmatrix}-{\mathbf{div}\,}\boldsymbol{M}\\ \mathbb{C}^{-1/2}\boldsymbol{M}-\mathbb{C}^{1/2}\boldsymbol{\epsilon}(\boldsymbol{u})\end{pmatrix}=\begin{pmatrix}\boldsymbol{f}\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+d^{2}},

where we identify $d\times d$ matrices with $d^{2}\times 1$ vectors. It is well known that the linear elasticity problem is well-posed so that (A1)–(A2) are satisfied; see, e.g., [9, Theorem 2.1]. The validity of the assumptions (A3)–(A4) and (L) follows as in Section 3.1. Consider the conforming discrete space

(27)

\displaystyle\mathbb{V}_{\bullet}(\Omega)=S_{0}^{k+1}(\mathcal{T}_{\bullet})^{d}\times\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})^{d}\subset\mathbb{V}(\Omega).

Here, $\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})^{d}$ denotes the space of matrix-valued functions, where each row is an element of $\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})$ . As in Section 3.1, we conclude that the assumptions (S1)–(S3) are satisfied. Overall, we then have the following result:

Proposition 11.

For any marking strategy satisfying (M), the adaptive least-squares formulation of (26) based on $\mathbb{V}_{\bullet}(\Omega)$ from (27) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

3.4. Maxwell problem

We consider the case $d=3$ only. Given $\boldsymbol{f}\in\boldsymbol{L}^{2}(\Omega)$ and $c\in L^{\infty}(\Omega)$ , find $\boldsymbol{u}\in\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega)$ and ${\boldsymbol{\sigma}}\in\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)$ satisfying

	$\displaystyle\boldsymbol{\operatorname{curl}}\,{\boldsymbol{\sigma}}+c\,\boldsymbol{u}$	$\displaystyle=\boldsymbol{f},$
	$\displaystyle\boldsymbol{\operatorname{curl}}\,\boldsymbol{u}-{\boldsymbol{\sigma}}$	$\displaystyle=\boldsymbol{0},$

where

	$\displaystyle\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)$	$\displaystyle:=\big{\{}\boldsymbol{v}\in\boldsymbol{L}^{2}(\Omega)\,:\,\boldsymbol{\operatorname{curl}}\,\boldsymbol{v}\in\boldsymbol{L}^{2}(\Omega)\big{\}},$
	$\displaystyle\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega)$	$\displaystyle:=\big{\{}\boldsymbol{v}\in\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)\,:\,\boldsymbol{v}\times{\boldsymbol{n}}\|_{\partial\Omega}=0\big{\}}.$

With the Hilbert space

(28)

\displaystyle\mathbb{V}(\Omega):=\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega)\times\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)

equipped with the norm

\displaystyle\|(\boldsymbol{v},{\boldsymbol{\tau}})\|_{\mathbb{V}(\Omega)}^{2}=\|\boldsymbol{v}\|_{\Omega}^{2}+\|\boldsymbol{\operatorname{curl}}\,\boldsymbol{v}\|_{\Omega}^{2}+\|{\boldsymbol{\tau}}\|_{\Omega}^{2}+\|\boldsymbol{\operatorname{curl}}\,{\boldsymbol{\tau}}\|_{\Omega}^{2},

the first-order system can be written in the abstract form (1), i.e.,

(29)

\displaystyle\mathcal{L}\begin{pmatrix}\boldsymbol{u}\\ {\boldsymbol{\sigma}}\end{pmatrix}:=\begin{pmatrix}\boldsymbol{\operatorname{curl}}\,{\boldsymbol{\sigma}}+c\boldsymbol{u}\\ \boldsymbol{\operatorname{curl}}\,\boldsymbol{u}-{\boldsymbol{\sigma}}\end{pmatrix}=\begin{pmatrix}\boldsymbol{f}\\ \boldsymbol{0}\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{6}.

The assumptions (A1)–(A2) are satisfied if $\operatorname{ess}\inf_{x\in\Omega}c>0$ or $c=-\omega^{2}<0$ and $\omega^{2}$ is not an eigenvalue of the cavity problem. The argumentation is similar to the Poisson case. For the case $c=-\omega^{2}$ , we also refer to [16, Section 2.4]. The validity of the assumptions (A3)–(A4) and (L) follows as in Section 3.1. Using the conforming discrete space

(30)

\displaystyle\mathbb{V}_{\bullet}(\Omega)=\boldsymbol{N}_{0}^{k}(\mathcal{T}_{\bullet})\times\boldsymbol{N}^{k}(\mathcal{T}_{\bullet}):=(\boldsymbol{N}^{k}(\mathcal{T}_{\bullet})\cap\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega))\times\boldsymbol{N}^{k}(\mathcal{T}_{\bullet})\subset\mathbb{V}(\Omega),

where $\boldsymbol{N}^{k}(\mathcal{T}_{\bullet})$ denotes the Nédélec space of order $k\in\mathbb{N}_{0}$ , this proves that the assumptions (S1)–(S2) are satisfied. It remains to verify the local approximation property (S3). Recall from [28] that $C_{c}^{\infty}(\overline{\Omega})^{3}$ is dense in $\boldsymbol{H}_{0}(\boldsymbol{\operatorname{curl}}\,;\Omega)$ and $C^{\infty}(\overline{\Omega})^{3}$ is dense in $\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)$ . This shows that

\displaystyle C_{c}^{\infty}(\overline{\Omega})^{3}\times C^{\infty}(\overline{\Omega})^{3}\subset\mathbb{D}(\Omega):=(H^{2}(\Omega)\cap H_{0}^{1}(\Omega))^{3}\times H^{2}(\Omega)^{3}\subset\mathbb{V}(\Omega)

is dense. Let $\mathcal{N}_{\bullet}\colon H^{1}(\Omega)^{3}\to\boldsymbol{N}^{k}(\mathcal{T}_{\bullet})$ denote the edge interpolation operator which satisfies

\displaystyle\|\boldsymbol{v}-\mathcal{N}_{\bullet}\boldsymbol{v}\|_{\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;T)}\lesssim h_{T}\|\boldsymbol{v}\|_{H^{2}(T)}\quad\text{for all }\boldsymbol{v}\in H^{2}(\Omega)^{3}\text{ and all }T\in\mathcal{T}_{\bullet};

see, e.g., [26, Section 3.5–3.6]. Note that the projection $\mathcal{N}_{\bullet}$ by definition also preserves homogeneous boundary conditions. Therefore, $\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\mathbb{V}_{\bullet}(\Omega)$ defined by $\mathcal{A}_{\bullet}(\boldsymbol{v},{\boldsymbol{\tau}})=(\mathcal{R}_{\bullet}\boldsymbol{v},\mathcal{R}_{\bullet}{\boldsymbol{\tau}})$ satisfies assumption (S3). Overall, we have the following result:

Proposition 12.

For any marking strategy satisfying (M), the adaptive least-squares formulation of (29) based on $\mathbb{V}_{\bullet}(\Omega)$ from (30) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

3.5. Stokes problem

Given $\boldsymbol{f}\in\boldsymbol{L}^{2}(\Omega)$ , find the velocity $\boldsymbol{u}\in{\boldsymbol{H}}_{0}^{1}(\Omega)$ and the pressure $p\in L^{2}(\Omega)$ satisfying

	$\displaystyle-\boldsymbol{\Delta}\boldsymbol{u}+\nabla p$	$\displaystyle=\boldsymbol{f},$
	$\displaystyle{\rm div\,}\boldsymbol{u}$	$\displaystyle=0,$
	$\displaystyle\int_{\Omega}p\,dx$	$\displaystyle=0.$

There are many different reformulations that are suitable to define LSFEMs; see [5, Chapter 7]. Here, we consider a stress–velocity–pressure formulation (see, e.g. [11]), where the stress is defined by the relation

\displaystyle\boldsymbol{M}=\boldsymbol{\epsilon}(\boldsymbol{u})-p\boldsymbol{I}.

Let $\Pi_{\Omega}v=|\Omega|^{-1}(v\hskip 1.42262pt,1)_{\Omega}$ denote the $L^{2}(\Omega)$ -orthogonal projection on constants. Consider the Hilbert space

(31)

\displaystyle\mathbb{V}(\Omega):={\boldsymbol{H}}_{0}^{1}(\Omega)\times\underline{\boldsymbol{H}\!}\hskip 0.50003pt({\mathbf{div}\,};\Omega)\times L^{2}(\Omega)

equipped with the norm

\displaystyle\|(\boldsymbol{v},\boldsymbol{N},q)\|_{\mathbb{V}(\Omega)}^{2}=\|\boldsymbol{v}\|_{\boldsymbol{H}^{1}(\Omega)}^{2}+\|\boldsymbol{N}\|_{\Omega}^{2}+\|{\mathbf{div}\,}\boldsymbol{N}\|_{\Omega}^{2}+\|q\|_{\Omega}^{2}+\|\Pi_{\Omega}q\|_{\Omega}^{2}.

Then, the first-order system can be recast in the abstract form (1), i.e.,

(32)

\displaystyle\mathcal{L}\begin{pmatrix}\boldsymbol{u}\\ \boldsymbol{M}\\ p\end{pmatrix}:=\begin{pmatrix}-{\mathbf{div}\,}\boldsymbol{M}\\ \boldsymbol{M}-\boldsymbol{\epsilon}(\boldsymbol{u})+p\boldsymbol{I}\\ {\rm div\,}\boldsymbol{u}\\ \Pi_{\Omega}p\end{pmatrix}=\begin{pmatrix}\boldsymbol{f}\\ \boldsymbol{0}\\ 0\\ 0\end{pmatrix}=:\boldsymbol{F}\in L^{2}(\Omega)^{d+d^{2}+2},

where we identify $d\times d$ matrices as before with $d^{2}\times 1$ vectors.

Following the proof of [11, Theorem 3.2] we conclude that (A1)–(A2) are guaranteed (the only difference to the work [11] is that we include the pressure condition directly in the first-order formulation). Clearly, (A3)–(A4) follow as in Section 3.1. Moreover, we have included the term $\|\Pi_{\Omega}q\|_{\Omega}$ in the definition of the norm $\|\cdot\|_{\mathbb{V}(\Omega)}$ to ensure that (L) is satisfied, i.e.,

	$\displaystyle\\|\mathcal{L}(\boldsymbol{v},\boldsymbol{N},q)\\|_{T}^{2}$	$\displaystyle=\\|{\mathbf{div}\,}\boldsymbol{N}\\|_{T}^{2}+\\|\boldsymbol{N}-\boldsymbol{\epsilon}(\boldsymbol{v})+q\\|_{T}^{2}+\\|{\rm div\,}\boldsymbol{v}\\|_{T}^{2}+\\|\Pi_{\Omega}q\\|_{T}^{2}$
		$\displaystyle\lesssim\\|\boldsymbol{v}\\|_{\boldsymbol{H}^{1}(T)}^{2}+\\|\boldsymbol{N}\\|_{T}^{2}+\\|{\mathbf{div}\,}\boldsymbol{N}\\|_{T}^{2}+\\|q\\|_{T}^{2}+\\|\Pi_{\Omega}q\\|_{T}^{2}=\\|(\boldsymbol{v},\boldsymbol{N},q)\\|_{\mathbb{V}(T)}^{2}$

holds for all $(\boldsymbol{v},\boldsymbol{N},q)\in\mathbb{V}(\Omega)$ .

Consider the conforming subspace

(33)

\displaystyle\mathbb{V}_{\bullet}(\Omega)=S_{0}^{k+1}(\mathcal{T}_{\bullet})^{d}\times\boldsymbol{RT}^{k}(\mathcal{T}_{\bullet})^{d}\times P^{k}(\mathcal{T}_{\bullet})\subset\mathbb{V}(\Omega),

where $P^{k}(\mathcal{T}_{\bullet})$ denotes the space of elementwise polynomials of degree $\leq k\in\mathbb{N}_{0}$ . It follows that the assumptions (S1)–(S2) are satisfied. It remains to prove the local approximation assumption (S3). For the first two components $(\boldsymbol{v},\boldsymbol{N})\in{\boldsymbol{H}}_{0}^{1}(\Omega)\times\underline{\boldsymbol{H}\!}\hskip 0.50003pt({\mathbf{div}\,};\Omega)$ in the space $\mathbb{V}(\Omega)$ , we argue as in Section 3.1. For the last component $q\in L^{2}(\Omega)$ , we note that $H^{1}(\Omega)$ is dense in $L^{2}(\Omega)$ . Let $\mathcal{P}_{\bullet}\colon L^{2}(\Omega)\to P^{k}(\mathcal{T}_{\bullet})$ denote the $L^{2}(\Omega)$ -orthogonal projection and observe that $\Pi_{\Omega}(q-\mathcal{P}_{\bullet}q)=0$ . Then,

\displaystyle\|q-\mathcal{P}_{\bullet}q\|_{T}+\|\Pi_{\Omega}(q-\mathcal{P}_{\bullet}q)\|_{T}=\|q-\mathcal{P}_{\bullet}q\|_{T}\lesssim h_{T}\|\nabla q\|_{T}\quad\text{for all }q\in H^{1}(\Omega)\text{ and all }T\in\mathcal{T}_{\bullet}.

Altogether, we choose $\mathbb{D}(\Omega)=(H^{2}(\Omega)\cap H_{0}^{1}(\Omega))\times H^{2}(\Omega)^{d\times d}\times H^{1}(\Omega)$ and $\mathcal{A}_{\bullet}\colon\mathbb{D}(\Omega)\to\mathbb{V}(\Omega)$ with $\mathcal{A}_{\bullet}(\boldsymbol{v},\boldsymbol{N},q)=(\mathcal{I}_{\bullet}\boldsymbol{v},\mathcal{R}_{\bullet}\boldsymbol{N},\mathcal{P}_{\bullet}q)$ , where $\mathcal{I}_{\bullet}$ and $\mathcal{R}_{\bullet}$ denote the operators from Section 3.1 applied to each row of $\boldsymbol{v}$ resp. $\boldsymbol{N}$ . Therefore, we conclude that also (S3) is satisfied. Overall, we have the following result:

Proposition 13.

For any marking strategy satisfying (M), the adaptive least-squares formulation of (32) based on $\mathbb{V}_{\bullet}(\Omega)$ from (33) guarantees plain convergence in the sense of Theorem 2 and Theorem 6.∎

References

[1] M. Arioli, E. H. Georgoulis, and D. Loghin. Stopping criteria for adaptive finite element solvers. SIAM J. Sci. Comput., 35(3):A1537–A1559, 2013.
[2] M. Aurada, M. Feischl, J. Kemetmüller, M. Page, and D. Praetorius. Each $H^{1/2}$ -stable projection yields convergence and quasi-optimality of adaptive FEM with inhomogeneous Dirichlet data in $\mathbb{R}^{d}$ . ESAIM Math. Model. Numer. Anal., 47(4):1207–1235, 2013.
[3] I. Babuška and M. Vogelius. Feedback and adaptive finite element solution of one-dimensional boundary value problems. Numer. Math., 44(1):75–102, 1984.
[4] P. Binev, W. Dahmen, and R. DeVore. Adaptive finite element methods with convergence rates. Numer. Math., 97(2):219–268, 2004.
[5] P. B. Bochev and M. D. Gunzburger. Least-squares finite element methods, volume 166 of Applied Mathematical Sciences. Springer, New York, 2009.
[6] D. Boffi, F. Brezzi, and M. Fortin. Mixed finite element methods and applications, volume 44 of Springer Series in Computational Mathematics. Springer, Heidelberg, 2013.
[7] S. C. Brenner and L. R. Scott. The mathematical theory of finite element methods, volume 15 of Texts in Applied Mathematics. Springer, New York, third edition, 2008.
[8] P. Bringmann, C. Carstensen, and G. Starke. An adaptive least-squares FEM for linear elasticity with optimal convergence rates. SIAM J. Numer. Anal., 56(1):428–447, 2018.
[9] Z. Cai, J. Korsawe, and G. Starke. An adaptive least squares mixed finite element method for the stress-displacement formulation of linear elasticity. Numer. Methods Partial Differential Equations, 21(1):132–148, 2005.
[10] Z. Cai, R. Lazarov, T. A. Manteuffel, and S. F. McCormick. First-order system least squares for second-order partial differential equations. I. SIAM J. Numer. Anal., 31(6):1785–1799, 1994.
[11] Z. Cai, B. Lee, and P. Wang. Least-squares methods for incompressible Newtonian fluid flow: linear stationary problems. SIAM J. Numer. Anal., 42(2):843–859, 2004.
[12] C. Carstensen. Collective marking for adaptive least-squares finite element methods with optimal rates. Math. Comp., 89(321):89–103, 2020.
[13] C. Carstensen, M. Feischl, M. Page, and D. Praetorius. Axioms of adaptivity. Comput. Math. Appl., 67(6):1195–1253, 2014.
[14] C. Carstensen and E.-J. Park. Convergence and optimality of adaptive least squares finite element methods. SIAM J. Numer. Anal., 53(1):43–62, 2015.
[15] C. Carstensen, E.-J. Park, and P. Bringmann. Convergence of natural adaptive least squares finite element methods. Numer. Math., 136(4):1097–1115, 2017.
[16] C. Carstensen and J. Storn. Asymptotic exactness of the least-squares finite element residual. SIAM J. Numer. Anal., 56(4):2008–2028, 2018.
[17] J. M. Cascon, C. Kreuzer, R. H. Nochetto, and K. G. Siebert. Quasi-optimal convergence rate for an adaptive finite element method. SIAM J. Numer. Anal., 46(5):2524–2550, 2008.
[18] W. Dörfler. A convergent adaptive algorithm for Poisson’s equation. SIAM J. Numer. Anal., 33(3):1106–1124, 1996.
[19] A. Ern, T. Gudi, I. Smears, and M. Vohralik. Equivalence of local-and global-best approximations, a simple stable local commuting projector, and optimal hp approximation estimates in $h({\rm div})$ . Preprint, arXiv:1908.08158, 2019.
[20] T. Führer, A. Haberl, D. Praetorius, and S. Schimanko. Adaptive BEM with inexact PCG solver yields almost optimal computational costs. Numer. Math., 141(4):967–1008, 2019.
[21] T. Führer and D. Praetorius. A linear Uzawa-type FEM-BEM solver for nonlinear transmission problems. Comput. Math. Appl., 75(8):2678–2697, 2018.
[22] G. Gantner, A. Haberl, D. Praetorius, and S. Schimanko. Rate optimality of adaptive finite element methods with respect to the overall computational costs. Preprint, arXiv:2003.10785, 2020.
[23] G. Gantner, A. Haberl, D. Praetorius, and B. Stiftner. Rate optimal adaptive FEM with inexact solver for nonlinear operators. IMA J. Numer. Anal., 38(4):1797–1831, 2018.
[24] V. Girault and P.-A. Raviart. Finite element methods for Navier-Stokes equations, volume 5 of Springer Series in Computational Mathematics. Springer-Verlag, Berlin, 1986. Theory and algorithms.
[25] G. H. Golub and C. F. Van Loan. Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition, 2013.
[26] R. Hiptmair. Finite elements in computational electromagnetism. Acta Numer., 11:237–339, 2002.
[27] M. Karkulik, D. Pavlicek, and D. Praetorius. On 2D newest vertex bisection: optimality of mesh-closure and $H^{1}$ -stability of $L_{2}$ -projection. Constr. Approx., 38(2):213–234, 2013.
[28] P. Monk. Finite element methods for Maxwell’s equations. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, 2003.
[29] P. Morin, R. H. Nochetto, and K. G. Siebert. Data oscillation and convergence of adaptive FEM. SIAM J. Numer. Anal., 38(2):466–488, 2000.
[30] P. Morin, K. G. Siebert, and A. Veeser. A basic convergence result for conforming adaptive finite elements. Math. Models Methods Appl. Sci., 18(5):707–737, 2008.
[31] C.-M. Pfeiler and D. Praetorius. Dörfler marking with minimal cardinality is a linear complexity problem. Math. Comp., (accepted for publication), 2020.
[32] L. R. Scott and S. Zhang. Finite element interpolation of nonsmooth functions satisfying boundary conditions. Math. Comp., 54(190):483–493, 1990.
[33] K. G. Siebert. A convergence proof for adaptive finite elements without lower bound. IMA J. Numer. Anal., 31(3):947–970, 2011.
[34] R. Stevenson. Optimality of a standard adaptive finite element method. Found. Comput. Math., 7(2):245–269, 2007.
[35] R. Stevenson. The completion of locally refined simplicial partitions created by bisection. Math. Comp., 77(261):227–241, 2008.
[36] A. Veeser. Approximating gradients with continuous piecewise polynomial functions. Found. Comput. Math., 16(3):723–750, 2016.
[37] L. Zhong, L. Chen, S. Shu, G. Wittum, and J. Xu. Convergence and optimality of adaptive edge finite element methods for time-harmonic Maxwell equations. Math. Comp., 81(278):623–642, 2012.

Appendix A Comments on optimal convergence rates

In this appendix, we consider the LSFEM discretization of the general second-order problem from Section 3.2. While Proposition 10 already proves plain convergence

\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\to 0\quad\text{as }\ell\to\infty,

the present section proves that optimal convergence rates (in the sense of, e.g., [13]) would already follow from the Dörfler marking criterion (see Section 2.6) and linear convergence

(34)

\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell+n}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\leq C_{\rm lin}\,q_{\rm lin}^{n}\,\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\quad\text{for all }~\ell,n\in\mathbb{N}_{0}

with generic constants $C_{\rm lin}>0$ and $0<q_{\rm lin}<1$ .

A.1. Scott–Zhang-type operators

For fixed $p\geq 1$ , let $\mathcal{J}_{\bullet}\colon H^{1}(\Omega)\to S^{p}(\mathcal{T}_{\bullet})$ be the Scott–Zhang projector from [32]. We recall that $\mathcal{J}_{\bullet}$ preserves discrete boundary data so that $\mathcal{J}_{\bullet}\colon H^{1}_{0}(\Omega)\to S^{p}_{0}(\mathcal{T}_{\bullet})$ is well-defined. Besides this, we will only exploit the following two properties which hold for all $v\in H^{1}(\Omega)$ , $v_{\bullet}\in S^{p}(\mathcal{T}_{\bullet})$ , and $T\in\mathcal{T}_{\bullet}$ :

(i)

local projection property: if $v|_{\Omega_{\bullet}(T)}=v_{\bullet}|_{\Omega_{\bullet}(T)}$ , then $(\mathcal{J}_{\bullet}v)|_{T}=v_{\bullet}|_{T}$ ;
(ii)

global $\boldsymbol{H^{1}}$ -stability: $\|(1-\mathcal{J}_{\bullet})v\|_{H^{1}(\Omega)}~\leq C(1+\mathrm{diam}(\Omega))\,\|v\|_{H^{1}(\Omega)}$ , where $C>0$ depends only on the polynomial degree $p\geq 1$ and the shape regularity of $\mathcal{T}_{\bullet}$ .

The recent work [19] has constructed a Scott–Zhang-type projector $\mathcal{P}_{\bullet}\colon\boldsymbol{H}({\rm div\,};\Omega)\to\boldsymbol{RT}^{p-1}(\mathcal{T}_{\bullet})$ . While [19, Section 3.4] also includes Neumann boundary conditions ${\boldsymbol{\sigma}}\cdot\boldsymbol{n}=0$ on $\Gamma_{N}\subseteq\partial\Omega$ , we only consider the full space $\boldsymbol{H}({\rm div\,};\Omega)$ . Moreover, we will only exploit the following two properties which hold for all ${\boldsymbol{\sigma}}\in\boldsymbol{H}({\rm div\,};\Omega)$ , ${\boldsymbol{\sigma}}_{\bullet}\in\boldsymbol{RT}^{p-1}(\mathcal{T}_{\bullet})$ , and $T\in\mathcal{T}_{\bullet}$ :

(i)

local projection property: if ${\boldsymbol{\sigma}}|_{\Omega_{\bullet}(T)}={\boldsymbol{\sigma}}_{\bullet}|_{\Omega_{\bullet}(T)}$ , then $(\mathcal{P}_{\bullet}{\boldsymbol{\sigma}})|_{T}={\boldsymbol{\sigma}}_{\bullet}|_{T}$ ;
(ii)

global $\boldsymbol{H({\rm div};\Omega)}$ -stability: $\|(1-\mathcal{P}_{\bullet}){\boldsymbol{\sigma}}\|_{\boldsymbol{H}({\rm div\,};\Omega)}~\leq C(1+\mathrm{diam}(\Omega))\,\|{\boldsymbol{\sigma}}\|_{\boldsymbol{H}({\rm div\,};\Omega)}$ , where again $C>0$ depends only on $p\geq 1$ and the shape regularity of $\mathcal{T}_{\bullet}$ .

A.2. Discrete reliability

In the following, we exploit the local projection property and the global stability of the Scott–Zhang-type operators from Section A.1 and prove that the built-in least-squares estimator satisfies discrete reliability.

Lemma 14 (Discrete reliability).

There exist constants $C_{\mathrm{ref}},C_{\mathrm{drel}}\geq 1$ such that for all $\mathcal{T}_{\bullet}\in\operatorname{refine}(\mathcal{T}_{0})$ and all $\mathcal{T}_{\circ}\in\operatorname{refine}(\mathcal{T}_{\bullet})$ there exists $\mathcal{R}_{\bullet,\circ}\subset\mathcal{T}_{\bullet}$ such that

(35)

\displaystyle\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega)}\leq C_{\mathrm{drel}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\quad\text{and}\quad\#\mathcal{R}_{\bullet,\circ}\leq C_{\mathrm{ref}}(\#\mathcal{T}_{\circ}-\#\mathcal{T}_{\bullet}).

The constant $C_{\mathrm{ref}}$ depends only on shape regularity of $\mathcal{T}_{\bullet}$ , whereas $C_{\mathrm{drel}}$ depends additionally on the constants $c_{\rm cnt}$ , $C_{\rm cnt}$ from (A1), the polynomial degree $p\in\mathbb{N}$ , and $\mathrm{diam}(\Omega)$ .

Proof.

The proof is split into three steps.

Step 1. Define the set

\displaystyle\mathcal{R}_{\bullet,\circ}=\mathcal{T}_{\bullet}\setminus\mathcal{N}_{\bullet,\circ}\quad\text{with}\quad\mathcal{N}_{\bullet,\circ}=\big{\{}T\in\mathcal{T}_{\bullet}\,:\,\Omega_{\bullet}(T)\subset\mathcal{T}_{\bullet}\cap\mathcal{T}_{\circ}\big{\}}.

Given $T\in\mathcal{R}_{\bullet,\circ}$ , there exists $T^{\prime}\in\mathcal{T}_{\bullet}$ such that $T\cap T^{\prime}\neq\emptyset$ and $T^{\prime}\not\in\mathcal{T}_{\bullet}\cap\mathcal{T}_{\circ}$ . This implies that $T^{\prime}\in\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ}$ and hence $T$ belongs to the patch around $\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ}$ , i.e., $T\in\big{\{}T^{\prime}\in\mathcal{T}_{\bullet}\,:\,T^{\prime}\cap\overline{\bigcup(\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ})}\neq\emptyset\big{\}}$ . Overall, we thus conclude that

\displaystyle\#\mathcal{R}_{\bullet,\circ}\leq\#\big{\{}T^{\prime}\in\mathcal{T}_{\bullet}\,:\,T^{\prime}\cap\overline{\bigcup(\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ})}\neq\emptyset\big{\}}\stackrel{{\scriptstyle\eqref{ass:refinement:quasiuniform}}}{{\lesssim}}\#(\mathcal{T}_{\bullet}\backslash\mathcal{T}_{\circ})\leq\#\mathcal{T}_{\circ}-\#\mathcal{T}_{\bullet}.

Step 2. We define the operator $\mathcal{I}_{\bullet}\colon\operatorname{\mathbb{V}}(\Omega)\to\operatorname{\mathbb{V}}_{\bullet}(\Omega)$ by $\mathcal{I}_{\bullet}\boldsymbol{v}:=(\mathcal{J}_{\bullet}v,\mathcal{P}_{\bullet}{\boldsymbol{\sigma}})$ for $\boldsymbol{v}=(v,{\boldsymbol{\sigma}})\in\operatorname{\mathbb{V}}(\Omega)$ . Since $\mathcal{J}_{\bullet}$ and $\mathcal{P}_{\bullet}$ are stable projections, it follows that also $\mathcal{I}_{\bullet}$ is a stable projection, i.e., for all $\boldsymbol{v}\in\operatorname{\mathbb{V}}(\Omega)$ and $\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)$ , it holds that

(36)

\displaystyle\|(1-\mathcal{I}_{\bullet})\boldsymbol{v}\|_{\operatorname{\mathbb{V}}(\Omega)}=\|(1-\mathcal{I}_{\bullet})(\boldsymbol{v}\!-\!\boldsymbol{v}_{\bullet})\|_{\operatorname{\mathbb{V}}(\Omega)}\lesssim\|\boldsymbol{v}-\boldsymbol{v}_{\bullet}\|_{\operatorname{\mathbb{V}}(\Omega)},

where the hidden constant depends on $\mathrm{diam}(\Omega)$ and the constants $C>0$ from Section A.1. Moreover, let $\boldsymbol{u}_{\circ}^{\star}=(u_{\circ}^{\star},{\boldsymbol{\sigma}}_{\circ}^{\star})$ . From the local projection properties of the operators $\mathcal{J}_{\bullet}$ and $\mathcal{P}_{\bullet}$ and the choice of $\mathcal{N}_{\bullet,\circ}$ , it follows that $\boldsymbol{u}_{\circ}^{\star}|_{\mathcal{N}_{\bullet,\circ}}=(\mathcal{I}_{\bullet}\boldsymbol{u}_{\circ}^{\star})|_{\mathcal{N}_{\bullet,\circ}}$ . This leads to $\operatorname{supp}(u_{\circ}^{\star}-\mathcal{I}_{\bullet}u_{\circ}^{\star})\subseteq\bigcup(\mathcal{T}_{\bullet}\backslash\mathcal{N}_{\bullet,\circ})=\bigcup\mathcal{R}_{\bullet,\circ}$ . Since $\mathcal{L}$ is defined pointwise, this also implies that

(37)

\displaystyle\operatorname{supp}\mathcal{L}(u_{\circ}^{\star}-\mathcal{I}_{\bullet}u_{\circ}^{\star})\subseteq\mbox{$\bigcup$}\mathcal{R}_{\bullet,\circ}.

Step 3. For any $\boldsymbol{v}_{\bullet}\in\mathbb{V}_{\bullet}(\Omega)$ , the Galerkin orthogonality (6) proves that

\displaystyle\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\mathbb{V}(\Omega)}^{2}

\displaystyle\stackrel{{\scriptstyle{\eqref{ass:pde}}}}{{\simeq}}b(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star})\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}b(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star})\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}b(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}).

With the choice $\boldsymbol{v}_{\bullet}=\mathcal{I}_{\bullet}u_{\circ}^{\star}$ , we see that

	$\displaystyle b(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star},\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet})=(\mathcal{L}(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}))_{\Omega}\stackrel{{\scriptstyle\eqref{eq:optimal2}}}{{=}}(\mathcal{L}(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}))_{\bigcup\mathcal{R}_{\bullet,\circ}}$
	$\displaystyle\quad\stackrel{{\scriptstyle\phantom{\eqref{ass:pde}}}}{{\leq}}\\|\mathcal{L}(\boldsymbol{u}-\boldsymbol{u}_{\bullet}^{\star})\\|_{\bigcup\mathcal{R}_{\bullet,\circ}}\,\\|\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet})\\|_{\bigcup\mathcal{R}_{\bullet,\circ}}\stackrel{{\scriptstyle\eqref{eq:optimal2}}}{{=}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\,\\|\mathcal{L}(\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet})\\|_{\Omega}$
	$\displaystyle\quad\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\lesssim}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{v}_{\bullet}\\|_{\mathbb{V}(\Omega)}\stackrel{{\scriptstyle\eqref{eq:optimal1}}}{{\lesssim}}\eta_{\bullet}(\mathcal{R}_{\bullet,\circ})\\|\boldsymbol{u}_{\circ}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\\|_{\mathbb{V}(\Omega)}.$

Combining the latter two estimates, we conclude the proof. ∎

A.3. Linear convergence implies optimal algebraic convergence rates

Given $s>0$ , we consider the following approximation class

(38)

\displaystyle\|\boldsymbol{u}\|_{\mathbb{A}_{s}}:=\sup_{N\in\mathbb{N}_{0}}\min_{\begin{subarray}{c}\mathcal{T}_{\diamond}\in\mathbb{T}\\ \#\mathcal{T}_{\diamond}-\#\mathcal{T}_{0}\leq N\end{subarray}}\min_{\boldsymbol{v}_{\diamond}\in\operatorname{\mathbb{V}}_{\diamond}(\Omega)}(N+1)^{s}\|\boldsymbol{u}^{\star}-\boldsymbol{v}_{\diamond}\|_{\operatorname{\mathbb{V}}(\Omega)}\in\mathbb{R}_{\geq 0}\cup\{\infty\}.

We note that $\|\boldsymbol{u}\|_{\mathbb{A}_{s}}<\infty$ implies the existence of a sequence of meshes $(\bar{}\mathcal{T}_{\ell})_{\ell\in\mathbb{N}_{0}}$ with $\bar{}\mathcal{T}_{0}=\mathcal{T}_{0}$ such that the corresponding least-square solutions satisfy

\displaystyle\min_{\bar{}\boldsymbol{v}_{\ell}\in\bar{\operatorname{\mathbb{V}}}_{\ell}(\Omega)}\|\boldsymbol{u}^{\star}-\bar{}\boldsymbol{v}_{\ell}\|_{\operatorname{\mathbb{V}}(\Omega)}\lesssim(\#\bar{}\mathcal{T}_{\ell}-\#\mathcal{T}_{0})^{-s}\to 0\quad\text{as }\ell\to\infty.

One says that the Algorithm 1 is rate-optimal, if and only if for all $s>0$ with $\|\boldsymbol{u}\|_{\mathbb{A}_{s}}<\infty$ , the sequence of meshes $(\mathcal{T}_{\ell})_{\ell\in\mathbb{N}_{0}}$ generated by Algorithm 1 guarantees that

\displaystyle\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\lesssim(\#\mathcal{T}_{\ell}-\#\mathcal{T}_{0})^{-s}\to 0\quad\text{as }\ell\to\infty.

We refer to [13] for an abstract framework of the state of the art of rate-optimal adaptive algorithms and note that LSFEM guarantees the equivalence

(39)

\displaystyle\eta_{\bullet}(\boldsymbol{u}_{\bullet}^{\star})\stackrel{{\scriptstyle\eqref{eq:reliable-efficient}}}{{\simeq}}\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\bullet}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\stackrel{{\scriptstyle\eqref{eq:cea}}}{{\simeq}}\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\|\boldsymbol{u}^{\star}-\boldsymbol{v}_{\bullet}\|_{\operatorname{\mathbb{V}}(\Omega)}\stackrel{{\scriptstyle\eqref{eq:reliable-efficient}}}{{\simeq}}\min_{\boldsymbol{v}_{\bullet}\in\operatorname{\mathbb{V}}_{\bullet}(\Omega)}\eta_{\bullet}(\boldsymbol{v}_{\bullet})

so that the present definition of $\|\boldsymbol{u}\|_{\mathbb{A}_{s}}$ is equivalent to that of [4, 34, 17, 13].

The following result is a direct consequence of discrete reliability (35) and the analysis from [13] under the following usual assumptions: First, suppose that we employ newest vertex bisection [35, 27] for mesh-refinement and that the initial mesh $\mathcal{T}_{0}$ satisfies the admissibility condition from [35] for $d\geq 3$ . Second, suppose that Algorithm 1 employs the Dörfler marking criterion with minimal cardinality, i.e., for given $0<\theta\leq 1$ , it holds that

(40)

\displaystyle\mathcal{M}_{\ell}\in\mathbb{M}_{\ell}:=\big{\{}\mathcal{U}_{\ell}\subseteq\mathcal{T}_{\ell}\,:\,\theta\,\eta_{\ell}(u_{\ell})^{2}\leq\eta_{\ell}(\mathcal{U}_{\ell},u_{\ell})^{2}\big{\}}\text{ \ \ and \ }\#\mathcal{M}_{\ell}\leq\#\mathcal{U}_{\ell}\text{ \ for all }\ \mathcal{U}_{\ell}\subseteq\mathcal{T}_{\ell}.

Proposition 15.

There exists a constant $0<\theta_{\rm opt}\leq 1$ such that for all $0<\theta<\theta_{\rm opt}$ , the following implication holds: If there holds linear convergence (34), then Algorithm 1 is even rate optimal, i.e., for all $s>0$ , there exists a constant $C_{\rm opt}>0$ such that

(41)

\displaystyle C_{\rm opt}^{-1}\,\|\boldsymbol{u}\|_{\mathbb{A}_{s}}\leq\sup_{\ell\in\mathbb{N}_{0}}(\#\mathcal{T}_{\ell}-\#\mathcal{T}_{0}+1)^{s}\|\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell}^{\star}\|_{\operatorname{\mathbb{V}}(\Omega)}\leq C_{\rm opt}\,\|\boldsymbol{u}\|_{\mathbb{A}_{s}}.

Sketch of proof.

According to the triangle inequality, the built-in least-squares error estimator is stable on non-refined elements in the sense of [13, Axiom (A1)]. Together with discrete reliability (35), this implies optimality of the Dörfler marking; see [13, Section 4.5]. Due to (39), the error estimator is quasi-monotone [13, Eq. (3.8)] with respect to mesh-refinement. This implies the so-called comparison lemma; see [13, Lemma 4.14]. Together with linear convergence (34), [13, Proposition 4.15] concludes optimality (41). ∎

Remark 16.

We note that the (constrained) optimality result of Proposition 15 can be obtained for any of the examples presented in Section 3. For the needed $\boldsymbol{H}(\boldsymbol{\operatorname{curl}}\,;\Omega)$ -stable local projection, we refer, e.g., to [37, Section 4].

	$\displaystyle\|\langle\boldsymbol{R}(\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\boldsymbol{v}\rangle\|$	$\displaystyle\,=\,\|b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v})+b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\mathcal{A}_{\ell}\boldsymbol{v})\|$
		$\displaystyle\,\stackrel{{\scriptstyle\eqref{eq:galerkinorthogonality}}}{{=}}\,\|b(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v})+b(\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}},\mathcal{A}_{\ell}\boldsymbol{v})\|$
		$\displaystyle\!\stackrel{{\scriptstyle\eqref{ass:pde}}}{{\lesssim}}\!\|(\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}})\hskip 1.42262pt,\mathcal{L}(\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}))_{\Omega}\|+\\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\\|_{\mathbb{V}(\Omega)}\\|\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega)}$
		$\displaystyle\stackrel{{\scriptstyle\eqref{ass:localOperator}}}{{\lesssim}}\sum_{T\in\mathcal{T}_{\ell}}\\|\mathcal{L}(\boldsymbol{u}^{\star}-\boldsymbol{u}_{\ell,\underline{n}})\\|_{T}\\|\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega_{\ell}(T))}+\\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\\|_{\mathbb{V}(\Omega)}\\|\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega)}$
		$\displaystyle\,=\sum_{T\in\mathcal{T}_{\ell}}\eta_{\ell}(T,\boldsymbol{u}_{\ell,\underline{n}})\\|\boldsymbol{v}-\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega_{\ell}(T))}+\\|\boldsymbol{u}_{\ell}^{\star}-\boldsymbol{u}_{\ell,\underline{n}}\\|_{\mathbb{V}(\Omega)}\\|\mathcal{A}_{\ell}\boldsymbol{v}\\|_{\mathbb{V}(\Omega)}.$

A short note on plain convergence of adaptive Least-squares finite element methods

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

2. Plain convergence of adaptive least-squares methods

2.1. Continuous model formulation

2.2. Least-squares method

2.3. Meshes and mesh-refinement

2.4. Discrete spaces

2.5. Natural least-squares error estimator

2.6. Marking strategy

2.7. Convergent adaptive algorithm

Algorithm 1.

Theorem 2.

Proof.

Remark 3.

2.8. Convergence in the case of inexact solvers

Algorithm 4.

Remark 5.

Theorem 6.

Lemma 7.

Lemma 8.

Proof.

Proof of Theorem 6.

2.9. Preconditioned conjugate gradient method (PCG)

Lemma 9.

3. Examples

3.1. Poisson problem

3.2. General second-order problem

Proposition 10.

3.3. Linear Elasticity

Proposition 11.

3.4. Maxwell problem

Proposition 12.

3.5. Stokes problem

Proposition 13.

References

Appendix A Comments on optimal convergence rates

A.1. Scott–Zhang-type operators

A.2. Discrete reliability

Lemma 14 (Discrete reliability).

Proof.

A.3. Linear convergence implies optimal algebraic convergence rates

Proposition 15.

Sketch of proof.

Remark 16.

A short note on plain convergence of
adaptive Least-squares finite element methods