This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A 2-Level Domain Decomposition Preconditioner for KKT Systems with Heat-Equation Constraints111To appear in the proceedings of The 27th International Conference on Domain Decomposition Methods (DD27)

Eric C. Cyr222Eric C. Cyr: eccyr@sandia.gov, Sandia National Laboratories, Albuquerqeue, NM
Abstract

Solving optimization problems with transient PDE-constraints is computationally costly due to the number of nonlinear iterations and the cost of solving large-scale KKT matrices. These matrices scale with the size of the spatial discretization times the number of time steps. We propose a new two level domain decomposition preconditioner to solve these linear systems when constrained by the heat equation. Our approach leverages the observation that the Schur-complement is elliptic in time, and thus amenable to classical domain decomposition methods. Further, the application of the preconditioner uses existing time integration routines to facilitate implementation and maximize software reuse. The performance of the preconditioner is examined in an empirical study demonstrating the approach is scalable with respect to the number of time steps and subdomains.

1 Introduction

This paper develops a new domain-decomposition method for solving the KKT system with heat-equation constraints. This effort is driven by the quadratic optimization problem of the form

min𝑧\displaystyle\underset{z}{\min} 120Tuu~L2(Ω)2𝑑t+ω20TzL2(Ω)2𝑑t\displaystyle\frac{1}{2}\int_{0}^{T}\|u-\tilde{u}\|_{L^{2}(\Omega)}^{2}\,dt+\frac{\omega}{2}\int_{0}^{T}\|z\|_{L^{2}(\Omega)}^{2}\,dt (1)
s.t. tuνu=z,\displaystyle\partial_{t}u-\nu\nabla\cdot\nabla u=z, xΩ2,t[0,T]\displaystyle x\in\Omega\subset\mathbb{R}^{2},t\in[0,T]
u(x,t)=0,xΩ,t[0,T],\displaystyle u(x,t)=0,\;\;x\in{\partial\Omega},t\in[0,T], u(x,0)=u0(x),xΩ\displaystyle u(x,0)=u_{0}(x),\;\;x\in\Omega

This quadratic PDE-constrained optimization problem finds a control zz such that the solution uu to the heat equation matches the target u~\tilde{u}. The spatial domain is Ω\Omega, the time interval is [0,T][0,T], and the heat conductivity is ν\nu. Uniform homogenous boundary conditions are assumed for all time, and the initial condition is prescribed by u0u_{0}.

Many nonlinear methods use a series of quadratic approximations of the form represented by Eq. 1 to solve PDE-constrained optimization problems (see for instance sequential quadratic programming methods [8, 15, 27]). There have been several studies focused on developing scalable preconditioners for the saddle-point system that arises from the first-order necessary conditions. Often preconditioners for saddle-point systems take the form of approximate factorization block preconditioners [3]. These were explored for KKT systems in [4, 5]. Our work relies heavily on the block preconditioners from the Wathen group [23, 24, 25].

This effort focuses on transient PDE constraints where the size of the system scales with the number of spatial unknowns times the number of time steps, resulting in substantial computational effort. To alleviate this, a number of efforts have proposed accelerating the time solve using adaptive space-time discretizations [17, 16], parareal [20, 27, 12], multigrid approaches [6, 7, 10, 13, 19], block preconditioning [23, 24], and domain decomposition methods [14].

Our approach is also built on block preconditioning ideas. A difference is that our technique exploits an observation that the Schur-complement of the KKT system is elliptic in time (see [11, 18]). This allows us to leverage existing two level domain decomposition approaches for elliptic systems to improve the parallel scalability of the block preconditioner. Good performance is achieved by algorithmic choices that ensure the forward and backward in time integrators can be applied on the fine level.

2 Discrete System and Block Preconditioner

In this article the PDE in Eq. 1 will be discretized on a 22D Cartesian grid using first order backward Euler in time, and a second order finite difference stencil in space. A row of the discrete space-time system for the heat equation satisfies:

ui,jn+1ui,jn+Δtν(u(i+1)jn+1+2uijn+1u(i1)jn+1Δx2+ui(j+1)n+1+2uijn+1ui(j1)n+1Δy2)=Δtzijn+1.u_{i,j}^{n+1}-u_{i,j}^{n}+\Delta t\nu\left(\frac{-u_{(i+1)j}^{n+1}+2u_{ij}^{n+1}-u_{(i-1)j}^{n+1}}{\Delta x^{2}}+\frac{-u_{i(j+1)}^{n+1}+2u_{ij}^{n+1}-u_{i(j-1)}^{n+1}}{\Delta y^{2}}\right)=\Delta tz_{ij}^{n+1}. (2)

Here i,ji,j are the interior space indices defined over 1nx11\dots n_{x}-1 and 1ny11\dots n_{y}-1. The exterior indices are eliminated using the homogenous boundary conditions. The superscript time index nn runs from 0Nt0\ldots N_{t}. Each nn is referred to below as a time-node. The control variable zz’s index matches the implicit index on uu, therefore zn+1z^{n+1} is associated with the nnth time interval. For a single time interval, Eq. 2 rewritten in matrix form is

J(n+1)(n+1)un+1+J(n+1)nun+L(n+1)(n+1)zn+1=0,J_{(n+1)(n+1)}u^{n+1}+J_{(n+1)n}u_{n}+L_{(n+1)(n+1)}z^{n+1}=0, (3)

and the global space-time system is

Ju+Lz=f.Ju+Lz=f. (4)

The right hand side ff includes contributions from the initial conditions. The matrix JJ is block lower triangular and the matrix LL is block diagonal.

The linear system whose solution solves the quadratic optimization problem from Eq. 1 is the celebrated KKT system Ku=fK\vec{u}=\vec{f} where

K=[MuJTωMzLTJL],u=[uzw],f=[fufzf].K=\begin{bmatrix}M_{u}&&J^{T}\\ &\omega M_{z}&L^{T}\\ J&L&\end{bmatrix},\;\vec{u}=\begin{bmatrix}u\\ z\\ w\end{bmatrix},\;\vec{f}=\begin{bmatrix}f_{u}\\ f_{z}\\ f\end{bmatrix}. (5)

The final row is the discrete form of the PDE constraint, enforced by the Lagrange multiplier ww. We will also refer to ww as the adjoint solution. MuM_{u} and MzM_{z} are identity matrices scaled by ΔtΔxΔy\Delta t\Delta x\Delta y. The matrix KK is a saddle point matrix, whose structure is frequently observed in numerical optimization. Many effective block preconditioners have been developed for this class of matrix [2, 3, 4]. We focus on the block preconditioning approach developed by Wathen and collaborators for solving linearized PDE-constrained optimization problems [23, 24, 25].

We write a block LDU factorization of the matrix KK

K=[IIJMu1ω1LMz1I][MuωMzS][IMu1JTIω1Mz1LTI]K=\begin{bmatrix}I&&\\ &I&\\ JM_{u}^{-1}&\omega^{-1}LM_{z}^{-1}&I\end{bmatrix}\begin{bmatrix}M_{u}&&\\ &\omega M_{z}&\\ &&-S\end{bmatrix}\begin{bmatrix}I&&M_{u}^{-1}J^{T}\\ &I&\omega^{-1}M_{z}^{-1}L^{T}\\ &&I\end{bmatrix} (6)

where the Schur-complement is S=JMu1JT+1ωLMz1LTS=JM_{u}^{-1}J^{T}+\frac{1}{\omega}LM_{z}^{-1}L^{T}. Following [23], KK is preconditioned using the block diagonal operator

P=[MuωMzS^] where S^=J^Mu1J^T,J^=J+ω1/2L.P=\begin{bmatrix}M_{u}&&\\ &\omega M_{z}&\\ &&\hat{S}\end{bmatrix}\mbox{ where }\hat{S}=\hat{J}M_{u}^{-1}\hat{J}^{T},\hat{J}=J+\omega^{-1/2}L. (7)

This preconditioner leverages the result in [21], and approximately inverts the block diagonal in the LDU factorization. The matrix J^\hat{J} used in the approximate Schur complement S^\hat{S} is block lower triangular (similar to JJ), a fact that we will exploit below. The choice of S^\hat{S} integrates the state Jacobian and the effects of the regularization parameter. In [23, 24] and [26], this approximation is developed and shown to lead to robust performance with respect to ω\omega.

3 Two-Level Domain Decomposition Schur-Complement

We propose a new domain decomposition approach for approximately inverting S^\hat{S}. This is motivated by the observation that the operator SS is elliptic in time (see [18] and [11]). For simplicity, we show this discretely using only the term JMu1JTJM_{u}^{-1}J^{T}. Consider the ODE ty=y\partial_{t}y=-y discretized over three time steps with forward Euler: yn+1yn+Δtyn=0y^{n+1}-y^{n}+\Delta ty^{n}=0. With Mu=IM_{u}=I, the Schur-complement S^\hat{S} is

[11+Δt11+Δt1][11+Δt11+Δt1]=[1(1Δt)(1Δt)2(1Δt)+Δt2(1Δt)(1Δt)2(1Δt)+Δt2].\begin{bmatrix}1\\ -1+\Delta t&1&&\\ &-1+\Delta t&1&\\ \end{bmatrix}\begin{bmatrix}1&-1+\Delta t&\\ &1&-1+\Delta t\\ &&1\\ \end{bmatrix}\\ =\begin{bmatrix}1&-(1-\Delta t)&\\ -(1-\Delta t)&2(1-\Delta t)+\Delta t^{2}&-(1-\Delta t)\\ &-(1-\Delta t)&2(1-\Delta t)+\Delta t^{2}\end{bmatrix}. (8)

Examining the second row it is clear the operator has a 11D Laplacian stencil in time, with a positive perturbation on the diagonal. To take advantage of this ellipticity, we will apply existing domain decomposition approaches to the S^\hat{S} operator. This ellipticity principle enables scalable performance of a preconditioned Krylov method. We also impose an efficiency constraint that the computational kernels in our preconditioner use the time integration method for the state and adjoint unknowns.

The ellipticity principle is realized by considering a restricted additive Schwarz (RAS) method with NDN_{D} subdomains (see [9]). Each subdomain contains the spatial unknowns associated with a subset of time steps. For instance, if there are Nt=9N_{t}=9 time steps, then for ND=3N_{D}=3 the subdomains contain time-nodes {1,2,3}\{1,2,3\}, {3,4,5,6}\{3,4,5,6\}, and {6,7,8,9}\{6,7,8,9\} (the 0th0^{\text{th}} time-node is the excluded initial condition). Notice that the time-nodes are overlapped but the time steps are not. With these subsets, boolean operators RsR_{s} are defined that restrict a space-time vector to the time-nodes in a subdomain, giving the RAS preconditioner

S^RAS1=s=1NDRsTDs(RsJ^Mu1J^TRsT)1Rs\hat{S}_{\text{RAS}}^{-1}=\sum_{s=1}^{N_{D}}R_{s}^{T}D_{s}\left(R_{s}\hat{J}M_{u}^{-1}\hat{J}^{T}R_{s}^{T}\right)^{-1}R_{s} (9)

where DsD_{s} is the (boolean) partition of unity matrix (Defn. 1.11 of [9]). RAS is known to lead to effective preconditioners for elliptic problems and can be extended to a multi-level schemes. While the ellipticity principle is exploited in S^RAS1\hat{S}_{\text{RAS}}^{-1}, the explicit formation of the product J^Mu1J^T\hat{J}M_{u}^{-1}\hat{J}^{T} does not satisfy the efficiency constraint.

To satisfy the efficiency constraint note that the range of J^TRsT\hat{J}^{T}R_{s}^{T} is nonzero on time-nodes in the sths^{\text{th}} subdomain and one time-node earlier. For instance, if the subdomain contains nodes {3,4,5,6}\{3,4,5,6\} then the range is nonzero on {2,3,4,5,6}\{2,3,4,5,6\}. Let QsQ_{s} be a new extended restriction operator whose action produces a space-time vector for time-nodes that are nonzero in the range of J^TRsT\hat{J}^{T}R_{s}^{T}. Further, choose QsQ_{s} so that

Qs=[WsRs] and QsQsT=I.Q_{s}=\begin{bmatrix}W_{s}\\ R_{s}\end{bmatrix}\mbox{ and }Q_{s}Q_{s}^{T}=I. (10)

The operator WsW_{s} restricts the space-time vector to the time-nodes contained in the earlier time step relative to the sths^{\text{th}} subdomain. Because QsQ_{s} is a restriction operator that represents the nonzero range of J^TRsT\hat{J}^{T}R_{s}^{T}, we have that QsTQsJ^TRsT=J^TRsTQ_{s}^{T}Q_{s}\hat{J}^{T}R_{s}^{T}=\hat{J}^{T}R_{s}^{T}. Recalling that MuM_{u} is diagonal, we can rewrite the subdomain solve in S^RAS1\hat{S}_{RAS}^{-1} as

RsJ^Mu1J^TRsT=RsJ^QsT(QsMu1QsT)QsJ^TRsTR_{s}\hat{J}M_{u}^{-1}\hat{J}^{T}R_{s}^{T}=R_{s}\hat{J}Q_{s}^{T}(Q_{s}M_{u}^{-1}Q_{s}^{T})Q_{s}\hat{J}^{T}R_{s}^{T} (11)

Using the constraint in Eq. 10, we have the additional identities Rs=RsQsTQsR_{s}=R_{s}Q_{s}^{T}Q_{s} and RsT=QsTQsRsTR_{s}^{T}=Q_{s}^{T}Q_{s}R_{s}^{T}. This permits a final rewrite of the operator in Eq. 11

RsJ^Mu1J^TRsT=RsQsTJ^sMs1J^sTQsRsTR_{s}\hat{J}M_{u}^{-1}\hat{J}^{T}R_{s}^{T}=R_{s}Q_{s}^{T}\hat{J}_{s}M_{s}^{-1}\hat{J}_{s}^{T}Q_{s}R_{s}^{T} (12)

where J^s=QsJ^QsT\hat{J}_{s}=Q_{s}\hat{J}Q_{s}^{T} and Ms1=QsMu1QsTM_{s}^{-1}=Q_{s}M_{u}^{-1}Q_{s}^{T}. The inverse action of J^sMs1J^sT\hat{J}_{s}M_{s}^{-1}\hat{J}_{s}^{T} is easily computed in a matrix free way on the time-nodes in the extended subdomain. Motivated by this equivalence, we define a new one-level preconditioner

S^RASQ1=s=1NDRsTDs(RsQsTJ^sTMsJ^s1QsRsT)Rs.\hat{S}_{\text{RASQ}}^{-1}=\sum_{s=1}^{N_{D}}R_{s}^{T}D_{s}\left(R_{s}Q_{s}^{T}\hat{J}_{s}^{-T}M_{s}\hat{J}_{s}^{-1}Q_{s}R_{s}^{T}\right)R_{s}. (13)

The term in parentheses is different from the term inverted in Eq. 9. The difference is that the inverse computed in Eq. 9 is constrained to have a zero initial condition outside of the subdomain. This revised operator satisfies our efficiency constraint as computing J^s1\hat{J}_{s}^{-1} and J^sT\hat{J}_{s}^{-T} is done using the time integration method.

To obtain scalability with respect to the number of subdomains a coarse grid correction is required. Again leveraging the elliptic nature of the Schur-complement we consider the Nicolaides coarse space developed for solving the Poisson problem [22]. Following the presentation in [9], the Nicolades coarse space is defined as

Z=[D1R1Φ1D2R2Φ2DNDRNDΦND] where Φs=[1111+2Ns11+2NsNs]Z=\begin{bmatrix}D_{1}R_{1}\Phi_{1}&&\ldots&\\ &D_{2}R_{2}\Phi_{2}&\ddots&\\ \vdots&\ddots&\ddots&\\ &\ldots&&D_{N_{D}}R_{N_{D}}\Phi_{N_{D}}\end{bmatrix}\mbox{ where }\Phi_{s}=\begin{bmatrix}1&-1\\ 1&-1+\frac{2}{N_{s}}\\ \vdots&\vdots\\ 1&-1+2\frac{N_{s}}{N_{s}}\end{bmatrix} (14)

The columns of Φs\Phi_{s} form a constant and linear basis over the NsN_{s} subdomain time-nodes. The coarse restriction R0=ZTR_{0}=Z^{T} is used in the definition of the coarse operator S^0=R0J^Mu1J^TR0T\hat{S}_{0}=R_{0}\hat{J}M_{u}^{-1}\hat{J}^{T}R_{0}^{T}. The coarse solve is applied in a multiplicative way

S^2level1=S^RASQ1R0TS^01R0.\hat{S}_{2-\text{level}}^{-1}=\hat{S}_{\text{RASQ}}^{-1}R_{0}^{T}\hat{S}_{0}^{-1}R_{0}. (15)

Due to the structure of R0R_{0} the coarse operator S^0\hat{S}_{0} can be constructed in parallel. This does represent a violation of the efficiency constraint to be addressed by future work.

4 Numerical Experiments

To demonstrate this approach we discretize the quadratic optimization problem from Eq. 1 as described in Sec. 2. The 22D spatial domain is Ω=(0,1)×(0,2)2\Omega=(0,1)\times(0,2)\subset\mathbb{R}^{2}, and the time domain is [0,1][0,1]. The initial conditions and target solutions are

u0(x,y)=xy(x1)(y2),u~(x,y,t)=sin(2.0πt)sin(2.0πx)sin(2.0πy).u_{0}(x,y)=-xy(x-1)(y-2),\quad\tilde{u}(x,y,t)=\sin(2.0\pi t)\sin(2.0\pi x)\sin(2.0\pi y). (16)

The regularization parameter ω\omega varies over five orders of magnitude. Experiments were run with 9×99\times 9, 17×1717\times 17, and 33×3333\times 33 mesh points. Qualitatively, variation with number of spatial points was not a factor in the convergence. This is not surprising as the implicit operator in space is inverted with a direct solve. As a result, the computations below are all for the case of 17×1717\times 17 mesh points. Recall that homogeneous boundary conditions are removed, giving 15×1515\times 15 unknowns in each time step. The linear system Ku=fK\vec{u}=\vec{f} (Eq. 5) is solved using right preconditioned GMRES from PyAMG [1] iterated until a relative residual tolerance of 10610^{-6} is achieved.

Refer to caption
Refer to caption
Refer to caption
Figure 1: Three weak scaling studies for different numbers of time-steps per subdomain. The two level scheme (triangles) has flat iteration counts for regardless of the number of time steps, the subdomain size, and the regularization parameter. Asymptotically the one level method shows a strong dependence with respect to the number of subdomains and time steps.
Refer to caption
Refer to caption
Refer to caption
Figure 2: This plot demonstrates the robustness of the two level scheme (triangle markers) with respect to the regularization parameter ω\omega. Note that for many cases the one level scheme (circle markers) did not converge in the 420420 iterations (the maximum allowed), thus those values are omitted.

Figure 1 presents three weak-scaling studies ranging from 100100 to 32003200 time steps. The number of time steps per subdomain is fixed at 8080 in the left plot, 2020 in the center plot, and 55 in the right plot. For the case of 8080 steps, the fewest number of time steps is 200200 (the minimum number of time steps evenly divisible by 8080 in the chosen sequence). The plots, show the number of iterations as a function of time step count for GMRES preconditioned using PP from Eqn 7 with Schur complement approximations S^RASQ1\hat{S}_{\text{RASQ}}^{-1} for the one level case (circle markers), and S^2level1\hat{S}_{2-\text{level}}^{-1} for the two level case (triangle markers). Different values for the regularization parameter ω\omega are indicated using solid (10210^{-2}), dashed (10310^{-3}) or dotted (10410^{-4}) lines. These plots demonstrate that the performance of the two level method is independent of both the number of subdomains, and the number of time steps. Further, independence holds regardless of the value of the regularization parameter. As anticipated the one level method has substantial growth with the number of time steps, and variability with the regularization parameter. However, it is worth noting that dependent on the number of subdomains and the size of the regularization parameter the one level method may be faster despite its lack of scalability. For instance, when using 4040 subdomains and a regularization parameter of 10410^{-4} the one-level method takes the same number of iterations but lacks the synchronization and added cost of the two level method.

The scaling with respect to the regularization parameter is investigated in Figure 2. In these plots the preconditioned iteration counts are plotted as a function of the inverse regularization parameter. Data points are excluded when the number of iterations exceeded the maximum iteration count for GMRES (in this case 420420). Here again the two level method scales well, yielding essentially flat iteration counts as a function of the regularization parameter. The one level method shows strong dependence on ω\omega, though it improves dramatically for smaller values.

5 Conclusion

In this paper, motivated by results in block preconditioning and the elliptic-in-time nature of the KKT system, we develop a two level domain decomposition preconditioner that facilitates a parallel-in-time solver for the discrete optimality system constrained by the heat equation. While limited in their breadth, initial results for this approach show excellent scalability with respect to the number of time steps, subdomains, and the regularization parameter. Future work will focus on achieving improved scaling by including more levels in the hierarchy, and applying this technique to a broader class of problems and discretizations.

Acknowledgements

The author is indebted anonymous reviewers whose comments resulted in a significant strengthening of the paper. The author also acknowledges support from the SEA-CROGS project and the Early Career program, both funded by the DOE Office of Science. This article has been authored by an employee of National Technology & Engineering Solutions of Sandia, LLC under Contract No. DE-NA0003525 with the U.S. Department of Energy (DOE). The employee owns all right, title and interest in and to the article and is solely responsible for its contents. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this article or allow others to do so, for United States Government purposes. The DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan https://www.energy.gov/downloads/doe-public-access-plan. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government. Sandia report number: SAND2023-03175O.

References

  • [1] N Bell, L N Olson, and J Schroder. PyAMG: Algebraic multigrid solvers in python. Journal of Open Source Software, 7(72):4142, 2022.
  • [2] M Benzi and G H Golub. A preconditioner for generalized saddle point problems. SIAM Journal on Matrix Analysis and Applications, 26(1):20–41, 2004.
  • [3] M Benzi, G H Golub, and J Liesen. Numerical solution of saddle point problems. Acta numerica, 14:1–137, 2005.
  • [4] M Benzi, E Haber, and L Taralli. A preconditioning technique for a class of PDE-constrained optimization problems. Advances in Computational Mathematics, 35(2):149–173, 2011.
  • [5] G Biros and O Ghattas. Parallel Lagrange–Newton–Krylov–Schur methods for PDE-constrained optimization. Part I: The Krylov–Schur solver. SIAM Journal on Scientific Computing, 27(2):687–713, 2005.
  • [6] A Borzı. Multigrid methods for parabolic distributed optimal control problems. Journal of Computational and Applied Mathematics, 157(2):365–382, 2003.
  • [7] A Borzi and V Schulz. Multigrid methods for PDE optimization. SIAM review, 51(2):361–395, 2009.
  • [8] R H Byrd, F E Curtis, and J Nocedal. An inexact SQP method for equality constrained optimization. SIAM Journal on Optimization, 19(1):351–369, 2008.
  • [9] V Dolean, P Jolivet, and F Nataf. An Introduction to Domain Decomposition Methods. Society for Industrial and Applied Mathematics, Philadelphia, PA, 2015.
  • [10] R D Falgout, J-M Maldague, I Meyers, J Munar, E Neville, and T Overman. TriMGRIT: An Extension of Multigrid Reduction in Time for Constrained Optimization. In PinT 2020 – 9th Parallel-in-Time Workshop, 2020.
  • [11] M J Gander and F Kwok. Schwarz methods for the time-parallel solution of parabolic control problems. In Domain decomposition methods in science and engineering XXII, pages 207–216. Springer, 2016.
  • [12] M J Gander, F Kwok, and J Salomon. PARAOPT: A parareal algorithm for optimality systems. SIAM Journal on Scientific Computing, 42(5):A2773–A2802, 2020.
  • [13] S Günther, N R Gauger, and J B Schroder. A non-intrusive parallel-in-time approach for simultaneous optimization with unsteady PDEs. Optimization Methods and Software, 34(6):1306–1321, 2019.
  • [14] M Heinkenschloss. A time-domain decomposition iterative method for the solution of distributed linear quadratic optimal control problems. Journal of Computational and Applied Mathematics, 173(1):169–198, 2005.
  • [15] M Heinkenschloss and D Ridzal. A matrix-free trust-region SQP method for equality constrained optimization. SIAM Journal on Optimization, 24(3):1507–1541, 2014.
  • [16] U Langer, O Steinbach, F Tröltzsch, and H Yang. Space-time finite element discretization of parabolic optimal control problems with energy regularization. SIAM Journal on Numerical Analysis, 59(2):675–695, 2021.
  • [17] U Langer, O Steinbach, F Tröltzsch, and H Yang. Unstructured space-time finite element methods for optimal control of parabolic equations. SIAM Journal on Scientific Computing, 43(2):A744–A771, 2021.
  • [18] R M Lewis and S G Nash. Model problems for the multigrid optimization of systems governed by differential equations. SIAM Journal on Scientific Computing, 26(6):1811–1837, 2005.
  • [19] S Lin. Multilevel-in-Time Methods for Optimal Control of PDEs and Training of Recurrent Neural Networks. PhD thesis, Rice University, 2022.
  • [20] Y Maday and G Turinici. A parareal in time procedure for the control of partial differential equations. Comptes Rendus Mathematique, 335(4):387–392, 2002.
  • [21] M F Murphy, G H Golub, and A J Wathen. A note on preconditioning for indefinite linear systems. SIAM Journal on Scientific Computing, 21(6):1969–1972, 2000.
  • [22] R A Nicolaides. Deflation of Conjugate Gradients with Applications to Boundary Value Problems. SIAM Journal on Numerical Analysis, 24(2):355–365, 1987.
  • [23] J W Pearson, M Stoll, and A J Wathen. Regularization-robust preconditioners for time-dependent PDE-constrained optimization problems. SIAM Journal on Matrix Analysis and Applications, 33(4):1126–1152, 2012.
  • [24] J W Pearson, M Stoll, and A J Wathen. Preconditioners for state-constrained optimal control problems with moreau–yosida penalty function. Numerical Linear Algebra with Applications, 21(1):81–97, 2014.
  • [25] T Rees, H S Dollar, and A J Wathen. Optimal solvers for PDE-constrained optimization. SIAM Journal on Scientific Computing, 32(1):271–298, 2010.
  • [26] A Schiela and S Ulbrich. Operator preconditioning for a class of inequality constrained optimal control problems. SIAM Journal on Optimization, 24(1):435–466, 2014.
  • [27] S Ulbrich. Generalized SQP methods with “parareal” time-domain decomposition for time-dependent PDE-constrained optimization. In Real-time PDE-constrained optimization, pages 145–168. SIAM, 2007.