\newsiamthm

remarkRemark

An adaptive planewave method for electronic structure calculations

Beilei Liu School of Mathematical Sciences, Beijing Normal University, No.19, Xinjiekouwai Street, Beijing 100875, China (, ). beilei@mail.bnu.edu.cn chen.huajie@bnu.edu.cn Huajie Chen²²footnotemark: 2 Geneviève Dusson Laboratoire de Mathématiques de Besançon, UMR CNRS 6623, Université Bourgogne Franche-Comté, 16 route de Gray, 25030 Besançon, France (). genevieve.dusson@math.cnrs.fr Jun Fang Institute of Applied Physics and Computational Mathematics, Fenghao East Road 2, Beijing 100094, China (). fang_jun@iapcm.ac.cn Xingyu Gao Laboratory of Computational Physics, Institute of Applied Physics and Computational Mathematics, Huayuan Road 6, Beijing 100088, China (). gao_xingyu@iapcm.ac.cn

Abstract

We propose an adaptive planewave method for eigenvalue problems in electronic structure calculations. The method combines a priori convergence rates and accurate a posteriori error estimates into an effective way of updating the energy cut-off for planewave discretizations, for both linear and nonlinear eigenvalue problems. The method is error controllable for linear eigenvalue problems in the sense that for a given required accuracy, an energy cut-off for which the solution matches the target accuracy can be reached efficiently. Further, the method is particularly promising for nonlinear eigenvalue problems in electronic structure calculations as it shall reduce the cost of early iterations in self-consistent algorithms. We present some numerical experiments for both linear and nonlinear eigenvalue problems. In particular, we provide electronic structure calculations for some insulator and metallic systems simulated with Kohn–Sham density functional theory (DFT) and the projector augmented wave (PAW) method, illustrating the efficiency and potential of the algorithm.

keywords:

Electronic structure calculation, Planewave method, Energy cut-off, A posteriori error estimate, Adaptive algorithm, Projector augmented wave method

{AMS}

65N25, 65N35, 65N15, 35Q40

1 Introduction

The electronic structure modelling of many-particle systems enables the investigation and prediction of properties of molecules and material systems and is used in different fields, such as in chemistry, materials science, biology, and nanosciences. Most of electronic structure models exhibit (possibly nonlinear) eigenvalue problems, in particular to compute the electronic ground state of a system within the Born–Oppenheimer approximation. To solve the underlying eigenvalue problems numerically, the planewave method is one of the most widely used discretization methods [37, 45, 50], and is employed in many packages, e.g. ABINIT [1], CASTEP [22], Quantum Espresso [49], and VASP [54]. By expressing the eigenfunction as a linear combination of planewave functions, this method has the following key advantages [50]. First, it is a natural discretization for periodic systems in materials science as the planewaves can match the translational symmetry of the systems perfectly. Second, the transformation from real to reciprocal space and vice versa can be done efficiently with fast Fourier transforms (FFTs), which makes the calculation of the elements of the Hamiltonian, the matrix that needs to be diagonalised, relatively simple. Third, as a spectral discretization method, the planewave method allows to achieve a spectral convergence rate when combined with pseudo-potential approximations [38].

A key issue in the planewave method is the choice of the energy cut-off, the parameter determining the number of basis functions. A larger cut-off will lead to more accurate results, but at the price of increasing the computational cost significantly. Therefore, it is important to choose an appropriate energy cut-off leading to the right accuracy while using as few planewaves as possible. Unfortunately, such a cut-off is a priori unknown. In practice, it is usually determined by testing the convergence of quantities of interest, such as the energy, the electron density, or the forces and stresses. Starting any simulation of a new system by a convergence test in order to determine a cut-off satisfying the above criterion can be quite demanding, as this requires to solve several times the same problem for different cut-offs.

The purpose of this paper is to present an elegant algorithm to determine an energy cut-off in an a posteriori way, so that the planewave approximation reaches the desired accuracy, and avoids superfluous convergence-test computations. For Schrödinger-type linear eigenvalue problems, our algorithm relies on three important aspects: (i) an a priori error estimate that gives the decay rate of the discretization error; (ii) an a posteriori error estimate that gives guaranteed and asymptotically accurate error indicators; (iii) the strategy which predicts sensible energy cut-offs during the simulation. We further incorporate this algorithm into the self-consistent field (SCF) algorithm used to solve nonlinear eigenvalue problems, in particular the Kohn–Sham equations [36, 40] in Density Functional Theory (DFT) [34, 36], so that the energy cut-off is adapted along the iterations of the algorithm.

The idea of adapting the discretization on-the-fly has long been exploited, in particular in finite element methods, which have been extensively studied for both general elliptic equations [2, 6, 7, 21, 27, 32] and eigenvalue problems in quantum physics [3, 23, 25, 26, 42, 51, 52]. In adaptive methods, an efficient and reliable a posteriori error estimator is essential, in order to indicate how to improve the basis set, e.g. by modifying the energy cut-off for planewave methods. To this end, this work will use some guaranteed a posteriori error estimates presented in [16, 17, 18]. For planewave methods, there are only a handful of works devoted to the construction of the a posteriori error estimates and adaptive methods. We refer to [30] for a posteriori error estimates and [20, 33] for adaptive algorithms. To our best knowledge, there is very few efficient implementations of adaptive planewave algorithm for quantum eigenvalue problems [30], and no existing work for simulations of real systems with Kohn–Sham equations. The difference lies in that the planewaves are non-local basis functions, and the traditional local refinement techniques used in adaptive finite element discretizations cannot be applied straightforwardly.

As a major application of our algorithm, we carry out electronic structure calculations in the framework of Kohn–Sham DFT and the projector augmented wave (PAW) method [11, 31, 38]. The PAW method is a very accurate and widely used method in electronic structure calculations, which has an elegant theoretical framework based on a linear transformation from the pseudo wave-functions to the real all-electron ones and then derives closed-form expressions for electronic densities, energy functionals, and forces in a consistent manner. This method has now been efficiently implemented in several simulation packages, including the well-known VASP and ABINIT codes, and is widely applied in materials science, condensed matter physics, and quantum chemistry. In practical PAW calculations, a proper setting of the planewave cut-off for the pseudo wave-function to ensure numerical convergence is critical for obtaining meaningful simulation results.

The rest of the article is organized as follows. In Section 2, we focus on linear Schrödinger-type equations. We briefly discuss the planewave discretization, and available a priori and a posteriori error estimates. In Section 3, we propose two strategies to construct an adaptive algorithm for linear eigenvalue problems. In Section 4, the Kohn–Sham equations and the PAW method are presented, and the adaptive algorithm for linear eigenvalue problems is built into SCF iterations for the nonlinear eigenvalue problem. In Section 5, we perform some numerical experiments to show the validity and efficiency of our algorithms for linear and nonlinear eigenvalue problems. In particular, Kohn–Sham DFT calculations with the PAW method for some bulk systems are presented, including a diamond system and a metallic FCC aluminum system with a vanishing band gap. We also provide a detailed comparison with the convergence test approach (see Section 5.2) to illustrate the efficiency and advantages of the adaptive algorithm. Finally, some concluding remarks are given in Section 6.

Throughout this paper, we shall use $C$ to denote a generic positive constant which may stand for different values at its different occurrences but is independent of finite dimensional subspaces. The symbol $\langle\cdot,\cdot\rangle$ denotes an abstract duality pairing between a Banach space and its dual, and $(\cdot,\cdot)$ denotes the inner product in the space $L^{2}_{\#}(\Omega)$ introduced in Section 2.

2 Linear eigenvalue problems

In this section, we consider a Schrödinger-type equation, review the planewave discretization and discuss available a priori and a posteriori error estimates. We clarify at this point that, only the error estimates for linear eigenvalue problems are required and exploited for the adaptive planewave algorithms developed in this paper, for both Schrödinger-type and Kohn–Sham equations.

Let $d\in\{1,2,3\}$ be the dimension of the system, $\mathbb{L}=P\mathbb{Z}^{d}\subset\mathbb{R}^{d}$ be a Bravais lattice, where $P\in\mathbb{R}^{d\times d}$ is a non-singular matrix. Let $\Omega$ be a unit cell of the lattice $\mathbb{L}$ , and $\mathbb{L}^{\prime}$ be the dual lattice of $\mathbb{L}$ . We denote by $e_{\boldsymbol{G}}({\boldsymbol{r}})=|\Omega|^{-1/2}e^{i\boldsymbol{G}\cdot\boldsymbol{r}}$ the planewave with wavevector $\boldsymbol{G}\in\mathbb{L}^{\prime}$ . The family $\{e_{\boldsymbol{G}}\}_{\boldsymbol{G}\in\mathbb{L}^{\prime}}$ forms an orthonormal basis set of $L_{\#}^{2}(\Omega)$ , where

L_{\#}^{2}(\Omega)=\{u\in L^{2}_{\rm loc}(\mathbb{R}^{d})~:~u~{\rm is}~\mathbb{L}{\rm-periodic}\}.

For all $u\in L_{\#}^{2}(\Omega)$ , we have

u({\boldsymbol{r}})=\sum_{\boldsymbol{G}\in\mathbb{L}^{\prime}}\hat{u}_{\boldsymbol{G}}e_{\boldsymbol{G}}({\bf r})\quad{\rm with}\quad\hat{u}_{\boldsymbol{G}}=(e_{\boldsymbol{G}},u)=|\Omega|^{-1/2}\int_{\Omega}u({\bf r})e^{-i\boldsymbol{G}\cdot\boldsymbol{r}}~{\rm d}{\boldsymbol{r}}.

We introduce the Sobolev spaces of real-valued $\mathbb{L}$ -periodic functions for $s\in\mathbb{R}$

H_{\#}^{s}(\Omega)=\left\{u(\boldsymbol{r})=\sum_{\boldsymbol{G}\in\mathbb{L}^{\prime}}\hat{u}_{\boldsymbol{G}}e_{\boldsymbol{G}}({\boldsymbol{r}}):\forall~{\boldsymbol{G}}\in\mathbb{L}^{\prime},\hat{u}_{-\boldsymbol{G}}=\hat{u}^{*}_{\boldsymbol{G}},\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime}}\big{(}1+|\boldsymbol{G}|^{2}\big{)}^{s}|\hat{u}_{\boldsymbol{G}}|^{2}<\infty\right\}.

2.1 Schrödinger-type equations

We consider a Schrödinger-type linear eigenvalue problem: Find $(\lambda_{i},\varphi_{i})\in\mathbb{R}\times H^{1}_{\#}(\Omega)$ , such that $\|\varphi_{i}\|_{L_{\#}^{2}(\Omega)}=1$ and

(1)

\left(-\Delta+V(\boldsymbol{r})\right)\varphi_{i}(\boldsymbol{r})=\lambda_{i}\varphi_{i}(\boldsymbol{r})\qquad i=1,2,\cdots,

where the potential $V\in C^{\infty}(\mathbb{R}^{d})$ and is $\mathbb{L}$ -periodic. The smoothness assumption on the potential $V$ is reasonable, since the planewave discretization is in practice often combined with smooth pseudo-potentials. It implies in particular that $V$ is bounded below.

The weak form of Eq. 1 reads

(2)

a(\varphi_{i},v)=\lambda_{i}(\varphi_{i},v)\quad\forall~v\in H^{1}_{\#}(\Omega)\qquad i=1,2,\cdots

with the bilinear form $a(\cdot,\cdot):H^{1}_{\#}(\Omega)\times H^{1}_{\#}(\Omega)\rightarrow\mathbb{R}$ given by

(3)

\displaystyle a(u,v)=\int_{\Omega}\nabla u\cdot\nabla v+\int_{\Omega}Vuv.

Since the operator $A:=-\Delta+V$ is self-adjoint, bounded below with compact resolvent, there exists a non-decreasing sequence of eigenvalues $\lambda_{1}<\lambda_{2}\leq\cdots\leq\lambda_{j}\rightarrow+\infty$ , where the $\lambda_{j}$ ’s are repeated according to their multiplicity.

Up to shifting the operator by a positive constant, e.g. $\displaystyle\max\big{\{}1-\inf_{\boldsymbol{r}\in\Omega}V(\boldsymbol{r}),0\big{\}}$ , we can further assume that $V(\boldsymbol{r})\geq 1~(\forall~\boldsymbol{r}\in\Omega)$ , which ensures that all eigenvalues of $A$ are positive. Note that such a shifting has no impact on the eigenfunctions of the problem. Then it holds

a(v,v)\geq\|v\|_{H^{1}_{\#}(\Omega)}^{2}\qquad\forall~v\in H^{1}_{\#}(\Omega),

and we can define the corresponding energy norm by $\|\cdot\|_{a}:=\sqrt{a(\cdot,\cdot)}$ .

2.2 Planewave discretization

Given an energy cut-off $E_{\rm c}>0$ , we define the following finite dimensional space

(4)

X_{E_{\rm c}}(\Omega):=\left\{\sum_{\boldsymbol{G}\in\mathbb{L}^{\prime},~|\boldsymbol{G}|^{2}\leq 2E_{\rm c}}c_{\boldsymbol{G}}e_{\boldsymbol{G}}({\boldsymbol{r}}),~\forall~\boldsymbol{G},~c_{-\boldsymbol{G}}=c^{*}_{\boldsymbol{G}}\right\}.

For $v\in H^{s}_{\#}(\Omega)$ , the best approximation of $v$ in $X_{E_{\rm c}}(\Omega)$ is

\displaystyle\Pi_{E_{\rm c}}v=\sum_{\boldsymbol{G}\in\mathbb{L}^{\prime},~|\boldsymbol{G}|^{2}\leq 2E_{\rm c}}\hat{v}_{\boldsymbol{G}}e_{\boldsymbol{G}}({\bf r})

for any $H^{t}_{\#}$ -norm ( $t\leq s$ ). The more regularity $v$ has, the faster this truncated series converge to $v$ : For $s,t\in\mathbb{R}_{+}$ satisfying $t\leq s$ , we have that for each $v\in H^{s}_{\#}(\Omega)$ (see e.g. [19]),

(5)

\|v-\Pi_{E_{\rm c}}v\|_{H^{t}_{\#}(\Omega)}=\min_{w\in X_{E_{\rm c}}}\|v-w\|_{H^{t}_{\#}(\Omega)}\lesssim E_{\rm c}^{(t-s)/2}\|v\|_{H^{s}_{\#}(\Omega)}.

For simplicity of notation, we suppress for now the index for the considered eigenpair in the eigenvalue problem. We rewrite the linear problem Eq. 2 as

(6)

a(\varphi,v)=\lambda(\varphi,v)\qquad\forall~v\in H^{1}_{\#}(\Omega),

and present available a priori and a posteriori error estimates. We note that the following discussions are not restricted to a specific eigenpair, but require a gap between the considered eigenvalue and the surrounding ones.

Given an energy cut-off $E_{\rm c}$ , the planewave discretization is a Galerkin approximation of the eigenpairs of Eq. 6 within the finite dimensional subspace $X_{E_{\rm c}}(\Omega)\subset H_{\#}^{1}(\Omega)$ : Find $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})\in\mathbb{R}\times X_{E_{\rm c}}(\Omega)$ , such that $\|\varphi_{E_{\rm c}}\|_{L^{2}_{\#}(\Omega)}=1$ and

(7)

a(\varphi_{E_{\rm c}},v)=\lambda_{E_{\rm c}}(\varphi_{E_{\rm c}},v)\qquad\forall~v\in X_{E_{\rm c}}(\Omega).

2.3 A priori error estimate

An a priori estimate for the solutions $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})$ of Eq. 7 can be found in [43, Theorems 2 and 3]. There exists a constant $C>0$ independent of $E_{\rm c}$ , such that

(8)			$\displaystyle\\|\varphi-\varphi_{E_{\rm c}}\\|_{H^{1}_{\#}(\Omega)}\leq C\inf_{\psi_{E_{\rm c}}\in X_{E_{\rm c}}}\\|\varphi-\psi_{E_{\rm c}}\\|_{H^{1}_{\#}(\Omega)}\qquad{\rm and}$
(9)			$\displaystyle\|\lambda-\lambda_{E_{\rm c}}\|\leq C\\|\varphi-\varphi_{E_{\rm c}}\\|_{H^{1}_{\#}(\Omega)}^{2}.$

Hence, if the eigenfunction $\varphi$ is analytic, which we will assume in the following, then the approximation error decays exponentially as (see [19, Section 5.4, Eq. (5.4.5)])

(10)			$\displaystyle\\|\varphi-\varphi_{E_{\rm c}}\\|_{H^{1}_{\#}(\Omega)}\leq Ce^{-c\sqrt{E_{\rm c}}}\qquad{\rm and}$
(11)			$\displaystyle\big{\|}\lambda-\lambda_{E_{\rm c}}\big{\|}\leq Ce^{-\tilde{c}\sqrt{E_{\rm c}}},$

with the constants $C,c,\tilde{c}>0$ independent of $E_{\rm c}$ . The above a priori error estimates not only guarantee that the error goes to zero as the energy cut-off increases, but also offer the speed of convergence of the approximate solutions to the exact one.

We observe from the numerical experiments (see Fig. 2 in Section 5.1) that for the smooth tested potentials, the planewave approximation errors decay exponentially fast with respect to $\sqrt{E_{\rm c}}$ , which matches the a priori error estimates in Eqs. 10 and 11.

In Section 3, we will combine the a priori bounds Eqs. 10 and 11 with the a posteriori bounds presented below in a linear regression strategy for determining good energy cut-offs.

Remark 2.1.

Analogous to the linear eigenvalue problem Eq. 6, some optimal a priori error estimates for nonlinear eigenvalue problems can be obtained [12, 13, 24].

2.4 A posteriori error estimate

To develop an adaptive method, it is very useful to have a posteriori error estimators available. Unlike a priori error estimates, the goal of the a posteriori error estimates is to provide a computable bound on the error between the numerical approximation of the solution and the unknown exact solution. This bound should therefore be computable with the knowledge of the approximate solution and the parameters used for the computation only.

In this section, we review an a posteriori error estimator for the planewave approximation of the linear eigenvalue problem Eq. 1, which gives computable and asymptotically accurate approximate errors. This a posteriori error indicator was first introduced for simple eigenvalues for conforming finite element discretizations in [16], nonconforming methods in [17], and then extended to eigenvalue clusters in [18].

Let us denote $H^{1}_{\#}(\Omega)$ by $H$ and its dual by $H^{\prime}$ for simplicity of notations. Let $(\lambda,\varphi)\in\mathbb{R}\times H$ be an eigenpair of Eq. 6 and $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})\in\mathbb{R}\times X_{E_{\rm c}}(\Omega)$ be the corresponding planewave approximation with a given energy cut-off $E_{\rm c}$ . We define the residual ${\rm Res}(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})\in H^{\prime}$ by

(12)

\big{\langle}{\rm Res}(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}}),v\big{\rangle}_{H^{\prime},H}=a\big{(}\varphi_{E_{\rm c}},v\big{)}-\lambda_{E_{\rm c}}\big{(}\varphi_{E_{\rm c}},v\big{)}\qquad\forall~v\in H.

The idea of residual-based a posteriori error estimation is to estimate the approximation error by the dual norm of the residual in an appropriate Hilbert space. Note that the dual norm must be well chosen, otherwise the norm of the residual might not accurately represent the error or not even go to zero when the error goes to zero. To construct the a posteriori error estimator for the approximation $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})$ , we consider the dual norm of the residual

(13)

\eta(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}}):=\|{\rm Res}(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})\|_{a^{\prime}},

where the dual norm (with respect to the energy norm $\|\cdot\|_{a}$ ) is defined for $f\in H^{\prime}$ by

\displaystyle\|f\|_{a^{\prime}}:=\sup_{v\in H,~v\neq 0}\frac{\langle f,v\rangle_{H^{\prime},H}}{\|v\|_{a}}.

For simplicity of notation, we will denote the residual and its dual norm by

{\rm Res}_{E_{\rm c}}={\rm Res}(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})\qquad{\rm and}\qquad\eta_{E_{\rm c}}=\eta(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})

in the following. We have from [18] that if $\lambda$ is a simple eigenvalue, there exists $\alpha(E_{\rm c})$ with $\alpha(E_{\rm c})\rightarrow 0$ as $E_{\rm c}\rightarrow\infty$ such that

(14)

\displaystyle 0\leq\lambda_{E_{\rm c}}-\lambda\leq\big{(}1+\alpha(E_{\rm c})\big{)}\eta_{E_{\rm c}}^{2}.

Remark 2.2.

So far, we have ignored the index $j$ for the eigenpairs for simplicity of notation, but the error indicator for a specific eigenpair with index $j$ can also be denoted by $\eta_{E_{\rm c},j}$ for completeness. Thus, when considering multiple eigenpairs, the definition of $\eta_{E_{\rm c}}$ can be extended by taking the sum

\eta_{E_{\rm c}}:=\left(\sum_{j\in J}\eta_{E_{\rm c},j}^{2}\right)^{1/2},

where $J$ is the set of indices of eigenpairs under consideration. Moreover, if the quantity of interest is not the sum of the eigenvalues, but a weighted sum, which happens to be the case in practice when fractional occupation numbers are in use, one can obtain error indicators on this quantity of interest by introducing corresponding weights in front of the individual error indicators $\eta_{E_{\rm c},j}^{2}$ , as defined in Section 4.2.

Remark 2.3.

The above a posteriori error estimate can be directly generalized to linear operators with non-local potentials (see e.g. Eq. 21).

To evaluate the a posteriori error indicator Eq. 13, one needs to solve the infinite dimensional linear system Eq. 12 to compute the residual. A practical evaluation method is to restrict the trial function space and test function space in Eq. 12 to a finite dimensional subspace $X_{E_{\rm g}}(\Omega)$ (see definition Eq. 4) with some cut-off $E_{\rm g}\gg E_{\rm c}$ . We will then denote the computable error indicator by $\eta_{E_{\rm c}}^{E_{\rm g}}$ , whose detailed calculations are presented in Appendix A. If $E_{\rm g}$ is proportional to $E_{\rm c}$ , then a standard calculation by solving the corresponding finite dimensional linear system will require a cost of $\mathcal{O}(E_{\rm c}^{d/2}\ln E_{\rm c})$ , and we refer to Section A.1 for the explanations.

In order to reduce the cost of the error indicator, we can alternatively evaluate the residual via a perturbation-based method. The resulting error indicator will be denoted by $\eta_{E_{\rm c}}^{E_{\rm g},[k]}$ (with $k=1,2$ the order of perturbation), whose cost is reduced by avoiding the resolution of a linear system. We refer to Section A.2 for the derivation of this method and discussion of its computational cost.

3 Adaptive algorithm for linear problems

We are now ready to construct the adaptive algorithm for Schrödinger-type equations. The algorithm repeats the following procedure until the required accuracy is reached:

(a)

given the energy cut-off $E_{\rm c}$ , solve Eq. 7 to obtain the approximate eigenpair $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})$ ;
(b)

compute the a posteriori error estimator $\eta_{E_{\rm c}}^{E_{\rm g}}$ (or $\eta_{E_{\rm c}}^{E_{\rm g},[k]}$ ) from the approximation $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})$ and a given $E_{\rm g}$ (or $E_{\rm g}$ and $k$ );
(c)

use some strategy to determine the new energy cut-off $E_{\rm c}^{\rm opt}$ .

The step (a) is standard and the step (b) has been discussed in the previous section. To determine the energy cut-offs during the iterations, i.e. step (c), we propose two strategies. Note that to initialize the algorithm, we choose a relatively small energy cut-off at the first step.

The first strategy relying on the known a priori spectral convergence rate (in Section 2.3) and the a posteriori error estimate (in Section 2.4) is presented in Strategy A. The idea is to perform a linear regression to predict the energy cut-off leading to a required accuracy. As an input for the linear regression, we use the $l$ -th energy cut-offs and the a posteriori error indicators $\big{\{}E_{\rm c}^{(i)},\eta_{E_{\rm c}^{(i)}}^{E_{\rm g}}({\rm or}\ \eta_{E_{\rm c}^{(i)}}^{E_{\rm g},[k]})\big{\}}_{0\leq i\leq l}$ . We show in Fig. 1 a schematic plot of Strategy A.

Strategy A: Linear regression

0: energy tolerance

\varepsilon

, energy cut-offs and error estimators

\big{\{}E_{\rm c}^{(i)},\eta_{E_{\rm c}^{(i)}}^{E_{\rm g}}\big{\}}_{0\leq i\leq l}

Perform a linear regression (LR) to compute the linear function

(15)

\Theta(x):={\textbf{L}\textbf{R}}\left[\left\{\sqrt{E_{\rm c}^{(i)}},\log\big{(}\eta_{E_{\rm c}^{(i)}}^{E_{\rm g}}\big{)}\right\}_{0\leq i\leq l}\right](x).

2.

Solve the following linear equation to obtain $\bar{E}$

$\Theta(\bar{E})=\frac{1}{2}\log\varepsilon.$

0: Output energy cut-off

E_{\rm c,A}^{(l+1)}=\bar{E}

In Eq. 15, LR gives a linear function $\Theta(x)=ax+b$ , where the coefficients $a$ and $b$ are determined by least squares as follows

(16)

\min_{a,b}\left\{\sum_{0\leq i\leq l}\Big{|}a\sqrt{E_{\rm c}^{(i)}}+b-\log\big{(}\eta_{E_{\rm c}^{(i)}}^{E_{\rm g}}\big{)}\Big{|}^{2}\right\}.

We observe from the numerical results in Section 5 that the strategy based on linear regression appears to be sharp in many cases, but sometimes overestimates the optimal energy cut-offs due to the fluctuation of the errors.

A second strategy is presented in Strategy B below. This strategy is analog to the Dörlfer strategy, which is a widely used method in adaptive finite elements methods. In this method, we define an error indicator $\eta^{E_{g}}_{E_{c}^{(l)}}(E)$ composed of all contributions in $\eta^{E_{g}}_{E_{c}^{(l)}}$ from all wave vectors $\boldsymbol{G}$ in the reciprocal space $E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E$ , i.e.

\eta^{E_{g}}_{E_{c}^{(l)}}(E)=\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E}\left|\widehat{\eta^{E_{g}}_{E_{c}^{(l)},\boldsymbol{G}}}\right|^{2}\right)^{1/2}.

More details are given in Appendix A.2, more precisely, in Eq. 34 and Eq. 39. The energy cut-off $E$ is then increased until sufficiently many $\boldsymbol{G}$ ’s are included so that the chosen tolerance is achieved.

Strategy B: Error reduction

0: eigenvalue tolerance

\varepsilon>0

, energy cut-off

E_{\rm c}^{(l)}

and error estimator

\eta_{E_{\rm c}^{(l)}}^{E_{\rm g}}

Find $\bar{E}>E_{\rm c}^{(l)}$ such that $\bar{E}$ is the minimal value satisfying

(17) $\eta_{E_{\rm c}^{(l)}}^{E_{\rm g}}(\bar{E})\geq\sqrt{\left(\eta_{E_{\rm c}^{(l)}}^{E_{\rm g}}\right)^{2}-\varepsilon}.$

0: Output energy cut-off

E_{\rm c,B}^{(l+1)}=\bar{E}

We show in Fig. 1 a schematic plot of Strategy B. In the picture, the red and blue circles represent the cut-offs $E_{\rm c}$ and $E_{\rm g}$ respectively, and the a posteriori error estimate consists of the sum over all reciprocal lattice vectors that lie between these two circles. Strategy B determines the black circle, which is obtained so that the planewave vectors lying between the red and black ones contribute to “most of the error” up to some given tolerance. In practice, Eq. 17 can be solved by using the bi-section or golden-section method to obtain $\bar{E}$ . We refer to [20, 44] for discussions related to the Dörlfer strategy for planewave approximations, in which no practical algorithm was implemented.

Refer to caption — Figure 1: Strategy A (left) and B (right) for determining the energy cut-offs.

From a practical point of view, it is more stable and reliable to use a combination of the above two strategies, which we describe in Algorithm 1 below. In terms of strategy, taking the minimum between the cutoffs obtained from Strategy A and B (as in Eq. 18 below) is expected to minimize the risk of overestimating the new energy cut-off. Alternatively, one could use ‘max’ in Eq. 18 instead of ‘min’ to minimize the number of iteration steps, and hence the number of calls to the linear eigensolver in the algorithm.

Algorithm 1: Adaptive planewave method for linear eigenvalue problems

0: eigenvalue tolerance

\varepsilon>0

, initial energy cut-off

E_{\rm c}^{(0)}

E_{\rm g}\gg 1

and

l=0

1.

Solve Eq. 7 within $X_{E_{\rm c}^{(l)}}$ to obtain the planewave approximation $\big{(}\lambda_{E_{\rm c}^{(l)}},\varphi_{E_{\rm c}^{(l)}}\big{)}$ .
2.

Compute the a posteriori error estimator defined by Eq. 33 or Eq. 38.
3.

If $\left(\eta_{E_{\rm c}^{(l)}}^{E_{\rm g}}\right)^{2}<\varepsilon$ , then stop and return the planewave approximation. Else, goto 4.
4.

If $l=0$ , then use Strategy B to get $E_{\rm c}^{(l+1)}=E_{\rm c,B}^{(l+1)}$ , let $l=l+1$ and goto 2.

Else, use Strategy A to compute $E_{\rm c,A}^{(l+1)}$ and Strategy B to compute $E_{\rm c,B}^{(l+1)}$ . Take

(18) $E_{\rm c}^{(l+1)}=\min\left\{E_{\rm c,A}^{(l+1)},E_{\rm c,B}^{(l+1)}\right\}$

and $l=l+1$ , goto 1.

\left(\lambda_{E_{\rm c}^{(l)}},\varphi_{E_{\rm c}^{(l)}}\right)

Note that the computational cost for both error indicators and Strategy B increase with respect to the cut-off $E_{\rm g}$ , but we can adjust $E_{\rm g}$ during the adaptive algorithm according to the energy cut-off $E_{\rm c}$ to avoid using very large $E_{\rm g}$ throughout the algorithm.

Let us now comment on the computational cost of the algorithm, assuming that we need to compute $N_{\rm b}$ eigenvalues. At each step of Algorithm 1, the cost for solving the eigenvalue problem is $\mathcal{O}(N_{\rm b}^{2}E_{\rm c}^{d/2})$ , the cost for evaluating the error indicators is $\mathcal{O}(N_{\rm b}E_{\rm g}^{d/2}\ln E_{\rm g})$ . Therefore, if $E_{\rm g}$ stays relatively small and comparable to $E_{\rm c}$ then the overall computational costs of Algorithm 1 is $\mathcal{O}(N_{\rm b}^{2}\bar{E}_{\rm c}^{d/2})$ , where $\bar{E_{\rm c}}$ is the optimal energy cut-off determined by the algorithm. Hence, the cost of the whole procedure is still dominated by the linear eigensolver. Compared to a classical convergence test, this procedure does not require user’s experience to find a good energy cut-off.

4 Applications to nonlinear eigenvalue problems

In this section, we will introduce an adaptive planewave method for nonlinear eigenvalue problems. We will first briefly review the Kohn–Sham equations, a widely used model in electronic structure calculations as well as its formulation in the framework of the projector augmented wave (PAW) method. We will then discuss how the adaptive algorithms developed for linear eigenvalue problems can be generalized to these nonlinear eigenvalue problems. We mention that only the error estimates for linear eigenvalue problems are required for the algorithms presented in this section.

4.1 Kohn–Sham equations

In the ab-initio quantum mechanical modeling of many electron systems, the Kohn–Sham density functional theory (DFT) [34, 36] has established itself as one of the most powerful electronic structure methods due to its good balance between accuracy and computational cost.

We consider a many-particle system (with $N_{\rm atom}$ nuclei and $N_{\rm e}$ electrons) in the spin-less and non-relativistic setting. The Kohn–Sham ground state solution can be obtained by solving the following Kohn–Sham equations [40]: Find $(\lambda_{i},\varphi_{i})\in\mathbb{R}\times H_{\#}^{1}(\Omega),~i=1,\cdots,N_{\rm b}$ , such that $\int_{\Omega}\varphi_{i}\varphi_{j}=\delta_{ij}$ and

(19)

H[\rho]\varphi_{i}=\lambda_{i}\varphi_{i},

where $N_{\rm b}$ is the number of eigenpairs/bands under consideration, $\{\lambda_{i}\}_{i=1}^{N_{\rm b}}$ are the $N_{\rm b}$ lowest eigenvalues, $\rho=\sum_{i=1}^{N_{\rm b}}f_{i}|\varphi_{i}|^{2}$ is the electron density with $f_{i}$ the occupancy number, and

(20)

\displaystyle H[\rho]=-\frac{1}{2}\Delta+V_{\rm{eff}}[\rho]\qquad{\rm and}\qquad V_{\rm{eff}}[\rho]=v_{\rm ps}+v_{\rm{H}}[\rho]+v_{\rm{xc}}[\rho]

with $V_{\rm{eff}}[\rho]$ the effective potential, $v_{\rm{ps}}$ the pseudo-potential generated by the nuclei and core electrons, $v_{\rm{H}}[\rho]$ and $v_{\rm{xc}}[\rho]$ the so-called Hartree potential and exchange-correlation potential, respectively. The occupancy numbers $f_{i}$ can be chosen by smearing methods like the Fermi–Dirac or the Methfessel–Paxton scheme.

We shall now give more details on the potentials used to define $V_{\rm eff}$ . In the norm-conserving pseudo-potential setting [40], the pseudo-potential $v_{\rm{ps}}$ can be written as

v_{\rm{ps}}=v_{\rm loc}+v_{\rm nl},

where $v_{\rm loc}:\mathbb{R}^{d}\rightarrow\mathbb{R}$ is the local part, $v_{\rm nl}$ is a non-local operator given by

(21)

\big{(}v_{\rm nl}\phi\big{)}({\bf r})=\sum_{I=1}^{N_{\rm atom}}\sum_{j=1}^{M_{\rm ps}}(\phi,\xi_{I,j})\xi_{I,j}({\bf r})\qquad\forall~\phi\in L_{\#}^{2}(\Omega)

with $M_{\rm ps}\in\mathbb{Z}_{+}$ and $\xi_{I,j}~(I=1,\cdots,N_{\rm atom},~j=1,\cdots,M_{\rm ps})\in L_{\#}^{2}(\Omega)$ being sufficiently smooth functions in $\mathbb{R}^{d}$ . In the context of ultra-soft pseudo-potentials (USPP) or PAW methods [39, 53], the non-local parts are formulated by

(22)

\big{(}v_{\rm nl}\varphi\big{)}({\bf r})=\sum_{I=1}^{N_{\rm atom}}\sum_{n,m=1}^{M_{\rm ps}}D_{nm}^{I}(\beta_{I,m},\varphi)\beta_{I,n}({\bf r})\qquad\forall~\varphi\in L_{\#}^{2}(\Omega),

where $D_{nm}^{I}$ depends on the eigenfunctions and need to be updated during the SCF loop. For more details on the PAW method, which will be the model used in practice for the numerical experiments in Section 5, see Appendix B.

In periodic systems, $v_{\rm{H}}[\rho]$ represents the $\mathbb{L}$ -periodic Coulomb potential generated by the $\mathbb{L}$ -periodic electron density $\rho$

v_{\rm{H}}[\rho]({\bf r})=4\pi\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime}\backslash\{\bf 0\}}|\boldsymbol{G}|^{-2}\hat{\rho}_{\boldsymbol{G}}e_{\boldsymbol{G}}({\boldsymbol{r}}).

Finally, for the exchange-correlation potential, we use a generalized gradient approximation (GGA) [46] in our numerical experiments.

Problem Eq. 19 is a nonlinear eigenvalue problem since the operator $H[\rho]$ depends on the electron density $\rho$ , and hence on the eigenfunctions $\{\varphi_{i}\}_{i=1}^{N_{\rm b}}$ . A self-consistent field (SCF) iteration algorithm [40] is commonly resorted to for this kind of nonlinear eigenvalue problem. At each iteration, a new Hamiltonian is constructed from a trial electronic state and a linear eigenvalue problem, in the form of Schrödinger-type equation Eq. 1, is then solved to obtain the low-lying eigenvalues and corresponding eigenvectors.

Once the ground state solution of Kohn–Sham equations have be obtained, we can evaluate the band structure energy $E_{\textrm{bands}}\equiv\sum_{i=1}^{N_{b}}f_{i}\lambda_{i}$ as a weighted sum of the eigenvalues by occupation numbers. When we consider the $k$ -point sampling, the integration over the Brillouin zone also needs to be included.

In the first iterations of the self-consistent field algorithm, the iterates are far form the exact solution, so that the error coming from the iterative solver is dominant. But close to self-consistency, the choice of the basis set becomes important, as the basis set size will ultimately determine the quality of the final approximation. Therefore, an efficient planewave SCF algorithm should be able to adjust the energy cut-off on the fly, so that no computational resource is wasted while the required accuracy can be reached at the end of the iterations. We refer to [30] for a similar discussion of error balancing for the Gross-Pitaevskii equation.

4.2 Adaptive algorithm for nonlinear problems

We now discuss the construction of error indicators and the adaptive algorithm for the nonlinear Kohn–Sham equations Eq. 19. As mentioned in Section 4.1, an SCF iteration algorithm is used to solve the nonlinear eigenvalue problem, in which a linear Schrödinger-type equation is solved at each step. Therefore, we can incorporate Algorithm 1 (for linear problems) into each SCF iteration, and thus adapt the energy cut-off on the fly with some well-chosen tolerance, corresponding to the self-consistency error. The use of the self-consistency error as the tolerance for adapting the cut-off is based on the following: when the solution is far from self-consistency, one can choose a relative small energy cut-off for the linearized eigenvalue problem; when the iteration is close to self-consistency, one has to use a large energy cut-off to obtain accurate eigenpairs.

Let $\rho_{\rm in}^{(m)}$ and $\rho_{\rm out}^{(m)}$ represent the input and output electron densities at the $m$ -th step of the SCF algorithm. At the $m$ -th step, a linearized eigenvalue problem, with the trial input electron density $\rho_{\rm in}^{(m)}$ is solved

(23)

\displaystyle H\big{[}\rho_{\rm in}^{(m)}\big{]}\varphi_{i}^{(m)}=\lambda_{i}^{(m)}\varphi_{i}^{(m)}\qquad i=1,2,\cdots N_{\rm b}

with some energy cut-off $E_{\rm c}^{(m)}$ . Then an output electron density $\rho_{\rm out}^{(m)}$ is obtained from the eigenfunctions $\big{\{}\varphi_{i}^{(m)}\big{\}}_{i=1}^{N_{\rm b}}$ . The self-consistency error is defined by

(24)

\eta_{\rm SCF}^{(m)}:=\alpha^{1/2}\|\rho_{\rm in}^{(m)}-\rho_{\rm out}^{(m)}\|_{L^{2}(\Omega)}\qquad{\rm for~some}~\alpha>0.

The parameter $\alpha$ in Eq. 24 is set to control the ratio between discretization error and self-consistency error in the total error. In our numerical simulations, $\alpha$ is set empirically: we take $\alpha$ to be the ratio between the tolerances of the discretization error and self-consistency error. Note that a good choice of $\alpha$ could be obtained by a careful error analysis of the nonlinear eigenvalue problems, estimating explicit optimal weights (in the total error) of the discretization error and self-consistent error. This will be investigated in future work.

The planewave discretization error indicator is defined with respect to the linear eigenvalue problem Eq. 23, which involves all $N_{\rm b}$ eigenpairs under consideration, as discussed in Remark 2.2. In analogy with definition Eq. 13, we define the discretization error indicator as

(25)

\eta_{E_{\rm c}}^{(m)}:=\left(\sum_{i=1}^{N_{\rm b}}f_{i}\|{\rm Res}^{(m)}_{i}\|_{a^{\prime}}^{2}\right)^{1/2}

where $f_{i}$ is the occupancy number, $\|\cdot\|_{a^{\prime}}$ is the dual norm corresponding to the linear operator

A^{(m)}:=H\big{[}\rho_{\rm in}^{(m)}\big{]},

and ${\rm Res}^{(m)}_{i}$ is the residual for the $i$ -th eigenpair at the $m$ -th step

(26)

{\rm Res}^{(m)}_{i}:=H\big{[}\rho_{\rm in}^{(m)}\big{]}\varphi_{i}^{(m)}-\lambda_{i}^{(m)}\varphi_{i}^{(m)}~\in~H^{\prime}.

With a given $E_{\rm g}\gg E_{\rm c}^{(m)}$ , we define the computable discretization error indicator by

(27)

\eta_{E_{\rm c}}^{E_{\rm g},(m)}:=\sum_{i=1}^{N_{\rm b}}f_{i}\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm g}}\widehat{{\rm Res}_{i}^{(m)}}^{E_{\rm g}}_{\boldsymbol{G}}\cdot\widehat{A^{(m),-1}{\rm Res}_{i}^{(m)}}^{E_{\rm g}}_{\boldsymbol{G}}\right)^{\frac{1}{2}}

based on the standard calculation Eq. 33, and for $k=1,2$ .

(28)

\displaystyle\eta_{E_{\rm c}}^{E_{\rm g},(m),[k]}:=\sum_{i=1}^{N_{\rm b}}f_{i}\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm g}}\widehat{{\rm Res}_{i}^{(m)}}^{E_{\rm g}}_{\boldsymbol{G}}\cdot\widehat{A^{(m),-1,[k]}{\rm Res}_{i}^{(m)}}^{E_{\rm g}}_{\boldsymbol{G}}\right)^{\frac{1}{2}},

based on a perturbation method Eq. 38.

Combining the SCF algorithm and Algorithm 1 for linear eigenvalue problems, we propose the following adaptive algorithm for nonlinear eigenvalue problems.

Algorithm 2: Adaptive planewave method for nonlinear eigenvalue problems

0: tolerance

\delta>0

, initial energy cut-off

E_{\rm c}^{(1)}

E_{\rm g}\gg 1

m=1

and

\rho^{(0)}\in L^{2}_{\#}(\Omega)

1.

Solve Eq. 23 with the trial electron density $\rho^{(m-1)}$ within $X_{E_{\rm c}^{(m)}}$ to obtain the planewave approximations $\Big{\{}(\lambda^{(m)}_{j,E_{\rm c}^{(m)}},\varphi^{(m)}_{j,E_{\rm c}^{(m)}})\Big{\}}_{j=1}^{N_{\rm b}}$ . Compute the error indicator $\eta_{\rm SCF}^{(m)}$ .
2.

Compute the error indicator $\eta_{E_{\rm c}^{(m)}}^{E_{\rm g},(m)}$ (or $\eta_{E_{\rm c}^{(m)}}^{E_{\rm g},(m),[k]}$ ).
3.

If $\left(\eta_{\rm SCF}^{(m)}\right)^{2}<\delta$ and $\left(\eta_{E_{\rm c}^{(m)}}^{E_{\rm g},(m)}\right)^{2}<\delta$ , then stop and return the approximations. Else, goto 4.
4.

If $\left(\eta_{E_{\rm c}^{(m)}}^{E_{\rm g},(m)}\right)^{2}<\left(\eta_{\rm SCF}^{(m)}\right)^{2}$ , then goto 5.

Else, use Algorithm 1 to solve Eq. 23 with tolerance $\varepsilon=\left({\eta_{\rm SCF}^{(m)}}\right)^{2}$ , let $E_{\rm c}^{(m)}$ be the final energy cut-off of Algorithm 1, goto 5.
5.

Let $\displaystyle\rho^{(m)}$ be the electron density from $\Big{\{}\varphi^{(m)}_{j,E_{\rm c}^{(m)}}\Big{\}}_{i=1}^{N_{\rm b}}$ and $m=m+1$ , goto 1.

0: Approximate eigenfunctions and eigenvalues

\Big{\{}(\lambda^{(m)}_{j,E_{\rm c}^{(m)}},\varphi^{(m)}_{j,E_{\rm c}^{(m)}})\Big{\}}_{j=1}^{N_{\rm b}}

In Step 4 of Algorithm 2, by taking $\left(\eta_{\rm SCF}^{(m)}\right)^{2}$ as the tolerance for planewave discretizations, we are actually estimating the total energy error of the full nonlinear eigenvalue problem by the sum $\left(\eta_{E_{\rm c}}^{(m)}\right)^{2}+\left(\eta_{\rm SCF}^{(m)}\right)^{2}$ . We note that it could be possible to construct some a posteriori error estimate for the nonlinear eigenvalue problems directly (see [14, 23, 25, 30]). However, we will not use or discuss this type of estimates in this work.

In Step 5 of Algorithm 2, one can exploit a charge mixing scheme to ensure or accelerate the convergence of the SCF algorithm. The charge mixing in our implementation is performed in the Fourier space. Assume that we need to mix two charge densities $\rho_{1}$ and $\rho_{2}$ with cut-offs $E_{\rm c}^{1}$ and $E_{\rm c}^{2}$ , respectively, and $E_{\rm c}^{2}>E_{\rm c}^{1}$ . We simply fill the rest of planewave components of $\rho_{1}$ by zeros to expand it to the same grid as $\rho_{2}$ and then carry out the mixing. The additional computational cost of this implementation is negligible.

The most important feature of Algorithm 2 is that it is adaptive. It automatically determines a good discretization parameter, in our case the energy cut-off, on the fly to achieve the required accuracy. Moreover, compared to the conventional SCF iterations, Algorithm 2 avoids solving large linear eigenvalue problems at all iteration steps. Since the energy cut-offs are determined on the fly and increased during the SCF iterations, computational cost can be saved from that one only needs to solve small eigenvalue problems while the solution is still away from convergence. The extra cost of this adaptive algorithm comes from calculating the error indicator, which scales as $\mathcal{O}(N_{\rm b}E_{\rm g}^{d/2}\ln E_{\rm g})$ . Since the cost for one linear eigensolver is $\mathcal{O}(N_{\rm b}^{2}E_{\rm c}^{d/2})$ and $E_{\rm g}$ is proportional to $E_{\rm c}$ , we have that the additional cost for calculating the error indicator is way smaller than the cost saved from the linear eigensolvers used with small energy cut-offs.

5 Numerical experiments

In this section, we will first test our adaptive algorithms on simple linear and nonlinear eigenvalue problems, and then show the performance on computing the electronic structure of typical bulk systems using the PAW method. The approximation errors are calculated with respect to the reference solutions obtained with a sufficiently large energy cut-off.

5.1 Tests of principle

A linear eigenvalue problem. Consider the 2D linear eigenvalue problems of the form Eq. 1, with $\Omega=[-5,5]^{2}$ and $V(x,y)=1.5(x^{2}+y^{2})+e^{-(x-1)^{2}-y^{2}}$ . We use a large energy cut-off $E_{\rm c}=100.0$ to obtain a sufficiently accurate solution as the reference in this example.

We first compare in Fig. 2 the numerical errors in the lowest two eigenvalues with different a posteriori error indicators discussed in Sections 2.4 and A. We observe in the plots that they match almost perfectly. Moreover, the errors decay exponentially with respect to the energy cut-off which suggests that a linear regression strategy could work well. We also observe that in this case, the first order perturbation method (Eq. 38 with $k=1$ ) to compute the error indicator already leads to a very accurate approximation of the true error.

We further test the adaptive algorithm (Algorithm 1) on this linear problem. Tables 1 and 2 show the energy cut-offs obtained by using the error indicators Eq. 34 and Eq. 38 respectively, with Strategy A and Strategy B, and tolerances $\varepsilon=10^{-3}$ and $10^{-6}$ . We obtain by performing a traditional convergence test that the optimal cut-offs for these two tolerances are 7.1 and 16.0 respectively. Algorithm 1 captures energy cut-offs respectively of 7.5 and 17.12 in a few steps, which suffice for the target accuracy without overshooting too much the optimal cut-offs.

Table 1: Linear problem: Error indicators and energy cut-offs obtained by Algorithm 1 with Eq. 34.

tolerance

\varepsilon

= 1.0e-3

step	$E_{\rm c}$	$\|\eta_{E_{\rm c}}^{E_{\rm g}}\|^{2}$	$E_{c,A}$	$E_{c,B}$
0	3.00	5.121543e-01	——	7.50
1	7.50	5.370600e-04	——	——

tolerance

\varepsilon

= 1.0e-6

step	$E_{\rm c}$	$\|\eta_{E_{\rm c}}^{E_{\rm g}}\|^{2}$	$E_{c,A}$	$E_{c,B}$
0	3.00	5.121543e-01	——	12.00
1	12.00	5.329866e-05	17.12	21.00
2	17.12	4.715432e-07	——	——

Table 2: Linear problem: Error indicators and energy cut-offs obtained by Algorithm 1 with Eq. 38, k = 1.

tolerance

\varepsilon

= 1.0e-3

step	$E_{\rm c}$	$\|\eta_{E_{\rm c}}^{E_{\rm g},[1]}\|^{2}$	$E_{c,A}$	$E_{c,B}$
0	3.00	3.807044e-01	——	7.50
1	7.50	2.898487e-04	——	——

tolerance

\varepsilon

= 1.0e-6

step	$E_{\rm c}$	$\|\eta_{E_{\rm c}}^{E_{\rm g},[1]}\|^{2}$	$E_{c,A}$	$E_{c,B}$
0	3.00	3.807044e-01	——	12.00
1	12.00	3.339172e-05	16.93	21.00
2	16.93	3.395260e-07	——	——

A nonlinear eigenvalue problem. Consider the following Gross–Pitaevskii equation (GPE) originated from modelling Bose–Einstein condensates [47]: Find $(\lambda,\varphi)\in\mathbb{R}\times H^{1}_{\#}(\Omega)$ , such that $\|\varphi\|_{L^{2}(\Omega)}=1$ and

(29)

\displaystyle\left(-\frac{1}{2}\Delta+V(\boldsymbol{r})+\mu\rho(\boldsymbol{r})\right)\varphi(\boldsymbol{r})

\displaystyle=\lambda\varphi(\boldsymbol{r})

where $\mu=1.0$ , $\Omega=[-5,5]^{2}$ , $V(x,y)=10(x^{2}+y^{2})+e^{-\big{(}(x-1)^{2}+y^{2})\big{)}}$ , and $\rho(\boldsymbol{r})=|\varphi(\boldsymbol{r})|^{2}$ . Note that this equation can be viewed as a simplified version of Eq. 19, where only the lowest eigenvalue is considered and the nonlinear term is much simpler.

We apply Algorithm 2 to solve Eq. 29 with tolerance $\delta=10^{-6}$ . Note that the “best” choice of the parameter $\alpha$ (defined in Eq. 24) is not known in this example, and we perform experiments with $\alpha=1.0$ and 0.1, respectively, to illustrate how the choice of $\alpha$ affects the number of convergence steps of our algorithm. We show in Fig. 3 the discretization and SCF errors with respect to the SCF steps. The energy cut-off is iteratively adjusted with Algorithm 2 and we observe that a good balance is achieved between the SCF and discretization errors. The oscillation in the SCF errors arises from the simple mixing scheme used in the SCF iterations.

5.2 Tests on PAW Kohn–Sham equations

We implemented the a posteriori error indicator and the adaptive algorithm (Algorithm 2) in an in-house planewave PAW code [31]. For the validity tests we chose two bulk systems. The first is a diamond structure with two carbon atoms in the unit cell. This is an insulator and we used only the $\Gamma$ point for the $k$ -point sampling. The number of occupied bands is $N_{\textrm{b}}=4$ . The second system is an FCC structure with four aluminum atoms in the unit cell. It is a metallic system with a vanishing band gap, and we employed a $6\times 6\times 6$ Monkhorst–Pack $k$ -mesh. The Brillouin zone grid is fixed over all calculations. The first order Methfessel–Paxton smearing method [41] with a smearing width of $0.3\,\textrm{eV}$ was used and the number of occupied bands is $N_{\textrm{b}}=7$ . For the efficiency tests on relatively large systems, we constructed a $4\times 4\times 4$ diamond supercell containing 128 carbon atoms, and the number of occupied bands is $N_{\textrm{b}}=256$ .

In the following, we first check the accuracy of the a posteriori error indicators on the linearized Kohn–Sham equations and then test the behavior of the adaptive algorithm for solving the nonlinear equations. In order to take into account all the $N_{\textrm{b}}$ occupied eigenstates, we present error results on the band structure energy $E_{\textrm{bands}}$ . Note that the discretization error in the band structure energy and that in the total energy decay in the same rate with respect to $E_{c}$ (see [13, Theorem 4.2 and (4.90)]).

Accuracy of the error indicators for the linearized Kohn–Sham equations. We are primarily concerned with the accuracy of the error indicator for the linearized Kohn–Sham equation since it determines the effectiveness of the adaptive algorithm. In our tests we fixed the density as the superposition of atomic charge densities and solved the linearized Kohn–Sham equations for different energy cut-off values $E_{\rm c}$ . The a posteriori error indicators were calculated by using the standard and the perturbation-based methods detailed in Appendix A.

Table 3: Comparison of estimated errors and real error in

E_{\textrm{bands}}

for the linearized Kohn–Sham equation of the diamond structure. The

E_{c}

and error data are in eV. The relative differences are calculated w.r.t. the real error. The reference solution is obtained with

E_{\rm c}=4500\,

eV.

$E_{\rm c}$	Real error	$\|\eta_{E_{\rm c}}^{E_{\rm g},[1]}\|^{2}$		$\|\eta_{E_{\rm c}}^{E_{\rm g},[2]}\|^{2}$		$\|\eta_{E_{\rm c}}^{E_{\rm g}}\|^{2}$
400.0	2.445e-01	2.190e-01	-10.44 %	2.336e-01	-4.46 %	2.385e-01	-2.45 %
600.0	2.329e-02	2.163e-02	-7.11 %	2.256e-02	-3.11 %	2.264e-02	-2.78 %
800.0	9.093e-03	8.684e-03	-4.50 %	8.889e-03	-2.25 %	8.891e-03	-2.23 %
1000.0	2.690e-03	2.574e-03	-4.29 %	2.630e-03	-2.21 %	2.632e-03	-2.15 %
1200.0	1.684e-03	1.631e-03	-3.18 %	1.657e-03	-1.65 %	1.655e-03	-1.73 %
1400.0	7.283e-04	7.133e-04	-2.07 %	7.230e-04	-0.73 %	7.251e-04	-0.45 %

Table 4: Comparison of estimated errors and real error in

E_{\textrm{bands}}

for the linearized Kohn–Sham equation of the FCC aluminum system. Energy unit and relative difference calculation are the same as in Table 3. The reference solution is obtained with

E_{\rm c}=2000\,

eV.

$E_{\rm c}$	Real error	$\|\eta_{E_{\rm c}}^{E_{\rm g},[1]}\|^{2}$		$\|\eta_{E_{\rm c}}^{E_{\rm g},[2]}\|^{2}$		$\|\eta_{E_{\rm c}}^{E_{\rm g}}\|^{2}$
300.0	2.035e-02	1.805e-02	-11.29 %	1.936e-02	-4.87 %	2.000e-02	-1.71 %
400.0	4.484e-03	4.248e-03	-5.27 %	4.404e-03	-1.78 %	4.445e-03	-0.88 %
500.0	1.810e-03	1.733e-03	-4.25 %	1.790e-03	-1.10 %	1.799e-03	-0.64 %
600.0	9.420e-04	9.046e-04	-3.97 %	9.333e-04	-0.92 %	9.368e-04	-0.55 %
700.0	5.262e-04	5.032e-04	-4.37 %	5.219e-04	-0.82 %	5.240e-04	-0.42 %
800.0	3.249e-04	3.120e-04	-3.96 %	3.225e-04	-0.75 %	3.237e-04	-0.37 %

The error indicators $|\eta_{E_{\rm c}}^{E_{\rm g}}|^{2}$ , $|\eta_{E_{\rm c}}^{E_{\rm g},[1]}|^{2}$ , and $|\eta_{E_{\rm c}}^{E_{\rm g},[2]}|^{2}$ are compared with real errors in Tables 3 and 4 for the diamond structure and the aluminum systems, respectively. It is observed from the tables that the first-order indicator $|\eta_{E_{\rm c}}^{E_{\rm g},[1]}|^{2}$ already provides a very reliable approximation of the real error. The second-order indicator $|\eta_{E_{\rm c}}^{E_{\rm g},[2]}|^{2}$ exhibits a clear improvement over the first-order indicator and derives results very close to those from the standard indicator. And the errors estimated by the standard indicator $|\eta_{E_{\rm c}}^{E_{\rm g}}|^{2}$ closely match the real error.

Table 5: The CPU time cost (in seconds) of the standard and perturbation-based calculations of error indicators for the diamond structure.

$E_{\rm c}$	1st-order	2nd-order	Standard
400.0	0.01	0.02	0.62
600.0	0.02	0.05	1.22
800.0	0.03	0.06	1.86
1000.0	0.06	0.11	3.33
1200.0	0.08	0.15	4.72
1400.0	0.10	0.17	5.87

In terms of cost, however, the standard calculation of the error indicator is remarkably more expensive than the perturbation-based methods, as shown by the timings for the diamond example in Table 5. As mentioned, the standard calculation requires to solve a linear system of size $N_{E_{\rm g}}$ for each of the $N_{\textrm{b}}$ eigenvalues. Since the planewave discretization leads to large-scale dense matrices, the solution of the corresponding linear systems can be very demanding.

Therefore, the perturbation-based calculation of the a posteriori error indicator is the proper choice in practice to balance accuracy versus computational cost. We employed the perturbation-based methods in the following tests. It should be emphasized that the proposed error estimate is accurate both for the tested insulator and metallic systems. This high-quality a posteriori error indicator is critical to a reliable tuning of the energy cut-off.

Adaptive solution of the Kohn–Sham equations with an unknown converged cut-off. We now apply Algorithm 2 to solve the nonlinear Kohn–Sham equations. The density mixing is performed using Pulay’s DIIS method [48]. The tolerance for the discretization error in $E_{\textrm{bands}}$ and that for the SCF error were chosen to be $2\times 10^{-3}$ eV for both the diamond structure and the aluminum systems.

Table 6: Diamond structure: Detailed

E_{\rm c}

and error data during the adaptive solution of the Kohn–Sham equations by Algorithm 2. The

E_{c}

and errors are in eV.

SCF-step	$E_{\rm c}$	$\|\eta_{E_{\rm c}}\|^{2}$	$\|\eta_{\rm SCF}\|^{2}$
2	300	7.895e-01	2.099
3	300	6.439e-01	8.518e-01
	300	6.837e-01
4	400	1.622e-01	2.878e-02
	794.42	8.537e-03
5	794.42	9.242e-03	5.018e-03
5	894.42	3.393e-03	5.018e-03
6	894.42	3.414e-03	6.735e-04
	994.42	2.260e-03
	1044.42	2.100e-03
	1094.42	1.960e-03
7	1094.42	1.956e-03	3.004e-05

The evolution of the discretization and SCF errors with respect to the SCF step are shown in Fig. 4 for the diamond and the aluminum systems. For a detailed illustration, we take the diamond structure and further present detailed data on the energy cut-off and the two components of errors in Table 6. It is seen that a converged solution is obtained by adaptively tuning the energy cut-off depending on the discretization and SCF errors.

Comparison with convergence-test computations and discussion. We have stated in the introduction that the purpose of our algorithm is to determine the converged energy cut-off adaptively and avoid the conventional convergence-test computations. In this section we compare and discuss the two approaches in detail.

In a typical convergence test on the energy cut-off $E_{\rm c}$ , first, one needs to choose three parameters: the smallest and largest cut-offs to be tested and a fixed step size of values in between. The largest value should be greater than the converged cut-off, but the latter is unknown in advance. So the largest cut-off needs to be set appropriately with some redundancy. Then the Kohn–Sham equations are solved under all the chosen cut-offs, and the converged cut-off is determined as the one beyond which the variation in results of the concerned quantity (the energies or the forces) is consistently smaller than the given tolerance.

In the adaptive algorithm proposed here, users need to choose the starting cut-off and the ratio between the tolerances for planewave discretizations and SCF iterations. At each iteration step, the discretization error associated with $E_{\rm c}$ is estimated with the a posteriori error indicator. Then the estimated error can be used directly to determine the computation is converged or not. If not, the known error data are combined with the a priori error analysis to predict the converged $E_{\rm c}$ in step sizes not limited by the fixed cut-off step like the case of convergence tests.

It can be seen that compared with the convergence-test approach, the proposed algorithm is based on theoretically justified bounds: the error estimates. As for the computational cost, two factors should be taken into account here: (i) the extra computational cost of the a posteriori error estimation and (ii) the number of iterations used to reach the converged solution. The time complexity for calculating the a posteriori error indicator is $\mathcal{O}(N_{\textrm{b}}\,N_{E_{\rm g}}\ln N_{E_{\rm g}})$ and that for the iterative diagonalization of eigenvalue problems is $\mathcal{O}(N_{\textrm{b}}^{2}\,N_{E_{\rm c}})$ . When the number of bands $N_{\textrm{b}}$ is relatively large, the additional cost for the error estimation is small compared to the cost for solving eigenvalue problems. We expect that the efficiency advantage of the adaptive algorithm over the convergence-test approach grows with the number of bands. The number of iterations to convergence is well controlled in the adaptive algorithm if the tuning of $E_{\rm c}$ proceeds efficiently by combining the a posteriori error with the a priori analysis and using strategies stated in Section 3.

In many practical simulations, the number of occupied bands $N_{\textrm{b}}$ is large, and the convergence tests are computationally demanding. This is the case for systems with defects or vacancies, surface systems, etc., where large supercells must be used. This happens as well for high-temperature simulations, where the cell may be small but $N_{\textrm{b}}$ grows rapidly with the temperature. In the latter case, besides using a large parameter $N_{\textrm{b}}$ , the converged cutoff energy $E_{\rm c}$ is usually significantly larger than the one needed in simulations at ambient temperature [10].

In the following comparison of the convergence-test approach and the adaptive algorithm, we use a $4\times 4\times 4$ diamond supercell containing 128 carbon atoms. The tolerance for the discretization error in $E_{\textrm{bands}}$ and for the SCF iteration error was set respectively to 0.128 eV (i.e. 1 meV/atom) and 0.1 meV. Note that although for simple bulk systems it is not necessary to perform convergence tests using large supercells, we still chose the example here because the scale of the system represents well the computational cost of systems with relatively large $N_{\textrm{b}}$ .

Table 7: Energy results (in eV/atom) and time costs (in seconds) in the convergence test for the Kohn–Sham equation of the diamond supercell. The cut-off values increase from 300 eV to 1200 eV with an interval of 100 eV.

	Start from scratch			Restart from previous step
$E_{\rm c}$	$E_{\textrm{bands}}/N_{\textrm{atom}}$	$N_{\textrm{SCF}}$	$t_{\textrm{eigen}}$	$E_{\textrm{bands}}/N_{\textrm{atom}}$	$N_{\textrm{SCF}}$	$t_{\textrm{eigen}}$
300	3.1068	11	31.23	3.1068	11	30.65
400	3.0939	11	46.07	3.0932	6	26.19
500	3.1027	11	61.54	3.1078	4	21.63
600	3.0944	11	77.49	3.0942	4	24.93
700	3.0868	11	96.73	3.0865	4	31.01
800	3.0833	11	112.68	3.0830	4	32.78
900	3.0820	11	139.62	3.0819	4	39.54
1000	3.0818	11	174.29	3.0817	2	25.18
1100	3.0816	11	193.27	3.0815	2	27.74
1200	3.0813	11	233.94	3.0812	2	32.95
Total			1166.86			292.60

Table 8: Time cost (in seconds) of the adaptive algorithm for solving the Kohn–Sham equation of the diamond supercell where

t_{\textrm{eigen}}

represents the time for solving the eigenvalue problem and

t_{\textrm{err}}

the time for calculating the a posteriori error estimate. The reason for the discontinuous SCF step index in the following is that, when the discretization error is detected to be smaller than the SCF error, we take the discretization error as the tolerance and perform more than one SCF iteration until convergence.

SCF-step	$E_{\rm c}$	$t_{\textrm{eigen}}$	$t_{\textrm{err}}$
2	300	14.33	1.04
6	300	9.76	1.04
	400	9.65	1.46
	538.83	12.78	2.57
	805.86	14.59	2.71
	805.86	39.91	5.28
10	905.86	18.48	5.83
	1004.61	22.89	7.23
12	1004.61	13.93	–
Total		156.32	27.16

In the convergence test, the Kohn–Sham equations were solved under $E_{\rm c}$ values between 300 eV and 1200 eV with an interval of 100 eV. Two subsets of tests have been carried out: in one of them, the solution of the Kohn–Sham equation under each $E_{\rm c}$ started from scratch; in the other, the calculation restarted from eigenfunctions under the previous cut-off. In the adaptive algorithm, we chose the same starting cut-off $E_{\rm c}=300\,$ eV and set $\alpha=1280$ (ratio between the tolerances for SCF iterations and planewave discretizations). Results of these tests are given in Tables 7 and 8.

It is observed from Table 7 that in the convergence tests, $E_{\textrm{bands}}$ converges to 1 meV/atom for $E_{\rm c}\geq 1000\,$ eV. This cut-off is consistent with the one (1004.61 eV) obtained in the adaptive approach. For time costs, it is shown in Table 7 that in the convergence tests starting from scratch, the cost of solving the Kohn–Sham equations accumulated to 1166.86 s. This number reduced to 292.60 s for the convergence tests with restart, which originates from an obvious decrease in the number of SCF iterations. The time cost of the adaptive algorithm is composed of two parts denoted by $t_{\textrm{eigen}}$ for solving the eigenvalue problems and $t_{\textrm{err}}$ for calculating the a posteriori error. It is seen from Table 8 that $t_{\textrm{eigen}}$ and $t_{\textrm{err}}$ are 156.32 s and 27.16 s in total, respectively. Compared with convergence tests from scratch, the adaptive algorithm saved 84 % of the CPU time. Even if the adaptive algorithm is compared with convergence tests with restart, 37 % of the CPU time has been saved.

Table 9: Energy results (in eV/atom) and time costs (in seconds) in the convergence test for the Kohn–Sham equation of the diamond supercell. The cut-off values increase from 300 eV to 1300 eV with an interval of 200 eV.

	Start from scratch			Restart from previous step
$E_{\rm c}$	$E_{\textrm{bands}}/N_{\textrm{atom}}$	$N_{\textrm{SCF}}$	$t_{\textrm{eigen}}$	$E_{\textrm{bands}}/N_{\textrm{atom}}$	$N_{\textrm{SCF}}$	$t_{\textrm{eigen}}$
300	3.1068	11	31.03	3.1068	11	30.73
500	3.1027	11	60.79	3.1022	7	40.25
700	3.0868	11	96.99	3.0865	4	33.83
900	3.0820	11	139.50	3.0819	4	45.91
1100	3.0816	11	192.41	3.0815	2	28.93
1300	3.0811	11	250.13	3.0809	4	69.99
Total			770.85			249.64

Further, if one chose to perform the convergence tests with an cut-off interval of 200 eV, as shown by the results in Table 9, the saving in time of the adaptive algorithm compared with tests from scratch and tests with restart decreased to 76 % and 27 %, respectively. But at the same time, the converged cut-off has changed: energy results in the table illustrate that $E_{\textrm{bands}}$ converged to 1 meV/atom for $E_{\rm c}\geq 1100\,$ eV, so the converged energy cut-off is 100 eV over-estimated.

In summary, the comparison tests illustrate the potential efficiency advantage of the adaptive algorithm over the convergence test approach on relatively large-scale systems. This kind of algorithms is expected to be further improved by integrating new theoretical advances in the future, in particular an educated choice for the parameter $\alpha$ . In our opinion, it is worthwhile implementing these algorithms in ab-initio software packages to stimulate the application of theoretical achievements into practical electronic structure calculations.

6 Conclusions and perspectives

In this paper, we constructed an adaptive planewave method for linear eigenvalue problems and further integrated it into a self-consistent field algorithm for nonlinear eigenvalue problems. The algorithm combines the knowledge of a priori error decay and the construction of a posteriori error indicators that is asymptotically accurate and computable. This adaptive method not only provides good energy cut-offs in planewave methods for linear eigenvalue problems, but also shows potential for the adaptive resolution of Kohn-Sham equations in electronic structure calculations. Specifically, in the solution of PAW Kohn–Sham equations, the adaptive method was shown to be built on a more sound theoretical basis and achieve a clear saving in the time costs for relatively large systems, in the context of convergence-test computations.

For now, we have only considered the error from the planewave discretizations. In real calculations though, in particular to compute physical quantities, other sources of errors come into play, such as the integration over the Brillouin zone (after Bloch’s decomposition [40]). Standard methods rely on equally spaced grid points and there is no reliable a posteriori error estimate for this error. This will need to be further investigated, in order to develop efficient numerical methods for periodic systems that can estimate, control and balance the errors from planewave discretizations, SCF iterations, and Brillouin zone integrations.

Appendix A Evaluating the a posteriori error indicator

In this section, we discuss how to evaluate (or approximate) the dual norm of the residual (in Eq. 13) in an accurate and efficient way within the framework of planewave discretization. We present first a standard calculation and then a perturbation-based-method.

A.1 A standard calculation

Let $A=-\Delta+V$ be the linear operator with periodic boundary condition given in Eq. 1. From Eq. 3 we have that $A$ is positive definite up to shifting by a positive constant, and hence $A^{1/2}$ is well defined. From definition Eq. 13, we can obtain from a standard calculation and Cauchy-Schwarz inequality that

(30)

\quad\eta_{E_{\rm c}}=\sup_{v\in H}\frac{\langle{\rm Res}_{E_{\rm c}},v\rangle_{H^{\prime},H}}{\|v\|_{a}}=\sup_{v\in H}\frac{\big{(}A^{-1/2}{\rm Res}_{E_{\rm c}},A^{1/2}v\big{)}}{\sqrt{\big{(}A^{1/2}v,A^{1/2}v\big{)}}}=\big{\|}A^{-1/2}{\rm Res}_{E_{\rm c}}\big{\|}_{L^{2}(\Omega)}\\[4.30554pt] =\big{\langle}{\rm Res}_{E_{\rm c}},\textit{A}^{-1}{\rm Res}_{E_{\rm c}}\big{\rangle}_{H^{\prime},H}^{\frac{1}{2}}=\left(\sum_{\boldsymbol{G}\in\mathbb{L}^{\prime}}\widehat{{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}\cdot\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}\right)^{\frac{1}{2}},\quad

where $\widehat{{\rm Res}_{E_{\rm c}}}$ and $\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}$ are the vectors of Fourier coefficients of ${\rm Res}_{E_{\rm c}}\in H^{\prime}$ and $A^{-1}{\rm Res}_{E_{\rm c}}\in H$ respectively. For $\boldsymbol{G}\in\mathbb{L}^{\prime}$ ,

\displaystyle\widehat{{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}=\big{\langle}{\rm Res}_{E_{\rm c}},e_{\boldsymbol{G}}\big{\rangle}_{H^{\prime},H}\qquad{\rm and}\qquad\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}=\big{(}A^{-1}{\rm Res}_{E_{\rm c}},e_{\boldsymbol{G}}\big{)}.

To compute Eq. 30, we shall approximate $L_{\#}^{2}(\Omega)$ by a finite dimensional subspace $X_{E_{\rm g}}(\Omega)$ (see definition Eq. 4) with some $E_{\rm g}\gg E_{\rm c}$ , that is, approximate the sum over $\boldsymbol{G}\in\mathbb{L}^{\prime}$ by finitely many terms $\frac{1}{2}|\boldsymbol{G}|^{2}\leq E_{\rm g}$ . We refer to [12, Eq. (80)-(82)] and [13, Theorem 3.1] for an analysis of the numerical integration error from the cut-off $E_{\rm g}$ . In our practical simulations, we observe that when we choose $E_{\rm g}\geq 4E_{\rm c}$ , then the integration error due to the cut-off $E_{\rm g}$ is negligible compared with the discretization error with cut-off $E_{\rm c}$ .

Using the orthogonality of the planewave basis functions and the fact that $(\lambda_{E_{\rm c}},\varphi_{E_{\rm c}})$ satisfies Eq. 7, the Fourier coefficients of the residual can be easily computed by

\widehat{{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}=a(\varphi_{E_{\rm c}},e_{\boldsymbol{G}})-\lambda_{E_{\rm c}}(\varphi_{E_{\rm c}},e_{\boldsymbol{G}})=\big{(}V\varphi_{E_{\rm c}}-\Pi_{E_{\rm c}}(V\varphi_{E_{\rm c}}),e_{\boldsymbol{G}}\big{)}\qquad\forall~{\boldsymbol{G}}\in\mathbb{L}^{\prime}.

More precisely, $\widehat{{\rm Res}_{E_{\rm c}}}$ can be approximated within $X_{E_{\rm g}}$ by

(31)

\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}:=\left\{\begin{array}[]{ll}0&~~{\rm if}~~\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm c}\\[4.30554pt] \displaystyle\big{(}\widehat{V\varphi_{E_{\rm c}}}\big{)}_{\boldsymbol{G}}=(V\varphi_{E_{\rm c}},e_{\boldsymbol{G}})&~~{\rm if}~~E_{\rm c}<\frac{1}{2}|\boldsymbol{G}|^{2}\leq E_{\rm g}\end{array}\right..

Denote by $\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}=\left(\widehat{{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}\right)_{\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm g}}\in\mathbb{R}^{N_{E_{\rm g}}}$ with $N_{E_{\rm g}}$ the number of planewaves in $X_{E_{\rm g}}$ . We can approximate $\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}$ by

\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}=\left(\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}_{\boldsymbol{G}}\right)_{\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm g}}\in X_{E_{\rm g}}(\Omega)

by solving the linear system

(32)

A_{E_{\rm g}}\cdot\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}=\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}},

where $A_{E_{\rm g}}\in\mathbb{R}^{N_{E_{\rm g}}\times N_{E_{\rm g}}}$ has the matrix elements $\big{(}A_{E_{\rm g}}\big{)}_{ij}=a(e_{\boldsymbol{G}_{i}},e_{\boldsymbol{G}_{j}})$ . Using Eqs. 30, 31 and 32, we therefore approximate the a posteriori error indicator $\eta_{E_{\rm c}}$ in practical calculations by

(33)

\eta_{E_{\rm c}}^{E_{\rm g}}:=\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm g}}\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\cdot\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\right)^{\frac{1}{2}}.

If the potential and eigenfunction are smooth, the approximation error $\big{|}\eta_{E_{\rm c}}-\eta_{E_{\rm c}}^{E_{\rm g}}\big{|}\ll\eta_{E_{\rm c}}$ becomes small for sufficiently large cut-off $E_{\rm g}$ [12]. The error indicator $\eta_{E_{\rm c}}^{E_{\rm g}}$ is therefore asymptotically accurate and fully computable. The main cost for computing the estimator comes from solving the linear system Eq. 32, which costs $\mathcal{O}(N_{E_{\rm g}}\ln N_{E_{\rm g}})$ . Note that if $E_{\rm g}$ is proportional to $E_{\rm c}$ , the cost for calculating $\eta_{E_{\rm c}}^{E_{\rm g}}$ is $\mathcal{O}\big{(}E_{\rm c}^{d/2}\ln E_{\rm c}\big{)}$ , with $d$ the dimension of the system.

We further introduce an error estimator with a parameter $E\in(E_{\rm c},E_{\rm g})$

(34)

\eta_{E_{\rm c}}^{E_{\rm g}}(E):=\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E}\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\cdot\widehat{A^{-1}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\right)^{1/2},

which represents a relatively local error estimator in reciprocal space and is used in Strategy B.

Remark A.1.

A natural alternative approach is to construct the a posteriori error estimate by the $H^{-1}$ -norm of the residual, say $\|{\rm Res}\|_{H^{-1}}$ , as discussed in [16, 17, 18] for Laplace eigenvalue problems. We show in the next subsection that it can be viewed as a first-order perturbation-method-based calculation, and can be improved without too much cost.

A.2 A perturbation-based calculation

Solving a linear system to compute the a posteriori error estimate may be expensive when the size of the basis becomes large. In this section, we discuss a possible workaround for the computation of the a posteriori error estimate based on a perturbation method. Perturbation theory [35] is a classical tool and has been widely used in electronic calculations, e.g. [4, 5, 14]. In particular, [15] provides post-processing algorithms based on a perturbation method which is related to the discussion of this section.

In our construction, we decompose $A=A_{E_{\rm c}}+\big{(}V-\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}\big{)}$ , with

(35)

A_{E_{\rm c}}=-\Delta+\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}.

Using a Dyson expansion, we obtain for $k\geq 1$ ,

	$\displaystyle A^{-1}$	$\displaystyle=\sum_{j=1}^{k}A_{E_{\rm c}}^{-1}\Big{(}\big{(}\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}-V\big{)}A_{E_{\rm c}}^{-1}\Big{)}^{j-1}+A^{-1}\Big{(}\big{(}\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}-V\big{)}A_{E_{\rm c}}^{-1}\Big{)}^{k}$
(36)			$\displaystyle=:A^{-1,[k]}+A^{-1}\Big{(}\big{(}\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}-V\big{)}A_{E_{\rm c}}^{-1}\Big{)}^{k},$

with $A^{-1,[k]}=\sum_{j=1}^{k}A_{E_{\rm c}}^{-1}\Big{(}\big{(}\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}-V\big{)}A_{E_{\rm c}}^{-1}\Big{)}^{j-1}$ . Note that the remainder term $A^{-1}\Big{(}\big{(}\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}-V\big{)}A_{E_{\rm c}}^{-1}\Big{)}^{k}$ will quickly become small if $\|\big{(}\Pi_{E_{\rm c}}V\Pi_{E_{\rm c}}-V\big{)}A_{E_{\rm c}}^{-1}\|\ll 1$ . We then approximate the error indicator Eq. 30 by

(37)

\eta_{E_{\rm c}}^{[k]}:=\big{\langle}{\rm Res}_{E_{\rm c}},\textit{A}^{-1,[k]}{\rm Res}_{E_{\rm c}}\big{\rangle}_{H^{\prime},H}^{\frac{1}{2}},

replacing $A^{-1}$ in Eq. 30 by $\textit{A}^{-1,[k]}$ . In practice, we shall perform the above calculations in $X_{E_{\rm g}}(\Omega)$ , and use the following a posteriori error indicator with $k=1,2.$

(38)

\eta_{E_{\rm c}}^{E_{\rm g},[k]}:=\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E_{\rm g}}\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\cdot\widehat{A^{-1,[k]}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\right)^{\frac{1}{2}}.

The low computational cost of this estimate comes from the following observation [14]: the operator $A_{E_{\rm c}}$ being block diagonal on $X_{E_{\rm c}}$ and $X_{E_{\rm c}}^{\perp}$ , with only the Laplace operator which is diagonal in planewaves on $X_{E_{\rm c}}^{\perp}$ , and the residual Eq. 31 vanishing on $X_{E_{\rm c}}$ , the Fourier coefficients of $A_{E_{\rm c}}^{-1}{\rm Res}_{E_{\rm c}}$ can be easily evaluated by

\widehat{A_{E_{\rm c}}^{-1}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}=\frac{\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}}{|\boldsymbol{G}|^{2}}

Therefore the estimator can be computed without solving a linear system Eq. 32, and the computational cost of Eq. 38 for $k=1$ and $k=2$ are $\mathcal{O}\big{(}E_{\rm g}^{d/2}\ln E_{\rm g}\big{)}$ . Although the scaling is the same as that of Eq. 33, but the cost has been significantly reduced since one does not need to solve a linear system. Note that for $k\geq 3$ , the above structure for convenient evaluation will be destroyed.

Finally, we define the corresponding error estimator with a parameter $E\in(E_{\rm c},E_{\rm g})$

(39)

\eta_{E_{\rm c}}^{E_{\rm g},[k]}\big{(}E\big{)}:=\left(\sum_{{\boldsymbol{G}}\in\mathbb{L}^{\prime},~E_{\rm c}\leq\frac{1}{2}|{\boldsymbol{G}}|^{2}\leq E}\widehat{{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\cdot\widehat{A^{-1,[k]}{\rm Res}_{E_{\rm c}}}^{E_{\rm g}}_{\boldsymbol{G}}\right)^{\frac{1}{2}}.

Appendix B The PAW method

In this section, we introduce the theoretical framework of the PAW method [11, 28, 31, 38], which is one of the most important electronic structure calculation methods used nowadays. In particular, we refer to [28] and [8, 9, 29] for rigorous mathematical derivations and numerical analysis for the PAW method.

The PAW Kohn–Sham model is based on a linear transformation from the smooth pseudo (PS) wave-function $\varphi_{i}$ to the real all-electron (AE) wave-function $\varphi_{i}^{\textrm{AE}}$ defined as

(40)

\varphi_{i}^{\textrm{AE}}(\boldsymbol{r})=\varphi_{i}(\boldsymbol{r})+\sum_{I=1}^{N_{\rm atom}}\sum_{n}(p_{n}^{I},\varphi_{i})\big{[}\psi_{n}^{I}(\boldsymbol{r})-\phi_{n}^{I}(\boldsymbol{r})\big{]},

where $\psi_{n}^{I}$ , $\phi_{n}^{I}$ , and $p_{n}^{I}$ are atomic quantities called AE partial waves, PS partial waves, and projector functions, respectively, and for a given atomic index $I$ , the index $n$ runs over the angular momentum and an additional index labeling different partial waves for the same angular momentum. The AE partial wave $\psi_{n}^{I}$ is constructed to be identical to the associated PS partial wave $\phi_{n}^{I}$ outside the radius of the core region $\Omega_{I}$ enclosing the atomic position. The projector functions $p_{n}^{I}$ are compactly supported in the core region $\Omega_{I}$ and fulfill

(41)

\int_{\Omega_{I}}[p_{n}^{I}(\boldsymbol{r})]^{*}\,\phi_{m}^{I}(\boldsymbol{r})\,\textrm{d}\boldsymbol{r}=\delta_{nm}\quad\textrm{for any}~n,m.

Based on the PAW transformation, one can derive consistent decomposition expressions for the electron density, the kinetic energy of electrons, and the Hartree and the exchange-correlation energy functionals, and then obtain the PAW Hamiltonian by taking variation of the total energy functional with respect to the pseudo density operator.

The AE valence density $\rho$ is decomposed as

(42)

\rho(\boldsymbol{r})=\tilde{\rho}(\boldsymbol{r})+\sum_{I}\big{[}\rho_{I}^{1}(\boldsymbol{r})-\tilde{\rho}_{I}^{1}(\boldsymbol{r})\big{]},

where $\tilde{\rho}(\boldsymbol{r})=\sum_{i=1}^{N_{\textrm{b}}}f_{i}|\varphi_{i}(\boldsymbol{r})|^{2}$ , and

(43)		$\displaystyle\rho_{I}^{1}(\boldsymbol{r})$	$\displaystyle=\sum_{i=1}^{N_{\textrm{b}}}\sum_{n,m}f_{i}(\varphi_{i},p_{n}^{I})(p_{m}^{I},\varphi_{i})\,[\psi_{n}^{I}(\boldsymbol{r})]^{*}\,\psi_{m}^{I}(\boldsymbol{r}),$
(44)		$\displaystyle\tilde{\rho}_{I}^{1}(\boldsymbol{r})$	$\displaystyle=\sum_{i=1}^{N_{\textrm{b}}}\sum_{n,m}f_{i}(\varphi_{i},p_{n}^{I})(p_{m}^{I},\varphi_{i})\,[\phi_{n}^{I}(\boldsymbol{r})]^{*}\,\phi_{m}^{I}(\boldsymbol{r}).$

In the derivation of the Hartree and the exchange-correlation functionals, the so-called pseudized core charge $\tilde{\rho}_{Zc}$ , partial electronic core charge $\tilde{\rho}_{c}$ , and compensation charge $\hat{\rho}$ need to be further introduced. In particular, the compensation charge $\hat{\rho}=\sum_{I}\hat{\rho}_{I}$ and

(45)

\hat{\rho}_{I}(\boldsymbol{r})=\sum_{i=1}^{N_{\textrm{b}}}\sum_{n,m}\sum_{L,M}f_{i}(\varphi_{i},p_{n}^{I})(p_{m}^{I},\varphi_{i})\,\hat{Q}_{nm}^{LM,I}(\boldsymbol{r}),

where $\hat{Q}_{nm}^{LM,I}(\boldsymbol{r})$ are confined in each augmentation region. We refer the readers to [11, 31, 38] for the detailed explanation of these terms.

Finally, the Kohn–Sham equation in the PAW formulation is

(46)

H[\rho]\varphi_{i}=\lambda_{i}S\varphi_{i}\qquad i=1,\cdots,N_{\textrm{b}}

with the Hamiltonian $H[\rho]$ and overlap operator $S$ detailed below. The PAW Hamiltonian is

(47)

H[\rho]=-\frac{1}{2}\Delta+\tilde{v}_{\textrm{eff}}+v_{\textrm{nl}},

where the local potential is given by

(48)

\tilde{v}_{\textrm{eff}}[\tilde{\rho},\hat{\rho}](\boldsymbol{r})=v_{\rm{H}}[\tilde{\rho}+\hat{\rho}+\tilde{\rho}_{Zc}](\boldsymbol{r})+v_{\rm{xc}}[\tilde{\rho}+\hat{\rho}+\tilde{\rho}_{c}](\boldsymbol{r}),

and the non-local part is given by

(49)

\big{(}v_{\textrm{nl}}\varphi_{i}\big{)}({\bf r})=\sum_{I}\sum_{n,m}D_{nm}^{I}(p_{m}^{I},\varphi_{i})\,p_{n}^{I}(\boldsymbol{r}),

in which the non-local strength $D_{nm}^{I}=\hat{D}_{nm}^{I}+D_{nm}^{1,I}-\tilde{D}_{nm}^{1,I}$ and

(50)	$\displaystyle\hat{D}_{nm}^{I}$	$\displaystyle=\sum_{L,M}\int_{\Omega_{I}}\tilde{v}_{\textrm{eff}}(\boldsymbol{r})\,\hat{Q}_{nm}^{LM,I}(\boldsymbol{r})\,\textrm{d}\boldsymbol{r},$
(51)	$\displaystyle D_{nm}^{1,I}$	$\displaystyle=-\frac{1}{2}\int_{\Omega_{I}}[\psi_{n}^{I}(\boldsymbol{r})]^{}\left[\Delta\psi_{m}^{I}(\boldsymbol{r})\right]\,\textrm{d}\boldsymbol{r}+\int_{\Omega_{I}}[\psi_{n}^{I}(\boldsymbol{r})]^{}\,\psi_{m}^{I}(\boldsymbol{r})\,v_{\textrm{eff}}^{1,I}(\boldsymbol{r})\,\textrm{d}\boldsymbol{r},$
(52)	$\displaystyle\begin{split}\tilde{D}_{nm}^{1,I}&=-\frac{1}{2}\int_{\Omega_{I}}[\phi_{n}^{I}(\boldsymbol{r})]^{}\left[\Delta\phi_{m}^{I}(\boldsymbol{r})\right]\,\textrm{d}\boldsymbol{r}+\int_{\Omega_{I}}[\phi_{n}^{I}(\boldsymbol{r})]^{}\,\phi_{m}^{I}(\boldsymbol{r})\,\tilde{v}_{\textrm{eff}}^{1,I}(\boldsymbol{r})\,\textrm{d}\boldsymbol{r}\\ &\quad+\sum_{L,M}\int_{\Omega_{I}}\tilde{v}_{\textrm{eff}}^{1,I}(\boldsymbol{r})\,\hat{Q}_{nm}^{LM,I}(\boldsymbol{r})\,\textrm{d}\boldsymbol{r},\end{split}$

with on-site potentials

(53)		$\displaystyle v_{\textrm{eff}}^{1,I}[\rho_{I}^{1}](\boldsymbol{r})$	$\displaystyle=v_{\rm{H}}[\rho_{I}^{1}+\rho_{Zc}^{I}](\boldsymbol{r})+v_{\rm{xc}}[\rho_{I}^{1}+\rho_{c}^{I}](\boldsymbol{r}),$
(54)		$\displaystyle\tilde{v}_{\textrm{eff}}^{1,I}[\tilde{\rho}_{I}^{1},\hat{\rho}_{I}](\boldsymbol{r})$	$\displaystyle=v_{\rm{H}}[\tilde{\rho}_{I}^{1}+\hat{\rho}_{I}+\tilde{\rho}_{Zc}^{I}](\boldsymbol{r})+v_{\rm{xc}}[\tilde{\rho}_{I}^{1}+\hat{\rho}_{I}+\tilde{\rho}_{c}^{I}](\boldsymbol{r}).$

Finally, the overlap operator in Eq. 46 is given by

(55)

\big{(}S\varphi_{i})(\boldsymbol{r})=\varphi_{i}(\boldsymbol{r})+\sum_{I}\sum_{n,m}q_{nm}^{I}(p_{m}^{I},\varphi_{i})\,p_{n}^{I}(\boldsymbol{r}),

where

(56)

q_{nm}^{I}=\int_{\Omega_{I}}\Big{\{}[\psi_{n}^{I}(\boldsymbol{r})]^{*}\,\psi_{m}^{I}(\boldsymbol{r})-[\phi_{n}^{I}(\boldsymbol{r})]^{*}\,\phi_{m}^{I}(\boldsymbol{r})\Big{\}}\,\textrm{d}\boldsymbol{r}.

When replacing the standard Kohn–Sham equation Eq. 19 with the PAW formulation Eq. 46, the framework of the adaptive algorithm designed in this paper stays the same, though a lot of attention needs to be paid for numerical implementations of the above terms.

Acknowledgement

B. Liu and H. Chen’s work was supported by the National Natural Science Foundation of China under grant 11971066. G. Dusson’s work was partially supported by the French ”Investissements d’Avenir” program, project ISITE-BFC (contract ANR-15-IDEX-0003). J. Fang’s work was supported by the National Natural Science Foundation of China under grant 11701037. X. Gao’s work was supported by the Science Challenge Project under Grant TZ2018002.

References

[1] ABINIT Development Team, ABINIT Software, https://www.abinit.org (accessed 2021/01/31).
[2] I. Babuška and J. Osborn, Eigenvalue problems. In: Handbook of Numerical Analysis, Vol. II, North-Holland, Amsterdam, 1991.
[3] G. Bao, G. Hu, and D. Liu, Numerical solution of the Kohn-Sham equation by finite element methods with an adaptive mesh redistribution technique, J. Sci. Comput., 55 (2013), p. 372–391.
[4] S. Baroni, S. de Gironcoli, A. D. Corso, and P. Giannozzi, Phonons and related crystal properties from density-functional perturbation theory, Rev. Mod. Phys., 73 (2001), pp. 515–562.
[5] S. Baroni, P. Giannozzi, and A. Testa, Green’s-function approach to linear response in solids, Phys. Rev. Lett., 58 (1987), pp. 1861–1864.
[6] R. Becker and R. Rannacher, An optimal control approach to a posteriori error estimation in finite element methods, Acta Numer., 10 (2001), pp. 1–102.
[7] P. Binev, W. Dahmen, and R. DeVore, Adaptive finite element methods with convergence rates, Numer. Math., 97 (2004), pp. 219–268.
[8] X. Blanc, E. Cancès, and M. S. Dupuy, Variational projector augmented-wave method, CR Math., 355 (2020), pp. 665–670.
[9] X. Blanc, E. Cancès, and M. S. Dupuy, Variational projector augmented-wave method: Theoretical analysis and preliminary numerical results, Numer. Math., 144 (2020), p. 271–321.
[10] A. Blanchet, M. Torrent, and J. Clerouin, Requirements for very high temperature Kohn–Sham DFT simulations and how to bypass them, Phys. Plasmas, 27 (2020), p. 122706.
[11] P. E. Blöchl, Projector augmented-wave method, Phys. Rev. B, 50 (1994), pp. 17953–17979.
[12] E. Cancès, R. Chakir, and Y. Maday., Numerical analysis of nonlinear eigenvalue problems, J. Sci. Comput., 45 (2010), pp. 90–117.
[13] E. Cancès, R. Chakir, and Y. Maday, Numerical analysis of the planewave discretization of some orbital-free and Kohn-Sham models, ESAIM: M2AN., 46 (2012), pp. 341–388.
[14] E. Cancès, G. Dusson, Y. Maday, B. Stamm, and M. Vohralík, A perturbation-method-based a posteriori estimator for the planewave discretization of nonlinear Schrödinger equations, C. R. Math., 352 (2014), pp. 941–946.
[15] E. Cancès, G. Dusson, Y. Maday, B. Stamm, and M. Vohralík, A perturbation-method-based post-processing for the planewave discretization of Kohn-Sham models, J. Comput. Phys., 307 (2016), pp. 446–459.
[16] E. Cancès, G. Dusson, Y. Maday, B. Stamm, and M. Vohralík, Guaranteed and robust a posteriori bounds for laplace eigenvalues and eigenvectors: conforming approximations, SIAM J. Numer. Anal., 55 (2017), pp. 2228–2254.
[17] E. Cancès, G. Dusson, Y. Maday, B. Stamm, and M. Vohralík, Guaranteed and robust a posteriori bounds for laplace eigenvalues and eigenvectors: a unified framework, Numer. Math., 140 (2018), pp. 1033–1079.
[18] E. Cancès, G. Dusson, Y. Maday, B. Stamm, and M. Vohralík, Guaranteed a posteriori bounds for eigenvalues and eigenvectors: multiplicities and clusters, Math. Comp., 89 (2019), pp. 2563–2611.
[19] C. Canuto, M. Y. Hussaini, A. Quarteroni, and T. A. Zang, Spectral Methods: Evolution to Complex Geometries and Applications to Fluid Dynamics, Springer-Verlag, Berlin, 2014.
[20] C. Canuto, R. Nochetto, and M. Verani, Adaptive Fourier-Galerkin methods, Math. Comput., 83 (2014), pp. 1645–1687.
[21] J. M. Cascon, C. Kreuzer, R. H. Nochetto, and K. G. Siebert, Quasi-optimal convergence rate for an adaptive finite element method, SIAM J. Numer. Anal., 46 (2008), pp. 2524–2550.
[22] CASTEP Development Team, CASTEP Software, http://www.castep.org (accessed 2021/01/31).
[23] H. Chen, X. Dai, X. Gong, L. He, and A. Zhou, Adaptive finite element approximations for Kohn-Sham models, Multi. Model. & Sim., 12 (2014), pp. 1828–1869.
[24] H. Chen, X. Gong, L. He, Z. Yang, and A. Zhou, Numerical analysis of finite dimensional approximations of Kohn-Sham models, Adv. Compu. Math., 38 (2011), p. 1–32.
[25] H. Chen, X. Gong, L. He, and A. Zhou, Finite element approximations of nonlinear eigenvalue problems in quantum physics, Comput. Methods Appl. Mech. Engrg., 200 (2011), pp. 1846–1865.
[26] X. Dai, J. Xu, and A. Zhou, Convergence and optimal complexity of adaptive finite element eigenvalue computations, Numer. Math., 110 (2008), pp. 313–355.
[27] W. Dörfler, A convergent adaptive algorithm for Poisson’s equation, SIAM J. Numer. Anal., 33 (1996), pp. 1106–1124.
[28] M. S. Dupuy, Analysis of The Projector augmented-wave method for electronic structure calculations in periodic settings, PhD thesis, Université Sorbonne Paris Cité, 2018.
[29] M. S. Dupuy, Projector augmented-wave method: An analysis in a one-dimensional setting, ESAIM: M2AN, 54 (2020), pp. 25–58.
[30] G. Dusson and Y. Maday, A posteriori analysis of a non-linear Gross-Pitaevskii type eigenvalue problem, IMA J. Numer. Anal., 37 (2017), pp. 94–137.
[31] J. Fang, X. Gao, and H. Song, Implementation of the projector augmented-wave method: The use of atomic datasets in the standard PAW-XML format, Commun. Comput. Phys., 26 (2019), pp. 1196–1223.
[32] E. Garau, P. Morin, and C. Zuppa, Convergence of adaptive finite element methods for eigenvalue problems, Math. Models Methods Appl. Sci., 19 (2009), pp. 721–747.
[33] F. Gygi, Adaptive Riemannian metric for plane-wave electronic-structure calculations, Europhys. Lett., 19 (1992), pp. 617–622.
[34] P. Hohenberg and W. Kohn, Inhomogeneous electron gas, Phys. Rev. B, 136 (1964), p. B864.
[35] T. Kato, Perturbation Theory for Linear Operators, Springer-Verlag, Berlin Heidelberg, 1976.
[36] W. Kohn and L. J. Sham, Self-consistent equations including exchange and correlation effects, Phys. Rev. A, 140 (1965), pp. A1133–A1138.
[37] G. Kresse and J. Furthmiiller, Efficiency of ab-initio total energy calculations for metals and semiconductors using a plane-wave basis set, Comput. Mater. Sci., 6 (1996), pp. 15–50.
[38] G. Kresse and D. Joubert, From ultrasoft pseudopotentials to the projector augmented-wave method, Phys. Rev. B, 59 (1999), pp. 1758–1775.
[39] K. Laasonen, A. Pasquarello, R. Car, C. Lee, and D. Vanderbilt, Car-Parrinello molecular dynamics with vanderbilt ultrasoft pseudopotentials, Phys. Rev. B, 47 (1993), p. 10142.
[40] R. M. Martin, Electronic Structure: Basic Theory and Practical Methods, Cambridge University Press, 2004.
[41] M. Methfessel and A. T. Paxton, High-precision sampling for brillouin-zone integration in metals, Phys. Rev. B, 40 (1989), p. 3616.
[42] P. Motamarri, M. Nowak, K. Leiter, J. Knap, and V. Gavini, Higher-order adaptive finite-element methods for Kohn-Sham density functional theory, J. Comput. Phys., 253 (2013), pp. 308–343.
[43] J. Osborn, Spectral approximation for compact operators, Math. Comp., 29 (1975), pp. 712–725.
[44] Y. Pan, Some plane wave methods for first principles electronic structure calculation. PhD thesis, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, 2018.
[45] M. C. Payne, M. P. Teter, M. P. Allan, T. A. Arias, and J. D. Joannopoulos, Iterative minimization techniques for ab-initio total energy calculations-molecular dynamics and conjugate gradients, Rev. Mod. Phys., 64 (1992), pp. 1045–1097.
[46] J. Perdew, K. Burke, and M. Ernzerhof, Generalized gradient approximation made simple, Phys. Rev. Lett., 77 (1996), pp. 3865–3868.
[47] L. P. Pitaevskii and S. Stringari, Bose-Einstein Condensation., Clarendon, Oxford, 2003.
[48] P. Pulay, Convergence acceleration of iterative sequences: The case of scf iteration, Chem. Phys. Lett., 73 (1980), pp. 393–398.
[49] Quantum Espresso Development Team, Quantum Espresso software, https://www.quantum-espresso.org (accessed 2021/01/31).
[50] Y. Saad, J. R. Chelikowsky, and S. M. Shontz, Numerical methods for electronic structure calculations of materials, SIAM Rev., 52 (2010), pp. 3–54.
[51] P. Suryanarayana, V. Gavini, T. Blesgen, K. Bhattacharya, and M. Ortiz, Non-periodic finite-element formulation of Kohn-Sham density functional theory, J. Mech. Phys. Solids, 58 (2010), pp. 256–280.
[52] E. Tsuchida and M. Tsukada, Adaptive finite-element method for electronic-structure calculations, Phys. Rev. B, 54 (1996), pp. 7602–7605.
[53] D. Vanderbilt, Soft self-consistent pseudopotentials in a generalized eigenvalue formalism, Phys. Rev. B, 41 (1990), p. 7892.
[54] VASP Development Team, VASP Software, https://www.vasp.at (accessed 2021/01/31).