ContEvol formalism: numerical methods based on Hermite spline optimization

Kaili Cao (曹开力) ¹¹1Email: \hrefmailto:cao.1191@osu.educao.1191@osu.edu 0000-0002-1699-6944 Department of Physics, The Ohio State University, 191 West Woodruff Ave, Columbus, OH 43210, USA Center for Cosmology and AstroParticle Physics (CCAPP), The Ohio State University, 191 West Woodruff Ave, Columbus, OH 43210, USA

Abstract

We present the ContEvol (continuous evolution) formalism, a family of implicit numerical methods which only need to solve linear equations and are almost symplectic. Combining values and derivatives of functions, ContEvol outputs allow users to recover full history and render full distributions. Using classic harmonic oscillator as a prototype case, we show that ContEvol methods lead to lower-order errors than two commonly used Runge–Kutta methods. Applying first-order ContEvol to simple celestial mechanics problems, we demonstrate that deviation from equation(s) of motion of ContEvol tracks is still $\mathcal{O}(h^{5})$ ( $h$ is the step length) by our definition. Numerical experiments with an eccentric elliptical orbit indicate that first-order ContEvol is a viable alternative to classic Runge–Kutta or the symplectic leapfrog integrator. Solving stationary Schrödinger equation in quantum mechanics, we manifest ability of ContEvol to handle boundary value or eigenvalue problems. Important directions for future work, including mathematical foundation, higher dimensions, and technical improvements, are discussed at the end of this article.

keywords:

Computational methods (1965)

1 Introduction

Numerical simulations are widely used in contemporary physics. For instance, famous computer codes in astrophysics include Arepo (Springel, 2010) and Athena++ (Jiang et al., 2014) for (magneto)hydrodynamic simulations, galpy (Bovy, 2015) for galactic dynamics, YREC (Demarque et al., 2008) and MESA (Paxton et al., 2011) for stellar evolution, mercury (Chambers, 1999) and REBOUND (Rein and Liu, 2012) for celestial mechanics, to name a few. There are certainly great works in other areas of research as well.

Because of the discreteness of the world of computers, it is common practice to convert differential equations into difference equations, so that finite difference methods can be applied. However, at spatial scales much larger than elementary particles, the physical world is arguably continuous. Therefore, finite difference might be intrinsically limited: when we try to model the full history of a dynamic system or full details of a function of spatial location, we have to resort to spline interpolation. Meanwhile, many physics problems are formulated as first- or second-order differential equations with analytic expressions, indicating that usage of general-purpose methods might be an overkill. These motivate the ContEvol (continuous evolution) formalism, which we²²2According to context, the pronouns “we/us/our” in this work may refer to: i) the author and indirect contributors (see acknowledgements), ii) the author and researchers with similar academic background and interests, or iii) the author and the readers. present in this work.

Desire for continuity has provoked thoughts about function representation. Imaging that, in addition to values of a one-dimensional real function $f(x):[x_{\min},x_{\max}]\mapsto\mathbb{R}$ at a series of sampling points $\{x_{\min},\ldots,x_{i},x_{i+1},\ldots,x_{\max}\}$ , we have its first derivative at the same points. Then in each interval $x_{i}\leq x\leq x_{i+1}$ , we can always find a cubic polynomial satisfying all boundary conditions at both ends, so that $f(x)$ can be represented as a piece-wise cubic function — not only is it continuous, but its first derivative is also continuous, which is favorable to some analysis in physics. This technique is known as Hermite spline³³3Anecdote: The author “independently” came up with this idea about three weeks before hearing about Hermite spline. For this reason, the author feels obliged to declare the possibility that this work might be reinventing some methods.. It can be naturally extended to higher orders: combining values and first- to $n$ th-order derivatives at both ends of an interval, we can find a $(2n+1)$ st-order polynomial representation of the function. However, it should be noted that basic calculus yields simple but powerful expressions for addition, subtraction, multiplication, division, and composition of representations with only values and first derivatives:

$\displaystyle h(x)=f(x)\pm g(x)\quad$	$\displaystyle\Rightarrow\quad$	$\displaystyle\dot{h}(x)=\dot{f}(x)\pm\dot{g}(x)$	(1.0.1)
$\displaystyle h(x)=f(x)\cdot g(x)\quad$	$\displaystyle\Rightarrow\quad$	$\displaystyle\dot{h}(x)=h(x)\left[\frac{\dot{f}(x)}{f(x)}+\frac{\dot{g}(x)}{g(x)}\right]$	(1.0.2)
$\displaystyle h(x)=\frac{f(x)}{g(x)}\quad$	$\displaystyle\Rightarrow\quad$	$\displaystyle\dot{h}(x)=h(x)\left[\frac{\dot{f}(x)}{f(x)}-\frac{\dot{g}(x)}{g(x)}\right]$	(1.0.3)
$\displaystyle h(x)=g(f(x))\quad$	$\displaystyle\Rightarrow\quad$	$\displaystyle\dot{h}(x)=\dot{g}(f(x))\dot{f}(x).$	(1.0.4)

Finiteness can be a blessing and a curse — we lose some high-order information, but do not need to assume that functions are infinitely differentiable, unlike when we use spectral methods (e.g., Grandclément and Novak, 2009).

ContEvol is a family of numerical methods built on this idea. It approximates functions of space and time as polynomials and minimizes deviation from equation(s) of the problem. While details will be presented and discussed in the rest of this work, here we briefly address how this relates to other common methods (e.g., Press et al., 2007). Some of the most important dichotomies of numerical methods include: explicit or implicit, single-step or (linear) multistep, and symplectic (or in physicists’ words, phase space conserving) or not. Since ContEvol finds the optimal solution for the next step, it should be categorized as implicit; however, as we will show in this work, unlike common implicit methods, ContEvol only needs to solve linear equations. Although this work focuses on the single-step version of ContEvol, we will argue that multistep versions are straightforward to achieve. Because of the predefined functional form, ContEvol is not strictly symplectic; however, with moderately small steps, its non-symplecticity (deviation from $1$ of determinant of Jacobian) can be rapidly below $2^{-53}$ , i.e., inundated by roundoff errors of double precision.

This work is principally for illustration and discussion of general strategies. The rest of this article is structured as follows. In Section 2, we apply first- and second-order ContEvol methods⁴⁴4An $n$ th-order ContEvol method treats up to $n$ th-order derivatives at sampling nodes as independent variables. to a prototype case, classic harmonic oscillator, and compare them to fourth- and eighth-order Runge–Kutta methods. Then in Section 3, we showcase potential applications of ContEvol in celestial mechanics; examples in this work are two-body and three-body problems, in which equations of motion are non-linear and multivariate. In Section 4, we use ContEvol to solve stationary Schrödinger equation in quantum mechanics, which is physically different from time evolution of a dynamic system. Finally in Section 5, we wrap up this work by discussing important directions for future work, including mathematical foundation, higher dimensions, and technical improvements.

2 Prototype case: classic harmonic oscillator

We start with the simplest case of a dynamical system: time evolution of a single real variable. To check results of numerical methods against exact solution, we choose the classic harmonic oscillator, for which the equation of motion (EOM) is

\displaystyle m\ddot{x}=-kx,

(2.0.1)

where $m$ is the mass of the particle and $k$ is the spring constant; setting these constants to $1$ ⁵⁵5This is a natural choice which makes time dimensionless. A different scaling would lead to different cost functions and thus different optimization results, but is not explored in this work., the EOM becomes

\displaystyle\ddot{x}=-x.

(2.0.2)

Without loss of generality, we are given $x(0)=x_{0}$ , $\dot{x}(0)=v_{0}$ and try to solve for $x(h)=x_{h}$ , $\dot{x}(h)=v_{h}$ , where $h$ is the time step (usually small). The exact solution is

\displaystyle\left\{\begin{aligned} x_{\rm exact}(t)&=x_{0}\cos t+v_{0}\sin t=\left[\begin{aligned} &x_{0}\left(1-\frac{t^{2}}{2}+\frac{t^{4}}{24}-\frac{t^{6}}{720}+\frac{t^{8}}{40320}+\mathcal{O}(t^{10})\right)\\ &+v_{0}\left(t-\frac{t^{3}}{6}+\frac{t^{5}}{120}-\frac{t^{7}}{5040}+\frac{t^{9}}{362880}+\mathcal{O}(t^{11})\right)\end{aligned}\right]\\ v_{\rm exact}(t)&=-x_{0}\sin t+v_{0}\cos t=\left[\begin{aligned} &-x_{0}\left(t-\frac{t^{3}}{6}+\frac{t^{5}}{120}-\frac{t^{7}}{5040}+\frac{t^{9}}{362880}+\mathcal{O}(t^{11})\right)\\ &+v_{0}\left(1-\frac{t^{2}}{2}+\frac{t^{4}}{24}-\frac{t^{6}}{720}+\frac{t^{8}}{40320}+\mathcal{O}(t^{10})\right)\end{aligned}\right]\end{aligned}\right..

(2.0.3)

Section 2.1 showcases ability of the first-order ContEvol method, and Section 2.2 compares it to two commonly used (explicit and multistep) Runge–Kutta methods. In Section 2.3, we explore the second-order ContEvol method, with and without strict EOM enforcement at $t=h$ .

2.1 First-order ContEvol method

We approximate the solution in a parametric form (subscript “CE1” stands for first-order ContEvol)

\displaystyle x_{\rm CE1}(t)=x_{0}+v_{0}t+Bt^{2}+At^{3},\quad t\in[0,h];

(2.1.1)

“terminal” conditions at $t=h$ yield

	$\displaystyle\left\{\begin{aligned} x_{\rm CE1}(h)&=x_{0}+v_{0}h+Bh^{2}+Ah^{3}=x_{h}\\ \dot{x}_{\rm CE1}(h)&=v_{0}+2Bh+3Ah^{2}=v_{h}\end{aligned}\right.$	(2.1.2)
$\displaystyle\Rightarrow\quad$	$\displaystyle\begin{pmatrix}h^{2}&h^{3}\\ 2h&3h^{2}\end{pmatrix}\begin{pmatrix}B\\ A\end{pmatrix}=\begin{pmatrix}x_{h}-x_{0}-v_{0}h\\ v_{h}-v_{0}\end{pmatrix}$	(2.1.3)
$\displaystyle\Rightarrow\quad$	$\displaystyle\left\{\begin{aligned} A&=2(x_{0}-x_{h})h^{-3}+(v_{0}+v_{h})h^{-2}\\ B&=3(x_{h}-x_{0})h^{-2}-(2v_{0}+v_{h})h^{-1}\end{aligned}\right..$	(2.1.4)

Because of the initial conditions $(x_{0},v_{0})^{\rm T}$ , the transformation $(x_{h},v_{h})^{\rm T}\to(A,B)^{\rm T}$ is affine, not linear.

We define the cost function as

$\displaystyle\epsilon_{\rm CE1}(A,B;h)$	$\displaystyle=\int_{0}^{h}(\ddot{x}+x)^{2}\,{\rm d}t=\int_{0}^{h}[(2B+x_{0})+(6A+v_{0})t+Bt^{2}+At^{3}]^{2}\,{\rm d}t$
	$\displaystyle=\int_{0}^{h}\left[\begin{aligned} &(4B^{2}+4Bx_{0}+x_{0}^{2})+(24AB+12Ax_{0}+4Bv_{0}+2v_{0}x_{0})t\\ &+(36A^{2}+12Av_{0}+4B^{2}+2Bx_{0}+v_{0}^{2})t^{2}+(16AB+2Ax_{0}+2Bv_{0})t^{3}\\ &+(12A^{2}+2Av_{0}+B^{2})t^{4}+2ABt^{5}+A^{2}t^{6}\end{aligned}\right]\,{\rm d}t$
	$\displaystyle=\left[\begin{aligned} &(4B^{2}+4Bx_{0}+x_{0}^{2})h+(12AB+6Ax_{0}+2Bv_{0}+v_{0}x_{0})h^{2}\\ &+\frac{1}{3}(36A^{2}+12Av_{0}+4B^{2}+2Bx_{0}+v_{0}^{2})h^{3}+\frac{1}{2}(8AB+A_{0}+Bv_{0})h^{4}\\ &+\frac{1}{5}(12A^{2}+2Av_{0}+B^{2})h^{5}+\frac{1}{3}ABh^{6}+\frac{1}{7}A^{2}h^{7}\end{aligned}\right];$	(2.1.5)

minimizing this, we obtain

	$\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{\rm CE1}}{\partial A}&=(12B+6x_{0})h^{2}+(24A+4v_{0})h^{3}+\frac{1}{2}(8B+x_{0})h^{4}+\frac{2}{5}(12A+v_{0})h^{5}+\frac{1}{3}Bh^{6}+\frac{2}{7}Ah^{7}=0\\ \frac{\partial\epsilon_{\rm CE1}}{\partial B}&=(8B+4x_{0})h+(12A+2v_{0})h^{2}+\frac{2}{3}(4B+x_{0})h^{3}+\frac{1}{2}(8A+v_{0})h^{4}+\frac{2}{5}Bh^{5}+\frac{1}{3}Ah^{6}=0\end{aligned}\right.$	(2.1.6)
$\displaystyle\Rightarrow\quad$	$\displaystyle\begin{pmatrix}24h^{3}+\dfrac{24}{5}h^{5}+\dfrac{2}{7}h^{7}&12h^{2}+4h^{4}+\dfrac{1}{3}h^{6}\\ 12h^{2}+4h^{4}+\dfrac{1}{3}h^{6}&8h+\dfrac{8}{3}h^{3}+\dfrac{2}{5}h^{5}\end{pmatrix}\begin{pmatrix}A_{\rm CE1}\\ B_{\rm CE1}\end{pmatrix}=\begin{pmatrix}-6x_{0}h^{2}-4v_{0}h^{3}-\dfrac{1}{2}x_{0}h^{4}-\dfrac{2}{5}v_{0}h^{5}\\ -4x_{0}h-2v_{0}h^{2}-\dfrac{2}{3}x_{0}h^{3}-\dfrac{1}{2}v_{0}h^{4}\end{pmatrix}$	(2.1.7)
$\displaystyle\Rightarrow\quad$	$\displaystyle\left\{\begin{aligned} A_{\rm CE1}&=\frac{7(-3600v_{0}+1800x_{0}h+60v_{0}h^{2}+120x_{0}h^{3}+10x_{0}h^{5}+3v_{0}h^{6})}{2(75600+10080h^{2}+1080h^{4}+24h^{6}+5h^{8})}\\ B_{\rm CE1}&=-\frac{15(5040x_{0}+1092x_{0}h^{2}+168v_{0}h^{3}+72x_{0}h^{4}+8v_{0}h^{5}+5x_{0}h^{6}+2v_{0}h^{7})}{2(75600+10080h^{2}+1080h^{4}+24h^{6}+5h^{8})}\end{aligned}\right..$	(2.1.8)

Plugging Eq. (2.1.8) back into Eq. (2.1.1), our solution at $t=h$ is

\displaystyle\begin{pmatrix}x_{h}\\ v_{h}\end{pmatrix}=\begin{pmatrix}G_{{\rm CE1},00}&G_{{\rm CE1},01}\\ G_{{\rm CE1},10}&G_{{\rm CE1},11}\end{pmatrix}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}

(2.1.9)

with

\displaystyle\left\{\begin{aligned} G_{{\rm CE1},00}&=\frac{151200-55440h^{2}-1620h^{4}-192h^{6}+5h^{8}}{2(75600+10080h^{2}+1080h^{4}+24h^{6}+5h^{8})}\\ G_{{\rm CE1},01}&=\frac{151200h-5040h^{3}+60h^{5}-72h^{7}+h^{9}}{2(75600+10080h^{2}+1080h^{4}+24h^{6}+5h^{8})}\\ G_{{\rm CE1},10}&=\frac{60h(-2520+84h^{2}+6h^{4}+h^{6})}{2(75600+10080h^{2}+1080h^{4}+24h^{6}+5h^{8})}\\ G_{{\rm CE1},11}&=\frac{151200-55440h^{2}-1620h^{4}-192h^{6}+13h^{8}}{2(75600+10080h^{2}+1080h^{4}+24h^{6}+5h^{8})}\end{aligned}\right..

(2.1.10)

The determinant of the time evolution operator $G_{\rm CE1}$ is

\displaystyle\det\begin{pmatrix}G_{{\rm CE1},00}&G_{{\rm CE1},01}\\ G_{{\rm CE1},10}&G_{{\rm CE1},11}\end{pmatrix}=1-\frac{19h^{8}}{302400+40320h^{2}+4320h^{4}+96h^{6}+20h^{8}},

(2.1.11)

i.e., unfortunately, ContEvol is not symplectic. However, the discrepancy $1-\det(G_{\rm CE1})\leq 2^{-53}$ (common double-precision floating-point format cannot tell discrepancies below this threshold) when $h\leq 0.03396$ . Thanks to the linearity of the problem, $G$ is diagonalizable for common choices of $h$ , and complexity of evolving the system for $N$ steps with fixed time step can be just $2N+\mathcal{O}(1)$ .

Expanding Eqs. (2.1.9) and (2.1.10), first-order ContEvol yields

\displaystyle\left\{\begin{aligned} x_{\rm CE1}(h)&=\left[\begin{aligned} &x_{0}\left(1-\frac{h^{2}}{2}+\frac{h^{4}}{24}-{\color[rgb]{1,0,0}0}\cdot\frac{h^{6}}{720}+{\color[rgb]{1,0,0}\left(-\frac{284}{15}\right)}\cdot\frac{h^{8}}{40320}+\mathcal{O}(h^{10})\right)\\ &+v_{0}\left(h-\frac{h^{3}}{6}+\frac{h^{5}}{120}-{\color[rgb]{1,0,0}\left(-\frac{18}{7}\right)}\cdot\frac{h^{7}}{5040}+{\color[rgb]{1,0,0}\left(-\frac{1716}{25}\right)}\cdot\frac{h^{9}}{362880}+\mathcal{O}(h^{11})\right)\end{aligned}\right]\\ v_{\rm CE1}(h)&=\left[\begin{aligned} &-x_{0}\left(h-\frac{h^{3}}{6}+{\color[rgb]{1,0,0}\frac{2}{3}}\cdot\frac{h^{5}}{120}-{\color[rgb]{1,0,0}\left(-\frac{14}{3}\right)}\cdot\frac{h^{7}}{5040}+{\color[rgb]{1,0,0}\left(-\frac{392}{5}\right)}\cdot\frac{h^{9}}{362880}+\mathcal{O}(h^{11})\right)\\ &+v_{0}\left(1-\frac{h^{2}}{2}+\frac{h^{4}}{24}-{\color[rgb]{1,0,0}0}\cdot\frac{h^{6}}{720}+{\color[rgb]{1,0,0}(-18)}\cdot\frac{h^{8}}{40320}+\mathcal{O}(h^{10})\right)\end{aligned}\right]\end{aligned}\right..

(2.1.12)

comparing to the exact solution Eq. (2.0.3), we see that errors in $x_{h}$ and $v_{h}$ (highlighted in red) are $\mathcal{O}(h^{6})$ and $\mathcal{O}(h^{5})$ , respectively.

According to Eq. (2.1.8), the minimized cost function Eq. (2.1) is

\displaystyle\epsilon_{{\rm CE1},\min}(h)=\left[\begin{aligned} &\frac{x_{0}^{2}}{720}h^{5}+\frac{v_{0}x_{0}}{720}h^{6}+\left(\frac{v_{0}^{2}}{2800}-\frac{x_{0}^{2}}{2160}\right)h^{7}-\frac{v_{0}x_{0}}{2800}h^{8}+\left(\frac{53x_{0}^{2}}{907200}-\frac{23v_{0}^{2}}{378000}\right)h^{9}+\frac{47v_{0}x_{0}}{1512000}h^{10}\\ &+\left(\frac{19v_{0}^{2}}{5880000}-\frac{11x_{0}^{2}}{6804000}\right)h^{11}+\frac{41v_{0}x_{0}}{79380000}h^{12}+\left(\frac{43v_{0}^{2}}{132300000}-\frac{3223x_{0}^{2}}{5715360000}\right)h^{13}\\ &-\frac{4681v_{0}x_{0}}{9525600000}h^{14}+\left(\frac{9461x_{0}^{2}}{85730400000}-\frac{31273v_{0}^{2}}{333396000000}\right)h^{15}+\frac{71909v_{0}x_{0}}{1000188000000}h^{16}\\ &+\left(\frac{18107v_{0}^{2}}{1666980000000}-\frac{360391x_{0}^{2}}{36006768000000}\right)h^{17}-\frac{287197v_{0}x_{0}}{60011280000000}h^{18}\\ &+\left(\frac{5933x_{0}^{2}}{135025380000000}-\frac{297667v_{0}^{2}}{700131600000000}\right)h^{19}-\frac{420823v_{0}x_{0}}{1575296100000000}h^{20}\end{aligned}\right];

(2.1.13)

note that $\epsilon_{{\rm CE1},\min}(h)=\mathcal{O}(h^{5})$ seems consistent with $x_{\rm CE1}(h)-x_{\rm exact}(h)=\mathcal{O}(h^{6})$ . This minimization goal can be used to adapt step length, e.g., for $x_{0}=1$ and $v_{0}=0$ ( $x_{0}=0$ and $v_{0}=1$ ), $\epsilon_{{\rm CE1},\min}(h)\leq 2^{-53}$ ⁶⁶6In this section, we use $2^{-53}$ as a general-purpose benchmark for numerical precision, although it is only a threshold for double-precision when the leading-order term is $1$ . when $h\leq 0.002402$ ( $h\leq 0.01634$ ).

2.2 Fourth- and eighth-order Runge–Kutta methods

To enable Runge–Kutta methods, the equation of motion Eq. (2.0.2) has to be written as

\displaystyle\frac{\rm d}{{\rm d}t}\begin{pmatrix}x\\ v\end{pmatrix}={\boldsymbol{f}}(\begin{pmatrix}x\\ v\end{pmatrix})=\begin{pmatrix}v\\ -x\end{pmatrix}.

(2.2.1)

Like in many physics problems, this derivative does not have explicit time dependence.

Applying the fourth-order (i.e., classic) Runge–Kutta method, we have

$\displaystyle{\boldsymbol{k}}_{{\rm RK4},1}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix})=\begin{pmatrix}v_{0}\\ -x_{0}\end{pmatrix},$	(2.2.2)
$\displaystyle{\boldsymbol{k}}_{{\rm RK4},2}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{{\boldsymbol{k}}_{{\rm RK4},1}}{2}h)=\left(v_{0}-\frac{x_{0}}{2}h,-x_{0}-\frac{v_{0}}{2}h\right)^{\rm T},$	(2.2.3)
$\displaystyle{\boldsymbol{k}}_{{\rm RK4},3}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{{\boldsymbol{k}}_{{\rm RK4},2}}{2}h)=\left(v_{0}-\frac{x_{0}}{2}h-\frac{v_{0}}{4}h^{2},-x_{0}-\frac{v_{0}}{2}h+\frac{x_{0}}{4}h^{2}\right)^{\rm T},$	(2.2.4)
$\displaystyle{\boldsymbol{k}}_{{\rm RK4},4}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+{\boldsymbol{k}}_{{\rm RK4},3}h)=\left(v_{0}-x_{0}h-\frac{v_{0}}{2}h^{2}+\frac{x_{0}}{4}h^{3},-x_{0}-v_{0}h+\frac{x_{0}}{2}h^{2}+\frac{v_{0}}{4}h^{3}\right)^{\rm T},$	(2.2.5)

and then

$\displaystyle\begin{pmatrix}x_{h}\\ v_{h}\end{pmatrix}$	$\displaystyle=\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{h}{6}({\boldsymbol{k}}_{{\rm RK4},1}+2{\boldsymbol{k}}_{{\rm RK4},2}+2{\boldsymbol{k}}_{{\rm RK4},3}+{\boldsymbol{k}}_{{\rm RK4},4})$
	$\displaystyle=\left(x_{0}+v_{0}h-\frac{x_{0}}{2}h^{2}-\frac{v_{0}}{6}h^{3}+\frac{x_{0}}{24}h^{4},v_{0}-x_{0}h-\frac{v_{0}}{2}h^{2}+\frac{x_{0}}{6}h^{3}+\frac{v_{0}}{24}h^{4}\right)^{\rm T}$
	$\displaystyle=\begin{pmatrix}1-\dfrac{h^{2}}{2}+\dfrac{h^{4}}{24}&h-\dfrac{h^{3}}{6}\\ -h+\dfrac{h^{3}}{6}&1-\dfrac{h^{2}}{2}+\dfrac{h^{4}}{24}\end{pmatrix}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}\equiv G_{\rm RK4}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}.$	(2.2.6)

Evidently, errors in $x_{h}$ and $v_{h}$ are both $\mathcal{O}(h^{5})$ .

The determinant of the time evolution operator $G_{\rm RK4}$ is

\displaystyle\det(G_{\rm RK4})=1-\frac{h^{6}}{72}+\frac{h^{8}}{576},

(2.2.7)

i.e., the discrepancy $1-\det(G_{\rm RK4})$ is two orders larger than $1-\det(G_{\rm CE1})$ ; to archive $1-\det(G_{\rm RK4})\leq 2^{-53}$ , one needs $h\leq 0.004472$ , $7.594$ times smaller than what was required for first-order ContEvol. To adapt step length, the fourth-order Runge–Kutta method usually resorts to the fifth-order version, which necessitates a slight increase in computational complexity.

Now let us try the eight-order Runge–Kutta method⁷⁷7RK8 coefficients used in this work are found on the MathWorks webpage “Runge Kutta 8th Order Integration”: https://www.mathworks.com/matlabcentral/fileexchange/55431-runge-kutta-8th-order-integration., which gives (subscripts “RK8” on the right-hand side are omitted for simplicity)

$\displaystyle{\boldsymbol{k}}_{{\rm RK8},0}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix})=\begin{pmatrix}v_{0}\\ -x_{0}\end{pmatrix},$	(2.2.8)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},1}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{4{\boldsymbol{k}}_{0}}{27}h)=\left(v_{0}-\frac{4x_{0}}{27}h,-x_{0}-\frac{4v_{0}}{27}h\right)^{\rm T},$	(2.2.9)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},2}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{{\boldsymbol{k}}_{0}+3{\boldsymbol{k}}_{1}}{18}h)=\left(v_{0}-\frac{2x_{0}}{9}h-\frac{2v_{0}}{81}h^{2},-x_{0}-\frac{2v_{0}}{9}h+\frac{2x_{0}}{81}h^{2}\right)^{\rm T},$	(2.2.10)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},3}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{{\boldsymbol{k}}_{0}+3{\boldsymbol{k}}_{2}}{12}h)=\begin{pmatrix}v_{0}-\dfrac{x_{0}}{3}h-\dfrac{v_{0}}{18}h^{2}+\dfrac{x_{0}}{162}h^{3}\\ -x_{0}-\dfrac{v_{0}}{3}h+\dfrac{x_{0}}{18}h^{2}+\dfrac{v_{0}}{162}h^{3}\end{pmatrix},$	(2.2.11)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},4}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{{\boldsymbol{k}}_{0}+3{\boldsymbol{k}}_{3}}{8}h)=\begin{pmatrix}v_{0}-\dfrac{x_{0}}{2}h-\dfrac{v_{0}}{8}h^{2}+\dfrac{x_{0}}{48}h^{3}+\dfrac{v_{0}}{432}h^{4}\\ -x_{0}-\dfrac{v_{0}}{2}h+\dfrac{x_{0}}{8}h^{2}+\dfrac{v_{0}}{48}h^{3}-\dfrac{x_{0}}{432}h^{4}\end{pmatrix},$	(2.2.12)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},5}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{13{\boldsymbol{k}}_{0}-27{\boldsymbol{k}}_{2}+42{\boldsymbol{k}}_{3}+8{\boldsymbol{k}}_{4}}{54}h)=\begin{pmatrix}v_{0}-\dfrac{2x_{0}}{3}h-\dfrac{2v_{0}}{9}h^{2}+\dfrac{4x_{0}}{81}h^{3}+\dfrac{23v_{0}}{2916}h^{4}-\dfrac{x_{0}}{2916}h^{5}\\ -x_{0}-\dfrac{2v_{0}}{3}h+\dfrac{2x_{0}}{9}h^{2}+\dfrac{4v_{0}}{81}h^{3}-\dfrac{23x_{0}}{2916}h^{4}-\dfrac{v_{0}}{2916}h^{5}\end{pmatrix},$	(2.2.13)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},6}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{389{\boldsymbol{k}}_{0}-54{\boldsymbol{k}}_{2}+966{\boldsymbol{k}}_{3}-824{\boldsymbol{k}}_{4}+243{\boldsymbol{k}}_{5}}{4320}h)$
	$\displaystyle=\begin{pmatrix}v_{0}-\dfrac{x_{0}}{6}h-\dfrac{v_{0}}{72}h^{2}+\dfrac{x_{0}}{1296}h^{3}+\dfrac{43v_{0}}{233280}h^{4}-\dfrac{x_{0}}{466560}h^{5}-\dfrac{v_{0}}{51840}h^{6}\\ -x_{0}-\dfrac{v_{0}}{6}h+\dfrac{x_{0}}{72}h^{2}+\dfrac{v_{0}}{1296}h^{3}-\dfrac{43x_{0}}{233280}h^{4}-\dfrac{v_{0}}{466560}h^{5}+\dfrac{x_{0}}{51840}h^{6}\end{pmatrix},$	(2.2.14)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},7}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{-234{\boldsymbol{k}}_{0}+81{\boldsymbol{k}}_{2}-1164{\boldsymbol{k}}_{3}+656{\boldsymbol{k}}_{4}-122{\boldsymbol{k}}_{5}+800{\boldsymbol{k}}_{6}}{20}h)$
	$\displaystyle=\begin{pmatrix}v_{0}-\dfrac{17x_{0}}{20}h-\dfrac{v_{0}}{2}h^{2}+\dfrac{x_{0}}{6}h^{3}+\dfrac{29v_{0}}{540}h^{4}-\dfrac{19x_{0}}{540}h^{5}+\dfrac{13v_{0}}{6480}h^{6}+\dfrac{x_{0}}{1296}h^{7}\\ -x_{0}-\dfrac{17v_{0}}{20}h+\dfrac{x_{0}}{2}h^{2}+\dfrac{v_{0}}{6}h^{3}-\dfrac{29x_{0}}{540}h^{4}-\dfrac{19v_{0}}{540}h^{5}-\dfrac{13x_{0}}{6480}h^{6}+\dfrac{v_{0}}{1296}h^{7}\end{pmatrix},$	(2.2.15)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},8}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{-217{\boldsymbol{k}}_{0}+18{\boldsymbol{k}}_{2}-678{\boldsymbol{k}}_{3}+456{\boldsymbol{k}}_{4}-9{\boldsymbol{k}}_{5}+576{\boldsymbol{k}}_{6}+4{\boldsymbol{k}}_{7}}{288}h)$
	$\displaystyle=\begin{pmatrix}v_{0}-\dfrac{5x_{0}}{6}h-\dfrac{497v_{0}}{1440}h^{2}+\dfrac{125x_{0}}{1296}h^{3}+\dfrac{323v_{0}}{15552}h^{4}-\dfrac{47x_{0}}{10368}h^{5}-\dfrac{5v_{0}}{10368}h^{6}+\dfrac{x_{0}}{93312}h^{7}+\dfrac{v_{0}}{93312}h^{8}\\ -x_{0}-\dfrac{5v_{0}}{6}h+\dfrac{497x_{0}}{1440}h^{2}+\dfrac{125v_{0}}{1296}h^{3}-\dfrac{323x_{0}}{15552}h^{4}-\dfrac{47v_{0}}{10368}h^{5}+\dfrac{5x_{0}}{10368}h^{6}+\dfrac{v_{0}}{93312}h^{7}-\dfrac{x_{0}}{93312}h^{8}\end{pmatrix},$	(2.2.16)
$\displaystyle{\boldsymbol{k}}_{{\rm RK8},9}$	$\displaystyle={\boldsymbol{f}}(\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{1481{\boldsymbol{k}}_{0}-81{\boldsymbol{k}}_{2}+7104{\boldsymbol{k}}_{3}-3376{\boldsymbol{k}}_{4}+72{\boldsymbol{k}}_{5}-5040{\boldsymbol{k}}_{6}-60{\boldsymbol{k}}_{7}+720{\boldsymbol{k}}_{8}}{820}h)$
	$\displaystyle=\begin{pmatrix}v_{0}-x_{0}h-\dfrac{419v_{0}}{820}h^{2}+\dfrac{811x_{0}}{4920}h^{3}+\dfrac{811v_{0}}{22140}h^{4}-\dfrac{8x_{0}}{1845}h^{5}-\dfrac{7v_{0}}{4920}h^{6}+\dfrac{x_{0}}{2214}h^{7}-\dfrac{5v_{0}}{106272}h^{8}-\dfrac{x_{0}}{106272}h^{9}\\ -x_{0}-v_{0}h+\dfrac{419x_{0}}{820}h^{2}+\dfrac{811v_{0}}{4920}h^{3}-\dfrac{811x_{0}}{22140}h^{4}-\dfrac{8v_{0}}{1845}h^{5}+\dfrac{7x_{0}}{4920}h^{6}+\dfrac{v_{0}}{2214}h^{7}+\dfrac{5x_{0}}{106272}h^{8}-\dfrac{v_{0}}{106272}h^{9}\end{pmatrix},$	(2.2.17)

and then

$\displaystyle\begin{pmatrix}x_{h}\\ v_{h}\end{pmatrix}$	$\displaystyle=\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}+\frac{h}{840}(41{\boldsymbol{k}}_{0}+27{\boldsymbol{k}}_{3}+272{\boldsymbol{k}}_{4}+27{\boldsymbol{k}}_{5}+216{\boldsymbol{k}}_{6}+216{\boldsymbol{k}}_{8}+41{\boldsymbol{k}}_{9})$
	$\displaystyle=\begin{pmatrix}x_{0}+v_{0}h-\dfrac{x_{0}}{2}h^{2}-\dfrac{v_{0}}{6}h^{3}+\dfrac{1397x_{0}}{33600}h^{4}+\dfrac{v_{0}}{120}h^{5}-\dfrac{x_{0}}{720}h^{6}-\dfrac{v_{0}}{5040}h^{7}+\dfrac{x_{0}}{40320}h^{8}+\dfrac{v_{0}}{2177280}h^{9}-\dfrac{x_{0}}{2177280}h^{10}\\ v_{0}-x_{0}h-\dfrac{v_{0}}{2}h^{2}+\dfrac{x_{0}}{6}h^{3}+\dfrac{1397v_{0}}{33600}h^{4}-\dfrac{x_{0}}{120}h^{5}-\dfrac{v_{0}}{720}h^{6}+\dfrac{x_{0}}{5040}h^{7}+\dfrac{v_{0}}{40320}h^{8}-\dfrac{x_{0}}{2177280}h^{9}-\dfrac{v_{0}}{2177280}h^{10}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}1-\dfrac{h^{2}}{2}+\dfrac{1397h^{4}}{33600}-\dfrac{h^{6}}{720}+\dfrac{h^{8}}{40320}-\dfrac{h^{10}}{2177280}&h-\dfrac{h^{3}}{6}+\dfrac{h^{5}}{120}-\dfrac{h^{7}}{5040}+\dfrac{h^{9}}{2177280}\\ -h+\dfrac{h^{3}}{6}-\dfrac{h^{5}}{120}+\dfrac{h^{7}}{5040}-\dfrac{h^{9}}{2177280}&1-\dfrac{h^{2}}{2}+\dfrac{1397h^{4}}{33600}-\dfrac{h^{6}}{720}+\dfrac{h^{8}}{40320}-\dfrac{h^{10}}{2177280}\end{pmatrix}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}$
	$\displaystyle\equiv G_{\rm RK8}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}.$	(2.2.18)

For some unknown reason, the fourth-order coefficients have a fractional error of $3/1400$ , while the fifth- to eighth-order coefficients agree with the exact solution Eq. (2.0.3).

The determinant of the time evolution operator $G_{\rm RK8}$ is

\displaystyle\det(G_{\rm RK8})=\left[\begin{aligned} &1-\frac{h^{4}}{5600}+\frac{h^{6}}{11200}-\frac{2797h^{8}}{376320000}-\frac{19h^{10}}{4032000}+\frac{18119h^{12}}{18289152000}\\ &-\frac{2197h^{14}}{36578304000}+\frac{h^{16}}{585252864}-\frac{107h^{18}}{4740548198400}+\frac{h^{20}}{4740548198400}\end{aligned}\right];

(2.2.19)

to archive $1-\det(G_{\rm RK8})\leq 2^{-53}$ , one needs $h\leq 0.0008880$ , $5.036$ times smaller than what was required for fourth-order Runge–Kutta.

2.3 Second-order ContEvol method

The ContEvol framework can be naturally generalized to higher orders. Like in Section 2.1, we approximate the solution in a parametric form (subscript “CE2” stands for second-order ContEvol)

\displaystyle x_{\rm CE2}(t)=x_{0}+v_{0}t-\frac{x_{0}}{2}t^{2}+Ct^{3}+Bt^{4}+At^{5},\quad t\in[0,h];

(2.3.1)

“terminal” conditions at $t=h$ yield

	$\displaystyle\left\{\begin{aligned} x_{\rm CE2}(h)&=x_{0}+v_{0}h-\frac{x_{0}}{2}h^{2}+Ch^{3}+Bh^{4}+Ah^{5}=x_{h}\\ \dot{x}_{\rm CE2}(h)&=v_{0}-x_{0}h+3Ch^{2}+4Bh^{3}+5Ah^{4}=v_{h}\\ \ddot{x}_{\rm CE2}(h)&=-x_{0}+6Ch+12Bh^{2}+20Ah^{3}=-x_{h}\end{aligned}\right.$	(2.3.2)
$\displaystyle\Rightarrow\quad$	$\displaystyle\begin{pmatrix}h^{3}&h^{4}&h^{5}\\ 3h^{2}&4h^{3}&5h^{4}\\ 6h&12h^{2}&20h^{3}\end{pmatrix}\begin{pmatrix}C\\ B\\ A\end{pmatrix}=\begin{pmatrix}x_{h}-x_{0}-v_{0}h+\dfrac{x_{0}}{2}h^{2}\\ v_{h}-v_{0}+x_{0}h\\ -x_{h}+x_{0}\end{pmatrix}$	(2.3.3)
$\displaystyle\Rightarrow\quad$	$\displaystyle\left\{\begin{aligned} A&=6(x_{h}-x_{0})h^{-5}-3(v_{0}+v_{h})h^{-4}+\frac{x_{0}-x_{h}}{2}h^{-3}\\ B&=15(x_{0}-x_{h})h^{-4}+(8v_{0}+7v_{h})h^{-3}+\left(x_{h}-\frac{3}{2}x_{0}\right)h^{-2}\\ C&=10(x_{h}-x_{0})h^{-3}-(6v_{0}+4v_{h})h^{-2}+\frac{3x_{0}-x_{h}}{2}h^{-1}\end{aligned}\right..$	(2.3.4)

Note that we have enforced the EOM at both $t=0$ and $t=h$ , and the three coefficients ( $A$ , $B$ , and $C$ ) are fully specified by two parameters ( $x_{h}$ and $v_{h}$ ).

Likewise, we define the cost function as

$\displaystyle\epsilon_{\rm CE2}(A,B,C;h)$	$\displaystyle=\int_{0}^{h}(\ddot{x}+x)^{2}\,{\rm d}t=\int_{0}^{h}[(6C+v_{0})t+\left(12B-\frac{x_{0}}{2}\right)t^{2}+(20A+C)t^{3}+Bt^{4}+At^{5}]^{2}\,{\rm d}t$
	$\displaystyle=\int_{0}^{h}\left[\begin{aligned} &(36C^{2}+12Cv_{0}+v_{0}^{2})t^{2}+(144BC+24Bv_{0}-6Cx_{0}-v_{0}x_{0})t^{3}\\ &+\left(240AC+40Av_{0}+144B^{2}-12Bx_{0}+12C^{2}+2Cv_{0}+\frac{x_{0}^{2}}{4}\right)t^{4}\\ &+(480AB-20Ax_{0}+36BC+2Bv_{0}-Cx_{0})t^{5}\\ &+(400A^{2}+52AC+2Av_{0}+24B^{2}-Bx_{0}+C^{2})t^{6}\\ &+(64AB-Ax_{0}+2BC)t^{7}+(40A^{2}+2AC+B^{2})t^{8}+2ABt^{9}+A^{2}t^{10}\end{aligned}\right]\,{\rm d}t$
	$\displaystyle=\left[\begin{aligned} &\left(12C^{2}+4Cv_{0}+\frac{v_{0}^{2}}{3}\right)h^{3}+\left(36BC+6Bv_{0}-\frac{3Cx_{0}}{2}-\frac{v_{0}x_{0}}{4}\right)h^{4}\\ &+\frac{1}{20}\left(960AC+160Av_{0}+576B^{2}-48Bx_{0}+48C^{2}+8Cv_{0}+x_{0}^{2}\right)h^{5}\\ &+\frac{1}{6}(480AB-20Ax_{0}+36BC+2Bv_{0}-Cx_{0})h^{6}\\ &+\frac{1}{7}(400A^{2}+52AC+2Av_{0}+24B^{2}-Bx_{0}+C^{2})h^{7}+\left(8AB-\frac{Ax_{0}}{8}+\frac{BC}{4}\right)h^{8}\\ &+\frac{1}{9}(40A^{2}+2AC+B^{2})h^{9}+\frac{1}{5}ABh^{10}+\frac{1}{11}A^{2}h^{11}\end{aligned}\right];$	(2.3.5)

minimizing this, we obtain

	$\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{\rm CE2}}{\partial A}&=\left[\begin{aligned} &(48C+8v_{0})h^{5}+\left(80B-\frac{10x_{0}}{3}\right)h^{6}+\frac{2}{7}(400A+26C+v_{0})h^{7}\\ &+\left(8B-\frac{x_{0}}{8}\right)h^{8}+\frac{2}{9}(40A+C)h^{9}+\frac{1}{5}Bh^{10}+\frac{2}{11}Ah^{11}\end{aligned}\right]=0\\ \frac{\partial\epsilon_{\rm CE2}}{\partial B}&=\left[\begin{aligned} &(36C+6v_{0})h^{4}+\frac{12}{5}\left(24B-x_{0}\right)h^{5}+\left(80A+6C+\frac{v_{0}}{3}\right)h^{6}\\ &+\frac{1}{7}(48B-x_{0})h^{7}+\left(8A+\frac{C}{4}\right)h^{8}+\frac{2}{9}Bh^{9}+\frac{1}{5}Ah^{10}\end{aligned}\right]=0\\ \frac{\partial\epsilon_{\rm CE2}}{\partial C}&=\left[\begin{aligned} &(24C+4v_{0})h^{3}+\left(36B-\frac{3x_{0}}{2}\right)h^{4}+\frac{2}{5}(120A+12C+v_{0})h^{5}\\ &+\left(6B-\frac{x_{0}}{6}\right)h^{6}+\frac{2}{7}(26A+C)h^{7}+\frac{1}{4}Bh^{8}+\frac{2}{9}Ah^{9}\end{aligned}\right]=0\end{aligned}\right.$	(2.3.6)
$\displaystyle\Rightarrow\quad$	$\displaystyle\begin{pmatrix}\dfrac{800}{7}h^{7}+\dfrac{80}{9}h^{9}+\dfrac{2}{11}h^{11}&80h^{6}+8h^{8}+\dfrac{1}{5}h^{10}&48h^{5}+\dfrac{52}{7}h^{7}+\dfrac{2}{9}h^{9}\\ 80h^{6}+8h^{8}+\dfrac{1}{5}h^{10}&\dfrac{288}{5}h^{5}+\dfrac{48}{7}h^{7}+\dfrac{2}{9}h^{9}&36h^{4}+6h^{6}+\dfrac{1}{4}h^{8}\\ 48h^{5}+\dfrac{52}{7}h^{7}+\dfrac{2}{9}h^{9})&36h^{4}+6h^{6}+\dfrac{1}{4}h^{8}&24h^{3}+\dfrac{24}{5}h^{5}+\dfrac{2}{7}h^{7}\end{pmatrix}\begin{pmatrix}A_{\rm CE2}\\ B_{\rm CE2}\\ C_{\rm CE2}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}-8v_{0}h^{5}+\dfrac{10}{3}x_{0}h^{6}-\dfrac{2}{7}v_{0}h^{7}+\dfrac{1}{8}x_{0}h^{8}\\ -6v_{0}h^{4}+\dfrac{12}{5}x_{0}h^{5}-\dfrac{1}{3}v_{0}h^{6}+\dfrac{1}{7}x_{0}h^{7}\\ -4v_{0}h^{3}+\dfrac{3}{2}x_{0}h^{4}-\dfrac{2}{5}v_{0}h^{5}+\dfrac{1}{6}x_{0}h^{6}\end{pmatrix}$	(2.3.7)
$\displaystyle\Rightarrow\quad$	$\displaystyle\left\{\begin{aligned} A_{\rm CE2}&=\frac{33\left[\begin{aligned} &487710720v_{0}-228614400x_{0}h-26127360v_{0}h^{2}-4596480x_{0}h^{3}-1209600v_{0}h^{4}\\ &-42336x_{0}h^{5}-18240v_{0}h^{6}-5040x_{0}h^{7}-1680v_{0}h^{8}+175x_{0}h^{9}\end{aligned}\right]}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ B_{\rm CE2}&=\frac{15\left[\begin{aligned} &670602240x_{0}+107775360x_{0}h^{2}+21288960v_{0}h^{3}+1542240x_{0}h^{4}+774144v_{0}h^{5}\\ &+12024x_{0}h^{6}+16128v_{0}h^{7}+1638x_{0}h^{8}+896v_{0}h^{9}-105x_{0}h^{10}\end{aligned}\right]}{2(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ C_{\rm CE2}&=\frac{-3\left[\begin{aligned} &26824089600v_{0}+1916006400v_{0}h^{2}+199584000x_{0}h^{3}+154828800v_{0}h^{4}-5322240x_{0}h^{5}\\ &+5210880v_{0}h^{6}-312480x_{0}h^{7}+104160v_{0}h^{8}+1120x_{0}h^{9}+4704v_{0}h^{10}-735x_{0}h^{11}\end{aligned}\right]}{4(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\end{aligned}\right..$	(2.3.8)

Since Eq. (2.3.8) is inconsistent with Eq. (2.3.4), we have two options.

Option 1: With EOM enforced at $t=h$ (“direct” solution).

First, we enforce $\ddot{x}(h)=-x_{h}$ by rewriting the cost function Eq. (2.3) as

\displaystyle\epsilon_{\rm CE2}(x_{h},v_{h};h)

\displaystyle=\left[\begin{aligned} &\frac{120}{7}(x_{0}^{2}-2x_{0}x_{h}+x_{h}^{2})h^{-3}+\frac{120}{7}(v_{0}x_{0}-v_{0}x_{h}+v_{h}x_{0}-v_{h}x_{h})h^{-2}\\ &+\frac{2}{35}(96v_{0}^{2}+108v_{0}v_{h}+96v_{h}^{2}-65x_{0}^{2}+130x_{0}x_{h}-65x_{h}^{2})h^{-1}\\ &-\frac{2}{35}(61v_{0}x_{0}-19v_{0}x_{h}+19v_{h}x_{0}-61v_{h}x_{h})\\ &+\frac{1}{2310}(-1056v_{0}^{2}+132v_{0}v_{h}-1056v_{h}^{2}+1213x_{0}^{2}+346x_{0}x_{h}+1213x_{h}^{2})h\\ &+\frac{1}{154}(31v_{0}x_{0}+13v_{0}x_{h}-13v_{h}x_{0}-31v_{h}x_{h})h^{2}\\ &+\frac{1}{27720}(416v_{0}^{2}-532v_{0}v_{h}+416v_{h}^{2}-369x_{0}^{2}-450x_{0}x_{h}-369x_{h}^{2})h^{3}\\ &+\frac{1}{27720}(-69v_{0}x_{0}-52v_{0}x_{h}+52v_{h}x_{0}+69v_{h}x_{h})h^{4}+\frac{1}{27720}(3x_{0}^{2}+5x_{0}x_{h}+3x_{h}^{2})h^{5}\end{aligned}\right];

(2.3.9)

minimizing this, we obtain (“d” in the subscript stands for direct)

	$\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{\rm CE2}}{\partial x_{h}}&=\left[\begin{aligned} &-\frac{240}{7}(x_{0}-x_{h})h^{-3}-\frac{120}{7}(v_{0}+v_{h})h^{-2}+\frac{52}{7}(x_{0}-x_{h})h^{-1}\\ &+\left(\frac{38v_{0}}{35}+\frac{122v_{h}}{35}\right)+\left(\frac{173x_{0}}{1155}+\frac{1213x_{h}}{1155}\right)h+\left(\frac{13v_{0}}{154}-\frac{31v_{h}}{154}\right)h^{2}\\ &-\left(\frac{5x_{0}}{308}+\frac{41x_{h}}{1540}\right)h^{3}+\left(\frac{23v_{h}}{9240}-\frac{13v_{0}}{6930}\right)h^{4}+\left(\frac{x_{0}}{5544}+\frac{x_{h}}{4620}\right)h^{5}\end{aligned}\right]=0\\ \frac{\partial\epsilon_{\rm CE2}}{\partial v_{h}}&=\left[\begin{aligned} &\frac{120}{7}(x_{0}-x_{h})h^{-2}+\left(\frac{216v_{0}}{35}+\frac{384v_{h}}{35}\right)h^{-1}+\left(\frac{122x_{h}}{35}-\frac{38x_{0}}{35}\right)+\frac{2}{35}(v_{0}-16v_{h})h\\ &-\left(\frac{13x_{0}}{154}+\frac{31x_{h}}{154}\right)h^{2}+\left(\frac{104v_{h}}{3465}-\frac{19v_{0}}{990}\right)h^{3}+\left(\frac{13x_{0}}{6930}+\frac{23x_{h}}{9240}\right)h^{4}\end{aligned}\right]=0\end{aligned}\right.$	(2.3.10)
$\displaystyle\Rightarrow\quad$	$\displaystyle\begin{pmatrix}\dfrac{240}{7}-\dfrac{52}{7}h^{2}+\dfrac{1213}{1155}h^{4}-\dfrac{41}{1540}h^{6}+\dfrac{1}{4620}h^{8}&-\dfrac{120}{7}h+\dfrac{122}{35}h^{3}-\dfrac{31}{154}h^{5}+\dfrac{23}{9240}h^{7}\\ -\dfrac{120}{7}h+\dfrac{122}{35}h^{3}-\dfrac{31}{154}h^{5}+\dfrac{23}{9240}h^{7}&\dfrac{384}{35}h^{2}-\dfrac{32}{35}h^{4}+\dfrac{104}{3465}h^{6}\end{pmatrix}\begin{pmatrix}x_{h,{\rm d}}\\ v_{h,{\rm d}}\end{pmatrix}$
	$\displaystyle=\begin{pmatrix}\dfrac{240x_{0}}{7}+\dfrac{120v_{0}}{7}h-\dfrac{52x_{0}}{7}h^{2}-\dfrac{38v_{0}}{35}h^{3}-\dfrac{173x_{0}}{1155}h^{4}-\dfrac{13v_{0}}{154}h^{5}+\dfrac{5x_{0}}{308}h^{6}+\dfrac{13v_{0}}{6930}h^{7}-\dfrac{x_{0}}{5544}h^{8}\\ -\dfrac{120x_{0}}{7}h-\dfrac{216v_{0}}{35}h^{2}+\dfrac{38x_{0}}{35}h^{3}-\dfrac{2v_{0}}{35}h^{4}+\dfrac{13x_{0}}{154}h^{5}+\dfrac{19v_{0}}{990}h^{6}-\dfrac{13x_{0}}{6930}h^{7}\end{pmatrix}$	(2.3.11)
$\displaystyle\Rightarrow\quad$	$\displaystyle\left\{\begin{aligned} x_{h,{\rm d}}&=\frac{4\left[\begin{aligned} &1437004800x_{0}+1437004800v_{0}h-602173440x_{0}h^{2}-123171840v_{0}h^{3}+6799680x_{0}h^{4}-2324160v_{0}h^{5}\\ &+469872x_{0}h^{6}+37296v_{0}h^{7}-8520x_{0}h^{8}-4248v_{0}h^{9}+1125x_{0}h^{10}+149v_{0}h^{11}-13x_{0}h^{12}\end{aligned}\right]}{3(1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12})}\\ v_{h,{\rm d}}&=\frac{\left[\begin{aligned} 1916006400v_{0}-1916006400x_{0}h-802897920v_{0}h^{2}+164229120x_{0}h^{3}+9066240v_{0}h^{4}+2908800x_{0}h^{5}\\ +626496v_{0}h^{6}-31776x_{0}h^{7}-11360v_{0}h^{8}+6680x_{0}h^{9}+1500v_{0}h^{10}-198x_{0}h^{11}-12v_{0}h^{12}+x_{0}h^{13}\end{aligned}\right]}{1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12}}\end{aligned}\right.,$	(2.3.12)

or equivalently

\displaystyle\begin{pmatrix}x_{h,{\rm d}}\\ v_{h,{\rm d}}\end{pmatrix}=\begin{pmatrix}G_{{\rm CE2d},00}&G_{{\rm CE2d},01}\\ G_{{\rm CE2d},10}&G_{{\rm CE2d},11}\end{pmatrix}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}

(2.3.13)

with

\displaystyle\left\{\begin{aligned} G_{{\rm CE2d},00}&=\frac{4(1437004800-602173440h^{2}+6799680h^{4}+469872h^{6}-8520h^{8}+1125h^{10}-13h^{12})}{3(1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12})}\\ G_{{\rm CE2d},01}&=\frac{4(1437004800h-123171840h^{3}-2324160h^{5}+37296h^{7}-4248h^{9}+149h^{11})}{3(1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12})}\\ G_{{\rm CE2d},10}&=\frac{-1916006400h+164229120h^{3}+2908800h^{5}-31776h^{7}+6680h^{9}-198h^{11}+h^{13}}{1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12}}\\ G_{{\rm CE2d},11}&=\frac{1916006400-802897920h^{2}+9066240h^{4}+626496h^{6}-11360h^{8}+1500h^{10}-12h^{12}}{1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12}}\end{aligned}\right..

(2.3.14)

The determinant of the time evolution operator $G_{\rm CE2d}$ is

\displaystyle\det\begin{pmatrix}G_{{\rm CE2d},00}&G_{{\rm CE2d},01}\\ G_{{\rm CE2d},10}&G_{{\rm CE2d},11}\end{pmatrix}=1-\frac{17h^{12}}{3(1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12})};

(2.3.15)

to archive $1-\det(G_{\rm CE2d})\leq 2^{-53}$ , one only needs $h\leq 0.2406$ , $7.087$ times larger than what was required for first-order ContEvol.

Expanding Eqs. (2.3.13) and (2.3.14), second-order ContEvol with $\ddot{x}(h)=-x_{h}$ enforced yields

\displaystyle\left\{\begin{aligned} x_{\rm CE2d}(h)&=\left[\begin{aligned} &x_{0}\left(1-\frac{h^{2}}{2}+\frac{h^{4}}{24}-{\color[rgb]{1,0,0}\frac{29}{28}}\cdot\frac{h^{6}}{720}+{\color[rgb]{1,0,0}\frac{1019}{630}}\cdot\frac{h^{8}}{40320}+\mathcal{O}(h^{10})\right)\\ &+v_{0}\left(h-\frac{h^{3}}{6}+\frac{h^{5}}{120}-{\color[rgb]{1,0,0}\frac{67}{60}}\cdot\frac{h^{7}}{5040}+{\color[rgb]{1,0,0}\frac{11513}{3850}}\cdot\frac{h^{9}}{362880}+\mathcal{O}(h^{11})\right)\end{aligned}\right]\\ v_{\rm CE2d}(h)&=\left[\begin{aligned} &-x_{0}\left(h-\frac{h^{3}}{6}+{\color[rgb]{1,0,0}\frac{85}{84}}\cdot\frac{h^{5}}{120}-{\color[rgb]{1,0,0}\frac{607}{504}}\cdot\frac{h^{7}}{5040}+{\color[rgb]{1,0,0}\frac{1559}{490}}\cdot\frac{h^{9}}{362880}+\mathcal{O}(h^{11})\right)\\ &+v_{0}\left(1-\frac{h^{2}}{2}+\frac{h^{4}}{24}-{\color[rgb]{1,0,0}\frac{29}{28}}\cdot\frac{h^{6}}{720}+{\color[rgb]{1,0,0}\frac{1019}{630}}\cdot\frac{h^{8}}{40320}+\mathcal{O}(h^{10})\right)\end{aligned}\right]\end{aligned}\right..

(2.3.16)

comparing to the exact solution Eq. (2.0.3), we see that errors in $x_{h}$ and $v_{h}$ (highlighted in red) are still $\mathcal{O}(h^{6})$ and $\mathcal{O}(h^{5})$ , respectively, same as first-order ContEvol Eq. (2.1.12); however, the coefficients are much closer to the exact values.

The minimized cost function Eq. (2.3.9) is

\displaystyle\epsilon_{{\rm CE2d},\min}(h)=\frac{h^{9}\left[\begin{aligned} &1425600x_{0}^{2}+1425600v_{0}x_{0}h+5616(64v_{0}^{2}-55x_{0}^{2})h^{2}-193104v_{0}x_{0}h^{3}\\ &-36(512v_{0}^{2}-541x_{0}^{2})h^{4}+5220v_{0}x_{0}h^{5}+(256v_{0}^{2}-243x_{0}^{2})h^{6}-31v_{0}x_{0}h^{7}+x_{0}^{2}h^{8}\end{aligned}\right]}{7560(1916006400+155105280h^{2}+6785280h^{4}+312576h^{6}+8464h^{8}+120h^{10}+7h^{12})};

(2.3.17)

when $h\leq 0.1014$ ( $h\leq 0.1742$ ), $\epsilon_{{\rm CE2d},\min}(h)\leq 2^{-53}$ for $x_{0}=1$ and $v_{0}=0$ ( $x_{0}=0$ and $v_{0}=1$ ).

Option 2: Without EOM enforced at $t=h$ (“indirect” solution).

Second, we remove the $\ddot{x}(h)=-x_{h}$ constraint and simply adopt Eq. (2.3.8); ergo (“i” in the subscript stands for indirect)

\displaystyle\left\{\begin{aligned} x_{h,{\rm i}}&\equiv x_{\rm CE2i}(h)=x_{0}+v_{0}h-\frac{x_{0}}{2}h^{2}+C_{\rm CE2}h^{3}+B_{\rm CE2}h^{4}+A_{\rm CE2}h^{5}\\ &=\frac{\left[\begin{aligned} &1931334451200x_{0}+1931334451200v_{0}h-827714764800x_{0}h^{2}-183936614400v_{0}h^{3}\\ &+16895692800x_{0}h^{4}-1497968640v_{0}h^{5}+518987520x_{0}h^{6}+59581440v_{0}h^{7}-9711360x_{0}h^{8}\\ &-3985920v_{0}h^{9}+1086048x_{0}h^{10}+156096v_{0}h^{11}-15568x_{0}h^{12}-448v_{0}h^{13}+35x_{0}h^{14}\end{aligned}\right]}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ v_{h,{\rm i}}&\equiv\dot{x}_{\rm CE2i}(h)=v_{0}-x_{0}h+3C_{\rm CE2}h^{2}+4B_{\rm CE2}h^{3}+5A_{\rm CE2}h^{4}\\ &=\frac{\left[\begin{aligned} &1931334451200v_{0}-1931334451200x_{0}h-827714764800v_{0}h^{2}+183936614400x_{0}h^{3}\\ &+16895692800v_{0}h^{4}+1426118400x_{0}h^{5}+558904320v_{0}h^{6}-51598080x_{0}h^{7}\\ &-10022400v_{0}h^{8}+4471200x_{0}h^{9}+1054656v_{0}h^{10}-158256x_{0}h^{11}-12544v_{0}h^{12}\end{aligned}\right]}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ a_{h,{\rm i}}&\equiv\ddot{x}_{\rm CE2i}(h)=-x_{0}+6C_{\rm CE2}h+12B_{\rm CE2}h^{2}+20A_{\rm CE2}h^{3}\\ &=-\frac{\left[\begin{aligned} &482833612800x_{0}+482833612800v_{0}h-206928691200x_{0}h^{2}-45984153600v_{0}h^{3}\\ &+3864672000x_{0}h^{4}-566092800v_{0}h^{5}+163676160x_{0}h^{6}+14688000v_{0}h^{7}\\ &-1576800x_{0}h^{8}-921600v_{0}h^{9}+280224x_{0}h^{10}+39312v_{0}h^{11}-3325h^{12}\end{aligned}\right]}{4(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\end{aligned}\right.,

(2.3.18)

or equivalently (neglecting the acceleration; note that this “indirect” strategy automatically avoids accumulation of additional error in $a_{h}$ , as it “resets” $a_{0}$ to $-x_{0}$ at each step)

\displaystyle\begin{pmatrix}x_{h,{\rm i}}\\ v_{h,{\rm i}}\end{pmatrix}=\begin{pmatrix}G_{{\rm CE2i},00}&G_{{\rm CE2i},01}\\ G_{{\rm CE2i},10}&G_{{\rm CE2i},11}\end{pmatrix}\begin{pmatrix}x_{0}\\ v_{0}\end{pmatrix}

(2.3.19)

with

\displaystyle\left\{\begin{aligned} G_{{\rm CE2i},00}&=\frac{\left[\begin{aligned} &1931334451200-827714764800h^{2}+16895692800h^{4}\\ &+518987520h^{6}-9711360h^{8}+1086048h^{10}-15568h^{12}+35h^{14}\end{aligned}\right]}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ G_{{\rm CE2i},01}&=\frac{1931334451200h-183936614400h^{3}-1497968640h^{5}+59581440h^{7}-3985920h^{9}+156096h^{11}-448h^{13}}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ G_{{\rm CE2i},10}&=\frac{-1931334451200h+183936614400h^{3}+1426118400h^{5}-51598080h^{7}+4471200h^{9}-158256h^{11}}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\\ G_{{\rm CE2i},11}&=\frac{\left[\begin{aligned} &1931334451200-827714764800h^{2}+16895692800h^{4}\\ &+558904320h^{6}-10022400h^{8}+1054656h^{10}-12544h^{12}\end{aligned}\right]}{16(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})}\end{aligned}\right..

(2.3.20)

The determinant of the time evolution operator $G_{\rm CE2i}$ is

\displaystyle\det\begin{pmatrix}G_{{\rm CE2i},00}&G_{{\rm CE2i},01}\\ G_{{\rm CE2i},10}&G_{{\rm CE2i},11}\end{pmatrix}=1-\frac{h^{6}(7983360+77760h^{2}-288h^{4}+816h^{6}-h^{8})}{\left[\begin{aligned} &4(120708403200+8622028800h^{2}+337478400h^{4}\\ &+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})\end{aligned}\right]};

(2.3.21)

to archive $1-\det(G_{\rm CE2i})\leq 2^{-53}$ , one needs $h\leq 0.01374$ , $17.52$ times smaller than what was required when we enforce $\ddot{x}(h)=-x_{h}$ .

Expanding Eqs. (2.3.19) and (2.3.20), second-order ContEvol without the $\ddot{x}(h)=-x_{h}$ constraint yields

\displaystyle\left\{\begin{aligned} x_{\rm CE2i}(h)&=\left[\begin{aligned} &x_{0}\left(1-\frac{h^{2}}{2}+\frac{h^{4}}{24}-{\color[rgb]{1,0,0}\frac{115}{112}}\cdot\frac{h^{6}}{720}+{\color[rgb]{1,0,0}\frac{121}{84}}\cdot\frac{h^{8}}{40320}+\mathcal{O}(h^{10})\right)\\ &+v_{0}\left(h-\frac{h^{3}}{6}+\frac{h^{5}}{120}-{\color[rgb]{1,0,0}\frac{13}{12}}\cdot\frac{h^{7}}{5040}+{\color[rgb]{1,0,0}\frac{365}{154}}\cdot\frac{h^{9}}{362880}+\mathcal{O}(h^{11})\right)\end{aligned}\right]\\ v_{\rm CE2i}(h)&=\left[\begin{aligned} &-x_{0}\left(h-\frac{h^{3}}{6}+{\color[rgb]{1,0,0}\frac{225}{224}}\cdot\frac{h^{5}}{120}-{\color[rgb]{1,0,0}\frac{751}{672}}\cdot\frac{h^{7}}{5040}+{\color[rgb]{1,0,0}\frac{125077}{51744}}\cdot\frac{h^{9}}{362880}+\mathcal{O}(h^{11})\right)\\ &+v_{0}\left(1-\frac{h^{2}}{2}+\frac{h^{4}}{24}-{\color[rgb]{1,0,0}\frac{85}{84}}\cdot\frac{h^{6}}{720}+{\color[rgb]{1,0,0}\frac{635}{462}}\cdot\frac{h^{8}}{40320}+\mathcal{O}(h^{10})\right)\end{aligned}\right]\end{aligned}\right..

(2.3.22)

comparing to the exact solution Eq. (2.0.3), we see that errors in $x_{h}$ and $v_{h}$ (highlighted in red) are once again $\mathcal{O}(h^{6})$ and $\mathcal{O}(h^{5})$ , respectively. Note that the “indirect” coefficients Eq. (2.3.22) are slightly closer to the exact version than their “direct” counterparts Eq. (2.3.16).

According to the optimal coefficients Eq. (2.3.8), the minimized cost function Eq. (2.3) is

\displaystyle\epsilon_{{\rm CE2i},\min}(h)=\frac{h^{9}\left[\begin{aligned} &199584000x_{0}^{2}+191600640v_{0}x_{0}h+(46448640v_{0}^{2}-39916800x_{0}^{2})h^{2}-24030720(v_{0}x_{0})h^{3}\\ &-(2211840v_{0}^{2}-2337120x_{0}^{2})h^{4}+604800v_{0}x_{0}h^{5}+(28672v_{0}^{2}-27216x_{0}^{2})h^{6}-3360v_{0}x_{0}h^{7}+105x_{0}^{2}h^{8}\end{aligned}\right]}{26880(120708403200+8622028800h^{2}+337478400h^{4}+14065920h^{6}+347760h^{8}+4536h^{10}+245h^{12})};

(2.3.23)

when $h\leq 0.1068$ ( $h\leq 0.1832$ ), $\epsilon_{{\rm CE2i},\min}(h)\leq 2^{-53}$ for $x_{0}=1$ and $v_{0}=0$ ( $x_{0}=0$ and $v_{0}=1$ ).

To summarize, the marginal benefit of raising ContEvol to second order is moderate: this reduces the minimized cost function from $\mathcal{O}(h^{5})$ to $\mathcal{O}(h^{9})$ — leading to a better representation of the evolutionary track — but does not reduce the order of errors in $x_{h}$ or $v_{h}$ . One is advised to enforce equation of motion at $t=h$ if symplecticity is more important, but to remove this constraint if error control takes priority. In the rest of this work, we only consider first-order ContEvol methods, as for real-world problems, second derivatives might be unavailable or unaffordable.

3 Celestial mechanics: two-body and three-body problems

In this section, we extend the ContEvol framework to time evolution of multiple real variables. As astrophysicists, we choose two simplest cases from celestial mechanics, two-body and three-body problems.

The equations of motion (EOMs) for a two-body problem are

\displaystyle\left\{\begin{aligned} m_{0}\ddot{\boldsymbol{r}}_{0}&=-Gm_{1}m_{0}\frac{{\boldsymbol{r}}_{0}-{\boldsymbol{r}}_{1}}{\|{\boldsymbol{r}}_{0}-{\boldsymbol{r}}_{1}\|^{3}}\\ m_{1}\ddot{\boldsymbol{r}}_{1}&=-Gm_{0}m_{1}\frac{{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{0}}{\|{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{0}\|^{3}}\end{aligned}\right.,

(3.0.1)

where $G$ is the gravitational constant and $m_{i}$ denotes masses of the two objects; setting the constant $G(m_{0}+m_{1})$ to $1$ , these can be straightforwardly reduced to

\displaystyle\ddot{\boldsymbol{r}}=-\frac{\boldsymbol{r}}{r^{3}},

(3.0.2)

with ${\boldsymbol{r}}\equiv{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{0}$ and $r\equiv\|{\boldsymbol{r}}\|$ ⁸⁸8In the rest of this section, we use regular symbols to denote magnitudes of vectors without further notice.. This problem only needs to be solved in two dimensions, as the particle never leaves the plane spanned by initial conditions — or the line, if the initial position and velocity are collinear, but it is trivial to apply full results to the one-dimensional case. The general solution to the above EOM can be expressed in parametric forms, which we do not include here; exact solutions to specific problems (i.e., for specific initial values) will be presented when needed.

The (unrestricted) three-body problem is more complicated, with equations of motion

\displaystyle\left\{\begin{aligned} m_{0}\ddot{\boldsymbol{r}}_{0}^{\prime}&=-Gm_{1}m_{0}\frac{{\boldsymbol{r}}_{0}^{\prime}-{\boldsymbol{r}}_{1}^{\prime}}{\|{\boldsymbol{r}}_{0}^{\prime}-{\boldsymbol{r}}_{1}^{\prime}\|^{3}}-Gm_{2}m_{0}\frac{{\boldsymbol{r}}_{0}^{\prime}-{\boldsymbol{r}}_{2}^{\prime}}{\|{\boldsymbol{r}}_{0}^{\prime}-{\boldsymbol{r}}_{2}^{\prime}\|^{3}}\\ m_{1}\ddot{\boldsymbol{r}}_{1}^{\prime}&=-Gm_{0}m_{1}\frac{{\boldsymbol{r}}_{1}^{\prime}-{\boldsymbol{r}}_{0}^{\prime}}{\|{\boldsymbol{r}}_{1}^{\prime}-{\boldsymbol{r}}_{0}^{\prime}\|^{3}}-Gm_{2}m_{1}\frac{{\boldsymbol{r}}_{1}^{\prime}-{\boldsymbol{r}}_{2}^{\prime}}{\|{\boldsymbol{r}}_{1}^{\prime}-{\boldsymbol{r}}_{2}^{\prime}\|^{3}}\\ m_{2}\ddot{\boldsymbol{r}}_{2}^{\prime}&=-Gm_{0}m_{2}\frac{{\boldsymbol{r}}_{2}^{\prime}-{\boldsymbol{r}}_{0}^{\prime}}{\|{\boldsymbol{r}}_{2}^{\prime}-{\boldsymbol{r}}_{0}^{\prime}\|^{3}}-Gm_{1}m_{2}\frac{{\boldsymbol{r}}_{2}^{\prime}-{\boldsymbol{r}}_{1}^{\prime}}{\|{\boldsymbol{r}}_{2}^{\prime}-{\boldsymbol{r}}_{1}^{\prime}\|^{3}}\end{aligned}\right.,

(3.0.3)

where the prime “^′” denotes inertial coordinate system; writing ${\boldsymbol{r}}_{i}\equiv{\boldsymbol{r}}_{i}^{\prime}-{\boldsymbol{r}}_{0}^{\prime}$ , $r_{i}\equiv\|{\boldsymbol{r}}_{i}\|$ and $G(m_{0}+m_{1}+m_{2})=1$ , $\mu_{i}\equiv m_{i}/(m_{0}+m_{1}+m_{2})<1$ for $i=1,2$ , these equations can be reduced to

\displaystyle\left\{\begin{aligned} \ddot{\boldsymbol{r}}_{1}&=-(1-\mu_{2})\frac{{\boldsymbol{r}}_{1}}{r_{1}^{3}}-\mu_{2}\left(\frac{{\boldsymbol{r}}_{2}}{r_{2}^{3}}+\frac{{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{2}}{\|{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{2}\|^{3}}\right)\\ \ddot{\boldsymbol{r}}_{2}&=-(1-\mu_{1})\frac{{\boldsymbol{r}}_{2}}{r_{2}^{3}}-\mu_{1}\left(\frac{{\boldsymbol{r}}_{1}}{r_{1}^{3}}+\frac{{\boldsymbol{r}}_{2}-{\boldsymbol{r}}_{1}}{\|{\boldsymbol{r}}_{2}-{\boldsymbol{r}}_{1}\|^{3}}\right)\end{aligned}\right..

(3.0.4)

The above equations do not have a closed-form solution in general.

Although $r_{x(i)}$ and $r_{y(i)}$ will be written as polynomials, Taylor expansion of $r_{(i)}=\sqrt{r_{x(i)}^{2}+r_{y(i)}^{2}}$ has infinitely many terms, hence some truncation is necessary. In Section 3.1, we apply first-order ContEvol method to the two-body problem, keeping “adequately” many terms. We show that this is equivalent to linearization and Taylor expansion in Section 3.2. In Section 3.3, we investigate conservation of mechanic energy and angular momentum, before moving on to numerical tests with an eccentric elliptical orbit in Section 3.4. Finally in Section 3.5, we describe how ContEvol is supposed to be applied to the three-body problem.

3.1 Two-body, first-order ContEvol with “adequate” expansion

Without loss of generality, we are given ${\boldsymbol{r}}(0)={\boldsymbol{r}}_{0}$ , $\dot{\boldsymbol{r}}(0)={\boldsymbol{v}}_{0}$ and try to solve for ${\boldsymbol{r}}(h)={\boldsymbol{r}}_{h}$ , $\dot{\boldsymbol{r}}(h)={\boldsymbol{v}}_{h}$ , where $h$ is the time step. Like in Section 2.1, we approximate the solution in a parametric form (subscript “CE2” now stands for ContEvol and two-body problem; note that we are recycling the subscripts)

\displaystyle{\boldsymbol{r}}_{\rm CE2}(t)={\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}t+{\boldsymbol{B}}t^{2}+{\boldsymbol{A}}t^{3},\quad t\in[0,h],

(3.1.1)

with coefficients ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ yielded by “terminal” conditions at $t=h$

\displaystyle\left\{\begin{aligned} {\boldsymbol{A}}&=2({\boldsymbol{r}}_{0}-{\boldsymbol{r}}_{h})h^{-3}+({\boldsymbol{v}}_{0}+{\boldsymbol{v}}_{h})h^{-2}\\ {\boldsymbol{B}}&=3({\boldsymbol{r}}_{h}-{\boldsymbol{r}}_{0})h^{-2}-(2{\boldsymbol{v}}_{0}+{\boldsymbol{v}}_{h})h^{-1}\end{aligned}\right..

(3.1.2)

To define the cost function $\epsilon$ as a finite polynomial of $h$ , we have to truncate the Taylor expansion on the right-hand side of the EOM Eq. (3.0.2). Since ${\boldsymbol{r}}_{\rm CE2}(t)$ traces up to the third order, we do not expect any benefit from going beyond the third order; justifying this statement is left for future work — note that non-linear coefficients in ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ start to occur at the fourth order, so one would have to solve non-linear equations to minimize the cost function. Thus we have

\displaystyle r_{\rm CE2}^{2}(t)\approx{\boldsymbol{r}}_{0}^{2}+2{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}t+(2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2})t^{2}+2({\boldsymbol{A}}\cdot{\boldsymbol{r}}_{0}+{\boldsymbol{B}}\cdot{\boldsymbol{v}}_{0})t^{3},\quad t\in[0,h],

(3.1.3)

and the cost function is defined as

$\displaystyle\epsilon_{\rm CE2}({\boldsymbol{A}},{\boldsymbol{B}};h)$	$\displaystyle=\int_{0}^{h}\left(\ddot{\boldsymbol{r}}+\frac{\boldsymbol{r}}{r^{3}}\right)^{2}\,{\rm d}t=\int_{0}^{h}\left[(2{\boldsymbol{B}}+6{\boldsymbol{A}}t)+\frac{{\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}t+{\boldsymbol{B}}t^{2}+{\boldsymbol{A}}t^{3}}{\\|{\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}t+{\boldsymbol{B}}t^{2}+{\boldsymbol{A}}t^{3}\\|^{3}}\right]^{2}\,{\rm d}t$
	$\displaystyle\approx\int_{0}^{h}\left[{\boldsymbol{C}}_{0}+{\boldsymbol{C}}_{1}t+{\boldsymbol{C}}_{2}t^{2}+{\boldsymbol{C}}_{3}t^{3}\right]^{2}\,{\rm d}t$
	$\displaystyle=\int_{0}^{h}\left[\begin{aligned} &C_{0}^{2}+2{\boldsymbol{C}}_{0}\cdot{\boldsymbol{C}}_{1}t+(C_{1}^{2}+2{\boldsymbol{C}}_{0}\cdot{\boldsymbol{C}}_{2})t^{2}+2({\boldsymbol{C}}_{0}\cdot{\boldsymbol{C}}_{3}+{\boldsymbol{C}}_{1}\cdot{\boldsymbol{C}}_{2})t^{3}\\ &+(C_{2}^{2}+2{\boldsymbol{C}}_{1}\cdot{\boldsymbol{C}}_{3})t^{4}+2{\boldsymbol{C}}_{2}\cdot{\boldsymbol{C}}_{3}t^{5}+C_{3}^{2}t^{6}\end{aligned}\right]\,{\rm d}t$
	$\displaystyle=\left[\begin{aligned} &C_{0}^{2}h+{\boldsymbol{C}}_{0}\cdot{\boldsymbol{C}}_{1}h^{2}+\frac{1}{3}(C_{1}^{2}+2{\boldsymbol{C}}_{0}\cdot{\boldsymbol{C}}_{2})h^{3}+\frac{1}{2}({\boldsymbol{C}}_{0}\cdot{\boldsymbol{C}}_{3}+{\boldsymbol{C}}_{1}\cdot{\boldsymbol{C}}_{2})h^{4}\\ &+\frac{1}{5}(C_{2}^{2}+2{\boldsymbol{C}}_{1}\cdot{\boldsymbol{C}}_{3})h^{5}+\frac{1}{3}{\boldsymbol{C}}_{2}\cdot{\boldsymbol{C}}_{3}h^{6}+\frac{1}{7}C_{3}^{2}h^{7}\end{aligned}\right]$	(3.1.4)

with

\displaystyle\left\{\begin{aligned} {\boldsymbol{C}}_{0}&=2{\boldsymbol{B}}+\frac{{\boldsymbol{r}}_{0}}{r_{0}^{3}}\\ {\boldsymbol{C}}_{1}&=6{\boldsymbol{A}}+\frac{{\boldsymbol{v}}_{0}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{r}}_{0}\\ {\boldsymbol{C}}_{2}&=\frac{\boldsymbol{B}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{v}}_{0}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{r}}_{0}\\ {\boldsymbol{C}}_{3}&=\left[\begin{aligned} &\frac{\boldsymbol{A}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}\boldsymbol{B}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{v}}_{0}\\ &-\left(\frac{3({\boldsymbol{A}}\cdot{\boldsymbol{r}}_{0}+{\boldsymbol{B}}\cdot{\boldsymbol{v}}_{0})}{r_{0}^{5}}-\frac{15(2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2})({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})}{2r_{0}^{7}}+\frac{35({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{3}}{2r_{0}^{9}}\right){\boldsymbol{r}}_{0}\end{aligned}\right]\end{aligned}\right.;

(3.1.5)

because of the ${\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}$ , ${\boldsymbol{A}}\cdot{\boldsymbol{r}}_{0}$ , and ${\boldsymbol{B}}\cdot{\boldsymbol{v}}_{0}$ terms, the two components are coupled with each other.

Minimizing this, we obtain

\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{\rm CE2}}{\partial A_{x}}&=12\left(2B_{x}+\frac{r_{0x}}{r_{0}^{3}}\right)h^{2}+4\left(6A_{x}+\frac{v_{0x}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}r_{0x}\right)h^{3}+\cdots=0\\ \frac{\partial\epsilon_{\rm CE2}}{\partial B_{x}}&=4\left(2B_{x}+\frac{r_{0x}}{r_{0}^{3}}\right)h+2\left(6A_{x}+\frac{v_{0x}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}r_{0x}\right)h^{2}+\cdots=0\end{aligned}\right.,

(3.1.6)

where we have omitted some high-order terms (“ $\cdots$ ”; up to $\mathcal{O}(h^{7})$ ) for simplicity, and equations $\partial\epsilon_{\rm CE2}/\partial A_{y}=0$ and $\partial\epsilon_{\rm CE2}/\partial B_{y}=0$ as they can be easily obtained via swapping subscripts $x$ and $y$ ; because of the coupling mentioned above, there are cross terms in high-order coefficients.

Put in matrix form, the system of equations is

\displaystyle\begin{pmatrix}M_{11}&M_{12}&M_{13}&M_{14}\\ M_{21}&M_{22}&M_{23}&M_{24}\\ M_{31}&M_{32}&M_{33}&M_{34}\\ M_{41}&M_{42}&M_{43}&M_{44}\end{pmatrix}\begin{pmatrix}A_{x,{\rm CE2}}\\ A_{y,{\rm CE2}}\\ B_{x,{\rm CE2}}\\ B_{y,{\rm CE2}}\end{pmatrix}=\begin{pmatrix}b_{1}\\ b_{2}\\ b_{3}\\ b_{4}\end{pmatrix}

(3.1.7)

with

\displaystyle\left\{\begin{aligned} &M_{11}=24h^{3}-\frac{24(2r_{0x}^{2}-r_{0y}^{2})}{5r_{0}^{5}}h^{5}+\cdots\\ &M_{22}=24h^{3}+\frac{24(r_{0x}^{2}-2r_{0y}^{2})}{5r_{0}^{5}}h^{5}+\cdots&&M_{12}=M_{21}=-\frac{72r_{0x}r_{0y}}{5r_{0}^{5}}h^{5}+\cdots\\ &M_{13}=M_{31}=12h^{2}-\frac{4(2r_{0x}^{2}-r_{0y}^{2})}{r_{0}^{5}}h^{4}+\cdots&&M_{14}=M_{41}=-\frac{12r_{0x}r_{0y}}{r_{0}^{5}}h^{4}+\cdots\\ &M_{24}=M_{42}=12h^{2}+\frac{4(r_{0x}^{2}-2r_{0y}^{2})}{r_{0}^{5}}h^{4}+\cdots&&M_{23}=M_{32}=-\frac{12r_{0x}r_{0y}}{r_{0}^{5}}h^{4}+\cdots\\ &M_{33}=8h-\frac{8(2r_{0x}^{2}-r_{0y}^{2})}{3r_{0}^{5}}h^{3}+\cdots\\ &M_{44}=8h+\frac{8(r_{0x}^{2}-2r_{0y}^{2})}{3r_{0}^{5}}h^{3}+\cdots&&M_{34}=M_{43}=-\frac{8r_{0x}r_{0y}}{r_{0}^{5}}h^{3}+\cdots\end{aligned}\right.

(3.1.8)

and

\displaystyle\left\{\begin{aligned} b_{1}&=-\frac{6r_{0x}}{r_{0}^{3}}h^{2}-4\left(\frac{v_{0x}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}r_{0x}\right)h^{3}+\cdots\\ b_{2}&=-\frac{6r_{0y}}{r_{0}^{3}}h^{2}-4\left(\frac{v_{0y}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}r_{0y}\right)h^{3}+\cdots\\ b_{3}&=-\frac{4r_{0x}}{r_{0}^{3}}h-2\left(\frac{v_{0x}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}r_{0x}\right)h^{2}+\cdots\\ b_{4}&=-\frac{4r_{0y}}{r_{0}^{3}}h-2\left(\frac{v_{0y}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}r_{0y}\right)h^{2}+\cdots\end{aligned}\right.;

(3.1.9)

the solution⁹⁹9To prevent Wolfram Mathematica from taking forever, one is advised to keep only up to $\mathcal{O}(h^{7})$ (or another desired order) terms at each step. This advice also applies to computation of determinant of the Jacobian matrix in this case. is

\displaystyle\left\{\begin{aligned} A_{x,{\rm CE2}}&=\left[\begin{aligned} &\frac{(2r_{0x}^{2}-r_{0y}^{2})v_{0x}+3r_{0x}r_{0y}v_{0y}}{6r_{0}^{5}}\\ &-\frac{3r_{0x}^{3}(2v_{0x}^{2}-v_{0y}^{2})+r_{0x}[2r_{0}+3r_{0y}^{2}(4v_{0y}^{2}-3v_{0x}^{2})]+6(4r_{0x}^{2}-r_{0y}^{2})r_{0y}v_{0x}v_{0y}}{12r_{0}^{7}}h+\cdots\end{aligned}\right]\\ B_{x,{\rm CE2}}&=-\frac{r_{0x}}{2r_{0}^{3}}+\frac{3r_{0x}^{3}(2v_{0x}^{2}-v_{0y}^{2})+r_{0x}[2r_{0}+3r_{0y}^{2}(4v_{0y}^{2}-3v_{0x}^{2})]+6(4r_{0x}^{2}-r_{0y}^{2})r_{0y}v_{0x}v_{0y}}{24r_{0}^{7}}h^{2}+\cdots\end{aligned}\right.,

(3.1.10)

where again we have omitted some high-order terms (up to $\mathcal{O}(h^{7})$ ) and expressions for $y$ components.

Plugging Eq. (3.1.10) back into Eq. (3.1.1), our solution at $t=h$ is

\displaystyle\left\{\begin{aligned} r_{hx}&=r_{0x}+v_{0x}h-\frac{r_{0x}}{2r_{0}^{3}}h^{2}-\left(\frac{v_{0x}}{6r_{0}^{3}}-\frac{{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{2r_{0}^{5}}r_{0x}\right)h^{3}+\cdots\\ v_{hx}&=\left[\begin{aligned} &v_{0x}-\frac{r_{0x}}{r_{0}^{3}}h+\left(\frac{v_{0x}}{2r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{2r_{0}^{5}}r_{0x}\right)h^{2}\\ &-\frac{3r_{0x}^{3}(2v_{0x}^{2}-v_{0y}^{2})+r_{0x}[2r_{0}+3r_{0y}^{2}(4v_{0y}^{2}-3v_{0x}^{2})]+6(4r_{0x}^{2}-r_{0y}^{2})r_{0y}v_{0x}v_{0y}}{6r_{0}^{7}}h^{3}+\cdots\end{aligned}\right]\end{aligned}\right.;

(3.1.11)

thus the Jacobian matrix is

\displaystyle J=\begin{pmatrix}\partial r_{hx}/\partial r_{0x}&\partial r_{hx}/\partial r_{0y}&\partial r_{hx}/\partial v_{0x}&\partial r_{hx}/\partial v_{0y}\\ \partial r_{hy}/\partial r_{0x}&\partial r_{hy}/\partial r_{0y}&\partial r_{hy}/\partial v_{0x}&\partial r_{hy}/\partial v_{0y}\\ \partial v_{hx}/\partial r_{0x}&\partial v_{hx}/\partial r_{0y}&\partial v_{hx}/\partial v_{0x}&\partial v_{hx}/\partial v_{0y}\\ \partial v_{hy}/\partial r_{0x}&\partial v_{hy}/\partial r_{0y}&\partial v_{hy}/\partial v_{0x}&\partial v_{hy}/\partial v_{0y}\end{pmatrix}\equiv\begin{pmatrix}J_{11}&J_{12}&J_{13}&J_{14}\\ J_{21}&J_{22}&J_{23}&J_{24}\\ J_{31}&J_{32}&J_{33}&J_{34}\\ J_{41}&J_{42}&J_{43}&J_{44}\end{pmatrix}

(3.1.12)

with

		$\displaystyle\left\{\begin{aligned} &J_{11}=1+\frac{2r_{0x}^{2}-r_{0y}^{2}}{2r_{0}^{5}}h^{2}+\frac{-2r_{0x}^{3}v_{0x}+3r_{0x}r_{0y}^{2}v_{0x}-4r_{0x}^{2}r_{0y}v_{0y}+r_{0y}^{3}v_{0y}}{2r_{0}^{7}}h^{3}+\cdots\\ &J_{22}=1+\frac{-r_{0x}^{2}+2r_{0y}^{2}}{2r_{0}^{5}}h^{2}+\frac{r_{0x}^{3}v_{0x}-4r_{0x}r_{0y}^{2}v_{0x}+3r_{0x}^{2}r_{0y}v_{0y}-2r_{0y}^{3}v_{0y}}{2r_{0}^{7}}h^{3}+\cdots\\ &J_{12}=\frac{3r_{0x}r_{0y}}{2r_{0}^{5}}h^{2}+\frac{-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y}}{2r_{0}^{7}}h^{3}+\cdots\\ &J_{21}=\frac{3r_{0x}r_{0y}}{2r_{0}^{5}}h^{2}+\frac{-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y}}{2r_{0}^{7}}h^{3}+\cdots\end{aligned}\right.$		(3.1.13)
		$\displaystyle\left\{\begin{aligned} &J_{13}=h+\frac{2r_{0x}^{2}-r_{0y}^{2}}{6r_{0}^{5}}h^{3}+\frac{-2r_{0x}^{3}v_{0x}+3r_{0x}r_{0y}^{2}v_{0x}-4r_{0x}^{2}r_{0y}v_{0y}+r_{0y}^{3}v_{0y}}{4r_{0}^{7}}h^{4}+\cdots\\ &J_{24}=h+\frac{-r_{0x}^{2}+2r_{0y}^{2}}{6r_{0}^{5}}h^{3}+\frac{r_{0x}^{3}v_{0x}-4r_{0x}r_{0y}^{2}v_{0x}+3r_{0x}^{2}r_{0y}v_{0y}-2r_{0y}^{3}v_{0y}}{4r_{0}^{7}}h^{4}+\cdots\\ &J_{31}=\frac{2r_{0x}^{2}-r_{0y}^{2}}{r_{0}^{5}}h-\frac{3(2r_{0x}^{3}v_{0x}-3r_{0x}r_{0y}^{2}v_{0x}+4r_{0x}^{2}r_{0y}v_{0y}-r_{0y}^{3}v_{0y})}{2r_{0}^{7}}h^{2}+\cdots\\ &J_{42}=\frac{-r_{0x}^{2}+2r_{0y}^{2}}{r_{0}^{5}}h+\frac{3(r_{0x}^{3}v_{0x}-4r_{0x}r_{0y}^{2}v_{0x}+3r_{0x}^{2}r_{0y}v_{0y}-2r_{0y}^{3}v_{0y})}{2r_{0}^{7}}h^{2}+\cdots\end{aligned}\right.$		(3.1.14)
		$\displaystyle\left\{\begin{aligned} &J_{14}=\frac{r_{0x}r_{0y}}{2r_{0}^{5}}h^{3}+\frac{-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y}}{4r_{0}^{7}}h^{4}+\cdots\\ &J_{23}=\frac{r_{0x}r_{0y}}{2r_{0}^{5}}h^{3}+\frac{-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y}}{4r_{0}^{7}}h^{4}+\cdots\\ &J_{41}=\frac{3r_{0x}r_{0y}}{r_{0}^{5}}h+\frac{3(-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y})}{2r_{0}^{7}}h^{2}+\cdots\\ &J_{32}=\frac{3r_{0x}r_{0y}}{r_{0}^{5}}h+\frac{3(-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y})}{2r_{0}^{7}}h^{2}+\cdots\end{aligned}\right.$		(3.1.15)
		$\displaystyle\left\{\begin{aligned} &J_{33}=1+\frac{2r_{0x}^{2}-r_{0y}^{2}}{2r_{0}^{5}}h^{2}+\frac{-2r_{0x}^{3}v_{0x}+3r_{0x}r_{0y}^{2}v_{0x}-4r_{0x}^{2}r_{0y}v_{0y}+r_{0y}^{3}v_{0y}}{r_{0}^{7}}h^{3}+\cdots\\ &J_{44}=1+\frac{-r_{0x}^{2}+2r_{0y}^{2}}{2r_{0}^{5}}h^{2}+\frac{r_{0x}^{3}v_{0x}-4r_{0x}r_{0y}^{2}v_{0x}+3r_{0x}^{2}r_{0y}v_{0y}-2r_{0y}^{3}v_{0y}}{r_{0}^{7}}h^{3}+\cdots\\ &J_{34}=\frac{3r_{0x}r_{0y}}{2r_{0}^{5}}h^{2}+\frac{-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y}}{r_{0}^{7}}h^{3}+\cdots\\ &J_{43}=\frac{3r_{0x}r_{0y}}{2r_{0}^{5}}h^{2}+\frac{-4r_{0x}^{2}r_{0y}v_{0x}+r_{0y}^{3}v_{0x}+r_{0x}^{3}v_{0y}-4r_{0x}r_{0y}^{2}v_{0y}}{r_{0}^{7}}h^{3}+\cdots\end{aligned}\right.$		(3.1.16)

Note that “symmetries” in the Jacobian are broken at high orders. Its determinant is

\displaystyle\det(J)=1+\frac{{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}[119r_{0}+30r_{0x}^{2}(4v_{0x}^{2}-3v_{0y}^{2})+420r_{0x}r_{0y}v_{0x}v_{0y}-30r_{0y}^{2}(3v_{0x}^{2}-4v_{0y}^{2})]}{60r_{0}^{9}}h^{5}+\cdots,

(3.1.17)

i.e., the non-symplecticity is at the $\mathcal{O}(h^{5})$ level, three orders larger than applying first-order ContEvol to classic harmonic oscillator (see Section 2.1, specifically Eq. (2.1.11)).

According to Eq. (3.1.10), the minimized cost function Eq. (3.1) is

\displaystyle\epsilon_{{\rm CE2},\min}(h)=\left[\begin{aligned} &\frac{1}{180r_{0}^{10}}+\frac{6r_{0x}r_{0y}v_{0x}v_{0y}+(2r_{0x}^{2}-r_{0y}^{2})v_{0x}^{2}-(r_{0x}^{2}-2r_{0y}^{2})v_{0y}^{2}}{60r_{0}^{11}}\\ &+\frac{\left\{\begin{aligned} &(4r_{0x}^{4}+r_{0y}^{4})v_{0x}^{4}+4(4r_{0x}^{3}r_{0y}-r_{0x}r_{0y}^{3})v_{0x}^{3}v_{0y}+30r_{0x}^{2}r_{0y}^{2}v_{0x}^{2}v_{0y}^{2}\\ &-4(r_{0x}^{3}r_{0y}-4r_{0x}r_{0y}^{3})v_{0x}v_{0y}^{3}+(r_{0x}^{4}+4r_{0y}^{4})v_{0y}^{4}\end{aligned}\right\}}{80r_{0}^{12}}\end{aligned}\right]h^{5}+\cdots;

(3.1.18)

the order in $h$ is same as in the prototype case Eq. (2.1.13); however, when $r_{0}$ is small, i.e., when $r_{0}\lesssim\sqrt{h}$ , the above expression can still be large.

Test case 1: Uniform circular motion.

Consider the initial conditions ${\boldsymbol{r}}_{0}=(1,0)^{\rm T}$ and ${\boldsymbol{v}}_{0}=(0,1)^{\rm T}$ . The particle will perform a uniform circular motion along the unit circle.

The exact solution is (subscript “UCM” stands for uniform circular motion)

\displaystyle\left\{\begin{aligned} {\boldsymbol{r}}_{\rm UCM}(t)&=\begin{pmatrix}\cos t\\ \sin t\end{pmatrix}=\begin{pmatrix}1-\dfrac{t^{2}}{2}+\dfrac{t^{4}}{24}-\dfrac{t^{6}}{720}+\mathcal{O}(t^{8})\\ t-\dfrac{t^{3}}{6}+\dfrac{t^{5}}{120}-\dfrac{t^{7}}{5040}+\mathcal{O}(t^{9})\end{pmatrix}\\ {\boldsymbol{v}}_{\rm UCM}(t)&=\begin{pmatrix}-\sin t\\ \cos t\end{pmatrix}=\begin{pmatrix}-\left[t-\dfrac{t^{3}}{6}+\dfrac{t^{5}}{120}-\dfrac{t^{7}}{5040}+\mathcal{O}(t^{9})\right]\\ 1-\dfrac{t^{2}}{2}+\dfrac{t^{4}}{24}-\dfrac{t^{6}}{720}+\mathcal{O}(t^{8})\end{pmatrix}\end{aligned}\right.,

(3.1.19)

while first-order ContEvol with “adequate” expansion yields

\displaystyle\left\{\begin{aligned} {\boldsymbol{r}}_{h}&=\begin{pmatrix}1-\dfrac{h^{2}}{2}+\dfrac{h^{4}}{24}-{\color[rgb]{1,0,0}0}\cdot\dfrac{h^{6}}{720}+\mathcal{O}(h^{8})\\ h-\dfrac{h^{3}}{6}+\dfrac{h^{5}}{120}-{\color[rgb]{1,0,0}\dfrac{303}{5}}\cdot\dfrac{h^{7}}{5040}+\mathcal{O}(h^{9})\end{pmatrix}\\ {\boldsymbol{v}}_{h}&=\begin{pmatrix}-\left[h-\dfrac{h^{3}}{6}+{\color[rgb]{1,0,0}\dfrac{4}{3}}\cdot\dfrac{h^{5}}{120}-{\color[rgb]{1,0,0}\dfrac{1486}{15}}\cdot\dfrac{h^{7}}{5040}+\mathcal{O}(h^{9})\right]\\ 1-\dfrac{h^{2}}{2}+\dfrac{h^{4}}{24}-{\color[rgb]{1,0,0}27}\cdot\dfrac{h^{6}}{720}+\mathcal{O}(h^{8})\end{pmatrix}\end{aligned}\right.,

(3.1.20)

i.e., like in Section 2.1, errors in ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ (highlighted in red) are $\mathcal{O}(h^{6})$ and $\mathcal{O}(h^{5})$ , respectively.

Test case 2: Parabolic motion.

Consider the initial conditions ${\boldsymbol{r}}_{0}=(2,0)^{\rm T}$ and ${\boldsymbol{v}}_{0}=(-1/\sqrt{2},1/\sqrt{2})^{\rm T}$ . The particle will move along the parabola $r_{y}=1-r_{x}^{2}/4$ .

According to conservation of angular momentum and mechanic energy (see Section 3.3 for further treatment), the exact solution is (subscript “PBM” stands for parabolic motion)

\displaystyle\left\{\begin{aligned} r_{{\rm PBM},x}(t)&=\frac{2\cdot 2^{2/3}}{\sqrt[3]{\sqrt{80-48\sqrt{2}t+18t^{2}}+3\sqrt{2}t-8}}-\sqrt[3]{2}\sqrt[3]{\sqrt{80-48\sqrt{2}t+18t^{2}}+3\sqrt{2}t-8}\\ &=2-\frac{t}{\sqrt{2}}-\frac{t^{2}}{8}-\frac{t^{3}}{24\sqrt{2}}-\frac{5t^{4}}{768}-\frac{t^{5}}{768\sqrt{2}}+\frac{7t^{6}}{36864}+\frac{13t^{7}}{36864\sqrt{2}}+\mathcal{O}(t^{8})\\ r_{{\rm PBM},y}(t)&=1-\frac{r_{{\rm PBM},x}^{2}(t)}{4}=\frac{t}{\sqrt{2}}-\frac{t^{3}}{48\sqrt{2}}-\frac{t^{4}}{128}-\frac{7t^{5}}{1536\sqrt{2}}-\frac{7t^{6}}{6144}-\frac{35t^{7}}{73728\sqrt{2}}+\mathcal{O}(t^{8})\end{aligned}\right.,

(3.1.21)

where $t$ is within the radius of convergence for the expansion, and

\displaystyle\left\{\begin{aligned} v_{{\rm PBM},x}(t)&=-\frac{\sqrt{2/r_{\rm PBM}(t)}}{\sqrt{1+[-r_{{\rm PBM},x}(t)/2]^{2}}}=-\frac{1}{\sqrt{2}}-\frac{t}{4}-\frac{t^{2}}{8\sqrt{2}}-\frac{5t^{3}}{192}-\frac{5t^{4}}{768\sqrt{2}}+\frac{7t^{5}}{6144}+\frac{91t^{6}}{36864\sqrt{2}}+\frac{341t^{7}}{294912}+\mathcal{O}(t^{8})\\ v_{{\rm PBM},y}(t)&=[-r_{{\rm PBM},x}(t)/2]\cdot v_{{\rm PBM},x}(t)=\frac{1}{\sqrt{2}}-\frac{t^{2}}{16\sqrt{2}}-\frac{t^{3}}{32}-\frac{35t^{4}}{1536\sqrt{2}}-\frac{7t^{5}}{1024}-\frac{245t^{6}}{73728\sqrt{2}}-\frac{9t^{7}}{16384}+\mathcal{O}(t^{8})\end{aligned}\right..

(3.1.22)

First-order ContEvol with “adequate” expansion yields

\displaystyle\left\{\begin{aligned} {\boldsymbol{r}}_{h}&=\begin{pmatrix}2-\dfrac{h}{\sqrt{2}}-\dfrac{h^{2}}{8}-\dfrac{h^{3}}{24\sqrt{2}}-\dfrac{5h^{4}}{768}-\dfrac{h^{5}}{768\sqrt{2}}+{\color[rgb]{1,0,0}0}\cdot\dfrac{7h^{6}}{36864}+{\color[rgb]{1,0,0}\dfrac{528}{455}}\cdot\dfrac{13h^{7}}{36864\sqrt{2}}+\mathcal{O}(h^{8})\\ \dfrac{h}{\sqrt{2}}-\dfrac{h^{3}}{48\sqrt{2}}-\dfrac{h^{4}}{128}-\dfrac{7h^{5}}{1536\sqrt{2}}-{\color[rgb]{1,0,0}0}\cdot\dfrac{7h^{6}}{6144}-{\color[rgb]{1,0,0}\dfrac{3}{25}}\cdot\dfrac{35h^{7}}{73728\sqrt{2}}+\mathcal{O}(h^{8})\end{pmatrix}\\ {\boldsymbol{v}}_{h}&=\begin{pmatrix}-\dfrac{1}{\sqrt{2}}-\dfrac{h}{4}-\dfrac{h^{2}}{8\sqrt{2}}-\dfrac{5h^{3}}{192}-\dfrac{5h^{4}}{768\sqrt{2}}+{\color[rgb]{1,0,0}\left(-\dfrac{512}{2373}\right)}\cdot\dfrac{7h^{5}}{6144}+{\color[rgb]{1,0,0}\dfrac{216}{455}}\cdot\dfrac{91h^{6}}{36864\sqrt{2}}+{\color[rgb]{1,0,0}\dfrac{13348}{35805}}\cdot\dfrac{341h^{7}}{294912}+\mathcal{O}(h^{8})\\ \dfrac{1}{\sqrt{2}}-\dfrac{h^{2}}{16\sqrt{2}}-\dfrac{h^{3}}{32}-\dfrac{35h^{4}}{1536\sqrt{2}}-{\color[rgb]{1,0,0}\left(-\dfrac{2}{105}\right)}\cdot\dfrac{7h^{5}}{1024}-{\color[rgb]{1,0,0}\dfrac{27}{1225}}\cdot\dfrac{245h^{6}}{73728\sqrt{2}}-{\color[rgb]{1,0,0}\dfrac{976}{2835}}\cdot\dfrac{9h^{7}}{16384}+\mathcal{O}(h^{8})\end{pmatrix}\end{aligned}\right..

(3.1.23)

Again, errors in ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ (highlighted in red) are $\mathcal{O}(h^{6})$ and $\mathcal{O}(h^{5})$ , respectively.

3.2 Two-body, equivalence with linearization and Taylor expansion

In this section, we show that first-order ContEvol with “adequate” expansion is equivalent to both linearization and fifth-order Taylor expansion of the equation of motion.

Equivalence with linearization.

An alternative way to handle the right hand side of the EOM Eq. (3.0.2) is to define

\displaystyle{\boldsymbol{f}}(t)={\boldsymbol{f}}({\boldsymbol{r}}(t))=\frac{\boldsymbol{r}}{r^{3}}

(3.2.1)

and use its derivatives at $t=0$ and $t=h$ to approximate it as (again, subscript “CE2” stands for ContEvol and two-body problem)

\displaystyle{\boldsymbol{f}}_{\rm CE2}(t)={\boldsymbol{f}}_{0}+\dot{\boldsymbol{f}}_{0}t+{\boldsymbol{B}}_{\boldsymbol{f}}t^{2}+{\boldsymbol{A}}_{\boldsymbol{f}}t^{3},\quad t\in[0,h],

(3.2.2)

with coefficients ${\boldsymbol{A}}_{\boldsymbol{f}}$ and ${\boldsymbol{B}}_{\boldsymbol{f}}$ yielded by “terminal” conditions at $t=h$

\displaystyle\left\{\begin{aligned} {\boldsymbol{A}}_{\boldsymbol{f}}&=2({\boldsymbol{f}}_{0}-{\boldsymbol{f}}_{h})h^{-3}+(\dot{\boldsymbol{f}}_{0}+\dot{\boldsymbol{f}}_{h})h^{-2}\\ {\boldsymbol{B}}_{\boldsymbol{f}}&=3({\boldsymbol{f}}_{h}-{\boldsymbol{f}}_{0})h^{-2}-(2\dot{\boldsymbol{f}}_{0}+\dot{\boldsymbol{f}}_{h})h^{-1}\end{aligned}\right..

(3.2.3)

Evidently, we have ${\boldsymbol{f}}_{0}={\boldsymbol{C}}_{0}-2{\boldsymbol{B}}$ (see Eq. (3.1.5) for ${\boldsymbol{C}}_{i}$ , $i=0,1,2,3$ ).

Since ${\boldsymbol{f}}(t)$ only depends on time through ${\boldsymbol{r}}(t)$ , its derivative is

	$\displaystyle\dot{\boldsymbol{f}}(t)$	$\displaystyle=\dot{f}_{i}{\boldsymbol{e}}_{i}=v_{j}\frac{\partial f_{i}}{\partial r_{j}}{\boldsymbol{e}}_{i}=v_{j}\frac{\partial}{\partial r_{j}}\left[\frac{r_{i}}{(r_{k}r_{k})^{3/2}}\right]{\boldsymbol{e}}_{i}=v_{j}\frac{\delta_{ij}(r_{k}r_{k})^{3/2}-r_{i}(3/2)(r_{k}r_{k})^{1/2}(2r_{j})}{(r_{k}r_{k})^{3}}{\boldsymbol{e}}_{i}$
		$\displaystyle=\frac{v_{i}{\boldsymbol{e}}_{i}}{(r_{k}r_{k})^{3/2}}-\frac{3r_{j}v_{j}}{(r_{k}r_{k})^{5/2}}r_{i}{\boldsymbol{e}}_{i}=\frac{{\boldsymbol{v}}}{r^{3}}-\frac{3{\boldsymbol{r}}\cdot{\boldsymbol{v}}}{r^{5}}{\boldsymbol{r}},$		(3.2.4)

where we have used Einstein notation. Similarly, we have $\dot{\boldsymbol{f}}_{0}={\boldsymbol{C}}_{1}-6{\boldsymbol{A}}$ .

The coefficients ${\boldsymbol{A}}_{\boldsymbol{f}}$ and ${\boldsymbol{B}}_{\boldsymbol{f}}$ can be fully specified by either ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ (see Eq. (3.1.2)) or ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ . Proceeding with ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ , the function ${\boldsymbol{f}}(t)$ at $t=h$ is

$\displaystyle{\boldsymbol{f}}_{h}$	$\displaystyle={\boldsymbol{f}}({\boldsymbol{r}}_{h})=\frac{{\boldsymbol{r}}_{h}}{r_{h}^{3}}=\frac{{\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}h+{\boldsymbol{B}}h^{2}+{\boldsymbol{A}}h^{3}}{\\|{\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}h+{\boldsymbol{B}}h^{2}+{\boldsymbol{A}}h^{3}\\|^{3}}$
	$\displaystyle=\left[\begin{aligned} &\frac{{\boldsymbol{r}}_{0}}{r_{0}^{3}}+\left(\frac{{\boldsymbol{v}}_{0}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{r}}_{0}\right)h+\left[\frac{\boldsymbol{B}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{v}}_{0}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{r}}_{0}\right]h^{2}\\ &+\left\{\begin{aligned} &\frac{\boldsymbol{A}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}\boldsymbol{B}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{v}}_{0}\\ &-\left(\frac{3({\boldsymbol{A}}\cdot{\boldsymbol{r}}_{0}+{\boldsymbol{B}}\cdot{\boldsymbol{v}}_{0})}{r_{0}^{5}}-\frac{15(2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2})({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})}{2r_{0}^{7}}+\frac{35({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{3}}{2r_{0}^{9}}\right){\boldsymbol{r}}_{0}\end{aligned}\right\}h^{3}+\mathcal{O}(h^{4})\end{aligned}\right]$
	$\displaystyle=({\boldsymbol{C}}_{0}-2{\boldsymbol{B}})+({\boldsymbol{C}}_{1}-6{\boldsymbol{A}})h+{\boldsymbol{C}}_{2}h^{2}+{\boldsymbol{C}}_{3}h^{3}+\mathcal{O}(h^{4}),$	(3.2.5)

and its derivative $\dot{\boldsymbol{f}}(t)$ at $t=h$ is

$\displaystyle\dot{\boldsymbol{f}}_{h}$	$\displaystyle=\dot{\boldsymbol{f}}({\boldsymbol{r}}_{h},{\boldsymbol{v}}_{h})=\frac{{\boldsymbol{v}}}{r^{3}}-\frac{3{\boldsymbol{r}}\cdot{\boldsymbol{v}}}{r^{5}}{\boldsymbol{r}}$
	$\displaystyle=\frac{{\boldsymbol{v}}_{0}+2{\boldsymbol{B}}h+3{\boldsymbol{A}}h^{2}}{\\|{\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}h+{\boldsymbol{B}}h^{2}+{\boldsymbol{A}}h^{3}\\|^{3}}+\frac{3({\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}h+{\boldsymbol{B}}h^{2}+{\boldsymbol{A}}h^{3})\cdot({\boldsymbol{v}}_{0}+2{\boldsymbol{B}}h+3{\boldsymbol{A}}h^{2})}{\\|{\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}h+{\boldsymbol{B}}h^{2}+{\boldsymbol{A}}h^{3}\\|^{5}}({\boldsymbol{r}}_{0}+{\boldsymbol{v}}_{0}h+{\boldsymbol{B}}h^{2}+{\boldsymbol{A}}h^{3})$
	$\displaystyle=\left[\begin{aligned} &\left(\frac{{\boldsymbol{v}}_{0}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{r}}_{0}\right)+2\left[\frac{\boldsymbol{B}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{v}}_{0}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{r}}_{0}\right]h^{2}\\ &+3\left\{\begin{aligned} &\frac{\boldsymbol{A}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}\boldsymbol{B}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{v}}_{0}\\ &-\left(\frac{3({\boldsymbol{A}}\cdot{\boldsymbol{r}}_{0}+{\boldsymbol{B}}\cdot{\boldsymbol{v}}_{0})}{r_{0}^{5}}-\frac{15(2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2})({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})}{2r_{0}^{7}}+\frac{35({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{3}}{2r_{0}^{9}}\right){\boldsymbol{r}}_{0}\end{aligned}\right\}h^{2}+\mathcal{O}(h^{3})\end{aligned}\right]$
	$\displaystyle=({\boldsymbol{C}}_{1}-6{\boldsymbol{A}})+2{\boldsymbol{C}}_{2}h+3{\boldsymbol{C}}_{3}h^{2}+\mathcal{O}(h^{3}),$	(3.2.6)

where we have “adequately” expanded ${\boldsymbol{f}}_{h}$ and $\dot{\boldsymbol{f}}_{h}$ to keep all the terms without non-linear coefficients in ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ . Plugging these into Eq. (3.2.3), we obtain the simple relations ${\boldsymbol{A}}_{\boldsymbol{f}}\approx{\boldsymbol{C}}_{3}$ and ${\boldsymbol{B}}_{\boldsymbol{f}}\approx{\boldsymbol{C}}_{2}$ .

With the function ${\boldsymbol{f}}(t)$ , the cost function is defined as (here the prime “^′” denotes linearization)

$\displaystyle\epsilon_{\rm CE2}^{\prime}({\boldsymbol{A}},{\boldsymbol{B}};h)$	$\displaystyle=\int_{0}^{h}[\ddot{\boldsymbol{r}}+{\boldsymbol{f}}({\boldsymbol{r}})]^{2}\,{\rm d}t=\int_{0}^{h}[(2{\boldsymbol{B}}+6{\boldsymbol{A}}t)+({\boldsymbol{f}}_{0}+\dot{\boldsymbol{f}}_{0}t+{\boldsymbol{B}}_{\boldsymbol{f}}t^{2}+{\boldsymbol{A}}_{\boldsymbol{f}}t^{3})]^{2}\,{\rm d}t$
	$\displaystyle=\int_{0}^{h}[(2{\boldsymbol{B}}+{\boldsymbol{f}}_{0})+(6{\boldsymbol{A}}+\dot{\boldsymbol{f}}_{0})t+{\boldsymbol{B}}_{\boldsymbol{f}}t^{2}+{\boldsymbol{A}}_{\boldsymbol{f}}t^{3}]^{2}\,{\rm d}t$
	$\displaystyle\approx\int_{0}^{h}\left[{\boldsymbol{C}}_{0}+{\boldsymbol{C}}_{1}t+{\boldsymbol{C}}_{2}t^{2}+{\boldsymbol{C}}_{3}t^{3}\right]^{2}\,{\rm d}t=\epsilon_{\rm CE2}({\boldsymbol{A}},{\boldsymbol{B}};h).$	(3.2.7)

Therefore, linearization is equivalent to “adequate” expansion (see Section 3.1); nevertheless, this approach should be more suitable when the function ${\boldsymbol{f}}(t)$ does not have a simple expression, e.g., when it has to be numerically computed by interpolating in lookup tables.

Equivalence with Taylor expansion.

By successively differentiate the equation of motion Eq. (3.0.2), one can attain the third derivative (jerk)

\displaystyle{\boldsymbol{r}}^{(3)}=\frac{{\rm d}}{{\rm d}t}\left(-\frac{r_{j}{\boldsymbol{e}}_{j}}{(r_{k}r_{k})^{3/2}}\right)=-\frac{\dot{r}_{j}{\boldsymbol{e}}_{j}}{r^{3}}+\frac{3}{2}\frac{2r_{k^{\prime}}\dot{r}_{k^{\prime}}}{(r_{k}r_{k})^{5/2}}{\boldsymbol{r}}=-\frac{\dot{\boldsymbol{r}}}{r^{3}}+\frac{3{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}}{r^{5}}{\boldsymbol{r}},

(3.2.8)

the fourth derivative (snap)

$\displaystyle{\boldsymbol{r}}^{(4)}$	$\displaystyle=\frac{{\rm d}}{{\rm d}t}\left(-\frac{\dot{r}_{j}{\boldsymbol{e}}_{j}}{(r_{k}r_{k})^{3/2}}+\frac{3r_{l}\dot{r}_{l}}{(r_{k}r_{k})^{5/2}}r_{j}{\boldsymbol{e}}_{j}\right)$
	$\displaystyle=-\frac{\ddot{r}_{j}{\boldsymbol{e}}_{j}}{r^{3}}+\frac{3}{2}\frac{2r_{k^{\prime}}\dot{r}_{k^{\prime}}}{(r_{k}r_{k})^{5/2}}\dot{\boldsymbol{r}}+3\frac{\dot{r}_{l}\dot{r}_{l}+r_{l}\ddot{r}_{l}}{r^{5}}{\boldsymbol{r}}+\frac{3{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}}{r^{5}}\dot{r}_{j}{\boldsymbol{e}}_{j}-\frac{5}{2}\frac{2r_{k^{\prime}}\dot{r}_{k^{\prime}}}{(r_{k}r_{k})^{7/2}}3({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}){\boldsymbol{r}}$
	$\displaystyle=-\frac{\ddot{\boldsymbol{r}}}{r^{3}}+3\frac{2({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})\dot{\boldsymbol{r}}+(\dot{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}+{\boldsymbol{r}}\cdot\ddot{\boldsymbol{r}}){\boldsymbol{r}}}{r^{5}}-15\frac{({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})^{2}}{r^{7}}{\boldsymbol{r}},$	(3.2.9)

and the fifth derivative (crackle)

$\displaystyle{\boldsymbol{r}}^{(5)}$	$\displaystyle=\frac{{\rm d}}{{\rm d}t}\left(-\frac{\ddot{r}_{j}{\boldsymbol{e}}_{j}}{(r_{k}r_{k})^{3/2}}+3\frac{2(r_{l}\dot{r}_{l})\dot{r}_{j}{\boldsymbol{e}}_{j}+(\dot{r}_{l}\dot{r}_{l}+r_{l}\ddot{r}_{l})r_{j}{\boldsymbol{e}}_{j}}{(r_{k}r_{k})^{5/2}}-15\frac{(r_{l}\dot{r}_{l})^{2}}{(r_{k}r_{k})^{7/2}}r_{j}{\boldsymbol{e}}_{j}\right)$
	$\displaystyle=\left[\begin{aligned} &-\frac{r_{j}^{(3)}{\boldsymbol{e}}_{j}}{r^{3}}+\frac{3}{2}\frac{2r_{k^{\prime}}\dot{r}_{k^{\prime}}}{(r_{k}r_{k})^{5/2}}\ddot{\boldsymbol{r}}+6\frac{(\dot{r}_{l}\dot{r}_{l}+r_{l}\ddot{r}_{l})\dot{\boldsymbol{r}}+({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})\ddot{r}_{j}{\boldsymbol{e}}_{j}}{r^{5}}\\ &+3\frac{(3\dot{r}_{l}\ddot{r}_{l}+r_{l}r_{l}^{(3)}){\boldsymbol{r}}+(\dot{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}+{\boldsymbol{r}}\cdot\ddot{\boldsymbol{r}})\dot{r}_{j}{\boldsymbol{e}}_{j}}{r^{5}}-\frac{5}{2}\frac{2r_{k^{\prime}}\dot{r}_{k^{\prime}}}{(r_{k}r_{k})^{7/2}}3[2({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})\dot{\boldsymbol{r}}+(\dot{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}+{\boldsymbol{r}}\cdot\ddot{\boldsymbol{r}}){\boldsymbol{r}}]\\ &-15\frac{2(r_{l}\dot{r}_{l})(\dot{r}_{l^{\prime}}\dot{r}_{l^{\prime}}+r_{l^{\prime}}\ddot{r}_{l^{\prime}}){\boldsymbol{r}}+({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})^{2}\dot{r}_{j}{\boldsymbol{e}}_{j}}{r^{7}}+\frac{7}{2}\frac{2r_{k^{\prime}}\dot{r}_{k^{\prime}}}{(r_{k}r_{k})^{9/2}}15({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})^{2}{\boldsymbol{r}}\end{aligned}\right]$
	$\displaystyle=\left[\begin{aligned} &-\frac{{\boldsymbol{r}}^{(3)}}{r^{3}}+3\frac{3({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})\ddot{\boldsymbol{r}}+3(\dot{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}+{\boldsymbol{r}}\cdot\ddot{\boldsymbol{r}})\dot{\boldsymbol{r}}+(3\dot{\boldsymbol{r}}\cdot\ddot{\boldsymbol{r}}+{\boldsymbol{r}}\cdot{\boldsymbol{r}}^{(3)}){\boldsymbol{r}}}{r^{5}}\\ &-45({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})\frac{({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})\dot{\boldsymbol{r}}+(\dot{\boldsymbol{r}}\cdot\dot{\boldsymbol{r}}+{\boldsymbol{r}}\cdot\ddot{\boldsymbol{r}}){\boldsymbol{r}}}{r^{7}}+105\frac{({\boldsymbol{r}}\cdot\dot{\boldsymbol{r}})^{3}}{r^{9}}{\boldsymbol{r}}\end{aligned}\right]$	(3.2.10)

of the position vector ${\boldsymbol{r}}$ ; using these derivatives, the Taylor expansion of the EOM is

\displaystyle\left\{\begin{aligned} {\boldsymbol{r}}(t)&={\boldsymbol{r}}_{0}+\dot{\boldsymbol{r}}_{0}t+\frac{1}{2}\ddot{\boldsymbol{r}}_{0}t^{2}+\frac{1}{6}{\boldsymbol{r}}^{(3)}_{0}t^{3}+\frac{1}{24}{\boldsymbol{r}}^{(4)}_{0}t^{4}+\frac{1}{120}{\boldsymbol{r}}^{(5)}_{0}t^{5}+\mathcal{O}(t^{6})\\ {\boldsymbol{v}}(t)&=\dot{\boldsymbol{r}}_{0}+\ddot{\boldsymbol{r}}_{0}t+\frac{1}{2}{\boldsymbol{r}}^{(3)}_{0}t^{2}+\frac{1}{6}{\boldsymbol{r}}^{(4)}_{0}t^{3}+\frac{1}{24}{\boldsymbol{r}}^{(5)}_{0}t^{4}+\mathcal{O}(t^{5})\end{aligned}\right..

(3.2.11)

It is verified that the first-order ContEvol solution Eq. (3.1.11) is identical to

\displaystyle\left\{\begin{aligned} {\boldsymbol{r}}_{{\rm CE1},h}&={\boldsymbol{r}}_{0}+\dot{\boldsymbol{r}}_{0}h+\frac{1}{2}\ddot{\boldsymbol{r}}_{0}h^{2}+\frac{1}{6}{\boldsymbol{r}}^{(3)}_{0}h^{3}+\frac{1}{24}{\boldsymbol{r}}^{(4)}_{0}h^{4}+\frac{1}{120}{\boldsymbol{r}}^{(5)}_{0}h^{5}+\mathcal{O}(h^{7})\\ {\boldsymbol{v}}_{{\rm CE1},h}&=\dot{\boldsymbol{r}}_{0}+\ddot{\boldsymbol{r}}_{0}h+\frac{1}{2}{\boldsymbol{r}}^{(3)}_{0}h^{2}+\frac{1}{6}{\boldsymbol{r}}^{(4)}_{0}h^{3}+\frac{1}{24}{\boldsymbol{r}}^{(5)}_{0}h^{4}+\mathcal{O}(h^{5})\end{aligned}\right.;

(3.2.12)

note that the $\mathcal{O}(h^{6})$ term of ${\boldsymbol{r}}_{{\rm CE1},h}$ is missing. Therefore, at least for the two-body problem, first-order ContEvol is equivalent to fifth-order Taylor expansion of the EOM in terms of position, and fourth-order in terms of velocity.

For relatively simple equation(s), successive derivatives are feasible; however, when the system is complicated, ContEvol could provide a “shortcut” to obtain fifth/fourth-order Taylor expansion of the evolution numerically. Specifically, one can compute counterparts of the ${\boldsymbol{C}}_{i}$ coefficients Eq. (3.1.5) numerically, use them to construct a linear system like Eq. (3.1.7), and then solve it to obtain counterparts of ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ . In Section 3.5, we will outline how this is supposed to be done for the three-body problem.

The procedure described above is not the only way to implement a ContEvol method. For relatively simple problems like the two-body problem, one can choose to directly use expressions for results at $t=h$ , e.g., ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ Eq. (3.1.11). We refer to the two strategies as implementation by optimization process and implementation by optimization results, respectively. In Section 3.4, while implementing first-order ContEvol for an eccentric orbit, we will adopt the second strategy, i.e., directly utilize Eq. (3.2.12), truncating it at $\mathcal{O}(h^{7})$ for ${\boldsymbol{r}}_{h}$ and $\mathcal{O}(h^{5})$ for ${\boldsymbol{v}}_{h}$ .

3.3 Two-body, conservation of mechanic energy and angular momentum

As mentioned in the second test case of Section 3.1, two quantities should be conserved in the two body problem: mechanic energy and angular momentum. In terms of ${\boldsymbol{r}}$ and ${\boldsymbol{v}}$ , these are

\displaystyle\left\{\begin{aligned} E({\boldsymbol{r}},{\boldsymbol{v}})&=-\frac{1}{r}+\frac{v^{2}}{2}\\ L_{z}({\boldsymbol{r}},{\boldsymbol{v}})&=r_{x}v_{y}-r_{y}v_{x}\end{aligned}\right.,

(3.3.1)

respectively; note that ${\boldsymbol{L}}={\boldsymbol{r}}\times{\boldsymbol{v}}=L_{z}\hat{\boldsymbol{z}}$ in the case of a two-body problem, hence we only need to track its $z$ component. The proofs are straightforward:

\displaystyle\left\{\begin{aligned} \dot{E}&=-\dot{r}_{i}\frac{\partial}{\partial r_{i}}\frac{1}{(r_{k}r_{k})^{1/2}}+\ddot{r}_{i}\frac{\partial}{\partial\dot{r}_{i}}\frac{\dot{r}_{k}\dot{r}_{k}}{2}=\dot{r}_{i}\frac{2r_{i}}{2(r_{k}r_{k})^{3/2}}+\ddot{r}_{i}\dot{r}_{i}=\dot{\boldsymbol{r}}\cdot\left(\frac{{\boldsymbol{r}}}{r^{3}}+\ddot{\boldsymbol{r}}\right)=0\\ \dot{\boldsymbol{L}}&=\dot{\boldsymbol{r}}\times\dot{\boldsymbol{r}}+{\boldsymbol{r}}\times\ddot{\boldsymbol{r}}={\boldsymbol{r}}\times\left(-\frac{{\boldsymbol{r}}}{r^{3}}\right)={\boldsymbol{0}}\end{aligned}\right.,

(3.3.2)

where we have used the equation of motion Eq. (3.0.2) in both.

Using these two conservation laws, we can express ${\boldsymbol{v}}$ in terms of ${\boldsymbol{r}}$ as

\displaystyle\left\{\begin{aligned} v_{x}&=\frac{-r_{y}L_{z}\pm|r_{x}|\sqrt{\Delta_{\boldsymbol{r}}}}{r^{2}}\\ v_{y}&=\frac{r_{x}L_{z}\pm\operatorname{sgn}(r_{x})r_{y}\sqrt{\Delta_{\boldsymbol{r}}}}{r^{2}}\end{aligned}\right.\quad r_{x}\neq 0,

(3.3.3)

where $\operatorname{sgn}(\cdot)$ is the sign function, or

\displaystyle\left\{\begin{aligned} v_{x}&=\frac{-r_{y}L_{z}\pm\operatorname{sgn}(r_{y})r_{x}\sqrt{\Delta_{\boldsymbol{r}}}}{r^{2}}\\ v_{y}&=\frac{r_{x}L_{z}\pm|r_{y}|\sqrt{\Delta_{\boldsymbol{r}}}}{r^{2}}\end{aligned}\right.\quad r_{y}\neq 0

(3.3.4)

with

\displaystyle\Delta_{\boldsymbol{r}}=2(r+Er^{2})-L_{z}^{2}=(rv)^{2}-(rv\sin\langle{\boldsymbol{r}},{\boldsymbol{v}}\rangle)^{2}=(rv\cos\langle{\boldsymbol{r}},{\boldsymbol{v}}\rangle)^{2}\geq 0.

(3.3.5)

One should not use $\sqrt{\Delta_{\boldsymbol{r}}}=|rv\cos\langle{\boldsymbol{r}},{\boldsymbol{v}}\rangle|=|{\boldsymbol{r}}\cdot{\boldsymbol{v}}|$ to simplify Eqs. (3.3.3) or (3.3.4), as ${\boldsymbol{v}}$ is what we are trying to derive.

To resolve the ambiguity of the $\pm$ symbols, we write ${\boldsymbol{r}}_{h}\equiv{\boldsymbol{r}}(h)$ and ${\boldsymbol{v}}_{h}\equiv{\boldsymbol{v}}(h)$ as

\displaystyle\left\{\begin{aligned} {\boldsymbol{r}}_{h}&={\boldsymbol{r}}_{0}+\bar{\boldsymbol{v}}h\\ {\boldsymbol{v}}_{h}&={\boldsymbol{v}}_{0}+\bar{\boldsymbol{a}}h\end{aligned}\right.

(3.3.6)

and derive $E$ and $L_{z}$ from initial conditions ${\boldsymbol{r}}_{0}\equiv{\boldsymbol{r}}(0)$ and ${\boldsymbol{v}}_{0}\equiv{\boldsymbol{v}}(0)$

\displaystyle\left\{\begin{aligned} E&=-\frac{1}{r_{0}}+\frac{v_{0}^{2}}{2}\\ L_{z}&=r_{0x}v_{0y}-r_{0y}v_{0x}\end{aligned}\right..

(3.3.7)

Note that for this purpose, we are treating each pair of ${\boldsymbol{r}}$ and ${\boldsymbol{v}}$ as ${\boldsymbol{r}}_{0}$ and ${\boldsymbol{v}}_{0}$ ; in other words, we imagine an infinitesimal next step $h\to 0$ for any given position and velocity.

Plugging these into and expanding Eq. (3.3.3), we obtain

\displaystyle v_{xh}=v_{0x}+\left[\begin{aligned} &\bar{v}_{x}\left(\frac{2r_{0x}r_{0y}L_{z}}{r_{0}^{4}}\pm\frac{r_{0x}[2r_{0x}^{2}r_{0y}^{2}v_{0x}^{2}-2r_{0x}r_{0y}(r_{0x}^{2}-r_{0y}^{2})v_{0x}v_{0y}+(r_{0y}^{4}+r_{0x}^{4})v_{0y}^{2}-r_{0}r_{0x}^{2}]}{r_{0}^{4}|r_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}|}\right)\\ &-\bar{v}_{y}\left(\frac{(r_{0x}^{2}-r_{0y}^{2})L_{z}}{r_{0}^{4}}\pm\frac{r_{0x}^{2}r_{0y}[(r_{0x}^{2}-r_{0y}^{2})(v_{0x}^{2}-v_{0y}^{2})+4r_{0x}r_{0y}v_{0x}v_{0y}+r_{0}]}{r_{0}^{4}|r_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}|}\right)\end{aligned}\right]h+\mathcal{O}(h^{2}),

(3.3.8)

where we have used $L_{z}$ to simplify notation, and a similar expression for $r_{hy}$ ; in the limit $h\to 0$ , we should have $\bar{\boldsymbol{v}}\to{\boldsymbol{v}}_{0}$ , and thus

\displaystyle v_{xh}\to v_{0x}+\left[\begin{aligned} &a_{0x}+(\pm\operatorname{sgn}(r_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})-1)\\ &\cdot\frac{2r_{0x}r_{0y}^{2}v_{0x}^{2}-r_{0y}(3r_{0x}^{2}-r_{0y}^{2})v_{0x}v_{0y}+r_{0x}(r_{0x}^{2}-r_{0y}^{2})v_{0y}^{2}-r_{0}r_{0x}}{r_{0}^{4}}\end{aligned}\right]h+\mathcal{O}(h^{2}).

(3.3.9)

Since $\bar{\boldsymbol{a}}\to{\boldsymbol{a}}_{0}=-{\boldsymbol{r}}_{0}/r_{0}^{3}$ , the $\pm$ symbols in Eq. (3.3.3) should take the same sign as $r_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}$ .

The $\pm$ symbols in Eq. (3.3.4) can be determined in the same way, and the final expressions are the same. In conclusion, based on the conserved quantities $E$ and $L_{z}$ , unambiguous expression for ${\boldsymbol{v}}$ in terms of ${\boldsymbol{r}}$ is

\displaystyle\left\{\begin{aligned} v_{x}&=\frac{-r_{y}L_{z}+\operatorname{sgn}({\boldsymbol{r}}\cdot{\boldsymbol{v}}^{\prime})r_{x}\sqrt{\Delta_{\boldsymbol{r}}}}{r^{2}}\\ v_{y}&=\frac{r_{x}L_{z}+\operatorname{sgn}({\boldsymbol{r}}\cdot{\boldsymbol{v}}^{\prime})r_{y}\sqrt{\Delta_{\boldsymbol{r}}}}{r^{2}}\end{aligned}\right.\quad r_{x}r_{y}\neq 0,

(3.3.10)

where the prime “^′” for ${\boldsymbol{v}}^{\prime}$ will be explained later in this section. Note that the condition $r_{x}r_{y}\neq 0$ is always satisfied unless $L_{z}=0$ , which reduces the two-body problem to its one-dimensional case and is usually not of interest, except for calculating the “free-fall” timescale.

Similarly, we can express ${\boldsymbol{r}}$ in terms of ${\boldsymbol{v}}$ as

\displaystyle\left\{\begin{aligned} r_{x}&=\frac{v_{y}L_{z}\pm|v_{x}|\sqrt{\Delta_{\boldsymbol{v}}}}{v^{2}}\\ r_{y}&=\frac{-v_{x}L_{z}\pm\operatorname{sgn}(v_{x})v_{y}\sqrt{\Delta_{\boldsymbol{v}}}}{v^{2}}\end{aligned}\right.\quad v_{x}\neq 0

(3.3.11)

\displaystyle\left\{\begin{aligned} r_{x}&=\frac{v_{y}L_{z}\pm\operatorname{sgn}(v_{y})v_{x}\sqrt{\Delta_{\boldsymbol{v}}}}{v^{2}}\\ r_{y}&=\frac{-v_{x}L_{z}\pm|v_{y}|\sqrt{\Delta_{\boldsymbol{v}}}}{v^{2}}\end{aligned}\right.\quad v_{y}\neq 0

(3.3.12)

with

\displaystyle\Delta_{\boldsymbol{v}}=\frac{v^{2}}{(-E+v^{2}/2)^{2}}-L_{z}^{2}=(rv)^{2}-(rv\sin\langle{\boldsymbol{r}},{\boldsymbol{v}}\rangle)^{2}=(rv\cos\langle{\boldsymbol{r}},{\boldsymbol{v}}\rangle)^{2}\geq 0.

(3.3.13)

Plugging Eqs. (3.3.6) and (3.3.7) into and expanding Eq. (3.3.11), we obtain

\displaystyle r_{hx}=r_{0x}+\left[\begin{aligned} &\bar{a}_{x}v_{0x}\left(-2\frac{v_{0y}L_{z}\pm|v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}|}{v_{0}^{4}}\pm 2\frac{r_{0x}}{v_{0}^{2}}\operatorname{sgn}(v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})\pm\frac{r_{0y}^{2}-r_{0}^{3}v_{0x}^{2}}{|v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}|}\right)\\ &-\bar{a}_{y}\left(2v_{0y}\frac{v_{0y}L_{z}\pm|v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}|}{v_{0}^{4}}-\frac{L_{z}}{v_{0}^{2}}\pm\frac{r_{0}^{2}v_{0x}^{2}v_{0y}(r_{0}v_{0}^{2}-1)}{|v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}|}\right)\end{aligned}\right]h+\mathcal{O}(h^{2}),

(3.3.14)

where again we have used $L_{z}$ to simplify notation, and a similar expression for $r_{hy}$ ; in the limit $h\to 0$ , we should have $\bar{\boldsymbol{a}}\to{\boldsymbol{a}}_{0}=-{\boldsymbol{r}}_{0}/r_{0}^{3}$ , and thus

\displaystyle r_{hx}\to r_{0x}+\left[\begin{aligned} &v_{0x}+(\pm\operatorname{sgn}(v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})-1)\\ &\cdot\left(v_{0x}-\frac{2r_{0x}^{2}v_{0x}v_{0y}^{2}+r_{0x}r_{0y}v_{0y}(v_{0y}^{2}-3v_{0x}^{2})+r_{0y}^{2}v_{0x}(v_{0x}^{2}-v_{0y}^{2})}{r_{0}^{3}v_{0}^{4}}\right)\end{aligned}\right]h+\mathcal{O}(h^{2}).

(3.3.15)

Since $\bar{\boldsymbol{v}}\to{\boldsymbol{v}}_{0}$ , $\pm$ symbols in Eq. (3.3.11) should take the same sign as $v_{0x}{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}$ .

The $\pm$ symbols in Eq. (3.3.12) can be determined in the same way, and the final expressions are the same. In conclusion, based on the conserved quantities $E$ and $L_{z}$ , unambiguous expression for ${\boldsymbol{r}}$ in terms of ${\boldsymbol{v}}$ is

\displaystyle\left\{\begin{aligned} r_{x}&=\frac{v_{y}L_{z}+\operatorname{sgn}({\boldsymbol{r}}^{\prime}\cdot{\boldsymbol{v}})v_{x}\sqrt{\Delta_{\boldsymbol{v}}}}{v^{2}}\\ r_{y}&=\frac{-v_{x}L_{z}+\operatorname{sgn}({\boldsymbol{r}}^{\prime}\cdot{\boldsymbol{v}})v_{y}\sqrt{\Delta_{\boldsymbol{v}}}}{v^{2}}\end{aligned}\right.\quad v_{x}v_{y}\neq 0.

(3.3.16)

The condition $v_{x}v_{y}\neq 0$ is also always satisfied unless $L_{z}=0$ .

Admittedly, ${\boldsymbol{v}}$ should not appear when we express ${\boldsymbol{v}}$ in terms of ${\boldsymbol{r}}$ , neither vice versa. Fortunately, numerical methods (including Runge–Kutta, ContEvol, etc.) usually predict both ${\boldsymbol{r}}$ and ${\boldsymbol{v}}$ after each time step, hence when we use ${\boldsymbol{r}}$ (or ${\boldsymbol{v}}$ ) to derive ${\boldsymbol{v}}$ (or ${\boldsymbol{r}}$ ), ${\boldsymbol{v}}^{\prime}$ (or ${\boldsymbol{r}}^{\prime}$ ) provided by the original numerical methods can be treated as a reasonable initial guess; these are denoted with a prime “^′” in Eqs. (3.3.10) and (3.3.16).

Behavior of the sign function near zero is worth more discussion. When ${\boldsymbol{r}}\cdot{\boldsymbol{v}}^{\prime}\approx 0$ , i.e., when ${\boldsymbol{r}}$ and ${\boldsymbol{v}}^{\prime}$ are perpendicular to each other, $\Delta_{\boldsymbol{r}}\approx 0$ , so that value of $\operatorname{sgn}({\boldsymbol{r}}\cdot{\boldsymbol{v}}^{\prime})$ does not matter. Similarly, when ${\boldsymbol{r}}^{\prime}\cdot{\boldsymbol{v}}\approx 0$ , $\Delta_{\boldsymbol{v}}\approx 0$ , so that value of $\operatorname{sgn}({\boldsymbol{r}}^{\prime}\cdot{\boldsymbol{v}})$ does not matter either. In practice, neither ${\boldsymbol{r}}\cdot{\boldsymbol{v}}^{\prime}$ nor ${\boldsymbol{r}}^{\prime}\cdot{\boldsymbol{v}}$ can be exactly zero, except for initial conditions or very rare coincidences, yet we need to consider the cases where they are about zero, as wrong signs can change the direction of the history, which is undesirable. To resolve this issue, one can specify a threshold $\delta$ , and set the value of the sign function to $0$ when $|{\boldsymbol{r}}\cdot{\boldsymbol{v}}^{\prime}|<\delta$ or $|{\boldsymbol{r}}^{\prime}\cdot{\boldsymbol{v}}|<\delta$ , or make a smoother transition using, e.g., a rescaled logistic function.

In the context of ContEvol, there are two approaches to make use of these conservation laws.

Approach 1: Use ${\boldsymbol{r}}_{h}$ to correct ${\boldsymbol{r}}_{h}$ .

As shown in Section 3.1, errors in ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ of first-order ContEvol are $\mathcal{O}(h^{6})$ and $\mathcal{O}(h^{5})$ , respectively. Because of this difference, after each step, using ${\boldsymbol{r}}_{h}$ to correct ${\boldsymbol{v}}_{h}$ according to Eq. (3.3.10) could be beneficial.

To testify the usefulness of this approach, we plug ${\boldsymbol{r}}_{h}$ given by Eq. (3.1.11) into Eq. (3.3.10) to attain a corrected version of ${\boldsymbol{v}}_{h}$ , denoted as ${\boldsymbol{v}}_{h,{\rm CC}}$ ; the discrepancy between uncorrected and corrected expressions is at the fifth order, hence we omit the latter here. Assuming ${\boldsymbol{r}}\cdot{\boldsymbol{v}}<0$ , the corrected Jacobian matrix is (subscript “CC” stands for conservation correction)

\displaystyle J_{\rm CC}=\begin{pmatrix}\partial r_{hx}/\partial r_{0x}&\partial r_{hx}/\partial r_{0y}&\partial r_{hx}/\partial v_{0x}&\partial r_{hx}/\partial v_{0y}\\ \partial r_{hy}/\partial r_{0x}&\partial r_{hy}/\partial r_{0y}&\partial r_{hy}/\partial v_{0x}&\partial r_{hy}/\partial v_{0y}\\ \partial v_{hx,{\rm CC}}/\partial r_{0x}&\partial v_{hx,{\rm CC}}/\partial r_{0y}&\partial v_{hx,{\rm CC}}/\partial v_{0x}&\partial v_{hx,{\rm CC}}/\partial v_{0y}\\ \partial v_{hy,{\rm CC}}/\partial r_{0x}&\partial v_{hy,{\rm CC}}/\partial r_{0y}&\partial v_{hy,{\rm CC}}/\partial v_{0x}&\partial v_{hy,{\rm CC}}/\partial v_{0y}\end{pmatrix}\equiv\begin{pmatrix}J_{11}&J_{12}&J_{13}&J_{14}\\ J_{21}&J_{22}&J_{23}&J_{24}\\ J_{31,{\rm CC}}&J_{32,{\rm CC}}&J_{33,{\rm CC}}&J_{34,{\rm CC}}\\ J_{41,{\rm CC}}&J_{42,{\rm CC}}&J_{43,{\rm CC}}&J_{44,{\rm CC}}\end{pmatrix},

(3.3.17)

where the matrix elements are the same as those given by Eqs. (3.1.13) through (3.1.15) for the first two rows, since we are using the same expressions for ${\boldsymbol{r}}_{h}$ ; for the last two rows, they are different from those in Eqs. (3.1.14) through (3.1.16), but again, the leading orders are not affected, so we refrain from showing them here. Most importantly, the determinant of the corrected Jacobian is

\displaystyle\det(J_{\rm CC})=1+\frac{\left[\begin{aligned} &22r_{0}^{3}-4r_{0}^{2}[r_{0x}^{2}(95v_{0x}^{2}+22v_{0y}^{2})+r_{0y}^{2}(22v_{0x}^{2}+95v_{0y}^{2})+146r_{0x}r_{0y}v_{0x}v_{0y}]\\ &-3r_{0}\left\{\begin{aligned} &r_{0x}^{4}(596v_{0x}^{4}-386v_{0x}^{2}v_{0y}^{2}-37v_{0y}^{4})+r_{0y}^{4}(-37v_{0x}^{4}-386v_{0x}^{2}v_{0y}^{2}+596v_{0y}^{4})\\ &-2r_{0x}^{2}r_{0y}^{2}(193v_{0x}^{4}-2449v_{0x}^{2}v_{0y}^{2}+193v_{0y}^{4})\\ &+12r_{0x}r_{0y}v_{0x}v_{0y}[r_{0y}^{2}(-52v_{0x}^{2}+263v_{0y}^{2})+r_{0x}^{2}(263v_{0x}^{2}-52v_{0y}^{2})]\end{aligned}\right\}\\ &-45\left\{\begin{aligned} &r_{0x}^{6}(16v_{0x}^{6}-72v_{0x}^{4}v_{0y}^{2}+18v_{0x}^{2}v_{0y}^{4}+v_{0y}^{6})+r_{0y}^{6}(v_{0x}^{6}+18v_{0x}^{4}v_{0y}^{2}-72v_{0x}^{2}v_{0y}^{4}+16v_{0y}^{6})\\ &+30r_{0x}r_{0y}v_{0x}v_{0y}\left[\begin{aligned} &r_{0x}^{4}(8v_{0x}^{4}-12v_{0x}^{2}v_{0y}^{2}+v_{0y}^{4})+r_{0y}^{4}(v_{0x}^{4}-12v_{0x}^{2}v_{0y}^{2}+8v_{0y}^{4})\\ &-2r_{0x}^{2}r_{0y}^{2}(6v_{0x}^{4}-23v_{0x}^{2}v_{0y}^{2}+6v_{0y}^{4})\end{aligned}\right]\\ &-3r_{0x}^{2}r_{0y}^{2}\left[\begin{aligned} &r_{0x}^{2}(24v_{0x}^{6}-308v_{0x}^{4}v_{0y}^{2}+187v_{0x}^{2}v_{0y}^{4}-6v_{0y}^{6})\\ &+r_{0y}^{2}(-6v_{0x}^{6}+187v_{0x}^{4}v_{0y}^{2}-308v_{0x}^{2}v_{0y}^{4}+24v_{0y}^{6})\end{aligned}\right]\end{aligned}\right\}\end{aligned}\right]}{720r_{0}^{11}({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}h^{6}+\cdots,

(3.3.18)

i.e., the non-simplecticity has been reduced from $\mathcal{O}(h^{5})$ (see Eq. (3.1.17)) to $\mathcal{O}(h^{6})$ . However, it blows up when ${\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}=0$ , in other words, when ${\boldsymbol{r}}_{0}$ and ${\boldsymbol{v}}_{0}$ are perpendicular to each other.

Then we take another look at test case 1 (of Section 3.1): uniform circular motion. Plugging ${\boldsymbol{r}}_{h}$ given by Eq. (3.1.20) into Eq. (3.3.10), we obtain

\displaystyle{\boldsymbol{v}}_{h,{\rm CC}}

\displaystyle=\begin{pmatrix}-\left[h-\dfrac{h^{3}}{6}+\dfrac{h^{5}}{120}-{\color[rgb]{1,0,0}\dfrac{1119}{14}}\cdot\dfrac{h^{7}}{5040}+\mathcal{O}(h^{9})\right]\\ 1-\dfrac{h^{2}}{2}+\dfrac{h^{4}}{24}-{\color[rgb]{1,0,0}2}\cdot\dfrac{h^{6}}{720}+\mathcal{O}(h^{8})\end{pmatrix}.

(3.3.19)

The fractional error of fifth-order coefficient has been eliminated, as expected; those of sixth and seventh (highlighted in red) have been reduced as well. Eq. (3.3.1) tells us that the discrepancies in conserved quantities are ameliorated as well

\displaystyle\left\{\begin{aligned} E_{h}&=-\frac{1}{2}-\frac{13}{240}h^{6}+\frac{2857}{100800}h^{8}+\mathcal{O}(h^{9})\\ L_{zh}&=1-\frac{13}{240}h^{6}+\frac{2857}{100800}h^{8}+\mathcal{O}(h^{9})\end{aligned}\right.\quad\Rightarrow\quad\left\{\begin{aligned} E_{h,{\rm CC}}&=-\frac{1}{2}-\frac{2669}{100800}h^{8}+\mathcal{O}(h^{9})\\ L_{zh,{\rm CC}}&=1-\frac{2669}{100800}h^{8}+\mathcal{O}(h^{9})\end{aligned}\right.,

(3.3.20)

i.e., deviations from conservation laws have been reduced by two orders. Note that the $\mathcal{O}(h^{8})$ errors may arise from truncation, since we only kept up to $\mathcal{O}(h^{7})$ terms in Section 3.1. Interestingly, errors in these two quantities are the same in both cases (before and after correction). We emphasize that, since $E$ and $L_{z}$ are derived from initial conditions, errors of the “CC” version do not accumulate.

In test case 2: parabolic motion, the corrected velocity vector is

\displaystyle{\boldsymbol{v}}_{h,{\rm CC}}

\displaystyle=\begin{pmatrix}-\dfrac{1}{\sqrt{2}}-\dfrac{h}{4}-\dfrac{h^{2}}{8\sqrt{2}}-\dfrac{5h^{3}}{192}-\dfrac{5h^{4}}{768\sqrt{2}}+\dfrac{7h^{5}}{6144}+{\color[rgb]{1,0,0}\dfrac{13}{10}}\cdot\dfrac{91h^{6}}{36864\sqrt{2}}+{\color[rgb]{1,0,0}\dfrac{1306}{1705}}\cdot\dfrac{341h^{7}}{294912}+\mathcal{O}(h^{8})\\ \dfrac{1}{\sqrt{2}}-\dfrac{h^{2}}{16\sqrt{2}}-\dfrac{h^{3}}{32}-\dfrac{35h^{4}}{1536\sqrt{2}}-\dfrac{7h^{5}}{1024}-{\color[rgb]{1,0,0}\dfrac{8}{7}}\cdot\dfrac{245h^{6}}{73728\sqrt{2}}-{\color[rgb]{1,0,0}\dfrac{11059}{5670}}\cdot\dfrac{9h^{7}}{16384}+\mathcal{O}(h^{8})\end{pmatrix};

(3.3.21)

the mechanic energy and the angular momentum before and after conservation correction are

\displaystyle\left\{\begin{aligned} E_{h}&=\frac{767h^{5}}{92160\sqrt{2}}+\frac{1891h^{6}}{737280}+\frac{7759h^{7}}{6193152\sqrt{2}}+\frac{25337h^{8}}{55050240}+\mathcal{O}(h^{9})\\ L_{zh}&=\sqrt{2}+\frac{107h^{5}}{7680}+\frac{113h^{6}}{61440\sqrt{2}}-\frac{223h^{7}}{368640}-\frac{1961h^{8}}{8257536\sqrt{2}}+\mathcal{O}(h^{9})\end{aligned}\right.\quad\Rightarrow\quad\left\{\begin{aligned} E_{h,{\rm CC}}&=\frac{99077h^{8}}{165150720}+\mathcal{O}(h^{9})\\ L_{zh,{\rm CC}}&=\sqrt{2}+\frac{8045h^{8}}{8257536\sqrt{2}}+\mathcal{O}(h^{9})\end{aligned}\right..

(3.3.22)

respectively. The situation is basically the same as test case 1, except that the reduction in $E$ and $L_{z}$ errors is three orders in this case.

As indicated by Section 2.2, for Runge–Kutta methods, errors in ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ have the same order, hence it is probably not well-motivated to use one to correct another; nevertheless, the correction described in this section should still be able to produce better conservation.

Approach 2: Enforce conservation laws in the formalism.

Alternatively, we can try to enforce conservation of machanic energy and angular momentum in the ContEvol formalism.

Plugging our polynomial approximation Eq. (3.1.1) into and expanding Eq. (3.3.10), we obtain

\displaystyle{\boldsymbol{v}}(t)={\boldsymbol{v}}_{0}-\frac{{\boldsymbol{r}}_{0}}{r_{0}^{3}}t+\mathcal{O}(t^{2})={\boldsymbol{v}}_{0}+2{\boldsymbol{B}}t+\mathcal{O}(t^{2})\quad\Rightarrow\quad{\boldsymbol{B}}=-\frac{{\boldsymbol{r}}_{0}}{2r_{0}^{3}};

(3.3.23)

further expansion (based on ${\boldsymbol{B}}$ found above) yields

	$\displaystyle{\boldsymbol{v}}(t)$	$\displaystyle={\boldsymbol{v}}_{0}-\frac{{\boldsymbol{r}}_{0}}{r_{0}^{3}}t+\frac{1}{2r_{0}^{5}}\begin{pmatrix}(2r_{0x}^{2}-r_{0y}^{2})v_{0x}+3r_{0x}r_{0y}v_{0y}\\ (2r_{0y}^{2}-r_{0x}^{2})v_{0y}+3r_{0x}r_{0y}v_{0x}\end{pmatrix}t^{2}+\mathcal{O}(t^{3})$
		$\displaystyle={\boldsymbol{v}}_{0}+2{\boldsymbol{B}}t+3{\boldsymbol{A}}t^{2}+\mathcal{O}(t^{3})\quad\Rightarrow\quad{\boldsymbol{A}}=\frac{1}{6r_{0}^{5}}\begin{pmatrix}(2r_{0x}^{2}-r_{0y}^{2})v_{0x}+3r_{0x}r_{0y}v_{0y}\\ (2r_{0y}^{2}-r_{0x}^{2})v_{0y}+3r_{0x}r_{0y}v_{0x}\end{pmatrix}.$		(3.3.24)

In short, the ${\boldsymbol{A}}$ and ${\boldsymbol{B}}$ coefficients determined in this way are simply zeroth-order terms of Eq. (3.1.10), which are not very useful. Therefore, in the context of ContEvol, conservation laws are better used for correction purposes.

To conclude this section, we briefly comment on how conservation of mechanic energy and angular momentum can be used in more realistic cases.

•

In galactic dynamics, when the matter distribution is axisymmetric, e.g., in the cases of some disk or elliptical galaxies, the situation is very similar to the two-body problem we consider here. Although both position and velocity of the particle are three-dimensional vectors now, mechanic energy and $z$ component of angular momentum are still conserved. Hence we can use these two constraints to correct $v_{x}$ and $v_{y}$ using all components of ${\boldsymbol{r}}$ and $v_{z}$ — note that $r_{z}$ and $v_{z}$ do not appear in the expression of $L_{z}$ , and are usually significantly smaller (in terms of absolute values) than their counterparts in $x$ and $y$ directions.
•

In general relativity, mechanic energy and angular momentum are conserved at the $\mathcal{O}(c^{-4})$ level ( $c$ is the speed of light in vacuum), before gravitational waves enter the scene. Thenceforth, while studying orbital motion of a planet around a star (e.g., Mercury around the Sun) or a star around a supermassive black hole using ContEvol (or another method which lead to different orders in position and velocity), conservation correction may also be useful.
•

Back to Newtonian gravity. For a general three-body problem (see Section 3.5 for further discussion), there are twelve components in total (two particles, positions and velocities, three directions) and four conserved quantities (total mechanic energy and three components of total angular momentum). Therefore, especially in almost coplanar cases, we can use $\{{\boldsymbol{r}}_{i}\}$ and $\{v_{iz}\}$ to correct $\{v_{ix}\}$ and $\{v_{iy}\}$ , where $i=1,2$ is the index of particle; in a restricted three-body problem, where one of the particles is much less massive than the others, we can choose a different set of four velocity components.
•

For a general $n$ -body problem, there are $6(n-1)$ components in total, but the number of conserved quantities are still four. Consequently, conservation laws become less and less useful as the number of particles increases. However, they are probably useful in hierarchical systems where we can still identify “important” velocity components. Further discussion on this topic is beyond the scope of this work.

The above discussion is only about the conservation laws per se. Since the ContEvol formalism promises to “recover” full evolutionary histories, when mechanic energy and angular momentum are not conserved for individual objects, in principle it allows users to perform corrections using energy-work and angular impulse-momentum theorems. To go one step further, if global sums of $E$ or ${\boldsymbol{L}}$ components (all of which should be conserved) obtained via these theorems deviate from the initial values, it is reasonable to globally rescale such sums before correcting individual quantities. However, such corrections are computationally expensive, and are only recommended when conservation laws are crucial.

3.4 Two-body, numerical tests with an eccentric elliptical orbit

In this section, we conduct numerical experiments to compare first-order ContEvol with some other low-order methods for celestial mechanics. We choose a highly eccentric elliptical orbit for testing purposes.

Specifically, this elliptical orbit has eccentricity $e$ , semi-major axis $a$ , semi-minor axis $b=a\sqrt{1-e^{2}}$ , and focal distance $c=ae$ . We write the equation of this ellipse as

\displaystyle\frac{(x-c)^{2}}{a^{2}}+\frac{y^{2}}{b^{2}}=0,

(3.4.1)

so that location of the “central object,” i.e., origin of our coordinate system $(0,0)^{\rm T}$ , is at the right focus. The mechanic energy of this orbit is (subscript “M” stands for mechanic and is added to distinguish energy from eccentric anomaly)

\displaystyle E_{\rm M}=-\frac{1}{2a},

(3.4.2)

while the orbital period is given by Kepler’s third law

\displaystyle T=2\pi a^{3/2}.

(3.4.3)

We let the particle start at the pericenter $(a(1-e),0)^{\rm T}$ and move counter-clockwise. The vis-viva equation tells us the initial speed

\displaystyle v_{0}=\sqrt{\frac{2}{a(1-e)}-\frac{1}{a}}=\sqrt{\frac{1}{a}\frac{1+e}{1-e}},

(3.4.4)

so that the initial velocity is $(0,v_{0})^{\rm T}$ , and thus ( $z$ component of) the angular momentum is

\displaystyle L_{z}=r_{0}v_{0}=\sqrt{a(1-e^{2})}.

(3.4.5)

At time $t$ , the position of our particle is given by

\displaystyle{\boldsymbol{r}}(t)=\begin{pmatrix}a(\cos E-e)\\ b\sin E\end{pmatrix},

(3.4.6)

where the eccentric anomaly $E$ is related to the mean anomaly

\displaystyle M=\frac{2\pi}{T}t=\frac{t}{a^{3/2}}

(3.4.7)

by Kepler’s equation

\displaystyle M=E-e\sin E,

(3.4.8)

which is a transcendental equation and has to be solved numerically.

The velocity can be obtained via Eq. (3.3.10) or expressed as

\displaystyle{\boldsymbol{v}}=\dot{\boldsymbol{r}}=\begin{pmatrix}-a\sin E\\ b\cos E\end{pmatrix}\dot{E}.

(3.4.9)

Ergo we have

$\displaystyle{\boldsymbol{r}}\cdot{\boldsymbol{v}}$	$\displaystyle=\begin{pmatrix}a(\cos E-e)\\ b\sin E\end{pmatrix}^{\rm T}\begin{pmatrix}-a\sin E\\ b\cos E\end{pmatrix}\dot{E}$
	$\displaystyle=a^{2}[-(\cos E-e)\sin E+(1-e^{2})\cos E\sin E]\dot{E}$
	$\displaystyle=a^{2}e(1-e\cos E)\sin E\dot{E};$	(3.4.10)

since $1-e\cos E\geq 1-e>0$ and $\dot{E}>0$ , ${\boldsymbol{r}}\cdot{\boldsymbol{v}}$ always has the same sign as $\sin E$ , or equivalently $r_{y}$ . This relation will be used for conservation correction (see Section 3.3), since this section is dedicated to testing numerical methods, not the sign determination strategy.

For numerical tests in this work, we choose the following orbital parameters:

•

Eccentricity $e=63/64\approx 0.9844$ , semi-major axis $a=16$ , semi-minor axis $b=\sqrt{127}/4\approx 2.817$ , and focal distance $c=63/4=15.75$ .
•

Orbital period $T=128\pi\approx 402.1$ , mechanic energy $E_{\rm M}=-1/32=-0.03125$ , and angular momentum $L_{z}=\sqrt{127}/16\approx 0.7043$ .
•

Pericenter at ${\boldsymbol{r}}_{p}=(1/4,0)^{\rm T}$ , where the velocity is ${\boldsymbol{v}}_{p}=(0,\sqrt{127}/4)^{\rm T}\approx(0,2.817)^{\rm T}$ ; apocenter at ${\boldsymbol{r}}_{a}=(-127/4,0)^{\rm T}=(-31.75,0)^{\rm T}$ , where the velocity is ${\boldsymbol{v}}_{a}=(0,1/(4\sqrt{127}))^{\rm T}\approx(0,-0.02218)^{\rm T}$ .

Meanwhile, technical choices include:

•

Numerical methods: leapfrog integrator (which is simple but simplectic), fourth-order Runge–Kutta, and first-order ContEvol methods, without and with conservation correction. Note that all these methods have higher-order counterparts.
•

Total duration $t_{\max}=432$ ; four fixed time steps: $h=1/16=0.0625$ , $h=1/64\approx 0.0156$ , $h=1/256\approx 0.0039$ , and $h=1/1024\approx 0.0010$ . For the two $h<1/64$ cases, we only record position and velocity every $\Delta t=1/64$ .
•

Programming language: Python with just-in-time compilation (see data availability). Processor information: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz, 2803 Mhz, 4 Core(s), 8 Logical Processor(s). We do not use multiprocessing explicitly.

Refer to caption — Figure 3.4.1: Exact solution to the eccentric orbit specified in Section 3.4. From top to bottom, plotted versus mean anomaly $M$ are eccentric anomaly $E$ , position ${\boldsymbol{r}}$ ( $x$ and $y$ components) and velocity ${\boldsymbol{v}}$ ( $x$ and $y$ components).

Fig. 3.4.1 displays the exact solution of this scenario. Since the particle reaches its maximum speed at the pericenter, both its position and velocity change rapidly near $M=0$ and $M=2\pi$ . In the case of velocity near $M=2\pi$ , the $x$ component reaches its maximum and quickly flips its sign, while the $y$ component reaches a larger maximum and quickly falls back. These rapid changes constitute a “stress test” for the numerical methods.

Integrator	$h=1/16$	$h=1/64$	$h=1/256$	$h=1/1024$
LF	$2.58\,{\rm ms}\pm 62.2\,{\rm\mu s}$	$11.3\,{\rm ms}\pm 999\,{\rm\mu s}$	$32.8\,{\rm ms}\pm 1.65\,{\rm ms}$	$133\,{\rm ms}\pm 1.10\,{\rm ms}$
LFCC	$3.16\,{\rm ms}\pm 53.6\,{\rm\mu s}$	$13.2\,{\rm ms}\pm 2.07\,{\rm ms}$	$47.3\,{\rm ms}\pm 688\,{\rm\mu s}$	$210\,{\rm ms}\pm 9.66\,{\rm ms}$
RK4	$7.99\,{\rm ms}\pm 397\,{\rm\mu s}$	$31.7\,{\rm ms}\pm 1.37\,{\rm ms}$	$131\,{\rm ms}\pm 5.65\,{\rm ms}$	$514\,{\rm ms}\pm 29.2\,{\rm ms}$
RK4CC	$8.97\,{\rm ms}\pm 467\,{\rm\mu s}$	$39.2\,{\rm ms}\pm 1.49\,{\rm ms}$	$146\,{\rm ms}\pm 9.60\,{\rm ms}$	$631\,{\rm ms}\pm 26.7\,{\rm ms}$
CE1	$11.6\,{\rm ms}\pm 443\,{\rm\mu s}$	$45.5\,{\rm ms}\pm 365\,{\rm\mu s}$	$193\,{\rm ms}\pm 8.94\,{\rm ms}$	$734\,{\rm ms}\pm 27.4\,{\rm ms}$
CE1CC	$11.3\,{\rm ms}\pm 636\,{\rm\mu s}$	$45.0\,{\rm ms}\pm 2.19\,{\rm ms}$	$181\,{\rm ms}\pm 9.56\,{\rm ms}$	$718\,{\rm ms}\pm 36.0\,{\rm ms}$

Table 1: Time consumption of leapfrog (“LF”), fourth-order Runge–Kutta (“RK4”), and first-order ContEvol (“CE1”) integrators, without and with conservation correction (“CC”), all for the configuration specified in Section 3.4. All quotes are obtained using the timeit standard library of Python.

Table 1 presents the time consumption of each configuration (integrator, conservation correction, and time step) tested in this work. Since the time step is fixed in each case, the time consumption is roughly inversely proportional to the time step, as expected. As a second-order method, leapfrog integrator is $\sim 3$ times faster than fourth-order Runge–Kutta; for these two methods, conservation correction increases the time consumption by a significant fraction — despite the simplicity of Eq. (3.3.10), it still takes time to perform floating point operations. Without conservation correction, first-order ContEvol costs about one half more time than fourth-order Runge–Kutta; with correction, it becomes slightly faster, since calculating ${\boldsymbol{v}}_{h}$ from Eq. (3.3.10) is simpler than from Eq. (3.2.12). In principle, this trick can be applied to Runge–Kutta as well, but we have not explored this possibility in this work, since it would encounter more overhead and an acceleration is not guaranteed.

Fig. 3.4.2 shows orbits predicted by configurations tested in this work. Those close to the exact solution Eq. (3.4.6), e.g., $h=1/1024$ ellipses, will be further investigated in the next few paragraphs; here we comment on significantly deviatory ones. Without conservation correction, leapfrog integrator produces a hyperbolic trajectory with $h=1/16$ , and a significantly larger and incomplete ellipse with $h=1/64$ — it only finishes slightly over half a cycle at our terminal time, $t_{\max}=432$ . With conservation correction, the $h=1/16$ leapfrog orbit involves more artifacts, featuring two teardrop-shaped laps with different size, and then a segment of probably the third one — apparently, the correction permanently alters the history by suddenly changing the sign of $v_{x}$ ; however, the $h=1/64$ did become more reasonable. Because of their higher-order precision, fourth-order Runge–Kutta and first-order ContEvol integrators only show substantial deviations when $h=1/16$ . Without conservation correction, the Runge–Kutta orbit “loses” energy and shrinks, while its ContEvol counterpart “gains” energy and leaves the “central object.” With conservation correction, both orbits slightly flattens in the second lap, possibly due to artifacts induced by the correction, although these artifacts are less noticeable than in the case of leapfrog.

Method 1: Leapfrog integrator.

Figs. 3.4.3 and 3.4.4 display deviations from exact solution of predictions by leapfrog (“LF”) integrator without and with conservation correction (“CC”), respectively. Thanks to its symplectic nature, leapfrog (without conservation correction) conserves angular momentum remarkably well — better than both “higher-order” methods tested in this work — regardless of the time step. The mechanic energy is also well-conserved, except at the beginning $M=0$ , where the particle gets an “initial kick,” of which the magnitude seems proportional to the time step; nevertheless, near $M=2\pi$ , none of the leapfrog orbits gets a “second kick,” making leapfrog eligible for studies of long-term (or secular) behaviors of the particle, if the energy discrepancy is acceptable. Without or with conservation correction, shrinking the time step by a factor of $4$ reduces errors in position and velocity by about an order of magnitude. However, since the correction breaks simplecticity and causes artifacts when $r_{y}$ reaches $0$ (most noticeable in the $v_{x}$ panel of Fig. 3.4.4), it only improves leapfrog in the first half of the first lap.

Method 2: Fourth-order Runge–Kutta.

Figs. 3.4.5 and 3.4.6 display deviations from exact solution of predictions by fourth-order Runge–Kutta (“RK4”) integrator without and with conservation correction (“CC”), respectively. As a higher-order method, Runge–Kutta (without conservation correction) significantly reduces the “initial kick” (in terms of mechanic energy and angular momentum) the particle gets at $M=0$ ; however, the particle does get a “second kick” near $M=2\pi$ , of which the amplitude shrinks with time step for mechanic energy, but is constantly about half an order of magnitude for angular momentum regardless of the time step. Therefore, quality of Runge–Kutta predictions possibly deteriorates after several laps; yet for the first lap, shrinking the time step by $4$ reduces errors by almost three (two and a half) orders of magnitude without (with) conservation correction, which is much better than leapfrog. In the first half of the first lap, with $h=1/1024$ , conservation correction improves Runge–Kutta by nearly three orders of magnitude in terms of $x$ components, and almost an order of magnitude in terms of $y$ components. Because of different scaling relations described above, these improvements are slightly more significant for larger time steps; due to roundoff errors, time steps smaller than $1/1024$ probably do not make much sense. However, a closer look at the $v_{x}$ panel of Fig. 3.4.6 would reveal a slight jump near $M=\pi$ , which is an artifact of the correction.

Method 3: First-order ContEvol.

Figs. 3.4.7 and 3.4.8 display deviations from exact solution of predictions by first-order ContEvol (“CE1”) integrator without and with conservation correction (“CC”), respectively. Without conservation correction, ContEvol does not perform as well as Runge–Kutta for the first lap — the “initial kick” is almost an order of magnitude larger in terms of mechanic energy, and up to three orders of magnitude in term of angular momentum; errors in position and velocity are also about an order of magnitude larger. This is not unexpected, because although ContEvol (as implemented for these tests, see Section 3.2) accurately traces ${\boldsymbol{r}}_{h}$ to $\mathcal{O}(h^{5})$ , it only traces ${\boldsymbol{v}}_{h}$ to $\mathcal{O}(h^{4})$ , and the higher-order terms are just zero; meanwhile, Runge–Kutta accurately traces both ${\boldsymbol{r}}_{h}$ and ${\boldsymbol{v}}_{h}$ to $\mathcal{O}(h^{4})$ , but the $\mathcal{O}(h^{5})$ terms could be partially right, hence it performs better when errors accumulate. Nonetheless, a comparison between $E_{\rm M}$ and $L_{z}$ panels of Figs. 3.4.5 and 3.4.7 tells us that, thanks to its closeness to simplecticity, ContEvol errors in these two quantities are not amplified at all near $M=2\pi$ , thus it could win out after several laps. Such possibility is not explored in this work, but we note that the $\mathcal{O}(h^{5})$ term of the determinant of the first order ContEvol Jacobian Eq. (3.1.17) vanishes when ${\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}=0$ , which might not have been affected by our truncation (see Section 3.2). With conservation correction, ContEvol accurately traces ${\boldsymbol{v}}_{h}$ to $\mathcal{O}(h^{5})$ as well, therefore it becomes more accurate than its Runge–Kutta counterpart by up to an order of magnitude, especially with smaller time steps.

To summarize, with different pros and cons, first-order ContEvol is a viable alternative to classic Runge–Kutta or the symplectic leapfrog integrator, especially for some specific situations or after some further developments.

3.5 Three-body, first-order ContEvol (description)

To simplify notation, we follow Eq. (3.1.5) to generalize Eq. (3.2.1) as a series of functionals

\displaystyle\left\{\begin{aligned} {\boldsymbol{f}}_{0}[{\boldsymbol{r}}(t)]&=\frac{{\boldsymbol{r}}_{0}}{r_{0}^{3}}\\ {\boldsymbol{f}}_{1}[{\boldsymbol{r}}(t)]&=\frac{{\boldsymbol{v}}_{0}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{r}}_{0}\\ {\boldsymbol{f}}_{2}[{\boldsymbol{r}}(t)]&=\frac{\boldsymbol{B}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}{\boldsymbol{v}}_{0}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{r}}_{0}\\ {\boldsymbol{f}}_{3}[{\boldsymbol{r}}(t)]&=\left[\begin{aligned} &\frac{\boldsymbol{A}}{r_{0}^{3}}-\frac{3{\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0}}{r_{0}^{5}}\boldsymbol{B}-\frac{3}{2}\left(\frac{2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2}}{r_{0}^{5}}-\frac{5({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{2}}{r_{0}^{7}}\right){\boldsymbol{v}}_{0}\\ &-\left(\frac{3({\boldsymbol{A}}\cdot{\boldsymbol{r}}_{0}+{\boldsymbol{B}}\cdot{\boldsymbol{v}}_{0})}{r_{0}^{5}}-\frac{15(2{\boldsymbol{B}}\cdot{\boldsymbol{r}}_{0}+v_{0}^{2})({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})}{2r_{0}^{7}}+\frac{35({\boldsymbol{r}}_{0}\cdot{\boldsymbol{v}}_{0})^{3}}{2r_{0}^{9}}\right){\boldsymbol{r}}_{0}\end{aligned}\right]\end{aligned}\right.

(3.5.1)

for any ${\boldsymbol{r}}(t)$ given or approximated by Eqs. (3.1.1) and (3.1.2), so that the (reduced) equations of motion for the three body problem Eq. (3.0.4) can be written as

\displaystyle\left\{\begin{aligned} \ddot{\boldsymbol{r}}_{1}&=-(1-\mu_{2})\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{1}]t^{i}\right)-\mu_{2}\left[\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{2}]t^{i}\right)+\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{2}]t^{i}\right)\right]\\ \ddot{\boldsymbol{r}}_{2}&=-(1-\mu_{1})\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{2}]t^{i}\right)-\mu_{1}\left[\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{1}]t^{i}\right)+\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{2}-{\boldsymbol{r}}_{1}]t^{i}\right)\right]\end{aligned}\right.,

(3.5.2)

and the cost function can be defined as (subscript “CE3” stands for ContEvol and three-body problem)

\displaystyle\epsilon_{\rm CE3}(\{{\boldsymbol{A}}_{i}\},\{{\boldsymbol{B}}_{i}\};h)=\int_{0}^{h}\left[\begin{aligned} &\left\|(2{\boldsymbol{B}}_{1}+6{\boldsymbol{A}}_{1}t)+(1-\mu_{2})\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{1}]t^{i}\right)\right.\\ &+\left.\mu_{2}\left[\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{2}]t^{i}\right)+\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{1}-{\boldsymbol{r}}_{2}]t^{i}\right)\right]\right\|^{2}\\ &+\left\|(2{\boldsymbol{B}}_{2}+6{\boldsymbol{A}}_{2}t)+(1-\mu_{1})\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{2}]t^{i}\right)\right.\\ &+\left.\mu_{1}\left[\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{1}]t^{i}\right)+\left(\sum_{i=0}^{3}{\boldsymbol{f}}_{i}[{\boldsymbol{r}}_{2}-{\boldsymbol{r}}_{1}]t^{i}\right)\right]\right\|^{2}\end{aligned}\right]\,{\rm d}t.

(3.5.3)

We refrain from proceeding with a symbolic analysis of the above cost function in this work, as orders of the discrepancy between determinant of Jacobian and $1$ (which mirrors non-symplecticity), the minimized cost function, and the errors in results at $t=h$ are not expected to be different from those in Sections 3.1 and 3.3.

From a perspective of numerical implementation, we can “flatten” the combination of $1$ and all the coefficients to be determined as

\displaystyle{\boldsymbol{x}}=(x_{0},x_{1},x_{2},\ldots,x_{12})^{\rm T}\equiv(1,A_{1x},A_{1y},A_{1z},B_{1x},B_{1y},B_{1z},A_{2x},A_{2y},A_{2z},B_{2x},B_{2y},B_{2z})^{\rm T},

(3.5.4)

so that the cost function can be succinctly expressed as

\displaystyle\epsilon_{\rm CE3}({\boldsymbol{x}};h)=\int_{0}^{h}w_{\alpha}\|{\mathcal{E}}_{\alpha ijk}h^{i}{\boldsymbol{e}_{j}}x_{k}\|^{2}\,{\rm d}t

(3.5.5)

with weights $w_{\alpha}$ (see below for discussion) and the fourth-order tensor ${\mathcal{E}}_{\alpha ijk}$ , wherein $\alpha=1,2$ is the index of equation, $i=0,1,2,3$ is the index of order, $j=x,y,z$ is the index of direction, and $k=0,1,2,\ldots,12$ is the index of location in the ${\boldsymbol{x}}$ vector; note that Einstein summation is assumed for all four indices, including $i$ in $h^{i}$ . All its $2\times 4\times 3\times 13=312$ elements can be numerically evaluated using initial conditions and information about the dynamic system, e.g., Eq. (3.5.1); many intermediate quantities can be shared between elements.

Then to minimize the cost function, we have

\displaystyle\frac{\partial\epsilon_{\rm CE3}}{\partial x_{k}}=\frac{\partial^{2}\epsilon_{\rm CE3}}{\partial x_{k^{\prime}}\partial x_{k}}x_{k^{\prime}}\equiv{\boldsymbol{M}}_{k}\cdot{\boldsymbol{x}}=0,\quad k=1,2,\ldots,12,

(3.5.6)

where the vectors ${\boldsymbol{M}}_{k}$ can be derived from the tensor ${\mathcal{E}}_{\alpha ijk}$ , and all their elements are guaranteed to be constants; put in a matrix form, this system of equations is

\displaystyle\begin{pmatrix}M_{11}&M_{12}&\cdots&M_{1,12}\\ M_{21}&M_{22}&\cdots&M_{2,12}\\ \vdots&\vdots&\ddots&\vdots\\ M_{12,1}&M_{12,2}&\cdots&M_{12,12}\end{pmatrix}\begin{pmatrix}x_{1}\\ x_{2}\\ \vdots\\ x_{12}\end{pmatrix}=\begin{pmatrix}b_{1}\\ b_{2}\\ \vdots\\ b_{12}\end{pmatrix},

(3.5.7)

where $b_{k}=-M_{k0}$ . Intuitively, the Hessian matrix $M$ should be positive semidefinite, since the cost function $\epsilon_{\rm CE3}$ is by definition non-negative; yet because of the different between affine and linear transformations, such intuition requires further justification. If it is indeed positive semidefinite, then efficient linear algebra solvers, e.g., Cholesky decomposition, can be used to solve the above linear system; if it is not, more general solvers must be used. Either way, this produces optimal coefficients $\{{\boldsymbol{A}}_{i}\}$ and $\{{\boldsymbol{B}}_{i}\}$ , which tell us the position and velocity of each particle at $t=h$ . As advertised in Section 1, ContEvol methods are implicit but only need to solve linear equations.

Here we conclude Section 3 with several remarks.

•

First, the framework described above can be naturally extended to more particles and more interactions. Eq. (3.5.1) is general for many-body problem in celestial mechanics, and should facilitate programming for both symbolic derivation and numerical implementation. The functionals ${\boldsymbol{f}}_{i}$ , $i=0,1,2,3$ are also applicable to some electromagnetic problems, since Coulomb’s law has the same form as Newton’s law of universal gravitation.
•

Second, whenever we have multiple equations (e.g., $2$ in the case of three-body problem), it is possible and sometimes natural to assign different weights to them while defining the cost function. Eq. (3.5.3) does not do so because the two EOMs are symmetric, and thanks to $\mu_{1}$ and $\mu_{2}$ , more weights are automatically assigned to more massive objects. While different equations describe different quantities, one is advised to rescale the equations and use the dimensionless version to define the cost function, and assign $\mathcal{O}(1)$ weights to them if necessary.
•

Third, in principle, one can combine Sections 2.3 and 3.5 to study celestial mechanics with second- (or even higher-) order ContEvol method. Since the cost function, which describes the discrepancy between approximated and “true” histories of the dynamic system, gets much better with higher order, results like Poincaré sections based on post hoc analysis (instead of combining tiny time steps and backwards evolution with traditional methods) should be more accurate than those based on lower-order ContEvol.

4 Quantum mechanics: stationary Schrödinger equation

Now we switch topic from initial value problems (IVPs) to boundary value problems (BVPs). Again as physicists, we choose two simplest cases from quantum mechanics, infinite potential well and (quantum) harmonic oscillator, and then a more realistic case, Coulomb potential.

In one dimension, the stationary Schrödinger equation is

\displaystyle H^{\prime}\psi=-\frac{\hbar^{2}}{2m}\ddot{\psi}+V^{\prime}\psi=E^{\prime}\psi,

(4.0.1)

where $H^{\prime}$ is the Hamiltonian (an operator), $\hbar$ is the reduced Planck constant, $m$ is the mass of the particle, $V^{\prime}$ is the potential energy (a function), and $E^{\prime}$ is the energy of the particle (a scalar); setting $\hbar^{2}/2m$ to $1$ , this becomes

\displaystyle H\psi=-\ddot{\psi}+V\psi=E\psi.

(4.0.2)

In this work, we require the wavefunction $\psi$ to be a real function.

To solve this eigenvalue problem, the general strategy of ContEvol is:

1.

Represent the wavefunction $\psi$ as two series, $\{\psi_{i}\equiv\psi(x_{i})\}$ and $\{\dot{\psi}_{i}\equiv\dot{\psi}(x_{i})\}$ , where $\{x_{i}\}$ is a finite sampling of the real axis.
2.

Find the optimal approximation $\phi\equiv H\psi$ , represented as $\{\phi_{i}\equiv\phi(x_{i})\}$ and $\{\dot{\phi}_{i}\equiv\dot{\phi}(x_{i})\}$ , by minimizing a cost function. We treat the wavefunction $\psi$ as “known” for this purpose.
3.

Formulate the Hamiltonian $H$ as a linear transformation, and solve for the eigenvalues and eigenvectors of the matrix.
4.

Normalize, orthogonalize (not implemented in this work), and “render” the eigenvectors as continuous wavefunctions.

To set a benchmark, we start by solving the infinite potential well using simple discretization in Section 4.1, before addressing the same problem with first-order ContEvol method in Section 4.2. Then in Section 4.3, we describe how ContEvol is supposed to be applied to a slightly trickier problem, quantum harmonic oscillator. In Section 4.4, we try to solve a more realistic problem, one-dimensional Coulomb potential.

4.1 Infinite potential well, simple discretization

In this section and the next, we study the infinite potential well

\displaystyle V(x)=\left\{\begin{aligned} &0&&0\leq x\leq 1\\ &+\infty&&{\rm otherwise}\end{aligned}\right.,

(4.1.1)

for which the exact solution is

\displaystyle\psi^{(n)}(x)=\left\{\begin{aligned} &\sqrt{2}\sin(n\pi x)&&0\leq x\leq 1\\ &0&&{\rm otherwise}\end{aligned}\right.\quad{\rm and}\quad E_{n}=(n\pi)^{2},\quad n\in{\mathbb{N}}^{+}.

(4.1.2)

We divide the interval $[0,1]$ into $N+1$ equal parts with $N+2$ nodes

\displaystyle x_{i}=\frac{i}{N+1},\quad i=0,1,\ldots,N+1.

(4.1.3)

With $\{\psi_{i}\}$ and linear spline interpolation, the wavefunction is sampled as

\displaystyle\psi(x)=\left\{\begin{aligned} &\psi_{i}+\frac{\psi_{i+1}-\psi_{i}}{h}(x-x_{i})&&x_{i}\leq x\leq x_{i+1}\\ &0&&x<0\ {\rm or}\ x>1\end{aligned}\right.,

(4.1.4)

where $h\equiv 1/(N+1)$ is now the length of each sub-interval. Boundary conditions at $x_{0}=0$ and $x_{N+1}=1$ indicate that $\psi_{0}=\psi_{N+1}=0$ .

At each sampling node, the second-order derivative $\ddot{\psi}$ is approximated as

\displaystyle\ddot{\psi}_{i}\approx\frac{\dot{\psi}_{i+1/2}-\dot{\psi}_{i-1/2}}{h}\approx\frac{1}{h}\left(\frac{\psi_{i+1}-\psi_{i}}{h}-\frac{\psi_{i}-\psi_{i-1}}{h}\right)=\frac{\psi_{i+1}-2\psi_{i}+\psi_{i-1}}{h^{2}},

(4.1.5)

and thus the $N\times N$ (for $i=1,2,\ldots,N$ ) Hamiltonian $H$ is simply

\displaystyle H=h^{-2}\begin{pmatrix}2&-1&0&\cdots&0&0\\ -1&2&-1&\ddots&0&0\\ 0&-1&2&\ddots&0&0\\ \vdots&\ddots&\ddots&\ddots&\ddots&\vdots\\ 0&0&0&\ddots&2&-1\\ 0&0&0&\cdots&-1&2\end{pmatrix},

(4.1.6)

where the minus sign comes from Eq. (4.0.2). This Hamiltonian matrix is Hermitian, as it should.

Before moving on to examples, we note that the eigenvectors need to be “renormalized” (even if they have already been normalized as usual vectors) as

	$\displaystyle 1$	$\displaystyle=\int_{0}^{1}[{\mathcal{N}}\psi(x)]^{2}\,{\rm d}x={\mathcal{N}}^{2}\sum_{i=0}^{N}\int_{x_{i}}^{x_{i+1}}\left[\psi_{i}+\frac{\psi_{i+1}-\psi_{i}}{h}(x-x_{i})\right]^{2}\,{\rm d}x$
		$\displaystyle={\mathcal{N}}^{2}\sum_{i=0}^{N}\int_{0}^{h}\left(\psi_{i}+\frac{\psi_{i+1}-\psi_{i}}{h}x\right)^{2}\,{\rm d}x={\mathcal{N}}^{2}\sum_{i=0}^{N}\frac{h}{3}(\psi_{i}^{2}+\psi_{i}\psi_{i+1}+\psi_{i+1}^{2}),$		(4.1.7)

where ${\mathcal{N}}$ is the normalization factor; similarly, in principle, they may need to be “reorthogonalized” according to the “inner product” defined as follows

$\displaystyle\langle\psi^{(k)}\|\psi^{(l)}\rangle$	$\displaystyle=\langle\{\psi_{i}^{(k)}\}\|\{\psi_{i}^{(l)}\}\rangle=\sum_{i=0}^{N}\int_{x_{i}}^{x_{i+1}}\left[\left(\psi_{i}^{(k)}+\frac{\psi_{i+1}^{(k)}-\psi_{i}^{(k)}}{h}(x-x_{i})\right)\cdot\left(\psi_{i}^{(l)}+\frac{\psi_{i+1}^{(l)}-\psi_{i}^{(l)}}{h}(x-x_{i})\right)\right]\,{\rm d}x$
	$\displaystyle=\sum_{i=0}^{N}\int_{0}^{h}\left[\left(\psi_{i}^{(k)}+\frac{\psi_{i+1}^{(k)}-\psi_{i}^{(k)}}{h}x\right)\cdot\left(\psi_{i}^{(l)}+\frac{\psi_{i+1}^{(l)}-\psi_{i}^{(l)}}{h}x\right)\right]\,{\rm d}x$
	$\displaystyle=\sum_{i=0}^{N}\frac{h}{6}[\psi_{i}^{(k)}(2\psi_{i}^{(l)}+\psi_{i+1}^{(l)})+\psi_{i+1}^{(k)}(\psi_{i}^{(l)}+2\psi_{i+1}^{(l)})],$	(4.1.8)

where we have not written the complex conjugate symbol “ $*$ ” as our wavefunctions are real. Yet intuitively, the eigenvectors should be orthogonal to each other, as they correspond to different eigenvalues of a Hermitian operator. Since this work is principally for illustration purposes, we simply present the normalized wavefunctions, and leave investigation of orthogonality for future work.

Fig. 4.1.1 compares simple discretization with $N=2$ and exact solution Eq. (4.1.2) for $n=1$ and $n=2$ . Note that these two wavefunctions are automatically orthogonal to each other. Fig. 4.1.2 shows two $H$ matrices ( $N=8$ and $N=16$ ) and normalized but not necessarily orthogonal eigenvectors produced by $N=8$ , $N=16$ , $N=32$ , and $N=64$ versions of simple discretization; the other two $H$ matrices ( $N=32$ and $N=64$ ) are omitted as the tridiagonal structure is the same. With increasing $n$ (note that $\psi^{(n)}$ has $n-1$ zero points between the two end points), the eigenvectors become less and less smooth.

Figs. 4.1.3 and 4.1.4 display errors in eigenvalues and rendered eigenvectors of $N=8$ , $N=16$ , $N=32$ , and $N=64$ Hamiltonians, respectively. Although a $N\times N$ Hermitian matrix has $N$ eigenpairs, $E_{n}$ and $\psi^{(n)}$ with $n\geq 17$ are not shown in these figures. At small $n$ , the approximated wavefunctions are reasonably smooth; however, as $n$ approaches $N/2$ , the broken features become much more noticeable. It should be noted that all the eigenvalues produced by simple discretization are smaller than their exact counterparts, unlike those yielded by first ContEvol method, as we will show in the next section.

4.2 Infinite potential well, first-order ContEvol

Now we present the ContEvol treatment of the same problem. We divide the interval $[0,1]$ into $N$ equal parts with $N+1$ nodes

\displaystyle x_{i}=\frac{i}{N},\quad i=0,1,\ldots,N;

(4.2.1)

investigating if an unequal partition leads to better results is left for future work. With $\{\psi_{i}\}$ and $\{\dot{\psi}_{i}\}$ , the wavefunction is sampled as

\displaystyle\psi(x)=\left\{\begin{aligned} &\psi_{i}+\dot{\psi}_{i}(x-x_{i})+B_{\psi i}(x-x_{i})^{2}+A_{\psi i}(x-x_{i})^{3}&&x_{i}\leq x\leq x_{i+1}\\ &0&&x<0\ {\rm or}\ x>1\end{aligned}\right.

(4.2.2)

with

\displaystyle\left\{\begin{aligned} A_{\psi i}&=2(\psi_{i}-\psi_{i+1})h^{-3}+(\dot{\psi}_{i}+\dot{\psi}_{i+1})h^{-2}\\ B_{\psi i}&=3(\psi_{i+1}-\psi_{i})h^{-2}-(2\dot{\psi}_{i}+\dot{\psi}_{i+1})h^{-1}\end{aligned}\right.,

(4.2.3)

where $h\equiv 1/N$ is the length of each sub-interval. Boundary conditions at $x_{0}=0$ and $x_{N}=1$ indicate that $\psi_{0}=\psi_{N}=0$ . The desired approximation $\phi\equiv H\psi$ is represented in the same way.

We are supposed to have $\phi\approx-\ddot{\psi}$ . Note that $\psi(x)$ and $\phi(x)$ are both piecewise cubic functions with continuous first derivatives, while $\ddot{\psi}$ is a piecewise linear function which is not necessarily continuous at sampling nodes. The cost function is defined as (subscript “IPW” stands for infinite potential well)

\displaystyle\epsilon_{\rm IPW}(\{\psi_{i}\},\{\dot{\psi}_{i}\};\{\phi_{i}\},\{\dot{\phi}_{i}\};\{x_{i}\})

\displaystyle=\sum_{i=0}^{N-1}\epsilon_{{\rm IPW},i}(\psi_{i},\dot{\psi}_{i},\psi_{i+1},\dot{\psi}_{i+1};\phi_{i},\dot{\phi}_{i},\phi_{i+1},\dot{\phi}_{i+1};x_{i},x_{i+1});

(4.2.4)

for simplicity, in the following text we omit parameters of $\epsilon_{{\rm IPW},i}$ , which is

$\displaystyle\epsilon_{{\rm IPW},i}$	$\displaystyle=\int_{x_{i}}^{x_{i+1}}(\ddot{\psi}+\phi)^{2}\,{\rm d}x=\int_{x_{i}}^{x_{i+1}}[(2B_{\psi i}+\phi_{i})+(6A_{\psi i}+\dot{\phi}_{i})(x-x_{i})+B_{\phi i}(x-x_{i})^{2}+A_{\phi i}(x-x_{i})^{3}]^{2}\,{\rm d}x$
	$\displaystyle=\int_{0}^{h}[(2B_{\psi i}+\phi_{i})+(6A_{\psi i}+\dot{\phi}_{i})x+B_{\phi i}x^{2}+A_{\phi i}x^{3}]^{2}\,{\rm d}x$
	$\displaystyle=\int_{0}^{h}\left[\begin{aligned} &(4B_{\psi i}^{2}+4B_{\psi i}\phi_{i}+\phi_{i}^{2})+(24A_{\psi i}B_{\psi i}+12A_{\psi i}\phi_{i}+4B_{\psi i}\dot{\phi}_{i}+2\phi_{i}\dot{\phi}_{i})x\\ &+(36A_{\psi i}^{2}+12A_{\psi i}\dot{\phi}_{i}+4B_{\phi i}B_{\psi i}+2B_{\phi i}\phi_{i}+\dot{\phi}_{i}^{2})x^{2}\\ &+(12A_{\psi i}B_{\phi i}+4A_{\phi i}B_{\psi i}+2A_{\phi i}\phi_{i}+2B_{\phi i}\dot{\phi}_{i})x^{3}\\ &+(12A_{\phi i}A_{\psi i}+2A_{\phi i}\dot{\phi}_{i}+B_{\phi i}^{2})x^{4}+2A_{\phi i}B_{\phi i}x^{5}+A_{\phi i}^{2}x^{6}\end{aligned}\right]\,{\rm d}x$
	$\displaystyle=\left[\begin{aligned} &(4B_{\psi i}^{2}+4B_{\psi i}\phi_{i}+\phi_{i}^{2})h+(12A_{\psi i}B_{\psi i}+6A_{\psi i}\phi_{i}+2B_{\psi i}\dot{\phi}_{i}+\phi_{i}\dot{\phi}_{i})h^{2}\\ &+\frac{1}{3}(36A_{\psi i}^{2}+12A_{\psi i}\dot{\phi}_{i}+4B_{\phi i}B_{\psi i}+2B_{\phi i}\phi_{i}+\dot{\phi}_{i}^{2})h^{3}\\ &+\frac{1}{2}(6A_{\psi i}B_{\phi i}+2A_{\phi i}B_{\psi i}+A_{\phi i}\phi_{i}+B_{\phi i}\dot{\phi}_{i})h^{4}\\ &+\frac{1}{5}(12A_{\phi i}A_{\psi i}+2A_{\phi i}\dot{\phi}_{i}+B_{\phi i}^{2})h^{5}+\frac{1}{3}A_{\phi i}B_{\phi i}h^{6}+\frac{1}{7}A_{\phi i}^{2}h^{7}\end{aligned}\right],$	(4.2.5)

for $i=0,1,\ldots,N-1$ ; plugging in expressions of $A_{\psi i}$ , $B_{\psi i}$ , $A_{\phi i}$ , and $B_{\phi i}$ , this becomes

\displaystyle\epsilon_{{\rm IPW},i}=\left[\begin{aligned} &12(\psi_{i}-\psi_{i+1})^{2}h^{-3}+12(\dot{\psi}_{i}+\dot{\psi}_{i+1})(\psi_{i}-\psi_{i+1})h^{-2}\\ &+\left\{4(\dot{\psi}_{i}^{2}+\dot{\psi}_{i}\dot{\psi}_{i+1}+\dot{\psi}_{i+1}^{2})-\frac{12}{5}(\psi_{i}-\psi_{i+1})(\phi_{i}-\phi_{i+1})\right\}h^{-1}\\ &-\left\{\frac{12}{5}(\dot{\psi}_{i}\phi_{i}-\dot{\psi}_{i+1}\phi_{i+1})-\frac{1}{5}(\dot{\psi}_{i}-\dot{\psi}_{i+1})(\phi_{i}+\phi_{i+1})+\frac{1}{5}(\psi_{i}-\psi_{i+1})(\dot{\phi}_{i}+\dot{\phi}_{i+1})\right\}\\ &+\left\{\frac{1}{105}(39\phi_{i}^{2}+27\phi_{i}\phi_{i+1}+39\phi_{i+1}^{2})-\frac{1}{15}(4\dot{\psi}_{i}\dot{\phi}_{i}-\dot{\psi}_{i+1}\dot{\phi}_{i}-\dot{\psi}_{i}\dot{\phi}_{i+1}+4\dot{\psi}_{i+1}\dot{\phi}_{i+1})\right\}h\\ &+\frac{1}{210}(22\phi_{i}\dot{\phi}_{i}-13\phi_{i}\dot{\phi}_{i+1}+13\dot{\phi}_{i}\phi_{i+1}-22\dot{\phi}_{i+1}\phi_{i+1})h^{2}+\frac{1}{210}(2\dot{\phi}_{i}^{2}-3\dot{\phi}_{i}\dot{\phi}_{i+1}+2\dot{\phi}_{i+1}^{2})h^{3}\end{aligned}\right];

(4.2.6)

for convenience, we define $\epsilon_{{\rm IPW},-1}=\epsilon_{{\rm IPW},N}=0$ .

Partial derivatives of $\epsilon_{{\rm IPW},i}$ with respect to $\phi_{i}$ , $\phi_{i+1}$ , $\dot{\phi}_{i}$ , and $\dot{\phi}_{i+1}$ are

\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{{\rm IPW},i}}{\partial\phi_{i}}&=-\frac{12(\psi_{i}-\psi_{i+1})}{5}h^{-1}-\frac{11\dot{\psi}_{i}+\dot{\psi}_{i+1}}{5}+\frac{26\phi_{i}+9\phi_{i+1}}{35}h+\frac{22\dot{\phi}_{i}-13\dot{\phi}_{i+1}}{210}h^{2}\\ \frac{\partial\epsilon_{{\rm IPW},i}}{\partial\phi_{i+1}}&=\frac{12(\psi_{i}-\psi_{i+1})}{5}h^{-1}+\frac{\dot{\psi}_{i}+11\dot{\psi}_{i+1}}{5}+\frac{9\phi_{i}+26\phi_{i+1}}{35}h+\frac{13\dot{\phi}_{i}-22\dot{\phi}_{i+1}}{210}h^{2}\\ \frac{\partial\epsilon_{{\rm IPW},i}}{\partial\dot{\phi}_{i}}&=-\frac{\psi_{i}-\psi_{i+1}}{5}-\frac{4\dot{\psi}_{i}-\dot{\psi}_{i+1}}{15}h+\frac{22\phi_{i}+13\phi_{i+1}}{210}h^{2}+\frac{4\dot{\phi}_{i}-3\dot{\phi}_{i+1}}{210}h^{3}\\ \frac{\partial\epsilon_{{\rm IPW},i}}{\partial\dot{\phi}_{i+1}}&=-\frac{\psi_{i}-\psi_{i+1}}{5}+\frac{\dot{\psi}_{i}-4\dot{\psi}_{i+1}}{15}h-\frac{13\phi_{i}+22\phi_{i+1}}{210}h^{2}-\frac{3\dot{\phi}_{i}-4\dot{\phi}_{i+1}}{210}h^{3}\end{aligned}\right.,

(4.2.7)

respectively; note that one should not set these to zero, as a node is coupled with two adjacent intervals, unless it is $x_{0}$ or $x_{N}$ . Put in matrix form, these are

	$\displaystyle\begin{pmatrix}\partial/\partial\phi_{i}\\ \partial/\partial\phi_{i+1}\\ \partial/\partial\dot{\phi}_{i}\\ \partial/\partial\dot{\phi}_{i+1}\end{pmatrix}\epsilon_{{\rm IPW},i}$	$\displaystyle=\left[\begin{aligned} &\begin{pmatrix}26h/35&9h/35&11h^{2}/105&-13h^{2}/210\\ 9h/35&26h/35&13h^{2}/210&-11h^{2}/105\\ 11h^{2}/105&13h^{2}/210&2h^{3}/105&-h^{3}/70\\ -13h^{2}/210&-11h^{2}/105&-h^{3}/70&2h^{3}/105\end{pmatrix}\begin{pmatrix}\phi_{i}\\ \phi_{i+1}\\ \dot{\phi}_{i}\\ \dot{\phi}_{i+1}\end{pmatrix}\\ &+\begin{pmatrix}-12h^{-1}/5&12h^{-1}/5&-11/5&-1/5\\ 12h^{-1}/5&-12h^{-1}/5&1/5&11/5\\ -1/5&1/5&-4h/15&h/15\\ -1/5&1/5&h/15&-4h/15\end{pmatrix}\begin{pmatrix}\psi_{i}\\ \psi_{i+1}\\ \dot{\psi}_{i}\\ \dot{\psi}_{i+1}\end{pmatrix}\end{aligned}\right]$
		$\displaystyle\equiv P^{(i)}\begin{pmatrix}\phi_{i}\\ \phi_{i+1}\\ \dot{\phi}_{i}\\ \dot{\phi}_{i+1}\end{pmatrix}+Q^{(i)}\begin{pmatrix}\psi_{i}\\ \psi_{i+1}\\ \dot{\psi}_{i}\\ \dot{\psi}_{i+1}\end{pmatrix};$		(4.2.8)

again for convenience, we define $P^{(-1)}=Q^{(-1)}=P^{(N+1)}=Q^{(N+1)}=\begin{pmatrix}0&0\\ 0&0\end{pmatrix}$ .

To minimize the cost function Eq. (4.2.4), we have

\displaystyle\begin{pmatrix}\partial/\partial\phi_{0}\\ \vdots\\ \partial/\partial\phi_{N}\\ \partial/\partial\dot{\phi}_{0}\\ \vdots\\ \partial/\partial\dot{\phi}_{N}\end{pmatrix}\epsilon_{\rm IPW}=\left[\begin{aligned} &\begin{pmatrix}P_{00}&\cdots&P_{0N}&P_{0,N+1}&\cdots&P_{0,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ P_{N0}&\cdots&P_{NN}&P_{N,N+1}&\cdots&P_{N,2N+1}\\ P_{N+1,0}&\cdots&P_{N+1,N}&P_{N+1,N+1}&\cdots&P_{N+1,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ P_{2N+1,0}&\cdots&P_{2N+1,N}&P_{2N+1,N+1}&\cdots&P_{2N+1,2N+1}\end{pmatrix}\begin{pmatrix}\phi_{0}\\ \vdots\\ \phi_{N}\\ \dot{\phi}_{0}\\ \vdots\\ \dot{\phi}_{N}\end{pmatrix}\\ &+\begin{pmatrix}Q_{00}&\cdots&Q_{0N}&Q_{0,N+1}&\cdots&Q_{0,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ Q_{N0}&\cdots&Q_{NN}&Q_{N,N+1}&\cdots&Q_{N,2N+1}\\ Q_{N+1,0}&\cdots&Q_{N+1,N}&Q_{N+1,N+1}&\cdots&Q_{N+1,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ Q_{2N+1,0}&\cdots&Q_{2N+1,N}&Q_{2N+1,N+1}&\cdots&Q_{2N+1,2N+1}\end{pmatrix}\begin{pmatrix}\psi_{0}\\ \vdots\\ \psi_{N}\\ \dot{\psi}_{0}\\ \vdots\\ \dot{\psi}_{N}\end{pmatrix}\end{aligned}\right]=\begin{pmatrix}0\\ \vdots\\ 0\\ 0\\ \vdots\\ 0\end{pmatrix};

(4.2.9)

since

\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{\rm IPW}}{\partial\phi_{i}}=\frac{\partial\epsilon_{{\rm IPW},i-1}}{\partial\phi_{i}}+\frac{\partial\epsilon_{{\rm IPW},i}}{\partial\phi_{i}}\\ \frac{\partial\epsilon_{\rm IPW}}{\partial\dot{\phi}_{i}}=\frac{\partial\epsilon_{{\rm IPW},i-1}}{\partial\dot{\phi}_{i}}+\frac{\partial\epsilon_{{\rm IPW},i}}{\partial\dot{\phi}_{i}}\end{aligned}\right.,

(4.2.10)

the $(2N+2)\times(2N+2)$ $P$ and $Q$ matrices can be constructed from scratch (zero matrix) by doing

\displaystyle\left\{\begin{aligned} \begin{pmatrix}P_{i,i}&P_{i,i+1}&P_{i,(N+1)+i}&P_{i,(N+1)+i+1}\\ P_{i+1,i}&P_{i+1,i+1}&P_{i+1,(N+1)+i}&P_{i+1,(N+1)+i+1}\\ P_{(N+1)+i,i}&P_{(N+1)+i,i+1}&P_{(N+1)+i,(N+1)+i}&P_{(N+1)+i,(N+1)+i+1}\\ P_{(N+1)+i+1,i}&P_{(N+1)+i+1,i+1}&P_{(N+1)+i+1,(N+1)+i}&P_{(N+1)+i+1,(N+1)+i+1}\end{pmatrix}+\!\!=P^{(i)}\\ \begin{pmatrix}Q_{i,i}&Q_{i,i+1}&Q_{i,(N+1)+i}&Q_{i,(N+1)+i+1}\\ Q_{i+1,i}&Q_{i+1,i+1}&Q_{i+1,(N+1)+i}&Q_{i+1,(N+1)+i+1}\\ Q_{(N+1)+i,i}&Q_{(N+1)+i,i+1}&Q_{(N+1)+i,(N+1)+i}&Q_{(N+1)+i,(N+1)+i+1}\\ Q_{(N+1)+i+1,i}&Q_{(N+1)+i+1,i+1}&Q_{(N+1)+i+1,(N+1)+i}&Q_{(N+1)+i+1,(N+1)+i+1}\end{pmatrix}+\!\!=Q^{(i)}\end{aligned}\right.,

(4.2.11)

where $+\!\!=$ denotes the addition assignment operator in common programming languages like C or Python, for $i=0,1,\ldots,N$ . To enforce the $\psi_{0}=\psi_{N}=0$ constraints, one simply needs to remove the corresponding rows and columns.

Our desired Hamiltonian is thus simply $H=-P^{-1}Q$ . Eigendecomposition of $H$ should yield $2N+2$ (or $2N$ ) eigenpairs, $\{\psi_{i}^{(k)},\dot{\psi}_{i}^{(k)}\}$ and $E^{(k)}$ , without (with) those two constraints. With or without the $\psi_{0}=\psi_{N}=0$ enforcement, $P$ and $Q$ matrices are always symmetric; however, this does not guarantee that the resulting $H$ matrix is also symmetric, and thus Hermitian.

Like in Section 4.1, the eigenvectors need to be “renormalized” as

$\displaystyle 1$	$\displaystyle=\int_{0}^{1}[{\mathcal{N}}\psi(x)]^{2}\,{\rm d}x={\mathcal{N}}^{2}\sum_{i=0}^{N-1}\int_{x_{i}}^{x_{i+1}}[\psi_{i}+\dot{\psi}_{i}(x-x_{i})+B_{\psi i}(x-x_{i})^{2}+A_{\psi i}(x-x_{i})^{3}]^{2}\,{\rm d}x$
	$\displaystyle={\mathcal{N}}^{2}\sum_{i=0}^{N-1}\int_{0}^{h}[\psi_{i}+\dot{\psi}_{i}x+B_{\psi i}x^{2}+A_{\psi i}x^{3}]^{2}\,{\rm d}x$
	$\displaystyle={\mathcal{N}}^{2}\sum_{i=0}^{N-1}\int_{0}^{h}\left[\begin{aligned} &\psi_{i}^{2}+2\psi_{i}\dot{\psi}_{i}x+(2B_{\psi i}\psi_{i}+\dot{\psi}_{i}^{2})x^{2}+(2A_{\psi i}\psi_{i}+2B_{\psi i}\dot{\psi}_{i})x^{3}\\ &+(B_{\psi i}^{2}+2A_{\psi i}\dot{\psi}_{i})x^{4}+2A_{\psi i}B_{\psi i}x^{5}+A_{\psi i}^{2}x^{6}\end{aligned}\right]\,{\rm d}x$
	$\displaystyle={\mathcal{N}}^{2}\sum_{i=0}^{N-1}\left[\begin{aligned} &\psi_{i}^{2}h+\psi_{i}\dot{\psi}_{i}h^{2}+\frac{2B_{\psi i}\psi_{i}+\dot{\psi}_{i}^{2}}{3}h^{3}+\frac{A_{\psi i}\psi_{i}+B_{\psi i}\dot{\psi}_{i}}{2}h^{4}\\ &+\frac{B_{\psi i}^{2}+2A_{\psi i}\dot{\psi}_{i}}{5}h^{5}+\frac{A_{\psi i}B_{\psi i}}{3}h^{6}+\frac{A_{\psi i}^{2}}{7}h^{7}\end{aligned}\right]$
	$\displaystyle={\mathcal{N}}^{2}\sum_{i=0}^{N-1}\left[\begin{aligned} &\frac{1}{35}(13\psi_{i}^{2}+9\psi_{i}\psi_{i+1}+13\psi_{i+1}^{2})h+\frac{1}{210}(22\psi_{i}\dot{\psi}_{i}-13\psi_{i}\dot{\psi}_{i+1}+13\dot{\psi}_{i}\psi_{i+1}-22\dot{\psi}_{i+1}\psi_{i+1})h^{2}\\ &+\frac{1}{210}(2\dot{\psi}_{i}^{2}-3\dot{\psi}_{i}\dot{\psi}_{i+1}+2\dot{\psi}_{i+1}^{2})h^{3}\end{aligned}\right];$	(4.2.12)

they may need to be “reorthogonalized” according to the “inner product” defined as follows

$\displaystyle\langle\psi^{(k)}\|\psi^{(l)}\rangle$	$\displaystyle=\langle\{\psi_{i}^{(k)},\dot{\psi}_{i}^{(k)}\}\|\{\psi_{i}^{(l)},\dot{\psi}_{i}^{(l)}\}\rangle=\sum_{i=0}^{N-1}\int_{x_{i}}^{x_{i+1}}\left[\begin{aligned} &\{\psi_{i}^{(k)}+\dot{\psi}_{i}^{(k)}(x-x_{i})+B_{\psi i}^{(k)}(x-x_{i})^{2}+A_{\psi i}^{(k)}(x-x_{i})^{3}\}\\ &\cdot\{\psi_{i}^{(l)}+\dot{\psi}_{i}^{(l)}(x-x_{i})+B_{\psi i}^{(l)}(x-x_{i})^{2}+A_{\psi i}^{(l)}(x-x_{i})^{3}\}\end{aligned}\right]\,{\rm d}x$
	$\displaystyle=\sum_{i=0}^{N-1}\int_{0}^{h}[\{\psi_{i}^{(k)}+\dot{\psi}_{i}^{(k)}x+B_{\psi i}^{(k)}x^{2}+A_{\psi i}^{(k)}x^{3}\}\cdot\{\psi_{i}^{(l)}+\dot{\psi}_{i}^{(l)}x+B_{\psi i}^{(l)}x^{2}+A_{\psi i}^{(l)}x^{3}\}]\,{\rm d}x$
	$\displaystyle=\sum_{i=0}^{N-1}\int_{0}^{h}\left[\begin{aligned} &\psi_{i}^{(k)}\psi_{i}^{(l)}+(\dot{\psi}_{i}^{(k)}\psi_{i}^{(l)}+\psi_{i}^{(k)}\dot{\psi}_{i}^{(l)})x+(B_{\psi i}^{(k)}\psi_{i}^{(l)}+\dot{\psi}_{i}^{(k)}\dot{\psi}_{i}^{(l)}+\psi_{i}^{(k)}B_{\psi i}^{(l)})x^{2}\\ &+(A_{\psi i}^{(k)}\psi_{i}^{(l)}+B_{\psi i}^{(k)}\dot{\psi}_{i}^{(l)}+\dot{\psi}_{i}^{(k)}B_{\psi i}^{(l)}+\psi_{i}^{(k)}A_{\psi i}^{(l)})x^{3}\\ &+(A_{\psi i}^{(k)}\dot{\psi}_{i}^{(l)}+B_{\psi i}^{(k)}B_{\psi i}^{(l)}+\dot{\psi}_{i}^{(k)}A_{\psi i}^{(l)})x^{4}+(A_{\psi i}^{(k)}B_{\psi i}^{(l)}+B_{\psi i}^{(k)}A_{\psi i}^{(l)})x^{5}+A_{\psi i}^{(k)}A_{\psi i}^{(l)}x^{6}\end{aligned}\right]\,{\rm d}x$
	$\displaystyle=\sum_{i=0}^{N-1}\left[\begin{aligned} &\psi_{i}^{(k)}\psi_{i}^{(l)}h+\frac{1}{2}(\dot{\psi}_{i}^{(k)}\psi_{i}^{(l)}+\psi_{i}^{(k)}\dot{\psi}_{i}^{(l)})h^{2}+\frac{1}{3}(B_{\psi i}^{(k)}\psi_{i}^{(l)}+\dot{\psi}_{i}^{(k)}\dot{\psi}_{i}^{(l)}+\psi_{i}^{(k)}B_{\psi i}^{(l)})h^{3}\\ &+\frac{1}{4}(A_{\psi i}^{(k)}\psi_{i}^{(l)}+B_{\psi i}^{(k)}\dot{\psi}_{i}^{(l)}+\dot{\psi}_{i}^{(k)}B_{\psi i}^{(l)}+\psi_{i}^{(k)}A_{\psi i}^{(l)})h^{4}\\ &+\frac{1}{5}(A_{\psi i}^{(k)}\dot{\psi}_{i}^{(l)}+B_{\psi i}^{(k)}B_{\psi i}^{(l)}+\dot{\psi}_{i}^{(k)}A_{\psi i}^{(l)})h^{5}+\frac{1}{6}(A_{\psi i}^{(k)}B_{\psi i}^{(l)}+B_{\psi i}^{(k)}A_{\psi i}^{(l)})h^{6}+\frac{1}{7}A_{\psi i}^{(k)}A_{\psi i}^{(l)}h^{7}\end{aligned}\right]$
	$\displaystyle=\sum_{i=0}^{N-1}\left[\begin{aligned} &\frac{1}{70}(26\psi_{i}^{(k)}\psi_{i}^{(l)}+9\psi_{i}^{(k)}\psi_{i+1}^{(l)}+9\psi_{i+1}^{(k)}\psi_{i}^{(l)}+26\psi_{i+1}^{(k)}\psi_{i+1}^{(l)})h\\ &+\frac{11}{210}(\dot{\psi}_{i}^{(k)}\psi_{i}^{(l)}-\dot{\psi}_{i+1}^{(k)}\psi_{i+1}^{(l)}+\psi_{i}^{(k)}\dot{\psi}_{i}^{(l)}-\psi_{i+1}^{(k)}\dot{\psi}_{i+1}^{(l)})h^{2}\\ &+\frac{13}{420}(\dot{\psi}_{i}^{(k)}\psi_{i+1}^{(l)}-\psi_{i}^{(k)}\dot{\psi}_{i+1}^{(l)}+\psi_{i+1}^{(k)}\dot{\psi}_{i}^{(l)}-\dot{\psi}_{i+1}^{(k)}\psi_{i}^{(l)})h^{2}\\ &+\frac{1}{420}(4\dot{\psi}_{i}^{(k)}\dot{\psi}_{i}^{(l)}-3\dot{\psi}_{i}^{(k)}\dot{\psi}_{i+1}^{(l)}-3\dot{\psi}_{i+1}^{(k)}\dot{\psi}_{i}^{(l)}+4\dot{\psi}_{i+1}^{(k)}\dot{\psi}_{i+1}^{(l)})h^{3}\end{aligned}\right].$	(4.2.13)

Toy version: $N=1$ .

While $N=1$ and $h=1/N=1$ , with $\psi_{0}=\psi_{N}=0$ enforced, the $P$ and $Q$ matrices are simply

\displaystyle P=\begin{pmatrix}2/105&-1/70\\ -1/70&2/105\end{pmatrix}\quad{\rm and}\quad Q=\begin{pmatrix}-4/15&1/15\\ 1/15&-4/15\end{pmatrix},

(4.2.14)

and the Harmiltonian is

\displaystyle H=-P^{-1}Q=\begin{pmatrix}26&16\\ 16&26\end{pmatrix}.

(4.2.15)

Eigendecomposition and normalization yield

\displaystyle\left\{\begin{aligned} &\psi_{1}(x)=\sqrt{30}(x-x^{2})\quad 0\leq x\leq 1\quad&&E_{1}=10\approx 1.0132\pi^{2}\\ &\psi_{2}(x)=\sqrt{210}(x-3x^{2}+2x^{3})\quad 0\leq x\leq 1\quad&&E_{2}=42\approx 1.0639(2\pi)^{2}\end{aligned}\right.;

(4.2.16)

see Fig. 4.2.1 for comparisons between these results and exact solution Eq. (4.1.2) for $n=1$ and $n=2$ . Like in Section 4.1, these two wavefunctions are automatically orthogonal to each other.

Realistic versions: $N=2$ , $N=4$ , and $N=8$ .

Although the toy version results seem promising, one needs to use a larger $N$ for more accurate results and larger quantum numbers.

Fig. 4.2.2 shows $P$ , $Q$ , and $H$ matrices, as well as normalized but not necessarily orthogonal eigenvectors produced by $N=1$ , $N=2$ , $N=4$ , and $N=8$ versions of first-order ContEvol. $P$ , $Q$ , and $H$ are all $2N\times 2N$ matrices. In each of them, the upper left $(N-1)\times(N-1)$ blocks (absent in the $N=1$ case) describes coupling between $\psi_{i}$ and $\psi_{i+1}$ , the lower right $(N+1)\times(N+1)$ blocks describes coupling between $\dot{\psi}_{i}$ and $\dot{\psi}_{i+1}$ , and they other two blocks (both absent in the $N=1$ case) describe coupling between values and derivatives. All these blocks are tridiagonal; because of the special form of $P^{(i)}$ and $Q^{(i)}$ submatrices Eq. (4.2), the central diagonals of the cross blocks are uniformly zero. From the third column, it is clear that the Hamiltonians are not symmetric; nevertheless, the upper left $(N-1)\times(N-1)$ blocks (absent in the $N=1$ case) and the lower right $(N+1)\times(N+1)$ blocks are symmetric. Intuitively, the Hamiltonians should still be Hermitian if we consider them as operators on function representations $\{\psi_{i},\dot{\psi}_{i}\}$ . Shown in the last column are the eigenvectors: the first $N-1$ components of each row (absent in the $N=1$ case) are $\psi_{i}$ for $i=1,2,\ldots,N-1$ , while the last $N+1$ components are $\dot{\psi}_{i}$ for $i=0,1,\ldots,N$ . Similar patterns can be seen from eigenvectors with different $N$ values. For example, both $\psi^{(2N)}$ and $\psi^{(N)}$ are zero or almost zero at nodes (not shown in the $N=1$ case), but the former has the same first derivatives, while the latter has alternating first derivatives.

Figs. 4.2.3 and 4.2.4 display errors in eigenvalues and rendered eigenvectors of $N=1$ , $N=2$ , $N=4$ , and $N=8$ Hamiltonians, respectively. With only a quarter of the number of parameters used in simple discretization (see Section 4.1), ContEvol results are arguably better, especially for the ground state energy $E_{1}$ . Since a $2N\times 2N$ matrix only has (at most) $2N$ eigenpairs, $E_{n}$ and $\psi^{(n)}$ with large $n$ are only available with large $N$ . The quality of the results significantly deteriorates as $n$ approaches $2N$ ; it reaches the worst case at $2N-1$ , and becomes reasonably good at $2N$ , when our sampling nodes coincide with zero points of the wavefunctions. Based on these two figures, a rule of thumb would be to only trust $n\leq N$ results, so that errors in eigenvalues are below or at the $\sim 1\%$ level.

4.3 Harmonic oscillator, first-order ContEvol (description)

In this section, we consider (quantum) harmonic oscillator with potential

\displaystyle V(x)=x^{2},\quad x\in\mathbb{R},

(4.3.1)

where we have set the constant $k/2$ to $1$ ; note that this only affects the scaling of $x$ . The exact wavefunctions can be expressed using Hermite polynomials; we do not include them here as no comparisons will be made.

As for application of the ContEvol method, there are three major differences between harmonic oscillator and infinite potential well, which we describe one by one.

Difference 1: Position-dependent potential.

In the case of infinite potential well, the potential $V(x)$ is uniformly zero in the interval of interest; the case of harmonic oscillator is different. Consequently, each piece of the cost function needs to be written as (subscript “QHO” stands for quantum harmonic oscillator)

$\displaystyle\epsilon_{{\rm QHO},i}$	$\displaystyle=\int_{x_{i}}^{x_{i+1}}(\ddot{\psi}-V\psi+\phi)^{2}\,{\rm d}x=\int_{x_{i}}^{x_{i+1}}\left[\begin{aligned} &\{2B_{\psi i}+6A_{\psi i}(x-x_{i})\}-x^{2}\\ &\cdot\{\psi_{i}+\dot{\psi}_{i}(x-x_{i})+B_{\psi i}(x-x_{i})^{2}+A_{\psi i}(x-x_{i})^{3}\}\\ &+\{\phi_{i}+\dot{\phi}_{i}(x-x_{i})+B_{\phi i}(x-x_{i})^{2}+A_{\phi i}(x-x_{i})^{3}\}\end{aligned}\right]^{2}\,{\rm d}x$
	$\displaystyle=\int_{x_{i}}^{x_{i+1}}\left[\begin{aligned} &\{2B_{\psi i}+6A_{\psi i}(x-x_{i})\}-\{x_{i}^{2}+2x_{i}(x-x_{i})+(x-x_{i})^{2}\}\\ &\cdot\{\psi_{i}+\dot{\psi}_{i}(x-x_{i})+B_{\psi i}(x-x_{i})^{2}+A_{\psi i}(x-x_{i})^{3}\}\\ &+\{\phi_{i}+\dot{\phi}_{i}(x-x_{i})+B_{\phi i}(x-x_{i})^{2}+A_{\phi i}(x-x_{i})^{3}\}\end{aligned}\right]^{2}\,{\rm d}x$
	$\displaystyle=\int_{0}^{h}\left[\begin{aligned} &(2B_{\psi i}+6A_{\psi i}x)-(x_{i}^{2}+2x_{i}x+x^{2})\\ &\cdot(\psi_{i}+\dot{\psi}_{i}x+B_{\psi i}x^{2}+A_{\psi i}x^{3})+(\phi_{i}+\dot{\phi}_{i}x+B_{\phi i}x^{2}+A_{\phi i}x^{3})\end{aligned}\right]^{2}\,{\rm d}x=\cdots,$	(4.3.2)

where we have omitted results of the expansion, squaring, integral, and substitution steps (“ $\cdots$ ”). It should be noted that, fortunately, ContEvol is robust against complications induced by the position-dependent potential function, because $\epsilon_{{\rm QHO},i}$ is still a finite polynomial of $h$ , of which all coefficients are linear combinations of $\{\psi_{i},\dot{\psi}_{i},\psi_{i+1},\dot{\psi}_{i+1}\}$ , $\{\phi_{i},\dot{\phi}_{i},\phi_{i+1},\dot{\phi}_{i+1}\}$ , and $\{x_{i},x_{i+1}\}$ .

In general, a potential function $V(x)$ can be represented as $\{V_{i}\equiv V(x_{i})\}$ and $\{\dot{V}_{i}\equiv\dot{V}(x_{i})\}$ , even if it is a hard-to-integrate transcendental function or does not have an analytic form. In the regime of first-order ContEvol, each piece of $V(x)$ possesses up to the third order in $x$ , ergo the resulting expression of each piece of the cost function has up to the thirteenth order in $h$ ; when $h$ is small, it is reasonable to truncate the expansion of the square root of the integrand at the third order in $x$ , so that the final expression has up to the seventh order in $h$ , like in Sections 2.1 or 3.1. Note that when $h$ denotes the length of each sub-interval, it is not necessarily small, specifically not necessarily smaller than $1$ , hence higher-order terms may be more important than lower-order ones.

Difference 2: Lack of sharp edges.

Unlike Eq. (4.1.1), Eq. (4.3.1) does not require wavefunctions to vanish at specific, finitely distant positions; figuratively speaking, wavefunctions are allowed to (and actually should) have tails. Therefore, we need to define $\epsilon_{{\rm QHO},-1}$ and $\epsilon_{{\rm QHO},N}$ — not just for convenience, but also for accuracy.

Wavefunctions are supposed to vanish at infinity, i.e., satisfy $\psi(-\infty)=\psi(+\infty)=0$ and $\dot{\psi}(-\infty)=\dot{\psi}(+\infty)=0$ . Given $\psi_{0}$ and $\dot{\psi}_{0}$ or $\psi_{N}$ and $\dot{\psi}_{N}$ , it is impossible to find a cubic representation of $\psi(x)$ in the interval $(-\infty,x_{0}]$ or $[x_{N},+\infty)$ ; however, assuming that $\psi_{0}$ and $\dot{\psi}_{0}$ have same signs while $\psi_{N}$ and $\dot{\psi}_{N}$ have opposite signs, there is always a pair of exponential tails

\displaystyle\psi(x)=\left\{\begin{aligned} &\psi_{0}\exp\left[\frac{\dot{\psi}_{0}}{\psi_{0}}(x-x_{0})\right]\quad&&x\leq x_{0}\\ &\psi_{N}\exp\left[\frac{\dot{\psi}_{N}}{\psi_{N}}(x-x_{N})\right]\quad&&x\geq x_{N}\end{aligned}\right.

(4.3.3)

satisfying all these boundary conditions. Expressing tails of $\phi(x)$ in the same way, tails of the cost function could be defined as

\displaystyle\left\{\begin{aligned} \epsilon_{{\rm QHO},-1}&=\int_{-\infty}^{x_{0}}(\ddot{\psi}-V\psi+\phi)^{2}\,{\rm d}x\\ &=\int_{-\infty}^{x_{0}}\left[\frac{\dot{\psi}_{0}^{2}}{\psi_{0}}\exp\left(\frac{\dot{\psi}_{0}}{\psi_{0}}(x-x_{0})\right)-x^{2}\psi_{0}\exp\left(\frac{\dot{\psi}_{0}}{\psi_{0}}(x-x_{0})\right)+\phi_{0}\exp\left(\frac{\dot{\phi}_{0}}{\phi_{0}}(x-x_{0})\right)\right]^{2}\,{\rm d}x\\ \epsilon_{{\rm QHO},N}&=\int_{x_{N}}^{+\infty}(\ddot{\psi}-V\psi+\phi)^{2}\,{\rm d}x\\ &=\int_{x_{N}}^{+\infty}\left[\frac{\dot{\psi}_{N}^{2}}{\psi_{N}}\exp\left(\frac{\dot{\psi}_{N}}{\psi_{N}}(x-x_{N})\right)-x^{2}\psi_{N}\exp\left(\frac{\dot{\psi}_{N}}{\psi_{N}}(x-x_{N})\right)+\phi_{N}\exp\left(\frac{\dot{\phi}_{N}}{\phi_{N}}(x-x_{N})\right)\right]^{2}\,{\rm d}x\end{aligned}\right.,

(4.3.4)

where integrals of exponential tails multiplied by $x^{2}$ (actually polynomial potential functions in general) can be expressed using gamma function. Yet unfortunately, with $\psi_{0}$ and $\psi_{N}$ , $\phi_{0}$ and $\phi_{N}$ as denominators, such tails break the linearity of our ContEvol formalism. A natural solution would be to treat $\dot{\psi}_{0}/\psi_{0}$ and $\dot{\psi}_{N}/\psi_{N}$ , $\dot{\phi}_{0}/\phi_{0}$ and $\dot{\phi}_{N}/\phi_{N}$ as fixed values in the tails; as a price, one would need to fine-tune $x_{0}$ and $x_{N}$ , so that these ratios are indeed close to the corresponding fixed values. A related example will be presented in the next section.

As for second-order ContEvol, the tails could be similarly written as

\displaystyle\psi(x)=\left\{\begin{aligned} &\psi_{0}\exp\left[\frac{\dot{\psi}_{0}}{\psi_{0}}(x-x_{0})+\frac{\ddot{\psi}_{0}\psi_{0}-\dot{\psi}_{0}^{2}}{2\psi_{0}^{2}}(x-x_{0})^{2}\right]\quad&&x\leq x_{0}\\ &\psi_{N}\exp\left[\frac{\dot{\psi}_{N}}{\psi_{N}}(x-x_{N})+\frac{\ddot{\psi}_{N}\psi_{N}-\dot{\psi}_{N}^{2}}{2\psi_{N}^{2}}(x-x_{N})^{2}\right]\quad&&x\geq x_{N}\end{aligned}\right.;

(4.3.5)

however, even one is willing to deal with non-linearity, since the error function does not have an analytic form, one may need to build numerical lookup tables for $\epsilon_{{\rm QHO},-1}$ and $\epsilon_{{\rm QHO},N}$ . In the linear regime, we can treat ratios like $\ddot{\psi}_{0}/\psi_{0}$ and $\ddot{\psi}_{N}/\psi_{N}$ to zeros as well, but it is not common for first and second derivatives to simultaneously satisfy constraints, hence we can only aim for having sensible $\dot{\psi}_{0}$ and $\dot{\psi}_{N}$ values. Better and possibly intricate circumvention is beyond the scope of this work.

Difference 3: Increasing “sizes” of wavefunctions.

For scenarios like (quantum) harmonic oscillator, the “sizes” of wavefunctions (which can be strictly quantified using percentiles of the probability distribution) increase with larger quantum numbers. Meanwhile, with $N$ nodes, first-order ContEvol is supposed to yield $2N$ eigenvectors. Therefore, for similar problems, the spread of nodes probably needs to be adjusted according to test results. Since the fine-tuning may require several iterations, objective evaluation criteria can be designed to automate this process; such efforts are left for future work, and probably for specific situations.

To summarize, harmonic oscillator manifests some of the difficulties encountered in real-world problems, but ContEvol methods should be able to handle them reasonably well.

4.4 Coulomb potential, first-order ContEvol

In this final section on quantum mechanics, we look at a more realistic case, one-dimensional Coulomb potential. Following Section 2.1 of Pradhan and Nahar (2011), the radial part of the stationary Schrödinger equation for a hydrogen atom can be written as

\displaystyle\left[\frac{{\rm d}^{2}}{{\rm d}r^{2}}-V(r)-\frac{l(l+1)}{r^{2}}+E\right]P(r)=0,\quad r\geq 0,

(4.4.1)

where we have used atomic units, the potential $V(r)=-2/r$ , $l$ is the angular quantum number, and $P(r)\equiv r\cdot R(r)$ is a modified version of the radial wavefunction $R(r)$ . This work focuses on the ground state $n=1$ , hence we set $l=0$ ; in our notation, the equation becomes

\displaystyle-\ddot{\psi}-\frac{2}{r}\psi=E\psi,\quad r\geq 0,

(4.4.2)

and the exact solution is

\displaystyle\psi^{(1)}(r)=2re^{-r},\quad r\geq 0\quad{\rm and}\quad E_{1}=-1.

(4.4.3)

For simplicity, we sample the non-negative half of the real axis with $N+1$ nodes

\displaystyle r_{i}=i\cdot h,\quad i=0,1,\ldots,N,

(4.4.4)

where $h$ is the width of each interval¹⁰¹⁰10Caution: In this section, $i$ is always a non-negative integer and never the imaginary unit.; non-uniform sampling is left for future work. To handle the $1/r$ factor in the equation, we require each piece of the wavefunction $\psi(r)$ to be proportional to $r$ ; note that this strategy can be applied to Yukawa potential as well. Therefore the wavefunction is written as

\displaystyle\psi(r)=\left\{\begin{aligned} &r(D_{\psi i}+C_{\psi i}r+B_{\psi i}r^{2}+A_{\psi i}r^{3})&&r_{i}\leq r\leq r_{i+1}\\ &\psi_{N}\frac{r}{r_{N}}\exp\left(1-\frac{r}{r_{N}}\right)&&r\geq r_{N}\end{aligned}\right.,

(4.4.5)

where we exclude $\dot{\psi}_{N}$ from the tail to maintain linearity of our framework.

The coefficients $D_{\psi i}$ through $A_{\psi i}$ are yielded by terminal conditions at $r=r_{i}$ and $r_{i+1}$

\displaystyle\left\{\begin{aligned} \psi(r_{i})&=r_{i}(D_{\psi i}+C_{\psi i}r_{i}+B_{\psi i}r_{i}^{2}+A_{\psi i}r_{i}^{3})=\psi_{i}\\ \dot{\psi}(r_{i})&=D_{\psi i}+2C_{\psi i}r_{i}+3B_{\psi i}r_{i}^{2}+4A_{\psi i}r_{i}^{3}=\dot{\psi}_{i}\\ \psi(r_{i+1})&=r_{i+1}(D_{\psi i}+C_{\psi i}r_{i+1}+B_{\psi i}r_{i+1}^{2}+A_{\psi i}r_{i+1}^{3})=\psi_{i+1}\\ \dot{\psi}(r_{i+1})&=D_{\psi i}+2C_{\psi i}r_{i+1}+3B_{\psi i}r_{i+1}^{2}+4A_{\psi i}r_{i+1}^{3}=\dot{\psi}_{i+1}\end{aligned}\right.;

(4.4.6)

since $r_{i}=i\cdot h$ and $r_{i+1}=(i+1)\cdot h$ , for $i>0$ we have

		$\displaystyle\begin{pmatrix}ih&(ih)^{2}&(ih)^{3}&(ih)^{4}\\ 1&2ih&3(ih)^{2}&4(ih)^{3}\\ (i+1)h&[(i+1)h]^{2}&[(i+1)h]^{3}&[(i+1)h]^{4}\\ 1&2(i+1)h&3[(i+1)h]^{2}&4[(i+1)h]^{3}\end{pmatrix}\begin{pmatrix}D_{\psi i}\\ C_{\psi i}\\ B_{\psi i}\\ A_{\psi i}\end{pmatrix}=\begin{pmatrix}\psi_{i}\\ \dot{\psi}_{i}\\ \psi_{i+1}\\ \dot{\psi}_{i+1}\end{pmatrix}$		(4.4.7)
	$\displaystyle\Rightarrow\quad$	$\displaystyle\left\{\begin{aligned} A_{\psi i}&=\left[\frac{(2i-1)\psi_{i}}{i^{2}}-\frac{(2i+3)\psi_{i+1}}{(i+1)^{2}}\right]h^{-4}+\left[\frac{\dot{\psi}_{i}}{i}+\frac{\dot{\psi}_{i+1}}{i+1}\right]h^{-3}\\ B_{\psi i}&=-2\left[\frac{(3i^{2}-1)\psi_{i}}{i^{2}}-\frac{(3i^{2}+6i+2)\psi_{i+1}}{(i+1)^{2}}\right]h^{-3}-\left[\frac{(3i+2)\dot{\psi}_{i}}{i}+\frac{(3i+1)\dot{\psi}_{i+1}}{i+1}\right]h^{-2}\\ C_{\psi i}&=\left[\frac{(i+1)(6i^{2}-3i-1)\psi_{i}}{i^{2}}-\frac{i(6i^{2}+15i+8)\psi_{i+1}}{(i+1)^{2}}\right]h^{-2}+\left[\frac{(i+1)(3i+1)\dot{\psi}_{i}}{i}+\frac{i(3i+2)\dot{\psi}_{i+1}}{i+1}\right]h^{-1}\\ D_{\psi i}&=-2\left[\frac{(i+1)^{2}(i-1)\psi_{i}}{i}-\frac{i^{2}(i+2)\psi_{i+1}}{i+1}\right]h^{-1}-[(i+1)^{2}\dot{\psi}_{i}+i^{2}\dot{\psi}_{i+1}]\end{aligned}\right..$		(4.4.8)

Like in Section 4.2, the desired approximation $\phi\equiv H\psi$ is represented in the same way with $\{\phi_{i}\}$ and $\{\dot{\phi}_{i}\}$ . For convenience, we put this linear transformation in matrix form

\displaystyle\bar{\boldsymbol{\psi}}^{(i)}\equiv\begin{pmatrix}A_{\psi i}\\ B_{\psi i}\\ C_{\psi i}\\ D_{\psi i}\end{pmatrix}=T^{(i)}\begin{pmatrix}\psi_{i}\\ \psi_{i+1}\\ \dot{\psi}_{i}\\ \dot{\psi}_{i+1}\end{pmatrix}\equiv T^{(i)}{\boldsymbol{\psi}}^{(i)}

(4.4.9)

with the transformation matrix

\displaystyle T^{(i)}=\begin{pmatrix}\dfrac{2i-1}{i^{2}}h^{-4}&-\dfrac{2i+3}{(i+1)^{2}}h^{-4}&\dfrac{1}{i}h^{-3}&\dfrac{1}{i+1}h^{-3}\\ -2\dfrac{3i^{2}-1}{i^{2}}h^{-3}&2\dfrac{3i^{2}+6i+2}{(i+1)^{2}}h^{-3}&-\dfrac{3i+2}{i}h^{-2}&-\dfrac{3i+1}{i+1}h^{-2}\\ \dfrac{(i+1)(6i^{2}-3i-1)}{i^{2}}h^{-2}&-\dfrac{i(6i^{2}+15i+8)}{(i+1)^{2}}h^{-2}&\dfrac{(i+1)(3i+1)}{i}h^{-1}&\dfrac{i(3i+2)}{i+1}h^{-1}\\ -2\dfrac{(i+1)^{2}(i-1)}{i}h^{-1}&2\dfrac{i^{2}(i+2)}{i+1}h^{-1}&-(i+1)^{2}&-i^{2}\end{pmatrix},

(4.4.10)

which is the same for $\psi(r)$ and $\phi(r)$ . Boundary condition at $r_{0}=0$ indicates that $\psi_{0}=0$ . In the special case of $i=0$ , we set $A_{\psi 0}=0$ to get

\displaystyle\left\{\begin{aligned} B_{\psi 0}&=-2\psi_{1}h^{-3}+(\dot{\psi}_{0}+\dot{\psi}_{1})h^{-2}\\ C_{\psi 0}&=3\psi_{1}h^{-2}-(2\dot{\psi}_{0}+\dot{\psi}_{1})h^{-1}\\ D_{\psi 0}&=\dot{\psi}_{0}\end{aligned}\right.

(4.4.11)

\displaystyle T^{(0)}=\begin{pmatrix}-2h^{-3}&h^{-2}&h^{-2}\\ 3h^{-2}&-2h^{-1}&-h^{-1}\\ 0&1&0\end{pmatrix},

(4.4.12)

so that $(B_{\psi 0},C_{\psi 0},D_{\psi 0})^{\rm T}=T^{(0)}(\psi_{1},\dot{\psi}_{0},\dot{\psi}_{1})^{\rm T}$ .

The cost function is defined as (subscript “H” stands for hydrogen atom)

\displaystyle\epsilon_{\rm H}(\{\psi_{i}\},\{\dot{\psi}_{i}\};\{\phi_{i}\},\{\dot{\phi}_{i}\};h)

\displaystyle=\sum_{i=0}^{N-1}\epsilon_{{\rm H},i}(\psi_{i},\dot{\psi}_{i},\psi_{i+1},\dot{\psi}_{i+1};\phi_{i},\dot{\phi}_{i},\phi_{i+1},\dot{\phi}_{i+1};r_{i},r_{i+1})+\epsilon_{{\rm H},N}(\psi_{N};\phi_{N};r_{N});

(4.4.13)

for simplicity, in the following text we omit parameters of $\epsilon_{{\rm H},i}$ , which is

$\displaystyle\epsilon_{{\rm H},i}$	$\displaystyle=\int_{r_{i}}^{r_{i+1}}(\ddot{\psi}+\frac{2}{r}\psi+\phi)^{2}\,{\rm d}r=\int_{r_{i}}^{r_{i+1}}\left[\begin{aligned} &(2C_{\psi i}+2D_{\psi i})+(6B_{\psi i}+2C_{\psi i}+D_{\phi i})r\\ &+(12A_{\psi i}+2B_{\psi i}+C_{\phi i})r^{2}+(2A_{\psi i}+B_{\phi i})r^{3}+A_{\phi i}r^{4}\end{aligned}\right]^{2}\,{\rm d}r$
	$\displaystyle=\int_{ih}^{(i+1)h}\left[\begin{aligned} &4(C_{\psi i}+D_{\psi i})^{2}+4(6B_{\psi i}+2C_{\psi i}+D_{\phi i})(C_{\psi i}+D_{\psi i})r\\ &+[(6B_{\psi i}+2C_{\psi i}+D_{\phi i})^{2}+4(12A_{\psi i}+2B_{\psi i}+C_{\phi i})(C_{\psi i}+D_{\psi i})]r^{2}\\ &+[2(12A_{\psi i}+2B_{\psi i}+C_{\phi i})(6B_{\psi i}+2C_{\psi i}+D_{\phi i})+4(2A_{\psi i}+B_{\phi i})(C_{\psi i}+D_{\psi i})]r^{3}\\ &+[(12A_{\psi i}+2B_{\psi i}+C_{\phi i})^{2}+2(2A_{\psi i}+B_{\phi i})(6B_{\psi i}+2C_{\psi i}+D_{\phi i})+4A_{\phi i}(C_{\psi i}+D_{\psi i})]r^{4}\\ &+[2(2A_{\psi i}+B_{\phi i})(12A_{\psi i}+2B_{\psi i}+C_{\phi i})+2A_{\phi i}(6B_{\psi i}+2C_{\psi i}+D_{\phi i})]r^{5}\\ &+[(2A_{\psi i}+B_{\phi i})^{2}+2A_{\phi i}(12A_{\psi i}+2B_{\psi i}+C_{\phi i})]r^{6}+2A_{\phi i}(2A_{\psi i}+B_{\phi i})r^{7}+A_{\phi i}^{2}r^{8}\end{aligned}\right]^{2}\,{\rm d}r$
	$\displaystyle=\left[\begin{aligned} &4(C_{\psi i}+D_{\psi i})^{2}h+2(6B_{\psi i}+2C_{\psi i}+D_{\phi i})(C_{\psi i}+D_{\psi i})d(i,2)h^{2}\\ &+\frac{1}{3}[(6B_{\psi i}+2C_{\psi i}+D_{\phi i})^{2}+4(12A_{\psi i}+2B_{\psi i}+C_{\phi i})(C_{\psi i}+D_{\psi i})]d(i,3)h^{3}\\ &+\frac{1}{4}[2(12A_{\psi i}+2B_{\psi i}+C_{\phi i})(6B_{\psi i}+2C_{\psi i}+D_{\phi i})+4(2A_{\psi i}+B_{\phi i})(C_{\psi i}+D_{\psi i})]d(i,4)h^{4}\\ &+\frac{1}{5}[(12A_{\psi i}+2B_{\psi i}+C_{\phi i})^{2}+2(2A_{\psi i}+B_{\phi i})(6B_{\psi i}+2C_{\psi i}+D_{\phi i})+4A_{\phi i}(C_{\psi i}+D_{\psi i})]d(i,5)h^{5}\\ &+\frac{1}{6}[2(2A_{\psi i}+B_{\phi i})(12A_{\psi i}+2B_{\psi i}+C_{\phi i})+2A_{\phi i}(6B_{\psi i}+2C_{\psi i}+D_{\phi i})]d(i,6)h^{6}\\ &+\frac{1}{7}[(2A_{\psi i}+B_{\phi i})^{2}+2A_{\phi i}(12A_{\psi i}+2B_{\psi i}+C_{\phi i})]d(i,7)h^{7}+\frac{1}{4}(2A_{\phi i}A_{\psi i}+A_{\phi i}B_{\phi i})d(i,8)h^{8}+\frac{1}{9}A_{\phi i}^{2}d(i,9)h^{9}\end{aligned}\right],$	(4.4.14)

where $d(i,n)\equiv(i+1)^{n}-i^{n}$ , for $i=0,1,\ldots,N-1$ ; $\epsilon_{{\rm H},N}$ will be addressed later. Again for convenience, we define $\epsilon_{{\rm H},-1}\equiv 0$ .

Partial derivatives of $\epsilon_{{\rm H},i}$ with respect to $A_{\phi i}$ , $B_{\phi i}$ , $C_{\phi i}$ , and $D_{\phi i}$ are

\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{{\rm H},i}}{\partial A_{\phi i}}&=\left[\begin{aligned} &\frac{4}{5}(C_{\psi i}+D_{\psi i})d(i,5)h^{5}+\frac{1}{3}(6B_{\psi i}+2C_{\psi i}+D_{\phi i})d(i,6)h^{6}\\ &+\frac{2}{7}(12A_{\psi i}+2B_{\psi i}+C_{\phi i})d(i,7)h^{7}+\frac{1}{2}A_{\psi i}d(i,8)h^{8}+\frac{1}{4}B_{\phi i}d(i,8)h^{8}+\frac{2}{9}A_{\phi i}d(i,9)h^{9}\end{aligned}\right]\\ \frac{\partial\epsilon_{{\rm H},i}}{\partial B_{\phi i}}&=\left[\begin{aligned} &(C_{\psi i}+D_{\psi i})d(i,4)h^{4}+\frac{2}{5}(6B_{\psi i}+2C_{\psi i}+D_{\phi i})d(i,5)h^{5}\\ &+\frac{1}{3}(12A_{\psi i}+2B_{\psi i}+C_{\phi i})d(i,6)h^{6}+\frac{2}{7}(2A_{\psi i}+B_{\phi i})d(i,7)h^{7}+\frac{1}{4}A_{\phi i}d(i,8)h^{8}\end{aligned}\right]\\ \frac{\partial\epsilon_{{\rm H},i}}{\partial C_{\phi i}}&=\left[\begin{aligned} &\frac{4}{3}(C_{\psi i}+D_{\psi i})d(i,3)h^{3}+\frac{1}{2}(6B_{\psi i}+2C_{\psi i}+D_{\phi i})d(i,4)h^{4}\\ &+\frac{2}{5}(12A_{\psi i}+2B_{\psi i}+C_{\phi i})d(i,5)h^{5}+\frac{1}{3}(2A_{\psi i}+B_{\phi i})d(i,6)h^{6}+\frac{2}{7}A_{\phi i}d(i,7)h^{7})\end{aligned}\right]\\ \frac{\partial\epsilon_{{\rm H},i}}{\partial D_{\phi i}}&=\left[\begin{aligned} &2(C_{\psi i}+D_{\psi i})d(i,2)h^{2}+\frac{2}{3}(6B_{\psi i}+2C_{\psi i}+D_{\phi i})d(i,3)h^{3}\\ &+\frac{1}{2}(12A_{\psi i}+2B_{\psi i}+C_{\phi i})d(i,4)h^{4}+\frac{2}{5}(2A_{\psi i}+B_{\phi i})d(i,5)h^{5}+\frac{1}{3}A_{\phi i}d(i,6)h^{6}\end{aligned}\right]\end{aligned}\right.,

(4.4.15)

respectively; put in matrix form, these are

\displaystyle\begin{pmatrix}\partial/\partial A_{\phi i}\\ \partial/\partial B_{\phi i}\\ \partial/\partial C_{\phi i}\\ \partial/\partial D_{\phi i}\end{pmatrix}\epsilon_{{\rm H},i}\equiv\bar{P}^{(i)}\bar{\boldsymbol{\phi}}^{(i)}+\bar{Q}^{(i)}\bar{\boldsymbol{\psi}}^{(i)}=\bar{P}^{(i)}\begin{pmatrix}A_{\phi i}\\ B_{\phi i}\\ C_{\phi i}\\ D_{\phi i}\end{pmatrix}+\bar{Q}^{(i)}\begin{pmatrix}A_{\psi i}\\ B_{\psi i}\\ C_{\psi i}\\ D_{\psi i}\end{pmatrix}

(4.4.16)

with

\displaystyle\left\{\begin{aligned} \bar{P}^{(i)}&=\begin{pmatrix}2d(i,9)h^{9}/9&d(i,8)h^{8}/4&2d(i,7)h^{7}/7&d(i,6)h^{6}/3\\ d(i,8)h^{8}/4&2d(i,7)h^{7}/7&d(i,6)h^{6}/3&2d(i,5)h^{5}/5\\ 2d(i,7)h^{7}/7&d(i,6)h^{6}/3&2d(i,5)h^{5}/5&d(i,4)h^{4}/2\\ d(i,6)h^{6}/3&2d(i,5)h^{5}/5&d(i,4)h^{4}/2&2d(i,3)h^{3}/3\end{pmatrix}\\ \bar{Q}^{(i)}&=\begin{pmatrix}\dfrac{24}{7}d(i,7)h^{7}+\dfrac{1}{2}d(i,8)h^{8}&2d(i,6)h^{6}+\dfrac{4}{7}d(i,7)h^{7}&\dfrac{4}{5}d(i,5)h^{5}+\dfrac{2}{3}d(i,6)h^{6}&\dfrac{4}{5}d(i,5)h^{5}\\ 4d(i,6)h^{6}+\dfrac{4}{7}d(i,7)h^{7}&\dfrac{12}{5}d(i,5)h^{5}+\dfrac{2}{3}d(i,6)h^{6}&d(i,4)h^{4}+\dfrac{4}{5}d(i,5)h^{5}&d(i,4)h^{4}\\ \dfrac{24}{5}d(i,5)h^{5}+\dfrac{2}{3}d(i,6)h^{6}&3d(i,4)h^{4}+\dfrac{4}{5}d(i,5)h^{5}&\dfrac{4}{3}d(i,3)h^{3}+d(i,4)h^{4}&\dfrac{4}{3}d(i,3)h^{3}\\ 6d(i,4)h^{4}+\dfrac{4}{5}d(i,5)h^{5}&4d(i,3)h^{3}+d(i,4)h^{4}&2d(i,2)h^{2}+\dfrac{4}{3}d(i,3)h^{3}&2d(i,2)h^{2}\end{pmatrix}\end{aligned}\right..

(4.4.17)

For the special case of $i=0$ , we simply need to drop the first rows and first columns of $\bar{P}^{(0)}$ and $\bar{Q}^{(0)}$ . Then partial derivatives of $\epsilon_{{\rm H},i}$ with respect to $\phi_{i}$ , $\phi_{i+1}$ , $\dot{\phi}_{i}$ , and $\dot{\phi}_{i+1}$ can be succinctly expressed as

	$\displaystyle\begin{pmatrix}\partial/\partial\phi_{i}\\ \partial/\partial\phi_{i+1}\\ \partial/\partial\dot{\phi}_{i}\\ \partial/\partial\dot{\phi}_{i+1}\end{pmatrix}\epsilon_{{\rm H},i}$	$\displaystyle=[T^{(i)}]^{\rm T}\begin{pmatrix}\partial/\partial A_{\phi i}\\ \partial/\partial B_{\phi i}\\ \partial/\partial C_{\phi i}\\ \partial/\partial D_{\phi i}\end{pmatrix}\epsilon_{{\rm H},i}=[T^{(i)}]^{\rm T}(\bar{P}^{(i)}\bar{\boldsymbol{\phi}}^{(i)}+\bar{Q}^{(i)}\bar{\boldsymbol{\psi}}^{(i)})$
		$\displaystyle=([T^{(i)}]^{\rm T}\bar{P}^{(i)}T^{(i)}){\boldsymbol{\phi}}^{(i)}+([T^{(i)}]^{\rm T}\bar{Q}^{(i)}T^{(i)}){\boldsymbol{\psi}}^{(i)}\equiv P^{(i)}{\boldsymbol{\phi}}^{(i)}+Q^{(i)}{\boldsymbol{\psi}}^{(i)}.$		(4.4.18)

As promised, we now address $\epsilon_{{\rm H},N}$ , which corresponds to the tail. Given our assumed functional form Eq. (4.4.5), this should be

$\displaystyle\epsilon_{{\rm H},N}(\psi_{N};\phi_{N};r_{N})$	$\displaystyle=\int_{r_{N}}^{\infty}(\ddot{\psi}+\frac{2}{r}\psi+\phi)^{2}\,{\rm d}r=\int_{r_{N}}^{\infty}\left[\left(\psi_{N}\frac{r-2r_{N}}{r_{N}^{3}}+\psi_{N}\frac{2}{r_{N}}+\phi_{N}\frac{r}{r_{N}}\right)\exp\left(1-\frac{r}{r_{N}}\right)\right]^{2}\,{\rm d}r$
	$\displaystyle=\int_{r_{N}}^{\infty}\left[\left\{2\frac{r_{N}-1}{r_{N}^{2}}\psi_{N}+\left(\frac{\phi_{N}}{r_{N}}+\frac{\psi_{N}}{r_{N}^{3}}\right)r\right\}\exp\left(1-\frac{r}{r_{N}}\right)\right]^{2}\,{\rm d}r$
	$\displaystyle=\frac{r_{N}}{4}\left[2\left(2\frac{r_{N}-1}{r_{N}^{2}}\psi_{N}\right)^{2}+6\left(2\frac{r_{N}-1}{r_{N}^{2}}\psi_{N}\right)\left(\frac{\phi_{N}}{r_{N}}+\frac{\psi_{N}}{r_{N}^{3}}\right)r_{N}+5\left(\frac{\phi_{N}}{r_{N}}+\frac{\psi_{N}}{r_{N}^{3}}\right)^{2}r_{N}^{2}\right]$
	$\displaystyle=\frac{5r_{N}}{4}\phi_{N}^{2}+\left(3-\frac{1}{2r_{N}}\right)\phi_{N}\psi_{N}+\left(\frac{2}{r_{N}}-\frac{1}{r_{N}^{2}}+\frac{1}{4r_{N}^{3}}\right)\psi_{N}^{2},$	(4.4.19)

and its partial derivative with respect to $\phi_{N}$ is

\displaystyle\frac{\partial\epsilon_{{\rm H},N}}{\partial\phi_{N}}=\frac{5r_{N}}{2}\phi_{N}+\left(3-\frac{1}{2r_{N}}\right)\psi_{N}\equiv P^{(N)}\phi_{N}+Q^{(N)}\psi_{N},

(4.4.20)

where $P^{(N)}$ and $Q^{(N)}$ are both $1\times 1$ matrices.

To minimize the cost function Eq. (4.4.13), we have

\displaystyle\begin{pmatrix}\partial/\partial\phi_{1}\\ \vdots\\ \partial/\partial\phi_{N}\\ \partial/\partial\dot{\phi}_{0}\\ \vdots\\ \partial/\partial\dot{\phi}_{N}\end{pmatrix}\epsilon_{\rm H}=\left[\begin{aligned} &\begin{pmatrix}P_{11}&\cdots&P_{1N}&P_{1,N+1}&\cdots&P_{1,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ P_{N1}&\cdots&P_{NN}&P_{N,N+1}&\cdots&P_{N,2N+1}\\ P_{N+1,1}&\cdots&P_{N+1,N}&P_{N+1,N+1}&\cdots&P_{N+1,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ P_{2N+1,1}&\cdots&P_{2N+1,N}&P_{2N+1,N+1}&\cdots&P_{2N+1,2N+1}\end{pmatrix}\begin{pmatrix}\phi_{1}\\ \vdots\\ \phi_{N}\\ \dot{\phi}_{1}\\ \vdots\\ \dot{\phi}_{N}\end{pmatrix}\\ &+\begin{pmatrix}Q_{11}&\cdots&Q_{1N}&Q_{1,N+1}&\cdots&Q_{1,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ Q_{N1}&\cdots&Q_{NN}&Q_{N,N+1}&\cdots&Q_{N,2N+1}\\ Q_{N+1,1}&\cdots&Q_{N+1,N}&Q_{N+1,N+1}&\cdots&Q_{N+1,2N+1}\\ \vdots&\ddots&\vdots&\vdots&\ddots&\vdots\\ Q_{2N+1,1}&\cdots&Q_{2N+1,N}&Q_{2N+1,N+1}&\cdots&Q_{2N+1,2N+1}\end{pmatrix}\begin{pmatrix}\psi_{1}\\ \vdots\\ \psi_{N}\\ \dot{\psi}_{1}\\ \vdots\\ \dot{\psi}_{N}\end{pmatrix}\end{aligned}\right]=\begin{pmatrix}0\\ \vdots\\ 0\\ 0\\ \vdots\\ 0\end{pmatrix};

(4.4.21)

since

\displaystyle\left\{\begin{aligned} \frac{\partial\epsilon_{\rm IPW}}{\partial\phi_{i}}=\frac{\partial\epsilon_{{\rm IPW},i-1}}{\partial\phi_{i}}+\frac{\partial\epsilon_{{\rm IPW},i}}{\partial\phi_{i}}\\ \frac{\partial\epsilon_{\rm IPW}}{\partial\dot{\phi}_{i}}=\frac{\partial\epsilon_{{\rm IPW},i-1}}{\partial\dot{\phi}_{i}}+\frac{\partial\epsilon_{{\rm IPW},i}}{\partial\dot{\phi}_{i}}\end{aligned}\right.,

(4.4.22)

the $(2N+1)\times(2N+1)$ $P$ and $Q$ matrices can be constructed from scratch (zero matrix) by doing

\displaystyle\left\{\begin{aligned} \begin{pmatrix}P_{1,1}&P_{1,N+1}&P_{i,N+2}\\ P_{N+1,1}&P_{N+1,N+1}&P_{N+1,N+2}\\ P_{N+2,1}&P_{N+2,N+1}&P_{N+2,N+2}\end{pmatrix}+\!\!=P^{(0)}\\ \begin{pmatrix}Q_{1,1}&Q_{1,N+1}&Q_{i,N+2}\\ Q_{N+1,1}&Q_{N+1,N+1}&Q_{N+1,N+2}\\ Q_{N+2,1}&Q_{N+2,N+1}&Q_{N+2,N+2}\end{pmatrix}+\!\!=Q^{(0)}\end{aligned}\right.

(4.4.23)

for $i=0$ , and then

\displaystyle\left\{\begin{aligned} \begin{pmatrix}P_{i,i}&P_{i,i+1}&P_{i,(N+1)+i}&P_{i,(N+1)+i+1}\\ P_{i+1,i}&P_{i+1,i+1}&P_{i+1,(N+1)+i}&P_{i+1,(N+1)+i+1}\\ P_{(N+1)+i,i}&P_{(N+1)+i,i+1}&P_{(N+1)+i,(N+1)+i}&P_{(N+1)+i,(N+1)+i+1}\\ P_{(N+1)+i+1,i}&P_{(N+1)+i+1,i+1}&P_{(N+1)+i+1,(N+1)+i}&P_{(N+1)+i+1,(N+1)+i+1}\end{pmatrix}+\!\!=P^{(i)}\\ \begin{pmatrix}Q_{i,i}&Q_{i,i+1}&Q_{i,(N+1)+i}&Q_{i,(N+1)+i+1}\\ Q_{i+1,i}&Q_{i+1,i+1}&Q_{i+1,(N+1)+i}&Q_{i+1,(N+1)+i+1}\\ Q_{(N+1)+i,i}&Q_{(N+1)+i,i+1}&Q_{(N+1)+i,(N+1)+i}&Q_{(N+1)+i,(N+1)+i+1}\\ Q_{(N+1)+i+1,i}&Q_{(N+1)+i+1,i+1}&Q_{(N+1)+i+1,(N+1)+i}&Q_{(N+1)+i+1,(N+1)+i+1}\end{pmatrix}+\!\!=Q^{(i)}\end{aligned}\right.,

(4.4.24)

for $i=1,2,\ldots,N-1$ , and finally

\displaystyle\left\{\begin{aligned} \begin{pmatrix}P_{N,N}\end{pmatrix}+\!\!=P^{(N)}\\ \begin{pmatrix}Q_{N,N}\end{pmatrix}+\!\!=Q^{(N)}\end{aligned}\right.

(4.4.25)

for $i=N$ .

Because of our definition of the tail, we need to enforce the $\dot{\psi}_{N}=0$ constraint if we want to maintain the continuity of first derivative at $r_{N}$ . In this case, simply removing the corresponding rows and columns from $P$ and $Q$ matrices constructed above would lead to erroneous results, as when four coefficients ( $A_{\psi,N-1}$ , $B_{\psi,N-1}$ , $C_{\psi,N-1}$ , and $D_{\psi,N-1}$ ; similar for $\phi$ ) are fully specified by three parameters ( $\psi_{N-1}$ , $\psi_{N}$ , and $\dot{\psi}_{N-1}$ ; similar for $\phi$ ), the inverse transformation may not be well defined — the situation is basically the same as in Section 2.3.

Therefore, when $\dot{\psi}_{i+1}=0$ and $\dot{\phi}_{i+1}=0$ , we have to plug the two sets of three parameters into $\epsilon_{{\rm H},i}$ Eq. (4.4) to obtain (here the prime “^′” denotes with the constraints mentioned above)

\displaystyle\left\{\begin{aligned} P^{\prime}{}^{(i)}&=\begin{pmatrix}P^{\prime}{}^{(i)}_{11}&P^{\prime}{}^{(i)}_{12}&P^{\prime}{}^{(i)}_{13}\\ P^{\prime}{}^{(i)}_{21}&P^{\prime}{}^{(i)}_{22}&P^{\prime}{}^{(i)}_{23}\\ P^{\prime}{}^{(i)}_{31}&P^{\prime}{}^{(i)}_{32}&P^{\prime}{}^{(i)}_{33}\end{pmatrix}\\ Q^{\prime}{}^{(i)}&=\begin{pmatrix}Q^{\prime}{}^{(i)}_{11}&Q^{\prime}{}^{(i)}_{12}&Q^{\prime}{}^{(i)}_{13}\\ Q^{\prime}{}^{(i)}_{21}&Q^{\prime}{}^{(i)}_{22}&Q^{\prime}{}^{(i)}_{23}\\ Q^{\prime}{}^{(i)}_{31}&Q^{\prime}{}^{(i)}_{32}&Q^{\prime}{}^{(i)}_{33}\end{pmatrix}\end{aligned}\right.

(4.4.26)

with

\displaystyle\left\{\begin{aligned} P^{\prime}{}^{(i)}_{11}&=\frac{234i^{4}+42i^{3}-17i^{2}-4i+1}{315i^{4}}h\\ P^{\prime}{}^{(i)}_{22}&=\frac{468i^{7}+1788i^{6}+2522i^{5}+1560i^{4}+360i^{3}}{630i^{3}(i+1)^{4}}h\\ P^{\prime}{}^{(i)}_{12}=P^{\prime}{}^{(i)}_{21}&=\frac{162i^{5}+324i^{4}+154i^{3}-8i^{2}-15i}{630i^{3}(i+1)^{2}}h\\ P^{\prime}{}^{(i)}_{13}=P^{\prime}{}^{(i)}_{31}&=\frac{132i^{3}+60i^{2}-i-4}{1260i^{3}}h^{2}\\ P^{\prime}{}^{(i)}_{23}=P^{\prime}{}^{(i)}_{32}&=\frac{78i^{4}+180i^{3}+127i^{2}+30i}{1260i^{2}(i+1)^{2}}h^{2}\\ P^{\prime}{}^{(i)}_{33}&=\frac{12i^{2}+9i+2}{630i^{2}}h^{3}\end{aligned}\right.

(4.4.27)

and

\displaystyle\left\{\begin{aligned} Q^{\prime}{}^{(i)}_{11}&=\frac{-8(63i^{4}+4i^{2}-4i+1)h^{-1}+(312i^{3}-16i^{2}-20i+3)}{210i^{4}}\\ Q^{\prime}{}^{(i)}_{22}&=\frac{-8(63i^{4}+252i^{3}+382i^{2}+264i+72)h^{-1}+(312i^{3}+952i^{2}+948i+305)}{210(i+1)^{4}}\\ Q^{\prime}{}^{(i)}_{12}=Q^{\prime}{}^{(i)}_{21}&=\frac{8(63i^{4}+126i^{3}+67i^{2}+4i-3)h^{-1}+(108i^{3}+162i^{2}+20i-17)}{210i^{2}(i+1)^{2}}\\ Q^{\prime}{}^{(i)}_{13}&=\frac{-2(231i^{3}+14i^{2}+i-4)+(44i^{2}+6i-3)h}{210i^{3}}\\ Q^{\prime}{}^{(i)}_{31}&=\frac{-2(21i^{3}+14i^{2}+i-4)+(44i^{2}+6i-3)h}{210i^{3}}\\ Q^{\prime}{}^{(i)}_{23}=Q^{\prime}{}^{(i)}_{32}&=\frac{2(21i^{3}+56i^{2}+50i+12)+(26i^{2}+46i+17)h}{210i(i+1)^{2}}\\ Q^{\prime}{}^{(i)}_{33}&=\frac{-4(14i^{2}+7i+2)h+(8i+3)h^{2}}{210i^{2}}\end{aligned}\right..

(4.4.28)

Note that although only the $i=N-1$ version of the above expressions is used in this work, we have written the general version for $i\neq 0$ .

To construct the $2N\times 2N$ $P$ and $Q$ matrices from scratch, the procedure is the same as when we do not enforce $\dot{\psi}_{i+1}=0$ and $\dot{\phi}_{i+1}=0$ , except for the $(N-1)$ st step, which needs to be substituted by

\displaystyle\left\{\begin{aligned} \begin{pmatrix}P_{N-1,N-1}&P_{N-1,N}&P_{N-1,2N}\\ P_{N,N-1}&P_{N,N}&P_{N,2N}\\ P_{2N,N-1}&P_{2N,N}&P_{2N,2N}\end{pmatrix}+\!\!=P^{\prime}{}^{(N-1)}\\ \begin{pmatrix}Q_{N-1,N-1}&Q_{N-1,N}&Q_{N-1,2N}\\ Q_{N,N-1}&Q_{N,N}&Q_{N,2N}\\ Q_{2N,N-1}&Q_{2N,N}&Q_{2N,2N}\end{pmatrix}+\!\!=Q^{\prime}{}^{(N-1)}\end{aligned}\right..

(4.4.29)

Our desired Hamiltonian is thus simply $H=-P^{-1}Q$ . Eigendecomposition of $H$ should yield $2N+1$ (or $2N$ ) eigenpairs, $\{\psi_{i}^{(k)},\dot{\psi}_{i}^{(k)}\}$ and $E^{(k)}$ , without (with) the constraint. With or without the $\dot{\psi}_{N}=0$ enforcement, $\bar{P}^{(i)}$ ( $\bar{Q}^{(i)}$ ) matrices are always (never) symmetric; consequently, $P$ ( $Q$ ) matrices are also always (never) symmetric.

Like in Sections 4.1 and 4.2, the eigenvectors need to be “renormalized” as

$\displaystyle 1$	$\displaystyle=\int_{0}^{\infty}[{\mathcal{N}}\psi(r)]^{2}\,{\rm d}r={\mathcal{N}}^{2}\left\{\sum_{i=0}^{N-1}\int_{r_{i}}^{r_{i+1}}[r(D_{\psi i}+C_{\psi i}r+B_{\psi i}r^{2}+A_{\psi i}r^{3})]^{2}\,{\rm d}r+\int_{r_{N}}^{\infty}\left[\psi_{N}\frac{r}{r_{N}}\exp\left(1-\frac{r}{r_{N}}\right)\right]^{2}\,{\rm d}r\right\}$
	$\displaystyle={\mathcal{N}}^{2}\left\{\sum_{i=0}^{N-1}\int_{ih}^{(i+1)h}\left[\begin{aligned} &D_{\psi i}^{2}r^{2}+2C_{\psi i}D_{\psi i}r^{3}+(C_{\psi i}^{2}+2B_{\psi i}D_{\psi i})r^{4}+(2B_{\psi i}C_{\psi i}+2A_{\psi i}D_{\psi i})r^{5}\\ &+(B_{\psi i}^{2}+2A_{\psi i}C_{\psi i})r^{6}+2A_{\psi i}B_{\psi i}r^{7}+A_{\psi i}^{2}r^{8}\end{aligned}\right]\,{\rm d}r+\frac{5}{4}r_{N}\psi_{N}^{2}\right\}$
	$\displaystyle={\mathcal{N}}^{2}\Bigg{\{}\sum_{i=0}^{N-1}\left[\begin{aligned} &\frac{1}{3}D_{\psi i}^{2}d(i,3)h^{3}+\frac{1}{2}C_{\psi i}D_{\psi i}d(i,4)h^{4}+\frac{1}{5}(C_{\psi i}^{2}+2B_{\psi i}D_{\psi i})d(i,5)h^{5}\\ &+\frac{1}{3}(B_{\psi i}C_{\psi i}+A_{\psi i}D_{\psi i})d(i,6)h^{6}+\frac{1}{7}(B_{\psi i}^{2}+2A_{\psi i}C_{\psi i})d(i,7)h^{7}\\ &+\frac{1}{4}A_{\psi i}B_{\psi i}d(i,8)h^{8}+\frac{1}{9}A_{\psi i}^{2}d(i,9)h^{9}\end{aligned}\right]+\frac{5}{4}r_{N}\psi_{N}^{2}\Bigg{\}};$	(4.4.30)

we omit the “inner product” definition here as this section focuses on the ground state.

Special version: $N=0$ .

Because of the tail, it is possible to study the $N=0$ case, for which our wavefunction is simply

\displaystyle\psi(r)=\dot{\psi}_{0}re^{-r},\quad r\geq 0,

(4.4.31)

which, after normalization, coincides with the exact solution Eq. (4.4.3). Nevertheless, we still need to study the energy predicted by ContEvol.

In this special case, the cost function is

	$\displaystyle\epsilon_{{\rm H},N=0}$	$\displaystyle=\int_{0}^{\infty}(\ddot{\psi}+\frac{2}{r}\psi+\phi)^{2}\,{\rm d}r=\int_{0}^{\infty}[\dot{\psi}_{0}(r-2)e^{-r}+2\dot{\psi}_{0}e^{-r}+\dot{\phi}_{0}re^{-r}]^{2}\,{\rm d}r$
		$\displaystyle=\int_{0}^{\infty}[(\dot{\psi}_{0}+\dot{\phi}_{0})re^{-r}]^{2}\,{\rm d}r=(\dot{\psi}_{0}+\dot{\phi}_{0})^{2}\int_{0}^{\infty}(re^{-r})^{2}\,{\rm d}r.$		(4.4.32)

Evidently, minimizing this would yield $\dot{\phi}_{0}=-\dot{\psi}_{0}$ , i.e., the Hamiltonian $H=\begin{pmatrix}-1\end{pmatrix}$ , and the ground state energy also coincides with the exact solution. Of course, such coincidence should not be relied upon, hence we move on to more realistic $N$ values.

Toy version: $N=1$ .

Then we explore the $N=1$ case, which only has one single interval $[0,h]$ in addition to the tail. Fig. 4.4.1 presents five sets of six $3\times 3$ matrices based on different values of $h$ . All non-zero elements of $T^{(0)}$ matrices are shown in gradually varying colors, illustrating how $T^{(0)}$ changes with $h$ ; note that Eq. (4.4.12) tells us that the matrix element $T^{(0)}_{32}$ is always $1$ regardless of $h$ . The symmetric $\bar{P}^{(0)}$ matrices (with first rows and first columns dropped) manifest similar gradual variation, with largest element “migrating” from lower-right corner to upper-left corner; however, combining variations of $T^{(0)}$ and $\bar{P}^{(0)}$ , as well as $P^{\prime}{}^{(0)}$ added for the tail, the $P$ matrices seem very similar to each other, although the color scales (not shown in Fig. 4.4.1) are different. The $\bar{Q}^{(0)}$ matrices (also with first rows and first columns dropped) are intrinsically asymmetric, and the largest element “migrates” from lower-center to lower-left; the resulting $Q$ matrices seem quite different with different values of $h$ , yet gradual variation can still be revealed if we examine the elements one at a time. Finally, the $H$ matrices also look similar to each other, although slightly variation can still be noticed; their eigenvectors are not shown as a matrix, since this section focuses on the ground state. Here we comment that the other two eigenvalues are positive, and the corresponding wavefunctions are quasi-sinusoidal in the interval $[0,h]$ and almost zero in the tail; to study the actual excited states, one needs to repeat the fine-tuning exercise described below.

Rendered ground state wavefunctions based on the $H$ matrices in Fig. 4.4.1 are shown in the left panel of Fig. 4.4.2. The $h=1$ version agrees with the exact solution Eq. (4.4.3) remarkably well, while other values of $h$ are limited by not-so-good predefined functional forms. The right panel of Fig. 4.4.2 plots the ground state energy as a function of $h$ . The ContEvol solution coincides with the exact value at $h\approx 1.0469$ . However, how shall we determine the optimal value of $h$ when we have no idea about the exact solution? Similar to an argument in Section 4.3, we can fine-tune $h$ so that $\dot{\psi}_{N}$ , in this case $\dot{\psi}_{1}$ , is close to zero. Fig. 4.4.3 plots $\dot{\psi}_{1}$ as a function of $h$ . It is exactly zero at $h\approx 1.0493$ , which is close but not identical to the value quoted above. In practice, we can adjust values of $h$ and $N$ in turn: for example, we explore a small interval around $h\approx 1.0493$ with $N=2$ , get a better estimate of $h$ , and explore a smaller interval around the updated $h$ with a larger $N$ , etc., until the errors are below some threshold. Such iterative process is not implemented for this work. In the following, we simply adopt $r_{N}=Nh=1$ , and enforce the $\dot{\psi}_{N}=0$ constraint; investigating how $h$ affects the accuracy of $N>1$ results is left for future work.

Realistic versions: $N=2$ to $N=8$ .

Fig. 4.4.4 shows $P$ , $Q$ , and $H$ matrices produced by $N=2$ , $N=4$ , and $N=8$ versions of first-order ContEvol. Like in the case of infinite potential well (see Section 4.2, especially Fig. 4.2.2), each $P$ or $Q$ matrix has $2\times 2$ tridiagonal blocks; because of the position-dependence of the Coulomb potential, elements on the same diagonal do not necessarily have the same value. Most noticeable matrix elements are $P_{N,N}$ and $Q_{N,N}$ , which are affected by the tail; the former are “more positive” in $P$ matrices, while the latter are “less negative” in $Q$ matrices. Consequently, the $N$ th rows and $N$ th columns of $H$ matrices do not follow the same pattern as other regions.

In Fig. 4.4.5, the left panel displays errors in rendered ground state wavefunctions of $N=2$ , $N=4$ , $N=6$ (not shown in Fig. 4.4.4), and $N=8$ Hamiltonians, while the right panel plots errors in ground state energy predicted by first-order ContEvol with $N=2,3,\ldots,8$ . Like in Section 4.2, the eigenpair is already remarkably accurate with $N=8$ , which is arguably small.

5 Discussion: directions for future work

The ContEvol formalism has many potential applications inside and outside physics. For example, yearn for a “smoother” stellar evolution code has supplied the original motivation for this work. As long as people want to represent continuous functions (of time, space, or both) with a finite sampling, ContEvol may help. However, much work remains to be done to reveal its full potential. In this final section, we discuss some of the major directions for future development of ContEvol.

Mathematical foundation.

Although ContEvol appears to be successful, it lacks a solid mathematical foundation. Desirable justifications and auxiliary tools include but are not limited to:

•

Control over errors and non-symplecticity. With specific cases, this work seems to indicate that first-order ContEvol results have $\mathcal{O}(h^{6})$ errors in values, $\mathcal{O}(h^{5})$ errors in first derivatives, and $\mathcal{O}(h^{5})$ error in deviation from equation(s) of motion — more specifically, the $\mathcal{O}(h^{6})$ terms in values are usually just missing, see Eq. (2.1.12) for an example; second-order ContEvol does not improve order of errors in results, but does reduce deviation from EOM(s) to $\mathcal{O}(h^{9})$ ; non-symplecticity (discrepancy between determinant of Jacobian and $1$ ) does not display a uniform pattern. Under what conditions do these statements hold? How do these quotes scale with the order of ContEvol? Such questions needs to be answered to solidify ContEvol results.
•

Foundation for customized Linear algebra. As hypothesized in Section 4.2, intuitively Hamiltonian $H=-P^{-1}Q$ based on Eq. (4.2.9) should be a Hermitian operator, and inner product defined in Eq. (4.2) is reasonable. Yet unless these statements are well justified, ContEvol does not guarantee an expected number of valid eigenpairs.
•

Moments and transforms. This work has not included expressions for moments and transforms (e.g., Fourier and Laplace transforms) based on values and derivatives at nodes, yet such things are likely to be important for the analysis of ContEvol results. Do they reveal additional properties or limitations of ContEvol methods? The answer will inform choices for specific applications.

Higher dimensions.

This work has been focused on one-dimensional scenarios, either time or space; nevertheless, the combination of function representation with linear coefficients and cost function minimization can be generalized to high-dimensional cases. In other words, the ContEvol formalism should be able to solve partial differential equations (PDEs) as well as ordinary differential equations (ODEs). Here we outline major directions of such extensions for first-order ContEvol.

•

Evolving one-dimensional functions. In this case, the full evolutionary history of the function $\psi(x,t)$ , sampled at $N_{t}$ timestamps and $N_{x}$ nodes, can be fully characterized by $N_{t}\times N_{x}$ quadruples, $\{\psi,\psi_{;x},\psi_{;t},\psi_{;x;t}\}$ , where semicolons “;” in subscripts denote partial derivatives. Thus at each space-time location, the function can be rendered as the product of a cubic polynomial in $x$ and a cubic polynomial in $t$ ; such a representation has $16$ coefficients, corresponding to four quadruples at four corners of a space-time cell.
•

Representing high-dimensional functions. Although there are no restrictions for use of curvilinear coordinates, the discussion here focuses on Cartesian coordinates. To fully characterize a spatial distribution, in principle one could use $\{\psi,\psi_{;x},\psi_{;y},\psi_{;x;y}\}$ in two dimensions and $\{\psi,\psi_{;x},\psi_{;y},\psi_{;x;y},\psi_{;z},\psi_{;x;z},\psi_{;y;z},\psi_{;x;y;z}\}$ in three dimensions. However, in $d$ dimensions, multiplying the $N^{d}$ growth of number of nodes and $2^{d}$ growth of number of features can easily make things computationally unaffordable. A less expensive version of the high-dimensional function representation would only use values and first derivatives, i.e., $\{\psi,\psi_{;x},\psi_{;y}\}$ in 2D and $\{\psi,\psi_{;x},\psi_{;y},\psi_{;z}\}$ in 3D, so that the number of features only grows as $1+d$ . A difficulty is that in 2D (3D), there are only $10$ (or $20$ ) zeroth- to third-order terms, but there are $2^{2}\times(1+2)=12$ (or $2^{3}\times(1+3)=32$ ) features to fit for each cell; to bypass inconsistency, it is recommended to add some higher-order terms (e.g., $x^{2}y^{2}$ ), but those involving fourth or higher order in a single variable should probably be avoided (e.g., $x^{4}$ or $y^{4}$ ).
•

Evolving high-dimensional functions. Space and time coordinates could be viewed as equivalent from the perspective of special relativity, yet for most computational physics problems, time may play a different role than spatial coordinates. Thenceforth, for better representing the “history” of a dynamic system, $\{\psi,\psi_{;x},\psi_{;y},\psi_{;t},\psi_{;x;t},\psi_{;y;t}\}$ in 2D and $\{\psi,\psi_{;x},\psi_{;y},\psi_{;z},\psi_{;t},\psi_{;x;t},\psi_{;y;t},\psi_{;z;t}\}$ in 3D might be a more sensible choice.

In addition to higher dimensions, we note that extension to multiple functions is also natural; vector and tensor functions can be decomposed into independent components, as we did in Section 3.

Technical improvements.

The last group of directions addresses some technical issues involved in the ContEvol formalism per se, which may lead to improvements in accuracy, precision, or performance.

•

Multistep version. This works has been focused on single-step ContEvol methods, regardless of the order, yet it is possible to extend ContEvol to multiple steps or intervals. For boundary value problems, if we want to study the function $f(x)$ for some interval $x_{i}\leq x\leq x_{i+1}$ , while the combination of $\{f_{i},\dot{f}_{i},f_{i+1},\dot{f}_{i+1}\}$ can give us a cubic approximation, the combination of $\{f_{i-1},\dot{f}_{i-1},f_{i},\dot{f}_{i},f_{i+1},\dot{f}_{i+1},f_{i+2},\dot{f}_{i+2}\}$ (assuming sampling nodes $x_{i-1}$ and $x_{i+2}$ both exist or can be reasonably defined for convenience) can give us a septic approximation. For initial value problems, there are two basic strategies: backward, which for example approximates the evolution during the next interval as a quintic polynomial based on $\{f_{-h},\dot{f}_{-h},f_{0},\dot{f}_{0},f_{h},\dot{f}_{h}\}$ ; and forward, which for example approximates the evolution during the next two intervals as a pair of cubic polynomials or a unified quintic polynomial based on $\{f_{0},\dot{f}_{0},f_{h},\dot{f}_{h},f_{2h},\dot{f}_{2h}\}$ . Of course one can include more steps or devise hybrid versions. Like higher orders (e.g., Section 2.3), inclusion of multiple steps complicates derivation and computation, but potentially improves accuracy or precision.
•

Better sampling and evolving nodes. As mentioned in Section 4.2, the distribution of sampling nodes is by no means necessarily uniform; for some realistic applications, their distribution should not be fixed, for example in Section 4.3, when the potential function necessitates a flexible sampling. In short, the sampling is something ContEvol users are encouraged to fine-tune. In addition, when a field is evolved (see above for discussion on higher dimensions), drifting nodes (i.e., nodes with varying positions) and splitting or merging cells (i.e., adding or removing nodes) may be desirable. Because of the uniqueness of Hermite spline, splitting $[x_{\rm left},x_{\rm right}]$ into $[x_{\rm left},x_{\rm middle}]$ and $[x_{\rm middle},x_{\rm right}]$ by inserting $f(x_{\rm middle})$ and $\dot{f}(x_{\rm middle})$ at an arbitrary location $x_{\rm middle}$ between $x_{\rm left}$ and $x_{\rm right}$ does not distort the “current” function representation at all; this fact should be applicable to higher dimensions as well. However, we note that such variations are preferably predefined (e.g., according to some strategy), not determined on-the-fly, as optimizing location of nodes often requires solving non-linear equations.
•

Computational efficiency. Let us consider arguably the most costly case of real-world physics problems, time evolution of a set of three-dimensional fields, e.g., cosmological simulations; we use single-step ContEvol with $N$ nodes in each dimension, and keep track of $N_{q}$ quantities, each with $N_{f}$ features (i.e., values or partial derivatives). Then the dimension of the matrix is $(N^{3}N_{q}N_{f})\times(N^{3}N_{q}N_{f})$ , which can be overwhelmingly expensive. However, indexing each of the $(N_{q}N_{f})\times(N_{q}N_{f})$ blocks as $B_{\alpha\beta\gamma\alpha^{\prime}\beta^{\prime}\gamma^{\prime}}$ , where $\alpha(^{\prime}),\beta(^{\prime}),\gamma(^{\prime})=0,1,\ldots,N-1$ , the necessary condition for an element to be non-zero is $\max\{|\alpha-\alpha^{\prime}|,|\beta-\beta^{\prime}|,|\gamma-\gamma^{\prime}|\}\leq 1$ . In other words, among the $(N^{3})^{2}=N^{6}$ elements of this block, only less than $3^{3}N^{3}=27N^{3}$ can possibly non-zero, i.e., such matrices are highly sparse when $N$ is large; a closer look would reveal many “tridiagonal” structures. Specialized data structures and algorithms could be designed to handle such matrices. Furthermore, when we compute the evolution of large-scale structures under gravitational interactions, information about specific chemical composition may not be particularly pertinent. In such cases, multi-tier strategy could be useful: at each step, we first evolve the “dominating” quantities, and then combine coarse-grained “future” and fine-grained “present” to evolve the “dependent” quantities.

Miscellany.

In addition to the above directions, some miscellaneous topics are worth mentioning.

•

Root-finding. While this work has been focused on differential equations, the backbone function representation of ContEvol (Hermite spline) can be applied to algebraic equations as well: knowing both values and first derivatives at two sampling points, we can always find a cubic approximation of the function to help root-finding. For instance, Fig. 5.0.1 displays Kepler’s equation Eq. (3.4.8) with $e=63/64$ ; using Newton’s method, one would have to carefully choose an initial guess to avoid divergence, while the cubic approximation is more robust. Admittedly, solution to a cubic equation is more complicated than that to a linear equation, yet cubic may work better in some cases; besides, one can use cubic for the first few steps, and then switch to linear for fine-tuning purposes.
•

Numerical integration. Likewise, piece-wise cubic (or higher-order) polynomials may help numerical integration. As demonstrated in Section 4, using less sampling points, a “compound” sampling with both values and derivatives can outperform “simple” sampling with only values. Although fitting polynomials with multiple values (e.g., Simpson’s rule) could effectively mitigate discreteness, usage of derivatives should rely less on a fine sampling. When the derivatives have to be evaluated numerically, in the first-order case, this technical is equivalent to a sampling like $\{\ldots,x_{i}-\Delta/2,x_{i}+\Delta/2,x_{i+1}-\Delta/2,x_{i+1}+\Delta/2,\ldots\}$ , where $\Delta\ll|x_{i+1}-x_{i}|$ .
•

Data structure of lookup tables. Due to the semi-analytic nature of the ContEvol formalism, its performance might be limited by lookup tables stored as hypercubes of values; fortunately, development of numerical methods may advance data structure of lookup tables as well. This section has already addressed how high-dimensional functions are supposed to be digitalized by combining values and derivatives; the three-dimensional plan can be naturally extended to higher dimensions. Even without ContEvol, “continuous” lookup tables have their own benefits, e.g., higher accuracy or less storage usage.

In conclusion, it is our hope that, with further developments, the ContEvol (continuous evolution) formalism can benefit some applications of computational physics.

Acknowledgements and data availability

KC thanks his advisors, Christopher M. Hirata and Marc H. Pinsonneault, for inspirations through research projects in cosmological image processing and stellar evolution, respectively, as well as insights and encouragement during the preparation of this work. KC appreciates insightful feedback from (in chronological order) Anil K. Pradhan, Annika H.G. Peter, R.J. Furnstahl, and Todd A. Thompson. KC also thanks Li-Yong Zhou (周礼勇; Nanjing University, China) and R.J. Furnstahl for introducing him to numerical methods in celestial mechanics and quantum mechanics, respectively.

During the preparation of this article, KC is supported by an internal funding source at The Ohio State University. The following software is used on KC’s personal computer (HP All-in-One 24-dp1xxx, Microsoft Windows 11 Home). Most symbolic operations throughout this work are performed and figures in Section 4 are made with Wolfram Mathematica 11.0 (Inc., ). Numerical tests in Section 3 are conducted with Python 3.11 (van Rossum and Team, 2023) codes developed using NumPy (Harris et al., 2020) and Numba (Lam et al., 2015), corresponding exact solution is derived with SciPy (Virtanen et al., 2020), while figures therein and that in Section 5 are made with Matplotlib (Hunter, 2007). Mathematica and Jupyter notebooks for this work will be available in the GitHub repository ContEvol_formalism¹¹¹¹11https://github.com/kailicao/ContEvol_formalism.git after it is posted on arXiv. This article is prepared with Overleaf, Online LaTeX Editor¹²¹²12https://www.overleaf.com/ and Online LaTeX Equation Editor¹³¹³13https://latex.codecogs.com/eqneditor/editor.php.

References

Springel [2010] Volker Springel. E pur si muove: Galilean-invariant cosmological hydrodynamical simulations on a moving mesh. MNRAS, 401(2):791–851, January 2010. 10.1111/j.1365-2966.2009.15715.x.
Jiang et al. [2014] Yan-Fei Jiang, James M. Stone, and Shane W. Davis. An Algorithm for Radiation Magnetohydrodynamics Based on Solving the Time-dependent Transfer Equation. ApJS, 213(1):7, July 2014. 10.1088/0067-0049/213/1/7.
Bovy [2015] Jo Bovy. galpy: A python Library for Galactic Dynamics. ApJS, 216(2):29, February 2015. 10.1088/0067-0049/216/2/29.
Demarque et al. [2008] P. Demarque, D. B. Guenther, L. H. Li, A. Mazumdar, and C. W. Straka. YREC: the Yale rotating stellar evolution code. Non-rotating version, seismology applications. Ap&SS, 316(1-4):31–41, August 2008. 10.1007/s10509-007-9698-y.
Paxton et al. [2011] Bill Paxton, Lars Bildsten, Aaron Dotter, Falk Herwig, Pierre Lesaffre, and Frank Timmes. Modules for Experiments in Stellar Astrophysics (MESA). ApJS, 192(1):3, January 2011. 10.1088/0067-0049/192/1/3.
Chambers [1999] J. E. Chambers. A hybrid symplectic integrator that permits close encounters between massive bodies. MNRAS, 304(4):793–799, April 1999. 10.1046/j.1365-8711.1999.02379.x.
Rein and Liu [2012] H. Rein and S. F. Liu. REBOUND: an open-source multi-purpose N-body code for collisional dynamics. A&A, 537:A128, January 2012. 10.1051/0004-6361/201118085.
Grandclément and Novak [2009] Philippe Grandclément and Jérôme Novak. Spectral Methods for Numerical Relativity. Living Reviews in Relativity, 12(1):1, January 2009. 10.12942/lrr-2009-1.
Press et al. [2007] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, 3 edition, 2007. ISBN 9780521880688. URL http://nr.com/.
Pradhan and Nahar [2011] Anil K. Pradhan and Sultana N. Nahar. Atomic Astrophysics and Spectroscopy. Cambridge University Press, 2011.
[11] Wolfram Research, Inc. Mathematica, Version 11.0. Champaign, IL, 2016.
van Rossum and Team [2023] G. van Rossum and Python Development Team. Python Language Reference Release 3. 11. 3. Lulu Press, Incorporated, 2023. ISBN 9781312573949. URL https://books.google.com/books?id=QiTwzwEACAAJ.
Harris et al. [2020] Charles R. Harris, K. Jarrod Millman, Stéfan J. van der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J. Smith, Robert Kern, Matti Picus, Stephan Hoyer, Marten H. van Kerkwijk, Matthew Brett, Allan Haldane, Jaime Fernández del Río, Mark Wiebe, Pearu Peterson, Pierre Gérard-Marchant, Kevin Sheppard, Tyler Reddy, Warren Weckesser, Hameer Abbasi, Christoph Gohlke, and Travis E. Oliphant. Array programming with NumPy. Nature, 585(7825):357–362, September 2020. 10.1038/s41586-020-2649-2.
Lam et al. [2015] Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. Numba: A LLVM-based Python JIT Compiler. In Proc. Second Workshop on the LLVM Compiler Infrastructure in HPC, pages 1–6, November 2015. 10.1145/2833157.2833162.
Virtanen et al. [2020] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C. J. Carey, İlhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1. 0 Contributors. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods, 17:261–272, February 2020. 10.1038/s41592-019-0686-2.
Hunter [2007] John D. Hunter. Matplotlib: A 2D Graphics Environment. Computing in Science and Engineering, 9(3):90–95, May 2007. 10.1109/MCSE.2007.55.

ContEvol formalism: numerical methods based on Hermite spline optimization

Abstract

keywords:

1 Introduction

2 Prototype case: classic harmonic oscillator

2.1 First-order ContEvol method

2.2 Fourth- and eighth-order Runge–Kutta methods

2.3 Second-order ContEvol method

Option 1: With EOM enforced at t=ht=h (“direct” solution).

Option 2: Without EOM enforced at t=ht=h (“indirect” solution).

3 Celestial mechanics: two-body and three-body problems

3.1 Two-body, first-order ContEvol with “adequate” expansion

Test case 1: Uniform circular motion.

Test case 2: Parabolic motion.

3.2 Two-body, equivalence with linearization and Taylor expansion

Equivalence with linearization.

Equivalence with Taylor expansion.

3.3 Two-body, conservation of mechanic energy and angular momentum

Approach 1: Use 𝒓h{\boldsymbol{r}}_{h} to correct 𝒓h{\boldsymbol{r}}_{h}.

Approach 2: Enforce conservation laws in the formalism.

3.4 Two-body, numerical tests with an eccentric elliptical orbit

Method 1: Leapfrog integrator.

Method 2: Fourth-order Runge–Kutta.

Method 3: First-order ContEvol.

3.5 Three-body, first-order ContEvol (description)

4 Quantum mechanics: stationary Schrödinger equation

4.1 Infinite potential well, simple discretization

4.2 Infinite potential well, first-order ContEvol

Toy version: N=1N=1.

Realistic versions: N=2N=2, N=4N=4, and N=8N=8.

4.3 Harmonic oscillator, first-order ContEvol (description)

Difference 1: Position-dependent potential.

Difference 2: Lack of sharp edges.

Difference 3: Increasing “sizes” of wavefunctions.

4.4 Coulomb potential, first-order ContEvol

Special version: N=0N=0.

Toy version: N=1N=1.

Realistic versions: N=2N=2 to N=8N=8.

5 Discussion: directions for future work

Mathematical foundation.

Higher dimensions.

Technical improvements.

Miscellany.

Acknowledgements and data availability

References

Option 1: With EOM enforced at $t=h$ (“direct” solution).

Option 2: Without EOM enforced at $t=h$ (“indirect” solution).

Approach 1: Use ${\boldsymbol{r}}_{h}$ to correct ${\boldsymbol{r}}_{h}$ .

Toy version: $N=1$ .

Realistic versions: $N=2$ , $N=4$ , and $N=8$ .

Special version: $N=0$ .

Toy version: $N=1$ .

Realistic versions: $N=2$ to $N=8$ .