This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\stackMath

High-order symplectic Lie group methods on SO(n)SO(n) using the polar decomposition

Xuefeng Shen Khoa Tran  and  Melvin Leok
Abstract.

A variational integrator of arbitrarily high-order on the special orthogonal group SO(n)SO(n) is constructed using the polar decomposition and the constrained Galerkin method. It has the advantage of avoiding the second-order derivative of the exponential map that arises in traditional Lie group variational methods. In addition, a reduced Lie–Poisson integrator is constructed and the resulting algorithms can naturally be implemented by fixed-point iteration. The proposed methods are validated by numerical simulations on SO(3)SO(3) which demonstrate that they are comparable to variational Runge–Kutta–Munthe-Kaas methods in terms of computational efficiency. However, the methods we have proposed preserve the Lie group structure much more accurately and and exhibit better near energy preservation.

1. Introduction

1.1. Overview

Given a configuration manifold QQ, variational integrators provide a useful method of deriving symplectic integrators for Lagrangian mechanics on the tangent bundle TQTQ in terms of the Lagrangian LL, or for Hamiltonian mechanics on the cotangent bundle TQT^{*}Q in terms of the Hamiltonian HH. It involves discretizing Hamilton’s principle or Hamilton’s phase space principle rather than the Euler–Lagrange or Hamilton’s equations. Discrete Lagrangian variational mechanics is described in terms of a discrete Lagrangian Ld(q0,q1)L_{d}(q_{0},q_{1}), which is an approximation of the exact discrete Lagrangian,

Ldexact(q0,q1)=extqC2([0,h],Q)q(0)=q0, q(h)=q10hL(q(t),q˙(t))𝑑t.L_{d}^{\text{exact}}(q_{0},q_{1})=\operatorname*{ext}_{\begin{subarray}{c}q\in C^{2}([0,h],Q)\\ q(0)=q_{0},\text{ }q(h)=q_{1}\end{subarray}}\int_{0}^{h}L(q(t),\dot{q}(t))\,dt. (1)

The discrete Hamilton’s principle states that the discrete action sum is stationary for variations that vanish at the endpoints. This yields the discrete Euler–Lagrange equations,

D2Ld(qk1,qk)+D1Ld(qk,qk+1)=0,D_{2}L_{d}(q_{k-1},q_{k})+D_{1}L_{d}(q_{k},q_{k+1})=0,

where DiD_{i} denotes the partial derivative with respect to the ii-th argument. This defines an update map on Q×QQ\times Q, where (qk1,qk)(qk,qk+1)(q_{k-1},q_{k})\mapsto(q_{k},q_{k+1}). The update map can equivalently be described in terms of the discrete Legendre transforms,

pk=D1Ld(qk,qk+1),pk+1=D2Ld(qk,qk+1),p_{k}=-D_{1}L_{d}(q_{k},q_{k+1}),\qquad p_{k+1}=D_{2}L_{d}(q_{k},q_{k+1}), (2)

which defines an update map (qk,pk)(qk+1,pk+1)(q_{k},p_{k})\mapsto(q_{k+1},p_{k+1}) on TQT^{*}Q that automatically preserves the canonical symplectic structure on TQT^{*}Q.

The order of the variational integrator depends on how accurately Ld(q0,q1)L_{d}(q_{0},q_{1}) approximates Ldexact(q0,q1)L_{d}^{\text{exact}}(q_{0},q_{1}). To derive a high-order discrete Lagrangian, a typical approach is the Galerkin method [14]. This involves considering the definition of the exact discrete Lagrangian, replacing C2([0,h],Q)C^{2}([0,h],Q) with a finite-dimensional function space, and approximating the integral with a numerical quadrature formula. When the configuration manifold QQ is a linear space and the polynomials of degree less than or equal to ss are chosen, the classical symplectic partitioned Runge–Kutta methods are recovered. Subsequently, Leok and Shingel [12] introduced the shooting-based discrete Lagrangian, which allows one to construct a symplectic integrator from an arbitrary one-step method.

When the configuration manifold QQ is a Lie group GG, the construction of the discrete Lagrangian is more complicated than the case of linear space. Leok [11] proposed parametrizing curves on the Lie group using the exponential map, namely a curve g(t)g(t) connecting g0g_{0} and g1g_{1} that is represented by

g(t)=g0exp(ϵ(t)),g(t)=g_{0}\cdot\exp(\epsilon(t)),

where ϵ(t)𝔤\epsilon(t)\in\mathfrak{g} is a curve on the Lie algebra of GG with fixed endpoints ϵ(0)=0\epsilon(0)=0 and ϵ(h)=log(g01g1)\epsilon(h)=\log(g_{0}^{-1}g_{1}). This allows one to replace variations in g(t)g(t) by variations in ϵ(t)\epsilon(t) on the Lie algebra 𝔤\mathfrak{g}, which is a linear space. This yields the following expression for the exact discrete Lagrangian,

Ldexact(g0,g1)=extϵC2([0,h],𝔤)ϵ(0)=0, ϵ(h)=log(g01g1)0hL(g0exp(ϵ(t)),g0dexpϵ(t)(ϵ˙(t)))𝑑t,L_{d}^{\text{exact}}(g_{0},g_{1})=\operatorname*{ext}_{\begin{subarray}{c}\epsilon\in C^{2}([0,h],\mathfrak{g})\\ \epsilon(0)=0,\text{ }\epsilon(h)=\log(g_{0}^{-1}g_{1})\end{subarray}}\int_{0}^{h}L(g_{0}\cdot\exp(\epsilon(t)),g_{0}\cdot\operatorname*{d\exp}\nolimits_{\epsilon(t)}(\dot{\epsilon}(t)))\,dt, (3)

where dexpϵ(ϵ˙)=exp(ϵ)1eadϵadϵ(ϵ˙)\operatorname*{d\exp}_{\epsilon}(\dot{\epsilon})=\exp(\epsilon)\cdot\frac{1-\text{e}^{-\text{ad}_{\epsilon}}}{\text{ad}_{\epsilon}}(\dot{\epsilon}) is the tangent lift of the exponential map. If ϵ(t)\epsilon(t) is restricted to a finite-dimensional function space and the integral is replaced with a quadrature rule, we obtain the Galerkin Lie group variational integrators. The error analysis and implementation details for such methods can be found in [8, 3]. The above construction can be naturally extended to any retraction [1] on GG, which is a diffeomorphism from a neighborhood of 0𝔤0\in\mathfrak{g} to neighborhood of eGe\in G that satisfies a rigidity condition. The main disadvantage of Galerkin Lie group variational integrators is the term dexp\operatorname*{d\exp} in (3). This implies that the resulting discrete Euler–Lagrange equations involve d2exp2exp\operatorname*{d^{2}\exp}2exp, which cannot be calculated exactly in general and requires the truncation of a series expansion.

1.2. Contributions

In this paper, we focus on the Lie group SO(n)SO(n) as our configuration space. By using the fact that every invertible square matrix can be uniquely decomposed into the product of an orthogonal matrix and a symmetric positive-definite matrix via the polar decomposition, we will circumvent the disadvantage discussed previously: Instead of parametrizing curves on SO(n)SO(n) by the exponential map or a retraction, SO(n)SO(n) is embedded naturally in the space GL+(n)={An×ndet(A)>0}GL_{+}(n)=\{A\in\mathbb{R}^{n\times n}\mid\text{det}(A)>0\}, an open subset of n×n\mathbb{R}^{n\times n}. Given fixed endpoints g0g_{0} and g1g_{1}, we will construct interpolating polynomials in GL+(n)GL_{+}(n) while ensuring that the internal points remain on SO(n)SO(n) by using the polar decomposition. Furthermore, we do not extend the Lagrangian LL to GL+(n)GL_{+}(n) but instead project the trajectory onto SO(n)SO(n) in the same way. The variational integrator in Lagrangian form is derived following the usual variational approach for the constrained Galerkin method and the Hamiltonian form is derived using the discrete Legendre transforms.

For a system with rotational symmetry, we obtain a simpler integrator using Lie–Poisson reduction on the Hamiltonian side. Namely, if LL is SO(n)SO(n)-invariant, the constructed discrete Lagrangian is also SO(n)SO(n)-invariant and we can construct a reduced symplectic Lie–Poisson integrator. Lastly, we consider the dipole on a stick problem from [3], conduct numerical experiments using our method, and compare these to the variational Runge–Kutta–Munthe-Kaas methods (VRKMK) from the same reference.

2. Background

2.1. Notation

Recall that the Lie algebra of SO(n)SO(n) is the set Skew(n)={Ωn×nΩT=Ω}\operatorname*{Skew}(n)=\{\Omega\in\mathbb{R}^{n\times n}\mid\Omega^{T}=-\Omega\}, with the matrix commutator as the Lie bracket. Here, the inner products on n×n\mathbb{R}^{n\times n} and Skew(n)\operatorname*{Skew}(n) are introduced, and we identify these spaces with their dual spaces using the Riesz representations. For any A,Bn×nA,B\in\mathbb{R}^{n\times n}, the inner product is given by

tr(ABT)=i,j=1naijbij.\operatorname*{tr}\left(AB^{T}\right)=\sum\limits_{i,j=1}^{n}a_{ij}b_{ij}.

For any Ω,Ω~Skew(n)\Omega,\tilde{\Omega}\in\operatorname*{Skew}(n), the inner product is defined by

Ω,Ω~=i<jΩijΩ~ij=12tr(ΩΩ~T).\langle\Omega,\tilde{\Omega}\rangle=\sum\limits_{i<j}\Omega_{ij}\tilde{\Omega}_{ij}=\frac{1}{2}\operatorname*{tr}(\Omega\tilde{\Omega}^{T}).

In addition, consider the operator Asym:n×nSkew(n)\operatorname*{Asym}\colon\mathbb{R}^{n\times n}\to\operatorname*{Skew}(n), defined by Asym(A)=AAT\operatorname*{Asym}(A)=A-A^{T}. The following properties can be easily verified:

  1. (a)

    For any A,Bn×nA,B\in\mathbb{R}^{n\times n}, tr(ABT)=tr(ATB)\operatorname*{tr}(AB^{T})=\operatorname*{tr}(A^{T}B).

  2. (b)

    For any A,B,P,Qn×nA,B,P,Q\in\mathbb{R}^{n\times n}, tr(A(PBQ)T)=tr((PTAQT)BT)\operatorname*{tr}(A(PBQ)^{T})=\operatorname*{tr}((P^{T}AQ^{T})B^{T}).

  3. (c)

    For any ΩSkew(n),An×n\Omega\in\operatorname*{Skew}(n),A\in\mathbb{R}^{n\times n}, Ω,Asym(A)=i<jΩij(AijAji)=tr(ΩAT)\langle\Omega,\operatorname*{Asym}(A)\rangle=\sum\limits_{i<j}\Omega_{ij}(A_{ij}-A_{ji})=\operatorname*{tr}(\Omega A^{T}).

In particular, we note that (c) gives a relationship between the two inner products. Lastly, given the choice of inner products, Riesz representation allows us to identify (n×n)(\mathbb{R}^{n\times n})^{*} with n×n\mathbb{R}^{n\times n} and Skew(n)\operatorname*{Skew}(n)^{*} with Skew(n)\operatorname*{Skew}(n).

2.2. Polar Decomposition

We introduce the polar decomposition and the construction of the retraction on SO(n)SO(n) described in [4]. Given AGL(n)A\in GL(n), it decomposes uniquely as

AUP,UO(n), PSym+(n),A\longrightarrow UP,\qquad U\in O(n),\text{ }P\in{\operatorname*{Sym}}^{+}(n),

where Sym+(n){\operatorname*{Sym}}^{+}(n) is the set of n×nn\times n symmetric positive-definite matrices. This is the polar decomposition of AA, and we denote it as a coproduct mapping by pol:GL(n)O(n)×Sym+(n)\operatorname*{pol}\colon GL(n)\to O(n)\times{\operatorname*{Sym}}^{+}(n). The map of interest is the projection =π1pol:GL(n)O(n)\mathbb{P}=\pi_{1}\circ\operatorname*{pol}\colon GL(n)\to O(n) defined by

(A)=U,\mathbb{P}(A)=U,

where π1\pi_{1} is the projection onto the first coordinate. In particular, when AGL+(n)A\in GL_{+}(n), we have USO(n)U\in SO(n). A fast and efficient algorithm for calculating the projection of the polar decomposition is by Newton iteration,

Uk+1=12(Uk+UkT), U0=A.U_{k+1}=\frac{1}{2}\left(U_{k}+U_{k}^{-T}\right),\text{ }U_{0}=A. (4)

Now, this projection can be used to construct a retraction on SO(n)SO(n) from its Lie algebra Skew(n)\operatorname*{Skew}(n) relative to the identity element II,

(I+Ω)=U,\mathbb{P}(I+\Omega)=U,

where ΩSkew(n)\Omega\in\operatorname*{Skew}(n). This provides a diffeomorphism between a neighborhood of 0Skew(n)0\in\operatorname*{Skew}(n) and a neighborhood of ISO(n)I\in SO(n). To calculate its inverse, suppose that I+Ω=UPI+\Omega=UP and take the transpose on both sides to obtain IΩ=PUTI-\Omega=PU^{T}, which implies that UT(I+Ω)=(IΩ)UU^{T}(I+\Omega)=(I-\Omega)U. Thus, we have that

UTΩ+ΩU+UTU=0.U^{T}\Omega+\Omega U+U^{T}-U=0. (5)

This is a Lyapunov equation, and it is well-known that matrix equations of the form AX+XB+C=0AX+XB+C=0 have a unique solution if and only if for any λσ(A),μσ(B)\lambda\in\sigma(A),\mu\in\sigma(B), λ+μ0\lambda+\mu\neq 0. For UU in the neighborhood of identity, its eigenvalues lie in the open right-half plane, which ensures that a unique solution to (5) exists. In principle, this Lyapunov equation can be solved using classical algorithms [2, 7]. For convenience, we denote the solution to the Lyapunov equation as

X=Lyap(A,B,C).X=\text{Lyap}(A,B,C).

Next we introduce the tangent map and its adjoint for \mathbb{P}, which are essential for the derivation of the variational integrator.

2.2.1. The Tangent Map

Consider the polar decomposition A(t)=U(t)P(t)A(t)=U(t)P(t) and differentiate both sides to yield A˙=U˙P+UP˙\dot{A}=\dot{U}P+U\dot{P}. By left-trivialization on SO(n)SO(n), we can write U˙=UΩ\dot{U}=U\Omega, where ΩSkew(n)\Omega\in\operatorname*{Skew}(n). Rearranging gives P˙=UTA˙ΩP\dot{P}=U^{T}\dot{A}-\Omega P, and since P˙Sym+(n)\dot{P}\in{\operatorname*{Sym}}^{+}(n), we get that UTA˙ΩP=A˙TU+PΩU^{T}\dot{A}-\Omega P=\dot{A}^{T}U+P\Omega. Consequently, we may write it in the form of a Lyapunov equation,

PΩ+ΩP+A˙TUUTA˙=0.P\Omega+\Omega P+\dot{A}^{T}U-U^{T}\dot{A}=0. (6)

We see that the tangent map of the polar decomposition dA:n×nSkew(n)d\mathbb{P}_{A}:\mathbb{R}^{n\times n}\to\operatorname*{Skew}(n) is given by dA(A˙)=Ωd\mathbb{P}_{A}(\dot{A})=\Omega, where we solve the Lyapunov equation (6) to obtain Ω\Omega.

2.2.2. The Adjoint of the Tangent Map

The adjoint of dAd\mathbb{P}_{A} can be defined as

tr(dA(Ω)BT)=Ω,dA(B),ΩSkew(n),Bn×n.\operatorname*{tr}(d\mathbb{P}_{A}^{*}(\Omega)B^{T})=\langle\Omega,d\mathbb{P}_{A}(B)\rangle,\qquad\forall\Omega\in\operatorname*{Skew}(n),B\in\mathbb{R}^{n\times n}.

Recall that dA(B)d\mathbb{P}_{A}(B) involves solving the Lyapunov equation (6). We aim to compute dAd\mathbb{P}_{A}^{*}, so we define the following two maps,

ϕ\displaystyle\phi :Skew(n)Skew(n),\displaystyle\colon\operatorname*{Skew}(n)\longrightarrow\operatorname*{Skew}(n), Ω\displaystyle\qquad\Omega ΩP+PΩ,\displaystyle\longmapsto\Omega P+P\Omega,
ψ\displaystyle\psi :n×nSkew(n),\displaystyle\colon\mathbb{R}^{n\times n}\longrightarrow\operatorname*{Skew}(n), B\displaystyle\qquad B UTBBTU,\displaystyle\longmapsto U^{T}B-B^{T}U,

where A=UPA=UP is fixed. Therefore, dAd\mathbb{P}_{A} can be viewed as composition of ψ\psi and ϕ1\phi^{-1},

dA=(ϕ1ψ):n×nSkew(n),d\mathbb{P}_{A}=(\phi^{-1}\circ\psi)\colon\mathbb{R}^{n\times n}\longrightarrow\operatorname*{Skew}(n),

and dA(Ω)=(ϕ1ψ)(Ω)=ψ(ϕ)1(Ω)d\mathbb{P}_{A}^{*}(\Omega)=(\phi^{-1}\circ\psi)^{*}(\Omega)=\psi^{*}\circ(\phi^{*})^{-1}(\Omega). We shall derive the expressions for ϕ\phi^{*} and ψ\psi^{*} by considering the Riesz representations for our domains and codomains. For the adjoint of ϕ\phi, let Ω,Ω~Skew(n)\Omega,\tilde{\Omega}\in\operatorname*{Skew}(n), then

ϕ(Ω),Ω~\displaystyle\langle\phi^{*}(\Omega),\tilde{\Omega}\rangle =Ω,ϕ(Ω~)=Ω,Ω~P+PΩ~\displaystyle=\langle\Omega,\phi(\tilde{\Omega})\rangle=\langle\Omega,\tilde{\Omega}P+P\tilde{\Omega}\rangle
=Ω,Asym(Ω~P)=tr(Ω(Ω~P)T)\displaystyle=\langle\Omega,\operatorname*{Asym}(\tilde{\Omega}P)\rangle=\operatorname*{tr}(\Omega(\tilde{\Omega}P)^{T})
=tr((ΩP)Ω~T)\displaystyle=\operatorname*{tr}((\Omega P)\tilde{\Omega}^{T})
=ΩP+PΩ,Ω~.\displaystyle=\langle\Omega P+P\Omega,\tilde{\Omega}\rangle.

Thus, ϕ=ϕ\phi^{*}=\phi, and ϕ\phi is Hermitian. For ψ\psi^{*}, let ΩSkew(n)\Omega\in\operatorname*{Skew}(n) and Bn×nB\in\mathbb{R}^{n\times n}, then

tr(ψ(Ω)BT)\displaystyle\operatorname*{tr}(\psi^{*}(\Omega)B^{T}) =Ω,ψ(B)=Ω,UTBBTU\displaystyle=\langle\Omega,\psi(B)\rangle=\langle\Omega,U^{T}B-B^{T}U\rangle
=Ω,Asym(UTB)=tr(Ω(UTB)T)\displaystyle=\langle\Omega,\operatorname*{Asym}(U^{T}B)\rangle=\operatorname*{tr}(\Omega(U^{T}B)^{T})
=tr((UΩ)BT).\displaystyle=\operatorname*{tr}((U\Omega)B^{T}).

Therefore, ψ(Ω)=UΩ\psi^{*}(\Omega)=U\Omega, and we obtain

dA(Ω)=(ψ(ϕ)1)(Ω)=(ψϕ1)(Ω)=ULyap(P,ΩT),d\mathbb{P}_{A}^{*}(\Omega)=(\psi^{*}\circ(\phi^{*})^{-1})(\Omega)=(\psi^{*}\circ\phi^{-1})(\Omega)=U\operatorname*{Lyap}(P,\Omega^{T}),

where Lyap(P,ΩT)=Lyap(P,P,ΩT)\text{Lyap}(P,\Omega^{T})=\text{Lyap}(P,P,\Omega^{T}). Finally, we state a lemma that will be used later:

Lemma 1.

(I+S)=I\mathbb{P}(I+S)=I if and only if SSym(n)S\in\operatorname*{Sym}(n) and λ>1\lambda>-1 for all λσ(S)\lambda\in\sigma(S).

3. Lagrangian Variational Integrators on the Rotation Group SO(n)SO(n)

Let the configuration manifold be the rotation group G=SO(n)G=SO(n), and L:G×𝔤L\colon G\times\mathfrak{g}\to\mathbb{R} be a left-trivialized Lagrangian. We shall construct a discrete Lagrangian following the approach of constrained Galerkin methods on GL+(n)GL_{+}(n) (see Appendix A). Denote the internal points by {Ui}i=1sG\{U_{i}\}_{i=1}^{s}\subset G and the left-trivialized internal tangent vectors by {Ωi}i=1s𝔤\{\Omega_{i}\}_{i=1}^{s}\subset\mathfrak{g}. Fixing the endpoints g0g_{0} and g1g_{1}, we have

Ld(g0,g1)=hi=1sbiL(Ui,Ωi),L_{d}(g_{0},g_{1})=h\sum_{i=1}^{s}b_{i}L(U_{i},\Omega_{i}),

subject to the constraint

g1=(g0+hi=1sbiUiΩi),g_{1}=\mathbb{P}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right), (7)

where the internal points UiU_{i} are represented by

Ui=(g0+hj=1saijUjΩj).U_{i}=\mathbb{P}\left(g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}\right). (8)

The expressions inside the parentheses in (7) and (8) correspond to a Runge–Kutta method in the embedding space. But, since these points may not lie on the Lie group GG, they are projected to GG using the polar decomposition.

Observe that (7) is equivalent to the condition that (g1T(g0+hi=1sbiUiΩi))=I\mathbb{P}\left(g_{1}^{T}(g_{0}+h\sum_{i=1}^{s}b_{i}U_{i}\Omega_{i})\right)=I. Suppose that hh is small, and g0g_{0} and g1g_{1} are close enough to each other, then g1T(g0+hi=1sbiUiΩi)g_{1}^{T}\left(g_{0}+h\sum_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right) is in the neighborhood of II. By Lemma 1, (7) holds if and only if g1T(g0+hi=1sbiUiΩi)Sym(n)g_{1}^{T}\left(g_{0}+h\sum_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right)\in\text{Sym}(n), and so it is equivalent to

Asym(g1T(g0+hi=1sbiUiΩi))=0.\operatorname*{Asym}\left(g_{1}^{T}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right)\right)=0.

Now, we can construct a discrete Lagrangian with the constraint using a Lagrange multiplier ΛSkew(n)\Lambda\in\operatorname*{Skew}(n),

F~(g0,g1,{Ωi}i=1s,Λ)=hi=1sbiL(Ui,Ωi)+Λ,Asym(g1T(g0+hi=1sbiUiΩi)).\tilde{F}(g_{0},g_{1},\{\Omega_{i}\}_{i=1}^{s},\Lambda)=h\sum_{i=1}^{s}b_{i}L(U_{i},\Omega_{i})+\left\langle\Lambda,\operatorname*{Asym}\left(g_{1}^{T}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right)\right)\right\rangle.

The corresponding variational integrator is given by

[left=\empheqlbrace]0\displaystyle[left=\empheqlbrace\,]0 =F~Ωi,i=1,2s,\displaystyle=\frac{\partial\tilde{F}}{\partial\Omega_{i}},\qquad i=1,2\dots s, (9a)
0\displaystyle 0 =F~Λ,\displaystyle=\frac{\partial\tilde{F}}{\partial\Lambda}, (9b)
Ui\displaystyle U_{i} =(g0+hj=1saijUjΩj),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}\right), (9c)
p0\displaystyle-p_{0} =F~g0,\displaystyle=\frac{\partial\tilde{F}}{\partial g_{0}}, (9d)
p1\displaystyle p_{1} =F~g1.\displaystyle=\frac{\partial\tilde{F}}{\partial g_{1}}. (9e)

We shall compute the above equations more explicitly. It is easy to see that (9b) is equivalent to the constraint (7). Let us turn to (9a), where the main difficulty is the implicit dependence of {Ui}i=1s\{U_{i}\}_{i=1}^{s} on {Ωi}i=1s\{\Omega_{i}\}_{i=1}^{s} that involves solving a nonlinear system (8). Suppose that k{1,2,,s}k\in\{1,2,\ldots,s\} is fixed, and we vary Ωk\Omega_{k} such that ΩkΩk(t)\Omega_{k}\to\Omega_{k}(t) with Ωk(0)=Ωk\Omega_{k}(0)=\Omega_{k} and Ω˙k(0)=δΩk\dot{\Omega}_{k}(0)=\delta\Omega_{k}, while {Ωi}ik\{\Omega_{i}\}_{i\neq k} remain fixed. Then,

Ui(t)=(g0+hjkaijUj(t)Ωj+haikUk(t)Ωk(t)).U_{i}(t)=\mathbb{P}\left(g_{0}+h\sum\nolimits_{j\neq k}a_{ij}U_{j}(t)\Omega_{j}+ha_{ik}U_{k}(t)\Omega_{k}(t)\right).

Differentiating both sides and letting U˙i=UiXik\dot{U}_{i}=U_{i}X_{ik}, we have that

Xik=dAi(hj=1saijUjXjkΩj+haikUkδΩk),X_{ik}=d\mathbb{P}_{A_{i}}\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}X_{jk}\Omega_{j}+ha_{ik}U_{k}\delta\Omega_{k}\right), (10)

where Ai=g0+hj=1saijUjΩjA_{i}=g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}. Then, (10) can be rewritten as

XikdAi(hj=1saijUjXjkΩj)=hdAi(aikUkδΩk).X_{ik}-d\mathbb{P}_{A_{i}}\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}X_{jk}\Omega_{j}\right)=hd\mathbb{P}_{A_{i}}(a_{ik}U_{k}\delta\Omega_{k}). (11)

In order to represent {Xik}i=1s\{X_{ik}\}_{i=1}^{s} in terms of δΩk\delta\Omega_{k}, we define three maps,

ψ~k\displaystyle\tilde{\psi}_{k} :Skew(n)Skew(n)s,\displaystyle:\operatorname*{Skew}(n)\longrightarrow\operatorname*{Skew}(n)^{s}, δΩk\displaystyle\quad\delta\Omega_{k} {dAi(aikUkδΩk)}i=1s,\displaystyle\longmapsto\{d\mathbb{P}_{A_{i}}(a_{ik}U_{k}\delta\Omega_{k})\}_{i=1}^{s},
ϕ~\displaystyle\tilde{\phi} :Skew(n)sSkew(n)s,\displaystyle:\operatorname*{Skew}(n)^{s}\longrightarrow\operatorname*{Skew}(n)^{s}, {Xik}i=1s\displaystyle\quad\{X_{ik}\}_{i=1}^{s} {XikdAi(hj=1saijUjXjkΩj)}i=1s,\displaystyle\longmapsto\left\{X_{ik}-d\mathbb{P}_{A_{i}}\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}X_{jk}\Omega_{j}\right)\right\}_{i=1}^{s},
π~i\displaystyle\tilde{\pi}_{i} :Skew(n)sSkew(n),\displaystyle:\operatorname*{Skew}(n)^{s}\longrightarrow\operatorname*{Skew}(n), {Ωi}i=1s\displaystyle\quad\{\Omega_{i}\}_{i=1}^{s} Ωi.\displaystyle\longmapsto\Omega_{i}.

Then,

Xik=(π~iϕ~1hψ~k)(δΩk)=h(π~iϕ~1ψ~k)(δΩk).X_{ik}=(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ h\tilde{\psi}_{k})(\delta\Omega_{k})=h(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})(\delta\Omega_{k}). (12)

Now, we compute F~Ωk\frac{\partial\tilde{F}}{\partial\Omega_{k}} by evaluating ddt|t=0F~(g0,g1,,Ωk(t),,Λ)\left.\frac{d}{dt}\right|_{t=0}\tilde{F}(g_{0},g_{1},\ldots,\Omega_{k}(t),\ldots,\Lambda) and expressing LU:G×𝔤𝔤\frac{\partial L}{\partial U}:G\times\mathfrak{g}\to\mathfrak{g}^{*} as a left-trivialized cotangent vector. Since this is a straightforward calculation of the variation, we present the result here for equation (9a) (see section C.1 for details),

0=hi=1sbi(π~iϕ~1ψ~k)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT))+bk(LΩ(Uk,Ωk)+Asym(UkTg1Λ)),0=h\sum_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right)+b_{k}\left(\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k})+\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda)\right), (13)

for any k=1,2,,sk=1,2,\ldots,s.

Recall that π~iϕ~1ψ~k:Skew(n)Skew(n)\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k}\colon\operatorname*{Skew}(n)\to\operatorname*{Skew}(n) and its dual is given by

(π~iϕ~1ψ~k)=ψ~k(ϕ~)1π~i.(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}=\tilde{\psi}_{k}^{*}\circ(\tilde{\phi}^{*})^{-1}\circ\tilde{\pi}_{i}^{*}.

Therefore, let us derive the explicit expressions for the adjoints of our three proposed maps, so we may write (π~iϕ~1ψ~k)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*} explicitly. The adjoint of π~i\tilde{\pi}_{i} is easy to compute, and for any SSkew(n)S\in\operatorname*{Skew}(n),

π~i(S)=(0,,S,0),\tilde{\pi}_{i}^{*}(S)=(0,\ldots,S,\ldots 0), (14)

where SS is in the ii-th position. For the adjoint of ϕ~\tilde{\phi}, we consider the identity

ϕ~(S1,,Ss),(S~1,,S~s)=(S1,,Ss),ϕ~(S~1,,S~s),\langle\tilde{\phi}^{*}(S_{1},\ldots,S_{s}),(\tilde{S}_{1},\ldots,\tilde{S}_{s})\rangle=\langle(S_{1},\ldots,S_{s}),\tilde{\phi}(\tilde{S}_{1},\ldots,\tilde{S}_{s})\rangle,

for any (S1,,Ss),(S~1,,S~s)Skew(n)s(S_{1},\ldots,S_{s}),(\tilde{S}_{1},\ldots,\tilde{S}_{s})\in\operatorname*{Skew}(n)^{s}. Using the properties of the inner products again, we obtain the explicit expression (see section C.2 for details),

ϕ~(S1,,Ss)={SjAsym(hUjTi=1saijdAi(Si)ΩjT)}j=1s.\tilde{\phi}^{*}(S_{1},\ldots,S_{s})=\left\{S_{j}-\operatorname*{Asym}\left(hU_{j}^{T}\sum\nolimits_{i=1}^{s}a_{ij}d\mathbb{P}_{A_{i}}^{*}(S_{i})\Omega_{j}^{T}\right)\right\}_{j=1}^{s}. (15)

Similarly, we consider the same identity and techniques to obtain the explicit expression for ψ~k\tilde{\psi}_{k}^{*} (see section C.3 for details),

ψ~k(S1,,Ss)=Asym(UkTi=1saikdAi(Si)).\tilde{\psi}_{k}^{*}(S_{1},\ldots,S_{s})=\operatorname*{Asym}\left(U_{k}^{T}\sum\nolimits_{i=1}^{s}a_{ik}d\mathbb{P}_{A_{i}}^{*}(S_{i})\right). (16)

Combining (14), (15), and (16), (π~iϕ~1ψ~k)(S)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}(S) for SSkew(n)S\in\operatorname*{Skew}(n) can be computed as follows,

[left=\empheqlbrace]SjAsym(hUjTl=1saljdAl(Sl)ΩjT)\displaystyle[left=\empheqlbrace\,]S_{j}-\operatorname*{Asym}\left(hU_{j}^{T}\sum\nolimits_{l=1}^{s}a_{lj}d\mathbb{P}_{A_{l}}^{*}(S_{l})\Omega_{j}^{T}\right) =Sδij,j=1,2s,\displaystyle=S\cdot\delta_{ij},\qquad j=1,2\dots s, (17a)
(π~iϕ~1ψ~k)(S)\displaystyle(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}(S) =Asym(UkTl=1salkdAl(Sl)).\displaystyle=\operatorname*{Asym}\left(U_{k}^{T}\sum\nolimits_{l=1}^{s}a_{lk}d\mathbb{P}_{A_{l}}^{*}(S_{l})\right). (17b)

We can first calculate {Sl}l=1s\{S_{l}\}_{l=1}^{s} from (17a) by using fixed-point iteration, and then substitute the result into (17b) to obtain (π~iϕ~1ψ~k)(S)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}(S). This combined with (13) gives an explicit formula for (9a).

We now derive an explicit formula for (9d). Notice that UiU_{i} depends on g0g_{0} by the nonlinear system in (8), so we can use the method of variations again. If we vary g0g0(t)g_{0}\rightarrow g_{0}(t) such that g0(0)=g0g_{0}(0)=g_{0} and g˙0(0)=g0δg0\dot{g}_{0}(0)=g_{0}\delta g_{0}, we obtain

Ui(t)=(g0(t)+hj=1saijUj(t)Ωj).U_{i}(t)=\mathbb{P}\left(g_{0}(t)+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}(t)\Omega_{j}\right).

Differentiating on both sides and letting U˙i=UiYi\dot{U}_{i}=U_{i}Y_{i}, where YiSkew(n)Y_{i}\in\operatorname*{Skew}(n) is a left-trivialized tangent vector, we obtain

Yi=dAi(g0δg0+hj=1saijUjYjΩj),Y_{i}=d\mathbb{P}_{A_{i}}\left(g_{0}\delta g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}Y_{j}\Omega_{j}\right),

which can be rewritten as

YidAi(hj=1saijUjYjΩj)=dAi(g0δg0).Y_{i}-d\mathbb{P}_{A_{i}}\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}Y_{j}\Omega_{j}\right)=d\mathbb{P}_{A_{i}}(g_{0}\delta g_{0}).

Similar to the approach used for XikX_{ik}, we introduce a new map,

φ~:Skew(n)Skew(n)s,δg0{dAi(g0δg0)}i=1s,\tilde{\varphi}\colon\operatorname*{Skew}(n)\longrightarrow\operatorname*{Skew}(n)^{s},\quad\delta g_{0}\longmapsto\left\{d\mathbb{P}_{A_{i}}(g_{0}\delta g_{0})\right\}_{i=1}^{s},

then Yi=(π~iϕ~1φ~)(δg0)Y_{i}=(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})(\delta g_{0}). The explicit expression for φ~\tilde{\varphi}^{*} can be written as (see section C.4 for details),

φ~(S1,,Ss)=Asym(g0Ti=1sdAi(Si)).\tilde{\varphi}^{*}(S_{1},\ldots,S_{s})=\operatorname*{Asym}\left(g_{0}^{T}\sum\nolimits_{i=1}^{s}d\mathbb{P}_{A_{i}}^{*}(S_{i})\right). (18)

As such, (π~iϕ~1φ~)(S)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}(S) can be computed as follows,

[left=\empheqlbrace]SjAsym(hUjTl=1saljdAl(Sl)ΩjT)\displaystyle[left=\empheqlbrace\,]S_{j}-\operatorname*{Asym}\left(hU_{j}^{T}\sum\nolimits_{l=1}^{s}a_{lj}d\mathbb{P}_{A_{l}}^{*}(S_{l})\Omega_{j}^{T}\right) =Sδij,j=1,2,s,\displaystyle=S\cdot\delta_{ij},\qquad j=1,2,\ldots s, (19a)
(π~iϕ~1φ~)(S)\displaystyle(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}(S) =Asym(g0Tl=1sdAl(Sl)).\displaystyle=\operatorname*{Asym}\left(g_{0}^{T}\sum\nolimits_{l=1}^{s}d\mathbb{P}_{A_{l}}^{*}(S_{l})\right). (19b)

Now, we compute F~g0\frac{\partial\tilde{F}}{\partial g_{0}}, which gives us (9d) explicitly (see section C.5 for details),

p0=hi=1sbi(π~iϕ~1φ~)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT))+Asym(g0Tg1Λ).-p_{0}=h\sum\limits_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right)+\operatorname*{Asym}(g_{0}^{T}g_{1}\Lambda). (20)

For equation (9e), it is easy to show that

p1=Asym(g1T(g0+hi=1sbiUiΩi)ΛT).p_{1}=\operatorname*{Asym}\left(g_{1}^{T}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right)\Lambda^{T}\right). (21)

Combining (13), (7), (8), (20), and (21), we obtain a Lagrangian variational integrator on SO(n)SO(n),

[left=\empheqlbrace]0=hi=1sbi(π~iϕ~1ψ~k)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT))+bk(LΩ(Uk,Ωk)+Asym(UkTg1Λ))\displaystyle[left=\empheqlbrace\,]\begin{split}0&=h\sum_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right)\\ &\qquad+b_{k}\left(\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k})+\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda)\right)\end{split} (22a)
g1\displaystyle g_{1} =(g0+hi=1sbiUiΩi),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right), (22b)
Ui\displaystyle U_{i} =(g0+hj=1saijUjΩj),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}\right), (22c)
p0\displaystyle-p_{0} =hi=1sbi(π~iϕ~1φ~)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT))+Asym(g0Tg1Λ),\displaystyle=h\sum\limits_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right)+\operatorname*{Asym}(g_{0}^{T}g_{1}\Lambda), (22d)
p1\displaystyle p_{1} =Asym(g1T(g0+hi=1sbiUiΩi)ΛT).\displaystyle=\operatorname*{Asym}\left(g_{1}^{T}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right)\Lambda^{T}\right). (22e)

The integrator gives an update map (g0,p0)(g1,p1)(g_{0},p_{0})\mapsto(g_{1},p_{1}) on the cotangent bundle. In particular, one may solve for ({Ω}k=1s,g1,{Ui}i=1s,Λ)(\{\Omega\}_{k=1}^{s},g_{1},\{U_{i}\}_{i=1}^{s},\Lambda) simultaneously using equations (22a)–(22d). Unfortunately, while (22b)–(22d) can be written in fixed-point form for the variables g1g_{1}, {Ui}i=1s\{U_{i}\}_{i=1}^{s}, and Λ\Lambda, (22b) is implicit for {Ω}k=1s\{\Omega\}_{k=1}^{s}. However, we can arrive at a fixed-point form for (22a) on the Hamiltonian side if LL is hyperregular.

4. Hamiltonian Variational Integrators on the Rotation Group SO(n)SO(n)

It is often desirable to transform a numerical method from the Lagrangian side to the Hamiltonian side, which is possible if LL is hyperregular. The same mechanical system can be represented by either a Lagrangian or a Hamiltonian, and they are related by the Legendre transform. In Euclidean space, this gives

(TQ,L)\textstyle{(TQ,L)\ignorespaces\ignorespaces\ignorespaces\ignorespaces}𝔽L\scriptstyle{\mathbb{F}L}(TQ,H),\textstyle{(T^{*}Q,H)\ignorespaces\ignorespaces\ignorespaces\ignorespaces,}𝔽H\scriptstyle{\mathbb{F}H}

and we have the following relationships,

Lq˙(q,q˙)=p,Hp(q,p)=q˙,Lq(q,q˙)=Hq(q,p).\frac{\partial L}{\partial\dot{q}}(q,\dot{q})=p,\quad\frac{\partial H}{\partial p}(q,p)=\dot{q},\quad\frac{\partial L}{\partial q}(q,\dot{q})=-\frac{\partial H}{\partial q}(q,p).

Given a Lie group GG, a left-trivialized Lagrangian L:G×𝔤L\colon G\times\mathfrak{g}\rightarrow\mathbb{R}, and its corresponding Hamiltonian H:G×𝔤H\colon G\times\mathfrak{g}^{*}\rightarrow\mathbb{R}, it is easy to verify that similar relationships hold for these trivializations,

Lϵ(g,ϵ)=μ,Hμ(g,μ)=ϵ,Lg(g,ϵ)=Hg(g,μ).\frac{\partial L}{\partial\epsilon}(g,\epsilon)=\mu,\quad\frac{\partial H}{\partial\mu}(g,\mu)=\epsilon,\quad\frac{\partial L}{\partial g}(g,\epsilon)=-\frac{\partial H}{\partial g}(g,\mu). (23)

Using (23) and denoting the corresponding internal cotangent vectors by {μk}k=1s\{\mu_{k}\}_{k=1}^{s}, (22) can be transformed to the Hamiltonian form as

[left=\empheqlbrace]μk\displaystyle[left=\empheqlbrace\,]\mu_{k} =Asym(UkTg1Λ)+hi=1sbibk(π~iϕ~1ψ~k)(HU(Ui,μi)Asym(UiTg1ΛΩiT)),\displaystyle=-\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda)+h\sum\limits_{i=1}^{s}\frac{b_{i}}{b_{k}}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}\left(\frac{\partial H}{\partial U}(U_{i},\mu_{i})-\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right), (24a)
g1\displaystyle g_{1} =(g0+hi=1sbiUiΩi),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right), (24b)
Ui\displaystyle U_{i} =(g0+hj=1saijUjΩj),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}\right), (24c)
Asym(g0Tg1Λ)\displaystyle\operatorname*{Asym}(g_{0}^{T}g_{1}\Lambda) =p0+hi=1sbi(π~iϕ~1φ~)(HU(Ui,μi)Asym(UiTg1ΛΩiT)),\displaystyle=-p_{0}+h\sum\limits_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}\left(\frac{\partial H}{\partial U}(U_{i},\mu_{i})-\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right), (24d)
p1\displaystyle p_{1} =Asym(g1T(g0+hi=1sbiUiΩi)ΛT),\displaystyle=\operatorname*{Asym}\left(g_{1}^{T}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right)\Lambda^{T}\right), (24e)
Ωi\displaystyle\Omega_{i} =Hμ(Ui,μi).\displaystyle=\frac{\partial H}{\partial\mu}(U_{i},\mu_{i}). (24f)

In the above algorithm, Ωi\Omega_{i} is given explicitly by (24f) and only serves to reduce the redundancy in the computations because they are used numerous times in the other expressions. Similarly, (π~iϕ~1)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1})^{*} shows up in both (24a) and (24d), so one can save computational effort by reusing the shared solution to (17a) and (19a). Then, the first four equations can be solved simultaneously by fixed-point iterations, meaning the variables ({μk}k=1s,g1,{Ui}i=1s,Λ)(\{\mu_{k}\}_{k=1}^{s},g_{1},\{U_{i}\}_{i=1}^{s},\Lambda) are updated concurrently in each iteration. Observe that the fixed-point form for Λ\Lambda in (24d) requires solving a Lyapunov equation. Finally, p1p_{1} is solved explicitly in (24e) after solving for ({μk}k=1s,g1,{Ui}i=1s,Λ)(\{\mu_{k}\}_{k=1}^{s},g_{1},\{U_{i}\}_{i=1}^{s},\Lambda). We shall call the integrators defined by (24) the variational polar decomposition method or VPD for short.

4.1. Lie–Poisson Integrator by Reduction

We consider a GG-invariant Hamiltonian system given by HH on the cotangent bundle TGT^{*}G. In this case, Hamilton’s equations can be reduced to a Lie–Poisson system on 𝔤\mathfrak{g}^{*}. If the Hamiltonian is hyperregular, then both the Lagrangian and the corresponding constrained Galerkin discrete Lagrangian Ld(g0,g1)L_{d}(g_{0},g_{1}) will be GG-invariant. As such, (2) naturally reduces to yield a Lie–Poisson integrator (see Appendix B). We only consider the reduction on the Hamiltonian side due to the nature of the constrained Galerkin methods, which give an update map on the cotangent bundle.

The discrete Lagrangian we have constructed becomes

Ld(g0,g1)=ext{Ωi}i=1si=1sbi𝒍(Ωi),L_{d}(g_{0},g_{1})=\operatorname*{ext}_{\{\Omega_{i}\}_{i=1}^{s}}\sum\limits_{i=1}^{s}b_{i}\bm{l}(\Omega_{i}),

where

Ui\displaystyle U_{i} =(g0+hj=1saijUjΩj),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}\right),
g1\displaystyle g_{1} =(g0+hi=1sbiUiΩi),\displaystyle=\mathbb{P}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right),

and 𝒍:𝔤\bm{l}\colon\mathfrak{g}\to\mathbb{R} is the reduced Lagrangian. It is easy to verify that our system is GG-invariant, meaning

Ld(gg0,gg1)=Ld(g0,g1)=ext{Ωi}i=1si=1sbi𝒍(Ωi),L_{d}(g\cdot g_{0},g\cdot g_{1})=L_{d}(g_{0},g_{1})=\operatorname*{ext}_{\{\Omega_{i}\}_{i=1}^{s}}\sum\limits_{i=1}^{s}b_{i}\bm{l}(\Omega_{i}),

where

gUi\displaystyle g\cdot U_{i} =((gg0)+hj=1saij(gUjΩj)),\displaystyle=\mathbb{P}\left((g\cdot g_{0})+h\sum\nolimits_{j=1}^{s}a_{ij}(g\cdot U_{j}\Omega_{j})\right),
gg1\displaystyle g\cdot g_{1} =((gg0)+hi=1sbi(gUiΩi)).\displaystyle=\mathbb{P}\left((g\cdot g_{0})+h\sum\nolimits_{i=1}^{s}b_{i}(g\cdot U_{i}\Omega_{i})\right).

Therefore, the variational integrator (24) can theoretically be reduced to a Lie–Poisson integrator. By letting g0Tg1=f0g_{0}^{T}g_{1}=f_{0} and UiTg1=ΘiU_{i}^{T}g_{1}=\Theta_{i}, (24) simplifies to

[left=\empheqlbrace]μk\displaystyle[left=\empheqlbrace\,]\mu_{k} =Asym(ΘkΛ)hi=1sbibk(π~iϕ~1ψ~k)(Asym(ΘiΛΩiT)),\displaystyle=-\operatorname*{Asym}(\Theta_{k}\Lambda)-h\sum\limits_{i=1}^{s}\frac{b_{i}}{b_{k}}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}(\operatorname*{Asym}(\Theta_{i}\Lambda\Omega_{i}^{T})), (25a)
0\displaystyle 0 =Asym(f0T+hi=1sbiΘiTΩi),\displaystyle=\operatorname*{Asym}\left(f_{0}^{T}+h\sum\nolimits_{i=1}^{s}b_{i}\Theta_{i}^{T}\Omega_{i}\right), (25b)
ΘiT\displaystyle\Theta_{i}^{T} =(f0T+hj=1saijΘjTΩj),\displaystyle=\mathbb{P}\left(f_{0}^{T}+h\sum\nolimits_{j=1}^{s}a_{ij}\Theta_{j}^{T}\Omega_{j}\right), (25c)
Asym(f0Λ)\displaystyle\operatorname*{Asym}(f_{0}\Lambda) =p0hi=1sbi(π~iϕ~1φ~)(Asym(ΘiΛΩiT)),\displaystyle=-p_{0}-h\sum\limits_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}(\operatorname*{Asym}(\Theta_{i}\Lambda\Omega_{i}^{T})), (25d)
p1\displaystyle p_{1} =Asym((f0T+hi=1sbiΘiTΩi)ΛT),\displaystyle=\operatorname*{Asym}\left(\left(f_{0}^{T}+h\sum\nolimits_{i=1}^{s}b_{i}\Theta_{i}^{T}\Omega_{i}\right)\Lambda^{T}\right), (25e)
Ωi\displaystyle\Omega_{i} =𝒉μ(μi),\displaystyle=\frac{\partial\bm{h}}{\partial\mu}(\mu_{i}), (25f)

where 𝒉\bm{h} is the reduced Hamiltonian. Multiplying by g1Tg_{1}^{T} on both sides of g1=(g0+hi=1sbiUiΩi)g_{1}=\mathbb{P}\left(g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}\Omega_{i}\right) yields

I=(f0T+hi=1sbiΘiTΩi).I=\mathbb{P}\left(f_{0}^{T}+h\sum\nolimits_{i=1}^{s}b_{i}\Theta_{i}^{T}\Omega_{i}\right).

Suppose that hh is small and g0g_{0} and g1g_{1} are close, then (f0T+hi=1sbiΘiTΩi)\left(f_{0}^{T}+h\sum\nolimits_{i=1}^{s}b_{i}\Theta_{i}^{T}\Omega_{i}\right) is in the neighborhood of II. By Lemma 1, this is equivalent to

Asym(f0T+hi=1sbiΘiTΩi)=0,\operatorname*{Asym}\left(f_{0}^{T}+h\sum\nolimits_{i=1}^{s}b_{i}\Theta_{i}^{T}\Omega_{i}\right)=0,

which can be regarded as the fixed-point equation for f0f_{0}. The first four equations can be solved using fixed-point iteration for the variables ({μk}k=1s,f0,{Θi}i=1s,Λ)(\{\mu_{k}\}_{k=1}^{s},f_{0},\{\Theta_{i}\}_{i=1}^{s},\Lambda) as in our previous discussions. Then, p1p_{1} can be calculated explicitly.

In the above algorithm, we also need to figure out the reduced version of (π~iϕ~1ψ~k)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*} and (π~iϕ~1φ~)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}. Note that (17a) and (17b) involve UjTdAiU_{j}^{T}d\mathbb{P}_{A_{i}}^{*}; in addition, g0,g1,Uig_{0},g_{1},U_{i}, and Ai=g0+hj=1saijUjΩj=UiPiA_{i}=g_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\Omega_{j}=U_{i}P_{i} are reduced, so we need a reduced version of UjTdAiU_{j}^{T}d\mathbb{P}_{A_{i}}^{*} as well. Multiplying AiA_{i} on the left by g1Tg_{1}^{T}, we obtain

g1TAi=f0T+hj=1saijΘjTΩj=(g1TUi)Pi.g_{1}^{T}A_{i}=f_{0}^{T}+h\sum\limits_{j=1}^{s}a_{ij}\Theta_{j}^{T}\Omega_{j}=(g_{1}^{T}U_{i})P_{i}.

Then, (g1TUi)Pi(g_{1}^{T}U_{i})P_{i} is the polar decomposition of f0T+hj=1saijΘjTΩjf_{0}^{T}+h\sum\nolimits_{j=1}^{s}a_{ij}\Theta_{j}^{T}\Omega_{j}, and for SSkew(n)S\in\operatorname*{Skew}(n),

UjTdAi(S)\displaystyle U_{j}^{T}d\mathbb{P}_{A_{i}}^{*}(S) =UjTUiLyap(Pi,ST)=Θjg1TUiLyap(Pi,ST),\displaystyle=U_{j}^{T}\cdot U_{i}\operatorname*{Lyap}(P_{i},S^{T})=\Theta_{j}g_{1}^{T}U_{i}\operatorname*{Lyap}(P_{i},S^{T}),
=Θj(f0T+hj=1saijΘjTΩj)Lyap(Pi,ST).\displaystyle=\Theta_{j}\mathbb{P}\left(f_{0}^{T}+h\sum\nolimits_{j=1}^{s}a_{ij}\Theta_{j}^{T}\Omega_{j}\right)\operatorname*{Lyap}(P_{i},S^{T}).

This is the reduced version of UjTdAi(S)U_{j}^{T}d\mathbb{P}_{A_{i}}^{*}(S), and so (π~iϕ~1ψ~k)(S)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}(S) can be computed as follows,

[left=\empheqlbrace]SjAsym(hΘjl=1saljdAl(Sl)ΩjT)=Sδij,j=1,2s,\displaystyle[left=\empheqlbrace\,]S_{j}-\operatorname*{Asym}\left(h\Theta_{j}\sum\nolimits_{l=1}^{s}a_{lj}d\mathbb{P}_{A_{l}}^{*}(S_{l})\Omega_{j}^{T}\right)=S\cdot\delta_{ij},\qquad j=1,2\ldots s, (26a)
(π~iϕ~1ψ~k)(S)=Asym(Θkl=1salkdAl(Sl)),\displaystyle(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}(S)=\operatorname*{Asym}\left(\Theta_{k}\sum\nolimits_{l=1}^{s}a_{lk}d\mathbb{P}_{A_{l}}^{*}(S_{l})\right), (26b)

where {Ai}\{A_{i}\} are redefined to be Ai=f0T+hj=1saijΘjTΩjA_{i}=f_{0}^{T}+h\sum\nolimits_{j=1}^{s}a_{ij}\Theta_{j}^{T}\Omega_{j}. For (π~iϕ~1φ~)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}, we need to compute g0TdAi(S)g_{0}^{T}d\mathbb{P}_{A_{i}}^{*}(S) for some SSkew(n)S\in\operatorname*{Skew}(n), which is given by

g0TdAi(S)\displaystyle g_{0}^{T}d\mathbb{P}_{A_{i}}^{*}(S) =g0TUiLyap(Pi,ST)=f0g1TUiLyap(Pi,ST)\displaystyle=g_{0}^{T}\cdot U_{i}\operatorname*{Lyap}(P_{i},S^{T})=f_{0}g_{1}^{T}U_{i}\operatorname*{Lyap}(P_{i},S^{T})
=f0(f0T+hj=1saijΘjTΩj)Lyap(Pi,ST).\displaystyle=f_{0}\mathbb{P}\left(f_{0}^{T}+h\sum\nolimits_{j=1}^{s}a_{ij}\Theta_{j}^{T}\Omega_{j}\right)\operatorname*{Lyap}(P_{i},S^{T}).

Hence, we have

[left=\empheqlbrace]SjAsym(hΘjl=1saljdAl(Sl)ΩjT)=Sδij,j=1,2s,\displaystyle[left=\empheqlbrace\,]S_{j}-\operatorname*{Asym}\left(h\Theta_{j}\sum\nolimits_{l=1}^{s}a_{lj}d\mathbb{P}_{A_{l}}^{*}(S_{l})\Omega_{j}^{T}\right)=S\cdot\delta_{ij},\qquad j=1,2\ldots s, (27a)
(π~iϕ~1φ~)(S)=Asym(f0l=1salkdAl(Sl)).\displaystyle(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}(S)=\operatorname*{Asym}\left(f_{0}\sum\nolimits_{l=1}^{s}a_{lk}d\mathbb{P}_{A_{l}}^{*}(S_{l})\right). (27b)

5. Numerical Experiments

We test our methods and compare them to the variational Runge–Kutta–Munthe-Kaas (VRKMK) methods from Bogfjellmo and Marthinsen [3] on the dipole on a stick problem that they considered. In particular, our configuration space is SO(3)SO(3), and its Lie algebra is identified with 3\mathbb{R}^{3}. We shall only recall the mathematical expressions here, so for a thorough description of the system one should refer to the reference above. The right-trivialized and left-trivialized Hamiltonians, HR,HL:SO(3)×3H^{R},H^{L}\colon SO(3)\times\mathbb{R}^{3}\to\mathbb{R}, can be written as

HR(g,p)\displaystyle H^{R}(g,p) =12pTgJ1gTp+U(g),\displaystyle=\frac{1}{2}p^{T}gJ^{-1}g^{T}p+U(g), (28)
HL(g,p~)\displaystyle H^{L}(g,\tilde{p}) =12p~TJ1p~+U(g),\displaystyle=\frac{1}{2}\tilde{p}^{T}J^{-1}\tilde{p}+U(g), (29)

where U(g)=me3Tge3+qβ(gy+0z21gy0z21)U(g)=me_{3}^{T}ge_{3}+q\beta\left(\lVert gy_{+}^{0}-z\rVert_{2}^{-1}-\lVert gy_{-}^{0}-z\rVert_{2}^{-1}\right). Note that J=mdiag(1+α2,1,α2)J=m\,\text{diag}(1+\alpha^{2},1,\alpha^{2}), with m=1m=1 and α=0.1\alpha=0.1. The constant vectors are y±0=(0,±α,1)Ty_{\pm}^{0}=(0,\pm\alpha,-1)^{T} and z=(0,0,3/2)Tz=(0,0,-3/2)^{T}. Lastly, q=β=1q=\beta=1, and 2\|\cdot\|_{2} is our usual Euclidean norm.

Both forms of the Hamiltonian are written here because while our method was developed for left-trivialized systems using Hamilton’s principle, their method was developed for right-trivialized systems using the Hamilton–Pontryagin principle. As a result, both discretizations yield symplectic variational integrators for the Hamiltonian with their corresponding choice of trivialization. In particular, we have

p~=gTp\tilde{p}=g^{T}p (30)

as a relationship between the dual elements of the corresponding trivialization: pp is the dual representation of ξ𝔤\xi\in\mathfrak{g} for the right-trivialization g˙=ξg\dot{g}=\xi g, and p~\tilde{p} is the dual representation of ξ~𝔤\tilde{\xi}\in\mathfrak{g} for the left-trivialization g˙=gξ~\dot{g}=g\tilde{\xi}. Consequently, we note that the left-trivialized cotangent vector Hg\frac{\partial H}{\partial g} in VPD (24) is computed as

Hg(g,p~)=Asym(g1gHL(g,p~)),\frac{\partial H}{\partial g}(g,\tilde{p})=\operatorname*{Asym}\left(g^{-1}\nabla_{g}H^{L}(g,\tilde{p})\right),

where gHL(g,p~)\nabla_{g}H^{L}(g,\tilde{p}) is the matrix derivative of HL(g,p~)H^{L}(g,\tilde{p}) with respect to gg. On the other hand, the right-trivialized cotangent vector is computed as

Hg(g,p)=Asym(gHR(g,p)g1),\frac{\partial H}{\partial g}(g,p)=\operatorname*{Asym}\left(\nabla_{g}H^{R}(g,p)g^{-1}\right),

when implementing the VRKMK method.

For our tests, we also have the same initial data from [3],

g(0)\displaystyle g(0) =(100001010),\displaystyle=\begin{pmatrix}1&0&0\\ 0&0&-1\\ 0&1&0\end{pmatrix},
p(0)\displaystyle p(0) =g(0)Jg(0)Te2,\displaystyle=g(0)Jg(0)^{T}e_{2},

for the VRKMK methods, and so (30) gives p~(0)\tilde{p}(0) to complete the initial data for the VPD methods. Since both methods involve fixed-point iteration, we terminate the processes when the norm 2\|\cdot\|_{2} between the current and previous iteration is less than 101510^{-15} for each variable. In particular, the norm for vectors is the Euclidean norm and for matrices it is the induced matrix norm. Lastly, we ran these implementations in Wolfram Mathematica 12 on a personal computer with the following specifications: Operating system: Windows 10; CPU: 3.7GHz AMD Ryzen 5 5600X; RAM: G.Skill Trident Z Neo Series 32Gb DDR4 3600 C16; Storage: 1TB Rocket Nvme PCIe 4.0 writing/reading up to 5000/4400 MB/s.

5.1. Order Tests

12\dfrac{1}{2} 12\dfrac{1}{2}
1
(a) 2nd order Gauss–Legendre method
0 0 0 0
12\dfrac{1}{2} 12\dfrac{1}{2} 0 0
11 1-1 22 0
16\dfrac{1}{6} 23\dfrac{2}{3} 16\dfrac{1}{6}
(b) 3rd order Runge–Kutta method
1236\dfrac{1}{2}-\dfrac{\sqrt{3}}{6} 14\dfrac{1}{4} 1436\dfrac{1}{4}-\dfrac{\sqrt{3}}{6}
12+36\dfrac{1}{2}+\dfrac{\sqrt{3}}{6} 14+36\dfrac{1}{4}+\dfrac{\sqrt{3}}{6} 14\dfrac{1}{4}
12\dfrac{1}{2} 12\dfrac{1}{2}
(c) 4th order Gauss–Legendre method
121510\dfrac{1}{2}-\dfrac{\sqrt{15}}{10} 536\dfrac{5}{36} 291515\dfrac{2}{9}-\dfrac{\sqrt{15}}{15} 5361530\dfrac{5}{36}-\dfrac{\sqrt{15}}{30}
12\dfrac{1}{2} 536+1524\dfrac{5}{36}+\dfrac{\sqrt{15}}{24} 29\dfrac{2}{9} 5361524\dfrac{5}{36}-\dfrac{\sqrt{15}}{24}
12+1510\dfrac{1}{2}+\dfrac{\sqrt{15}}{10} 536+1530\dfrac{5}{36}+\dfrac{\sqrt{15}}{30} 29+1515\dfrac{2}{9}+\dfrac{\sqrt{15}}{15} 536\dfrac{5}{36}
518\dfrac{5}{18} 49\dfrac{4}{9} 518\dfrac{5}{18}
(d) 6th order Gauss–Legendre method
Table 1. Butcher tableaux for the comparison tests
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1. Order comparison plots between VRKMK and VPD methods: The black-dashed lines are references for the corresponding orders.
Refer to caption
Refer to caption
Figure 2. Long-term energy error for second order VRKMK/VPD methods.
Refer to caption
Refer to caption
Figure 3. Long-term energy error for third order VRKMK/VPD methods.
Refer to caption
Refer to caption
Figure 4. Long-term energy error for fourth order VRKMK/VPD methods.
Refer to caption
Refer to caption
Figure 5. Long-term energy error for sixth order VRKMK/VPD methods.
Refer to caption
Refer to caption
Figure 6. Orthogonality error for sixth order VRKMK/VPD methods.
Refer to caption
Refer to caption
Figure 7. Run-time comparison for fourth and sixth order VRKMK/VPD methods.

We run both methods based on the one-, two-, and three-stage Gauss–Legendre methods and a third-order Runge–Kutta method for comparisons, and these methods are shown as Butcher tableaux in Table 1.

We compute the errors in (g(0.5),p(0.5))(g(0.5),p(0.5)) from VRKMK and errors in (g(0.5),p~(0.5))(g(0.5),\tilde{p}(0.5)) from VPD with respect to a reference solution and these errors are shown in Figure 1. The reference solution was calculated using the sixth-order VPD method with step size h=0.001h=0.001. The errors for VRKMK is computed as g(0.5)Tp(0.5)p~ref2+g(0.5)gref2\|g(0.5)^{T}p(0.5)-\tilde{p}_{\text{ref}}\|_{2}+\|g(0.5)-g_{\text{ref}}\|_{2} while the errors for VPD is computed as p~(0.5)p~ref2+g(0.5)gref2\|\tilde{p}(0.5)-\tilde{p}_{\text{ref}}\|_{2}+\|g(0.5)-g_{\text{ref}}\|_{2}. The black-dashed lines are reference lines for the appropriate orders: In the sixth order comparison plot, the errors from the fixed-point iteration are dominating for smaller step sizes, so the theoretical order is not observed in those regimes. Otherwise, both methods achieve their theoretical orders, and they are quite comparable, noting that VPD exhibits a noticeable, smaller error constant in the second order method.

5.2. Long-Term Behaviors

The long-term energy behaviors for both methods are presented in Figures 25. For second, third, and fourth order, we fixed the step size h=0.01h=0.01 and ran 10510^{5} integration steps to observe the energy errors. For second order, the energy errors have magnitude orders of 10410^{-4} for VRKMK and 10510^{-5} for VPD; for third order, they have magnitude orders of 10710^{-7} for VRKMK and 10610^{-6} for VPD; for fourth order, they have magnitude orders of 10910^{-9} for both methods.

For the sixth order, we consider step size h=1/26h=1/26 to avoid the regime where the errors in the fixed-point iteration are dominating. This step size corresponds to the third point from the right in the order comparison plots. We also run 10510^{5} integration steps and observe the energy errors in Figure 5. The energy error for the VPD method is stable with an order of magnitude of 101010^{-10}. On the other hand, the energy error in VRKMK exhibits a slow increase over the integrating time span. We investigated this phenomenon and attribute this to the framework of each method. In VPD, g1g_{1} in (24b) and the internal points UiU_{i} in (24c) are updated in each integration step via polar decomposition. As a result, the Lie group structure is always preserved up to machine precision for both the configuration space elements and internal points. However, in VRKMK, both g1g_{1} and the internal points UiU_{i} are updated by the left action of SO(n)SO(n). This left action is a matrix multiplication which is performed to machine precision, but with a sufficient number of multiplications, the round-off error will still accumulate. Consequently, the Lie group structure is not as well-preserved in comparison, and this is illustrated in the comparison of orthogonality errors ggTI32\|gg^{T}-I_{3}\|_{2} in Figure 6.

We also observed that projecting the sixth-order VRKMK method onto the rotation group using the polar decomposition at each timestep does not recover the near energy preservation typical of symplectic methods. Presumably, this is because the projection subtly compromises the symplecticity of the method. This is consistent with the observation made in [10], that both the Lie group structure and symplecticity needs to be preserved for the methods to exhibit near energy preservation.

5.3. Runtime Comparison

We also have data for runtime comparison of VRKMK and VPD for the two-stage and three-stage Gauss–Legendre methods in Figure 7. For each step size hh, we recorded the runtime of each method in seconds and repeated the execution 256 times to compute the average runtime. The runtime for VPD methods is on average longer than for VRKMK methods. This may be due in parts to the Lyapunov solutions and multiple layers of fixed-point iterations required in the polar decomposition in (4) and (π~iϕ~1)(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1})^{*} in the same equations (17a) and (19a).

6. Conclusion

By applying the polar decomposition in a constrained Galerkin method, we constructed high-order Lie group symplectic integrators on SO(n)SO(n), which we refer to as the variational polar decomposition method. Furthermore, the integrator can be reduced to a Lie–Poisson integrator when the underlying mechanical system is group-invariant. We performed extensive numerical experiments comparing our proposed method to the variational Runge–Kutta–Munthe-Kaas method from [3]. These comparisons provide insights into each method and highlight an advantage of our proposed method, which is the preservation of Lie group structure up to machine precision for both the configurations and internal points. This appears to be important for high-order methods to exhibit the near energy preservation that one expects for symplectic integrators when applied to dynamics on Lie groups.

For future work, it would be interesting to explore the generalization of the proposed method to symmetric spaces, by applying the generalized polar decomposition [15]. This may be of particular interest in the construction of accelerated optimization algorithms on symmetric spaces, following the use of time-adaptive variational integrators for symplectic optimization [6] based on the discretization of Bregman Lagrangian and Hamiltonian flows [16, 5]. Examples of symmetric spaces include the space of Lorentzian metrics, the space of symmetric positive-definite matrices, and the Grassmannian.

Acknowledgements

This research has been supported in part by NSF under grants DMS-1010687, CMMI-1029445, DMS-1065972, CMMI-1334759, DMS-1411792, DMS-1345013, DMS-1813635, CCF-2112665, by AFOSR under grant FA9550-18-1-0288, and by the DoD under grant HQ00342010023 (Newton Award for Transformative Ideas during the COVID-19 Pandemic). The authors would also like to thank Geir Bogfjellmo and Håkon Marthinsen for sharing the code for their VRKMR method from [3], which we used in the numerical comparisons with our method.

Appendix A Constrained Galerkin Methods

Our Galerkin variational integrator will involve a discrete Lagrangian that differs from the classical construction in [14]. Traditionally in the linear space setting, (1) is approximated with a quadrature rule

Ld(q0,q1)\displaystyle L_{d}(q_{0},q_{1}) =hi=1sbiL(q(cih),q˙(cih))=hi=1sL(Qi,Q˙i),\displaystyle=h\sum\limits_{i=1}^{s}b_{i}L(q(c_{i}h),\dot{q}(c_{i}h))=h\sum\limits_{i=1}^{s}L(Q_{i},\dot{Q}_{i}),

and q(t)q(t) is approximated by polynomials with degree less than or equal to ss with fixed endpoints q0q_{0} and q1q_{1}. By choosing interpolation nodes {d0ν}ν=0s\{d_{0}^{\nu}\}_{\nu=0}^{s} with d00=0,d0s=1d_{0}^{0}=0,d_{0}^{s}=1 and interpolation values {q0ν}ν=0s\{q_{0}^{\nu}\}_{\nu=0}^{s} with q00=q0q_{0}^{0}=q_{0} and q0s=q1q_{0}^{s}=q_{1}, q(t)q(t) can be expressed as q(t)=ν=0sq0νϕν(th)q(t)=\sum\nolimits_{\nu=0}^{s}q_{0}^{\nu}\phi_{\nu}\left(\frac{t}{h}\right) on [0,h][0,h], where ϕν(t)\phi_{\nu}(t) are Lagrange polynomials corresponding to the nodes {d0ν}ν=0s\{d_{0}^{\nu}\}_{\nu=0}^{s}. By taking variations with respect to the interpolation values {q0ν}ν=1s1\{q_{0}^{\nu}\}_{\nu=1}^{s-1}, q(t)q(t) is varied over the finite-dimensional function space,

Ms={q(t)q(t)Ps[0,h],q(0)=q0,q(h)=q1}.M^{s}=\{q(t)\mid q(t)\in P_{s}[0,h],q(0)=q_{0},q(h)=q_{1}\}.

Consider the quadrature approximation of the action integral, viewing it as a function of the endpoint and interpolation values,

F(q0,q1,{q0ν}ν=1s1)=hi=1sbiL(q(cih),q˙(cih)),F(q_{0},q_{1},\{q_{0}^{\nu}\}_{\nu=1}^{s-1})=h\sum\limits_{i=1}^{s}b_{i}L(q(c_{i}h),\dot{q}(c_{i}h)),

where q(t)=ν=0sq0νϕν(th)q(t)=\sum\nolimits_{\nu=0}^{s}q_{0}^{\nu}\phi_{\nu}(\frac{t}{h}). Then, a variational integrator (2) can be obtained as follows,

{0=Fq0ν,ν=1,2s1,p0=Fq0,p1=Fq1.\left\{\begin{aligned} 0&=\frac{\partial F}{\partial q_{0}^{\nu}},\qquad\nu=1,2\dots s-1,\\ -p_{0}&=\frac{\partial F}{\partial q_{0}},\\ p_{1}&=\frac{\partial F}{\partial q_{1}}.\end{aligned}\right. (31)

However, (31) is often impractical due to the complexity of evaluating q(cih)q(c_{i}h) and q˙(cih)\dot{q}(c_{i}h), which involve the Lagrange interpolating polynomials. In addition, computing the root of a system of nonlinear equations in (31) can be challenging because the formulation of a fixed-point problem could be complicated, and convergence issues could arise. In contrast, Runge-Kutta methods are already in fixed-point formulation and are convergent as long as consistency is satisfied.

Now, note that the finite-dimensional function space MsM^{s} does not depend on the choice of nodes {d0ν}ν=1s1\{d_{0}^{\nu}\}_{\nu=1}^{s-1}, and by a tricky elimination of ϕν(t)\phi_{\nu}(t), (31) can be simplified to yield

q1\displaystyle q_{1} =q0+hi=1sbiQ˙i,\displaystyle=q_{0}+h\sum\limits_{i=1}^{s}b_{i}\dot{Q}_{i}, p1\displaystyle\quad p_{1} =p0+hi=1sbiP˙i,\displaystyle=p_{0}+h\sum\limits_{i=1}^{s}b_{i}\dot{P}_{i}, (32a)
Qi\displaystyle Q_{i} =q0+hj=1saijQ˙j,\displaystyle=q_{0}+h\sum\limits_{j=1}^{s}a_{ij}\dot{Q}_{j}, Pi\displaystyle\quad P_{i} =p0+hj=1sa~ijP˙j,\displaystyle=p_{0}+h\sum\limits_{j=1}^{s}\tilde{a}_{ij}\dot{P}_{j}, (32b)
Pi\displaystyle P_{i} =Lq˙(Qi,Q˙i),\displaystyle=\frac{\partial L}{\partial\dot{q}}(Q_{i},\dot{Q}_{i}), P˙i\displaystyle\quad\dot{P}_{i} =Lq(Qi,Q˙i),\displaystyle=\frac{\partial L}{\partial q}(Q_{i},\dot{Q}_{i}), (32c)

where a~ij=bj(1ajibi)\tilde{a}_{ij}=b_{j}(1-\frac{a_{ji}}{b_{i}}). When transformed to the Hamiltonian side, (32) simply recovers the symplectic partitioned Runge–Kutta method.

The same variational integrator can be derived in a much simpler way: Instead of performing variations on internal points {q0ν}ν=1s1\{q_{0}^{\nu}\}_{\nu=1}^{s-1}, we will perform variations on the internal derivatives {Q˙}i=1s\{\dot{Q}\}_{i=1}^{s}, subject to the constraint q1=q0+hi=1sbiQ˙iq_{1}=q_{0}+h\sum\nolimits_{i=1}^{s}b_{i}\dot{Q}_{i}. Then, the internal points are reconstructed using the fundamental theorem of calculus and the degree of precision of the quadrature rule to obtain Qi=q0+hj=1saijQ˙jQ_{i}=q_{0}+h\sum\nolimits_{j=1}^{s}a_{ij}\dot{Q}_{j}. Now, consider the quadrature approximation of the action integral, viewed as a function of the endpoint values and the internal velocities,

F~(q0,q1,{Q˙i}i=1s,λ)=hi=1sbiL(Qi,Q˙i)+λT(q1q0hi=1sbiQ˙i),\tilde{F}(q_{0},q_{1},\{\dot{Q}_{i}\}_{i=1}^{s},\lambda)=h\sum\limits_{i=1}^{s}b_{i}L(Q_{i},\dot{Q}_{i})+\lambda^{T}\left(q_{1}-q_{0}-h\sum\nolimits_{i=1}^{s}b_{i}\dot{Q}_{i}\right),

where λ\lambda is a Lagrange multiplier that enforces the constraint. Then, a variational integrator (2) can be obtained as follows,

{0=F~Q˙i,i=1,2s,0=F~λ,Qi=q0+hj=1saijQ˙j,p0=F~q0,p1=F~q1.\left\{\begin{aligned} 0&=\frac{\partial\tilde{F}}{\partial\dot{Q}_{i}},\qquad i=1,2\ldots s,\\ 0&=\frac{\partial\tilde{F}}{\partial\lambda},\\ Q_{i}&=q_{0}+h\sum\limits_{j=1}^{s}a_{ij}\dot{Q}_{j},\\ -p_{0}&=\frac{\partial\tilde{F}}{\partial q_{0}},\\ p_{1}&=\frac{\partial\tilde{F}}{\partial q_{1}}.\end{aligned}\right. (33)

Explicitly expanding (33) and eliminating the Lagrange multiplier yields (32) in a much more straightforward manner. This same technique, known as the constrained Galerkin method, is adopted on the rotation group SO(n)SO(n) in order to directly obtain a variational integrator in fixed-point form.

Appendix B Euler–Poincaré & Lie–Poisson Reductions

When the Lagrangian LL or Hamiltonian HH is left-invariant, the mechanical system can be symmetry reduced to evolve on the Lie algebra 𝔤\mathfrak{g} or its dual 𝔤\mathfrak{g}^{*}, respectively, assuming some regularity. On the Lagrangian side, this corresponds to Euler–Poincaré reduction [13], which is described by the Euler–Poincaré equations,

ddt(𝒍ϵ)=adϵ(𝒍ϵ).\frac{d}{dt}\left(\frac{\partial\bm{l}}{\partial\epsilon}\right)=\operatorname*{ad}\nolimits_{\epsilon}^{*}\left(\frac{\partial\bm{l}}{\partial\epsilon}\right).

The above is expressed in terms of the reduced Lagrangian 𝒍(ϵ)=𝒍(g1g˙)=L(g,g˙)\bm{l}(\epsilon)=\bm{l}(g^{-1}\dot{g})=L(g,\dot{g}). As a result, this can be described in terms of a reduced variational principle δab𝒍(ϵ(t))𝑑t=0\delta\int_{a}^{b}\bm{l}(\epsilon(t))\,dt=0 with respect to constrained variations of form δϵ=η˙+[ϵ,η]\delta\epsilon=\dot{\eta}+[\epsilon,\eta], where η(t)\eta(t) is an arbitrary path in the Lie algebra 𝔤\mathfrak{g} that vanishes at the endpoints, namely η(a)=η(b)=0\eta(a)=\eta(b)=0. The constraint on the variations δϵ\delta\epsilon are induced by the condition that ϵ=g1g˙\epsilon=g^{-1}\dot{g} together with arbitrary variations δg\delta g that vanish at the endpoints.

On the Hamiltonian side, this corresponds to Lie–Poisson reduction [13]. Recall that the Lie–Poisson structure on 𝔤\mathfrak{g}^{*} is given by

{F,G}(μ)=μ,[Fμ,Gμ],\{F,G\}(\mu)=\left\langle\mu,\left[\frac{\partial F}{\partial\mu},\frac{\partial G}{\partial\mu}\right]\right\rangle,

and together with the reduced Hamiltonian 𝒉(μ)\bm{h}(\mu), they gives the Lie–Poisson equations on 𝔤\mathfrak{g}^{*},

dμdt=ad𝒉μ(μ).\frac{d\mu}{dt}=\text{ad}_{\frac{\partial\bm{h}}{\partial\mu}}^{*}(\mu).

If the discrete Lagrangian Ld(g0,g1)L_{d}(g_{0},g_{1}) is also GG-invariant, meaning Ld(gg0,gg1)=Ld(g0,g1)L_{d}(g\cdot g_{0},g\cdot g_{1})=L_{d}(g_{0},g_{1}) for some gGg\in G, then (2) can be reduced to a Lie–Poisson integrator [9],

{μ0=𝒍d(f0)f01,μ1=f01μ0f0,\left\{\begin{aligned} {\mu}_{0}&=\bm{l}_{d}^{{}^{\prime}}(f_{0})f_{0}^{-1},\\ {\mu}_{1}&=f_{0}^{-1}\cdot{\mu}_{0}\cdot f_{0},\end{aligned}\right. (34)

where 𝒍d(f0)=Ld(e,f0)\bm{l}_{d}(f_{0})=L_{d}(e,f_{0}). This algorithm preserves the coadjoint orbits and, hence, the Poisson structure on 𝔤\mathfrak{g}^{*}.

Appendix C Detailed Derivations for the Lagrangian Variational Integrators

C.1. Derivations of F~Ωk\frac{\partial\tilde{F}}{\partial\Omega_{k}}

Initially, we have

ddt|t=0F~(g0,g1,,Ωk(t),,Λ)\displaystyle\left.\frac{d}{dt}\right|_{t=0}\tilde{F}(g_{0},g_{1},\ldots,\Omega_{k}(t),\ldots,\Lambda) =hi=1sbiLU(Ui,Ωi),Xik+hbkLΩ(Uk,Ωk),δΩk\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i}),X_{ik}\right\rangle+hb_{k}\left\langle\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k}),\delta\Omega_{k}\right\rangle
+Λ,Asym(hg1Ti=1sbiUiXikΩi+hg1TbkUkδΩk),\displaystyle\qquad+\left\langle\Lambda,\operatorname*{Asym}\left(hg_{1}^{T}\sum\nolimits_{i=1}^{s}b_{i}U_{i}X_{ik}\Omega_{i}+hg_{1}^{T}b_{k}U_{k}\delta\Omega_{k}\right)\right\rangle,

where we can use the properties of the inner products to express the last term as follows,

hi=1sbiAsym(UiTg1ΛΩiT),Xik+hbkAsym(UkTg1Λ),δΩk.h\sum\limits_{i=1}^{s}b_{i}\langle\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T}),X_{ik}\rangle+hb_{k}\langle\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda),\delta\Omega_{k}\rangle.

Then, we continue using equation (12) to obtain,

ddt|t=0F~(g0,g1,,Ωk(t),,Λ)\displaystyle\left.\frac{d}{dt}\right|_{t=0}\tilde{F}(g_{0},g_{1},\ldots,\Omega_{k}(t),\ldots,\Lambda) =hi=1sbiLU(Ui,Ωi)+Asym(UiTg1ΛΩiT),Xik\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T}),X_{ik}\right\rangle
+hbkLΩ(Uk,Ωk)+Asym(UkTg1Λ),δΩk\displaystyle\qquad+hb_{k}\left\langle\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k})+\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda),\delta\Omega_{k}\right\rangle
=hi=1sbiLU(Ui,Ωi)+Asym(UiTg1ΛΩiT),h(π~iϕ~1ψ~k)(δΩk)\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T}),h(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})(\delta\Omega_{k})\right\rangle
+hbkLΩ(Uk,Ωk)+Asym(UkTg1Λ),δΩk\displaystyle\qquad+hb_{k}\left\langle\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k})+\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda),\delta\Omega_{k}\right\rangle
=h2i=1sbi(π~iϕ~1ψ~k)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT)),δΩk\displaystyle=h^{2}\sum\limits_{i=1}^{s}b_{i}\left\langle(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right),\delta\Omega_{k}\right\rangle
+hbkLΩ(Uk,Ωk)+Asym(UkTg1Λ),δΩk.\displaystyle\qquad+hb_{k}\left\langle\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k})+\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda),\delta\Omega_{k}\right\rangle.

We finally have

F~Ωk=h2i=1sbi(π~iϕ~1ψ~k)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT))+hbk(LΩ(Uk,Ωk)+Asym(UkTg1Λ)).\frac{\partial\tilde{F}}{\partial\Omega_{k}}=h^{2}\sum\limits_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\psi}_{k})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right)+hb_{k}\left(\frac{\partial L}{\partial\Omega}(U_{k},\Omega_{k})+\operatorname*{Asym}(U_{k}^{T}g_{1}\Lambda)\right).

C.2. Explicit Expression for ϕ~\tilde{\phi}^{*}: A Derivation

Let us consider (S1,,Ss),(S~1,,S~s)Skew(n)s(S_{1},\ldots,S_{s}),(\tilde{S}_{1},\ldots,\tilde{S}_{s})\in\operatorname*{Skew}(n)^{s}, and so

ϕ~(S1,,Ss),(S~1,,S~s)\displaystyle\langle\tilde{\phi}^{*}(S_{1},\ldots,S_{s}),(\tilde{S}_{1},\ldots,\tilde{S}_{s})\rangle =(S1,,Ss),ϕ~(S~1,,S~s)\displaystyle=\langle(S_{1},\ldots,S_{s}),\tilde{\phi}(\tilde{S}_{1},\ldots,\tilde{S}_{s})\rangle
=(S1,,Ss),{S~idAi(hj=1saijUjS~jΩj)}i=1s\displaystyle=\left\langle(S_{1},\ldots,S_{s}),\left\{\tilde{S}_{i}-d\mathbb{P}_{A_{i}}\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\tilde{S}_{j}\Omega_{j}\right)\right\}_{i=1}^{s}\right\rangle
=i=1sSi,S~idAi(hj=1saijUjS~jΩj)\displaystyle=\sum\limits_{i=1}^{s}\left\langle S_{i},\tilde{S}_{i}-d\mathbb{P}_{A_{i}}\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\tilde{S}_{j}\Omega_{j}\right)\right\rangle
=i=1sSi,S~ii=1str(dAi(Si)(hj=1saijUjS~jΩj)T)\displaystyle=\sum\limits_{i=1}^{s}\langle S_{i},\tilde{S}_{i}\rangle-\sum\limits_{i=1}^{s}\operatorname*{tr}\left(d\mathbb{P}_{A_{i}}^{*}(S_{i})\left(h\sum\nolimits_{j=1}^{s}a_{ij}U_{j}\tilde{S}_{j}\Omega_{j}\right)^{T}\right)
=i=1sSi,S~ii,j=1str((haijUjTdAi(Si)ΩjT)S~jT)\displaystyle=\sum\limits_{i=1}^{s}\langle S_{i},\tilde{S}_{i}\rangle-\sum\limits_{i,j=1}^{s}\operatorname*{tr}\left(\left(ha_{ij}U_{j}^{T}d\mathbb{P}_{A_{i}}^{*}(S_{i})\Omega_{j}^{T}\right)\tilde{S}_{j}^{T}\right)
=i=1sSi,S~ij=1str((hUjTi=1saijdAi(Si)ΩjT)S~jT)\displaystyle=\sum\limits_{i=1}^{s}\langle S_{i},\tilde{S}_{i}\rangle-\sum\limits_{j=1}^{s}\operatorname*{tr}\left(\left(hU_{j}^{T}\sum\nolimits_{i=1}^{s}a_{ij}d\mathbb{P}_{A_{i}}^{*}(S_{i})\Omega_{j}^{T}\right)\tilde{S}_{j}^{T}\right)
=j=1SjAsym(hUjTi=1saijdAi(Si)ΩjT),S~j.\displaystyle=\sum\limits_{j=1}\left\langle S_{j}-\operatorname*{Asym}\left(hU_{j}^{T}\sum\nolimits_{i=1}^{s}a_{ij}d\mathbb{P}_{A_{i}}^{*}(S_{i})\Omega_{j}^{T}\right),\tilde{S}_{j}\right\rangle.

This gives us equation (15).

C.3. Explicit Expression for ψ~k\tilde{\psi}_{k}^{*}: A Derivation

Consider (S1,,Ss),Skew(n)s(S_{1},\ldots,S_{s}),\in\operatorname*{Skew}(n)^{s} and S~Skew(n)\tilde{S}\in\operatorname*{Skew}(n), then

ψ~k(S1,,Ss),S~\displaystyle\langle\tilde{\psi}_{k}^{*}(S_{1},\ldots,S_{s}),\tilde{S}\rangle =(S1,,Ss),ψ~k(S~)\displaystyle=\langle(S_{1},\ldots,S_{s}),\tilde{\psi}_{k}(\tilde{S})\rangle
=(S1,,Ss),{dAi(aikUkS~)}i=1s\displaystyle=\left\langle(S_{1},\ldots,S_{s}),\left\{d\mathbb{P}_{A_{i}}(a_{ik}U_{k}\tilde{S})\right\}_{i=1}^{s}\right\rangle
=i=1str(dAi(Si)(aikUkS~)T)\displaystyle=\sum\limits_{i=1}^{s}\operatorname*{tr}\left(d\mathbb{P}_{A_{i}}^{*}(S_{i})(a_{ik}U_{k}\tilde{S})^{T}\right)
=i=1str((aikUkTdAi(Si))S~T)\displaystyle=\sum\limits_{i=1}^{s}\operatorname*{tr}\left((a_{ik}U_{k}^{T}d\mathbb{P}_{A_{i}}^{*}(S_{i}))\tilde{S}^{T}\right)
=Asym(UkTi=1saikdAi(Si)),S~.\displaystyle=\left\langle\operatorname*{Asym}\left(U_{k}^{T}\sum\nolimits_{i=1}^{s}a_{ik}d\mathbb{P}_{A_{i}}^{*}(S_{i})\right),\tilde{S}\right\rangle.

This gives us equation (16).

C.4. Explicit Expression for φ~\tilde{\varphi}^{*}: A Derivation

Let us derive φ~\tilde{\varphi}^{*} by considering (S1,,Ss)Skew(n)s(S_{1},\ldots,S_{s})\in\operatorname*{Skew}(n)^{s} and S~Skew(n)\tilde{S}\in\operatorname*{Skew}(n), and so

φ~(S1,,Ss),S~\displaystyle\langle\tilde{\varphi}^{*}(S_{1},\ldots,S_{s}),\tilde{S}\rangle =(S1,,Ss),{dAi(g0S~)}i=1s\displaystyle=\left\langle(S_{1},\ldots,S_{s}),\{d\mathbb{P}_{A_{i}}(g_{0}\tilde{S})\}_{i=1}^{s}\right\rangle
=i=1str(dAi(Si)(g0S~)T)\displaystyle=\sum\limits_{i=1}^{s}\operatorname*{tr}\left(d\mathbb{P}_{A_{i}}^{*}(S_{i})(g_{0}\tilde{S})^{T}\right)
=i=1str((g0TdAi(Si))S~T)\displaystyle=\sum\limits_{i=1}^{s}\operatorname*{tr}\left((g_{0}^{T}d\mathbb{P}_{A_{i}}^{*}(S_{i}))\tilde{S}^{T}\right)
=Asym(g0Ti=1sdAi(Si)),S~.\displaystyle=\left\langle\operatorname*{Asym}\left(g_{0}^{T}\sum\nolimits_{i=1}^{s}d\mathbb{P}_{A_{i}}^{*}(S_{i})\right),\tilde{S}\right\rangle.

This gives us equation (18).

C.5. Derivation of F~g0\frac{\partial\tilde{F}}{\partial g_{0}}

We compute

ddt|t=0F~(g0(t),g1,{Ωi}i=1s,Λ)\displaystyle\left.\frac{d}{dt}\right|_{t=0}\tilde{F}(g_{0}(t),g_{1},\{\Omega_{i}\}_{i=1}^{s},\Lambda) =hi=1sbiLU(Ui,Ωi),Yi+Λ,Asym(g1T(g0δg0+hi=1sbiUiYiΩi))\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i}),Y_{i}\right\rangle+\left\langle\Lambda,\operatorname*{Asym}\left(g_{1}^{T}\left(g_{0}\delta g_{0}+h\sum\nolimits_{i=1}^{s}b_{i}U_{i}Y_{i}\Omega_{i}\right)\right)\right\rangle
=hi=1sbiLU(Ui,Ωi),Yi+tr(Λ(g1Tg0δg0)T)+hi=1str(Λ(big1TUiYiΩi)T)\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i}),Y_{i}\right\rangle+\operatorname*{tr}\left(\Lambda(g_{1}^{T}g_{0}\delta g_{0})^{T}\right)+h\sum\limits_{i=1}^{s}\operatorname*{tr}\left(\Lambda(b_{i}g_{1}^{T}U_{i}Y_{i}\Omega_{i})^{T}\right)
=hi=1sbiLU(Ui,Ωi),Yi+tr((g0Tg1Λ)δg0T)+hi=1str(bi(UiTg1ΛΩiT)Yi)\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i}),Y_{i}\right\rangle+\operatorname*{tr}\left((g_{0}^{T}g_{1}\Lambda)\delta g_{0}^{T}\right)+h\sum\limits_{i=1}^{s}\operatorname*{tr}\left(b_{i}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})Y_{i}\right)
=hi=1sbiLU(Ui,Ωi)+Asym(UiTg1ΛΩiT),Yi+Asym(g0Tg1Λ),δg0\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T}),Y_{i}\right\rangle+\langle\operatorname*{Asym}(g_{0}^{T}g_{1}\Lambda),\delta g_{0}\rangle
=hi=1sbiLU(Ui,Ωi)+Asym(UiTg1ΛΩiT),(π~iϕ~1φ~)(δg0)\displaystyle=h\sum\limits_{i=1}^{s}b_{i}\left\langle\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T}),(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})(\delta g_{0})\right\rangle
+Asym(g0Tg1Λ),δg0\displaystyle\qquad+\langle\operatorname*{Asym}(g_{0}^{T}g_{1}\Lambda),\delta g_{0}\rangle

Thus, we have

F~g0=hi=1sbi(π~iϕ~1φ~)(LU(Ui,Ωi)+Asym(UiTg1ΛΩiT))+Asym(g0Tg1Λ).\frac{\partial\tilde{F}}{\partial g_{0}}=h\sum\limits_{i=1}^{s}b_{i}(\tilde{\pi}_{i}\circ\tilde{\phi}^{-1}\circ\tilde{\varphi})^{*}\left(\frac{\partial L}{\partial U}(U_{i},\Omega_{i})+\operatorname*{Asym}(U_{i}^{T}g_{1}\Lambda\Omega_{i}^{T})\right)+\operatorname*{Asym}(g_{0}^{T}g_{1}\Lambda).

References

  • Absil et al. [2008] P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2008.
  • Bartels and Stewart [1972] R. H. Bartels and G. W. Stewart. Solution of the matrix equation ax+xb=cax+xb=c [f4]. Communications of the ACM, 15(9):820–826, 1972.
  • Bogfjellmo and Marthinsen [2016] G. Bogfjellmo and H. Marthinsen. High-order symplectic partitioned Lie group methods. Foundations of Computational Mathematics, 16:493–530, 2016.
  • Celledoni and Owren [2002] E. Celledoni and B. Owren. A class of intrinsic schemes for orthogonal integration. SIAM Journal on Numerical Analysis, 40(6):2069–2084, 2002.
  • Duruisseaux and Leok [2021] V. Duruisseaux and M. Leok. A variational formulation of accelerated optimization on Riemannian manifolds. 2021.
  • Duruisseaux et al. [2021] V. Duruisseaux, J. Schmitt, and M. Leok. Adaptive Hamiltonian variational integrators and applications to symplectic accelerated optimization. SIAM Journal on Scientific Computing, 43(4):A2949–A2980, 2021. doi: 10.1137/20M1383835.
  • Golub et al. [1979] G. Golub, S. Nash, and C. Van Loan. A Hessenberg-Schur method for the problem ax+xb=cax+xb=c. IEEE Transactions on Automatic Control, 24(6):909–913, 1979.
  • Hall and Leok [2017] J. Hall and M. Leok. Lie group spectral variational integrators. Foundations of Computational Mathematics, 17:199–257, 2017.
  • J.E. Marsden and Shkoller [1999] S. Pekarsky J.E. Marsden and S. Shkoller. Discrete Euler–Poincaré and Lie-Poisson equations. Nonlinearity, 12:1647–1662, 1999.
  • Lee et al. [2007] T. Lee, M. Leok, and N. H. McClamroch. Lie group variational integrators for the full body problem in orbital mechanics. Celest. Mech. Dyn. Astr., 98(2):121–144, 2007.
  • Leok [2005] M. Leok. Generalized Galerkin variational integrators. arXiv:math/0508360, 2005.
  • Leok and Shingel [2012] M. Leok and T. Shingel. General techniques for constructing variational integrators. Frontiers of Mathematics in China, 7:273––303, 2012.
  • Marsden and Ratiu [1999] J. E. Marsden and T. S. Ratiu. Introduction to Mechanics and Symmetry, volume 17 of Texts in Applied Mathematics. Springer-Verlag, second edition, 1999.
  • Marsden and West [2001] J. E. Marsden and M. West. Discrete mechanics and variational integrators. Acta Numer., 10:357–514, 2001.
  • Munthe-Kaas et al. [2001] Hans Z Munthe-Kaas, GRW Quispel, and Antonella Zanna. Generalized polar decompositions on lie groups with involutive automorphisms. Foundations of Computational Mathematics, 1(3):297–324, 2001.
  • Wibisono et al. [2016] A. Wibisono, A. Wilson, and M. Jordan. A variational perspective on accelerated methods in optimization. Proceedings of the National Academy of Sciences, 113(47):E7351–E7358, 2016.