This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Koopman-Based Learning of Infinitesimal Generators
without Operator Logarithm

Yiming Meng, Ruikun Zhou, Melkior Ornik, and Jun Liu This research was supported in part by NASA under grant numbers 80NSSC21K1030 and 80NSSC22M0070, by the Air Force Office of Scientific Research under grant number FA9550-23-1-0131, and in part by the Natural Sciences and Engineering Research Council of Canada and the Canada Research Chairs program. Yiming Meng is with the Coordinated Science Laboratory, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA. ymmeng@illinois.edu. Ruikun Zhou is with the Department of Applied Mathematics, University of Waterloo, Waterloo ON N2L 3G1, Canada ruikun.zhou@uwaterloo.ca.Melkior Ornik is with the Department of Aerospace Engineering and the Coordinated Science Laboratory, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA. mornik@illinois.edu. Jun Liu is with the Department of Applied Mathematics, University of Waterloo, Waterloo ON N2L 3G1, Canada j.liu@uwaterloo.ca.
Abstract

To retrieve transient transition information of unknown systems from discrete-time observations, the Koopman operator structure has gained significant attention in recent years, particularly for its ability to avoid time derivatives through the Koopman operator logarithm. However, the effectiveness of these logarithm-based methods has only been demonstrated within a restrictive function space. In this paper, we propose a logarithm-free technique for learning the infinitesimal generator without disrupting the Koopman operator learning framework.

Index Terms:
Unknown nonlinear systems, Koopman operators, infinitesimal generator, system identification, verification.

I Introduction

Verification of dynamical system properties and achieving autonomy are two important directions for the future of industrial intelligence, with applications in numerous fields, including mathematical finance, automated vehicles, power systems, and other physical sciences.

Witnessing the success of problem-solving within the data paradigm, there has been a surge of interest in revealing the governing equations of continuous-time dynamical systems from time-series data to better understand the underlying physical laws [1, 2]. Additional interests in safety-critical industries include data-driven stability and safety analysis, prediction, and control. Techniques such as Lyapunov and barrier certificates have proven effective [3, 4, 5, 6, 7]. It is worth noting, however, that these practical concerns require information on the vector fields, the value functions that abstract system performance, and the corresponding Lie derivatives, all underpinned by an understanding of the infinitesimal generator [8, 9, 10, 11, 12]. Considering nonlinear effects, challenges therefore arise in the converse identification of infinitesimal system transitions based on discrete-time observations that represent cumulative trajectory behaviors.

Direct methods, such as the sparse identification of nonlinear dynamics (SINDy) algorithm [13], have been developed to identify state dynamics by relying on nonlinear parameter estimation [14, 15] and static linear regression techniques. However, the accurate approximation of time derivatives of the state may not be achieved due to challenges such as low sampling rates, noisy measurements, and short observation periods. Furthermore, the data cannot be reused in the proposed structure for constructing other value functions (e.g., Lyapunov or barrier functions) for stability, reachability, and safety analysis. This limitation extends to verifying their Lie derivatives along the trajectories, which is crucial for demonstrating the evolving trends of the phase portraits.

Comparatively, the operator logarithm-based Koopman generator learning structure [16, 17, 18, 19] does not require the estimation of time derivatives, enabling a data-driven estimation of Lie derivatives. This approach can potentially circumvent the need for high sampling rates and extended observation periods. Heuristically, researchers tend to represent the Koopman operator 𝒦t\mathcal{K}_{t} by an exponential form of its infinitesimal generator \mathcal{L} as 𝒦t=et\mathcal{K}_{t}=e^{t\mathcal{L}}, leading to the converse representation =1tlog(𝒦t)\mathcal{L}=\frac{1}{t}\log(\mathcal{K}_{t}) for any t>0t>0. However, representing Koopman operators in exponential form requires the boundedness of the generator. Additionally, the operator logarithm is a single-valued mapping only within a specific sector of the spectrum. Recent studies [20, 21] have investigated the sufficient and necessary conditions under which the Koopman-logarithm-based generator learning method can be uniquely identified. However, these conditions are less likely to be verifiable for unknown systems.

In this paper, we introduce a logarithm-free generator learning scheme that does not require knowledge of the spectrum properties and present preliminary results. This scheme will be compatible with the current advances in [7] for Koopman-based construction of maximal Lyapunov functions. It is important to note that the method in [7] assumes full knowledge of the equilibrium point and acknowledges that verification of the constructed Lyapunov function depends on information about the actual system transitions, which may diminish its predictive value in stability analysis.

The rest of the paper is organized as follows. Preliminaries are covered in Section II. The formal finite-horizon approximation of the generator using Koopman operators is presented in Section III. The data-driven algorithm is discussed in Section IV. Finally, case studies are presented in Section V.

Due to page limitations and to enhance readability, this paper presents only the primary results of the proposed method. Detailed proofs are omitted and will be published elsewhere.

Notation: We denote by n\mathbb{R}^{n} the Euclidean space of dimension n>1n>1, and by \mathbb{R} the set of real numbers. For xnx\in\mathbb{R}^{n} and r0r\geq 0, we denote the ball of radius rr centered at xx by (x,r)={yn:|yx|r}\mathcal{B}(x,r)=\left\{y\in\mathbb{R}^{n}:\,\left|y-x\right|\leq r\right\}, where ||\left|\;\cdot\;\right| is the Euclidean norm. For a closed set AnA\subset\mathbb{R}^{n} and xnx\in\mathbb{R}^{n}, we denote the distance from xx to AA by |x|A=infyA|xy|\left|x\right|_{A}=\inf_{y\in A}\left|x-y\right| and rr-neighborhood of AA by (A,r)=xA(x,r)={xn:|x|Ar}\mathcal{B}(A,r)=\cup_{x\in A}\mathcal{B}(x,r)=\left\{x\in\mathbb{R}^{n}:\,\left|x\right|_{A}\leq r\right\}. For a set AnA\subseteq\mathbb{R}^{n}, A¯\overline{A} denotes its closure, int(A)\operatorname{int}(A) denotes its interior and A\partial A denotes its boundary. For finite-dimensional matrices, we use the Frobenius norm F\|\cdot\|_{F} as the metric. Let C(Ω)C(\Omega) be the set of continuous functions with domain Ω\Omega. We denote the set of ii times continuously differentiable functions by Ci(Ω)C^{i}(\Omega).

II Preliminaries

II-A Dynamical Systems

Given a pre-compact state space 𝒳n\mathcal{X}\subseteq\mathbb{R}^{n}, we consider a continuous-time nonlinear dynamical system of the form

𝐱˙(t)=f(𝐱(t)),𝐱(0)=x𝒳,t[0,),\dot{\mathbf{x}}(t)=f(\mathbf{x}(t)),\;\;\mathbf{x}(0)=x\in\mathcal{X},\;t\in[0,\infty), (1)

where xx denotes the initial condition, and the vector field f:𝒳𝒳f:\mathcal{X}\rightarrow\mathcal{X} is assumed to be locally Lipschitz continuous. We denote by ϕ:×𝒳𝒳\phi:\mathcal{I}\times\mathcal{X}\rightarrow\mathcal{X} the forward flow, also known as the solution map.

The evolution of observable functions of system (1) restricted to C(𝒳)C(\mathcal{X}) is governed by the family of Koopman operators, defined by

𝒦th=hϕ(t,),hC(𝒳)\displaystyle\mathcal{K}_{t}h=h\circ\phi(t,\cdot),\quad h\in C(\mathcal{X}) (2)

for each t0t\geq 0, where \circ is the composition operator. By the properties of the flow map, it can be easily verified that {𝒦t}t0\left\{\mathcal{K}_{t}\right\}_{t\geq 0} forms a semigroup, with its (infinitesimal) generator defined by

h(x):=limt0𝒦th(x)h(x)t.\mathcal{L}h(x):=\lim_{t\rightarrow 0}\frac{\mathcal{K}_{t}h(x)-h(x)}{t}. (3)

Within the space of continuous observable functions, the limit in Eq. (3) exists on the subspace of all continuously differentiable functions, indicating that dom()=C1(𝒳)\operatorname{dom}(\mathcal{L})=C^{1}(\mathcal{X}). In this case, we can verify that the generator of Koopman operators is such that

h(x)=h(x)f(x).\mathcal{L}h(x)=\nabla h(x)\cdot f(x).

Note that the finite-difference method [22, 23, 24] indirectly approximates the generator by setting a small terminal time without taking the limit on the r.h.s. of (3). Through this approximation scheme of the time derivative, it can be anticipated that the precision heavily depends on the size of that terminal time.

II-B Representation of Koopman Operators

If \mathcal{L} is a bounded linear operator that generates {𝒦t}\{\mathcal{K}_{t}\}, then 𝒦t=et\mathcal{K}_{t}=e^{t\mathcal{L}} for each tt in the uniform (operator) topology. However, many of the differential operators that occur in mathematical physics are unbounded. We revisit some concepts to show how to interpret 𝒦t\mathcal{K}_{t} in terms of the possibly unbounded generator on an exponential scale.

Definition II.1 (Resolvents)

The resolvent set of \mathcal{L} is defined as ρ():={λ:λIis invertible}\rho(\mathcal{L}):=\left\{\lambda\in\mathbb{C}:\lambda\operatorname{I}-\mathcal{L}\;\text{is invertible}\right\}. For any λρ()\lambda\in\rho(\mathcal{L}), the resolvent operator is defined as

R(λ;):=(λI)1,R(\lambda;\mathcal{L}):=(\lambda\operatorname{I}-\mathcal{L})^{-1}, (4)

which is a bounded linear operator [25, Chap. I, Theorem 4.3]. \diamond

We further define the Yosida approximation of \mathcal{L} as

λ:=λR(λ;)=λ2R(λ;)λI.\mathcal{L}_{\lambda}:=\lambda\mathcal{L}R(\lambda;\mathcal{L})=\lambda^{2}R(\lambda;\mathcal{L})-\lambda\operatorname{I}. (5)

Note that {λ}λρ()\left\{\mathcal{L}_{\lambda}\right\}_{\lambda\in\rho(\mathcal{L})} is a family of bounded linear operators, and etλe^{t\mathcal{L}_{\lambda}} is well-defined for each λρ()\lambda\in\rho(\mathcal{L}).

Definition II.2 (Strong convergence)

Let (,)(\mathcal{F},\|\cdot\|_{\mathcal{F}}) be a Banach space. Let B:B:\mathcal{F}\rightarrow\mathcal{F} and Bn:B_{n}:\mathcal{F}\rightarrow\mathcal{F}, for each nn\in\mathbb{N}, be linear operators. Then, the {Bn}n\{B_{n}\}_{n\in\mathbb{N}} is said to converge to BB strongly, denoted by nB\mathcal{B}_{n}\rightharpoonup B, if limnBnhBh=0\lim_{n\rightarrow\infty}\|B_{n}h-Bh\|_{\mathcal{F}}=0 for each hh\in\mathcal{F}. We also write B=s-limnBnB=\mathop{\mathrm{s\text{-}\lim}}_{n\rightarrow\infty}B_{n}. \diamond

Theorem II.3

[25, Chap. I, Theorem 5.5] 𝒦t=s-limλetλ\mathcal{K}_{t}=\mathop{\mathrm{s\text{-}\lim}}_{\lambda\rightarrow\infty}e^{t\mathcal{L}_{\lambda}} for all t0t\geq 0 on the uniform topology.

III The Formal Approximation of the Infinitesimal Generators

Section II presents preliminary results on representing Koopman operators using the asymptotic approximation of etλe^{t\mathcal{L}_{\lambda}} as λ\lambda approaches infinity. In this section, we aim to find the correct converse representation of \mathcal{L} based on {𝒦t}t0\{\mathcal{K}_{t}\}_{t\geq 0}. The ultimate goal is to leverage the Koopman learning structure to learn the generator.

Note that it is intuitive to take the operator logarithm such that =1tlog𝒦t\mathcal{L}=\frac{1}{t}\log\mathcal{K}_{t} when \mathcal{L} is bounded. The spectrum’s sector should be confined to make the logarithm a single-valued mapping [20, 21]. However, for an unbounded \mathcal{L}, there is no direct connection. In this section, we establish the connection for how the unbounded \mathcal{L} can be properly approximated based on {𝒦t}t0\{\mathcal{K}_{t}\}_{t\geq 0}.

III-A Asymptotic Approximations of Generators

Let C(𝒳)C(\mathcal{X}) be endowed with the uniform norm ||:=supx𝒳|||\cdot|_{\infty}:=\sup_{x\in\mathcal{X}}|\cdot|. Suppose that {𝒦t}t0\{\mathcal{K}_{t}\}_{t\geq 0} is a contraction semigroup where

𝒦t:=suph=1|𝒦th|1\|\mathcal{K}_{t}\|:=\sup_{\|h\|_{\mathcal{F}}=1}|\mathcal{K}_{t}h|_{\infty}\leq 1

for all t0t\geq 0. Then, ρ()(0,)\rho(\mathcal{L})\supseteq(0,\infty), and by [25, Lemma 1.3.3], we have that {λ}λ>0\left\{\mathcal{L}_{\lambda}\right\}_{\lambda>0} converges strongly to \mathcal{L}. Specifically, when working on the C2(𝒳)C^{2}(\mathcal{X}) subspace, the convergence rate is given by 𝒪(λ1)\mathcal{O}(\lambda^{-1}).

To address more general cases, we first present the following facts.

Proposition III.1

For system (1), there exist constants ω0\omega\geq 0 and M1M\geq 1 such that 𝒦tMeωt\left\|\mathcal{K}_{t}\right\|\leq Me^{\omega t} for all t0t\geq 0. In addition, for any λ\lambda\in\mathbb{C}, the family {eλt𝒦t}t0\left\{e^{\lambda t}\mathcal{K}_{t}\right\}_{t\geq 0} is a C0C_{0} semigroup with generator +λI:dom()C(𝒳)\mathcal{L}+\lambda\operatorname{I}:\operatorname{dom}(\mathcal{L})\rightarrow C(\mathcal{X}).

Intuitively, MM represents the uniform scaling of the magnitude of the Koopman operator, while ω\omega indicates the dominant exponential growth or decay rate of the flow on C(𝒳)C(\mathcal{X}). The above properties provide us with a tool to convert the dominant exponential growth or decay rate of the flow on C(𝒳)C(\mathcal{X}) by introducing an extra term λI\lambda\operatorname{I} to the original generator. Specifically, the transformation eωt𝒦te^{-\omega t}\mathcal{K}_{t} allows us to work in a topology (by shifting the direction of flows and uniformly compressing by MM) that is equivalent to the uniform topology of continuous functions, where the semigroup {eωt𝒦t}\{e^{-\omega t}\mathcal{K}_{t}\} is a contraction. The convergence of s-limλλ=\mathop{\mathrm{s\text{-}\lim}}_{\lambda\rightarrow\infty}\mathcal{L}_{\lambda}=\mathcal{L} for {λ}λ>ω\left\{\mathcal{L}_{\lambda}\right\}_{\lambda>\omega} and its reciprocal convergence rate can similarly be demonstrated in this new topology.

III-B Representation of Resolvent Operators

Motivated by representing \mathcal{L} by {𝒦t}t0\left\{\mathcal{K}_{t}\right\}_{t\geq 0} and the Yosida approximation for \mathcal{L} on {λ>ω}\{\lambda>\omega\}, we establish a connection between R(λ;)R(\lambda;\mathcal{L}) and {𝒦t}t0\left\{\mathcal{K}_{t}\right\}_{t\geq 0}.

Proposition III.2

Let R(λ)R(\lambda) on C(𝒳)C(\mathcal{X}) be defined by

R(λ)h:=0eλt(𝒦th)𝑑t.R(\lambda)h:=\int_{0}^{\infty}e^{-\lambda t}(\mathcal{K}_{t}h)dt. (6)

Then, for all λ>ω\lambda>\omega,

  1. 1.

    R(λ)(λI)h=hR(\lambda)(\lambda\operatorname{I}-\mathcal{L})h=h for all hC1(𝒳)h\in C^{1}(\mathcal{X});

  2. 2.

    (λI)R(λ)h=h(\lambda\operatorname{I}-\mathcal{L})R(\lambda)h=h for all hC(𝒳)h\in C(\mathcal{X}).

Restricted to the C1(𝒳)C^{1}(\mathcal{X}) subspace, to use the approximation in Section III-A, we can replace R(λ;)R(\lambda;\mathcal{L}) with R(λ)R(\lambda).

Corollary III.3

For each λ>ω\lambda>\omega,

λ=λ20eλt𝒦t𝑑tλI\mathcal{L}_{\lambda}=\lambda^{2}\int_{0}^{\infty}e^{-\lambda t}\mathcal{K}_{t}dt-\lambda\operatorname{I} (7)

and λ\mathcal{L}_{\lambda}\rightharpoonup\mathcal{L} on C1(𝒳)C^{1}(\mathcal{X}).

III-C Finite Time-Horizon Approximation

The current form of the improper integral in (7) cannot be directly used for a data-driven approximation. We need to further derive an approximation approach based on finite-horizon observable data. Below, we present a direct truncation modification based on (7).

Definition III.4

For any hC(𝒳)h\in C(\mathcal{X}) and τ0\tau\geq 0, we define 𝒯τ:C(𝒳)C(𝒳)\mathcal{T}_{\tau}:C(\mathcal{X})\rightarrow C(\mathcal{X}) as

𝒯τh(x):=0τeλs𝒦sh(x)𝑑s.\mathcal{T}_{\tau}h(x):=\int_{0}^{\tau}e^{-\lambda s}\mathcal{K}_{s}h(x)ds. (8)

It can be shown that for any λ\lambda, the truncation in Definition III.4 results in an error term eλτR(λ)h(ϕ(τ,x))e^{-\lambda\tau}R(\lambda)h(\phi(\tau,x)) that is uniformly bounded for each observable function hh, with the dominant term being eλτe^{-\lambda\tau}. This indicates that for any arbitrarily large λ\lambda, this truncation does not significantly affect the accuracy compared to the procedure from λ\mathcal{L}_{\lambda} to \mathcal{L}.

For any fixed τ>0\tau>0, we can therefore use

~λ:=λ2𝒯τλI\widetilde{\mathcal{L}}_{\lambda}:=\lambda^{2}\mathcal{T}_{\tau}-\lambda\operatorname{I} (9)

to approximate \mathcal{L} within a small time-horizon. We illustrate this approximation with the following simple example.

Example III.5

Consider the simple dynamical system 𝐱˙(t)=𝐱(t)\dot{\mathbf{x}}(t)=\mathbf{x}(t) and the observable function V(x)=xnV(x)=x^{n} for any n1n\geq 1. Then, analytically, ϕ(τ,x)=xeτ\phi(\tau,x)=xe^{\tau} and V(x)=nxn\mathcal{L}V(x)=nx^{n}. We test the validity of using Eq. (9). Note that, for sufficiently large λ\lambda, we have

λ20τeλs(𝒦sV(x))𝑑s=λ20τeλsesnxn𝑑s=λ2xn0τe(λn)s𝑑s=λ2xnλn(1e(λn)τ)λ2xnλn\begin{split}&\lambda^{2}\int_{0}^{\tau}e^{-\lambda s}(\mathcal{K}_{s}V(x))ds\\ =&\lambda^{2}\int_{0}^{\tau}e^{-\lambda s}e^{sn}x^{n}ds=\lambda^{2}x^{n}\int_{0}^{\tau}e^{-(\lambda-n)s}ds\\ =&\frac{\lambda^{2}x^{n}}{\lambda-n}(1-e^{-(\lambda-n)\tau})\approx\frac{\lambda^{2}x^{n}}{\lambda-n}\end{split}

and

λ2𝒯τV(x)λV(x)λ2xnλnλxnnxn=V(x).\begin{split}&\lambda^{2}\mathcal{T}_{\tau}V(x)-\lambda V(x)\approx\frac{\lambda^{2}x^{n}}{\lambda-n}-\lambda x^{n}\approx nx^{n}=\mathcal{L}V(x).\end{split}

With high-accuracy evaluation of the integral, we can achieve a reasonably good approximation. \diamond

IV Data-Driven Algorithm

We continue by considering a data-driven algorithm to learn ~λ\widetilde{\mathcal{L}}_{\lambda}, which indirectly approximate \mathcal{L} as demonstrated in Section III.

The idea is to modify the conventional Koopman operator learning scheme [26, 16, 7], which involves learning linear operators using a finite-dimensional dictionary of basis functions and representing the image function as a linear combination of the dictionary functions. To be more specific, obtaining a fully discretized version 𝐋\mathbf{L} of the bounded linear operator λ2𝒯τNλI\lambda^{2}\mathcal{T}_{\tau}^{N}-\lambda\operatorname{I} based on the training data typically relies on the selection of a discrete dictionary of continuously differentiable observable test functions, denoted by

𝒵N(x):=[𝔷0(x),𝔷1(x),,𝔷N1(x)],N.\mathcal{Z}_{N}(x):=\left[\mathfrak{z}_{0}(x),\mathfrak{z}_{1}(x),\cdots,\mathfrak{z}_{N-1}(x)\right],\;N\in\mathbb{N}. (10)

Then, the followings should hold:

1) Let (μi,ξi)i=0N1(\mu_{i},\mathbf{\xi}_{i})_{i=0}^{N-1} be the eigenvalues and eigenvectors of 𝐋\mathbf{L}. Let (ρi,φi)i=0N1(\rho_{i},\varphi_{i})_{i=0}^{N-1} be the eigenvalues and eigenfunctions of \mathcal{L}. Then, for each ii,

μiρi,φi(x)𝒵N(x)ξi.\mu_{i}\approx\rho_{i},\;\;\varphi_{i}(x)\approx\mathcal{Z}_{N}(x)\mathbf{\xi}_{i}. (11)

2) For any hspan{𝔷0,𝔷1,,𝔷N1}h\in\operatorname{span}\{\mathfrak{z}_{0},\mathfrak{z}_{1},\cdots,\mathfrak{z}_{N-1}\} such that h(x)=𝒵N(x)𝐰h(x)=\mathcal{Z}_{N}(x)\mathbf{w} for some column vector 𝐰\mathbf{w}, we have that

h()~λh()λ2𝒯τNh()λh()𝒵N()(𝐋𝐰).\mathcal{L}h(\cdot)\approx\widetilde{\mathcal{L}}_{\lambda}h(\cdot)\approx\lambda^{2}\mathcal{T}_{\tau}^{N}h(\cdot)-\lambda h(\cdot)\approx\mathcal{Z}_{N}(\cdot)(\mathbf{L}\mathbf{w}). (12)

In this section, we modify the existing Koopman learning technique to obtain 𝐋\mathbf{L}.

IV-A Generating Training Data

For any fixed λ>0\lambda>0, given a dictionary 𝒵N\mathcal{Z}_{N} of the form (10), for each 𝔷i𝒵N\mathfrak{z}_{i}\in\mathcal{Z}_{N} and each x𝒳x\in\mathcal{X}, we consider 𝔷i(x)\mathfrak{z}_{i}(x) as the features and λ2𝒯τ𝔷i(x)λ𝔷i(x)=λ20τeλs𝔷i(ϕ(s,x))𝑑sλ𝔷i(x)\lambda^{2}\mathcal{T}_{\tau}\mathfrak{z}_{i}(x)-\lambda\mathfrak{z}_{i}(x)=\lambda^{2}\int_{0}^{\tau}e^{-\lambda s}\mathfrak{z}_{i}(\phi(s,x))ds-\lambda\mathfrak{z}_{i}(x) as the labels. To compute the integral, we employ numerical quadrature techniques for approximation. This approach inevitably requires discrete-time observations (snapshots), the number of which is denoted by 𝐍\mathbf{N}, within the interval [0,τ][0,\tau] of the flow map ϕ(τ,x)\phi(\tau,x).

To streamline the evaluation process for numerical examples, drawing inspiration from [27], for any τ\tau and ii, we can assess both the trajectory and the integral, i.e. the pair (ϕ(τ,x),λ20τeλs𝔷i(ϕ(s,x))𝑑s)\left(\phi(\tau,x),\lambda^{2}\int_{0}^{\tau}e^{-\lambda s}\mathfrak{z}_{i}(\phi(s,x))ds\right), by numerically solving the following augmented ODE system

𝐱˙(t)=f(𝐱(t)),𝐱(0)=xn,I˙i(t)=λ2eλt𝔷i(𝐱),Ii(0)=0.\begin{split}&\dot{\mathbf{x}}(t)=f(\mathbf{x}(t)),\;\;\mathbf{x}(0)=x\in\mathbb{R}^{n},\\ &\dot{I}_{i}(t)=\lambda^{2}e^{-\lambda t}\mathfrak{z}_{i}(\mathbf{x}),\;\;I_{i}(0)=0.\end{split} (13)

We summarize the algorithm for generating training data for one time period as in Algorithm 1.

Algorithm 1 Generating Training Data
ff, nn, NN, 𝒳\mathcal{X}, 𝒵N\mathcal{Z}_{N}, τ\tau, λ\lambda, and uniformly sampled {x(m)}m=0M1𝒳\left\{x^{(m)}\right\}_{m=0}^{M-1}\subseteq\mathcal{X}.
for mm from 0 to M1M-1 do
     for ii from 0 to N1N-1 do
         Compute 𝔷i(x(m))\mathfrak{z}_{i}(x^{(m)})
         Compute λ2𝒯τ𝔷i(x(m))\lambda^{2}\mathcal{T}_{\tau}\mathfrak{z}_{i}(x^{(m)}) using (13)
     end for
     Stack
~λ𝒵N(x(m))=[λ2𝒯τ𝔷0(x(m))λ𝔷0(x(m)),,λ2𝒯τ𝔷N1(x(m))λ𝔷N1(x(m))]\begin{split}\widetilde{\mathcal{L}}_{\lambda}\mathcal{Z}_{N}(x^{(m)})&=[\lambda^{2}\mathcal{T}_{\tau}\mathfrak{z}_{0}(x^{(m)})-\lambda\mathfrak{z}_{0}(x^{(m)}),\cdots,\\ &\lambda^{2}\mathcal{T}_{\tau}\mathfrak{z}_{N-1}(x^{(m)})-\lambda\mathfrak{z}_{N-1}(x^{(m)})]\end{split}
end for
Stack 𝔛,𝔜M×N\mathfrak{X},\mathfrak{Y}\in\mathbb{C}^{M\times N} such that 𝔛=[𝒵N(x(0)),𝒵N(x(1)),,𝒵N(x(M1))]T\mathfrak{X}=[\mathcal{Z}_{N}(x^{(0)}),\mathcal{Z}_{N}(x^{(1)}),\cdots,\mathcal{Z}_{N}(x^{(M-1)})]^{T} and 𝔜=[~λ𝒵N(x(0)),~λ𝒵N(x(1)),,~λ𝒵N(x(M1))]T\mathfrak{Y}=[\widetilde{\mathcal{L}}_{\lambda}\mathcal{Z}_{N}(x^{(0)}),\widetilde{\mathcal{L}}_{\lambda}\mathcal{Z}_{N}(x^{(1)}),\cdots,\widetilde{\mathcal{L}}_{\lambda}\mathcal{Z}_{N}(x^{(M-1)})]^{T}

IV-B Extended Dynamic Mode Decomposition Algorithm

After obtaining the training data (𝔛,𝔜)(\mathfrak{X},\mathfrak{Y}) using Algorithm 1, we can find 𝐋\mathbf{L} by 𝐋=argminAN×N𝔜𝔛AF\mathbf{L}=\operatorname{argmin}_{A\in\mathbb{C}^{N\times N}}\|\mathfrak{Y}-\mathfrak{X}A\|_{F}. The 𝐋\mathbf{L} is given in closed-form as 𝐋=(𝔛T𝔛)𝔛T𝔜\mathbf{L}=\left(\mathfrak{X}^{T}\mathfrak{X}\right)^{\dagger}\mathfrak{X}^{T}\mathfrak{Y}, where \dagger is the pseudo inverse. Similar to Extended Dynamic Mode Decomposition (EDMD) [26] for learning Koopman operators, the approximations (11) and (12) can be guaranteed. In addition, by the universal approximation theorem, all of the function approximations from above should have uniform convergence to the actual quantities.

It is worth noting that operator learning frameworks, such as autoencoders, which utilize neural networks (NN) as dictionary functions, can reduce human bias in the selection of these functions [5]. Incorporating the proposed learning scheme with NN falls outside the scope of this paper but will be pursued in future work.

Remark IV.1 (The Logarithm Method)

We compare the aforementioned learning approach of \mathcal{L} to those obtained using the benchmark approach as described in [16]. Briefly speaking, at a given s[0,τ]s\in[0,\tau], that method first obtains a matrix 𝐊N×N\mathbf{K}\in\mathbb{C}^{N\times N} such that 𝐊=(𝔛T𝔛)𝔛T𝔜~\mathbf{K}=\left(\mathfrak{X}^{T}\mathfrak{X}\right)^{\dagger}\mathfrak{X}^{T}\widetilde{\mathfrak{Y}}, where 𝔜~=[𝒵N(ϕ(s,x(0)),𝒵N(ϕ(s,x(1)),,𝒵N(ϕ(s,x(M1))]T.\widetilde{\mathfrak{Y}}=[\mathcal{Z}_{N}(\phi(s,x^{(0)}),\mathcal{Z}_{N}(\phi(s,x^{(1)}),\cdots,\mathcal{Z}_{N}(\phi(s,x^{(M-1)})]^{T}. Let (μiK,ξiK)i=0N1(\mu_{i}^{K},\mathbf{\xi}_{i}^{K})_{i=0}^{N-1} be the eigenvalues and eigenvectors of 𝐊\mathbf{K}. Let (ρiK,φiK)i=0N1(\rho_{i}^{K},\varphi_{i}^{K})_{i=0}^{N-1} be the eigenvalues and eigenfunctions of 𝒦s\mathcal{K}_{s}. Similar to (11)\eqref{E: approx_eigen} and (12), for each ii, we have φiK(x)𝒵N(x)ξiK\varphi_{i}^{K}(x)\approx\mathcal{Z}_{N}(x)\xi_{i}^{K} and 𝒦sh()𝒵N()(𝐊𝐰)\mathcal{K}_{s}h(\cdot)\approx\mathcal{Z}_{N}(\cdot)(\mathbf{K}\mathbf{w}) for h(x)=𝒵N(x)𝐰h(x)=\mathcal{Z}_{N}(x)\mathbf{w}.

However, even when \mathcal{L} can be represented by (log𝒦s)/s(\log{\mathcal{K}_{s}})/s, we cannot guarantee that log𝒦ssh()𝒵N()(log(𝐊)s𝐰)\frac{\log{\mathcal{K}_{s}}}{s}h(\cdot)\approx\mathcal{Z}_{N}(\cdot)(\frac{\log(\mathbf{K})}{s}\mathbf{w}), not to mention the case where the above logarithm representation does not hold. Denoting Φ()=[φ0K(),φ1K(),,φN1K()]\Phi(\cdot)=[\varphi_{0}^{K}(\cdot),\varphi_{1}^{K}(\cdot),\cdots,\varphi_{N-1}^{K}(\cdot)] and Ξ=[ξ0K,ξ1K,,ξN1K]\Xi=[\xi_{0}^{K},\xi_{1}^{K},\cdots,\xi_{N-1}^{K}], then it is clear that Φ()=𝒵N()Ξ\Phi(\cdot)=\mathcal{Z}_{N}(\cdot)\Xi. The (possibly complex-valued) rotation matrix Ξ\Xi establishes the connection between finite-dimensional eigenfunctions and dictionary functions through data-fitting, ensuring that any linear combination within 𝒵N\mathcal{Z}_{N} can be equivalently represented using Φ\Phi with a cancellation of the imaginary parts.

This imaginary-part cancellation effect does not generally hold when applying the matrix logarithm. Suppose the imaginary parts account for a significantly large value, the mutual representation of Φ\Phi and 𝒵N\mathcal{Z}_{N} does not match in the logarithmic scale. An exception holds unless the chosen dictionary is inherently rotation-free with respect to the true eigenfunctions [20], or there is direct access to the data for log(𝒦s)\log(\mathcal{K}_{s}) allowing for direct training of the matrix. However, such conditions contravene our objective of leveraging Koopman data to conversely find the generator. In comparison, the approach in this paper presents an elegant method for approximating \mathcal{L} regardless of its boundedness. This enables the direct learning of 𝐋\mathbf{L} without computing the logarithm, thereby avoiding the potential appearance of imaginary parts caused by basis rotation. \diamond

V Case Study

We provide a numerical example to demonstrate the effectiveness of the proposed approach. The research code can be found at https://github.com/Yiming-Meng/Log-Free-Learning-of-Koopman-Generators.

Consider the Van der Pol oscillator

𝐱˙1(t)=𝐱2(t),𝐱˙2(t)=𝐱1(t)+(1𝐱12(t))𝐱2(t),\dot{\mathbf{x}}_{1}(t)=\mathbf{x}_{2}(t),\quad\dot{\mathbf{x}}_{2}(t)=-\mathbf{x}_{1}(t)+(1-\mathbf{x}_{1}^{2}(t))\mathbf{x}_{2}(t),

with 𝐱(0)=x:=[x1,x2]\mathbf{x}(0)=x:=[x_{1},x_{2}]. We assume the system dynamics are unknown to us, and our information is limited to the system dimension, n=2n=2, and observations of sampled trajectories. To generate training data using Algorithm 1, for simplicity of illustration, we select 𝒳=(1,1)2\mathcal{X}=(-1,1)^{2} and obtain a total of M=1002M=100^{2} uniformly spaced samples {x(m)}m=0M1\left\{x^{(m)}\right\}_{m=0}^{M-1} in 𝒳\mathcal{X}. We choose the dictionary as

𝒵N=[𝔷0,0,𝔷1,0,,𝔷3,0,𝔷0,1,𝔷1,1,𝔷2,1,,𝔷i,j,,𝔷3,2],N=12,\begin{split}\mathcal{Z}_{N}&=[\mathfrak{z}_{0,0},\mathfrak{z}_{1,0},\cdots,\mathfrak{z}_{3,0},\\ &\mathfrak{z}_{0,1},\mathfrak{z}_{1,1},\mathfrak{z}_{2,1},\cdots,\mathfrak{z}_{i,j},\cdots,\mathfrak{z}_{3,2}],\;N=12,\end{split} (14)

where 𝔷i,j(x1,x2)=x1ix2j\mathfrak{z}_{i,j}(x_{1},x_{2})=x_{1}^{i}x_{2}^{j} for each i{0,1,2,3}i\in\left\{0,1,2,3\right\} and j{0,1,2}j\in\left\{0,1,2\right\}. We also set τ=1\tau=1 and λ=106\lambda=10^{6}. The discrete form 𝐋\mathbf{L} can be obtained according to Section IV-B. We apply the learned 𝐋\mathbf{L} to identify the system vector fields, and construct a local Lyapunov function for the unknown system.

V-A System Identification

The actual vector field is

f(x):=[f1(x),f2(x)]=[x2,x1+(1x12)x2].f(x):=[f_{1}(x),f_{2}(x)]=[x_{2},-x_{1}+(1-x_{1}^{2})x_{2}].

As we can analytically establish that 𝔷1,0(x)=f1(x)\mathcal{L}\mathfrak{z}_{1,0}(x)=f_{1}(x) and 𝔷0,1(x)=f2(x)\mathcal{L}\mathfrak{z}_{0,1}(x)=f_{2}(x), we use the approximation [~λ𝔷1,0,~λ𝔷0,1][\widetilde{\mathcal{L}}_{\lambda}\mathfrak{z}_{1,0},\widetilde{\mathcal{L}}_{\lambda}\mathfrak{z}_{0,1}] to conversely obtain ff.

Note that 𝔷1,0(x)=𝒵N(x)𝐞2\mathfrak{z}_{1,0}(x)=\mathcal{Z}_{N}(x)\mathbf{e}_{2} and 𝔷0,1(x)=𝒵N(x)𝐞5\mathfrak{z}_{0,1}(x)=\mathcal{Z}_{N}(x)\mathbf{e}_{5}, where each 𝐞i\mathbf{e}_{i} for i{0,1,,N}i\in\left\{0,1,\cdots,N\right\} is a column unit vector with all components being 0 except for the ii-th component, which is 11. To apply 𝐋\mathbf{L}, we have that ~λ𝔷1,0(x)𝒵N(x)(𝐋𝐞2)\widetilde{\mathcal{L}}_{\lambda}\mathfrak{z}_{1,0}(x)\approx\mathcal{Z}_{N}(x)(\mathbf{L}\mathbf{e}_{2}) and ~λ𝔷0,1(x)𝒵N(x)(𝐋𝐞5)\widetilde{\mathcal{L}}_{\lambda}\mathfrak{z}_{0,1}(x)\approx\mathcal{Z}_{N}(x)(\mathbf{L}\mathbf{e}_{5}).

In other words, we approximate ~λ𝔷1,0\widetilde{\mathcal{L}}_{\lambda}\mathfrak{z}_{1,0} and ~λ𝔷0,1\widetilde{\mathcal{L}}_{\lambda}\mathfrak{z}_{0,1} (and hence f1f_{1} and f2f_{2}) using a linear combination of functions within 𝒵N\mathcal{Z}_{N}. The weights for these approximations are given by 𝐋𝐞2\mathbf{L}\mathbf{e}_{2} and 𝐋𝐞5\mathbf{L}\mathbf{e}_{5}, respectively. We report the corresponding results in Table I and II.

𝔷i,j\mathfrak{z}_{i,j} j=0j=0 j=1j=1 j=2j=2
i=0i=0 1.6×1010-1.6\times 10^{-10} 𝟏+𝟏𝟎𝟔\mathbf{1+10^{-6}} 4.8×1010-4.8\times 10^{-10}
i=1i=1 1.3×105-1.3\times 10^{-5} 1.9×109-1.9\times 10^{-9} 1.4×107-1.4\times 10^{-7}
i=2i=2 3.6×1010-3.6\times 10^{-10} 1.0×106-1.0\times 10^{-6} 6.1×1010-6.1\times 10^{-10}
i=3i=3 1.5×108-1.5\times 10^{-8} 2.3×109-2.3\times 10^{-9} 2.9×107-2.9\times 10^{-7}
TABLE I: The weights of 𝒵N\mathcal{Z}_{N} obtained by 𝐋𝐞2\mathbf{L}\mathbf{e}_{2}.
𝔷i,j\mathfrak{z}_{i,j} j=0j=0 j=1j=1 j=2j=2
i=0i=0 3.8×1010-3.8\times 10^{-10} 𝟏𝟏𝟎𝟓\mathbf{1-10^{-5}} 1.3×1091.3\times 10^{-9}
i=1i=1 𝟏𝟏𝟎𝟔\mathbf{-1-10^{-6}} 9.5×10109.5\times 10^{-10} 2.0×106-2.0\times 10^{-6}
i=2i=2 8.1×10108.1\times 10^{-10} 𝟏𝟏𝟎𝟔\mathbf{-1-10^{-6}} 2.0×109-2.0\times 10^{-9}
i=3i=3 1×1081\times 10^{-8} 6.9×1010-6.9\times 10^{-10} 3.0×109-3.0\times 10^{-9}
TABLE II: The weights of 𝒵N\mathcal{Z}_{N} obtained by 𝐋𝐞5\mathbf{L}\mathbf{e}_{5}.

We compare the aforementioned results with those obtained using the benchmark approach as described in [16] with the same MM and 𝒵N\mathcal{Z}_{N}. As described in Remark IV.1, 𝐊N×N\mathbf{K}\in\mathbb{C}^{N\times N} can be obtained such that 𝒦s𝔷1,0(x)𝒵N(x)(𝐊𝐞2)\mathcal{K}_{s}\mathfrak{z}_{1,0}(x)\approx\mathcal{Z}_{N}(x)(\mathbf{K}\mathbf{e}_{2}) and 𝒦s𝔷0,1(x)𝒵N(x)(𝐊𝐞5)\mathcal{K}_{s}\mathfrak{z}_{0,1}(x)\approx\mathcal{Z}_{N}(x)(\mathbf{K}\mathbf{e}_{5}). Then, we have the approximation, as claimed by [16], f1(x)𝒵N(x)(log(𝐊)𝐞2/s)f_{1}(x)\approx\mathcal{Z}_{N}(x)(\log(\mathbf{K})\mathbf{e}_{2}/s) and f2(x)𝒵N(x)(log(𝐊)𝐞5/s)f_{2}(x)\approx\mathcal{Z}_{N}(x)(\log(\mathbf{K})\mathbf{e}_{5}/s) w.r.t. |||\cdot|_{\infty}. According to [16, Section VI.A], we set s=0.5s=0.5 to avoid the multi-valued matrix logarithm. We report the weights obtained by taking the real parts of log(𝐊)𝐞2/s\log(\mathbf{K})\mathbf{e}_{2}/s and log(𝐊)𝐞5/s\log(\mathbf{K})\mathbf{e}_{5}/s in Table III and IV, respectively. Multiple orders of magnitude in accuracy have been established using the proposed method.

It is worth noting that, unlike the experiment in [16] where i,j{0,1,2,3}i,j\in\left\{0,1,2,3\right\}, we deliberately choose different numbers for ii and jj, leading to non-negligible imaginary parts after taking the matrix logarithm of 𝐊\mathbf{K}. The imaginary parts of the learned weights using the Koopman-logarithm approach are reported in Table V and VI. The presence of non-negligible imaginary parts indicates that the underlying system is sensitive to the selection of dictionary functions when employing the Koopman-logarithm approach, an effect that is significant and cannot be overlooked when aiming to minimize human intervention in identifying unknown systems in practice. Furthermore, the original experiment in [16] used M=3002M=300^{2} samples for a data fitting, and when MM is reduced to 1002100^{2}, the quality of the operator learning deteriorates.

𝔷i,j\mathfrak{z}_{i,j} j=0j=0 j=1j=1 j=2j=2
i=0i=0 9.5×1016-9.5\times 10^{-16} 𝟏3.0×𝟏𝟎𝟒\mathbf{1-3.0\times 10^{-4}} 4.8×1010-4.8\times 10^{-10}
i=1i=1 4.1×𝟏𝟎𝟑\mathbf{-4.1\times 10^{-3}} 3×1015-3\times 10^{-15} 1.8×𝟏𝟎𝟐\mathbf{-1.8\times 10^{-2}}
i=2i=2 1.7×1016-1.7\times 10^{-16} 6.4×𝟏𝟎𝟑\mathbf{-6.4\times 10^{-3}} 7.8×1017-7.8\times 10^{-17}
i=3i=3 6.0×𝟏𝟎𝟑\mathbf{-6.0\times 10^{-3}} 6.7×1015-6.7\times 10^{-15} 3.7×𝟏𝟎𝟐\mathbf{-3.7\times 10^{-2}}
TABLE III: The weights of 𝒵N\mathcal{Z}_{N} obtained by log(𝐊)𝐞2/s\log(\mathbf{K})\mathbf{e}_{2}/s.
𝔷i,j\mathfrak{z}_{i,j} j=0j=0 j=1j=1 j=2j=2
i=0i=0 8.6×1016-8.6\times 10^{-16} 𝟏+6.5×𝟏𝟎𝟑\mathbf{1+6.5\times 10^{-3}} 4.1×10164.1\times 10^{-16}
i=1i=1 𝟏+3.0×𝟏𝟎𝟐-\mathbf{1+3.0\times 10^{-2}} 1.6×1015-1.6\times 10^{-15} 1.5×𝟏𝟎𝟏\mathbf{-1.5\times 10^{-1}}
i=2i=2 1.0×10151.0\times 10^{-15} 𝟏3.8×𝟏𝟎𝟐-\mathbf{1-3.8\times 10^{-2}} 2.0×1015-2.0\times 10^{-15}
i=3i=3 5.8×𝟏𝟎𝟐\mathbf{-5.8\times 10^{-2}} 1.8×10151.8\times 10^{-15} 3.3×𝟏𝟎𝟏\mathbf{3.3\times 10^{-1}}
TABLE IV: The weights of 𝒵N\mathcal{Z}_{N} obtained by log(𝐊)𝐞5/s\log(\mathbf{K})\mathbf{e}_{5}/s
𝔷i,j\mathfrak{z}_{i,j} j=0j=0 j=1j=1 j=2j=2
i=0i=0 1.8×1017-1.8\times 10^{-17} 2.0×𝟏𝟎𝟑\mathbf{2.0\times 10^{-3}} 2.2×1018-2.2\times 10^{-18}
i=1i=1 5.3×𝟏𝟎𝟑\mathbf{5.3\times 10^{-3}} 6.9×10176.9\times 10^{-17} 2.3×𝟏𝟎𝟐\mathbf{-2.3\times 10^{-2}}
i=2i=2 1.6×10171.6\times 10^{-17} 8.4×𝟏𝟎𝟑\mathbf{-8.4\times 10^{-3}} 3.0×1017-3.0\times 10^{-17}
i=3i=3 8.4×𝟏𝟎𝟑\mathbf{-8.4\times 10^{-3}} 1.1×1016-1.1\times 10^{-16} 5.0×𝟏𝟎𝟐\mathbf{5.0\times 10^{-2}}
TABLE V: The imaginary parts obtained by log(𝐊)𝐞2/s\log(\mathbf{K})\mathbf{e}_{2}/s.
𝔷i,j\mathfrak{z}_{i,j} j=0j=0 j=1j=1 j=2j=2
i=0i=0 8.0×1017-8.0\times 10^{-17} 8.8×𝟏𝟎𝟑\mathbf{8.8\times 10^{-3}} 9.5×1018-9.5\times 10^{-18}
i=1i=1 2.3×𝟏𝟎𝟐\mathbf{2.3\times 10^{-2}} 3.0×10163.0\times 10^{-16} 1.0×𝟏𝟎𝟏\mathbf{-1.0\times 10^{-1}}
i=2i=2 6.9×10176.9\times 10^{-17} 3.7×𝟏𝟎𝟐-\mathbf{-3.7\times 10^{-2}} 1.3×1016-1.3\times 10^{-16}
i=3i=3 3.7×𝟏𝟎𝟐\mathbf{3.7\times 10^{-2}} 4.9×1016-4.9\times 10^{-16} 2.2×𝟏𝟎𝟏\mathbf{2.2\times 10^{-1}}
TABLE VI: The imaginary parts obtained by obtained by log(𝐊)𝐞5/s\log(\mathbf{K})\mathbf{e}_{5}/s.

V-B Stability Prediction for the Reversed Dynamics

Observing the identified system dynamics, we anticipate that the time-reversed system is (locally) asymptotically stable w.r.t. the origin. We use the learned 𝐋\mathbf{L} to construct polynomial Lyapunov functions based on the Lyapunov equation V(x)=|x|2\mathcal{L}V(x)=|x|^{2}, where the sign on the r.h.s. is reversed due to the dynamics being reversed. To use 𝐋\mathbf{L} and the library functions from 𝒵N\mathcal{Z}_{N}, we define the weights as θ:=[θ0,θ1,,θN1]T\theta:=[\theta_{0},\theta_{1},\cdots,\theta_{N-1}]^{T} and seek a θ\theta such that V(x;θ):=𝒵N(x)θV(x;\theta):=\mathcal{Z}_{N}(x)\theta , with the objective of minimizing |𝒵N(x)𝐋θx12x22||\mathcal{Z}_{N}(x)\mathbf{L}\theta-x_{1}^{2}-x_{2}^{2}|. Ignoring the small terms of the magnitude 10910^{-9}, the constructed Lyapunov function is given by

V(x;θ)=1.39x121.56x1x2+1.16x22+0.74x12x22.V(x;\theta)=1.39x_{1}^{2}-1.56x_{1}x_{2}+1.16x_{2}^{2}+0.74x_{1}^{2}x_{2}^{2}.

Additionally, V(x;θ)\mathcal{L}V(x;\theta) is approximated by 𝒵N(x)𝐋θ\mathcal{Z}_{N}(x)\mathbf{L}\theta, with the maximal value observed at 𝒵N(0)𝐋θ=0.068\mathcal{Z}_{N}(0)\mathbf{L}\theta=-0.068, which indicates that stability predition using the data-driven Lyapunov function is verified to be valid. This indicates that the stability prediction, as inferred using the data-driven Lyapunov function, is verified to be valid. The visualization can be found in Fig 1.

Remark V.1

The idea illustrated above can be expanded to cover a larger region of interest. Specifically, a Zubov equation, instead of a Lyapunov equation, can be solved within the Koopman learning framework using the same dataset [7] as proposed in Algorithm 1. The solution obtained can potentially serve as a Lyapunov function. Since it cannot guarantee the properties of the learned function’s derivatives, to confirm it as a true Lyapunov function, the data can be reused as in Algorithm 1 to verify its Lie derivative. \diamond

Refer to caption Refer to caption

Figure 1: Data-driven Lyapunov function VV and its Lie derivative V\mathcal{L}V w.r.t. the predicted system dynamics.

VI Conclusion

In this paper, we propose a logarithm-free Koopman operator-based learning framework for the infinitesimal generator, demonstrating both theoretical and numerical improvements over the method proposed in [16]. In particular, for more general cases where the generator is unbounded and, consequently, the logarithm of the Koopman operator cannot be used for representation, we draw upon the rich literature to propose an approximation (Eq. (9)) based on Yosida’s approximation. A convergence result, along with the convergence rate, is proved in this paper to guide users in tuning the parameters. A numerical example with application in system identification is provided in comparison with the experiment in [16] demonstrating the learning accuracy. Unlike the experiment in [16], where learning accuracy is sensitive to the choice of dictionary functions, the method presented in this paper shows significant improvement in this regard. In applications where automatic computational approaches surpass human computability, such as in constructing Lyapunov-like functions using the Lie derivative, the proposed logarithm-free method holds more promise. We will pursue future efforts to provide more analysis on the sampling rate, numerical simulations, and real-world applications using real data.

References

  • [1] J. Sjöberg, Q. Zhang, L. Ljung, A. Benveniste, B. Delyon, P.-Y. Glorennec, H. Hjalmarsson, and A. Juditsky, “Nonlinear black-box modeling in system identification: a unified overview,” Automatica, vol. 31, no. 12, pp. 1691–1724, 1995.
  • [2] R. Haber and H. Unbehauen, “Structure identification of nonlinear dynamic systems—a survey on input/output approaches,” Automatica, vol. 26, no. 4, pp. 651–677, 1990.
  • [3] A. Mauroy and I. Mezić, “A spectral operator-theoretic framework for global stability,” in 52nd IEEE Conference on Decision and Control, pp. 5234–5239, IEEE, 2013.
  • [4] A. Mauroy and I. Mezić, “Global stability analysis using the eigenfunctions of the Koopman operator,” IEEE Transactions on Automatic Control, vol. 61, no. 11, pp. 3356–3369, 2016.
  • [5] S. A. Deka, A. M. Valle, and C. J. Tomlin, “Koopman-based neural lyapunov functions for general attractors,” in 61st Conference on Decision and Control (CDC), pp. 5123–5128, IEEE, 2022.
  • [6] J. L. Proctor, S. L. Brunton, and J. N. Kutz, “Dynamic mode decomposition with control,” SIAM Journal on Applied Dynamical Systems, vol. 15, no. 1, pp. 142–161, 2016.
  • [7] Y. Meng, R. Zhou, and J. Liu, “Learning regions of attraction in unknown dynamical systems via Zubov-Koopman lifting: Regularities and convergence,” arXiv preprint arXiv:2311.15119, 2023.
  • [8] I. M. Mitchell, A. M. Bayen, and C. J. Tomlin, “A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games,” IEEE Transactions on Automatic Control, vol. 50, no. 7, pp. 947–957, 2005.
  • [9] Y. Lin, E. D. Sontag, and Y. Wang, “A smooth converse lyapunov theorem for robust stability,” SIAM Journal on Control and Optimization, vol. 34, no. 1, pp. 124–160, 1996.
  • [10] J. Liu, Y. Meng, M. Fitzsimmons, and R. Zhou, “Physics-informed neural network lyapunov functions: Pde characterization, learning, and verification,” arXiv preprint arXiv:2312.09131, 2023.
  • [11] Y. Meng, Y. Li, M. Fitzsimmons, and J. Liu, “Smooth converse Lyapunov-barrier theorems for asymptotic stability with safety constraints and reach-avoid-stay specifications,” Automatica, vol. 144, p. 110478, 2022.
  • [12] Y. Meng and J. Liu, “Lyapunov-barrier characterization of robust reach–avoid–stay specifications for hybrid systems,” Nonlinear Analysis: Hybrid Systems, vol. 49, p. 101340, 2023.
  • [13] S. L. Brunton, J. L. Proctor, and J. N. Kutz, “Discovering governing equations from data by sparse identification of nonlinear dynamical systems,” Proceedings of the National Academy of Sciences, vol. 113, no. 15, pp. 3932–3937, 2016.
  • [14] J. C. Nash and M. Walker-Smith, “Nonlinear parameter estimation,” An integrated system on BASIC. NY, Basel, vol. 493, 1987.
  • [15] J. M. Varah, “A spline least squares method for numerical parameter estimation in differential equations,” SIAM Journal on Scientific and Statistical Computing, vol. 3, no. 1, pp. 28–46, 1982.
  • [16] A. Mauroy and J. Goncalves, “Koopman-based lifting techniques for nonlinear systems identification,” IEEE Transactions on Automatic Control, vol. 65, no. 6, pp. 2550–2565, 2019.
  • [17] S. Klus, F. Nüske, S. Peitz, J.-H. Niemann, C. Clementi, and C. Schütte, “Data-driven approximation of the Koopman generator: Model reduction, system identification, and control,” Physica D: Nonlinear Phenomena, vol. 406, p. 132416, 2020.
  • [18] Z. Drmač, I. Mezić, and R. Mohr, “Identification of nonlinear systems using the infinitesimal generator of the Koopman semigroup—a numerical implementation of the Mauroy–Goncalves method,” Mathematics, vol. 9, no. 17, p. 2075, 2021.
  • [19] M. Black and D. Panagou, “Safe control design for unknown nonlinear systems with koopman-based fixed-time identification,” IFAC-PapersOnLine, vol. 56, no. 2, pp. 11369–11376, 2023.
  • [20] Z. Zeng, Z. Yue, A. Mauroy, J. Gonçalves, and Y. Yuan, “A sampling theorem for exact identification of continuous-time nonlinear dynamical systems,” in 2022 IEEE 61st Conference on Decision and Control (CDC), pp. 6686–6692, IEEE, 2022.
  • [21] Z. Zeng and Y. Yuan, “A generalized nyquist-shannon sampling theorem using the koopman operator,” arXiv preprint, 2023.
  • [22] J. J. Bramburger and G. Fantuzzi, “Auxiliary functions as Koopman observables: Data-driven analysis of dynamical systems via polynomial optimization,” Journal of Nonlinear Science, vol. 34, no. 1, p. 8, 2024.
  • [23] A. Nejati, A. Lavaei, S. Soudjani, and M. Zamani, “Data-driven estimation of infinitesimal generators of stochastic systems,” IFAC-PapersOnLine, vol. 54, no. 5, pp. 277–282, 2021.
  • [24] C. Wang, Y. Meng, S. L. Smith, and J. Liu, “Data-driven learning of safety-critical control with stochastic control barrier functions,” in 61st Conference on Decision and Control (CDC), pp. 5309–5315, IEEE, 2022.
  • [25] A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, vol. 44. Springer Science & Business Media, 2012.
  • [26] M. O. Williams, I. G. Kevrekidis, and C. W. Rowley, “A data–driven approximation of the Koopman operator: Extending dynamic mode decomposition,” Journal of Nonlinear Science, vol. 25, pp. 1307–1346, 2015.
  • [27] W. Kang, K. Sun, and L. Xu, “Data-driven computational methods for the domain of attraction and Zubov’s equation,” IEEE Transactions on Automatic Control, 2023.