\emails

cyt1012@jlu.edu.cn (Y. Chen), liyong@jlu.edu.cn (Y. Li)

Finding similarity of orbits between two discrete dynamical systems via optimal principle

Yuting Chen and Yong Li\comma\corrauth 1 1,2 ¹¹affiliationmark: College of Mathematics, Jilin University, Changchun 130012, P.R. China.
²²affiliationmark: School of Mathematics and Statistics, and Center for Mathematics and Interdisciplinary Sciences, Northeast Normal University, Changchun 130024, P.R. China.

Abstract

Whether there is similarity between two physical processes in the movement of objects and the complexity of behavior is an essential problem in science. How to seek similarity through the adoption of quantitative and qualitative research techniques still remains an urgent challenge we face. To this end, the concepts of similarity transformation matrix and similarity degree are innovatively introduced to describe similarity of orbits between two complicated discrete dynamical systems that seem to be irrelevant. Furthermore, we present a general optimal principle, giving a strict characterization from the perspective of dynamical systems combined with optimization theory. For well-known examples of chaotic dynamical systems, such as Lorenz attractor, Chua’s circuit, R $\rm\ddot{o}$ ssler attractor, Chen attractor, L $\rm\ddot{u}$ attractor and hybrid system, with using of the homotopy idea, some numerical simulation results demonstrate that similarity can be found in rich characteristics and complex behaviors of chaotic dynamics via the optimal principle we presented.

keywords:

similarity, optimal principle, homotopy , discrete dynamical system, chaotic attractor.

\ams

37N40, 49K15, 65K10

1 Introduction

Discrete dynamical systems described by iteration of mappings appear everywhere, showing directive laws from physical science or result from simulations to better understand differential equations numerically. Generally, it is much more difficult but interesting to investigate how complex behavior happens to discrete dynamical systems than continuous dynamical systems after some iterations, since there are probably greater covered ranges and more ghost phenomena. Along with the development of computer technology, modeling problems by means of discrete dynamical systems mathematically has already been gained in different fields such as biology, economics, demography, engineering, and so on. It is universally acknowledge that no matter how different the various technologies develop as well as the objects appear in the research process, there are certain underlying similarities.

Similarity, in addition to being frequently encountered, is viewed as a fundamental concept in scientific research. The idea of similarity has gained widespread popularity in the era of big data and machine learning by various means. For instance, scale similarity is found in many natural phenomena in the universe [1]. An embedding-based vehicle method with deep representation learning drastically accelerates trajectory similarity computation [2]. A novel brain electroencephalography (EEG) clustering algorithm not only handles the problem of unlabeled EEG, but also avoids the time-consuming task of manually marking the EEG [3]. Based on the cosine similarity, a transductive long short-term memory model is developed for temperature forecasting [4]. Self-similar coordinates are investigated in Lattice Boltzmann equation, showing that the time averaged statistics for velocity and vorticity express self-similarity at low Reynolds [5]. Many other applications include gene expression [6], image registration [7], web pages and scientific literature [8], fuzzy linguistic term sets [9], collaborative filtering [10], pattern analysis [11] and preferential attachment [12]. Indeed, the ubiquitous similarity is attributed to facilitate prediction of indeterminate events by analyzing known data, being an essential task in many natural systems and phenomena of real life.

A core part of similarity search is the so-called similarity measure whose famous characteristic is able to assess how similar two sequences are, in other words, the degree to which a given sequence resembles another. Many researchers have paid great attention to devise a proper similarity measure and have achieved several valuable results, which can be roughly categorized into two sorts. One sort is based on the traditional measures, such as Euclidean distance, dynamic time warping, cosine and cotangent similarity measures and Pearson correlation coefficient [13]. The other sort is some transform-based methods, such as singular value decomposition, principal component analysis, Fourier coefficients, auto-regression and moving average model [14, 15]. The cautious selection of similarity measure scheme has long been a research hotspot, affecting the accuracy of further data mining tasks directly, such as classification, clustering and indexing [16, 17].

Up to now, whether there is similarity between two physical processes and how to seek similarity through a mathematical principle are still remain a significant challenge. Several theoretical approaches are available to deal with this problem by taking into account asymptotic equivalence, synchronization and stability just as some kind of similarity. Furthermore, almost all similarity measure criteria are exploited according to application background and actual data, which can only be regarded as quantitative representations to estimate pairwise similarity of a given series resembles another under certain conditions.

Determining similarity between orbits derived from chaotic systems generally characterized by highly complex behavior is particularly difficult when the general similarity measure is employed. Having in view that, for any two chaotic dynamical systems, what are their similarities and how do we find them are both fundamental but challenging subjects in science and engineering. For this purpose, we will try to touch these problems.

To the authors best knowledge, this is the first work to develop a mathematical framework, studying the connection of orbits between two discrete dynamical systems from the perspective of dynamical systems combined with optimization theory. Novel contributions and results of this paper include 1) proposing the concepts of similarity transformation matrix and similarity degree to describe what extent the orbits derived from two discrete dynamical systems are similar; 2) presenting a general optimality principle by employing variational method when orbits of two discrete dynamical systems are similar at some step; 3) constructing hybrid dynamical system with richer and more complex dynamical behavior via the idea of homotopy, applying to numerical simulation.

The remainder of this paper begins with review of several typical chaotic attractors related to this study. We formalize the similarity via optimization techniques and give the definitions of similarity transformation matrix and similarity degree mathematically to assess what extent the orbits of two dynamical systems are similar, followed by establishing the main results of this paper in Section 3. Section 4 reports numerical simulation results of chaotic systems to support the theoretical findings. Conclusions are drawn in Section 5.

2 Chaotic systems

Among a broad variety of dynamical systems in the universe, we consider some typical chaotic systems such as Lorenz attractor, Chua’ circuit, R $\rm\ddot{o}$ ssler attractor, Chen attractor, L $\rm\ddot{u}$ attractor and their hybrid systems.

Lorenz attractor, the first chaotic dynamical system, was obtained by Lorenz in 1963 from simplified mathematical model developed for atmospheric convection while modelling meteorological phenomena [18]. The chaotic system is a typical nonlinear system with three differential equations known as the Lorenz equations

\dot{x}=-\sigma x+\sigma y,~{}~{}\dot{y}=-xz+rx-y,~{}~{}\dot{z}=xy-bz,

(1)

where $x$ , $y$ , $z$ represent the system states and the system parameters are selected as $\sigma=10$ , $b=8/3$ , $r=28$ . The initial conditions are $x(0)=0.1$ , $y(0)=0.1$ , $z(0)=0.1$ , then the behavior of Lorenz attractor resembling a butterfly or figure eight is illustrated in Fig. 2.

Refer to caption — Figure 1: Lorenz attractor.

Chua’s circuit is the simplest electronic circuit known as nonperiodic oscillator [19]. It has been confirmed by numerous experimental simulations and rigorous mathematical analysis that this circuit is able to produce an oscillating waveform exhibiting classic chaos behavior and many well-known bifurcation phenomena. Three ordinary differential equations are found as below in the analysis of Chua’s circuit

\dot{x}=\alpha[y-x-f(x)],~{}~{}\dot{y}=x-y+z,~{}~{}\dot{z}=-\beta y,

(2)

where $x$ , $y$ denote the voltage of capacities, $z$ represents inductance current and the parameters $\alpha$ , $\beta$ are determined by the particular values of the circuit components. The function $f(x)$ is defined as a piece-linear function, describing the electrical response of nonlinear resistor

f(x)=m_{1}x+\dfrac{1}{2}(m_{0}-m_{1})(|x+1|-|x-1|).

(3)

Fig. 2 shows the double scroll attractor from Chua’s circuit, in which the initial states are $x(0)=0.1$ , $y(0)=0.1$ , $z(0)=0.1$ , and the parameters are determined as $\alpha=10$ , $\beta=15$ , $m_{0}=-1.2$ , $m_{1}=-0.6$ .

R $\rm\ddot{o}$ ssler attractor behaves similar to Lorenz attractor, and it is the most simple chaotic attractor from the topological point of view. This attractor is applied to modelling equilibrium in chemical reactions which is a chaotic solution to the system of three differential equations

\dot{x}=-y-z,~{}~{}\dot{y}=x+ay,~{}~{}\dot{z}=b+z(x-c),

(4)

where $x$ , $y$ , $z$ denote the system states and three parameters $a$ , $b$ and $c$ are assumed to be positive [20]. We select numerical values of parameters as $a=0.2$ , $b=0.2$ , $c=5.7$ and give a typical orbit of R $\rm\ddot{o}$ ssler attractor, which admits chaotic behavior, as shown in Fig. 4.

Chen attractor is found in the pursuit of chaotification, being similar but topologically not equivalent to Lorenz attractor [21]. Despite Chen attractor with simple structure is the dual to Lorenz system, it is considered displaying even more sophisticated dynamical behaviors [22]. The three-dimensional autonomous system of ordinary differential equations with quadratic nonlinearities that describe Chen system are

\dot{x}=a(y-x),~{}~{}\dot{y}=(c-a)x-xz+cy,~{}~{}\dot{z}=xy-bz,

(5)

where $x$ , $y$ , $z$ are the system states and $a$ , $b$ , $c$ are real parameters. For parameters values $a=40$ , $b=3$ , $c=28$ , we obtain a Lorenz-based wing attractor as shown in Fig. 4.

L $\rm\ddot{u}$ attractor is another example that captures the paradigms of chaotic system, which connects Lorenz attractor and Chen attractor and represents the transition from one to the other [23, 24]. In order to reveal the topological structure of this chaotic attractor, consider its controlled system which is obtained by adding a constant $u$ to the second equation of L $\rm\ddot{u}$ system

\dot{x}=a(y-x),~{}~{}\dot{y}=-xz+cy+u,~{}~{}\dot{z}=xy-bz,

(6)

where $x$ , $y$ , $z$ denote the system states, $a$ , $b$ , $c$ are the system parameters. By varying the parameter $u$ considered as “controller” of the controlled system, one can observe different dynamical behaviors, contributing to a better understanding of all similar and closely related chaotic system [25]. When $a=36$ , $b=3$ , $c=20$ , all the simulation figures are summarized in Fig. 5.

The high sensitivity of chaotic systems to small perturbations of the initial states, together with the complex dynamical behavior characterized by rapidly changing solutions, make the research on chaotic dynamical systems challenging. The purpose of this paper focuses on finding similarity between the orbits of chaotic attractors via the general optimality principle, which will be discussed in next section.

3 Main results

We are now in the position to demonstrate that similarity of orbits derived by discrete dynamical systems can be found through a strict mathematical principle. This allows us to better understanding the motion trajectory and predicting process trend when looking at the rich behavior of complex physical processes.

3.1 Simple Dynamical Systems

Consider the following two discrete dynamical system

x_{k+1}=f(k,x_{k}),

(7)

y_{k+1}=g(k,y_{k}),

(8)

where the mappings $f,g:\mathbb{N_{+}}\times\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}$ are of $\mathbb{C}^{1}$ . Starting from initial states $x_{0}$ and $y_{0}$ , the solutions of (7) and (8) are denoted as

x_{k}=x(k,x_{0}),

(9)

y_{k}=y(k,y_{0}),

(10)

respectively. We introduce a new concept of similarity transformation matrix to deal with the problem of drawing a relation of similarity between (7) and (8).

Definition 3.1.

Let $[n]$ denote the set $\{1,2,\ldots,n\}$ . There exists an $n-$ order matrix $A=(a_{ij})\in\Omega$ satisfies:

$\bullet$ $a_{ij}$ denotes some matrix element, where $i\in[n]$ and $j\in[n]$ are the $i$ th row and $j$ th column of $A$ , respectively;

$\bullet$ $\Omega$ is a bounded closed convex set of $\mathbb{M}^{n\times n}$ whose interior $\Omega^{\circ}\neq\emptyset$ ;

$\bullet$ $y_{0}=Ax_{0}$ .

We say that $A$ is a similarity transformation matrix if solutions between two discrete dynamical systems (7) and (8) satisfy $y_{k}=Ax_{k}$ at $k$ th step. Otherwise, we say that they are not similar if such A does not exist.

From the definition given above, it follows that the way to estimate similarity transformation matrix $A$ is to minimize the cost functional

\displaystyle J(A)=\min\limits_{A}\sum\limits_{k=0}^{N}\|Ax_{k}-y_{k}\|_{2}^{2}.

(11)

Motivated by first-order optimality condition which is the foundation for many of optimization algorithms, we know that the optimal solution of (11) is equivalent to

\displaystyle\frac{\partial J(A)}{\partial a_{ij}}=\frac{\partial}{\partial a_{ij}}(x_{k}^{\rm T}A^{\rm T}Ax_{k})-\frac{\partial}{\partial a_{ij}}(2x_{k}^{\rm T}A^{\rm T}y_{k})+\frac{\partial}{\partial a_{ij}}(y_{k}^{\rm T}A^{\rm T}yx_{k})=0.

(12)

For the first term of the right-hand side in (12), we have

$\displaystyle\dfrac{\partial}{\partial a_{ij}}(x_{k}^{\rm T}A^{\rm T}Ax_{k})$	$\displaystyle=\dfrac{\partial}{\partial a_{ij}}(x_{k}^{\rm T}A^{\rm T})Ax_{k}+x_{k}^{\rm T}A^{\rm T}\dfrac{\partial}{\partial a_{ij}}(Ax_{k})$
	$\displaystyle=x_{k}^{\rm T}\left(\dfrac{\partial}{\partial a_{ij}}A^{\rm T}\right)Ax_{k}+x_{k}^{\rm T}A^{\rm T}\left(\dfrac{\partial}{\partial a_{ij}}A\right)x_{k}$
	$\displaystyle=(0,\ldots,0,x_{kj},0,\ldots,0)Ax_{k}+x_{k}^{\rm T}A^{\rm T}\left(\begin{array}[]{lllllll}~{}0\\ ~{}\vdots\\ ~{}0\\ x_{kj}\\ ~{}0\\ ~{}\vdots\\ ~{}0\\ \end{array}\right)$	(20)
	$\displaystyle=(a_{i1}x_{kj},\ldots,a_{in}x_{kj})\left(\begin{array}[]{lll}x_{k1}\\ ~{}~{}\vdots\\ x_{kn}\\ \end{array}\right)+(x_{k1},\ldots,x_{kn})\left(\begin{array}[]{lll}a_{i1}x_{kj}\\ ~{}~{}~{}\vdots\\ a_{in}x_{kj}\\ \end{array}\right)$	(27)
	$\displaystyle=\sum\limits_{r=1}^{n}a_{ir}x_{kj}x_{kr}+x_{kj}\sum\limits_{r=1}^{n}a_{ir}x_{kr}$
	$\displaystyle=2x_{kj}\sum\limits_{r=1}^{n}a_{ir}x_{kr},$	(28)

where $i,j\in[n]$ , $k=0,1,2,\ldots,n$ .

In the following, we give the estimate of variational equation (10), which plays an important role in analysis of the optimal principle. The result can be shown by induction. For the case $k=0$ , from the fact that $y_{1}=g(0,y_{0})=g(0,Ax_{0})$ , we have

	$\displaystyle\dfrac{\partial y_{1}}{\partial a_{ij}}$	$\displaystyle=\dfrac{\partial}{\partial{a_{ij}}}g(0,Ax_{0})\ =\dfrac{\partial y_{1}}{\partial y_{0}^{\rm T}}\dfrac{\partial y_{0}}{\partial a_{ij}}$
		$\displaystyle=\left(\begin{array}[]{lll}\dfrac{\partial y_{11}}{\partial y_{01}}\cdots\dfrac{\partial y_{11}}{\partial y_{0n}}\\ ~{}~{}\vdots~{}~{}~{}~{}\ddots~{}~{}\vdots\\ \dfrac{\partial y_{1n}}{\partial y_{01}}\cdots\dfrac{\partial y_{1n}}{\partial y_{0n}}\end{array}\right)\left(\begin{array}[]{lllllll}~{}0\\ ~{}\vdots\\ ~{}0\\ x_{0j}\\ ~{}0\\ ~{}\vdots\\ ~{}0\end{array}\right)=\left(\begin{array}[]{lll}\dfrac{\partial y_{11}}{\partial y_{0i}}x_{0j}\\ ~{}~{}~{}~{}~{}\vdots\\ \dfrac{\partial y_{1n}}{\partial y_{0i}}x_{0j}\end{array}\right)=\left(\begin{array}[]{lll}\dfrac{\partial y_{11}}{\partial y_{0i}}\\ ~{}~{}~{}\vdots\\ \dfrac{\partial y_{1n}}{\partial y_{0i}}\end{array}\right)x_{0j}.$		(45)

When $k=1$ , it follows from the iterative formula $y_{2}=g(1,y_{1})=g(1,g(0,Ax_{0}))$ and (3.1) that

	$\displaystyle\dfrac{\partial y_{2}}{\partial a_{ij}}$	$\displaystyle=\dfrac{\partial}{\partial{a_{ij}}}g(1,g(0,Ax_{0}))=\dfrac{\partial y_{2}}{\partial y_{1}^{\rm T}}\dfrac{\partial y_{1}}{\partial a_{ij}}$
		$\displaystyle=\left(\begin{array}[]{lll}\dfrac{\partial y_{21}}{\partial y_{11}}\cdots\dfrac{\partial y_{21}}{\partial y_{1n}}\\ ~{}~{}\vdots~{}~{}~{}~{}\ddots~{}~{}\vdots\\ \dfrac{\partial y_{2n}}{\partial y_{11}}\cdots\dfrac{\partial y_{2n}}{\partial y_{1n}}\end{array}\right)\left(\begin{array}[]{lll}\dfrac{\partial y_{11}}{\partial y_{0i}}\\ ~{}~{}~{}\vdots\\ \dfrac{\partial y_{1n}}{\partial y_{0i}}\end{array}\right)x_{0j}$
		$\displaystyle=\left(\begin{array}[]{lll}\dfrac{\partial y_{21}}{\partial y_{11}}\dfrac{\partial y_{11}}{\partial y_{0i}}+\cdots+\dfrac{\partial y_{21}}{\partial y_{1n}}\dfrac{\partial y_{1n}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \dfrac{\partial y_{2n}}{\partial y_{11}}\dfrac{\partial y_{11}}{\partial y_{0i}}+\cdots+\dfrac{\partial y_{2n}}{\partial y_{1n}}\dfrac{\partial y_{1n}}{\partial y_{0i}}\end{array}\right)x_{0j}=\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{21}}{\partial y_{1t}}\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{2n}}{\partial y_{1t}}\dfrac{\partial y_{1t}}{\partial y_{0i}}\end{array}\right)x_{0j}.$

Let us make the induction hypothesis, assuming that the following expression is true for $k-1$ , that is

\displaystyle\dfrac{\partial y_{k}}{\partial a_{ij}}=\dfrac{\partial}{\partial{a_{ij}}}g(k-1,y_{k-1})=\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ \end{array}\right)x_{0j}.

(49)

We now show that it continues to hold for $k$ . By combining (10) with (49), we obtain

	$\displaystyle\dfrac{\partial y_{k+1}}{\partial a_{ij}}$	$\displaystyle=\dfrac{\partial}{\partial{a_{ij}}}g(k,y_{k})=\dfrac{\partial y_{k+1}}{\partial y_{k}^{\rm T}}\dfrac{\partial}{\partial a_{ij}}g(k-1,y_{k-1})$
		$\displaystyle=\left(\begin{array}[]{lll}\dfrac{\partial y_{(k+1)1}}{\partial y_{k1}}\cdots\dfrac{\partial y_{(k+1)1}}{\partial y_{kn}}\\ ~{}~{}~{}~{}~{}\vdots~{}~{}~{}~{}~{}\ddots~{}~{}~{}~{}~{}~{}\vdots\\ \dfrac{\partial y_{(k+1)n}}{\partial y_{k1}}\cdots\dfrac{\partial y_{(k+1)n}}{\partial y_{kn}}\end{array}\right)\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ \end{array}\right)x_{0j}$
		$\displaystyle=\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{(k+1)1}}{\partial y_{kt}}\dfrac{\partial y_{kt}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{(k+1)n}}{\partial y_{kt}}\dfrac{\partial y_{kt}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ \end{array}\right)x_{0j}.$

Then, if $\{y_{k}\}$ is generated by (10), we deduce that

\dfrac{\partial y_{k+1}}{\partial a_{ij}}=\dfrac{\partial}{\partial a_{ij}}g(k,y_{k})=\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{(k+1)1}}{\partial y_{kt}}\dfrac{\partial y_{kt}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{(k+1)n}}{\partial y_{kt}}\dfrac{\partial y_{kt}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ \end{array}\right)x_{0j}.

(50)

For the second term of the right-hand side in (12), we obtain the following equation by means of derivative rule of compound function,

\dfrac{\partial}{\partial a_{ij}}(x_{k}^{\rm T}A^{\rm T}y_{k})=x_{k}^{\rm T}\left(\dfrac{\partial}{\partial{a_{ij}}}A^{\rm T}\right)y_{k}+x_{k}^{\rm T}A^{\rm T}\left(\dfrac{\partial}{\partial{a_{ij}}}y_{k}\right).

(51)

It is obvious that

x_{k}^{\rm T}\left(\dfrac{\partial}{\partial a_{ij}}A^{\rm T}\right)y_{k}=(0,\ldots,0,x_{kj},0,\ldots,0)y_{k}=x_{kj}y_{ki}.

(52)

Together with (50), it yields that

$\displaystyle x_{k}^{\rm T}A^{\rm T}\left(\dfrac{\partial}{\partial{a_{ij}}}y_{k}\right)=$	$\displaystyle x_{k}^{\rm T}A^{\rm T}\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\end{array}\right)x_{0j}$	(56)
$\displaystyle=$	$\displaystyle x_{k}^{\rm T}\left(\begin{array}[]{lll}a_{11}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}+\cdots+a_{n1}\sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ a_{1n}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}+\cdots+a_{nn}\sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\end{array}\right)x_{0j}$	(60)
$\displaystyle=$	$\displaystyle x_{k1}\left(a_{11}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}+\cdots+a_{n1}\sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)x_{0j}$
	$\displaystyle+\cdots$
	$\displaystyle+x_{kn}\left(a_{1n}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}+\cdots+a_{nn}\sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)x_{0j}$
$\displaystyle=$	$\displaystyle x_{0j}\sum\limits_{r=1}^{n}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}x_{kr}a_{sr}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right).$	(61)

Using (52) and (56) in (51), we can easily get

\displaystyle\dfrac{\partial}{\partial a_{ij}}\left(x_{k}^{\rm T}A^{\rm T}y_{k}\right)=x_{kj}y_{ki}+x_{0j}\sum\limits_{r=1}^{n}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}x_{kr}a_{sr}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right),

(62)

where $i,j\in[n]$ , $k=0,1,2,\ldots,n$ .

For the last term of the right-hand side in (12), it is enough to compute that

$\displaystyle\dfrac{\partial}{\partial a_{ij}}(y_{k}^{\rm T}y_{k})=$	$\displaystyle\left(\dfrac{\partial}{\partial a_{ij}}y_{k}^{\rm T}\right)y_{k}+y_{k}^{\rm T}\left(\dfrac{\partial}{\partial a_{ij}}y_{k}\right)$
$\displaystyle=$	$\displaystyle\left(\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}},\cdots,\sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)x_{0j}y_{k}$
	$\displaystyle+y_{k}^{\rm T}\left(\begin{array}[]{lll}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\\ \end{array}\right)x_{0j}$	(66)
$\displaystyle=$	$\displaystyle 2x_{0j}\left(y_{k1}\sum\limits_{t=1}^{n}\dfrac{\partial y_{k1}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}+\cdots+y_{kn}\sum\limits_{t=1}^{n}\dfrac{\partial y_{kn}}{\partial y_{(k-1)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)$
$\displaystyle=$	$\displaystyle 2x_{0j}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}y_{ks}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right),$	(67)

where $i,j\in[n]$ , $k=0,1,2,\ldots,n$ .

In this way, one of the main results in this paper is built up by substituting (3.1), (62) and (3.1) into (12).

Proposition 3.2.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be generated by (7) and (8), respectively. If matrix $A$ is the optimal solutions of (11), then there exists the optimal principle via first-order optimality condition:

	$\displaystyle x_{kj}\sum\limits_{r=1}^{n}a_{ir}x_{kr}-x_{kj}y_{ki}-x_{0j}\sum\limits_{r=1}^{n}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}x_{kr}a_{sr}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)$
	$\displaystyle+x_{0j}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}y_{ks}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)=0,~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}$		(68)

where $x_{kj}$ denotes the $j$ th component of column vector $x_{k}$ at step $k$ , and other similar representations have the same meanings.

3.2 Homotopy Dynamical Systems

In light of above analysis, a natural extension of that is a more complex similarity study for general dynamical systems.

Consider hybrid systems formed by two chaotic attractors via homotopy method, which will show richer and more interesting dynamical behavior. Two simple systems as shown in (7) and (8) can be connected by constructing such a homotopy

H(k,y_{k},\lambda)=(1-\lambda)f(k,y_{k})+\lambda g(k,y_{k}),

(69)

where $\lambda\in[0,1]$ is an embedding parameter. Note that the homotopy $H$ is exactly a path connecting $f$ and $g$ such that $H(k,y_{k},0)=f(k,y_{k})$ and $H(k,y_{k},1)=g(k,y_{k})$ . As the parameter $\lambda$ increases from 0 to 1, the homotopy $H$ varies continuously from one system to another.

Take into account two dynamical systems with the following forms

x_{k+1}=H_{1}(k,x_{k},\lambda_{1}),

(70)

y_{k+1}=H_{2}(k,y_{k},\lambda_{2}),

(71)

where homotopies (70) and (71) stand for general hybrid dynamical systems with the embedding parameters $\lambda_{1},\lambda_{2}\in[0,1]$ changing per iteration. The solutions of (70) and (71) are denoted as

x_{k}=H_{1}(k,x_{0},\lambda_{1}),

(72)

y_{k}=H_{2}(k,y_{0},\lambda_{2}),

(73)

respectively. At this point, the similarity transformation matrix, becoming related to the parameter $\lambda=(\lambda_{1},\lambda_{2})^{\rm T}$ , is re-expressed as $A=A(\lambda)$ . Let the cost functional be written in the form

J(A,\lambda)=\min\limits_{A,\lambda}\sum\limits_{k=0}^{N}\|Ax_{k}-y_{k}\|_{2}^{2}.

(74)

According to Karush-Kuhn-Tucker (KKT for short) optimality conditions that are often checked for investigating whether a solution of nonlinear programming problem is optimal, $(A,\lambda)$ is a stationary point of (74) if and only if

\frac{\partial J(A,\lambda)}{\partial a_{ij}}=\frac{\partial}{\partial a_{ij}}(x_{k}^{\rm T}A^{\rm T}Ax_{k})-\frac{\partial}{\partial a_{ij}}(2x_{k}^{\rm T}A^{\rm T}y_{k})+\frac{\partial}{\partial a_{ij}}(y_{k}^{\rm T}A^{\rm T}yx_{k})=0,

(75)

and

\frac{\partial J(A,\lambda)}{\partial\lambda}=\frac{\partial}{\partial\lambda}(x_{k}^{\rm T}A^{\rm T}Ax_{k})-\frac{\partial}{\partial\lambda}(2x_{k}^{\rm T}A^{\rm T}y_{k})+\frac{\partial}{\partial\lambda}(y_{k}^{\rm T}A^{\rm T}yx_{k})=0,

(76)

simultaneously. The derivation of (75) is the same as (12), and hence the details are omitted here.

For the first term of the right-hand side in (76), by direct calculation, we get

$\displaystyle\dfrac{\partial}{\partial\lambda}(x_{k}^{\rm T}A^{\rm T}Ax_{k})=$	$\displaystyle\left(\dfrac{\partial}{\partial\lambda}x_{k}^{\rm T}\right)A^{\rm T}Ax_{k}+x_{k}^{\rm T}A^{\rm T}A\left(\dfrac{\partial}{\partial\lambda}x_{k}\right)$
$\displaystyle=$	$\displaystyle\left(\dfrac{\partial x_{k1}^{\rm T}}{\partial\lambda},\cdots,\dfrac{\partial x_{kn}^{\rm T}}{\partial\lambda}\right)A^{\rm T}Ax_{k}+x_{k}^{\rm T}A^{\rm T}A\left(\begin{array}[]{lll}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}\\ ~{}~{}~{}\vdots\\ \dfrac{\partial{x_{kn}}}{\partial{\lambda}}\\ \end{array}\right)$	(80)
$\displaystyle=$	$\displaystyle\left(a_{11}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{1n}\dfrac{\partial{x_{kn}}}{\partial{\lambda}},\cdots,a_{n1}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{nn}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right)\left(\begin{array}[]{lll}\sum\limits_{s=1}^{n}a_{1s}x_{ks}\\ ~{}~{}~{}~{}~{}~{}~{}\vdots\\ \sum\limits_{s=1}^{n}a_{ns}x_{ks}\\ \end{array}\right)$	(84)
	$\displaystyle+\left(\sum\limits_{s=1}^{n}a_{1s}x_{ks},\cdots,\sum\limits_{s=1}^{n}a_{ns}x_{ks}\right)\left(\begin{array}[]{lll}a_{11}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{1n}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}\vdots\\ a_{n1}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{nn}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\\ \end{array}\right)$	(88)
$\displaystyle=$	$\displaystyle 2\sum\limits_{s=1}^{n}a_{1s}x_{ks}\left(a_{11}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{1n}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right)+\cdots$
	$\displaystyle+2\sum\limits_{s=1}^{n}a_{ns}x_{ks}\left(a_{n1}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{nn}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right).$	(89)

For the second term of the right-hand side in (76), it follows from derivative rule of compound function that

\displaystyle\dfrac{\partial}{\partial\lambda}(x_{k}^{\rm T}A^{\rm T}y_{k})=\left(\dfrac{\partial}{\partial\lambda}x_{k}^{\rm T}\right)A^{\rm T}y_{k}+x_{k}^{\rm T}A^{\rm T}\left(\dfrac{\partial}{\partial\lambda}y_{k}\right).

(90)

On the one hand,

$\displaystyle\left(\dfrac{\partial}{\partial\lambda}x_{k}^{\rm T}\right)A^{\rm T}y_{k}=$	$\displaystyle\left(\dfrac{\partial{x_{k1}}}{\partial{\lambda}},\cdots,\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right)A^{\rm T}y_{k}$
$\displaystyle=$	$\displaystyle\left(a_{11}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{1n}\dfrac{\partial{x_{kn}}}{\partial{\lambda}},\cdots,a_{n1}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{nn}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right)\left(\begin{array}[]{lll}y_{k1}\\ ~{}~{}\vdots\\ y_{kn}\\ \end{array}\right)$	(94)
$\displaystyle=$	$\displaystyle\left(a_{11}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{1n}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right)y_{k1}+\cdots+\left(a_{n1}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+a_{nn}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}\right)y_{kn}$
$\displaystyle=$	$\displaystyle\sum\limits_{t=1}^{n}a_{t1}y_{kt}\dfrac{\partial{x_{k1}}}{\partial{\lambda}}+\cdots+\sum\limits_{t=1}^{n}a_{tn}y_{kt}\dfrac{\partial{x_{kn}}}{\partial{\lambda}}.$	(95)

On the other hand,

$\displaystyle x_{k}^{\rm T}A^{\rm T}\left(\dfrac{\partial}{\partial\lambda}y_{k}\right)=$	$\displaystyle x_{k}^{\rm T}A^{\rm T}\left(\begin{array}[]{lll}\dfrac{\partial{y_{k1}}}{\partial{\lambda}}\\ ~{}~{}~{}\vdots\\ \dfrac{\partial{y_{kn}}}{\partial{\lambda}}\\ \end{array}\right)$	(99)
$\displaystyle=$	$\displaystyle\left(\sum\limits_{s=1}^{n}a_{1s}x_{ks},\cdots,\sum\limits_{s=1}^{n}a_{ns}x_{ks}\right)\left(\begin{array}[]{lll}\dfrac{\partial{y_{k1}}}{\partial{\lambda}}\\ ~{}~{}~{}\vdots\\ \dfrac{\partial{y_{kn}}}{\partial{\lambda}}\\ \end{array}\right)$	(103)
$\displaystyle=$	$\displaystyle\sum\limits_{s=1}^{n}a_{1s}x_{ks}\dfrac{\partial{y_{k1}}}{\partial{\lambda}}+\cdots+\sum\limits_{s=1}^{n}a_{ns}x_{ks}\dfrac{\partial{y_{kn}}}{\partial{\lambda}}.$	(104)

Combining (3.2) and (99) with (90), we obtain

\displaystyle\dfrac{\partial}{\partial\lambda}(x_{k}^{\rm T}A^{\rm T}y_{k})=\sum\limits_{t=1}^{n}a_{t1}y_{kt}\dfrac{\partial{x_{k1}}}{\partial\lambda}+\cdots+\sum\limits_{t=1}^{n}a_{tn}y_{kt}\dfrac{\partial{x_{kn}}}{\partial\lambda}+\sum\limits_{s=1}^{n}a_{1s}x_{ks}\dfrac{\partial{y_{k1}}}{\partial\lambda}+\cdots+\sum\limits_{s=1}^{n}a_{ns}x_{ks}\dfrac{\partial{y_{kn}}}{\partial\lambda}.

(105)

For the last term of the right-hand side in (76), it is obvious that

$\displaystyle\dfrac{\partial}{\partial\lambda}(y_{k}^{\rm T}y_{k})=$	$\displaystyle\left(\dfrac{\partial}{\partial\lambda}y_{k}^{\rm T}\right)y_{k}+y_{k}^{\rm T}\left(\dfrac{\partial}{\partial\lambda}y_{k}\right)$
$\displaystyle=$	$\displaystyle\left(\dfrac{\partial{y_{k1}}}{\partial\lambda},\cdots,\dfrac{\partial{y_{kn}}}{\partial\lambda}\right)\left(\begin{array}[]{lll}y_{k1}\\ ~{}~{}\vdots\\ y_{kn}\\ \end{array}\right)+(y_{k1},\cdots,y_{kn})\left(\begin{array}[]{lll}\dfrac{\partial{y_{k1}}}{\partial{\lambda}}\\ ~{}~{}~{}\vdots\\ \dfrac{\partial{y_{kn}}}{\partial{\lambda}}\\ \end{array}\right)$	(112)
$\displaystyle=$	$\displaystyle 2y_{k1}\dfrac{\partial{y_{k1}}}{\partial\lambda}+\cdots+2y_{kn}\dfrac{\partial{y_{kn}}}{\partial\lambda}.$	(113)

According to all derivations above, we deduce the other main result of this paper, which deals with how similar between orbits of general dynamical systems. By substituting (3.2), (105) and (3.2) into (76), we give the existence of the solution of (74).

Proposition 3.3.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be generated by (70) and (71), respectively. If matrix $A=A(\lambda)$ and parameter $\lambda$ are the optimal solutions of (74), then there exist the following general optimal principle:

	$\displaystyle x_{kj}\sum\limits_{r=1}^{n}\bar{a}_{ir}x_{kr}-x_{kj}y_{ki}-x_{0j}\sum\limits_{r=1}^{n}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}x_{kr}\bar{a}_{sr}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)$
	$\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}+x_{0j}\sum\limits_{s=1}^{n}\sum\limits_{t=1}^{n}y_{ks}\left(\dfrac{\partial y_{ks}}{\partial y_{(k-1)t}}\dfrac{\partial y_{(k-1)t}}{\partial y_{(k-2)t}}\cdots\dfrac{\partial y_{1t}}{\partial y_{0i}}\right)=0,~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}$		(114)

and

	$\displaystyle\left(\sum\limits_{r=1}^{n}\sum\limits_{s=1}^{n}\bar{a}_{r1}\bar{a}_{rs}x_{ks}-\sum\limits_{t=1}^{n}\bar{a}_{t1}y_{kt}\right)\frac{\partial x_{k1}}{\partial\lambda}+\cdots+\left(\sum\limits_{r=1}^{n}\sum\limits_{s=1}^{n}\bar{a}_{rn}\bar{a}_{rs}x_{ks}-\sum\limits_{t=1}^{n}\bar{a}_{tn}y_{kt}\right)\frac{\partial x_{kn}}{\partial\lambda}$
	$\displaystyle-\left(\sum\limits_{s=1}^{n}\bar{a}_{1s}x_{ks}-y_{k1}\right)\frac{\partial y_{k1}}{\partial\lambda}-\cdots-\left(\sum\limits_{s=1}^{n}\bar{a}_{ns}x_{ks}-y_{kn}\right)\frac{\partial y_{kn}}{\partial\lambda}=0,~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}$		(115)

where the similarity transformation matrix becomes related to the parameter $\lambda=(\lambda_{1},\lambda_{2})^{\rm T}$ , and $\bar{a}_{ij}$ stands for some matrix element of $A=A(\lambda)$ , other representations of components have the same meanings as before.

Remark 3.4.

Although (3.3) is formally consistent with (3.2), in fact, each component of similarity transformation matrix $A$ is related to parameter $\lambda$ .

Remark 3.5.

The existing literature on the variation of functional with respect to matrix rather than just vector or even scalar is rare, as considered in (11) and (74). Taking variation of a matrix can be converted to the partial derivative of each matrix elements, which is exactly the tedious calculations of derivation, especially when the number of iterations is large.

For most cases, it is almost impossible to find similarity transformation matrix that makes solutions between two discrete dynamical systems exactly similar. By virtue of the characterizations of function

h(\omega)=\frac{\log(1+\omega)}{\omega},~{}\omega\in\mathbb{R},

we define a similarity function by setting $\omega=\dfrac{1}{N}\sum\limits_{k=1}^{N}\|Ax_{k}-y_{k}\|_{2}^{2}$ . When no similarity transformation matrix can be found to make two orbits completely similar, we also ask to what extent they are similar.

Definition 3.6.

The similarity degree $\rho(A)$ of solutions between two discrete dynamical systems is defined as

\rho(A)=\left\{\begin{array}[]{ll}~{}~{}~{}~{}~{}1,&~{}~{}{\rm{if}}\,\omega=0,\\ \dfrac{\log(1+\omega)}{\omega},&~{}~{}{\rm{otherwise.}}\end{array}\right.

(116)

It is easy to see that $\rho(A)$ $\in$ $(0,1]$ , according to the above definition.

We close this section by summing up that the solutions of two systems are said to be completely similar if $\rho(A)=1$ , otherwise, some are similar.

4 Experimental results

In this section, some examples are given to show the utility of general optimal principle proposed in this paper. All codes are written in MATLAB R2021a and run on PC with 1.80 GHz CPU processor and 8.00 GB RAM memory. Unless otherwise specified, the numerical results are accurate to four decimal places throughout this paper.

To cope with over-fitting, L2-norm regularization is introduced naturally. In reinforcement learning and neural networks, this happens frequently when samples are limited and computation is expensive. The cost functional in (11) or (74) is augmented to include a L2-norm penalty of matrix $A$ with the following form

\tilde{J}(A)=\min\limits_{A}\sum\limits_{k=0}^{N}\|Ax_{k}-y_{k}\|_{2}^{2}+\tau\|A\|_{2}^{2},

(117)

where $\tau$ is a positive constant called regularization parameter that balances the two objective terms.

In the optimal control theory, the most fundamental but crucial two optimization methods are Pontryagin’s maximum principle and Bellman’s dynamic programming. Taking into account the proposed optimal principles, we conduct the numerical simulations by following Pontryagin’s maximum principle and Bellman’s dynamic programming respectively for similarity of orbits between various chaotic systems.

The chaotic systems chosen in this section are all solved numerically by means of the widely used fourth-order Runge-Kutta method with time step size equal 0.01. The following numerical experiments mainly include two parts.

4.1 Pontryagin’s maximum principle

As a necessary condition to solve optimal control problems, Pontryagin’s maximum principle was proposed by Pontryagin and his group in the 1960s [26]. Outstanding feature of Pontryagin’s maximum principle lies in that the optimal control signal transfering dynamical system from one state to another can be found under the condition that the state or input is fixed. The following examples concern to verify Pontryagin’s maximum principle formulated in terms of the proposed optimal principle (3.2) when studying the similarity of orbits between two chaotic attractors.

Example 4.1. Similarity of orbits between Lorenz attractor and Chua’s circuit.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be the numerical solutions derived from Lorenz and Chua systems for 2000 time steps (namely, $N$ $=$ $2000$ ) from the same initial states

x_{0}=y_{0}=(0.1,0.1,0.1)^{\rm T}.

We divide the sequences into multi-stage decisions consisting of 10 steps for each (denoted by $N_{1}$ $-$ $N_{200}$ ), with the final state of previous stage as the initial condition of current stage. The optimal similarity transformation matrix of each stage is found by optimal principle (3.2) whose accuracy performance is assessed by similarity degree (116).

Figs. 6-7 describe three dimensional stereograms and two dimensional plans of Lorenz attractor, Chua’s circuit and the trajectory acted by the optimal similarity transformation matrix for each stage. Enlarge the trajectories marked by dotted box so as to find similarity of orbits between $\{Ax_{k}\}$ and $\{y_{k}\}$ more clearly. We can observe that Lorenz attractor and Chua’s circuit with different orbits become mainly similar after the optimal similarity transformation matrix is employed according to Figs. 6a and 7a. As can be seen in Figs. 6b and 7b, if the regularization parameter $\tau$ is selected as $10^{-4}$ , the orbit $\{x_{k}\}$ under action of optimal similarity transformation matrix is surprisingly close to the orbit $\{y_{k}\}$ , supporting the availability of tuning parameter and the stability of L2-norm regularization.

Results of similarity degree are shown in Fig. 8. From Fig. 8a, we can see that the values of similarity degree can achieve over 0.9 in the majority of results. When L2-norm penalty is introduced, only 4 results are less than 0.9. This exactly implies that the stability of solution is improved by introducing L2-norm regularization with suitable regularization parameter.

Example 4.2. Similarity of orbits between Lorenz attractor and R $\rm\ddot{o}ssler$ attractor.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be the numerical solutions of Lorenz and R $\rm\ddot{o}ssler$ systems, sharing the same initial states $x_{0}$ $=$ $y_{0}$ $=$ $(0.1,0.1,0.1)^{\rm T}$ . We perform the same experimental procedure as in Example 4.1.

Figs. 9 and 10 shown the stereograms and plans of Lorenz attractor, R $\rm\ddot{o}$ ssler attractor and the trajectory acted by optimal similarity transformation matrix for each stage when $N$ $=$ $2000$ . Even if overlapped trajectories marked by dotted box are enlarged, we could still observe that the orbits of R $\rm\ddot{o}$ ssler system almost coincides with that of Lorenz system acted by optimal similarity transformation matrix.

Lorenz attractor with butterfly shape and R $\rm\ddot{o}ssler$ attractor with spiral shape that appear to be different geometrically, become remarkably similar within the proper precision under the action of Pontryagin’s maximum principle using optimal principle (3.2), which is greatly an amazing finding.

Surprisingly, the numerical results indicated in Fig. 11 show that only six results are less than 0.95 without regularization term. It should be point out that when we carry out tests to find optimal similarity transformation matrix based on L2-norm penalty (117), all values of similarity degree are greater than 0.98.

To summarize, even without the use of L2-norm regularization, we can still get very satisfactory results to some extent.

4.2 Bellman’s dynamic programming

Dynamic programming was studied by Bellman to deal with situations where the best decisions are made in stages [27]. Whatever the initial state and initial decision are, the decisions that will follow must also constitute an optimal policy for the remaining problems, when the stage and state formed by the first step decision are considered as initial conditions. Applying Bellman’s dynamic programming in terms of the proposed optimal principle, we analyze the similarity of orbits between two chaotic attractors, see the following three examples.

Example 4.3. Similarity of orbits between Lorenz attractor and Chen attractor.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be the numerical solutions obtained from Lorenz and Chen systems for 2000 time steps. Instead of solving 2000 steps one at a time, we consider multi-stage decision making process by breaking the complex problems into ten simple subproblems denote as $N_{1}$ $-$ $N_{200}$ with 10 steps for each.

More precisely, for stage $N_{1}$ , the initial conditions of two systems are determined as

x_{0}=(0.1,0.1,0.1)^{\rm T}~{}\mbox{and}~{}y_{0}=Ax_{0}

with $A$ unknown. Make use of Runge-Kutta method, we calculate the states of this stage including the values of $x_{1}$ $-$ $x_{10}$ and $y_{1}$ $-$ $y_{10}$ whose components are represented by expressions containing elements of $A$ . By solving a nonlinear equations formulated by (3.2), we obtain an approximate solution with a high precision.

For the following stages, we do the same actions and compare the current similarity transformation matrix $\bar{A}$ with the one obtained by previous stage denoted as $\bar{\bar{A}}$ subject to similarity degree defined in (116), then let

A=\max\limits_{\bar{A},\bar{\bar{A}}}\{\rho(\bar{A}),\rho(\bar{\bar{A}})\}

be the optimal similarity transformation matrix of this stage. We obtain the approximate solution which makes similarity degree reach 1.0000 for each stage.

As shown in Figs. 12-14, Lorenz and Chen systems with different orbits can become distinct similar through the adjustment of similarity transformation matrix derived by the proposed optimal principle.

Now we elaborate the advantage of multi-stage dynamic programming with the help of numerical results. For $N$ $=$ $2000$ , the optimal similarity transformation matrix $A_{0}$ can be found according to optimal principle (3.2), which makes similarity degree reach 0.988837. When only the optimal similarity transformation matrix at stage $N_{1}$ is taken and $A_{0}$ is still employed in other stages, similarity degree increases to 0.988839. If we adopt the corresponding optimal similarity transformation matrix in both $N_{1}$ and $N_{2}$ , the similarity degree rises to 0.9888435, and so on. The final similarity degree of solutions between Lorenz attractor and Chen attractor in this example can reach 0.999995 and each stage is optimal at this point, which meets Bellman’s principle of optimality. To get a better view of the change in similarity degree, the numerical results of each stage are depicted in Fig. 14. We observe that as the number of the optimal similarity transformation matrix in corresponding stage increases, similarity degree is increase progressively.

Example 4.4. Similarity of orbits between Lorenz attractor and L $\rm\ddot{u}$ attractor.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be the numerical solutions got from Lorenz system and L $\rm\ddot{u}$ system with $u=0$ for 2000 time steps. Similar to the multi-stage decision in Example 4.3, we also divide the steps into 200 stage. For each stage, only the initial state of sequence $\{x_{k}\}$ is known. The values of similarity degree can reach 1.0000 for all stages, implying the effectiveness of the optimal similarity transformation matrix.

We are surprised by the effectiveness of similarity transformation matrix formulated by the proposed optimal principle, as shown in Figs. 15-17. Even if we enlarge the trajectories represented in dotted box to the coordinate diagram with small horizontal and vertical coordinates, the orbits of L $\rm\ddot{u}$ attractor and Lorenz attractor acted by optimal similarity transformation matrix can still coincide almost exactly. The change of similarity degree gradually increase from $0.999609$ to $0.999994$ with the increase of the number of optimal similarity transformation matrix, satisfying Bellman’s principle of optimality, see Fig. 17.

Example 4.5. Similarity of orbits between Hybrid attractor and L $\rm\ddot{u}$ attractor.

The last example concerns hybrid Lorenz-Chua chaotic system formed by Lorenz attractor and Chua’s circuit using the homotopy approach (69), modelling below

\begin{array}[]{lll}\dot{x}=\lambda(-\sigma x+\sigma y)+(1-\lambda)\cdot\alpha[y-x-f(x)],\\ \dot{y}=\lambda(-xz+rz-y)+(1-\lambda)(x-y+z),\\ \dot{z}=\lambda(xy-bz)+(1-\lambda)(-\beta y),\end{array}

(118)

where the piece-linear function $f(x)$ is defined in (3), and all the same parameters as in (1)-(3). The study on hybrid attractor is more challenging due to its more complex topologies and dynamics.

Different dynamical behaviors in L $\rm\ddot{u}$ attractor’s controlled system (6) can be generated by varying the parameter $u$ . For the parameters $u$ $=$ $-1$ , $u$ $=$ $8$ , $u$ $=$ $-12$ and $u$ $=$ $12$ that produce complete attractor, partial attractor, left-attractor and right-attractor, we simulate the similarity between orbits of hybrid Lorenz-Chua system and L $\rm\ddot{u}$ attractor respectively.

Let $\{x_{k}\}$ and $\{y_{k}\}$ be the numerical solutions obtained from hybrid system (118) and L $\rm\ddot{u}$ attractor for 1000 time steps. Breaking the problems into 200 simple subproblems with 5 steps for each, the approximate solutions of parameter $\lambda$ and optimal similarity transformation matrix are found by (3.3) and (3.3).

We show the numerical performance in Fig.18 for the four different values of $u$ , respectively. For the purpose of demonstrating the optimal principle simulated effect more intuitively, Fig. 19 illustrates the orbit of L $\rm\ddot{u}$ system (actual) and that of hybrid Lorenz-Chua attractor acted by the optimal similarity transformation matrix (simulated). In spite of L $\rm\ddot{u}$ system exhibits various dynamical behaviors due to varying parameter $u$ , the two sequences almost completely coincide, showing the availability and universality of the proposed optimal principle.

For the sake of completeness, the change in similarity degree of four cases of Example 4.5 are shown in Fig. 20, which also fulfill Bellman’s principle of optimality.

Chaotic systems, are generally characterized by complex behavior and rapidly changing solutions, whose orbits become quite similar taking advantage of the general optimal principle presented in this paper.

5 Conclusions

In scientific research, capturing certain underlying similarity between two complex physical processes is one of the most intensively essential problems. The critical challenge for finding similarity of orbits between dynamical systems arises when facing the high sensitivity to small perturbations with respect to initial states of chaotic systems. Main contribution, in addition to proposing some concepts described what extent the orbits between two markedly different systems are similar, is the general optimal principle built up from the viewpoint of dynamical systems together with optimization theory. This optimal principle is applied to various well-known chaotic attractors and some kind of hybrid chaotic system that is constructed on the basis of homotopy idea, yielding encouraging numerical simulation results surprisingly.

Specifically, attention is paid to find similarity of orbits between dynamical systems with complex behavior, mathematically. As necessary foundations for this paper, the definitions of similarity transformation matrix and similarity degree are introduced. We present a general optimal principle based on optimality condition and variational method, finding some underlying similarity between orbits of two dynamical systems. The numerical simulations concern with both Pontryagin’s maximum principle and Bellman’s dynamic programming formulated in terms of the optimal principle for similarity of orbits between various chaotic systems. The orbits differed markedly become remarkably similar under action of the optimal similarity transformation matrix, and the value of similarity degree also supports this, implying significance of the optimal principle we proposed in this paper.

Acknowledgments

This work was supported by National Basic Research Program of China (2013CB834100), National Natural Science Foundation of China (11571065, 11171132, 12071175), Project of Science and Technology Development of Jilin Province (2017C028-1, 20190201302JC), and Natural Science Foundation of Jilin Province (20200201253JC).

References

[1] Z. J. Wang and Y. Li, A mathematical analysis of scale similarity, Commun. Comput. Phys., 21 (2017), 149-161.
[2] Y. Y. Chen, P. Yu, W. W. Chen, Z. W. Zheng and M. Y. Guo, Embedding-based similarity computation for massive vehicle trajectory data, IEEE Internet Things, 9 (2022), 4650-4660.
[3] C. L. Dai, J. Wu, D. C. Pi, S. I. Becker, L. Cui, Q. Zhang and B. Johnson, Brain EEG time-series clustering using maximum-weight clique, IEEE Trans. Cybern., 52 (2022), 357-371.
[4] Z. Karevan and J. A. K. Suykens, Transductive LSTM for time-series prediction: An application to weather forecasting, Neural Networks, 125 (2020), 1-9.
[5] A. Zarghami, M. J. Maghrebi, J. Ghasemi and S. Ubertini, Lattice Boltzmann finite volume formulation with improved stability, Commun. Comput. Phys., 12 (2012), 42-64.
[6] H. Arbela, S. Basua, W. W. Fisherd, A. S. Hammondsd, K. H. Wand and S. Parkd, Exploiting regulatory heterogeneity to systematically identify enhancers with high accuracy, Proc. Natl. Acad. Sci. USA, 116 (2018), 900-908.
[7] A. Mang and L. Ruthotto, A Lagrangian Guass-Newton-Krylov solver for mass and intensity preserving diffeomorphic image registration, SIAM J. Sci. Comput., 39 (2017), B860-B885.
[8] B. Wang, J. J. Zhu, E. Pierson, D. Ramazzotti and S. Batzoglou, Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning, Nat. Methods, 14 (2017), 414-416.
[9] H. C. Liao, Z. S. Xu and X. J. Zeng, Distance and similarity measures for hesitant fuzzy linguistic term sets and their application in multi-criteria decision making, Inf. Sci., 271 (2014), 125-142.
[10] H. F. Liu, Z. Hu, A. Mian, H. Tian and X. Z. Zhu, A new user similarity model to improve the accuracy of collaborative filtering, Knowl. Based Syst., 56 (2014), 156-166.
[11] Z. Zhao, L. Wang, H. Liu and J. P. Ye, On similarity preserving feature selection, IEEE Trans. Knowl. Data Eng., 25 (2013) 619-632.
[12] F. Papadopoulos, M. Kitsak, M. A. Serrano, M. Boguna and D. Krioukov, Popularity versus similarity in growing networks, Nature, 489 (2012), 537-540.
[13] T. W. Liao, Clustering of time series data-a survey, Pattern Recognition, 38 (2005), 1857-1874.
[14] I. Bartolini, P. Ciaccia and M. Patella, WARP: accurate retrieval of shapes using phase of Fourier descriptors and time warping distance, IEEE Trans. Pattern Anal. Mach. Intell., 27 (2005), 142-147.
[15] T. C. Fu, A review on time series data mining, Eng. Appl. Artif. Intel., 24 (2011), 164-181.
[16] F. M. Shen, Y. Xu, L. Liu, Y. Yang, Z. Huang and H. T. Shen, Unsupervised deep hashing with similarity-adaptive and discrete optimization, IEEE Trans. Pattern Anal. Mach. Intell., 40 (2018), 3034-3044.
[17] A. Torrente, Band-based similarity indices for gene expression classifcation and clustering, Nature, 11 (2021), 21609.
[18] E. N. Lorenz, Deterministic nonperiodic flow, J. Atmospheric Sci., 20 (1963), 130-141.
[19] L. O. Chua, The genesis of Chua’s circuit, Archiv f. Elektronik u. $\ddot{U}$ bertragungstechnik, 46 (1992), 250-257.
[20] O. E. R $\rm\ddot{o}$ ssler, An equation for continous chaos, Phys. Lett. A., 57 (1976), 397-398.
[21] G. R. Chen and T. Ueta, Yet another chaotic attractor, Internat. J. Bifur. Chaos Appl. Sci. Engrg., 9 (1999), 1465-1466.
[22] T. Ueta and G. R. Chen, Bifurcation analysis of Chen’s attractor, Int. J. Bifurcat. Chaos, 10 (2000), 1917-1931.
[23] J. H. L $\rm\ddot{u}$ and G. R. Chen, A new chaotic attractor coined, Int. J. Bifurcat. Chaos, 12 (2002), 659-661.
[24] J. H. L $\rm\ddot{u}$ , G. R. Chen and S. C. Zhang, Dynamical analysis of a new chaotic attractor, Int. J. Bifurcat. Chaos, 12 (2002), 1001-1015.
[25] J. H. L $\rm\ddot{u}$ , G. R. Chen and S. C. Zhang, The compound structure of a new chaotic attractor, Chaos Solitons Fractals, 14 (2002), 669-672.
[26] L. S. Pontryagin, V. G. Boltayanskii, R. V. Gamkrelidze and E. F. Mishchenko, The mathematical theory of optimal processes, Wiley, NewYork, USA, 1962.
[27] R. Bellman, Dynamic programming, Princeton, N. J., USA: Princeton University Press, 1957.