\chapterstyle
article
\setbeforesubsecskip1ex
\setaftersubsecskip-0.25em
\setsubsecheadstyle
On the differential equation
with
Gerd S. Schmidt
ββ
Christian Ebenbauer
ββ
Frank AllgΓΆwer
Institute for Systems Theory and Automatic Control
Abstract
In this note we consider the global convergence properties of the differential
equation with ,
which is a gradient flow of the function
. Many of the
presented results are not new, but scattered throughout literature. The
motivation of this note is to summarize and extend the convergence results
known from literature. Rather than giving an exhaustive list of references,
the results are presented in a self-contained fashion.
In this note, we discuss the properties of a function and a differential
equation on a smooth manifold.
If we speak about a manifoldΒ of dimension we always mean a
smooth manifold in the
sense of [1], i.e. the subset of some
with and is locally diffeomorphic to
.
In the context of this note, we need the notions of measure zero and dense.
A set of a manifold is a set of
measure zero if there is a collection of smooth charts whose
domains cover and such that have
measure zero in , i.e. the can be
covered for any by a countable collection of open balls whose
volumes sum up to less than , for details see e.g. [2, Chapter
10].
To define dense, we need the topological closure of a set
, i.e. the intersection of all closed sets in
that contain .
A dense subset of a smooth manifold is a set
such that the topological closure fulfills
, see e.g. [2, Appendix,
Topology].
is dense if and only if every nonempty open subset of has non-empty
intersection with .
The complement of a
set of measure zero is dense in
, since if there is a point such that there
is an open with and
, then
contains an open set and cannot have measure zero, see also
[3, Chapter 2].
Here, we consider a function and a differential equation on the set
of special orthogonal matrices
.
is a smooth manifold of dimension with the subspace
topology induced by .
The tangent space at is
given by
(1) |
|
|
|
The Riemannian metric
induced by the
standard Euclidean metric on is given by
(2) |
|
|
|
In the following, we define the differential and the Hessian of a function
at a point .
Let be a smooth curve with
, and
with
and .
The differential of a
function at a point evaluated at
is defined by
(3) |
|
|
|
The critical points of are the points where
is not surjective.
Because of , this means that
these are the points where .
The gradient of is defined as the unique vector field
with
(4) |
|
|
|
see e.g. [2, Chapter 11].
The Hessian of at a critical point
evaluated at is defined by
(5) |
|
|
|
Since the Hessian at a critical point is bilinear and symmetric, we have
for with
the equality
(6) |
|
|
|
As a consequence, the value
can be computed utilizing the values
,
,
and (6).
For details on the Hessian at a critical point, see
[4, AppendixΒ C.5].
Lemma 1
Consider the function
.
-
a)
The differential of at is given
for any by
|
|
|
and the critical points of are given by
(7) |
|
|
|
Furthermore, the gradientΒ atΒ
is given by
(8) |
|
|
|
-
b)
The Hessian at a critical point is given by
|
|
|
-
c)
The set of critical points has the following properties:
-
i)
where
(9) |
|
|
|
-
ii)
Each is connected and isolated, i.e. there exists a
neighborhood of each such that
for all .
-
iii)
are compact submanifolds of dimension
and the tangent space at is
.
-
iv)
For every
and every we have
|
|
|
-
d)
has a unique minimum at , the other critical points are
saddle points.
Corollary 2
The differential equation
(10) |
|
|
|
is the gradient flow of
with respect to
the Riemannian metric (2).
In the following we prove Lemma 1.
Proof.
a)
As stated above, is a
differentiable curve with ,
and with
and .
Then
(11) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Therefore, the critical points ofΒ are given by
(12) |
|
|
|
With the definition of the Riemannian metric by (2),
the gradient atΒ is given by
(13) |
|
|
|
b)
Let denote a critical point of .
As stated above, is a
differentiable curve with ,
and with
and .
Then
(14) |
|
|
|
|
|
|
|
|
|
|
|
|
where we utilized
thatΒ sinceΒ is skew
symmetric for all and since is a
critical point.
Utilizing (6) we get
(15) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
c)i)
SinceΒ is symmetric, is orthonormally diagonalizable,
i.e. for some diagonalΒ and orthonormalΒ where
the columns ofΒ are eigenvectors ofΒ .
SinceΒ , we getΒ and consequently the
eigenvalues areΒ . SinceΒ andΒ , we
always have an even number of negative eigenvalues.
A similarity transformation leaves the trace
invariant, hence a critical point fulfills
(16) |
|
|
|
where is the number of
eigenvalue pairs which are .
Β
c)ii)
We start by showing that each is path
connected and thus connected.
Let be arbitrary but fixed
and let .
Then there are orthogonal such that
and . Furthermore, there are real
skew-symmetric matrices such that
and with denoting the matrix exponential.
Then defined by
(17) |
|
|
|
is a smooth curve in which connects and .
Since were arbitrary, this implies the
path-connectedness of .
To show that is isolated, we utilize that
for
.
Then there is a with
for . As a consequence, the intersection of the
preimage of these sets under is empty.
Since is continuous and both,
and are open, their preimages are
open and contain and respectively.
With
we thus have .
Since this is possible for every
and since a finite
intersection of open sets is an open set, we find an open neighborhood of
such that for all
.
Β
c)iii)
The property that the are submanifolds is given in
[5].
The tangent space follows from (7).
Β
c)iv)
Let be arbitrary but fixed and
.
Since
is always true,
we have to check
. Since every
critical point is symmetric, there is an orthogonal such that
where is a diagonal matrix with non-zero
diagonal elements. Furthermore, we know that
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
where and
.
As a consequence
|
|
|
|
|
|
|
|
|
|
|
|
Observe that is skew symmetric.
Furthermore, is skew symmetric and since is diagonal with
non-zero entries, is skew symmetric.
Because the equation has to hold for all
skew-symmetric , we obtain .
With the non-singular this implies which is equivalent to
.
In c)ii) we showed is
, thus
the previous calculation shows
.
Β
d)
Since , is orthogonally diagonalizable,
i.e.Β for some diagonalΒ and orthogonalΒ .
Therefore
(18) |
|
|
|
|
where is skew symmetric.
Consequently, is definite for all skew
symmetric at a critical point if and only
if is definite for all skew
symmetric where is diagonal. Thus,
we have to consider only for diagonal , i.e.
(19) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The critical points are such that are
symmetric, therefore all eigenvalues of are real.
Observe now that with orthogonal
implies .
Hence, the eigenvalues are and
since the number of -eigenvalues is even.
Consequently we have to determine the definiteness
of by considering (19) for all
diagonal matrices with on the diagonals where the number of
entries is zero or even.
Β
Suppose first, that all are equal to , i.e. .
The associated is .
then implies
, i.e. for all
skew-symmetric . Thus is positive definite if .
Suppose now there is an even number of eigenvalues equal to .
Then, there are
indices and such that and , and
therefore there are skew symmetric such
that .
As consequence, is indefinite at a critical
point where has an even number of
negative eigenvalues .
Therefore, is the only local (global) minimum
of . All other critical points are saddle points.
Β Β
Definition 3
[4, on p.21]
Let be a smooth Riemannian manifold and
be a smooth function. Denote the set of
critical points of by . is called Morse-Bott function
provided the following conditions are satisfied:
-
a)
has compact sublevel sets.
-
b)
where are disjoint,
closed and connected submanifolds of and is constant on
for .
-
c)
for all and
all .
Lemma 4
is a Morse-Bott function.
Proof.
We show only DefinitionΒ 3a) since b) and
c) were shown in Lemma 1.
is compact, hence attains its minimal and its maximal value on
. The minimal value of is zero, the maximal value is for
odd and for even.
If is odd we thus have
|
|
|
If is even we have
|
|
|
Since is continuous, the preimage of a closed set is a closed set and
since is bounded, its subsets are bounded as well. Since
, the boundedness and closedness of the
sublevel sets implies their compactness.
Β Β
The convergence properties of the gradient flow associated with
are thus determined by the following proposition.
Proposition 5
[4, Proposition 3.9]
Let be a Morse-Bott function on a
Riemannian manifold . The -limit set of
with respect to the gradient flow of is a single
critical point of .
Every solution of the gradient flow converges to an equilibrium point.
To give a more detailed specification the convergence behavior of the gradient
flow of we need the following result.
Lemma 6
Let be a smooth and compact Riemannian manifold of dimension
, be a Morse-Bott function and denote
the set of critical points of by .
Let be a fixed connected component of
of dimension .
If at least one of the eigenvalues with nonzero real part of the
linearization of at some has a real part greater than
zero, then the set of initial conditions
for which the solutions of the gradient flow
converge towards , i.e.
(20) |
|
|
|
has measure zero. Furthermore is dense in
, i.e. .
Proof.
The goal of the proof is to show that has measure zero and that
is dense. We show this in the following way.
First, we consider the set of points lying in a suitable neighborhood of
and which contains the orbits of the
solutions of the gradient flow
which eventually converge towards .
We utilize a result from
[6] to
conclude that this set has measure zero and without this set is
dense. Then, we utilize this set to derive the same result for
utilizing the properties of the flow of the gradient vector field on
.
In the following, we apply [6, Proposition
4.1].
This proposition concerns the case of a a three times continuously
differentiable vector field
together with
submanifold of equilibria in
under the assumption that
is normally hyperbolic with respect to .
Normal hyperbolicity of means that the linearization
of the vector field at has
eigenvalues with real parts different from
zero.
Under these assumptions, there exists a neighborhood
of such that any solution
of with initial condition and with
a forward
orbit in lies on the stable of
manifold of a point .
is defined by
(21) |
|
|
|
We can always embed into for large enough,
see e.g. [2, Chapter 10], therefore we
can utilize
[6] also
for our case of a vector field on a manifold .
Since is compact and is smooth, we have a global flow
, which means that
is a solution of the gradient flow
defined for all and with
, see e.g. [2, Chapter
17].
Furthermore is a
diffeomorphism for every .
Since is a Morse-Bott function, is normally hyperbolic,
i.e. the linearization of the gradient flow at any
has exactly eigenvalues with real parts different from zero,
see [7, p. 183, Morse-Bott
functions].
According to
[6, Proposition
4.1], we
have a neighborhood of such that for every
solution with and a forward orbit
in , the solution has to lie in one
with .
We know from [8, Proposition
3.2],
that if we choose small
enough, then the local stable manifold
of given by
(22) |
|
|
|
is a smooth submanifold of dimension where is the number of
eigenvalues with real part smaller than zero.
Since the stable manifold is a submanifold
of with smaller dimension than , the stable
manifold has measure zero and
is dense in , see [2, Theorem
10.5].
Let be defined by (20).
Define by
(23) |
|
|
|
and let for be defined by
(24) |
|
|
|
If , then there is an integer such that
. As a consequence
(25) |
|
|
|
Because of (24),
for every .
Moreover,
[6, Proposition
4.1]
implies that
(26) |
|
|
|
As subset of a set of measure zero, has measure zero,
see e.g. [2, Lemma A.60(b)]. Since
is a diffeomorphism, this
means that also has measure zero, see e.g. [2, Lemma
10.1]. According to
(25), is a countable union of the
, i.e. is a countable union of sets of measure
zero. Therefore, has measure zero and as a consequence
is dense.
Β
To finally derive the global stability properties of the identity matrix
for the gradient flow of
and thus also for the differential equation
(10), we linearize the gradient flow around the
equilibria.
Lemma 7
The convergence properties of the gradient flow of
are the following:
-
a)
The -limit set of any solution is contained in the set of
equilibria given by (7), i.e.
|
|
|
-
b)
The equilibriumΒ is locally exponentially stable
and all other equilibria are unstable.
-
c)
The set of initial conditions for which the solutions of the gradient flow
of converge towards is dense in
and the set of initial conditions for which the solutions of the gradient
flow converge to the other equilibria has measure zero.
Corollary 8
The identity matrix is an almost globally asymptotically stable
equilibrium for the differential equation
(27) |
|
|
|
In the following we prove Lemma 7.
Proof.
a) is a consequence of Lemma 4 and
Proposition 5.
b)
To prove the property b) we
linearize the gradient flow with the vector field defined by
around the equilibria.
To do this directly we compute
where is an equilibrium and
is smooth with
, and .
This yields
(28) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
since and for a
with .
Consequently the linearization of the gradient flow at an equilibrium
is given by
(29) |
|
|
|
|
whereΒ . Note that due to the simple nature of the
Riemannian metric (2) and the connection of
the linearization of a gradient flow to the Hessian, we could have obtained
(29) directly from
(14).
More precisely, utilize
(30) |
|
|
|
|
|
|
|
|
|
|
|
|
If , then the linearization is
(31) |
|
|
|
|
which shows that the equilibrium is locally exponentially stable.
Now consider the linearization at the other equilibrium points,
i.e. and .
SinceΒ is symmetric, is orthonormally diagonalizable,
i.e. for some diagonalΒ and orthonormalΒ where
the columns ofΒ are eigenvectors ofΒ .
SinceΒ , we getΒ and consequently the
eigenvalues areΒ . SinceΒ andΒ , we
always have an even number of negative eigenvalues with associated
eigenvectorsΒ .
Set
and . Therefore
(32) |
|
|
|
Therefore, is an eigenvector of the operator defined
by the right hand side of (29).
Since the associated eigenvalue is positive (one), the linearization
(29) is unstable.
Consequently, the linearization of the gradient flow
at the equilibria with and
is
unstable, which proves b).
c)
Denote the flow of by
and by the
set
(33) |
|
|
|
i.e. the set of initial conditions that converges to the connected component
of the set of critical points given in Lemma
1c).
Because of Proposition 5, we are certain that any solution
of the gradient flow converges to the critical set of , and as a
consequence,
.
Then is the set of
initial conditions for which the flow converges to and
is the
set of initial conditions for which the flow converges to any of the other
critical points. In Lemma 6 we showed that
has measure zero and that is dense in .
Since is the union of a finite number of sets of measure zero,
it has measure zeros, see e.g. [2, Lemma
10.1]. In particular
is dense.
Β Β
References
-
[1]
V.Β Guillemin, Differential Topology, Prentice-Hall, Inc., 1974.
-
[2]
J.Β M. Lee, Introduction to Smooth Manifolds, Vol. 218 of Graduate Texts in
Mathematics, Springer, 2006.
-
[3]
J.Β Milnor, Topology from the differentiable viewpoint, Princeton Landmarks in
Mathematics, Princeton University Press, 1997, originally published by
University Press of Virginia, 1965.
-
[4]
U.Β Helmke, J.Β B. Moore, Optimization and Dynamical Systems, Springer, 1994.
-
[5]
T.Β Frankel, Critical submanifolds of the classical groups and Stiefel
manifolds, in: S.Β S. Cairns (Ed.), Differential and Combinatorial Topology
β A Symposium in Honor of Marston Morse, Princeton University Press, 1965,
pp. 37β54.
-
[6]
B.Β Aulbach, Continuous and Discrete Dynamics near Manifolds of Equilibria, Vol.
1058 of Lecture Notes in Mathematics, Springer, 1984.
-
[7]
D.Β McDuff, D.Β Salamon, Introduction to Symplectic Topology, Clarendon Press,
1998.
-
[8]
D.Β M. Austin, P.Β J. Braam, Morse-Bott theory and equivariant cohomology, in:
H.Β Hofer, C.Β H. Taubes, A.Β Weinstein, Z.Β Eduard (Eds.), The Floer Memorial
Volume, Vol. 133 of Progress in Mathematics, BirkhΓ€user, 1995, pp.
123β184.