Probabilistic Reachability of Discrete-Time Nonlinear Stochastic Systems

Zishun Liu zliu910@gatech.edu Saber Jafarpour Saber.Jafarpour@colorado.edu Yongxin Chen yongchen@gatech.edu Georgia Institute of Technology, Atlanta, GA 30332, USA University of Colorado Boulder, Boulder, CO 80309, USA

Abstract

In this paper we study the reachability problem for discrete-time nonlinear stochastic systems. Our goal is to present a unified framework for calculating the probabilistic reachable set of discrete-time systems in the presence of both deterministic input and stochastic noise. By adopting a suitable separation strategy, the probabilistic reachable set is decoupled into a deterministic reachable set and the effect of the stochastic noise. To capture the effect of the stochastic noise, in particular sub-Gaussian noise, we provide a probabilistic bound on the distance between a stochastic trajectory and its deterministic counterpart. The key to our approach is a novel energy function called the Averaged Moment Generating Function, which we leverage to provide a high probability bound on this distance. We show that this probabilistic bound is tight for a large class of discrete-time nonlinear stochastic systems and is exact for linear stochastic dynamics. By combining this tight probabilistic bound with the existing methods for deterministic reachability analysis, we propose a flexible framework that can efficiently compute probabilistic reachable sets of stochastic systems. We also provide two case studies for applying our framework to Lipschitz bound reachability and interval-based reachability. Three numerical experiments are conducted to validate the theoretical results.

keywords:

Reachability. Discrete-time Stochastic Systems. Nonlinear Systems.

^†^†thanks: This paper was not presented at any IFAC meeting.

, , ,

1 Introduction

Reachability analysis is a classical problem that studies how a dynamical system evolves over time under uncertainties in its initial conditions and inputs. It naturally appears in diverse applications, including model checking, safety verification of autonomous systems, and controller synthesis. It is known that obtaining the exact reachable sets of general dynamical systems requires computing an infinite number of their trajectories, which is computationally intractable [1]. Consequently, numerous frameworks have been developed in the literature to efficiently approximate the reachable sets for both continuous-time and discrete-time dynamical systems. While continuous-time models are commonly used to represent physical processes, discrete-time models are better suited for systems involving digital implementation and real-time computation [2, 3, 4]. In this paper, we focus on the reachability analysis of discrete-time systems.

For deterministic systems, reachability studies typically handle the system in the presence of bounded input and provide worst-case approximation for the reachable set [5]. Approaches in this line can be roughly divided into three categories. Dynamical programming approaches use a game-theoretic perspective to compute reachable sets of discrete-time systems [6, 7]. However, the computational cost of these approaches makes them intractable for reachability analysis of large-scale systems. Set propagation methods rely on propagating a specific family of sets to over-approximate the reachable sets of the system (such as polytopes, ellipsoids, and zonotopes) [2, 8]. Despite their computational efficiency, these approaches are either applicable to a special class of systems or lead to overly conservative estimates of reachable sets. Recently Lipschitz bound [9, 10] and interval analysis [11, 12] have been used as flexible and computationally efficient set propagation methods for reachability analysis of large-scale deterministic systems. Another category is simulation-based techniques, by using which the reachable set is estimated through numerical simulations [13, 14, 15, 9, 16]. All these methods constitute a toolset for reachability analysis on deterministic systems with bounded input.

Many real-world applications exhibit uncertainties that are highly variable and unpredictable. To analyze the propagation of these uncertainties through the system, it is more effective to describe them using stochastic disturbances. When the stochastic disturbance is bounded, the deterministic reachability methods based on the worst-case analysis can be used to estimate reachable sets of the system. However, this approach can lead to trivial or overly-conservative estimates of reachable sets as it offer excessive robustness to the worst-case adversarial noise, which rarely happens in practice [4]. When the stochastic disturbance is unbounded, such as Gaussian noise, the set of all possible stochastic trajectories can be unbounded. To better capture the effects of stochastic disturbances, reachability analysis in stochastic systems focuses on the probabilistic reachable set, which refers to the set that any possible trajectory starting from an initial set can reach with high probability (e.g., 99.9%).

The probabilistic reachability of discrete-time systems has been studied using various techniques. A prevalent approach is to approximate the stochastic disturbance with a bounded input, for instance, by establishing a high-probability bound on the stochastic noise [17, 18]. This converts the probabilistic reachability analysis into a worst-case deterministic reachability problem. However, this strategy can lead to a conservative estimates of the probabilistic reachable sets for high probability levels over long time periods (see discussions in Section 4.5). Optimization-based approaches leverage methods such as dynamic programming to accurately compute the probabilistic reachable set [6, 7, 19]. Despite the high accuracy of these approaches, they can become computationally inefficient for complex and large-scale systems. Another relevant line of literature for reachability analysis is the use of Lyapunov functions [20, 21] or control barrier function [22, 23] for measuring the probability of a trajectory staying in a level set. However, these functional approaches are only efficient in restrictive scenarios. For general nonlinear systems, finding a probabilistic reachable set that can be verified through these approaches can be difficult.

In this paper, we develop a framework for probabilistic reachability analysis of discrete-time stochastic systems under both bounded input and stochastic disturbances. Inspired by [24], we establish a separation strategy that decouples the effects of deterministic inputs and stochastic uncertainties on the probabilistic reachable set of systems. The effect of deterministic input is captured using reachable sets of an associated deterministic system and the effect of stochastic uncertainty is represented using the distance between trajectories of the stochastic system and the associated deterministic system termed as stochastic deviation. By leveraging an energy function called the Averaged Moment Generating Function (AMGF), we prove a tight probabilistic bound (Theorem 1) on the stochastic deviation for general nonlinear discrete-time systems. This bound is sharper than bounds achieved by the worst-case analysis methods such as conformal prediction. Moreover, our bound coincides with the classical bound on stochastic deviation for linear stochastic systems under the same assumptions. Our bound is applicable to systems subject to sub-Gaussian stochastic noise, which includes a wide range of typical noise such as Gaussian, truncated Gaussian and uniform noise. We show that this bound cannot be improved further without additional assumptions.

Our framework offers tremendous flexibility in finding probabilistic reachable sets of stochastic systems, as it can incorporate any deterministic reachability methods for approximating the effect of deterministic inputs with high probability tight bounds on the stochastic deviation. Due to the separation strategy and the tightness of the stochastic deviation, the computational efficiency and tightness of the probabilistic reachable set only rely on the over-approximation of the deterministic reachable set. Consequently, analyzing the reachability of the associated deterministic system is all we need to obtain a good probabilistic reachable set, and computational heavy work on the stochastic system can be avoided. In particular, we combine our framework with two computationally efficient deterministic reachability approaches, namely Lipschitz bound reachability and interval-based reachability, and provide explicit expressions for the probabilistic reachable sets of discrete-time stochastic systems.

Finally, the contribution of this probabilistic bound goes beyond reachability analysis. Our tight probabilistic bound of stochastic deviation is the first non-conservative result that can quantitatively describe the behavior of a general nonlinear discrete-time system subject to sub-Gaussian noise. This tight bound is poised to have profound implications in the broader range of research and applications such as safety-critical control, finance, statistics, machine learning, etc.

The rest of the paper is organized as follows. We provide background knowledge about deterministic reachability, formulate the probabilistic reachability problem, and presents our overall strategy in Section 2. In Section 3 we provide the expectation bound on the stochastic deviation introduced in Section 2 and points out its limitations. To overcome the limitations, we introduce the main technical contribution termed Averaged Moment Generating Function in Section 4, and make use of it to prove the probabilistic bound on the stochastic deviation. We also show the tightness of our result and compare it with the performance of the worst-case analysis in this section. Based on the theoretical result of Section 4, Section 5 provides the result of the probabilistic reachable set and gives two related case studies. Numerical experiments are displayed in Section 6. Section 7 concludes the paper.

Notations. For a vector $x\in\mathbb{R}^{n}$ , $\|x\|$ denotes its Euclidean norm ( $l_{2}$ norm) and $\langle x,y\rangle$ denotes the corresponding inner product. For a matrix $A$ , $\|A\|$ denotes its induced norm. Given two sets $A,B$ , we define the Minkowski sum of the sets $A$ and $B$ by $A\oplus B=\{x+y:x\in A,~{}y\in B\}$ . Throughout the paper, we use $\mathbb{E}$ to denote expectation, $\mathbb{P}$ to denote probability, $\mathcal{B}^{n}(r,y)$ to denote the ball $\{x\in\mathbb{R}^{n}:\|x-y\|\leq r\}$ , and $\mathcal{S}^{n-1}$ to denote the unit sphere of $\mathbb{R}^{n}$ . For a random variable $X$ , $X\sim G$ means $X$ is independently drawn from the distribution $G$ and $X\sim\mathcal{S}^{n-1}$ means $X$ is drawn uniformly from $\mathcal{S}^{n-1}$ . $N(\mu,\Sigma)$ denotes the Gaussian distribution with mean $\mu$ and covariance matrix $\Sigma$ . Given a function $f:\mathbb{R}\to\mathbb{R}$ and a non-negative function $g:\mathbb{R}\to\mathbb{R}_{\geq 0}$ , we write $f(x)=\mathcal{O}(g(x))$ , if there exist constants $M>0$ and $x_{0}\in\mathbb{R}$ such that $|f(x)|\leq Mg(x)$ , for every $x\geq x_{0}$ .

2 Problem Statement and Preliminaries

Consider the discrete-time stochastic system

X_{t+1}=f(X_{t},u_{t},t)+w_{t}

(1)

where $X_{t}\in\mathbb{R}^{n}$ is the state, $u_{t}\in\mathbb{R}^{p}$ is the input, $w_{t}\in\mathbb{R}^{n}$ is the stochastic disturbance, and $f:\mathbb{R}^{n}\times\mathbb{R}^{p}\times\mathbb{R}_{+}\to\mathbb{R}^{n}$ is a parameterized vector field. The input $u_{t}$ can be either control input or deterministic disturbance depending on the application. In this paper, we impose the following standard Lipschitz condition on $f$ [25].

Assumption 1.

At every time $t\geq 0$ , there exists $L_{t}\geq 0$ such that, for every $x,y\in\mathbb{R}^{n}$ and every $u\in\mathbb{R}^{p}$ ,

\|f(x,u,t)-f(y,u,t)\|\leq L_{t}\|x-y\|.

This paper aims to establish an effective method of reachability analysis for the stochastic system (1). In this section, we formulate our probabilistic reachability problem on the stochastic system (1) following a brief review of deterministic reachability. We then present our overall strategy for addressing the problem.

2.1 Deterministic Reachable Set

Computing the reachable sets for deterministic systems is a fundamental problem in dynamical systems and control theory. Consider the deterministic system

x_{t+1}=f(x_{t},u_{t},t),

(2)

which can be viewed as a noiseless version of the stochastic system (1). The deterministic reachable set (DRS) at time $t$ is the set in the state space that the system (2) can reach, starting from an initial configuration, under all possible inputs within a specified time.

Definition 2.1 (DRS).

Consider the system (2) with initial set $\mathcal{X}_{0}\subseteq\mathbb{R}^{n}$ and input set $\mathcal{U}\subseteq\mathbb{R}^{p}$ . The deterministic reachable set of (2) at time $t$ starting from $\mathcal{X}_{0}$ with inputs in $\mathcal{U}$ is

\mathcal{R}_{t}=\left\{x_{t}\middle|\begin{aligned} &\tau\mapsto x_{\tau}\mbox{ is a trajectory of~{}\eqref{sys: d-t ds}}\\ &\mbox{ with }x_{0}\in\mathcal{X}_{0}\mbox{ and }u_{\tau}:\mathbb{Z}_{\geq 0}\to\mathcal{U}\end{aligned}\right\}

(3)

In general, computing the exact DRS of a dynamic system is challenging [26]. Therefore, most methods in reachability analysis focus on providing over-approximation of DRS [1]. A set $\overline{\mathcal{R}}_{t}\subseteq\mathbb{R}^{n}$ is an over-approximation of the DRS $\mathcal{R}_{t}$ if

\displaystyle\mathcal{R}_{t}\subseteq\overline{\mathcal{R}}_{t}.

The calculation of $\overline{\mathcal{R}}_{t}$ depends on the properties of system dynamics $f$ [27], and types of input $u_{t}$ [5]. In Section 5, we review two approaches to compute $\overline{\mathcal{R}}_{t}$ : Lipschitz bound reachability and interval-based reachability.

2.2 Probabilistic Reachable Set

Refer to caption — Figure 1: An illustration of $\delta$ -PRS at time $t$ . Here $\mathcal{R}_{\delta,t}$ is the $\delta$ -PRS of the stochastic system (1), whose trajectories are in color. $\mathcal{R}_{t}$ is the DRS of the associated deterministic system (2), whose trajectories are in black.

We are interested in characterizing the reachable set of (1). We focus on the setting where the disturbance $w_{t}$ is sub-Gaussian. The sub-Gaussian distribution is quite general and includes a wide range of distributions such as Gaussian, uniform, and any zero-mean distributions with bounded support.

Definition 2.2 (sub-Gaussian).

A random variable $X\in\mathbb{R}^{n}$ is said to be sub-Gaussian with variance proxy $\sigma^{2}$ , denoted as $X\sim subG(\sigma^{2})$ , if $\mathbb{E}(X)=0$ and $X$ satisfies

\mathbb{E}_{X}\left(e^{\lambda\langle\ell,X\rangle}\right)\leq e^{\frac{\lambda^{2}\sigma^{2}}{2}},~{}\forall\lambda\in\mathbb{R},~{}\forall\ell\in\mathcal{S}^{n-1}.

(4)

Assumption 2.

The disturbance $w_{t}\sim subG(\sigma_{t}^{2})$ with some finite $\sigma_{t}\geq 0$ for every $t\geq 0$ .

The theory for deterministic reachability analysis is not effective in handling stochastic disturbances. When $w_{t}$ is unbounded sub-Gaussian noise, the DRS in the sense of (3) is often trivial. To see this, consider the system $X_{t+1}=X_{t}+w_{t}$ where $w_{t}$ is a zero-mean Gaussian noise. The state $X_{t}$ is a Gaussian vector whose possible range is the entire $\mathbb{R}^{n}$ space and thus $\mathcal{R}_{t}=\mathbb{R}^{n}$ . When $w_{t}$ is bounded, the deterministic reachability analysis ignores the statistics of the disturbance and treats it in a worst-case manner, leading to conservative results. In reality, the worst-case realization of stochastic disturbance occurs with extremely low probability and is thus not representative. For these reasons, we turn to the concept of probabilistic reachable set, which refers to the set that any possible trajectory starting from the initial state set $x_{0}\in\mathcal{X}_{0}$ and with the input $u_{\tau}\in\mathcal{U}$ can reach with high probability.

Definition 2.3 ( $\delta$ -PRS).

Consider the stochastic system (1) with the initial set $\mathcal{X}_{0}\subseteq\mathbb{R}^{n}$ and the bounded input set $\mathcal{U}\subseteq\mathbb{R}^{p}$ . Given $\delta\in(0,1]$ and $t\in\mathbb{Z}_{\geq 0}$ , the set $\mathcal{R}_{\delta,t}\subseteq\mathbb{R}^{n}$ is said to be a $\delta$ -probabilistic reachable set ( $\delta$ -PRS) of the system (1) at time $t$ , if for any $x_{0}\in\mathcal{X}_{0}$ and $u_{t}:\mathbb{Z}_{\geq 0}\to\mathcal{U}$ , it holds that

\mathbb{P}\left(X_{t}\in\mathcal{R}_{\delta,t}\right)\geq 1-\delta.

(5)

Apparently, as can be seen from the definition, $\delta$ -PRS is not unique. Indeed, if $\mathcal{R}_{\delta,t}$ is a $\delta$ -PRS, then any $\mathcal{R}_{\delta,t}^{\prime}\supseteq\mathcal{R}_{\delta,t}$ is also a $\delta$ -PRS. We say $\mathcal{R}_{\delta,t}$ is a tighter $\delta$ -PRS than $\mathcal{R}_{\delta,t}^{\prime}$ if $\mathcal{R}_{\delta,t}\subseteq\mathcal{R}_{\delta,t}^{\prime}$ . An illustration of DRS and $\delta$ -PRS is in Figure 1. In many applications like safety-critical control, it is desirable to have a tight $\delta$ -PRS, and a loose $\delta$ -PRS can result in very conservative strategies. Therefore, the main goal in probabilistic reachability is to find a tight $\delta$ -PRS.

Problem 1.

Find an as tight as possible $\delta$ -PRS $\mathcal{R}_{\delta,t}$ of the stochastic system (1) under Assumptions 1 and 2.

2.3 Separation Strategy and Main Challenge

The trajectories of the stochastic system (1) are driven by both deterministic terms ( $f$ , $u_{t}$ ) and stochastic disturbances ( $w_{t}$ ). The impacts of these two parts on the trajectories are relatively independent and can be addressed separately. Building on this intuition, we proposed a strategy termed the separation strategy in our recent work [24] for probabilistic reachability analysis for continuous-time systems.

This separation strategy turns out to be extendable to the discrete-time setting. We say $X_{t}$ and $x_{t}$ are associated trajectories if they have the same initial state $x_{0}$ and the same input $u_{t}$ . The separation strategy states that the $\delta$ -PRS can be separated into a deterministic part encoded by the DRS of (2), and a stochastic part, which is characterized as the distance $\|X_{t}-x_{t}\|$ between the associated trajectories, as formalized below.

Proposition 1.

Consider the stochastic system (1) with its associated deterministic system (2). Given $\mathcal{X}_{0}\subseteq\mathbb{R}^{n}$ , $\mathcal{U}\subseteq\mathbb{R}^{p}$ , let $X_{t}$ and $x_{t}$ be the associated trajectories with inputs $u_{t}\in\mathcal{U}$ and the initial state $x_{0}\in\mathcal{X}_{0}$ . If for any $t\geq 0$ , $x_{0}\in\mathcal{X}_{0}$ , $u_{t}\in\mathcal{U}$ and $\delta>0$ , $\exists r_{\delta,t}>0$ such that

\mathbb{P}\left(\|X_{t}-x_{t}\|\leq r_{\delta,t}\right)\geq 1-\delta,

(6)

then for any over-approximation $\overline{\mathcal{R}}_{t}$ of the DRS $\mathcal{R}_{t}$ of (2),

\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0)

(7)

is a $\delta$ -PRS of the system (1).

{pf}

Let $X_{t}$ be any trajectory of (1) associated with a trajectory $x_{t}$ of (2), then, by (6), $X_{t}\in\mathcal{B}^{n}(r_{\delta,t},x_{t})$ with probability at least $1-\delta$ . By the definition of $\overline{\mathcal{R}}_{t}$ , $x_{t}\in\overline{\mathcal{R}}_{t}$ . Therefore, by the definition of the Minkowski sum [28], with probability at least $1-\delta$ ,

X_{t}\in\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0),

which completes the proof. We term the difference $\|X_{t}-x_{t}\|$ between associated trajectories stochastic deviation. This separation strategy decomposes the probabilistic reachability analysis problem into two parts: approximate the DRS of (2) and estimate the probabilistic bound $r_{\delta,t}$ of the stochastic deviation. Once a bound $r_{\delta,t}$ of the stochastic deviation is obtained, one can combine it with any existing deterministic reachability method to approximate the $\delta$ -PRS.

It is apparent that the size of the $\delta$ -PRS $\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0)$ increases with $r_{\delta,t}$ . Therefore, to ensure that $\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0)$ is not an overly-conservative $\delta$ -PRS of (1), it is crucial to establish a probabilistic bound $r_{\delta,t}$ for the stochastic deviation that is as tight as possible. This is the main challenge addressed in this paper.

Problem 2 (Stochastic Deviation).

Establish an as tight as possible probabilistic bound $r_{\delta,t}$ of the stochastic deviation $\|X_{t}-x_{t}\|$ associated with systems (1)-(2) under Assumptions 1 and 2.

3 Expectation Bound and Limitations

Our first attempt for addressing Problem 2 is to adapt the expectation bound in our recent work [24] to discrete-time systems. By applying the classical Markov inequality, the expectation bound translates into the probabilistic bound on stochastic deviation $\|X_{t}-x_{t}\|$ . In this section, we present this approach and highlight its limitations.

3.1 Expectation Bound on Stochastic Deviation

We employ a similar technique to that in [24, 29] to analyze the stochastic deviation $\|X_{t}-x_{t}\|$ and establish bounds on $\mathbb{E}(\|X_{t}-x_{t}\|^{2})$ . The result is presented in the following proposition.

Proposition 2.

Consider the stochastic system (1) and its associated deterministic system (2). Suppose that Assumptions 1 and 2 hold. Given $x_{0}\in\mathbb{R}^{n}$ and $u_{t}:\mathbb{Z}_{\geq 0}\to\mathbb{R}^{p}$ be an input sequence. Let $X_{t}$ (resp. $x_{t}$ ) be the trajectory of (1) (resp. (2)) with the input $u_{t}$ and the initial state $x_{0}$ . Then,

\mathbb{E}\left(\|X_{t}-x_{t}\|^{2}\right)\leq n\Psi_{t}

(8)

where

\Psi_{t}=\psi_{t-1}\sum_{k=0}^{t-1}\sigma_{k}^{2}\psi_{k}^{-1},\quad\psi_{t}=\prod_{k=0}^{t}L_{k}^{2}.

(9)

{pf}

Define $v_{t}=X_{t}-x_{t}$ , $\beta_{t}=f(X_{t},u_{t},t)-f(x_{t},u_{t},t)$ and $V(\cdot)=\|\cdot\|^{2}$ , then by (1) and (2) we have

v_{t+1}=\beta_{t}+w_{t}.

(10)

It follows that

V(v_{t+1})=V(\beta_{t})+V(w_{t})+2\langle\beta_{t},w_{t}\rangle.

(11)

Take the expectation of (11). By Assumption 1, $\mathbb{E}\left(V(\beta_{t})\right)\leq L_{t}^{2}\mathbb{E}\left(v_{t}\right)$ . Since $w_{t}$ is independent from $\beta_{t}$ , $\mathbb{E}\left(\langle\beta_{t},w_{t}\rangle\right)=0$ . Let $w_{t}=\begin{bmatrix}w_{t,1}&\cdots&w_{t,n}\end{bmatrix}^{\top}$ , then by Definition 2.2, each entry $w_{t,i}$ of $w_{t}$ is sub-Gaussian with variance proxy $\sigma_{t}^{2}$ . Therefore,

\begin{split}\mathbb{E}\left(V(v_{t+1})\right)&\leq L_{t}^{2}\mathbb{E}\left(V(v_{t})\right)+\sum_{i=1}^{n}\mathbb{E}\left(w_{t,i}^{2}\right)+0\\ &=L_{t}^{2}\mathbb{E}\left(V(v_{t})\right)+\sum_{i=1}^{n}{\text{Var}(w_{t,i})}\\ &\leq L_{t}^{2}\mathbb{E}\left(V(v_{t})\right)+n\sigma_{t}^{2},~{}V(v_{0})=0,\end{split}

(12)

where the last “ $\leq$ ” uses the classical result that $\text{Var}(X)\leq\sigma^{2}$ if $X\in\mathbb{R}$ and $X\sim subG(\sigma^{2})$ [30]. Solving the linear difference inequality (12) we obtain

\mathbb{E}\left(V(v_{t})\right)\leq n\Psi_{t},

(13)

where $\Psi_{t}$ is defined in (9). This completes the proof.

3.2 Limitation of The Expectation Bound

The probabilistic bound directly derived from the expectation bound (8) can be loose. By Markov’s Inequality, for any $r>0$ ,

\begin{split}\mathbb{P}\left(\|X_{t}-x_{t}\|\leq r\right)=&\mathbb{P}\left(\|X_{t}-x_{t}\|^{2}\leq r^{2}\right)\\ \geq&1-\frac{\mathbb{E}\left(\|X_{t}-x_{t}\|^{2}\right)}{r^{2}}.\end{split}

(14)

Let $r=\sqrt{\frac{n}{\delta}\Psi_{t}}$ and apply Proposition 2 to (14), then

\mathbb{P}\left(\|X_{t}-x_{t}\|\leq\sqrt{\frac{n\Psi_{t}}{\delta}}\right)\geq 1-\delta.

(15)

Given a probability level $1-\delta$ , (15) provides a probabilistic bound on stochastic deviation that has $\mathcal{O}(\sqrt{1/\delta})$ dependence on $\delta$ . This is much worse than that for linear systems. To see this, consider a linear dynamics $X_{t+1}=AX_{t}+Bu_{t}+w_{t}$ which corresponds to (1) with $f(x,u,t)=Ax+Bu$ . It satisfies Assumption 1 with $L_{t}\equiv\|A\|$ , for every $t\geq 0$ . For associated $X_{t}$ and $x_{t}$ , by linearity,

\begin{split}&X_{t+1}-x_{t+1}=A(X_{t}-X_{t})+w_{t},\quad X_{0}-x_{0}=0\\ &\Rightarrow\,X_{t}-x_{t}=\sum_{k=0}^{t-1}A^{k}w_{t-1-k}.\end{split}

(16)

By the sum-up property of independent sub-Gaussian variables [31], (16) implies that $X_{t}-x_{t}$ is sub-Gaussian with variance proxy

\sigma^{2}(X_{t})=\sum_{k=0}^{t-1}\|A\|^{2k}\sigma_{k}^{2}=\Psi_{t}.

(17)

Sub-Gaussian random variables enjoy a strong norm concentration property stated below. Note that the dependence $\mathcal{O}(\sqrt{n})$ and $\mathcal{O}(\sqrt{\log(1/\delta)})$ in Lemma 3.1 are tight for sub-Gaussian distributions but the coefficients can be improved. We refer to [31, Chapter 1.4][24] for more details.

Lemma 3.1 (Norm Concentration).

Let $X\in\mathbb{R}^{n}$ be a sub-Gaussian random variable with variance proxy $\sigma^{2}$ , then with any $\delta\in(0,1)$ and $\varepsilon\in(0,1)$ ,

\|X\|\leq\sqrt{\sigma^{2}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}

(18)

holds with probability at least $1-\delta$ , where

\varepsilon_{1}=\frac{2\log(1+2/\varepsilon)}{(1-\varepsilon)^{2}},~{}\varepsilon_{2}=\frac{2}{(1-\varepsilon)^{2}}.

(19)

Plugging (17) into Lemma 3.1, we conclude that when the system (1) is linear, the probabilistic bound on the stochastic deviation becomes

r_{\delta,t}^{(lin)}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))},

(20)

which has an $\mathcal{O}(\sqrt{\log(1/\delta))}$ dependence on $\delta$ . When $\delta$ is sufficiently small (e.g., $\delta=10^{-6}$ ), which is crucial for safety-critical systems, $\sqrt{\log(1/\delta))}$ is significantly smaller than the term $\sqrt{1/\delta}$ in (15) (e.g., $2.6$ v.s. $10^{3}$ ).

Thus, a significant gap between the result (15) for nonlinear dynamics and the probabilistic bound (20) for linear dynamics exists. This points to the question: is the gap fundamental or an artifact of the analysis?

4 Probabilistic Bound on Stochastic Deviation

In this section, we answer the aforementioned question by establishing a probabilistic bound for the stochastic deviation $\|X_{t}-x_{t}\|$ of order $\mathcal{O}(\sqrt{\log(1/\delta)})$ for general nonlinear stochastic systems (1) under Assumptions 1 and 2. We further show the tightness of our result and its advantages over the worst-case analysis for this problem.

4.1 Motivation: Moment Generating Function

The limitation of the expectation bound (8) primarily lies in the quadratic Lyapunov function $V_{t}=\|X_{t}-x_{t}\|^{2}$ . The analysis focuses only on the evolution of the second order moment $\mathbb{E}(\|X_{t}-x_{t}\|^{2})$ , so its upper bound at best guarantees a probabilistic bound for $\|X_{t}-x_{t}\|$ of order $\mathcal{O}(\sqrt{1/\delta})$ via Markov inequality. In contrast, in Definition 2.2, the function on the left-hand side of (4) captures arbitrarily high-order moments of a random variable, and its boundness leads to the norm concentration property of sub-Gaussian variables. This function is known as Moment Generating Function (MGF) [31, Chapter 1.1]

\mathbb{E}_{X}\left(M_{\lambda,\ell}(X)\right):=\mathbb{E}_{X}\left(e^{\lambda\langle\ell,X\rangle}\right),\quad\ell\in\mathcal{S}^{n-1}.

(21)

Thus, to establish a tight probabilistic bound on the stochastic deviation, a potential approach is to upper bound the MGF of $X_{t}-x_{t}$ so that $X_{t}-x_{t}$ is sub-Gaussian and Lemma 3.1 can be applied. Unfortunately, this is infeasible. For associated $X_{t}$ and $x_{t}$ , $\mathbb{E}(X_{t})\neq x_{t}$ when the systems are nonlinear, because

\begin{split}&\mathbb{E}(X_{t+1})=\mathbb{E}(f(X_{t},u_{t},t))+\mathbb{E}(w_{t})\\ =&\mathbb{E}(f(X_{t},u_{t},t))\neq f(\mathbb{E}(X_{t}),u_{t},t),\end{split}

where the last “ $\neq$ ” can be improved to “ $=$ ” only if $f$ is linear. Therefore, the upper bound on $\mathbb{E}\left(M_{\lambda,\ell}(X_{t}-x_{t})\right)$ is influenced by the value of $\mathbb{E}\left(\langle\ell,X_{t}-x_{t}\rangle\right)$ [32, Chapter 2.5], which can be large for some $\ell$ when $\mathbb{E}(\|X_{t}-x_{t}\|)$ is large.

4.2 Averaged Moment Generating Function

Motivated by the advantages and limitations of MGF, we resort to a modified version of MGF termed Averaged Moment Generating Function (AMGF) in our method.

Definition 4.1 (AMGF).

Given $\lambda\in\mathbb{R}$ , the Averaged Moment Generating Function (AMGF) is defined as

\displaystyle\mathbb{E}_{X}(\Phi_{n,\lambda}(X))=\mathbb{E}_{X}\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell,X\rangle}\right).

(22)

The AMGF was recently proposed in [33] in the field of Sampling Theory. It has a natural interpretation as the average of MGF over the unit sphere $\ell\in\mathcal{S}^{n-1}$ . AMGF can also be viewed as a special MGF while replacing the exponential energy function $e^{\lambda\langle\ell,x\rangle}$ by its sphere-averaged version $\Phi_{n,\lambda}(x)=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell,x\rangle}\right)$ . This energy function $\Phi_{n,\lambda}(x)$ exhibits several intriguing properties.

Lemma 4.1 (Properties of AMGF).

For $\Phi_{n,\lambda}(x)$ defined in (22), we have

1.

Rotation Invariance: For any $x\in\mathbb{R}^{n}$ and any unitary matrix $Q\in\mathbb{R}^{n\times n}$ ,

$\Phi_{n,\lambda}(x)=\Phi_{n,\lambda}(Qx).$
2.

Monotonicity: For any $x,y\in\mathbb{R}^{n}$ such that $\|x\|\leq\|y\|$ ,

$1\leq\Phi_{n,\lambda}(x)\leq\Phi_{n,\lambda}(y).$
3.

Decoupling: Given $x\in\mathbb{R}^{n}$ and $w\sim subG(\sigma^{2})$ ,

$\mathbb{E}_{w}(\Phi_{n,\lambda}(x+w))\leq e^{\frac{\lambda^{2}\sigma^{2}}{2}}\Phi_{n,\lambda}(x).$ (23)

Moreover, AMGF induces the same norm concentration property as MGF.

Lemma 4.2.

For a random variable $X\in\mathbb{R}^{n}$ , if there exists $\sigma>0$ such that

\mathbb{E}_{x}\left(\Phi_{n,\lambda}(x)\right)\leq e^{\frac{\lambda^{2}\sigma^{2}}{2}},~{}\forall\lambda\in\mathbb{R},

(24)

then for any $\delta\in(0,1)$ , (18) holds with probability at least $1-\delta$ .

We emphasize that even though (24) is a weaker condition than sub-Gaussian, it still guarantees norm concentration property as MGF. To see why Lemma 4.2 holds, define an intermediate random variable $\tilde{X}=QX$ where $Q\sim\mathbb{U}^{n}$ is a random unitary matrix with $\mathbb{U}^{n}$ denoting the set of all the unitary matrices in $\mathbb{R}^{n\times n}$ . Then the AMGF over $X$ is equal to the MGF over $\tilde{X}$ , which means $\tilde{X}$ is sub-Gaussian with variance proxy $\sigma^{2}$ if (24) holds. Norm concentration of $X$ then follows by noticing that the transformation $\tilde{X}=QX$ does not affect the norm.

4.3 Theoretical Analysis

Equipped with the AMGF, we establish a probabilistic bound of order $\mathcal{O}(\sqrt{\log(1/\delta)})$ for the stochastic deviation $\|X_{t}-x_{t}\|$ by bounding the evolution of AMGF $\mathbb{E}(\Phi_{n,\lambda}(X_{t}-x_{t}))$ over time.

Theorem 1.

Consider the stochastic system (1) and its associated deterministic system (2) under Assumptions 1 and 2. Let $X_{\tau}$ be the trajectory of (1) and $x_{\tau}$ be the associated trajectory of (2) with the same initial state $x_{0}$ and input $u_{\tau}:\mathbb{Z}_{\geq 0}\to\mathcal{U}$ . Then, for any $t\geq 0$ , $\delta\in(0,1)$ and $\varepsilon\in(0,1)$ ,

\|X_{t}-x_{t}\|\leq\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}

(25)

holds with probability at least $1-\delta$ , where $\Psi_{t}$ is as in (9) and $\varepsilon_{1},\varepsilon_{2}$ are as in (19).

{pf}

We begin with the special case where the Lipschitz constant $L=1$ at any time $t\geq 0$ . Define $v_{t}=X_{t}-x_{t}$ and $\beta_{t}=f(X_{t},u_{t},t)-f(x_{t},u_{t},t)$ . By (1) and (2) we have

v_{t+1}=\beta_{t}+w_{t}.

(26)

By Assumption 1,

\|\beta_{t}\|\leq\|v_{t}\|.

(27)

By Assumption 2 and Lemma 4.1(3), the conditional expectation of $\Phi_{n,\lambda}(v_{t+1})$ can be bounded as

\begin{split}&\mathbb{E}\left(\Phi_{n,\lambda}(v_{t+1})|v_{t}\right)=\mathbb{E}_{w_{t}}\left(\Phi_{n,\lambda}(\beta_{t}+w_{t})\right)\\ &\leq e^{\frac{\lambda^{2}\sigma_{t}^{2}}{2}}\Phi_{n,\lambda}(\beta_{t})\end{split}

(28)

Combining (27) and (28), invoking Lemma 4.1(2), we obtain

\begin{split}&\mathbb{E}\left(\Phi_{n,\lambda}(v_{t+1})|v_{t}\right)\leq e^{\frac{\lambda^{2}\sigma_{t}^{2}}{2}}\Phi_{n,\lambda}(v_{t}).\end{split}

(29)

Taking the expectation over $v_{t}$ on both sides of (29) yields

\mathbb{E}\left(\Phi_{n,\lambda}(v_{t+1})\right)\leq e^{\frac{\lambda^{2}\sigma_{t}^{2}}{2}}\mathbb{E}\left(\Phi_{n,\lambda}(v_{t})\right),~{}\mathbb{E}\left(\Phi_{n,\lambda}(v_{0})\right)=0.

which is a linear difference inequality that points to

\mathbb{E}\left(\Phi_{n,\lambda}(v_{t})\right)\leq e^{\frac{\lambda^{2}\sum_{k=0}^{t-1}\sigma_{k}^{2}}{2}}.

(30)

By Lemma 4.2, it follows that for any $\delta\in(0,1)$

\|X_{t}-x_{t}\|\leq\sqrt{\sum_{k=0}^{t-1}\sigma_{k}^{2}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}

(31)

holds with probability at least $1-\delta$ , where $\varepsilon_{1},\varepsilon_{2}$ are as in (19). This completes the proof of the special case.

Next, we consider the general cases under Assumption 1. Define $\tilde{X}_{t}=\prod_{k=0}^{t-1}L_{k}^{-1}X_{t}$ and $\tilde{x}_{t}=\prod_{k=0}^{t-1}L_{k}^{-1}x_{t}$ . Then $\tilde{X}_{t}$ satisfies

\begin{split}\tilde{X}_{t+1}&=\prod_{k=0}^{t}L_{k}^{-1}f(\prod_{k=0}^{t-1}L_{k}\tilde{X}_{t})+\prod_{k=0}^{t}L_{k}^{-1}w_{t}\\ &:=g(\tilde{X}_{t})+\tilde{w}_{t}\end{split}

(32)

and similarly, $\tilde{x}_{t}$ satisfies

\tilde{x}_{t+1}=\prod_{k=0}^{t}L_{k}^{-1}f(\prod_{k=0}^{t-1}L_{k}\tilde{x}_{t})=g(\tilde{x}_{t}),

(33)

which has the same drift term as (33). For any $\tilde{x}_{t},\tilde{y}_{t}\in\mathbb{R}^{n}$ , it holds that

\begin{split}&\|g(\tilde{x}_{t})-g(\tilde{y}_{t})\|=\prod_{k=0}^{t}L_{k}^{-1}\|f(\prod_{k=0}^{t-1}L_{k}\tilde{x}_{t})-f(\prod_{k=0}^{t-1}L_{k}\tilde{y}_{t})\|\\ &\leq\prod_{k=0}^{t}L_{k}^{-1}\cdot L_{t}\cdot\prod_{k=0}^{t-1}L_{k}\|\tilde{x}_{t}-\tilde{y}_{t}\|=\|\tilde{x}_{t}-\tilde{y}_{t}\|,\end{split}

meaning the scaled systems (32) and (33) satisfy Assumption 1 with $\tilde{L}=1$ . Also, $\tilde{w}_{t}$ satisfies Assumption 2 with variance proxy $\tilde{\sigma}^{2}_{t}=\prod_{k=0}^{t}L_{k}^{-2}\sigma_{t}^{2}$ due to the sum-up property of independent sub-Gaussian noise [31]. Thus the result of the special case can be applied.

By (31), with probability at least $1-\delta$ ,

\|\tilde{X}_{t}-\tilde{x}_{t}\|\leq\sqrt{\sum_{k=0}^{t-1}\sigma^{2}_{k}\psi_{k}^{-1}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}.

Because of the relation $\|X_{t}-x_{t}\|=\prod_{k=0}^{t-1}L_{k}\|\tilde{X}_{t}-\tilde{x}_{t}\|$ , the probabilistic bound (25) follows. This completes the proof.

Remark 4.1.

When Assumptions 1 and 2 hold with time-invariant $L_{t}\equiv L$ and $\sigma_{t}\equiv\sigma$ , $\psi_{t}$ defined in (9) becomes $\psi_{t}=L^{2t}$ , and the result of Theorem 1 reduces to

\|X_{t}-x_{t}\|\leq\sqrt{\frac{\sigma^{2}(L^{2t}-1)}{L^{2}-1}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}

(34)

hold with probability at least $1-\delta$ .

Remark 4.2.

In this work, we use Euclidean norm to measure the stochastic deviation $\|X_{t}-x_{t}\|$ for simplicity, but the results in Theorem 1 can be easily extended to the case with weighted Euclidean norm through a coordinate transformation [24, Section V-D].

Theorem 1 establishes $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ as a probabilistic bound on stochastic deviation. The bound is comprised of a scaling term $\sqrt{\Psi_{t}}$ , a dimensional term $\sqrt{n}$ and a probabilistic term $\sqrt{\log(1/\delta)}$ . The scaling term is determined by the Lipschitz constants $L_{t}$ and the disturbance variance proxy $\sigma_{t}^{2}$ . It reveals the fluctuation of the system. Especially, when the system is linear, $\Psi_{t}$ is exactly the variance proxy of $X_{t}-x_{t}$ . The term $\sqrt{n}$ reveals the dimensional dependence of $r_{\delta,t}$ , which is the same as the dimensional dependence of sub-Gaussian vectors shown in Lemma 3.1. The term $\sqrt{\log(1/\delta)}$ essentially represents the fast-decay feature of the distribution of $X_{t}-x_{t}$ . Notice that this term means the probability level $\delta$ has an $\mathcal{O}(e^{-r_{\delta,t}^{2}})$ dependence on the probabilistic bound $r_{\delta,t}$ , so $\mathbb{P}\left(\|X_{t}-x_{t}\|>r_{\delta,t}\right)$ decreases with $r_{\delta,t}$ even faster than the exponential rate. This phenomenon leads to the following question: Is this the best rate one can obtain? In other words, is $r_{\delta,t}$ derived from Theorem 1 tight?

4.4 Tightness of The Bound

We next show that the probabilistic bound in Theorem 1 is tight under Assumptions 1 and 2 and it is impossible to achieve better probabilistic bounds than (25) without additional assumptions. In particular, we show that the tightness is precisely captured by linear systems satisfying Assumptions 1 and 2.

Consider the linear stochastic system

x_{t+1}=Ax_{t}+Bu_{t}+w_{t},~{}~{}w_{t}\sim subG(\sigma^{2}),

that satisfies Assumptions 1 and 2 with $L\equiv\|A\|$ and $\sigma_{t}\equiv\sigma$ . Applying the time-invariant $L$ and $\sigma$ into (9), we get that $\Psi_{t}=\frac{\sigma^{2}(L^{2t}-1)}{L^{2}-1}$ . Therefore, by Theorem 1, we get that with probability at least $1-\delta$ ,

\|X_{t}-x_{t}\|\leq\sqrt{\frac{\sigma^{2}(L^{2t}-1)}{L^{2}-1}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}.

(35)

This is exactly the same as the $r_{\delta,t}^{(lin)}$ in (20). From [32, Chapter 4.6], we know that at least for linear systems satisfying Assumptions 1 and 2, the dependence on $\delta$ and $n$ can not be improved further without additional assumptions. Therefore, $r_{\delta,t}$ is tight in our problem setting.

4.5 Comparison with Worst-Case Analysis

The worst-case analysis is a popular method for understanding the reachability when the disturbance is bounded [4]. We point out that it can also be applied to stochastic systems to obtain a probabilistic bound on the stochastic deviation under unbounded stochastic disturbance. As explained below, this is achieved by viewing unbounded sub-Gaussian disturbance as a bounded disturbance with high probability, reminiscent of conformal prediction [34, 35]. However, the result is still significantly more conservative than our bound in Theorem 1.

By norm concentration (Lemma 3.1) of sub-Gaussian random variables, for any $\delta\in(0,1)$ ,

\mathbb{P}\left(\|w_{\tau}\|\leq\sqrt{\sigma_{\tau}^{2}(\varepsilon_{1}n+\varepsilon_{2}\log\frac{t}{\delta})}\right)\geq 1-\frac{\delta}{t}.

(36)

Applying union bound to (36) over $\tau=0,\dots,t-1$ , we obtain

\mathbb{P}\left(\|w_{\tau}\|\leq\sqrt{\sigma_{\tau}^{2}(\varepsilon_{1}n+\varepsilon_{2}\log\frac{t}{\delta})},~{}\forall\tau<t\right)\geq 1-\delta.

(37)

This means that the disturbance is bounded with

\|w_{\tau}\|\leq b_{\tau}:=\sqrt{\sigma_{\tau}^{2}(\varepsilon_{1}n+\varepsilon_{2}\log\frac{t}{\delta})}

(38)

for all $\tau<t$ with probability at least $1-\delta$ . A worst-case probabilistic bound on $\|X_{t}-x_{t}\|$ can be established by assuming this bound. More specifically, by the system dynamics (1)-(2) and the triangular inequality,

\begin{split}\|X_{\tau+1}-x_{\tau+1}\|\leq&\|f(X_{\tau},u_{\tau},\tau)-f(x_{\tau},u_{\tau},\tau)\|+\|w_{\tau}\|\\ \leq&L_{\tau}\|X_{\tau}-x_{\tau}\|+b_{\tau}\end{split}

(39)

for all $\tau<t$ with probability at least $1-\delta$ . It follows that

\|X_{t}-x_{t}\|\leq\sqrt{\psi_{t-1}}\sum_{k=0}^{t-1}b_{k}\sqrt{\psi_{k}^{-1}},

(40)

where $\psi_{t}$ is as in (9). Plugging (38) into (40), we conclude that, for any $\delta\in(0,1)$ ,

\|X_{t}-x_{t}\|\leq\sqrt{\psi_{t-1}}\sum_{k=0}^{t-1}\sigma_{k}\sqrt{\psi_{k}^{-1}(\varepsilon_{1}n+\varepsilon_{2}\log\frac{t}{\delta})}.

(41)

holds with probability at least $1-\delta$ .

This bound (41) derived using worst-case analysis is substantially more conservative than that in Theorem 1. First, since $\sqrt{\Psi_{t}}\leq\sqrt{\psi_{t-1}}\sum_{k=0}^{t-1}\sigma_{k}\sqrt{\psi_{k}^{-1}}$ , (41) is always worse than (25). To see more clearly the gap, consider the case when $L_{t}\equiv L$ and $\sigma_{t}\equiv\sigma$ . In this case, (41) reduces to $\|X_{t}-x_{t}\|\leq\frac{L^{t}-1}{L-1}\sqrt{\sigma^{2}(\varepsilon_{1}n+\varepsilon_{2}\log\frac{t}{\delta})}$ , which is much worse than (34) when $L$ is close to 1. In particular, when $L=1$ , $\sqrt{\psi_{t}}\sum_{k=1}^{t}\sigma_{k-1}\sqrt{\psi_{t}^{-1}}=\sigma t$ , significantly larger than $\sqrt{\Psi_{t}}=\sigma\sqrt{t}$ as $t$ increases.

5 Probabilistic Reachable Set

With the tight probabilistic bound (25) on stochastic deviation, we can integrate it with any existing methods for approximating the DRS for (2), by leveraging the separation strategy outlined in Proposition 1, to obtain a practical $\delta$ -PRS for (1).

Theorem 2.

Consider the stochastic system (1) with the initial set $\mathcal{X}_{0}\subseteq\mathbb{R}^{n}$ and the input set $\mathcal{U}\subseteq\mathbb{R}^{p}$ . Suppose that Assumptions 1 and 2 hold. Then given any over-approximation $\overline{\mathcal{R}}_{t}$ of the DRS of the deterministic system (2), $\mathcal{R}_{\delta,t}=\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}\left(r_{\delta,t},0\right)$ is a $\delta$ -PRS of the system (1), where $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ with $\Psi_{t}$ as (9) and $\varepsilon_{1},\varepsilon_{2}$ as (19).

{pf}

Plug the bound stated in Theorem 1 into (6) in Proposition 1, then Theorem 2 follows.

Since $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ is proved to have tight dependence on the probability level $1-\delta$ , $\mathcal{R}_{\delta,t}$ in Theorem 2 is also tight with respect to $\delta$ . In scenarios like safety-critical control, $\mathcal{R}_{\delta,t}$ only scales at the rate of $\mathcal{O}(\sqrt{\log(1/\delta)})$ , and is thus suitable for harsh probability-level requirements that need an extremely small $\delta$ . As the dependence of $r_{\delta,t}$ on $\delta$ and $n$ cannot be further improved, the tightness of $\mathcal{R}_{\delta,t}$ merely depends on the tightness of the over-approximation $\overline{\mathcal{R}}_{t}$ on the DRS. It becomes loose when $\overline{\mathcal{R}}_{t}$ is conservative.

Besides tightness, computational efficiency is also an important consideration. The computational cost of $\mathcal{R}_{\delta,t}$ in Theorem 2 arises from two main sources: computing $\overline{\mathcal{R}}_{t}$ and realizing the Minkowski sum $\oplus$ with a ball. Although computing the Minkowski sum in a parameterized form is challenging and efficient algorithms are only available for ellipsoids and polyhedra [36, 28], in practice we only need an efficient membership oracle to determine whether $X_{t}\in\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}\left(r_{\delta,t},0\right)$ . This can be achieved by checking $\min_{y\in\overline{\mathcal{R}}_{t}}\|y-X_{t}\|\leq r_{\delta,t}$ , which is simpler and is a convex optimization when $\overline{\mathcal{R}}_{t}$ is convex. Therefore, the computational efficiency primarily depends on how $\overline{\mathcal{R}}_{t}$ is generated.

Following the above discussion on the tightness and computational efficiency, we conclude that if the reachability of the associated deterministic system (2) is well studied, then $\delta$ -PRS can be easily obtained by Theorem 2. In the following subsections, we exemplify this conclusion with two popular types of reachability study for deterministic systems. These methods are scalable and result in convex $\overline{\mathcal{R}}_{t}$ , rendering efficient algorithms for probabilistic reachability analysis.

5.1 Lipschitz Bound Reachability

In this section, we review a computationally efficient method for reachability using Lipschitz bound of the dynamics [9]. We start with an assumption.

Assumption 3.

For system (2), there exist constants $L_{d},\rho\geq 0$ such that, for every $x,u,t\in\mathbb{R}^{n}\times\mathcal{U}\times\mathbb{Z}_{\geq 0}$ ,

1.

$\|D_{x}f(x,u,t)\|_{\mathbb{X}}\leq L_{d}$ , and
2.

$\|D_{u}f(x,u,t)\|_{\mathbb{X},\mathbb{U}}\leq\rho$ ,

where $\|\cdot\|_{\mathbb{X}}$ is a norm on $\mathbb{R}^{n}$ , $\|\cdot\|_{\mathbb{U}}$ is a norm on $\mathbb{R}^{p}$ , and $\|\cdot\|_{\mathbb{X},\mathbb{U}}$ is the induced norm on $\mathbb{R}^{n\times p}$ .

The norms $\|\cdot\|_{\mathbb{X}}$ and $\|\cdot\|_{\mathbb{U}}$ can be chosen different from Euclidean norm to ensure the tightest over-approximation of the reachable sets of the deterministic system (2). Suppose that Assumption 3 holds for the discrete-time system (2) and $t\mapsto x^{*}_{t}$ is a trajectory of (2) with an input $t\mapsto u^{*}_{t}$ . Define $\mathcal{B}_{\mathbb{X}}(r,y)=\{x\in\mathbb{R}^{n}:\|x-y\|_{\mathbb{X}}\leq r\}$ . Then, for every $x_{0}\in\mathcal{B}_{\mathbb{X}}(r_{1},x^{*}_{0})$ and every input $t\mapsto u_{t}$ with $u_{t}\in\mathcal{B}_{\mathbb{U}}(r_{2},u^{*}_{t})$ , we have

\displaystyle\overline{R}_{t}=\mathcal{B}_{\mathbb{X}}\left(L_{d}^{t}r_{1}+\rho(\tfrac{L_{d}^{t-1}-1}{L_{d}-1})r_{2},x^{*}_{t}\right).

(42)

We can use the Lipschitz bound over-approximation (42) to construct $\delta$ -PRS for the stochastic system (1).

Proposition 5.1 (Lipschitz bound Reachability).

Consider the stochastic system (1) with the associated deterministic system (2) satisfying Assumptions 1, 2, and 3. Let $t\mapsto x^{*}_{t}$ be a trajectory of (2) with an input $t\mapsto u^{*}_{t}$ and $t\mapsto X_{t}$ be a trajectory of (2) starting from $x_{0}\in\mathcal{B}_{\mathbb{X}}(r_{1},x^{*}_{0})$ with an input $t\mapsto u_{t}\in\mathcal{B}_{\mathbb{U}}(r_{2},u^{*}_{t})$ . Then, with probability at least $1-\delta$ ,

\displaystyle X_{t}\in\mathcal{B}_{\mathbb{X}}\left(L_{d}^{t}r_{1}+\rho(\tfrac{L_{d}^{t-1}-1}{L_{d}-1})r_{2},x^{*}_{t}\right)\oplus\mathcal{B}^{n}\left(r_{\delta,t},0\right)

where $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ with $\Psi_{t}$ as (9) and $\varepsilon_{1},\varepsilon_{2}$ as (19).

{pf}

The proof is a straightforward combination of Lipschitz bound over-approximation (42) and Theorem 2.

5.2 Interval-Based Reachability

Interval analysis provides a framework for estimating the propagation of uncertainties in discrete-time systems [11]. The main idea of interval-based reachability is to embed the dynamical system into a higher dimensional space using a suitable inclusion function. The map $\left[\begin{smallmatrix}\underline{\mathsf{F}}\\ \overline{\mathsf{F}}\end{smallmatrix}\right]:\mathbb{R}^{2n}\times\mathbb{R}^{2p}\times\mathbb{Z}_{\geq 0}\to\mathbb{R}^{2n}$ is an inclusion function for $f$ , if, for every $z,w\in[\underline{x},\overline{x}]\times[\underline{u},\overline{u}]$ and every $t\in\mathbb{Z}_{\geq 0}$ ,

\displaystyle\underline{\mathsf{F}}(\underline{x},\overline{x},\underline{u},\overline{u},t)\leq f(z,w,t)\leq\overline{\mathsf{F}}(\underline{x},\overline{x},\underline{u},\overline{u},t).

Many automated approaches exist for finding an inclusion function for $f$ . We refer to [37, Section IV.B] for a detailed discussion on these approaches and to [38] for a toolbox for computing inclusion functions. Given an interval initial configuration $\mathcal{X}_{0}=[\underline{x}_{0},\overline{x}_{0}]$ and an interval input set $\mathcal{U}=[\underline{u},\overline{u}]$ , the embedding system of (2) associated with the inclusion function $\mathsf{F}$ is given by

\displaystyle\begin{bmatrix}\underline{x}_{t+1}\\ \overline{x}_{t+1}\end{bmatrix}=\begin{bmatrix}\underline{\mathsf{F}}(\underline{x}_{t},\overline{x}_{t},\underline{u},\overline{u},t)\\ \overline{\mathsf{F}}(\underline{x}_{t},\overline{x}_{t},\underline{u},\overline{u},t)\end{bmatrix}.

(43)

Let $\left[\begin{smallmatrix}\underline{x}_{t}\\ \overline{x}_{t}\end{smallmatrix}\right]$ be the trajectory of the embedding system (43) starting from $\left[\begin{smallmatrix}\underline{x}_{0}\\ \overline{x}_{0}\end{smallmatrix}\right]$ . Then, an over-approximation of the deterministic reachable set of (2) is

\displaystyle\overline{\mathcal{R}}_{t}=[\underline{x}_{t},\overline{x}_{t}].

(44)

This over-approximation of reachable sets can be used in Theorem 2 to construct $\delta$ -PRS of the system (1).

Proposition 5.2 (Interval-based Reachability).

Consider the stochastic system (1) with the associated deterministic system (2) satisfying Assumptions 1 and 2. Let $t\mapsto X_{t}$ be a trajectory of the stochastic system (1) starting from $x_{0}\in[\underline{x}_{0},\overline{x}_{0}]$ with an input $t\mapsto u_{t}\in[\underline{u},\overline{u}]$ and $\left[\begin{smallmatrix}\underline{\mathsf{F}}\\ \overline{\mathsf{F}}\end{smallmatrix}\right]$ be an inclusion function for $f$ , Then with probability at least $1-\delta$ ,

\displaystyle X_{t}\in[\underline{x}_{t},\overline{x}_{t}]\oplus\mathcal{B}^{n}\left(r_{\delta,t},0\right)

where $\left[\begin{smallmatrix}\underline{x}_{t}\\ \overline{x}_{t}\end{smallmatrix}\right]$ is the trajectory of the embedding system (43) starting from $\left[\begin{smallmatrix}\underline{x}_{0}\\ \overline{x}_{0}\end{smallmatrix}\right]$ and $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ with $\Psi_{t}$ as (9) and $\varepsilon_{1},\varepsilon_{2}$ as (19).

{pf}

The proof is a straightforward combination of interval-case over-approximation (44) and Theorem 2.

6 Numerical experiments

6.1 Linear System

We first consider a linear example to validate the tightness of our bound (25) on the stochastic deviation. Consider a simple linear dynamics

\begin{split}X_{t+1}&=-0.93I_{n}X_{t}dt+w_{t}\\ &:=AX_{t}+w_{t},\end{split}

(45)

with $w_{t}\sim N(0,0.2I_{n})$ and initialized at $X_{0}=0$ . The system (45) satisfies Assumption 1 with $L_{t}\equiv L=\|A\|=0.93$ and Assumption 2 with $\sigma_{t}\equiv\sigma=\sqrt{0.2}$ . By linearity, $X_{t}$ follows a zero-mean Gaussian distribution whose covariance $\text{cov}(X_{t})$ can be computed using the sum-up property of zero-mean Gaussian variables. The trajectory of the deterministic dynamics associated with (45) starting from $x_{0}=0$ is $x_{t}\equiv 0$ .

To illustrate the bound (25), we simulate 5000 independent trajectories of (45) with $n=2$ over the time horizon $t\leq 15$ and compute the deviation associated with each trajectory, as depicted in Figure 3. These trajectories are compared with our probabilistic bound (25) with $\delta=10^{-3}$ and $\varepsilon=1/16$ . Figure 3 shows that all the trajectories satisfy the bound $r_{\delta,t}$ as expected. Moreover, the bound (41) derived by the worst-case analysis is put in the right sub-figure of Figure 3 to compare with our probabilistic bound 25. It is clear that worst-case analysis produces a much more conservative probabilistic bound on stochastic deviation.

By Theorem 1, the square of our bound (25), denoted as $r_{\delta,t}^{2}$ , grows linearly with $\log(1/\delta)$ and $n$ , as illustrated in Figure 4. To verify the tightness of these dependencies, we compare them with those obtained through Monte-Carlo simulation. In particular, for each choice of $\delta$ and $n$ , we simulate $10^{7}$ independent trajectories of (45) and compute the associated value of $\|X_{t}-x_{t}\|$ for each trajectory. We follow a standard approach [39] and estimate the high probability bound $\hat{r}_{\delta,t}$ of the stochastic deviation as the $\delta$ -th largest $\|X_{t}-x_{t}\|$ (e.g., top 1% if $\delta=10^{-2}$ ). The results, shown in Figure 4, imply that $\hat{r}_{\delta,t}^{2}$ also grows linearly with $\log(1/\delta)$ and $n$ , consistent with our theoretical bound (25).

6.2 Cobweb Supply-Demand Model

We next consider a Cobweb Supply-Demand model [40] with nonlinear dynamics. This model describes the price-quantity relationship of an item in a nearly-saturated single market such as electronics [41]. For a certain item, the model is described as

\begin{split}p_{t+1}&=a-b\log(1+q_{t})+w_{t,1}\\ q_{t+1}&=cp_{t+1}-d+w_{t,2}\end{split}

(46)

where $p_{t}$ is the price, $q_{t}$ is the quantity of the item, $a,~{}b,~{}c,~{}d$ are parameters determined by the market, and $w_{t,1},~{}w_{t,2}$ are disturbance brought by the market volatility. The model in the form of (1) is

\begin{split}X_{t+1}=&\begin{bmatrix}a-b\log(1+q_{t})\\ (ca-d)-cb\cdot\log(1+q_{t})\end{bmatrix}+\begin{bmatrix}w_{t,1}\\ cw_{t,1}+w_{t,2}\end{bmatrix}\\ :=&f(X_{t})+w_{t}\end{split}

(47)

where $X_{t}=[p_{t}~{}q_{t}]^{\top}$ . We set the parameters $a=10$ , $b=1.5$ , $c=0.5$ , $d=1$ and $w_{t,i}$ as truncated Gaussian noise: $w_{t,1}=\min\nolimits_{|\cdot|}\{N(0,10^{-5}),0.05p_{t}\}$ and $w_{t,2}=\min\nolimits_{|\cdot|}\{N(0,10^{-5}),0.05q_{t}\}$ where $\min_{|\cdot|}\{x,y\}$ outputs the one with smaller absolute value. The Lipschitz constant can be calculated as $L_{t}=\frac{b\sqrt{1+c^{2}}}{1+q_{t}}\in(0,1)$ , and $\sigma_{t}\approx 0.0032$ is approximated through Monte-Carlo simulation [32, Chapter 2.5].

We validate our $\delta$ -PRS based on the Lipschitz bound reachability with the initial set $\mathcal{X}_{0}=[9.195,9.205]\times[3.595,3.605]$ . In this case, $x_{t}^{*}=(p_{t}^{*},q_{t}^{*})$ is the trajectory of the associated deterministic system of (47) starting from $x_{0}^{*}=[9.2~{}3.6]^{\top}$ , and $r_{1}=5\sqrt{2}\times 10^{-3}$ . In each experiment, we sample 2000 independent trajectories starting from $X_{0}\in\mathcal{X}_{0}$ during $t\leq 5$ , and the $\delta$ -PRS based on Proposition 5.1, with the norm $\|\cdot\|_{\mathbb{X}}=\|\cdot\|$ , the probability level $\delta=10^{-3}$ and the parameter $\varepsilon=1/32$ . The evolution of the stochastic trajectories and the $\delta$ -PRS are visualized in Figure 5. It is clear that all the sampled points are within the derived bound, thus validating the derived $\delta$ -PRS.

6.3 Fixed-Wing UAV Path Following

In this section we consider the path following task of a discre-time fixed-wing unmanned aerial vehicle (UAV), whose kinematic model is formulated as

\begin{split}X_{t+1}&=X_{t}+\eta\begin{bmatrix}v\cos\theta_{t}\cos\gamma_{t}+u_{t,x}\\ v\sin\theta_{t}\cos\gamma_{t}+u_{t,y}\\ v\sin\gamma_{t}+u_{t,z}\\ \frac{g}{v}\tan\varphi_{t}\end{bmatrix}+w_{t}\\ &:=f(X_{t},u_{t})+w_{t},\end{split}

(48)

where $X_{t}=[p_{t,x}~{}p_{t,y}~{}p_{t,z}~{}\theta_{t}]^{\top}$ is the state variable at time $t$ , $(p_{t,x},p_{t,y})$ is the inertial north-east position of the UAV, $p_{t,z}$ is the altitude, $\theta_{t}$ is the heading angle measured from north, $\eta$ is the discretization stepsize, $[\gamma_{t}~{}\varphi_{t}]^{\top}$ is the controller input, $\gamma_{t}$ is the air mass referenced flight path angle, $\varphi_{t}$ is the roll angle, $v$ is the airspeed, $g$ is the gravity, $u_{t}=[u_{t,x}~{}u_{t,y}~{}u_{t,z}]^{\top}$ is the bounded noise brought by the wind, and $w_{t}$ is the stochastic disturbance in the environment. We set $v=13$ , $g=9.8$ , $\eta=0.1$ , $w_{t}\sim\sqrt{\eta}\cdot 0.1N(0,diag([1,1,1,0.1]))$ and $|u_{t,i}|\leq 0.5$ for $i=x,y,z$ . We use the feedback controller proposed in [42, Section 3] with the same feedback gains to control the UAV (48) starting from $X_{0}=[5~{}4.5~{}0~{}\frac{5\pi}{18}]^{\top}$ to follow the line $l^{*}=[0~{}0~{}3]^{\top}+\alpha[1~{}1~{}0]^{\top}$ , $\alpha\in[0,+\infty)$ in the $p_{x}-p_{y}-p_{z}$ space.

To verify our $\delta$ -PRS, we generate a deterministic trajectory $x_{t}$ starting from $X_{0}$ on the associated deterministic system of (48), and then sample 2000 independent stochastic trajectories from the same $X_{0}$ during $t\leq 200$ . Two of the trajectories are visualized in the first row of Figure 6.

To make the probabilistic bounds tighter, we estimate a sequence of localized $L_{t}$ with respect to the $P$ -weighted norm by simulated-based methods in [43] with $P=diag(1,~{}1,~{}100,~{}50)$ , and then compute the radius of the $\delta$ -PRS by Theorem 2 and the time-varying version of Proposition 5.1 with $\|\cdot\|_{\mathbb{X}},\|\cdot\|_{\mathbb{U}}=\|\cdot\|_{P}$ , $\varepsilon=1/16$ and $\delta=10^{-4}$ . The union of the projections of the obtained $\delta$ -PRS to the $p_{x}-p_{y}-p_{z}$ space for $t\leq 200$ is visualized as the green envelopes in the second row of Figure 6. It is clear that all the sampled states are within the green envelopes, thus verifying our theoretical results.

7 Conclusions

We present a unified and practical framework for computing the probabilistic reachable set (PRS) for discrete-time nonlinear stochastic systems, a generalization of our previous framework [24] to the discrete-time system. A central idea of our approach is the separation strategy, which decouples the effect of deterministic inputs and stochastic noise on the PRS. The latter is captured by the difference between stochastic trajectories and their deterministic counterparts, termed stochastic deviation. A key technical innovation is a novel energy function called the Averaged Moment Generating Function, with which we establish a tight probabilistic bound on stochastic deviation. This bound is tight for general discrete-time nonlinear systems and is exact for linear systems. Based on the separation strategy, we can combine this tight bound with any existing methods for deterministic reachability to offer a PRS that is both flexible and effective with a high probability level. This PRS framework makes it possible to develop effective algorithms and products for control systems that require high probability guarantee on the PRS, such as safety-critical systems, and the AMGF leveraged in our work will open new research directions in multiple areas like control theory, finance, statistics, and machine learning.

References

[1] Xin Chen and Sriram Sankaranarayanan. Reachability analysis for cyber-physical systems: Are we there yet? In NASA Formal Methods: 14th International Symposium, NFM 2022, Pasadena, CA, USA, May 24–27, 2022, Proceedings, pages 109–130. Springer, 2022.
[2] S. V. Rakovic, E. C. Kerrigan, D. Q. Mayne, and J. Lygeros. Reachability analysis of discrete-time systems with disturbances. IEEE Transactions on Automatic Control, 51(4):546–561, 2006.
[3] Yue Meng, Zengyi Qin, and Chuchu Fan. Reactive and safe road user simulations using neural barrier certificates. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 6299–6306. IEEE, 2021.
[4] Ryan K Cosner, Preston Culbertson, and Aaron D Ames. Bounding stochastic safety: Leveraging freedman’s inequality with discrete-time control barrier functions. arXiv e-prints, pages arXiv–2403, 2024.
[5] Stephen Prajna, Ali Jadbabaie, and George J Pappas. A framework for worst-case and stochastic safety verification using barrier certificates. IEEE Transactions on Automatic Control, 52(8):1415–1428, 2007.
[6] A. Abate, M. Prandini, J. Lygeros, and S. Sastry. Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems. Automatica, 44(11):2724–2734, 2008.
[7] S. Summers and J. Lygeros. Verification of discrete time stochastic hybrid systems: A stochastic reach-avoid decision problem. Automatica, 46(12):1951–1961, 2010.
[8] M. Korda, D. Henrion, and C. N. Jones. Convex computation of the maximum controlled invariant set for polynomial control systems. SIAM Journal on Control and Optimization, 52(5):2944–2969, 2014.
[9] Z. Huang and S. Mitra. Computing bounded reach sets from sampled simulation traces. In Hybrid Systems: Computation and Control, page 291–294, 2012.
[10] Torsten Koller, Felix Berkenkamp, Matteo Turchetta, and Andreas Krause. Learning-based model predictive control for safe exploration. In 2018 IEEE conference on decision and control (CDC), pages 6059–6066. IEEE, 2018.
[11] L. Jaulin, M. Kieffer, O. Didrit, and É. Walter. Applied Interval Analysis. Springer London, 2001.
[12] S. Coogan and M. Arcak. Efficient finite abstraction of mixed monotone systems. In Hybrid Systems: Computation and Control, pages 58–67, April 2015.
[13] Alex Devonport and Murat Arcak. Data-driven reachable set computation using adaptive gaussian process classification and monte carlo methods. In 2020 American Control Conference (ACC), pages 2629–2634, 2020.
[14] Chuchu Fan, James Kapinski, Xiaoqing Jin, and Sayan Mitra. Simulation-driven reachability using matrix measures. ACM Trans. Embed. Comput. Syst., 17(1), dec 2017.
[15] J. Maidens and M. Arcak. Reachability analysis of nonlinear systems using matrix measures. IEEE Transactions on Automatic Control, 60(1):265–270, 2015.
[16] Chuchu Fan, Bolun Qi, Sayan Mitra, Mahesh Viswanathan, and Parasara Sridhar Duggirala. Automatic reachability analysis for nonlinear hybrid models with c2e2. In International Conference on Computer Aided Verification, pages 531–538. Springer, 2016.
[17] Amr Alanwar, Anne Koch, Frank Allgöwer, and Karl Henrik Johansson. Data-driven reachability analysis from noisy data. IEEE Transactions on Automatic Control, 68(5):3054–3069, 2023.
[18] Thomas Lew and Marco Pavone. Sampling-based reachability analysis: A random set theory approach with adversarial sampling. In Conference on robot learning, pages 2055–2070. PMLR, 2021.
[19] A. P. Vinod and M. K. Oishi. Stochastic reachability of a target tube: Theory and computation. Automatica, 125:109458, 2021.
[20] Lukas Hewing and Melanie N Zeilinger. Stochastic model predictive control for linear systems using probabilistic reachable sets. In 2018 IEEE Conference on Decision and Control (CDC), pages 5182–5188. IEEE, 2018.
[21] Mitchell Black, Georgios Fainekos, Bardh Hoxha, and Dimitra Panagou. Risk-aware fixed-time stabilization of stochastic systems under measurement uncertainty. arXiv preprint arXiv:2403.20258, 2024.
[22] Cesar Santoyo, Maxence Dutreix, and Samuel Coogan. A barrier function approach to finite-time stochastic system verification and control. Automatica, 125:109439, 2021.
[23] Albert Chern, Xiang Wang, Abhiram Iyer, and Yorie Nakahira. Safe control in the presence of stochastic uncertainties. In 2021 60th IEEE Conference on Decision and Control (CDC), pages 6640–6645. IEEE, 2021.
[24] S. Jafarpour^∗, Z. Liu^∗, and Y. Chen. Probabilistic reachability analysis of stochastic control systems. arXiv preprint, 2024.
[25] Carlo Novara, Lorenzo Fagiano, and Mario Milanese. Direct feedback control design for nonlinear systems. Automatica, 49(4):849–860, 2013.
[26] Cristopher Moore. Unpredictability and undecidability in dynamical systems. Physical Review Letters, 64:2354–2357, May 1990.
[27] F. Bullo. Contraction Theory for Dynamical Systems. Kindle Direct Publishing, 1.1 edition, 2023.
[28] Christophe Weibel. Minkowski sums of polytopes: combinatorics and computation. Technical report, EPFL, 2007.
[29] Q-C. Pham, N. Tabareau, and J-J. Slotine. A contraction theory approach to stochastic incremental stability. IEEE Transactions on Automatic Control, 54(4):816–820, 2009.
[30] Alessandro Rinaldo. Lecture 3, 2019. 36-710: Advanced Statistical Theory.
[31] Philippe Rigollet and Jan-Christian Hütter. High-dimensional statistics. arXiv preprint arXiv:2310.19244, 2023.
[32] R. Vershynin. High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018.
[33] Jason M Altschuler and Kunal Talwar. Concentration of the langevin algorithm’s stationary distribution. arXiv preprint arXiv:2212.12629, 2022.
[34] Navid Hashemi, Xin Qin, Lars Lindemann, and Jyotirmoy V Deshmukh. Data-driven reachability analysis of stochastic dynamical systems with conformal inference. In 2023 62nd IEEE Conference on Decision and Control (CDC), pages 3102–3109. IEEE, 2023.
[35] Eleftherios E Vlahakis, Lars Lindemann, Pantelis Sopasakis, and Dimos V Dimarogonas. Probabilistic tube-based control synthesis of stochastic multi-agent systems under signal temporal logic. arXiv preprint arXiv:2405.02827, 2024.
[36] Gokul Varadhan and Dinesh Manocha. Accurate minkowski sum approximation of polyhedral models. In 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings., pages 392–401. IEEE, 2004.
[37] S. Jafarpour, A. Harapanahalli, and S. Coogan. Efficient interaction-aware interval analysis of neural network feedback loops. arXiv preprint, 2023.
[38] A. Harapanahalli, S. Jafarpour, and S. Coogan. A toolbox for fast interval arithmetic in numpy with an application to formal verification of neural network controlled systems. In 2nd ICML Workshop on Formal Verification of Machine Learning, 2023.
[39] Alexander Shapiro. Monte carlo sampling methods. Handbooks in operations research and management science, 10:353–425, 2003.
[40] Robert S. Pindyck and Daniel L. Rubinfeld. Microeconomics. Pearson, 2012.
[41] Cars H Hommes. Dynamics of the cobweb model with adaptive expectations and nonlinear supply and demand. Journal of Economic Behavior & Organization, 24(3):315–335, 1994.
[42] Randal W Beard, Jeff Ferrin, and Jeffrey Humpherys. Fixed wing uav path following in wind with input constraints. IEEE Transactions on Control Systems Technology, 22(6):2103–2117, 2014.
[43] Craig Knuth, Glen Chou, Necmiye Ozay, and Dmitry Berenson. Planning with learned dynamics: Probabilistic guarantees on safety and reachability via lipschitz constants. IEEE Robotics and Automation Letters, 6(3):5129–5136, 2021.

Appendix A Proof of Lemmas

A.1 Proof of Lemma 4.1

(1)-(2): See [24].

(3): By (22) we have

\begin{split}\mathbb{E}_{w}\Phi_{n,\lambda}(x+w)&=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(\mathbb{E}_{w}e^{\lambda\langle\ell,x+w\rangle}\right)\\ &=\mathbb{E}_{{\ell\sim\mathcal{S}^{n-1}}}\left(e^{\lambda\langle\ell,x\rangle}\cdot\mathbb{E}_{w}e^{\lambda\langle\ell,w\rangle}\right)\end{split}

(49)

By the definition of sub-Gaussian vector, we know that

\mathbb{E}_{w}\left(e^{\lambda\langle\ell,w\rangle}\right)\leq e^{\frac{\lambda^{2}\sigma^{2}}{2}}

Therefore, (49) becomes

\mathbb{E}_{w}\Phi_{n,\lambda}(x+w)\leq e^{\frac{\lambda^{2}\sigma^{2}}{2}}\Phi_{n,\lambda}(x)

which completes the proof.

A.2 Proof of Lemma 4.2

Define random vector $\tilde{X}=QX$ , where $Q\sim\mathbb{U}^{n}$ is a random unitary matrix and $\mathbb{U}^{n}$ denotes the set of all the unitary matrices in $\mathbb{R}^{n\times n}$ . By Lemma 4.1(i), we have that for any $\eta\in\mathcal{S}^{n-1}$ ,

\begin{split}\Phi_{n,\lambda}(X)&=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\|X\|\langle\ell,\frac{X}{\|X\|}\rangle}\right)=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\|X\|\langle\ell,\eta\rangle}\right)\\ &=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\eta,\ell\|X\|\rangle}\right)=\mathbb{E}_{Q\sim\mathbb{U}^{n}}\left(e^{\lambda\langle\eta,QX\rangle}\right),\end{split}

where the last “ $=$ ” uses the fact that $Q\ell\in\mathcal{S}^{n-1}$ for any $\ell\in\mathcal{S}^{n-1}$ . Therefore, we obtain

\begin{split}&\mathbb{E}_{\tilde{X}}\left(e^{\lambda\langle\eta,\tilde{X}\rangle}\right)=\mathbb{E}_{X}\mathbb{E}_{Q}\left(e^{\lambda\langle\eta,QX\rangle}\right)\\ =&\mathbb{E}_{X}\left(\Phi_{n,\lambda}(X)\right)\leq e^{\frac{\lambda^{2}\sigma^{2}}{2}},\quad\forall\lambda\in\mathbb{R},~{}\forall\eta\in\mathcal{S}^{n-1}.\end{split}

(50)

Therefore, $\tilde{X}$ is sub-Gaussian with variance proxy $\sigma^{2}$ . By Lemma 3.1, $\tilde{X}$ satisfies (18).

Finally, since $\|X\|=\|QX\|=\|\tilde{X}\|$ for any $Q\in\mathbb{U}^{n}$ , we conclude that $X$ also satisfies (18). This completes the proof.

Probabilistic Reachability of Discrete-Time Nonlinear Stochastic Systems

Abstract

keywords:

1 Introduction

2 Problem Statement and Preliminaries

Assumption 1.

2.1 Deterministic Reachable Set

Definition 2.1 (DRS).

2.2 Probabilistic Reachable Set

Definition 2.2 (sub-Gaussian).

Assumption 2.

Definition 2.3 (δ\delta-PRS).

Problem 1.

2.3 Separation Strategy and Main Challenge

Proposition 1.

Problem 2 (Stochastic Deviation).

3 Expectation Bound and Limitations

3.1 Expectation Bound on Stochastic Deviation

Proposition 2.

3.2 Limitation of The Expectation Bound

Lemma 3.1 (Norm Concentration).

4 Probabilistic Bound on Stochastic Deviation

4.1 Motivation: Moment Generating Function

4.2 Averaged Moment Generating Function

Definition 4.1 (AMGF).

Lemma 4.1 (Properties of AMGF).

Lemma 4.2.

4.3 Theoretical Analysis

Theorem 1.

Remark 4.1.

Remark 4.2.

4.4 Tightness of The Bound

4.5 Comparison with Worst-Case Analysis

5 Probabilistic Reachable Set

Theorem 2.

5.1 Lipschitz Bound Reachability

Assumption 3.

Proposition 5.1 (Lipschitz bound Reachability).

5.2 Interval-Based Reachability

Proposition 5.2 (Interval-based Reachability).

6 Numerical experiments

6.1 Linear System

6.2 Cobweb Supply-Demand Model

6.3 Fixed-Wing UAV Path Following

7 Conclusions

References

Appendix A Proof of Lemmas

A.1 Proof of Lemma 4.1

A.2 Proof of Lemma 4.2

Definition 2.3 ( $\delta$ -PRS).