This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\coltauthor\Name

Albert Lin \Emailalbert.k.lin@usc.edu
\NameSomil Bansal \Emailsomilban@usc.edu
\addrDepartment of Electrical and Computer Engineering, University of Southern California, CA, USA

Verification of Neural Reachable Tubes via Scenario Optimization and Conformal Prediction

Abstract

Learning-based approaches for controlling safety-critical autonomous systems are rapidly growing in popularity; thus, it is important to provide rigorous and robust assurances on their performance and safety. Hamilton-Jacobi (HJ) reachability analysis is a popular formal verification tool for providing such guarantees, since it can handle general nonlinear system dynamics, bounded adversarial system disturbances, and state and input constraints. However, it involves solving a Partial Differential Equation (PDE), whose computational and memory complexity scales exponentially with respect to the state dimension, making its direct use on large-scale systems intractable. To overcome this challenge, neural approaches, such as DeepReach, have been used to synthesize reachable tubes and safety controllers for high-dimensional systems. However, verifying these neural reachable tubes remains challenging. In this work, we propose two different verification methods, based on robust scenario optimization and conformal prediction, to provide probabilistic safety guarantees for neural reachable tubes. Our methods allow a direct trade-off between resilience to outlier errors in the neural tube, which are inevitable in a learning-based approach, and the strength of the probabilistic safety guarantee. Furthermore, we show that split conformal prediction, a widely used method in the machine learning community for uncertainty quantification, reduces to a scenario-based approach, making the two methods equivalent not only for verification of neural reachable tubes but also more generally. To our knowledge, our proof is the first in the literature to show a strong relationship between the highly related but disparate fields of conformal prediction and scenario optimization. Finally, we propose an outlier-adjusted verification approach that harnesses information about the error distribution in neural reachable tubes to recover greater safe volumes. We demonstrate the efficacy of the proposed approaches for the high-dimensional problems of multi-vehicle collision avoidance and rocket landing with no-go zones.

keywords:
Probabilistic Safety Guarantees, Safety-Critical Learning, Neural Certificates, Hamilton-Jacobi Reachability Analysis, Scenario Optimization, Conformal Prediction

1 Introduction

It is important to design provably safe controllers for autonomous systems. Hamilton-Jacobi (HJ) reachability analysis provides a powerful framework to design such controllers for general nonlinear dynamical systems (Lygeros, 2004; Mitchell et al., 2005). In reachability analysis, safety is characterized by the system’s Backward Reachable Tube (BRT). This is the set of states from which trajectories will eventually reach a given target set despite the best control effort. Thus, if the target set represents undesirable states, the BRT represents unsafe states and should be avoided. Along with the BRT, reachability analysis provides a safety controller to keep the system outside the BRT.

Traditionally, the BRT computation in HJ reachability is formulated as an optimal control problem. The BRT can then be obtained as a sub-zero level solution of the corresponding value function. Obtaining the value function requires solving a partial differential equation (PDE) over a state-space grid, resulting in an exponentially scaling computation complexity with the number of states (Bansal et al., 2017). To overcome this challenge, a variety of solutions have been proposed that trade off between the class of dynamics they can handle, the approximation quality of the BRT, and the required computation. These include specialized methods for linear and affine dynamics (Greenstreet and Mitchell, 1998; Frehse et al., 2011; Kurzhanski and Varaiya, 2000, 2002; Maidens et al., 2013; Girard, 2005; Althoff et al., 2010; Bak et al., 2019; Nilsson and Ozay, 2016), polynomial dynamics (Majumdar et al., 2014; Majumdar and Tedrake, 2017; Dreossi et al., 2016; Henrion and Korda, 2014), monotonic dynamics (Coogan and Arcak, 2015), and convex dynamics (Chow et al., 2017) (see Bansal et al. (2017); Bansal and Tomlin (2021) for a survey).

Owing to the success of deep learning, there has also been a surge of interest in approximating high-dimensional BRTs (Rubies-Royo et al., 2019; Fisac et al., 2019; Djeridane and Lygeros, 2006; Niarchos and Lygeros, 2006; Darbon et al., 2020) and optimal controllers (Onken et al., 2022) through deep neural networks (DNNs). Building upon this line of work, Bansal and Tomlin (2021) have proposed DeepReach – a toolbox that leverages recent advances in neural implicit representations and neural PDE solvers to compute a value function and a safety controller for high-dimensional systems. Compared to the aforementioned methods, DeepReach can handle general nonlinear dynamics, the presence of exogenous disturbances, as well as state and input constraints during the BRT computation. Consequently, methods for verifying neural reachable tubes have been proposed. For example, Lin and Bansal (2023) propose an iterative scenario-based method (Campi et al., 2009) to recover probabilistically safe reachable tubes from DeepReach solutions up to a desired confidence level and bound on violation rate. Unfortunately, the method does not allow an after-the-fact risk-return trade-off, and as a result, it is highly sensitive to outlier errors in the learned solutions. This can lead to highly conservative reachable tubes and a severe loss of recovery in the case of stringent safety requirements, as we demonstrate in our case studies.

In this work, we propose two different verification methods, one based on robust scenario optimization and the other based on conformal prediction, to provide probabilistic safety guarantees for neural reachable tubes. Both methods are resilient to the outlier errors in neural reachable tubes and automatically trade-off the strength of the probabilistic safety guarantees based on the outlier rate. The proposed methods can evaluate any candidate tube and are not restricted to a specific class of system dynamics or value functions. We further prove that these seemingly different verification methods naturally reduce to one another, providing a unifying viewpoint for uncertainty quantification (typical use case of conformal prediction) and error optimization (typical use case of scenario optimization) in neural reachable tubes. Based on these insights, we propose an outlier-adjusted verification approach that can recover a greater safe volume from a neural reachable tube by harnessing information about the distribution of error in the learned solution. To summarize, the key contributions of this paper are:

  • probabilistic safety verification methods for neural reachable tubes that enable a direct trade-off between resilience and the probabilistic strength of safety,

  • a proof that split conformal prediction reduces to a scenario-based approach in general, demonstrating a strong relationship between the two highly related but disparate fields,

  • an outlier-adjusted verification approach that recovers greater safe volumes from tubes, and

  • a demonstration of the proposed approaches for the high-dimensional problems of multi-vehicle collision avoidance and rocket landing with no-go zones.

2 Problem Setup

Consider a dynamical system with state xXnx\in X\subseteq\mathbb{R}^{n}, control u𝒰u\in\mathcal{U}, and dynamics x˙=f(x,u)\dot{x}=f(x,u) governing how xx evolves over time until a final time TT. Let ξx,tu(τ)\xi_{x,t}^{u}(\tau) denote the state achieved at time τ[t,T]\tau\in[t,T] by starting at initial state xx and time tt and applying control u()u(\cdot) over [t,τ][t,\tau]. Let \mathcal{L} represent a target set that the agent wants to either reach (e.g. goal states) or avoid (e.g. obstacles).

Running example: Multi-Vehicle Collision Avoidance. Consider a 9D multi-vehicle collision avoidance system with 3 independent Dubins3D cars: Q1,Q2,Q3Q_{1},Q_{2},Q_{3}. QiQ_{i} has position (pxi,pyi)(p_{xi},p_{yi}), heading θi\theta_{i}, constant velocity vv, and steering control ui[umin,umax]u_{i}\in[u_{\min},u_{\max}]. The dynamics of QiQ_{i} are: p˙xi=vcosθi,p˙yi=vsinθi,θi˙=ui\dot{p}_{xi}=v\cos{\theta_{i}},\quad\dot{p}_{yi}=v\sin{\theta_{i}},\quad\dot{\theta_{i}}=u_{i}. \mathcal{L} is the set of states where any of the vehicle pairs is in collision: ={x:min{d(Q1,Q2),d(Q1,Q3),d(Q2,Q3)}R}\mathcal{L}=\{x:\min\{d(Q_{1},Q_{2}),d(Q_{1},Q_{3}),d(Q_{2},Q_{3})\}\leq R\}, where d(Qi,Qj)d(Q_{i},Q_{j}) is the distance between QiQ_{i} and QjQ_{j}. We set: v=0.6,umin=1.1,umax=1.1,R=0.25v=0.6,\quad u_{\min}=-1.1,\quad u_{\max}=1.1,\quad R=0.25.

In this setting, we are interested in computing the system’s initial-time Backward Reachable Tube, which we denote as BRT. We define BRT as the set of all initial states in XX from which the agent will eventually reach \mathcal{L} within the time horizon [0,T][0,T], despite best control efforts: BRT={x:xX,u(),τ[0,T],ξx,0u(τ)}\text{BRT}=\{x:x\in X,\forall u(\cdot),\exists\tau\in[0,T],\xi_{x,0}^{u}(\tau)\in\mathcal{L}\}. When \mathcal{L} represents unsafe states for the system, as it does in our running example, staying outside of BRT is desirable. When \mathcal{L} instead represents the states that the agent wants to reach, BRT is defined as the set of all initial states in XX from which the agent, acting optimally, can eventually reach \mathcal{L} within [0,T][0,T]. Thus, staying within BRT is desirable.

The above 9D system is intractable for traditional grid-based methods, motivating the use of DeepReach to learn a neural BRT. Our goal in this work is to recover an approximation of the safe set with probabilistic guarantees. Specifically, we want to find 𝒮\mathcal{S} such that x𝒮(xBRT)ϵ\underset{x\in\mathcal{S}}{\mathbb{P}}(x\in\text{BRT})\leq\epsilon for some violation parameter ϵ(0,1)\epsilon\in(0,1). When \mathcal{L} represents goal states, we want x𝒮(xBRTC)ϵ\underset{x\in\mathcal{S}}{\mathbb{P}}(x\in\text{BRT}^{C})\leq\epsilon.

3 Background: Hamilton-Jacobi Reachability, DeepReach, and Safety Verification

Here, we provide a quick overview of Hamilton-Jacobi reachability analysis, a specific toolbox, DeepReach, to compute high-dimensional neural reachable tubes, and an iterative scenario-based method for recovering probabilistically safe tubes from learning-based methods like DeepReach.

3.1 Hamilton-Jacobi (HJ) Reachability

In HJ reachability, computing BRT is formulated as an optimal control problem. We will explain it in the context of \mathcal{L} being a set of undesirable states. In the end, we will comment on when \mathcal{L} is a set of desirable states and refer interested readers to Bansal et al. (2017) for other cases.

We first define a target function l(x)l(x) such that the sub-zero level of l(x)l(x) yields \mathcal{L}: ={x:l(x)0}\mathcal{L}=\{x:l(x)\leq 0\}. l(x)l(x) is commonly a signed distance function to \mathcal{L}. For example, we can choose l(x)=min{d(Q1,Q2),d(Q1,Q3),d(Q2,Q3)}Rl(x)=\min\{d(Q_{1},Q_{2}),d(Q_{1},Q_{3}),d(Q_{2},Q_{3})\}-R for our running example in \sectionrefsec:problem_setup. Next, we define the cost function of a state corresponding to some policy u()u(\cdot) to be the minimum of l(x)l(x) over its trajectory: Ju()(x,t)=minτ[t,T]l(ξx,tu(τ))J_{u(\cdot)}(x,t)=\min_{\tau\in[t,T]}l(\xi_{x,t}^{u}(\tau)). Since the system wants to avoid \mathcal{L}, our goal is to maximize Ju()(x,t)J_{u(\cdot)}(x,t). Thus, the value function corresponding to this optimal control problem is:

V(x,t)=supu()Ju()(x,t)V(x,t)=\sup_{u(\cdot)}J_{u(\cdot)}(x,t)\vspace{-0.5em} (1)

By defining our optimal control problem in this way, we can recover BRT using the value function. In particular, the value function being sub-zero implies that the target function is sub-zero somewhere along the optimal trajectory, or in other words, that the system has reached \mathcal{L}. Thus, BRT is given as the sub-zero level set of the value function at the initial time: BRT={x:xX,V(x,0)0}\text{BRT}=\{x:x\in X,V(x,0)\leq 0\}. The value function in Equation (1) can be computed using dynamic programming, resulting in the following final value Hamilton-Jacobi-Bellman Variational Inequality (HJB-VI): min{DtV(x,t)+H(x,t),l(x)V(x,t)}=0\min\Big{\{}D_{t}V(x,t)+H(x,t),l(x)-V(x,t)\Big{\}}=0, with the terminal value function V(x,T)=l(x)V(x,T)=l(x). DtD_{t} and \nabla represent the time and spatial gradients of VV. HH is the Hamiltonian that encodes the role of dynamics and the optimal control: H(x,t)=maxuV(x,t),f(x,u)H(x,t)=\max_{u}\langle\nabla V(x,t),f(x,u)\rangle. The value function in Equation (1) induces the optimal safety controller: u(x,t)=argmax𝑢V(x,t),f(x,u)u^{*}(x,t)=\underset{u}{\arg\max}\langle\nabla V(x,t),f(x,u)\rangle. Intuitively, the safety controller aligns the system dynamics in the direction of the value function’s gradient, thus steering the system towards higher-value states, i.e., away from \mathcal{L}.

We have just explained the case where \mathcal{L} represents a set of undesirable states. When the system instead wants to reach \mathcal{L}, an infimum is used instead of a supremum in Equation (1)\eqref{eq:valuefunc}. The control wants to reach \mathcal{L}, hence there is a minimum instead of a maximum in the Hamiltonian and optimal safety controller equations. See Bansal et al. (2017) for details on other reachability cases.

Traditionally, the value function is computed by solving the HJB-VI over a discretized grid in the state space. Unfortunately, doing so involves computation whose memory and time complexity scales exponentially with respect to the system dimension, making these methods practically intractable for high-dimensional systems, such as those beyond 5D. Fortunately, a deep learning approach, DeepReach, has been proposed to enable HJ reachability for high-dimensional systems.

3.2 DeepReach and an Iterative Scenario-Based Probabilistic Safety Verification Method

Instead of solving the HJB-VI over a grid, DeepReach (Bansal and Tomlin, 2021) learns a parameterized approximation of the value function using a sinusoidal deep neural network (DNN). Thus, memory and complexity requirements for training scale with the value function complexity rather than the grid resolution, allowing it to obtain BRTs for high-dimensional systems. DeepReach trains the DNN via self-supervision on the HJB-VI itself. Ultimately, it takes as input a state xx and time tt, and it outputs a learned value function V~(x,t)\tilde{V}(x,t). V~(x,t)\tilde{V}(x,t) also induces a corresponding safe policy π~(x,t)\tilde{\pi}(x,t), as well as a BRT (referred to as the neural reachable tube from hereon).

However, the neural reachable tube will only be as accurate as V~(x,t)\tilde{V}(x,t). To obtain a provably safe BRT, Lin and Bansal (2023) propose a uniform value correction bound which is defined, for the avoid case, as the maximum learned value of an unsafe state under the induced policy: δV~,π~maxxX{V~(x,0):Jπ~(x,0)0}\delta_{\tilde{V},\tilde{\pi}}\coloneqq\max_{x\in X}\{\tilde{V}(x,0):J_{\tilde{\pi}}(x,0)\leq 0\}. The authors show that the super-δV~,π~\delta_{\tilde{V},\tilde{\pi}} level set of V~(x,0)\tilde{V}(x,0) is provably safe under the policy π~(x,t)\tilde{\pi}(x,t). They also propose an iterative scenario-based probabilistic verification method for computing an approximation of δV~,π~\delta_{\tilde{V},\tilde{\pi}} from finite random samples that satisfies a desired confidence level and violation rate. However, the method is sensitive to outlier errors in the neural reachable tube and can result in very conservative safe sets. Specifically, it does not provide safety assurances for safe sets with nonzero empirical safety violations.

In this work, we propose probabilistic safety verification methods that allow nonzero empirical safety violations at the cost of the probabilistic strength of safety. This enables a direct trade-off between resilience to outlier errors and the strength of the safety guarantee.

Remark 3.1.

Although we work with DeepReach solutions in particular for our problem setup, our proposed approaches can verify any general V~(x,t)\tilde{V}(x,t) and π~(x,t)\tilde{\pi}(x,t), regardless of whether DeepReach, a numerical PDE solver, or some other tool is used to obtain them.

4 Robust Scenario-Based Probabilistic Safety Verification Method

Here, we propose a robust scenario-based probabilistic safety verification method for neural reachable tubes. The new method is a straightforward application of a scenario-based sampling-and-discarding approach to chance-constrained optimization problems, which quantifies the trade-off between feasibility and performance of the optimal solution based on finite samples (Campi and Garatti, 2011). First, we explain the method when \mathcal{L} represents undesirable states. In the end, we comment on when \mathcal{L} represents desirable states.

Procedures: Let 𝒮X\mathcal{S}\subseteq X be a neural safe set that is, in the avoid case, the complement of the neural reachable tube being verified. In our case, 𝒮\mathcal{S} is typically a super-δ\delta level set of the learned value function V~(x,0)\tilde{V}(x,0). Ideally, any super-δ\delta level set of V~(x,0)\tilde{V}(x,0) for δ>0\delta>0 should be a valid safe set; however, due to learning errors, that might not be true in practice. To provide a probabilistic safety assurance for 𝒮\mathcal{S}, we first sample NN independent and identically distributed (i.i.d.) states x1:Nx_{1:N} from 𝒮\mathcal{S} according to some probability distribution \mathbb{P} over 𝒮\mathcal{S}. Since 𝒮\mathcal{S} is defined implicitly by V~(x,0)\tilde{V}(x,0), we use rejection sampling. We next compute the costs Jπ~(xi,0)J_{\tilde{\pi}}(x_{i},0) for i=1,2,,Ni=1,2,...,N by rolling out the system trajectory from xix_{i} under π~(x,t)\tilde{\pi}(x,t). Let kk refer to the number of “outliers” - samples that are empirically unsafe, i.e., Jπ~(xi,0)0J_{\tilde{\pi}}(x_{i},0)\leq 0. Then the following theorem provides a probabilistic guarantee on the safety of the neural reachable tube and its complement, the neural safe set 𝒮\mathcal{S}:

Theorem 4.1 (Robust Scenario-Based Probabilistic Safety Verification).

Select a safety violation parameter ϵ(0,1)\epsilon\in(0,1) and a confidence parameter β(0,1)\beta\in(0,1) such that

i=0k(Ni)ϵi(1ϵ)Niβ\displaystyle\sum^{k}_{i=0}\binom{N}{i}\epsilon^{i}(1-\epsilon)^{N-i}\leq\beta (2)

where kk and NN are as defined above. Then, with probability at least 1β1-\beta, the following holds:

x𝒮(V(x,0)0)ϵ\displaystyle\underset{x\in\mathcal{S}}{\mathbb{P}}\left(V(x,0)\leq 0\right)\leq\epsilon (3)

All proofs can be found in the Appendix of the extended version of this article111See \urlhttps://sia-lab-git.github.io/Verification_of_Neural_Reachable_Tubes.pdf. Disregarding the confidence parameter β\beta for a moment, \theoremreftheorem:scenario-based_method states that the fraction of 𝒮\mathcal{S} that is unsafe is bounded above by the violation parameter ϵ\epsilon, where ϵ\epsilon is computed empirically using Equation (2) based on the outlier rate kk encountered within NN samples. ϵ\epsilon is thus a reflection of the safety quality of 𝒮\mathcal{S}, which degrades with the increase in the number of outliers kk, as expected. This can also be seen for the running example in \figurereffig:fixed_N (the red curve). Overall, \theoremreftheorem:scenario-based_method allows us to compute probabilistic safety guarantees for any neural set 𝒮\mathcal{S} based on a finite number of samples. Subsequently, this result can be used to find some 𝒮\mathcal{S} for which ϵ\epsilon is smaller than a desired threshold, as we discuss later in this section.

To interpret β\beta, note that kk is a random variable that depends on the randomly sampled x1:Nx_{1:N}. It may be the case that we just happen to draw an unrepresentative sample, in which case the ϵ\epsilon bound does not hold. β\beta controls the probability of this adverse event happening, which regards the correctness of the probabilistic safety guarantee in Equation (3). Fortunately, β\beta goes to 0 exponentially with NN, so β\beta can be chosen to be an extremely small value, such as 101610^{-16}, when we sample large NN. 1β1-\beta will then be so close to 11 that it does not have any practical importance.

We have just explained the robust scenario-based probabilistic safety verification method in the case where \mathcal{L} represents undesirable states. When the system instead wants to reach \mathcal{L}, 𝒮\mathcal{S} will be a sublevel set instead of a superlevel set of the learned value function. The cost inequality should be flipped when computing kk, and the value inequality should be flipped in Equation (3).

4.1 Comparison of Robust and Iterative Scenario-Based Probabilistic Safety Verification

The key difference between the proposed robust scenario-based method and the iterative scenario-based method discussed in \sectionrefsec:deepreach_iterative_verification is that the former can handle nonzero empirical safety violations kk. This enables several crucial advantages that we demonstrate in \figurereffig:fixed_N,fig:diff_N for a solution learned by DeepReach on the multi-vehicle collision avoidance running example in \sectionrefsec:problem_setup. We have fixed the confidence parameter β=1016\beta=10^{-16} to be so close to 0 that it has no practical significance (β\beta plays the same role in both methods).

Refer to captionRefer to caption
Figure 1: (Top) For a fixed simulation budget NN, the cyan curve shows the number of empirical safety violations kk for different learned volumes (different super-levels of V~(x,0)\tilde{V}(x,0)). The red curve shows the trade-off in safety strength ϵ\epsilon (in log scale) for each kk using the robust method. The grey point indicates the iterative method baseline. The robust method is able to provide safety assurances even for the volumes that have non-zero outliers. (Dashed black line) By a small decrease in safety level (from 99.999%99.999\% to 99.974%99.974\%) caused by outliers, we are able to significantly increase the assured safe volume from 0.560.56 to 0.810.81. (Bottom) Correspondingly, the safe set 𝒮\mathcal{S} increases greatly from the complement of the grey region to the complement of the blue region.

Firstly, for a fixed simulation budget NN, the robust method allows one to trade off the probabilistic strength of safety (increasing ϵ\epsilon) for resilience (increasing kk). In other words, the method can verify any given neural safe set 𝒮\mathcal{S} in an outlier-robust fashion by automatically attenuating the level of safety assurance based on the number of empirical outliers (i.e., safety violations). The iterative method, in contrast, can only verify a region that is outlier-free. Consequently, the robust method enables one to engage in a trade-off if a large increase in safe set volume can be attained by a tolerable decrease in safety, as illustrated in \figurereffig:fixed_N.

Refer to caption
Figure 2: (Left) Computing the safety strength ϵ\epsilon across different volumes (different super-levels of V~(x,0)\tilde{V}(x,0)) for different simulation budgets NN using the robust method. The grey points indicate the iterative method baselines. (Right) As we increase NN, the largest volume achieving the desired 99.968%99.968\% safety using the robust method increases up to a limit.

Secondly, by allowing nonzero safety violations kk, the robust method provides stronger safety assurances for a fixed volume with increment in the simulation budget NN, as long as the outlier rate does not grow substantially with NN. Thus, with more simulation effort, significantly larger volumes can be attained for a desired safety strength ϵ\epsilon as shown in \figurereffig:diff_N. Incrementing NN in the iterative method, on the other hand, will only correspond to verifying smaller volumes at a stronger ϵ\epsilon. It cannot verify larger volumes for a fixed ϵ\epsilon, because empirical safety violations will be introduced. \figurereffig:diff_N shows how the robust method (curves) adds a new degree of freedom for computing safety assurances compared to the iterative method (grey points).

5 Conformal Probabilistic Safety Verification Method

We now propose a conformal probabilistic safety verification method for neural reachable tubes which is intended to be the direct analogue of the robust scenario-based method in \sectionrefsec:scenario-based_method. The method is a straightforward application of split conformal prediction, a widely used method in the machine learning community for uncertainty quantification (Angelopoulos and Bates, 2023).

Using the same procedures as described in \sectionrefsec:scenario-based_method, split conformal prediction can be used instead of robust scenario optimization to provide a probabilistic guarantee on the safety of the neural reachable tube and its complement, the neural safe set 𝒮\mathcal{S}:

Theorem 5.1 (Conformal Probabilistic Safety Verification).

Let the number of outliers kk and the number of samples NN be as defined in the procedures in \sectionrefsec:scenario-based_method, then:

x𝒮(Jπ~(x,0)>0)Beta(Nk,k+1)\underset{x\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right)\sim\mathrm{Beta}(N-k,k+1) (4)
\theoremref

theorem:conformal_method can be established via a straightforward application of conformal prediction with Jπ~(x,0)-J_{\tilde{\pi}}(x,0) as the scoring function. The proof is in the Appendix of the extended version of this article1. The above theorem states that the fraction of 𝒮\mathcal{S} that is safe is distributed according to the Beta distribution with shape parameters NkN-k and k+1k+1. Intuitively, the mass in the distribution shifts towards 0 as kk increases for a fixed NN, implying that it is more likely that a smaller fraction of 𝒮\mathcal{S} is safe, as expected. For a fixed ratio N:kN:k, NN controls how concentrated the mass is around the mean; i.e., for larger sample sizes NN, we can more confidently determine the fraction of 𝒮\mathcal{S} that is safe.

To better understand \theoremreftheorem:conformal_method, we show in \figurereffig:beta the Beta distribution of x𝒮(Jπ~(x,0)>0)\underset{x\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right) for a solution learned by DeepReach on the multi-vehicle collision avoidance running example in \sectionrefsec:problem_setup, for which k=731k=731 outliers are found from N=3684118N=3684118 samples.

Refer to caption
Figure 3: The Beta distribution of x𝒮(Jπ~(x,0)>0)\underset{x\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right) when k=731k=731 outliers are found from N=3684118N=3684118 samples. (Dashed black line) For an example choice of confidence 1β=0.91-\beta=0.9 (shaded blue), we can lower-bound the fraction of 𝒮\mathcal{S} which is safe with at least 1ϵ=0.999791-\epsilon=0.99979 (99.979%99.979\%) confidence.
Remark 5.2.

The mean of the Beta distribution in Equation (4) is given as NkN+1\frac{N-k}{N+1}, which is roughly the fraction of the empirically safe samples. One can immediately derive that the safety probability of 𝒮\mathcal{S}, marginalized over the sampled “calibration” states, is given as: (x1:N,x)𝒮(Jπ~(x,0)>0)NkN+1\underset{\left(x_{1:N},x\right)\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right)\geq\frac{N-k}{N+1}, which precisely resembles the most commonly used coverage property of split conformal prediction.

Even though \theoremreftheorem:conformal_method provides the distribution of the safety level, when we compute safety assurances in practice, it is often desirable to know a lower-bound on the safety level with at least some desired confidence. This corresponds to choosing a lower-bound whose accumulated probability mass is smaller than some confidence parameter β\beta (shaded red in \figurereffig:beta). The following lemma formalizes this by using the CDF of the Beta distribution in \theoremreftheorem:conformal_method.

Lemma 5.3 (Conformal Probabilistic Safety Verification).

Select a safety violation parameter ϵ(0,1)\epsilon\in(0,1) and a confidence parameter β(0,1)\beta\in(0,1) such that

i=0k(Ni)ϵi(1ϵ)Niβ\displaystyle\sum^{k}_{i=0}\binom{N}{i}\epsilon^{i}(1-\epsilon)^{N-i}\leq\beta (5)

where kk and NN are as defined above. Then, with probability at least 1β1-\beta, the following holds:

x𝒮(V(x,0)0)ϵ\displaystyle\underset{x\in\mathcal{S}}{\mathbb{P}}\left(V(x,0)\leq 0\right)\leq\epsilon (6)
\lemmaref

lemma:conformal_method is, in fact, precisely the same result as obtained by \theoremreftheorem:scenario-based_method using robust scenario optimization. This is no coincidence, as one can show that split conform prediction more generally reduces to a robust scenario-optimization problem.

Remark 5.4.

In general, a split conformal prediction problem can be reduced to a robust scenario-optimization problem. This is proven in the Appendix of the extended version of this article.1

Due to the equivalence between conformal method and robust scenario-based methods, the analysis in \sectionrefsec:scenario-based_method holds here as well. More generally, we hope that this insight will lead to future research into further investigating the close relationship between the two methods.

6 Outlier-Adjusted Probabilistic Safety Verification Approach

The verification methods in \sectionrefsec:scenario-based_method,sec:conformal_method are limited by the quality of the neural reachable tube. Although they can account for outliers, the computed safety level can be low if the outlier rate is high. This can lead to significant losses in the safe volume, as demonstrated in \sectionrefsec:rocketlanding,sec:reachavoidrocketlanding.

To address this issue, we propose an outlier-adjusted approach that can recover a larger safe volume for any desired ϵ\epsilon. Note that in the verification methods, the key quantity which determines ϵ\epsilon is the number of safety violations kk. This corresponds to the number of samples xix_{i} which are marked safe by membership in 𝒮\mathcal{S}, i.e., V~(xi,0)δ\tilde{V}(x_{i},0)\geq\delta, but are not guaranteed to be safe, i.e., Jπ~(xi,0)0J_{\tilde{\pi}}\left(x_{i},0\right)\leq 0. It is easy to see that the best we can do to simultaneously minimize kk and maximize volume is to compute 𝒮\mathcal{S} as the super-δ\delta level set of the induced cost function Jπ~(x,0)J_{\tilde{\pi}}(x,0). For example, the largest possible 𝒮\mathcal{S} that is guaranteed to be violation-free is precisely the super-zero level set of Jπ~(x,0)J_{\tilde{\pi}}(x,0). Thus, our overall approach will be to refine V~(x,0)\tilde{V}(x,0) so that it more accurately reflects Jπ~(x,0)J_{\tilde{\pi}}(x,0).

Modeling Jπ~(x,0)J_{\tilde{\pi}}(x,0) can be formulated as a supervised learning problem, since we can sample a state xix_{i} and compute its cost Jπ~(xi,0)J_{\tilde{\pi}}(x_{i},0) in simulation. We learn an approximation J~π~(x,0)\tilde{J}_{\tilde{\pi}}(x,0) by retraining V~(x,0)\tilde{V}(x,0) on a training dataset 𝒯\mathcal{T} of nn samples, 𝒯=(x1,Jπ~(x1,0)),,(xn,Jπ~(xn,0))\mathcal{T}=(x_{1},J_{\tilde{\pi}}(x_{1},0)),...,(x_{n},J_{\tilde{\pi}}(x_{n},0)). Specifically, we use the weighted MSE loss 1ni=1nwi(V~(xi,0)Jπ~(xi,0))2\frac{1}{n}\sum_{i=1}^{n}w_{i}(\tilde{V}(x_{i},0)-J_{\tilde{\pi}}(x_{i},0))^{2}, where wi=ww_{i}=w if the error is conservative (V~(xi,0)<Jπ~(xi,0))\left(\tilde{V}(x_{i},0)<J_{\tilde{\pi}}(x_{i},0)\right), otherwise wi=1w_{i}=1. We introduce ww as a hyperparameter to underweight conservative errors because in the end, we are concerned with recovering larger safe volumes. Thus, selecting a small ww allows us to focus on reducing optimistic errors (V~(xi,0)>Jπ~(xi,0))\left(\tilde{V}(x_{i},0)>J_{\tilde{\pi}}(x_{i},0)\right) which are more safety-critical and correspond to outlier safety violations.

To avoid overfitting, we select the training checkpoint that performs best on a validation dataset 𝒱\mathcal{V}. The validation metric we use is the maximum learned cost of an empirically unsafe state: maxx𝒱{J~π~(x,0):Jπ~(x,0)0}\max_{x\in\mathcal{V}}\{\tilde{J}_{\tilde{\pi}}(x,0):J_{\tilde{\pi}}(x,0)\leq 0\}, which one can think of as a proxy for the recoverable safe volume. We demonstrate the efficacy of the proposed outlier-adjusted approach for the high-dimensional systems of multi-vehicle collision avoidance and rocket landing with no-go zones. For all case studies, we set w=103w=10^{-3} during retraining, fix the confidence parameter β=1016\beta=10^{-16} and find a safe volume that satisfies ϵ104\epsilon\leq 10^{-4} (99.990%99.990\% safety) using the robust method in \sectionrefsec:scenario-based_method.

6.1 Multi-Vehicle Collision Avoidance

In \figurereffig:multivehicle, we compare our outlier-adjusted approach (blue) to the baseline (grey) for a DeepReach solution trained on the multi-vehicle collision avoidance running example in \sectionrefsec:problem_setup. A 2.3%2.3\% increase in the safe volume is attained, shown by the tightened BRT. Note that the largest visual difference in the BRT is where the third vehicle is between the two others; intuitively, the safety in this region is likely more difficult to model by the baseline approach.

Figure 4: Multi-Vehicle Collision Avoidance: outlier-adjusted (blue) and baseline (grey) results. (Left) Slice of the neural BRTs achieving ϵ=104\epsilon=10^{-4} (99.990%99.990\% safety). (Right) The outlier-adjusted approach increases the safe volume from 0.7820.782 to 0.80.8 (2.3%2.3\% increase).
Refer to caption

6.2 Rocket Landing

We now apply our approach to a 6D rocket landing system with position (px,py)(p_{x},p_{y}), heading θ\theta, velocity (vx,vy)(v_{x},v_{y}), angular velocity ω\omega, and torque controls τ1,τ2[250,250]\tau_{1},\tau_{2}\in[-250,250]. The dynamics are: px˙=vx,py˙=vy,θ˙=ω,ω˙=0.3τ1,vx˙=τ1cosθτ2sinθ,vy˙=τ1sinθ+τ2cosθg\dot{p_{x}}=v_{x},~{}\dot{p_{y}}=v_{y},~{}\dot{\theta}=\omega,~{}\dot{\omega}=0.3\tau_{1},~{}\dot{v_{x}}=\tau_{1}\cos{\theta}-\tau_{2}\sin{\theta},~{}\dot{v_{y}}=\tau_{1}\sin{\theta}+\tau_{2}\cos{\theta}-g, where g=9.81g=9.81 is acceleration due to gravity. The target set is the set of states where the rocket reaches a rectangular landing zone of side length 20m centered at the origin: ={x:|px|<20.0,py<20.0}\mathcal{L}=\{x:|p_{x}|<20.0,p_{y}<20.0\}. Note that we want to reach \mathcal{L}, so the BRT now represents the safe set. Results are shown in \figurereffig:rocketlanding. Interestingly, a large 9.58%9.58\% increase in the volume of the safe set is recovered using the proposed approach, particularly near the lower-left part of the state space. Further investigation reveals that the trajectories starting from these states exit the training regime south. This highlights a general limitation of computing the value function over a constrained state space where information is propagated via dynamic programming, which affects both learning-based methods and traditional grid-based methods. Nevertheless, in this case, the relative order of the value function levels is still preserved, leading to a high quality safe policy and recovery of a larger safe volume.

Figure 5: Rocket Landing: outlier-adjusted (blue) and baseline (grey) results. (Left) Slice of the neural BRTs achieving ϵ=104\epsilon=10^{-4} (99.990%99.990\% safety). (Right) The outlier-adjusted approach increases the safe volume from 0.3340.334 to 0.3660.366 (9.58%9.58\% increase).
Refer to caption

6.3 Rocket Landing with No-Go Zones

We now consider the rocket landing problem in a constrained airspace where we have no-go zones of height 100m and width 10m to the left of the landing zone and where altitude is below the landing zone. Safety in this case takes the form of a reach-avoid set - the rocket needs to reach the landing zone while avoiding the no-go zones. An analogous HJI-VI to the one in \sectionrefsec:background_reachability can be derived for this case, whose solution can be computed using DeepReach. However, since reach-avoid problems are more complex than just the reach or avoid problem, the DeepReach solution results in a poor safety volume. In fact, no safe volume can be recovered with the desired safety level of ϵ104\epsilon\leq 10^{-4}. In contrast, we can recover a sizable safe volume using the outlier-adjusted approach, as shown in \figurereffig:reachavoidrocketlanding. These examples highlight the utility of the proposed approach.

Figure 6: Rocket Landing with No-Go Zones: outlier-adjusted (blue) and baseline (grey) results. (Left) Slice of the neural BRTs achieving ϵ=104\epsilon=10^{-4} (99.990%99.990\% safety). (Right) The outlier-adjusted approach increases the safe volume from 0 to 0.190.19.
Refer to caption

7 Discussion and Future Work

In this work, we propose two different verification methods, based on robust scenario optimization and conformal prediction, to provide probabilistic safety guarantees for neural reachable tubes. Our methods allow a direct trade-off between resilience to outlier errors in the neural tube, which are inevitable in a learning-based approach, and the strength of the probabilistic safety guarantee. Furthermore, we show that split conformal prediction, a widely used method in the machine learning community for uncertainty quantification, reduces to a scenario-based approach, making the two methods equivalent not only for verification of neural reachable tubes but also more generally. We hope that our proof will lead to future insights into the close relationship between the highly related but disparate fields of conformal prediction and scenario optimization. Finally, we propose an outlier-adjusted verification approach that harnesses information about the error distribution in neural reachable tubes to recover greater safe volumes. We demonstrate the efficacy of the proposed approaches for the high-dimensional problems of multi-vehicle collision avoidance and rocket landing with no-go zones. Altogether, these are important steps toward using learning-based reachability methods to compute safety assurances for high-dimensional systems in the real world.

In the future, we will explore how the key idea of the outlier-adjusted verification approach, using cost labels as a supervised learning signal, can be used to enhance the accuracy of learning-based reachability methods like DeepReach. Other directions include providing safety assurances in the presence of worst-case disturbances and in real-time for tubes that are generated online.

\acks

This work is supported in part by a NASA Space Technology Graduate Research Opportunity, the NVIDIA Academic Hardware Grant Program, the NSF CAREER Program under award 2240163, and the DARPA ANSR program.

References

  • Althoff et al. (2010) Matthias Althoff, Olaf Stursberg, and Martin Buss. Computing reachable sets of hybrid systems using a combination of zonotopes and polytopes. Nonlinear analysis: hybrid systems, 4(2):233–249, 2010.
  • Angelopoulos and Bates (2023) Anastasios N. Angelopoulos and Stephen Bates. Conformal prediction: A gentle introduction. Foundations and Trends® in Machine Learning, 16(4):494–591, 2023. ISSN 1935-8237. 10.1561/2200000101. URL \urlhttp://dx.doi.org/10.1561/2200000101.
  • Bak et al. (2019) Stanley Bak, Hoang-Dung Tran, and Taylor T Johnson. Numerical verification of affine systems with up to a billion dimensions. In International Conference on Hybrid Systems: Computation and Control, pages 23–32, 2019.
  • Bansal and Tomlin (2021) Somil Bansal and Claire J Tomlin. DeepReach: A deep learning approach to high-dimensional reachability. In IEEE International Conference on Robotics and Automation (ICRA), 2021.
  • Bansal et al. (2017) Somil Bansal, Mo Chen, Sylvia Herbert, and Claire J Tomlin. Hamilton-Jacobi Reachability: A brief overview and recent advances. In IEEE Conference on Decision and Control (CDC), 2017.
  • Campi and Garatti (2011) M. C. Campi and S. Garatti. A sampling-and-discarding approach to chance-constrained optimization: feasibility and optimality. Journal of Optimization Theory and Applications, 2011.
  • Campi et al. (2009) M. C. Campi, S. Garatti, and M. Prandini. The scenario approach for systems and control design. Annual Reviews in Control, 2009.
  • Chow et al. (2017) Yat Tin Chow, Jérôme Darbon, Stanley Osher, and Wotao Yin. Algorithm for overcoming the curse of dimensionality for time-dependent non-convex hamilton–jacobi equations arising from optimal control and differential games problems. Journal of Scientific Computing, 73(2-3):617–643, 2017.
  • Coogan and Arcak (2015) Samuel Coogan and Murat Arcak. Efficient finite abstraction of mixed monotone systems. In Proceedings of the 18th International Conference on Hybrid Systems: Computation and Control, pages 58–67, 2015.
  • Darbon et al. (2020) Jerome Darbon, Gabriel P Langlois, and Tingwei Meng. Overcoming the curse of dimensionality for some hamilton–jacobi partial differential equations via neural network architectures. Research in the Mathematical Sciences, 7(3):1–50, 2020.
  • Djeridane and Lygeros (2006) Badis Djeridane and John Lygeros. Neural approximation of pde solutions: An application to reachability computations. In Conference on Decision and Control, pages 3034–3039, 2006.
  • (12) DLMF. NIST Digital Library of Mathematical Functions. \urlhttps://dlmf.nist.gov/, Release 1.1.11 of 2023-09-15, 2023. URL \urlhttps://dlmf.nist.gov/. F. W. J. Olver, A. B. Olde Daalhuis, D. W. Lozier, B. I. Schneider, R. F. Boisvert, C. W. Clark, B. R. Miller, B. V. Saunders, H. S. Cohl, and M. A. McClain, eds.
  • Dreossi et al. (2016) Tommaso Dreossi, Thao Dang, and Carla Piazza. Parallelotope bundles for polynomial reachability. In International Conference on Hybrid Systems: Computation and Control, 2016.
  • Fisac et al. (2019) Jaime F. Fisac, Neil F. Lugovoy, Vicenç Rubies-Royo, Shromona Ghosh, and Claire J. Tomlin. Bridging Hamilton-Jacobi Safety Analysis and Reinforcement Learning. International Conference on Robotics and Automation, 2019.
  • Frehse et al. (2011) G. Frehse, C. Le Guernic, A. Donzé, S. Cotton, R. Ray, O. Lebeltel, R. Ripado, A. Girard, T. Dang, and O. Maler. SpaceEx: Scalable verification of hybrid systems. In International Conference Computer Aided Verification, 2011.
  • Girard (2005) Antoine Girard. Reachability of uncertain linear systems using zonotopes. In International Workshop on Hybrid Systems: Computation and Control, pages 291–305, 2005.
  • Greenstreet and Mitchell (1998) Mark R. Greenstreet and Ian Mitchell. Integrating projections. In Thomas A. Henzinger and Shankar Sastry, editors, Hybrid Systems: Computation and Control, pages 159–174, Berlin, Heidelberg, 1998. Springer Berlin Heidelberg. ISBN 978-3-540-69754-1.
  • Henrion and Korda (2014) D. Henrion and M. Korda. Convex computation of the region of attraction of polynomial control systems. IEEE Transactions on Automatic Control, 59(2):297–312, 2014.
  • Kurzhanski and Varaiya (2002) Alexander Kurzhanski and Pravin Varaiya. On ellipsoidal techniques for reachability analysis. part ii: Internal approximations box-valued constraints. Optimization Methods and Software, 17:207–237, 01 2002. 10.1080/1055678021000012435.
  • Kurzhanski and Varaiya (2000) Alexander B Kurzhanski and Pravin Varaiya. Ellipsoidal techniques for reachability analysis: internal approximation. Systems & Control Letters, 2000.
  • Lin and Bansal (2023) Albert Lin and Somil Bansal. Generating formal safety assurances for high-dimensional reachability. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 10525–10531. IEEE, 2023.
  • Lygeros (2004) John Lygeros. On reachability and minimum cost optimal control. Automatica, 40(6):917–927, 2004.
  • Maidens et al. (2013) John N Maidens, Shahab Kaynama, Ian M Mitchell, Meeko MK Oishi, and Guy A Dumont. Lagrangian methods for approximating the viability kernel in high-dimensional systems. Automatica, 2013.
  • Majumdar and Tedrake (2017) A. Majumdar and R. Tedrake. Funnel libraries for real-time robust feedback motion planning. The International Journal of Robotics Research, 36(8):947–982, 2017.
  • Majumdar et al. (2014) Anirudha Majumdar, Ram Vasudevan, Mark M. Tobenkin, and Russ Tedrake. Convex optimization of nonlinear feedback controllers via occupation measures. The International Journal of Robotics Research, 33(9):1209–1230, 2014. 10.1177/0278364914528059. URL \urlhttps://doi.org/10.1177/0278364914528059.
  • Mitchell et al. (2005) Ian Mitchell, Alex Bayen, and Claire J. Tomlin. A time-dependent Hamilton-Jacobi formulation of reachable sets for continuous dynamic games. IEEE Transactions on Automatic Control (TAC), 50(7):947–957, 2005.
  • Niarchos and Lygeros (2006) KN Niarchos and John Lygeros. A neural approximation to continuous time reachability computations. In Conference on Decision and Control, pages 6313–6318, 2006.
  • Nilsson and Ozay (2016) Petter Nilsson and Necmiye Ozay. Synthesis of separable controlled invariant sets for modular local control design. In American Control Conference, pages 5656–5663, 2016.
  • Onken et al. (2022) Derek Onken, Levon Nurbekyan, Xingjian Li, Samy Wu Fung, Stanley Osher, and Lars Ruthotto. A neural network approach for high-dimensional optimal control applied to multiagent path finding. IEEE Transactions on Control Systems Technology, 2022.
  • Rubies-Royo et al. (2019) Vicenç Rubies-Royo, David Fridovich-Keil, Sylvia Herbert, and Claire J Tomlin. A classification-based approach for approximate reachability. In International Conference on Robotics and Automation, pages 7697–7704. IEEE, 2019.
  • Vovk (2012) Vladimir Vovk. Conditional validity of inductive conformal predictors. In Steven C. H. Hoi and Wray Buntine, editors, Proceedings of the Asian Conference on Machine Learning, volume 25 of Proceedings of Machine Learning Research, pages 475–490, Singapore Management University, Singapore, 04–06 Nov 2012. PMLR. URL \urlhttps://proceedings.mlr.press/v25/vovk12.html.

Appendix A Robust Scenario-Based Proofs

First, we introduce and prove the following lemma regarding a generic 1-dimensional chance-constrained optimization problem (CCP), which will be useful for subsequent proofs.

Lemma A.1 (Solution Feasibility for a 1-D CCP).

Consider the following 1-dimensional CCP:

CCPϵ:minggs.t. hH(f(h)g)1ϵ\displaystyle\begin{split}\text{CCP}_{\epsilon}:&\min_{g\in\mathbb{R}}{g}\\ &\text{s.t. }\underset{h\in H}{\mathbb{P}}\left(f(h)\leq g\right)\geq 1-\epsilon\end{split} (7)

where gg is the 1-dimensional optimization variable, hh is the uncertain parameter that describes different instances of an uncertain optimization scenario, and ff is some function of hh.

The corresponding sample counterpart (SP) of this CCP is:

SPN,kA:minggs.t. f(hi)g,i{1,,N}A{h1,,hN}\displaystyle\begin{split}\text{SP}^{A}_{N,k}:&\min_{g\in\mathbb{R}}{g}\\ &\text{s.t. }f(h_{i})\leq g,\quad i\in\{1,...,N\}-A\{h_{1},...,h_{N}\}\end{split} (8)

where NN constraints are sampled but kk constraints are discarded according to some constraint elimination algorithm AA. Let gN,kg^{*}_{N,k} denote the solution to the above SP. Select a violation parameter ϵ(0,1)\epsilon\in(0,1) and a confidence parameter β(0,1)\beta\in(0,1) such that

i=0k(Ni)ϵi(1ϵ)Niβ\displaystyle\sum^{k}_{i=0}\binom{N}{i}\epsilon^{i}(1-\epsilon)^{N-i}\leq\beta (9)

Then, with probability at least 1β1-\beta, the following holds:

hH(f(h)>gN,k)ϵ\displaystyle\underset{h\in H}{\mathbb{P}}\left(f(h)>g^{*}_{N,k}\right)\leq\epsilon (10)
Proof A.2.
\lemmaref

lemma:CCP is a straightforward application of a scenario-based sampling-and-discarding approach to a CCP as detailed in Campi and Garatti (2011). CCP (7) satisfies the assumptions of Campi and Garatti (2011), since both the domain of optimization \mathbb{R} and the constraint sets parameterized by hh, {g:f(h)g}\{g:f(h)\leq g\}, are convex and closed in gg. Thus, \lemmareflemma:CCP follows as a special case of Theorem 2.1 in Campi and Garatti (2011), where d=1d=1, c=1c=1, x=gx=g, X=G=X=G=\mathbb{R}, δ=h\delta=h, Δ=H\Delta=H, and Xδ=Gh={g:f(h)g}X_{\delta}=G_{h}=\{g:f(h)\leq g\}.

A.1 Proof of \texorpdfstring\theoremreftheorem:scenario-based_methodRobust Scenario-Based Probabilistic Safety Verification Theorem

Proof A.3.

Consider the chance-constrained optimization problem (CCP) (7) in \lemmareflemma:CCP directly above, where h=xh=x, H=𝒮H=\mathcal{S}, and f(h)=f(x)=Jπ~(x,0)f(h)=f(x)=-J_{\tilde{\pi}}(x,0). The proposed robust scenario-based probabilistic safety verification method deals with the corresponding sample counterpart (SP) (8) in \lemmareflemma:CCP where the constraint elimination algorithm AA is to remove all kk constraints f(hi)gf(h_{i})\leq g where f(hi)=Jπ~(xi,0)0f(h_{i})=-J_{\tilde{\pi}}(x_{i},0)\geq 0. Thus, the only constraints remaining are f(hi)gf(h_{i})\leq g where f(hi)<0f(h_{i})<0. Since we are minimizing gg, the solution gN,kg^{*}_{N,k} to the SP must be <0<0; i.e., 0<gN,k0<-g^{*}_{N,k}. Therefore, x𝒮(Jπ~(x,0)0)x𝒮(Jπ~(x,0)<gN,k)\underset{x\in\mathcal{S}}{\mathbb{P}}(J_{\tilde{\pi}}(x,0)\leq 0)\leq\underset{x\in\mathcal{S}}{\mathbb{P}}(J_{\tilde{\pi}}(x,0)<-g^{*}_{N,k}). Equation (10) of \lemmareflemma:CCP then yields: x𝒮(Jπ~(x,0)<gN,k)ϵx𝒮(Jπ~(x,0)0)ϵ\underset{x\in\mathcal{S}}{\mathbb{P}}(J_{\tilde{\pi}}(x,0)<-g^{*}_{N,k})\leq\epsilon\implies\underset{x\in\mathcal{S}}{\mathbb{P}}(J_{\tilde{\pi}}(x,0)\leq 0)\leq\epsilon, where (x,t),Jπ~(x,t)V(x,t)\forall(x,t),J_{\tilde{\pi}}(x,t)\leq V(x,t) from Equation (1), so Equation (3) of \theoremreftheorem:scenario-based_method directly follows.

Appendix B Conformal Proofs

B.1 Proof of \texorpdfstring\theoremreftheorem:conformal_methodConformal Probabilistic Safety Verification Theorem

Proof B.1.
\theoremref

theorem:conformal_method is a straightforward application of the split conformal prediction method detailed in Angelopoulos and Bates (2023), where we set the conformal “input” x=xx=x, “output” y=Jπ~(x,0)y=-J_{\tilde{\pi}}(x,0), “score function” s(x,y)=y=Jπ~(x,0)s(x,y)=y=-J_{\tilde{\pi}}(x,0), “size of the calibration set” n=Nn=N, and “user-chosen error rate” α=k+1N+1\alpha=\frac{k+1}{N+1}. The conformal q^\hat{q} is then computed as the (N+1)(1α)N\frac{\lceil(N+1)(1-\alpha)\rceil}{N} quantile of the calibration scores Jπ~(x1:N,0)-J_{\tilde{\pi}}(x_{1:N},0). The quantile is (N+1)(1α)N=(N+1)(1k+1N+1)N=(N+1)(NkN+1)N=NkN\frac{\lceil(N+1)(1-\alpha)\rceil}{N}=\frac{\lceil(N+1)(1-\frac{k+1}{N+1})\rceil}{N}=\frac{\lceil(N+1)(\frac{N-k}{N+1})\rceil}{N}=\frac{N-k}{N}, where we have defined kk in the procedures in \sectionrefsec:scenario-based_method as the number of scores Jπ~(xi,0)0-J_{\tilde{\pi}}(x_{i},0)\geq 0. Thus, this quantile corresponds precisely to the largest negative score, so we know that q^<0\hat{q}<0. Theorem 1 in Angelopoulos and Bates (2023) then yields:

(x1:N,x)𝒮(Jπ~(x,0)q^)\displaystyle\underset{\left(x_{1:N},x\right)\in\mathcal{S}}{\mathbb{P}}\left(-J_{\tilde{\pi}}(x,0)\leq\hat{q}\right) 1α\displaystyle\geq 1-\alpha
(x1:N,x)𝒮(Jπ~(x,0)q^)\displaystyle\underset{\left(x_{1:N},x\right)\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)\geq-\hat{q}\right) 1k+1N+1\displaystyle\geq 1-\frac{k+1}{N+1}
(x1:N,x)𝒮(Jπ~(x,0)>0)\displaystyle\underset{\left(x_{1:N},x\right)\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right) NkN+1\displaystyle\geq\frac{N-k}{N+1} (11)

where Equation (11) follows from the line preceding it because if Jπ~(x,0)q^J_{\tilde{\pi}}(x,0)\geq-\hat{q}, then certainly Jπ~(x,0)>0J_{\tilde{\pi}}(x,0)>0. This coverage property result is precisely the same as described in \remarkrefremark:CP_coverage. Furthermore, Section 3.2 in Angelopoulos and Bates (2023) yields:

x𝒮(Jπ~(x,0)>0)\displaystyle\underset{x\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right) Beta(N+1l,l),l=(N+1)α\displaystyle\sim\text{Beta}(N+1-l,l),\quad l=\lfloor(N+1)\alpha\rfloor
x𝒮(Jπ~(x,0)>0)\displaystyle\underset{x\in\mathcal{S}}{\mathbb{P}}\left(J_{\tilde{\pi}}(x,0)>0\right) Beta(Nk,k+1)\displaystyle\sim\text{Beta}(N-k,k+1) (12)

which is precisely the result of \theoremreftheorem:conformal_method.

B.2 Proof that Split Conformal Prediction Reduces to Robust Scenario Optimization

Here, we show that split conformal prediction, in full generality, reduces to a robust scenario optimization problem. We hope that this insight will encourage future research on the close relationship between the highly related but disparate fields of conformal prediction and scenario optimization. In split conformal prediction, we first define a score function s(x,y)s(x,y)\in\mathbb{R} which is meant to reflect the uncertainty for a model input xx and corresponding model output yy. Then, we sample an i.i.d. calibration set (X1,Y1),,(Xn,Yn)(X_{1},Y_{1}),...,(X_{n},Y_{n}) and compute q^\hat{q} as the (n+1)(1α)n\frac{\lceil(n+1)(1-\alpha)\rceil}{n} quantile of the calibration scores s(X1,Y1),,s(Xn,Yn)s(X_{1},Y_{1}),...,s(X_{n},Y_{n}), where α[0,1]\alpha\in[0,1] is a user-chosen error rate. For a new i.i.d. sample XtestX_{\text{test}}, we construct a prediction set C(Xtest)={y:s(Xtest,y)q^}C(X_{\text{test}})=\{y:s(X_{\text{test}},y)\leq\hat{q}\}. Theorem 1 in Angelopoulos and Bates (2023) provides the following coverage property: (YtestC(Xtest))1α\mathbb{P}\left(Y_{\text{test}}\in C(X_{\text{test}})\right)\geq 1-\alpha. This follows from the more powerful property, first introduced in Vovk (2012), which we prove reduces to a robust scenario-based result after:

(YtestC(Xtest)|{(Xi,Yi)}i=1n)Beta(n+1l,l),l=(n+1)α\mathbb{P}\left(Y_{\text{test}}\in C(X_{\text{test}})|\{(X_{i},Y_{i})\}^{n}_{i=1}\right)\sim\text{Beta}(n+1-l,l),\quad l=\lfloor(n+1)\alpha\rfloor (13)
Proof B.2.

To show that split conformal prediction reduces to a scenario-based approach, consider the CCP (7) in \lemmareflemma:CCP in \appendixrefapd:SO, where h=(x,y)h=(x,y) and f(h)=f((x,y))=s(x,y)f(h)=f\left((x,y)\right)=s(x,y). That is, we want to find a probabilistic upper-bound on samples of the score function. Then, consider the corresponding SP (8) in \lemmareflemma:CCP where the nn sampled calibration scores s(X1,Y1),,s(Xn,Yn)s(X_{1},Y_{1}),...,s(X_{n},Y_{n}) forms our set of constraints. Remove the k=(n+1)α1k=\lfloor(n+1)\alpha-1\rfloor largest scores, where α\alpha is the user-chosen error rate in split conformal prediction. The largest remaining score will be the nkn\frac{n-k}{n} quantile. nkn=n(n+1)α1n=n+((n+1)α1)n=n(n+1)α+1n=(n+1)(1α)n\frac{n-k}{n}=\frac{n-\lfloor(n+1)\alpha-1\rfloor}{n}=\frac{n+\lceil-((n+1)\alpha-1)\rceil}{n}=\frac{\lceil n-(n+1)\alpha+1\rceil}{n}=\frac{\lceil(n+1)(1-\alpha)\rceil}{n}, which is precisely the same quantile as q^\hat{q} in split conformal prediction. Thus, the solution to the SP is gN,k=q^g^{*}_{N,k}=\hat{q}. \lemmareflemma:CCP tells us that for a violation parameter ϵ(0,1)\epsilon\in(0,1) and a confidence parameter β(0,1)\beta\in(0,1) that satisfies the relationship in Equation (9), with probability at least 1β1-\beta, the following holds:

(YtestC(Xtest)|{(Xi,Yi)}i=1n)1ϵ\mathbb{P}\left(Y_{\text{test}}\in C(X_{\text{test}})|\{(X_{i},Y_{i})\}^{n}_{i=1}\right)\geq 1-\epsilon (14)

This is equivalent to Equation (13). To see why, note that the cumulative distribution function of the Beta distribution in Equation (13) is given in terms of kk by the incomplete beta function ratio Ix(nk,k+1)=j=nkn(nj)xj(1x)njI_{x}(n-k,k+1)=\sum^{n}_{j=n-k}\binom{n}{j}x^{j}(1-x)^{n-j} from DLMF (, \hrefhttps://dlmf.nist.gov/8.17.E5(8.17.5)). Changing the index i=nji=n-j yields Ix(nk,k+1)=i=0k(nni)xni(1x)n(ni)=i=0k(ni)xni(1x)iI_{x}(n-k,k+1)=\sum^{k}_{i=0}\binom{n}{n-i}x^{n-i}(1-x)^{n-(n-i)}=\sum^{k}_{i=0}\binom{n}{i}x^{n-i}(1-x)^{i}. Thus, Equation (13) is equivalent to the claim that for any violation parameter ϵ(0,1)\epsilon\in(0,1) and confidence parameter β(0,1)\beta\in(0,1), (YtestC(Xtest)|{(Xi,Yi)}i=1n)1ϵ\mathbb{P}\left(Y_{\text{test}}\in C(X_{\text{test}})|\{(X_{i},Y_{i})\}^{n}_{i=1}\right)\geq 1-\epsilon (Equation (14)) holds with probability at least 1β1-\beta as long as βI1ϵ(nk,k+1)=i=0k(ni)ϵi(1ϵ)ni\beta\geq I_{1-\epsilon}(n-k,k+1)=\sum^{k}_{i=0}\binom{n}{i}\epsilon^{i}(1-\epsilon)^{n-i} (Equation (9)).

B.3 Proof of \texorpdfstring\lemmareflemma:conformal_methodConformal Probabilistic Safety Verification Lemma

Proof B.3.

The conformal probabilistic safety verification method in \sectionrefsec:conformal_method is nothing more than a specific formulation of the general split conformal prediction method in \appendixrefapd:reduction_proof, which we have proven provides a result that is equivalent to the robust scenario optimization result in \lemmareflemma:CCP. The result in \lemmareflemma:CCP, when formulated in the context of the conformal probabilistic safety verification method in \sectionrefsec:conformal_method and noting that (x,t),Jπ~(x,t)V(x,t)\forall(x,t),J_{\tilde{\pi}}(x,t)\leq V(x,t) from Equation (1), is precisely \lemmareflemma:conformal_method.