Rationally Inattentive Path-Planning via RRT*

Jeb Stefan¹, Ali Reza Pedram², Riku Funada³ and Takashi Tanaka³ *This work is supported by Lockheed Martin Corporation.¹Odyssey Space Research. jeb.stefan@odysseysr.com. ²Walker Department of Mechanical Engineering, University of Texas at Austin. apedram@utexas.edu. ³Department of Aerospace Engineering and Engineering Mechanics, University of Texas at Austin. riku.funada@austin.utexas.edu and ttanaka@utexas.edu.

Abstract

We consider a path-planning scenario for a mobile robot traveling in a configuration space with obstacles under the presence of stochastic disturbances. A novel path length metric is proposed on the uncertain configuration space and then integrated with the existing RRT* algorithm. The metric is a weighted sum of two terms which capture both the Euclidean distance traveled by the robot and the perception cost, i.e., the amount of information the robot must perceive about the environment to follow the path safely. The continuity of the path length function with respect to the topology of the total variation metric is shown and the optimality of the Rationally Inattentive RRT* algorithm is discussed. Three numerical studies are presented which display the utility of the new algorithm.

I Introduction

As robots are designed to be more self-reliant in navigating complex and stochastic environments, it is sensible for the strategic execution of perception/cognition tasks to be included in the theory which governs their path-planning [1, 2, 3]. Even though the body of work surrounding motion planning techniques has greatly expanded recently, a technological gap remains in the integration of perception concerns into planning tasks [4]. Mitigating this gap is paramount to missions which require robots to autonomously complete tasks when sensing actions carry high costs (battery power, computing constraints, etc).

Path-planning is typically followed by feedback control design, which is executed during the path following phase. In the current practice, path-planning and path-following are usually discussed separately (notable exceptions include [5, 6, 7]), and the cost of feedback control (perception cost in particular) is not incorporated in the path-planning phase. The first objective of this work is to fill this gap by introducing a novel path cost function which incorporates the expected perception cost accrued during path-following into the planning phase. This cost jointly penalizes the amount of sensing needed to follow a path and the distance traveled. Our approach is closely related to the concept of rationally inattentive (RI) control [8] (topic from macroeconomics which has recently been applied in control theory [9, 10]). The aim of rationally inattentive control is to jointly design the control and sensing policies such that the least amount of information (measured in bits) is collected about the environment in order to achieve the desired control.

The second objective of this work is to integrate the proposed path length function with an existing sampling-based algorithm, such as Rapidly-Exploring Random Trees (RRT) [11]. The RRT algorithm is suited for this problem as it has been shown to find feasible paths in motion planning problems quickly. A modified version of this algorithm, RRT* [12], will be utilized as it has the additional property of being asymptotically optimal. We develop an RRT*-like algorithm incorporating the proposed path length function (called the RI-RRT* algorithm) and demonstrate its effectiveness.

While the practical utility of the proposed framework must be thoroughly studied in the future, its expected impact is displayed in Fig. 1. This figure shows the example of a robot moving through the two-dimensional, obstacle-filled environment. Path A (red) represents the path from the origin to target location which minimizes the Euclidean distance. However, this path requires a large number of sensor actuations to keep the robot’s spatial uncertainty small and avoid colliding with obstacles. Alternatively, Path B (blue) allows for the covariance to safely grow more along the path. Although the Path B travels a greater Euclidean distance to reach the target, it is cheaper in the information-theoretic sense as it requires fewer sensing actions. Therefore, if the perception cost is weighed more than the travel cost, Path B is characterized as the shortest path in the proposed path planning framework. We will demonstrate this effect in a numerical simulation in Section V-C.

The proposed concept of rationally inattentive path-planning provides insight into the mathematical modeling of human experts’ skills in path planning [13], especially in terms of an efficiency-simplicity trade-off. Several path-planning algorithms have been proposed in the literature that are capable of enhancing path simplicity; this list includes potential field approaches [14], multi-resolution perception and path-planning [15, 16], and safe path-planning [17, 18]. The information-theoretic distance function we introduce in this paper can be thought of as an alternative measure of path simplicity, which may provide a suitable modeling of the human intuition for simplicity in planning. In our standard, a path which requires less sensor information during the path-following phase is more “simple;” Path B in Fig. 1 is simpler than Path A, and the simplest path is that which is traceable by an open-loop control policy.

The contributions of this paper are summarized as follows:

•

A novel path cost (RI cost) is formulated which jointly accounts for travel distance and perception cost.
•

The continuity of the path cost with respect to the topology of the total variation metric is shown in the single dimensional case, which is a step forward to guaranteeing the asymptotic optimality of sampling-based algorithms.
•

An RRT*-like algorithm is produced implementing the RI path-planning concept.

Refer to caption — Figure 1: Example of an autonomous robot navigating a two-dimensional configuration space with obstacles. The goal of the robot is to reach the target location. As it moves, the uncertainty of the robot’s exact location in the environment grows, represented by the varying sized covariance ellipses.

Notation: For the purpose of this work, the following definitions for vectors (lower case) and matrices (upper case) hold: $\mathbb{S}^{d}=\left\{P\in\mathbb{R}^{d\times d}:P\text{ is symmetric.}\right\},\mathbb{S}_{++}^{d}=\left\{P\in\mathbb{S}^{d}:P\succ 0\right\}$ , $\mathbb{S}_{\epsilon}^{d}=\left\{P\in\mathbb{S}^{d}:P\succeq\epsilon I\right\}$ , and bold symbols such as $\bm{x}$ represent random variables. The vector 2-norm is $\|\cdot\|$ and $\|\cdot\|_{F}$ is Frobenius norm. The maximum singular values of a matrix $M$ is denoted by $\bar{\sigma}(M)$ .

II Preliminary Material

In this paper, we consider a path-planning problem for a mobile robot with dynamics given by model (1). Let $\bm{x}(t)$ be a $\mathbb{R}^{d}$ -valued random process representing the robot’s position at time $t$ , given by the controlled Ito process:

\begin{split}&d\bm{x}(t)=\bm{v}(t)dt+W^{\frac{1}{2}}d\bm{b}(t),\end{split}

(1)

with $\bm{x}(0)\sim\mathcal{N}(x_{0},P_{0})$ and $t\in[0,T]$ . Here, $\bm{v}(t)$ is the velocity input command, $\bm{b}(t)$ is the $d$ -dimensional standard Brownian motion, and $W$ is a given positive definite matrix used in modeling the process noise intensity. We assume that the robot is commanded to travel at a unit velocity (i.e., $\|\bm{v}(t)\|=1$ ). Let $\mathcal{P}=(0=t_{0}<t_{1}<\cdots<t_{N}=T)$ be a partition of $[0,T]$ , which must not necessarily be of equal spacing. Time discretization of (1) based on the Euler-Maruyama method [19] yields:

\bm{x}(t_{k+1})=\bm{x}(t_{k})+\bm{v}(t_{k})\Delta t_{k}+\bm{n}(t_{k}),

(2)

where $\Delta t_{k}=t_{k+1}-t_{k}$ and $\bm{n}(t_{k})\sim\mathcal{N}(0,\Delta t_{k}W)$ . Introducing a new control input $\bm{u}(t_{k}):=\bm{v}(t_{k})\Delta t_{k}$ and applying the constraint $\|\bm{v}(t_{k})\|=1$ , (2) can be written as:

\begin{split}&\bm{x}(t_{k+1})=\bm{x}(t_{k})+\bm{u}(t_{k})+\bm{n}(t_{k}),\end{split}

(3)

with $\bm{n}(t_{k})\sim\mathcal{N}(0,\|\bm{u}(t_{k})\|W)$ . Due to the unit velocity assumption above, the time intervals $\Delta t_{k},k=0,1,2,\cdots$ are determined once the command sequence $\bm{u}({t_{0}}),\bm{u}(t_{1}),\bm{u}(t_{2}),\cdots$ is formalized. Since the physical times $t_{k}$ do not play significant roles in our theoretical development in the sequel, it is convenient to rewrite (3) as the main dynamics model of this work:

\begin{split}&\bm{x}_{k+1}=\bm{x}_{k}+\bm{u}_{k}+\bm{n}_{k},\;\bm{n}_{k}\sim\mathcal{N}(0,\|\bm{u}_{k}\|W).\end{split}

(4)

Let the probability distributions of the robot position at a given time step $k$ be parametrized by a Gaussian model $\bm{x}_{k}\sim\mathcal{N}(x_{k},P_{k})$ , where $x_{k}\in\mathbb{R}^{d}$ is the nominal position and $P_{k}\in\mathbb{S}_{++}^{d}$ is the associated covariance matrix (with $d$ being the dimension of the configuration space). In this paper, we consider a path-planning framework in which the sequence $\{(x_{k},P_{k})\}_{k\in\mathbb{N}}$ is scheduled. Following [20, 21], the product space $\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ is called the uncertain configuration space. In what follows, the problem of finding the shortest path in the uncertain configuration space with respect to a novel information-theoretic path length function is formulated.

First, an appropriate directed distance function from a point $(x_{k},P_{k})\in\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ to another $(x_{k+1},P_{k+1})\in\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ is introduced. This function is interpreted as the cost of steering the random state variable $\bm{x}_{k}\sim\mathcal{N}(x_{k},P_{k})$ to $\bm{x}_{k+1}\sim\mathcal{N}(x_{k+1},P_{k+1})$ in the next time step under the dynamics provided by (4). In order to implement the rational inattention concept, we formulate this cost as a weighted sum of the control cost $\mathcal{D}_{\text{cont}}(k)$ and the information cost $\mathcal{D}_{\text{info}}(k)$ in achieving each state transition.

II-A Control Cost

The control cost is simply the commanded travel distance in the Euclidean metric:

\mathcal{D}_{\text{cont}}(k):=\|x_{k+1}-x_{k}\|.

(5)

II-B Information Cost

Jointly accounting for both the control efficiency and sensing simplicity in planning necessitates the formulation of a metric that captures the information acquisition cost required for path following. We utilize the information gain (entropy reduction) for this purpose.

Assume that the control input $\bm{u}_{k}=x_{k+1}-x_{k}$ is applied to (4). The propagation of the prior covariance during the movement of the robot, over the time interval $[t_{k},t_{k+1})$ , is denoted as $\hat{P}_{k}=P_{k}+\|x_{k+1}-x_{k}\|W$ . At time $t_{k+1}$ , the covariance is “reduced” to $P_{k+1}(\preceq\hat{P}_{k})$ by utilizing a sensor input. The minimum information gain (minimum number of bits that must be contained in the sensor data) for this transition is:

\mathcal{D}_{\text{info}}(k)=\frac{1}{2}\log_{2}\det\hat{P}_{k}-\frac{1}{2}\log_{2}\det P_{k+1}.

(6)

The notion of an “optimal” sensing signal which reduces $\hat{P}_{k}$ to $P_{k+1}$ has been previously discussed in [22] in the context of optimal sensing in filtering theory. The information cost function $\mathcal{D}_{\text{info}}(k)$ in (6) is well-defined for the pairs $(P_{k},P_{k+1})$ satisfying $P_{k+1}\preceq\hat{P}_{k}$ . For those pairs which do not satisfy $P_{k+1}\preceq\hat{P}_{k}$ , we generalize (6) as:

\begin{split}\mathcal{D}_{\text{info}}(k)=&\min_{Q_{k+1}\succeq 0}\quad\frac{1}{2}\log_{2}\det\hat{P}_{k}-\frac{1}{2}\log_{2}\det Q_{k+1}\\ &\quad\text{s.t. }\quad Q_{k+1}\preceq P_{k+1},\;\;Q_{k+1}\preceq\hat{P}_{k}.\end{split}

(7)

Notice that (7) takes a non-negative value for any given transition from an origin $(x_{k},P_{k})$ to destination $(x_{k+1},P_{k+1})$ . However, (7) is an implicit function involving a convex optimization problem in its expression (more precisely, the max-det problem [23]). To see why (7) is an appropriate generalization of (6), consider a two-step procedure $\hat{P}_{k}\rightarrow Q_{k+1}\rightarrow P_{k+1}$ to update the prior covariance $\hat{P}_{k}$ to the posterior covariance $P_{k+1}$ . In the first step, the uncertainty is “reduced” from $\hat{P}_{k}$ to satisfy both $Q_{k+1}\preceq\hat{P}_{k}$ and $Q_{k+1}\preceq P_{k+1}$ . The associated information gain (the amount of telemetry data) is $\frac{1}{2}\log_{2}\det\hat{P}_{k}-\frac{1}{2}\log_{2}\det Q_{k+1}$ . In the second step, the covariance $Q_{k+1}$ is “increased” to $P_{k+1}(\succeq Q_{k+1})$ . This step incurs no information cost, since the location uncertainty can be increased simply by “deteriorating” the prior knowledge. The max-det problem (7) can then be interpreted as finding the optimal intermediate step $Q_{k+1}$ which minimizes the information gain in the first step.

II-C Total Cost

The cost to steer a random state variable $\bm{x}_{k}\sim\mathcal{N}(x_{k},P_{k})$ to $\bm{x}_{k+1}\sim\mathcal{N}(x_{k+1},P_{k+1})$ is a weighted sum of $\mathcal{D}_{\text{cont}}(k)$ and $\mathcal{D}_{\text{info}}(k)$ . Introducing $\alpha>0$ , the total RI cost is:

\begin{split}&\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1}):=\mathcal{D}_{\text{cont}}(k)+\alpha\mathcal{D}_{\text{info}}(k)\\ &\quad=\min_{Q_{k+1}\succ 0}\quad\|x_{k+1}-x_{k}\|\\ &\qquad\qquad\qquad+\frac{\alpha}{2}\left[\log_{2}\det\hat{P}_{k}-\log_{2}\det Q_{k+1}\right]\\ &\qquad\qquad\text{ s.t. }\;\;Q_{k+1}\preceq P_{k+1},\;\;Q_{k+1}\preceq\hat{P}_{k}.\end{split}

(8)

By increasing $\alpha$ , more weight is placed on the amount of information which must be gained compared to the distance traversed. Note that the information cost $\mathcal{D}_{\text{info}}$ is an asymmetric function, so that transitioning $(x_{1},P_{1})\rightarrow(x_{2},P_{2})$ does not return the same cost as $(x_{2},P_{2})\rightarrow(x_{1},P_{1})$ .

III Problem Formulation

Having introduced the RI cost function (8), it is now appropriate to introduce the notion of path length. Let $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ , $\gamma(t)=(x(t),P(t))$ be a path. The RI length of a path $\gamma$ is defined as:

c(\gamma):=\sup_{\mathcal{P}}\sum_{k=0}^{N-1}\mathcal{D}\left(x(t_{k}),x(t_{k+1}),P(t_{k}),P(t_{k+1})\right),

where the supremum is over the space of partitions $\mathcal{P}$ of $[0,T]$ . If $\gamma(t)$ is differentiable and $W\succeq\frac{d}{dt}P(t)\;\forall\;t\in[0,T]$ , then it can be shown that:

c(\gamma)=\int_{0}^{T}\left[\left\|\frac{d}{dt}x(t)\right\|+\frac{\alpha}{2}\text{Tr}\left(W-\frac{d}{dt}P(t)\right)P^{-1}(t)\right]dt

III-A Topology on the path space

In this subsection, we introduce a topology for the space of paths $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ , which is necessary to discuss continuity of $c(\gamma)$ . The space of all paths $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ can be thought of as a subset (convex cone) of the space of generalized paths $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}^{d}$ . The space of generalized paths is a vector space on which addition and scalar multiplication exist and are defined as $(\gamma_{1}+\gamma_{2})(t)=(x_{1}(t)+x_{2}(t),P_{1}(t)+P_{2}(t))$ and $\alpha\gamma(t)=(\alpha x(t),\alpha P(t))$ for $\alpha\in\mathbb{R}$ , respectively. Assuming that a path can be partitioned such that $\mathcal{P}=(0=t_{0}<t_{1}<\cdots<t_{N}=T)$ , the variation $V(\gamma;\mathcal{P})$ of a generalized path $\gamma$ with respect to the choice of $\mathcal{P}$ is given by:

V(\gamma;\mathcal{P}):=\|x(0)\|+\bar{\sigma}(P(0))+\sum_{k=0}^{N-1}\Bigl{[}\|\Delta x_{k}\|+\bar{\sigma}(\Delta P_{k})\Bigr{]}

where $\Delta x_{k}=x(t_{k+1})-x(t_{k})$ , and $\Delta P_{k}=P(t_{k+1})-P(t_{k})$ . Utilizing the above definition for the variation of a path, the total variation of a generalized path $\gamma$ corresponds to the partition $\mathcal{P}$ which results in the supremum of the variation:

|\gamma|_{\text{TV}}:=\sup_{\mathcal{P}}V(\gamma;\mathcal{P}).

Notice that $|\cdot|_{\text{TV}}$ defines a norm on the space of generalized paths. The following relationship holds between $|\gamma|_{\text{TV}}$ and

\|\gamma\|_{\infty}:=\sup_{t\in[0,T]}\|x(t)\|+\bar{\sigma}(P(t)).

Lemma 1

[24, Lemma 13.2] For a given path $\gamma$ with partitioning $\mathcal{P}$ the following inequality holds:

\|\gamma\|_{\infty}\leq|\gamma|_{\text{TV}}.

Proof:

See Appendix A for proof. ∎

In what follows, we assume on the space of generalized paths $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}^{d}$ the topology of total variation metric $|\gamma_{1}-\gamma_{2}|_{\text{TV}}$ , which is then inherited to the space of paths $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ . We denote by $\mathcal{BV}[0,T]$ the space of paths $\gamma:[0,T]\rightarrow\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ such that $|\gamma|_{\text{TV}}<\infty$ . In the next subsection, we discuss the continuity of the RI path cost $c(\cdot)$ in the space $\mathcal{BV}[0,T]$ .

III-B Continuity of RI Cost Function

The continuity of RI path cost function plays a critical role in determining the theoretical guarantees we can provide when we use sampling-based algorithms to find the shortest RI path. Specifically, the asymptotic optimality (the convergence to the path with the minimum cost as the number of nodes is increased) of RRT* algorithms [12], the main numerical method we use in this paper, expects the continuity of the path cost function. Showing that the RI cost function (8) is continuous requires additional derivation which is shown via Theorem 1.

Theorem 1

When $d=1$ , the path cost function $c(\cdot)$ is continuous in the sense that for every $\gamma\in\mathcal{BV}[0,T]$ , $\gamma:[0,T]\rightarrow\mathbb{R}^{1}\times\mathbb{S}_{2\epsilon}^{1}$ , and for every $\epsilon_{0}>0$ , there exists $\delta>0$ such that

|\gamma^{\prime}-\gamma|_{\text{TV}}<\delta\quad\Rightarrow\quad|c(\gamma^{\prime})-c(\gamma)|<\epsilon_{0}.

Proof:

See Appendix B for proof. ∎

Before discussing the required modifications for implementing RRT* algorithm with RI cost in Section IV, we first characterize the shortest RI path in the obstacle-free space, and then formally define the shortest RI path problem in obstacle-filled spaces in the following subsections.

III-C Shortest Path in Obstacle-Free Space

In obstacle-free space, it can be shown that the optimal path cost between $z_{1}=(x_{1},P_{1})$ and $z_{2}=(x_{2},P_{2})$ is equal to $\mathcal{D}(z_{1},z_{2})$ . In other words, the triangular inequality $\mathcal{D}(z_{1},z_{2})\leq\mathcal{D}(z_{1},z_{int})+\mathcal{D}(z_{int},z_{2})$ holds. This means it is optimal for the robot to follow the direct path from $x_{1}$ to $x_{2}$ without sensing, and then make a measurement at $x_{2}$ to shrink the uncertainty from $\hat{P}_{1}$ to $P_{2}$ . In what follows, we call such a motion plan the “move-and-sense” strategy. The optimality of the move-and-sense path for one-dimensional geometric space is shown in Appendix C. We confirm this optimality by simulation in Section V-A, where the move-and-sense path is the wedge-shaped path depicted in Fig. 3 (a).

III-D Shortest Path Formulation

The utility of path-planning algorithms is made non-trivial by the introduction of obstacles in the path space. Let $X_{obs}\subset\mathbb{R}^{d}$ be a closed subset of spatial points representing obstacles. The initial configuration of the robot is defined as $z_{\text{init}}=(x_{0},P_{0})\in\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ , while $\mathcal{Z}_{\text{target}}\subset\mathbb{R}^{d}\times\mathbb{S}_{++}^{d}$ is a given closed subset representing the target region which the robot desires to attain. Given a confidence level parameter $\chi^{2}>0$ , the shortest RI path problem can be formulated as:

\begin{split}\min_{\gamma\in\mathcal{BV}[0,T]}\;\;&c(\gamma)\\ \text{ s.t. }\;\;\;\;&\gamma(0)=z_{\text{init}},\;\;\gamma(T)\in\mathcal{Z}_{\text{target}}\\ &(x(t)-x_{\text{obs}})^{\top}P^{-1}(t)(x(t)-x_{\text{obs}})\geq\chi^{2},\\ &\qquad\forall t\in[0,T],\;\;\forall x_{\text{obs}}\in X_{obs}.\end{split}

(9)

The $\chi^{2}$ term in the constraints of (9) is implemented to provide a confidence bound on probability that a robot with position $x(t)$ will not be in contact with an obstacle $x_{\text{obs}}$ .

IV RI-RRT* Algorithm

IV-A RRT*

The RRT algorithm [25] constructs a tree of nodes (state realizations) through random sampling of the feasible state-space and then connects these nodes with edges (tree branches). A user-defined cost is utilized to quantify the length of the edges, which are in turn summed to form path lengths. Each new node is connected via a permanent edge to the existing node which provides the shortest path between the new node and the initial node of the tree. Although the RRT algorithm is known to be probabilistically complete (the algorithm finds a feasible path if one exists), it does not achieve asymptotic optimality (path cost does not converge to the optimal one as the number of nodes is increased) [12]. The RRT* algorithm [12] attains asymptotic optimality by including an additional “re-wiring” step that re-evaluates if the path length for each node can be reduced via a connection to the newly created node. This paper utilizes RRT* as a numerical approach to the shortest path problem (9).

IV-B Algorithm

Provided below is a Rationally Inattentive RRT* (RI-RRT*) algorithm for finding a solution to (9). Like the original RRT* algorithm, the RI-RRT* algorithm constructs a graph of state nodes and edges ( $G\leftarrow(Z,E)$ ) in spaces with or without obstacles.

(z_{1})\leftarrow(z_{\text{init}})

;

E\leftarrow\emptyset

;

G^{\prime}\leftarrow(z_{1},E)

;

2 for $i=2:N$ do

G\leftarrow G^{\prime}

;

z_{i}=(x_{i},P_{i})\leftarrow\textsc{Generate}(i)

;

(Z^{\prime},E^{\prime})\leftarrow(Z,E)

;

z_{\text{near}}\leftarrow\textsc{Nearest}(Z^{\prime},z_{i})

;

z_{\text{new}}\leftarrow\textsc{Scale}(z_{\text{near}},z_{i},ED_{\text{min}})

;

8 if $\textsc{ObsCheck}(z_{\text{near}},z_{\text{new}})=False$ then

Z^{\prime}\leftarrow Z^{\prime}\cup z_{\text{new}}

;

Z_{\text{nbors}}\leftarrow\textsc{Neighbor}(Z,z_{\text{new}},ED_{\text{nbors}})

;

Path_{z_{\text{new}}}\leftarrow realmax

;

12 for $z_{j}\in Z_{\text{nbors}}$ do

13 if $\textsc{ObsCheck}(z_{j},z_{\text{new}})=False$ then

Path_{z_{\text{new},j}}\leftarrow Path_{z_{j}}+\mathcal{D}(z_{j},z_{\text{new}})

;

15 if $Path_{z_{\text{new},j}}<Path_{z_{\text{new}}}$ then

Path_{z_{\text{new}}}\leftarrow Path_{z_{\text{new},j}}

;

z_{\text{nbor}}^{*}\leftarrow z_{j}

;

E^{\prime}\leftarrow\left[z_{\text{nbor}}^{*},z_{\text{new}}\right]\cup E^{\prime}

;

22 for $z_{j}\in Z_{\text{nbors}}\>\backslash\>z_{\text{nbor}}^{*}$ do

23 if $\textsc{ObsCheck}(z_{\text{new}},z_{j})=False$ then

Path_{z_{j},\text{rewire}}=Path_{z_{\text{new}}}+\mathcal{D}(z_{\text{new}},z_{j})

;

25 if $Path_{z_{j},\text{rewire}}<Path_{z_{j}}$ then

E^{\prime}\leftarrow E^{\prime}\cup\left[z_{\text{new}},z_{j}\right]\backslash\left[z_{j,\text{parent}},z_{j}\right]

;

z_{j,\text{parent}}\leftarrow z_{\text{new}}

;

\textsc{UpdateDes}(G,z_{j})

;

G^{\prime}\leftarrow(Z^{\prime},E^{\prime})

Algorithm 1 RI-RRT* Algorithm

In Algorithm 1, the $\textsc{Generate}(i)$ function creates a new point by randomly sampling a spatial location ( $x\in\mathbb{R}^{d}$ ) and covariance ( $P\in\mathbb{S}_{++}^{d}$ ). Notice that for a $d$ -dimensional configuration space, the corresponding uncertain configuration space $\mathbb{R}^{d}\times\mathbb{S}^{d}_{++}$ has $d+\frac{1}{2}d(d+1)$ dimensions from which the samples are generated. The $\textsc{Nearest}(Z,z_{i})$ function finds the nearest point ( $z_{\text{near}}$ ), in metric $\hat{\mathcal{D}}(z,z^{\prime}):=\|x-x^{\prime}\|+\|P-P^{\prime}\|_{F}$ , between the newly generated state $z_{i}=(x_{i},P_{i})$ and an existing state in the set $Z$ . Using the metric $\hat{\mathcal{D}}(z,z^{\prime})$ , the $\textsc{Scale}(z_{\text{near}},z_{i},ED_{\text{min}})$ function linearly shifts the generated point ( $z_{i}$ ) to a new location as:

z_{\text{new}}\!=\!\!\begin{cases}z_{\text{near}}\!+\!\frac{ED_{\text{min}}}{\hat{\mathcal{D}}(z_{i},z_{\text{near}})}\left(z_{i}-z_{\text{near}}\right)\leavevmode\nobreak\ \text{if}\leavevmode\nobreak\ \hat{\mathcal{D}}(z,z^{\prime})>ED_{\text{min}},\\ z_{i}\hskip 108.12054pt\text{otherwise,}\end{cases}

where $ED_{\text{min}}$ is a user-defined constant. In addition to generating $z_{\text{new}}$ , the Scale function also ensures that its $\chi^{2}$ covariance region does not interfere with any obstacles.

The $\textsc{ObsCheck}(z_{\text{near}},z_{\text{new}})$ function ensures that transition from state $z_{\text{near}}$ to $z_{\text{new}}$ does not intersect with the obstacles. More precisely, we assume the transition $z_{\text{near}}\rightarrow z_{\text{new}}$ follows the move-and-sense path, introduced in Section III-C, and the ObsCheck function returns $False$ if all state pairs along the move-and-sense path, shown by blue ellipses in Fig. 2, has $\chi^{2}$ covariance regions that are non-interfering with obstacles.

The $\textsc{Neighbor}(Z,z_{\text{new}},ED_{\text{nbors}})$ function returns the sub-set of nodes described as $Z_{\text{nbors}}=\{z_{i}=(x_{i},P_{i})\in Z:\hat{\mathcal{D}}(z_{i},z_{\text{new}})\leq ED_{\text{nbors}}\}$ . This set is then evaluated for the presence of obstacles via the ObsCheck function of the previous paragraph. Note that in this instance, the function is evaluating obstacle interference along the continuous path of state-covariance pairs from $z_{j}\rightarrow z_{\text{new}}$ . Lines 14-17 of Algorithm 1 connect the new node to the existing graph in an identical manner to RRT*, where the $Path_{z}$ denote the cost of the path from the $z_{init}$ to node $z$ through the edges of $G$ . Line 18 creates a new edge between the new node and the existing nodes from the neighbor group $Z_{\text{nbors}}$ which results in the minimum $Path_{z_{\text{new}}}$ . The calculation of RI path cost in Line 14 utilizes (8).

Lines 19-24 are the tree re-wiring steps of Algorithm 1. In line 20, the ObsCheck function is called again. This is because the move-and-sense path is direction-dependent, and thus $\textsc{ObsCheck}(z_{j},z_{\text{new}})=False$ does not necessarily imply $\textsc{ObsCheck}(z_{\text{new}},z_{j})=False$ . Finally, for each rewired node $z_{j}$ , its cost (i.e., $Path_{z_{j}}$ ) and the cost of its descendants are updated via $\textsc{UpdateDes}(G,z_{j})$ function in line 25.

To increase the computational efficiency of RI-RRT* algorithm we deploy a branch-and-bound technique as detailed in [26]. For a given tree $G$ , let $z_{\text{min}}$ be the node that has the lowest cost along the nodes of $G$ within $\mathcal{Z}_{\text{target}}$ . As discussed in Section III-C, $\mathcal{D}(z,z_{\text{goal}})$ is a lower-bound for the cost of transitioning from $z$ to $z_{\text{goal}}$ . The branch-and-bound algorithm periodically deletes the nodes $Z^{\prime\prime}=\{z\in Z:Path_{z}+\mathcal{D}(z,z_{\text{goal}})\geq Path_{z_{\text{min}}}\}$ . This elimination of the non-optimal nodes speeds up the RI-RRT* algorithm.

IV-C Properties of RI-RRT*

The question regarding the asymptotic optimality of RI-RRT* naturally arises. Recall that the proof of the asymptotic optimality of the RRT* algorithm [27] is founded on four main assumptions:

1.

additivity of the cost function,
2.

the cost function is monotonic,
3.

there exists a finite distance between all points on the optimal path and the obstacle space,
4.

the cost function is Lipschitz continuous, either in the topology of total variation metric [27] or the supremum norm metric [12].

The proofs of the first three assumptions are trivial for the RI cost (8). However, Theorem 1 does not suffice to guarantee that the RI cost meets the fourth condition for $d\geq 2$ . For this reason, currently the asymptotic optimality of the RI-RRT* algorithm cannot be guaranteed, while the numerical simulations of Section V do show that the proposed algorithm does have merit in rationally inattentive path-planning.

V Simulation Results

V-A One-Dimensional Simulation

The first study is the case of a robot which is allowed to travel at a constant velocity in a one-dimensional geometric space from a predetermined initial position and covariance $z_{0}=(x_{0},P_{0})$ , specified by the red dot in Fig. 3 (a). The robot has a goal of reaching some final state within the blue box representing a goal region which is a sub-set of the reachable space. Note that the goal region contains acceptable bounds on both location and uncertainty. Although, in one-dimensional setting, the strategy which minimizes the control cost is obviously the one that moves directly toward the target region, we utilize the RI-RRT* algorithm to solve the non-trivial measurement scheduling problem.

In Fig. 3 (a), the blue curve represents the path generated by the RI-RRT* algorithm with $10,000$ nodes, which is sufficiently close to the shortest path obtainable via the RI-distance depicted as the red curve. These wedge-shaped optimal paths are created by the “move-and-sense” strategy integrated in the RI-cost, where it has a section of covariance propagation followed by an instantaneous reduction of covariance as discussed in Section III-C. For example, if the robot were an autonomous ground vehicle with GPS capabilities, then this path signifies the robot driving the total distance without any GPS updates, followed by a reduction its spatial uncertainty with a single update once the goal region is reached. The minimum path cost at the end of each iteration of the for-loop in Algorithm 1 is depicted in Fig. 3 (b), where the red curve represents the average of $100$ independent simulations. The path cost of each simulation approaches to the optimal cost.

V-B Two-Dimensional Asymmetric Simulation

The asymmetric characteristic of the RI-cost is demonstrated via a simulation in the two-dimensional configuration space with a diagonal wall, as seen in Fig. 4. The initial position and covariance of the robot is depicted as a red dot, while the target region is illustrated as the black rectangle at the upper-right corner.

The path is generated by the RI-RRT* algorithm by sampling $10,000$ nodes. The corresponding sampled covariance ellipses are shown in black where the blue ellipses represent covariance propagation. As shown in Fig. 4, there are two options; path A requires the robot to move into a funnel-shaped corridor, while the path B moves out of a funnel. In this setting, the RI-cost prefers the path B even though both A and B have the same Euclidean distance. This asymmetric behavior results from the fact that as the robot approaches the goal region path B requires a less severe uncertainty reduction compared to path A. Similarly, by exchanging the start and goal positions, the RI-cost prefers path A over path B, thus displaying the directional dependency of our efficient sensing strategy.

V-C Two-Dimensional Simulation with Multiple Obstacles

In a final demonstration, the RI-RRT* algorithms with $\alpha=0,0.1,$ and $0.3$ are implemented in a two-dimensional configuration space containing multiple obstacles in order to illustrate the effects of varying the information cost. All three paths in panels of Fig. 5 are generated from 4,000 nodes.

As seen in Fig. 5 (a), when $\alpha=0$ the algorithm simply finds a path that has the shortest Euclidean distance, even if that path requires frequent sensing actions. In contrast, the RI-RRT* algorithm with $\alpha=0.3$ does not take a constrained pathway, and thus requires fewer sensor actuations in order to avoid obstacle collisions. As a result, the algorithm deviates from the shortest Euclidean distance path and allows the covariance to propagate safely, as seen in Fig. 5 (c). A moderate path, illustrated in Fig. 5 (b), can be also obtained by choosing $\alpha=0.1$ .

VI Conclusion

In this work, a novel RI cost for utilization in path-planning algorithms is presented. The cost accounts for both the path distance traversed (efficiency) and the amount of information which must be perceived by the robot. Information gained from perception is important in that it allows the robot to safely navigate obstacle-filled environments with confidence that collisions will be avoided. This method, in balancing path distance and perception costs, provides a simplicity-based path which can be tailored to mimic the results potentially generated by an expert human path-planner. Three numerical simulations were provided demonstrating these results.

Currently, the preliminary version of the RI-RRT* algorithm is optimized for computational efficiency by the aid of a branch-and-bound technique. The authors note that utilizing other RRT* improvement methods, such as k-d trees could further improve the computational speed of Algorithm 1 and should be considered in future work. In the same vein of future work, the authors note the importance of quantifying the impact that the user-defined constants, such as the distances which signify which nodes are neighbors, have on the results of the RI-RRT* algorithm. Also, the topic of path refinement should be further explored as RRT*-like algorithms converge asymptotically.

It should be noted that once the RI-RRT* algorithm finds an initial feasible path, there exist iterative methods for path “smoothing” which do not require additional node sampling. In a theorized hybrid method, RI-RRT* is first utilized to find some initial path and then an iterative method takes the initial path and “smooths” towards the optimal path. Convergence benefits of iterative methods are often improved as the initial path guess is more similar to the optimal path, and a trade-off could be found in computational efficiency which results in the best time to switch between algorithms.

Appendix A Proof of Lemma 1

For every $t\in[0,T]$ , set a partition $\mathcal{P}=(0,t,T)$ . Then

	$\displaystyle\\|x(t)\\|+\bar{\sigma}(P(t))$	$\displaystyle\leq\\|x(0)\\|+\bar{\sigma}(P(0))$
		$\displaystyle\qquad+\\|x(t)-x(0)\\|+\bar{\sigma}(P(t)-P(0))$
		$\displaystyle\leq\\|x(0)\\|+\bar{\sigma}(P(0))$
		$\displaystyle\qquad+\\|x(t)-x(0)\\|+\bar{\sigma}(P(t)-P(0))$
		$\displaystyle\qquad+\\|x(T)-x(t)\\|+\bar{\sigma}(P(T)-P(t))$
		$\displaystyle\leq V(\gamma,\mathcal{P})\leq\|\gamma\|_{\text{TV}}.$

Appendix B Proof of Theorem 1

The proof is based on the following lemma:

Lemma 2

Assume $d=1$ . For each $(\epsilon,\delta)$ satisfying $0<\delta\leq\frac{\epsilon}{2}$ , there exists a constant $L_{\epsilon}$ such that the inequality:

\begin{split}&|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})|\\ &\qquad\leq L_{\epsilon}\Bigl{[}\left|(x^{\prime}_{k+1}-x_{k+1})-(x^{\prime}_{k}-x_{k})\right|\\ &\qquad\qquad\qquad+\left|(P^{\prime}_{k+1}-P_{k+1})-(P^{\prime}_{k}-P_{k})\right|\\ &\qquad\qquad\qquad+\delta\left|P_{k+1}-P_{k}\right|+\delta\left|x_{k+1}-x_{k}\right|\Bigr{]}\\ \end{split}

holds for all

\begin{split}&x_{k}^{\prime},x_{k+1}^{\prime},x_{k},x_{k+1}\in\mathbb{R}\;\;\text{and}\;\;P_{k}^{\prime},P_{k+1}^{\prime},P_{k},P_{k+1}\geq\epsilon\end{split}

such that

\begin{split}&\Delta x_{k}:=x_{k}^{\prime}-x_{k}\leq\delta,\;\;\Delta x_{k+1}:=x_{k+1}^{\prime}-x_{k+1}\leq\delta\\ &\Delta P_{k}:=P_{k}^{\prime}-P_{k}\leq\delta,\;\;\Delta P_{k+1}:=P_{k+1}^{\prime}-P_{k+1}\leq\delta.\end{split}

Proof:

For simplicity, we assume $\alpha=W=1$ , but the extension of the following proof to general cases is straightforward. In what follows, we write

	$\displaystyle\mathcal{D}_{\text{info}}(x_{k},x_{k+1},P_{k},P_{k+1})$
	$\displaystyle:=\max\left\{0,\frac{1}{2}\log_{2}(P_{k}+\left\|x_{k+1}-x_{k}\right\|)-\frac{1}{2}\log_{2}(P_{k+1})\right\}.$

We consider four different cases depending on the signs of $\mathcal{D}_{\text{info}}(x_{k}^{\prime},x_{k+1}^{\prime},P_{k}^{\prime},P_{k+1}^{\prime})$ and $\mathcal{D}_{\text{info}}(x_{k},x_{k+1},P_{k},P_{k+1})$ .

Case 1: First, we consider the case with

	$\displaystyle\mathcal{D}_{\text{info}}(x_{k}^{\prime},x_{k+1}^{\prime},P_{k}^{\prime},P_{k+1}^{\prime})>0\text{ and }$
	$\displaystyle\mathcal{D}_{\text{info}}(x_{k},x_{k+1},P_{k},P_{k+1})>0.$

In this case:

\small\begin{split}&\left|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})\right|\\ &=\bigg{|}\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|-\left|x_{k+1}-x_{k}\right|\\ &\quad+\left.\frac{1}{2}\log_{2}(P_{k}+\Delta P_{k}+\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|)\right.\\ &\quad\left.-\frac{1}{2}\log_{2}(P_{k+1}+\Delta P_{k+1})-\frac{1}{2}\log_{2}(P_{k}+\left|x_{k+1}-x_{k}\right|)\right.\\ &\quad+\frac{1}{2}\log_{2}(P_{k+1})\bigg{|}\\ &\leq\bigg{|}\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|-\left|x_{k+1}-x_{k}\right|\bigg{|}\\ &\quad+\frac{1}{2}\left|\log_{2}\left(1+\frac{\Delta P_{k}}{P_{k}}+\frac{\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|}{P_{k}}\right)\right.\\ &\left.\quad-\log_{2}\left(1+\frac{\left|x_{k+1}-x_{k}\right|}{P_{k}}+\frac{P_{k}+\left|x_{k+1}-x_{k}\right|}{P_{k}P_{k+1}}\Delta P_{k+1}\right)\right|.\end{split}

(10)

Using the fact that $\left||a+b|-|a|\right|\leq\left|b\right|$ for $a,b\in\mathbb{R}$ , we have:

\begin{split}&\left|\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|-\left|x_{k+1}-x_{k}\right|\right|\\ &\qquad\leq\left|\Delta x_{k+1}-\Delta x_{k}\right|.\end{split}

Noticing that the arguments in the logarithmic terms in (10) are $\geq\frac{1}{2}$ , and using the fact that $\left|\log_{2}(a)-\log_{2}(b)\right|\leq 2\log_{2}(e)|a-b|,\;\forall a,b\geq\frac{1}{2}$ , we have:

\begin{split}&\frac{1}{2}\left|\log_{2}\left(1+\frac{\Delta P_{k}}{P_{k}}+\frac{\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|}{P_{k}}\right)\right.\\ &\left.\quad-\log_{2}\left(1+\frac{\left|x_{k+1}-x_{k}\right|}{P_{k}}+\frac{P_{k}+\left|x_{k+1}-x_{k}\right|}{P_{k}P_{k+1}}\Delta P_{k+1}\right)\right|\\ &\leq\log_{2}(e)\left|\frac{\left|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|}{P_{k}}-\frac{\left|x_{k+1}-x_{k}\right|}{P_{k}}\right.\\ &\left.\quad+\frac{\Delta P_{k}}{P_{k}}-\frac{P_{k}+\left|x_{k+1}-x_{k}\right|}{P_{k}P_{k+1}}\Delta P_{k+1}\right|\\ &\leq\frac{\log_{2}(e)}{P_{k}}\bigg{|}|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}|-|x_{k+1}-x_{k}|\bigg{|}\\ &\quad+\!\log_{2}(e)\bigg{|}\frac{\Delta P_{k}\!-\!\Delta P_{k+1}}{P_{k}}\\ &\hskip 56.9055pt+\frac{P_{k+1}\!-\!P_{k}\!-\!|x_{k+1}\!-\!x_{k}|}{P_{k}P_{k+1}}\Delta P_{k+1}\bigg{|}\\ &\leq\frac{\log_{2}(e)}{P_{k}}|\Delta x_{k+1}-\Delta x_{k}|+\frac{\log_{2}(e)}{P_{k}}|\Delta P_{k+1}-\Delta P_{k}|\\ &\quad+\frac{\log_{2}(e)\left|\Delta P_{k+1}\right|}{P_{k}P_{k+1}}\big{|}P_{k+1}-P_{k}-|x_{k+1}-x_{k}|\big{|}\\ &\leq\frac{\log_{2}(e)}{P_{k}}|\Delta x_{k+1}-\Delta x_{k}|+\frac{\log_{2}(e)}{P_{k}}|\Delta P_{k+1}-\Delta P_{k}|\\ &\quad+\frac{\log_{2}(e)|\Delta P_{k+1}|}{P_{k}P_{k+1}}|x_{k+1}-x_{k}|\\ &\quad+\frac{\log_{2}(e)|\Delta P_{k+1}|}{P_{k}P_{k+1}}|P_{k+1}-P_{k}|\\ &\leq\frac{\log_{2}(e)}{\epsilon}|\Delta x_{k+1}-\Delta x_{k}|+\frac{\log_{2}(e)}{\epsilon}|\Delta P_{k+1}-\Delta P_{k}|\\ &\quad+\frac{\log_{2}(e)\delta}{\epsilon^{2}}|x_{k+1}-x_{k}|+\frac{\log_{2}(e)\delta}{\epsilon^{2}}|P_{k+1}-P_{k}|\end{split}

Therefore,

\begin{split}&|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})|\\ &\leq\left(1+\frac{\log_{2}(e)}{\epsilon}\right)|\Delta x_{k+1}-\Delta x_{k}|\\ &\quad+\frac{\log_{2}(e)}{\epsilon}|\Delta P_{k+1}-\Delta P_{k}|\\ &\quad+\frac{\log_{2}(e)\delta}{\epsilon^{2}}|x_{k+1}-x_{k}|+\frac{\log_{2}(e)\delta}{\epsilon^{2}}|P_{k+1}-P_{k}|\end{split}

(11)

Case 2: Next, we consider the case with


$\displaystyle\mathcal{D}_{\text{info}}(x_{k}^{\prime},x_{k+1}^{\prime},P_{k}^{\prime},P_{k+1}^{\prime})$	$\displaystyle>0\text{ and }$	(12a)
$\displaystyle\mathcal{D}_{\text{info}}(x_{k},x_{k+1},P_{k},P_{k+1})$	$\displaystyle=0.$	(12b)

Notice that (12b) implies $P_{k}+|x_{k+1}-x_{k}|-P_{k+1}\leq 0$ . In this case:


	$\displaystyle\|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})\|$
	$\displaystyle=\bigg{\|}\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|-\|x_{k+1}-x_{k}\|$
	$\displaystyle\quad+\left.\frac{1}{2}\log_{2}\left(P_{k}+\Delta P_{k}+\left\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|\right)\right.$
	$\displaystyle\quad-\frac{1}{2}\log_{2}(P_{k+1}+\Delta P_{k+1})\bigg{\|}$		(13a)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|-\frac{1}{2}\log_{2}(P_{k+1}+\Delta P_{k+1})$
	$\displaystyle\quad+\frac{1}{2}\log_{2}(P_{k}+\Delta P_{k}+\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|)$		(13b)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}(P_{k}+\Delta P_{k}+\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|$
	$\displaystyle\quad-P_{k+1}-\Delta P_{k+1})$		(13c)
	$\displaystyle=\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\!\left(P_{k}\!+\!\left\|x_{k+1}-x_{k}\right\|-P_{k+1}+\Delta P_{k}-\Delta P_{k+1}\right.$
	$\displaystyle\quad\left.+\left\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|-\left\|x_{k+1}-x_{k}\right\|\right)$		(13d)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\bigg{\|}\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|$		(13e)
	$\displaystyle\hskip 56.9055pt-\|x_{k+1}-x_{k}\|\bigg{\|}$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\left\|\Delta P_{k+1}-\Delta P_{k}\right\|$		(13f)
	$\displaystyle\leq\left(1+\frac{\log_{2}(e)}{\epsilon}\right)\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\left\|\Delta P_{k+1}-\Delta P_{k}\right\|$		(13g)

In step (13b), we have used the fact that the difference between the two logarithmic terms is positive, because of the hypothesis (12a). In step (13c), we used the fact that $\log_{2}a-\log_{2}b\leq\frac{2\log_{2}(e)}{\epsilon}(a-b)$ , for $a>b\geq\frac{\epsilon}{2}$ .

Case 3: Next, we consider the case with


$\displaystyle\mathcal{D}_{\text{info}}(x_{k}^{\prime},x_{k+1}^{\prime},P_{k}^{\prime},P_{k+1}^{\prime})$	$\displaystyle=0\text{ and }$	(14a)
$\displaystyle\mathcal{D}_{\text{info}}(x_{k},x_{k+1},P_{k},P_{k+1})$	$\displaystyle>0.$	(14b)

The first hypothesis (14a) implies:

	$\displaystyle P_{k}+\Delta P_{k}+\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|$
	$\displaystyle-P_{k+1}-\Delta P_{k+1}$	$\displaystyle\leq 0.$		(15)

Using

	$\displaystyle\left\|x_{k+1}-x_{k}\right\|$	$\displaystyle-\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
		$\displaystyle\leq\left\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|,$

one can deduce from (10) that:

\begin{split}&P_{k}+|x_{k+1}-x_{k}|-P_{k+1}\\ &\leq\Delta P_{k+1}-\Delta P_{k}+|\Delta x_{k+1}-\Delta x_{k}|\\ &\leq|\Delta P_{k+1}-\Delta P_{k}|+|\Delta x_{k+1}-\Delta x_{k}|.\end{split}

(16)

This results in:


	$\displaystyle\|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})\|$
	$\displaystyle=\bigg{\|}\left\|x_{k+1}-\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|-\left\|x_{k+1}-x_{k}\right\|$
	$\displaystyle\quad-\frac{1}{2}\log_{2}(P_{k}+\|x_{k+1}-x_{k}\|)+\frac{1}{2}\log_{2}(P_{k+1})\bigg{\|}$		(17a)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{1}{2}\log_{2}(P_{k}+\|x_{k+1}-x_{k}\|)-\frac{1}{2}\log_{2}(P_{k+1})$		(17b)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|+\frac{\log_{2}(e)}{2\epsilon}(P_{k}+\|x_{k+1}-x_{k}\|)-P_{k+1})$		(17c)
	$\displaystyle\leq\left(1+\frac{\log_{2}(e)}{2\epsilon}\right)\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{2\epsilon}\left\|\Delta P_{k+1}-\Delta P_{k}\right\|$		(17d)

In (17b) we used the fact that the difference between the two logarithmic terms is positive. In (17c) we used the fact that $\log_{2}a-\log_{2}b\leq\frac{\log_{2}(e)}{\epsilon}(a-b)$ , for $a>b\geq\epsilon$ . Finally, the inequality (16) was used in step (17d).

Case 4: Finally, we consider the case with

	$\displaystyle\mathcal{D}_{\text{info}}(x_{k}^{\prime},x_{k+1}^{\prime},P_{k}^{\prime},P_{k+1}^{\prime})$	$\displaystyle=0\text{ and }$
	$\displaystyle\mathcal{D}_{\text{info}}(x_{k},x_{k+1},P_{k},P_{k+1})$	$\displaystyle=0.$

In this case:

\begin{split}&|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})|\\ &\quad=\big{|}\left|x_{k+1}-\Delta x_{k+1}-x_{k}-\Delta x_{k}\right|-|x_{k+1}-x_{k}|\big{|}\\ &\quad\leq|\Delta x_{k+1}-\Delta x_{k}|\end{split}

(18)

To summarize, (11), (13g), (17d), and (18) are sufficient to be able to choose $L_{\epsilon}=\max\{1+\frac{\log_{2}(e)}{\epsilon},\frac{\log_{2}(e)}{\epsilon^{2}}\}$ to obtain the desired result. ∎

Proof of Theorem 1
Suppose $\gamma(t)=(x(t),P(t))$ and $\gamma^{\prime}(t)=(x^{\prime}(t),P^{\prime}(t))$ . In what follows, we consider the choice:

\delta=\min\left\{\frac{\epsilon_{0}}{2L_{\epsilon}\left(1+|\gamma|_{\text{TV}}\right)},\frac{\epsilon}{2}\right\}

(19)

Since $|\gamma^{\prime}-\gamma|_{\text{TV}}<\delta$ , we have $\|\gamma^{\prime}-\gamma\|_{\infty}<\delta$ . In particular, for each $t\in[0,T]$ , we have $|x^{\prime}(t)-x(t)|<\delta$ and $|P^{\prime}(t)-P(t)|<\delta$ . Moreover:

\begin{split}|P^{\prime}(t)|&=|P(t)+P^{\prime}(t)-P(t)|\\ &\geq|P(t)|-|P^{\prime}(t)-P(t)|>2\epsilon-\delta>\epsilon.\end{split}

Therefore, $\forall t\in[0,T]$ , we have both $P(t)>\epsilon$ and $P^{\prime}(t)>\epsilon$ . Let $\mathcal{P}=(0=t_{0}<t_{1}<\cdots<t_{N}=T)$ be a partition and define:

c(\gamma;\mathcal{P}):=\sum_{k=0}^{N-1}\mathcal{D}(x(t_{k}),x(t_{k+1}),P(t_{k}),P(t_{k+1})).

For any partition $\mathcal{P}$ , the following chain of inequalities holds:


	$\displaystyle\|c(\gamma^{\prime};\mathcal{P})-c(\gamma;\mathcal{P})\|$
	$\displaystyle=\left\|\sum_{k=0}^{N-1}\mathcal{D}(x^{\prime}(t_{k}),x^{\prime}(t_{k+1}),P^{\prime}(t_{k}),P^{\prime}(t_{k+1}))\right.$
	$\displaystyle\qquad\qquad-\mathcal{D}(x(t_{k}),x(t_{k+1}),P(t_{k}),P(t_{k+1}))\Bigg{\|}$		(20a)
	$\displaystyle\leq\sum_{k=0}^{N-1}\left\|\mathcal{D}(x^{\prime}(t_{k}),x^{\prime}(t_{k+1}),P^{\prime}(t_{k}),P^{\prime}(t_{k+1}))\right.$
	$\displaystyle\left.\qquad\qquad-\mathcal{D}(x(t_{k}),x(t_{k+1}),P(t_{k}),P(t_{k+1}))\right\|$		(20b)
	$\displaystyle\leq L_{\epsilon}\sum_{k=0}^{N-1}\Bigl{[}\|(x^{\prime}(t_{k+1})-x(t_{k+1}))-(x^{\prime}(t_{k})-x(t_{k}))\|\Bigr{.}$
	$\displaystyle\qquad\left.+\|(P^{\prime}(t_{k+1})-P(t_{k+1}))-(P^{\prime}(t_{k})-P(t_{k}))\|\right.$
	$\displaystyle\qquad+\Bigl{.}\delta\|P(t_{k+1})-P(t_{k})\|+\delta\|x(t_{k+1})-x(t_{k})\|\Bigr{]}$		(20c)
	$\displaystyle=L_{\epsilon}(V(\gamma^{\prime}-\gamma,\mathcal{P})+\delta V(\gamma,\mathcal{P}))$		(20d)
	$\displaystyle\leq L_{\epsilon}\left(\|\gamma^{\prime}-\gamma\|_{\text{TV}}+\delta\|\gamma\|_{\text{TV}}\right)$		(20e)
	$\displaystyle<L_{\epsilon}\left(\delta+\delta\|\gamma\|_{\text{TV}}\right)$		(20f)
	$\displaystyle\leq L_{\epsilon}\left(1+\|\gamma\|_{\text{TV}}\right)\left(\frac{\epsilon_{0}}{2L_{\epsilon}(1+\|\gamma\|_{\text{TV}})}\right)$		(20g)
	$\displaystyle=\frac{\epsilon_{0}}{2}$		(20h)

The inequality (20c) follows from Lemma 2. Let $\{\mathcal{P}_{i}\}_{i\in\mathbb{N}}$ and $\{\mathcal{P}^{\prime}_{i}\}_{i\in\mathbb{N}}$ be sequences of partitions such that:

\lim_{i\rightarrow\infty}c(\gamma;\mathcal{P}_{i})=c(\gamma),\;\;\lim_{i\rightarrow\infty}c(\gamma^{\prime};\mathcal{P}^{\prime}_{i})=c(\gamma^{\prime}),

(21)

and let $\{\mathcal{P}^{\prime\prime}_{i}\}_{i\in\mathbb{N}}$ be the sequence of partitions such that for each $i\in\mathbb{N}$ , $\mathcal{P}^{\prime\prime}_{i}$ is a common refinement of $\mathcal{P}_{i}$ and $\mathcal{P}^{\prime}_{i}$ . Since both

\displaystyle c(\gamma;\mathcal{P}_{i})\!\leq\!c(\gamma;\mathcal{P}^{\prime\prime}_{i})\leq c(\gamma)\;\text{and}\;c(\gamma^{\prime};\mathcal{P}^{\prime}_{i})\leq c(\gamma;\mathcal{P}^{\prime\prime}_{i})\leq c(\gamma)

hold for each $i\in\mathbb{N}$ , (21) implies

\lim_{i\rightarrow\infty}c(\gamma;\mathcal{P}^{\prime\prime}_{i})=c(\gamma),\;\;\lim_{i\rightarrow\infty}c(\gamma^{\prime};\mathcal{P}^{\prime\prime}_{i})=c(\gamma^{\prime}).

(22)

Now, since the chain of inequalities (20) holds for any partitions,

|c(\gamma;\mathcal{P}^{\prime\prime}_{i})-c(\gamma^{\prime};\mathcal{P}^{\prime\prime}_{i})|<\frac{\epsilon_{0}}{2}

holds for all $i\in\mathbb{N}$ . This results in:

	$\displaystyle\|c(\gamma)-c(\gamma^{\prime})\|$	$\displaystyle=\lim_{i\rightarrow\infty}\|c(\gamma;\mathcal{P}^{\prime\prime}_{i})-c(\gamma^{\prime};\mathcal{P}^{\prime\prime}_{i})\|$
		$\displaystyle\leq\frac{\epsilon_{0}}{2}<\epsilon_{0}.$		(23)

where (B) follows from (22).

Appendix C One-Dimensional Problem Optimal Path

Consider taking the single perception optimal path $\gamma_{1}$ in Fig. 3 (a):

(x_{0},P_{0})\rightarrow(x_{T},P_{T})

and dividing it into the combination of two sub-paths $\gamma_{2}$ :

\begin{split}&(x_{0},P_{0})\rightarrow(x_{a},P_{a})\rightarrow(x_{T},P_{T})\\ &\text{such that}\;\;\hat{P}_{0}^{{}^{\prime}}=P_{0}+\beta\|x_{T}-x_{0}\|W>P_{a},\\ &\qquad\qquad\hat{P}_{a}^{{}^{\prime}}=P_{a}+(1-\beta)\|x_{T}-x_{0}\|W>P_{T}\end{split}

where $\beta\in(0,1)$ is a constant which denotes where in $\gamma_{2}$ the additional sensing action takes place. The combination of the divided sub-paths have in the same initial ( $z_{0}$ ) and ending ( $z_{T}$ ) states as the original path, but also achieve an intermediate state ( $z_{a}$ ).

Path $\gamma_{1}$ has a total RI cost:


	$\displaystyle\mathcal{D}(\gamma_{1})$	$\displaystyle=\\|x_{T}-x_{0}\\|+\frac{\alpha}{2}\left[\log_{2}\hat{P}_{0}-\log_{2}P_{T}\right],$

where, $\hat{P}_{0}=P_{0}+\|x_{T}-x_{0}\|W$ . Likewise, the path $\gamma_{2}$ has a length which is the summation of two information gains and while transitioning the same distance as $\gamma_{1}$ .


	$\displaystyle\mathcal{D}(\gamma_{2})=\\|x_{T}-x_{0}\\|+\frac{\alpha}{2}\left[\log_{2}\hat{P}_{0}^{{}^{\prime}}-\log_{2}P_{a}\right]$
	$\displaystyle\qquad\qquad+\frac{\alpha}{2}\left[\log_{2}\hat{P}_{a}^{{}^{\prime}}-\log_{2}P_{T}\right]$

By comparing the costs between $\gamma_{1}$ and $\gamma_{2}$ , it is possible to achieve:


	$\displaystyle\mathcal{D}(\gamma_{1})-\mathcal{D}(\gamma_{2})=\frac{\alpha}{2}\left[\log_{2}\hat{P}_{0}-\log_{2}P_{T}\right]$
	$\displaystyle\qquad-\frac{\alpha}{2}\left[\log_{2}\hat{P}_{0}^{{}^{\prime}}-\log_{2}P_{a}\right]-\frac{\alpha}{2}\left[\log_{2}\hat{P}_{a^{\prime}}-\log_{2}P_{T}\right]$
	$\displaystyle=\frac{\alpha}{2}\left[\log_{2}\hat{P}_{0}-\log_{2}\hat{P}_{0}^{{}^{\prime}}\right]-\frac{\alpha}{2}\left[\log_{2}\hat{P}_{a^{\prime}}-\log_{2}P_{a}\right]$
	$\displaystyle=f(\hat{P}_{0}^{{}^{\prime}})-f(P_{a})<0,$

where $f(P)\!=\!\frac{\alpha}{2}\left[\log_{2}(P+(1-\beta)\|x_{T}-x_{0}\|W)\!-\!\log_{2}P\right]$ . Last inequality follows the facts that $\hat{P}_{0}^{{}^{\prime}}>P_{a}$ and $f(P)$ is a decreasing function ( $\derivative{f}{P}<0$ ).

References

[1] S. Pendleton, H. Andersen, X. Du, X. Shen, M. Meghjani, Y. Eng, D. Rus, and M. Ang, “Perception, planning, control, and coordination for autonomous vehicles,” Machines, vol. 5, no. 1, p. 6, 2017.
[2] M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena, “From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots,” in Proc. IEEE Int. Conf. Robot. Autom., 2017, pp. 1527–1533.
[3] L. Carlone and S. Karaman, “Attention and anticipation in fast visual-inertial navigation,” IEEE Trans. Robot., vol. 35, no. 1, pp. 1–20, 2019.
[4] R. Alterovitz, S. Koenig, and M. Likhachev, “Robot planning in the real world: research challenges and opportunities,” AI Magazine, vol. 37, no. 2, pp. 76–84, 2016.
[5] Y. Kuwata, J. Teo, S. Karaman, G. Fiore, E. Frazzoli, and J. How, “Motion planning in complex environments using closed-loop prediction,” AIAA Guid. Navi. Control Conf. Exhibit, 2008.
[6] J. Van Den Berg, P. Abbeel, and K. Goldberg, “LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information,” Int. J. Robot. Res., vol. 30, no. 7, pp. 895–913, 2011.
[7] A.-A. Agha-Mohammadi, S. Chakravorty, and N. M. Amato, “FIRM: Sampling-based feedback motion-planning under motion uncertainty and imperfect measurements,” Int. J. Robot. Res., vol. 33, no. 2, pp. 268–304, 2014.
[8] C. A. Sims, “Implications of rational inattention,” J. Monetary Economics, vol. 50, no. 3, pp. 665–690, 2003.
[9] E. Shafieepoorfard, M. Raginsky, and S. P. Meyn, “Rationally inattentive control of markov processes,” SIAM J. Control Optimization, vol. 54, no. 2, pp. 987–1016, 2016.
[10] E. Shafieepoorfard and M. Raginsky, “Rational inattention in scalar lqg control,” in Proc. Conf. Decision Control, 2013, pp. 5733–5739.
[11] S. M. LaValle, Planning algorithms. Cambridge University Press, 2006.
[12] S. Karaman and E. Frazzoli, “Incremental sampling-based algorithms for optimal motion planning,” in Proc. Robot.: Sci. Syst., 2010.
[13] J. J. Marquez and M. L. Cummings, “Design and evaluation of path planning decision support for planetary surface exploration,” J. Aerosp. Comput., Info., Comm., vol. 5, no. 3, pp. 57–71, 2008.
[14] Y. K. Hwang and N. Ahuja, “A potential field approach to path planning,” IEEE Trans. Robot. Autom., vol. 8, no. 1, pp. 23–32, 1992.
[15] S. Kambhampati and L. Davis, “Multiresolution path planning for mobile robots,” IEEE J. Robot. Autom., vol. 2, no. 3, pp. 135–145, 1986.
[16] F. Hauer, A. Kundu, J. M. Rehg, and P. Tsiotras, “Multi-scale perception and path planning on probabilistic obstacle maps,” in Proc. IEEE Int. Conf. Robot. Autom., 2015, pp. 4210–4215.
[17] A. Lambert and D. Gruyer, “Safe path planning in an uncertain-configuration space,” in Proc. IEEE Int. Conf. Robot. Autom., vol. 3, 2003, pp. 4185–4190.
[18] R. Pepy and A. Lambert, “Safe path planning in an uncertain-configuration space using RRT,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2006, pp. 5376–5381.
[19] P. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations. Berlin: Springer, 1992.
[20] A. Lambert and D. Gruyer, “Safe path planning in an uncertain-configuration space,” in Proc. IEEE Int. Conf. Robot. Autom., vol. 3, 2003, pp. 4185–4190.
[21] R. Pepy and A. Lambert, “Safe path planning in an uncertain-configuration space using RRT,” in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2006, pp. 5376–5381.
[22] T. Tanaka, K.-K. Kim, P. A. Parrilo, and S. K. Mitter, “Semidefinite programming approach to Gaussian sequential rate-distortion trade-offs,” IEEE Trans. Automat. Control, vol. 62, no. 4, pp. 1896–1910, 2016.
[23] L. Vandenberghe, S. Boyd, and S.-P. Wu, “Determinant maximization with linear matrix inequality constraints,” SIAM J. Matrix Analysis and Applications, vol. 19, no. 2, pp. 499–533, 1998.
[24] N. L. Carothers, Real analysis. Cambridge University Press, 2000.
[25] S. M. LaValle and J. J. Kuffner, “Randomized kinodynamic planning,” Int. J. Robot. Res., vol. 20, no. 5, pp. 378–400, May 2001.
[26] S. Karaman, M. R. Walter, A. Perez, E. Frazzoli, and S. Teller, “Anytime motion planning using the RRT,” in Proc. IEEE Int. Conf. Robot. Autom., 2011, pp. 1478–1483.
[27] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimal motion planning,” Int. J. Robot. Res., vol. 30, no. 7, pp. 846–894, 2011.

	$\displaystyle\\|x(t)\\|+\bar{\sigma}(P(t))$	$\displaystyle\leq\\|x(0)\\|+\bar{\sigma}(P(0))$
		$\displaystyle\qquad+\\|x(t)-x(0)\\|+\bar{\sigma}(P(t)-P(0))$
		$\displaystyle\leq\\|x(0)\\|+\bar{\sigma}(P(0))$
		$\displaystyle\qquad+\\|x(t)-x(0)\\|+\bar{\sigma}(P(t)-P(0))$
		$\displaystyle\qquad+\\|x(T)-x(t)\\|+\bar{\sigma}(P(T)-P(t))$
		$\displaystyle\leq V(\gamma,\mathcal{P})\leq\|\gamma\|_{\text{TV}}.$


	$\displaystyle\|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})\|$
	$\displaystyle=\bigg{\|}\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|-\|x_{k+1}-x_{k}\|$
	$\displaystyle\quad+\left.\frac{1}{2}\log_{2}\left(P_{k}+\Delta P_{k}+\left\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|\right)\right.$
	$\displaystyle\quad-\frac{1}{2}\log_{2}(P_{k+1}+\Delta P_{k+1})\bigg{\|}$		(13a)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|-\frac{1}{2}\log_{2}(P_{k+1}+\Delta P_{k+1})$
	$\displaystyle\quad+\frac{1}{2}\log_{2}(P_{k}+\Delta P_{k}+\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|)$		(13b)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}(P_{k}+\Delta P_{k}+\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|$
	$\displaystyle\quad-P_{k+1}-\Delta P_{k+1})$		(13c)
	$\displaystyle=\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\!\left(P_{k}\!+\!\left\|x_{k+1}-x_{k}\right\|-P_{k+1}+\Delta P_{k}-\Delta P_{k+1}\right.$
	$\displaystyle\quad\left.+\left\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|-\left\|x_{k+1}-x_{k}\right\|\right)$		(13d)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\bigg{\|}\|x_{k+1}+\Delta x_{k+1}-x_{k}-\Delta x_{k}\|$		(13e)
	$\displaystyle\hskip 56.9055pt-\|x_{k+1}-x_{k}\|\bigg{\|}$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\left\|\Delta P_{k+1}-\Delta P_{k}\right\|$		(13f)
	$\displaystyle\leq\left(1+\frac{\log_{2}(e)}{\epsilon}\right)\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{\epsilon}\left\|\Delta P_{k+1}-\Delta P_{k}\right\|$		(13g)


	$\displaystyle\|\mathcal{D}(x^{\prime}_{k},x^{\prime}_{k+1},P^{\prime}_{k},P^{\prime}_{k+1})-\mathcal{D}(x_{k},x_{k+1},P_{k},P_{k+1})\|$
	$\displaystyle=\bigg{\|}\left\|x_{k+1}-\Delta x_{k+1}-x_{k}-\Delta x_{k}\right\|-\left\|x_{k+1}-x_{k}\right\|$
	$\displaystyle\quad-\frac{1}{2}\log_{2}(P_{k}+\|x_{k+1}-x_{k}\|)+\frac{1}{2}\log_{2}(P_{k+1})\bigg{\|}$		(17a)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{1}{2}\log_{2}(P_{k}+\|x_{k+1}-x_{k}\|)-\frac{1}{2}\log_{2}(P_{k+1})$		(17b)
	$\displaystyle\leq\left\|\Delta x_{k+1}-\Delta x_{k}\right\|+\frac{\log_{2}(e)}{2\epsilon}(P_{k}+\|x_{k+1}-x_{k}\|)-P_{k+1})$		(17c)
	$\displaystyle\leq\left(1+\frac{\log_{2}(e)}{2\epsilon}\right)\left\|\Delta x_{k+1}-\Delta x_{k}\right\|$
	$\displaystyle\quad+\frac{\log_{2}(e)}{2\epsilon}\left\|\Delta P_{k+1}-\Delta P_{k}\right\|$		(17d)


	$\displaystyle\|c(\gamma^{\prime};\mathcal{P})-c(\gamma;\mathcal{P})\|$
	$\displaystyle=\left\|\sum_{k=0}^{N-1}\mathcal{D}(x^{\prime}(t_{k}),x^{\prime}(t_{k+1}),P^{\prime}(t_{k}),P^{\prime}(t_{k+1}))\right.$
	$\displaystyle\qquad\qquad-\mathcal{D}(x(t_{k}),x(t_{k+1}),P(t_{k}),P(t_{k+1}))\Bigg{\|}$		(20a)
	$\displaystyle\leq\sum_{k=0}^{N-1}\left\|\mathcal{D}(x^{\prime}(t_{k}),x^{\prime}(t_{k+1}),P^{\prime}(t_{k}),P^{\prime}(t_{k+1}))\right.$
	$\displaystyle\left.\qquad\qquad-\mathcal{D}(x(t_{k}),x(t_{k+1}),P(t_{k}),P(t_{k+1}))\right\|$		(20b)
	$\displaystyle\leq L_{\epsilon}\sum_{k=0}^{N-1}\Bigl{[}\|(x^{\prime}(t_{k+1})-x(t_{k+1}))-(x^{\prime}(t_{k})-x(t_{k}))\|\Bigr{.}$
	$\displaystyle\qquad\left.+\|(P^{\prime}(t_{k+1})-P(t_{k+1}))-(P^{\prime}(t_{k})-P(t_{k}))\|\right.$
	$\displaystyle\qquad+\Bigl{.}\delta\|P(t_{k+1})-P(t_{k})\|+\delta\|x(t_{k+1})-x(t_{k})\|\Bigr{]}$		(20c)
	$\displaystyle=L_{\epsilon}(V(\gamma^{\prime}-\gamma,\mathcal{P})+\delta V(\gamma,\mathcal{P}))$		(20d)
	$\displaystyle\leq L_{\epsilon}\left(\|\gamma^{\prime}-\gamma\|_{\text{TV}}+\delta\|\gamma\|_{\text{TV}}\right)$		(20e)
	$\displaystyle<L_{\epsilon}\left(\delta+\delta\|\gamma\|_{\text{TV}}\right)$		(20f)
	$\displaystyle\leq L_{\epsilon}\left(1+\|\gamma\|_{\text{TV}}\right)\left(\frac{\epsilon_{0}}{2L_{\epsilon}(1+\|\gamma\|_{\text{TV}})}\right)$		(20g)
	$\displaystyle=\frac{\epsilon_{0}}{2}$		(20h)