This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Stochastic receding horizon control with output feedback and bounded control inputs

Peter Hokayem Eugenio Cinquemani Debasish Chatterjee Federico Ramponi  and  John Lygeros
Abstract.

We provide a solution to the problem of receding horizon control for stochastic discrete-time systems with bounded control inputs and imperfect state measurements. For a suitable choice of control policies, we show that the finite-horizon optimization problem to be solved on-line is convex and successively feasible. Due to the inherent nonlinearity of the feedback loop, a slight extension of the Kalman filter is exploited to estimate the state optimally in mean-square sense. We show that the receding horizon implementation of the resulting control policies renders the state of the overall system mean-square bounded under mild assumptions. Finally, we discuss how some of the quantities required by the finite-horizon optimization problem can be computed off-line, reducing the on-line computation, and present some numerical examples.

Key words and phrases:
Predictive control, output feedback, constrained control, state estimation, stochastic control
This research was partially supported by the Swiss National Science Foundation under grant 200021-122072 and by the European Commission under the project Feednetback FP7-ICT-223866 (www.feednetback.eu). This article was not presented at any IFAC meeting.
P. Hokayem, D. Chatterjee, F. Ramponi, and J. Lygeros are with the Automatic Control Laboratory, ETH Zürich, Physikstrasse 3, 8092 Zürich, Switzerland. E. Cinquemani is with the INRIA Grenoble-Rhône-Alpes. Corresponding author P. Hokayem Tel. +41 44 632 0403. Fax: +41 44 632 1211.
Email: {hokayem,chatterjee,ramponif,lygeros}@control.ee.ethz.ch,
Eugenio.Cinquemani@inria.fr

1. Introduction

A considerable amount of research effort has been devoted to deterministic receding horizon control, see, for example, [Mayne et al., 2000; Bemporad and Morari, 1999; Lazar et al., 2007; Maciejowski, 2001; Qin and Badgwell, 2003] and references therein. Currently we have readily available conclusive proofs of successive feasibility and stability of receding horizon control laws in the noise-free deterministic setting. These techniques can be readily extended to the robust case, i.e., whenever there is an additive noise of bounded nature entering the system. The counterpart for stochastic systems subject to process noise and imperfect state measurements and bounded control inputs, however, is still missing. The principal obstacle is posed by the fact that it may not be possible to determine an a priori bound on the support of the noise, e.g., whenever the noise is additive and Gaussian. This extra ingredient complicates both the stability and feasibility proofs manifold: the noise eventually drives the state outside any bounded set no matter how large the latter is taken to be, and employing any standard linear state feedback means that any hard bounds on the control inputs will eventually be violated.

In this article we propose a solution to the general receding horizon control problem for linear systems with noisy process dynamics, imperfect state information, and bounded control inputs. Both the process and measurement noise sequences are assumed to enter the system in an additive fashion, and we require that the designed control policies satisfy hard bounds. Periodically at times t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots, where NcN_{c} is the control horizon, a certain finite-horizon optimal control problem is solved for a prediction horizon NN. The underlying cost is the standard expectation of the sum of cost-per-state functions that are quadratic in the state and control inputs [Bertsekas, 2005]. We also impose some variance-like bounds on the predicted future states and inputs—this is one possible way to impose soft state-constraints that are in spirit similar to integrated chance-constraints, e.g., in [Klein Haneveld, 1983; Klein Haneveld and van der Vlerk, 2006].

There are several key challenges that are inherent to our setup. First, since state-information is incomplete and corrupt, the need for a filter to estimate the state from the corrupt state measurements naturally presents itself. Second, it is clear that in the presence of unbounded (e.g., Gaussian) noise, it is not in general possible to ensure any bound on the control values with linear state feedback alone (assuming complete state information is available)—the additive nature of the noise ensures that the states exit from any fixed bounded set at some time almost surely. We see at once that nonlinear feedback is essential, and this issue is further complicated by the fact that only incomplete and imperfect state-information is available. Third, it is unclear whether applicatino of the control policies stabilizes the system in any reasonable sense.

In the backdrop of these challenges, let us mention the main contributions of this article. We

  1. (i)

    provide a tractable method for designing, in a receding horizon fashion, bounded causal nonlinear feedback policies, and

  2. (ii)

    guarantee that applying the designed control policies in a receding horizon fashion renders the state of the closed-loop system mean-square bounded for any initial state statistics.

We elaborate on the two points above:

  1. (i)

    Tractability: Given a subclass of causal policies, we demonstrate that the underlying optimal control problem translates to a convex optimization problem to be solved every NcN_{c} time steps, that this optimization problem is feasible for any statistics of the initial state, and equivalent to a quadratic program. The subclass of polices we employ is comprised of open-loop terms and nonlinear feedback from the innovations. Under the assumption that the process and measurement noise is Gaussian, (even though the bounded inputs requirement makes the problem inherently nonlinear and it may appear that standard Kalman filtering may not apply,) it turns out that Kalman filtering techniques can indeed be utilized. We provide a low-complexity algorithm (essentially similar to standard Kalman filtering) for updating this conditional density, and report tractable solutions for the off-line computation of the time-dependent optimization parameters.

  2. (ii)

    Stability: It is well known that for noise-free discrete-time linear systems it is not possible to globally asymptotically stabilize the system, if the usual matrix AA has unstable eigenvalues, see, for example, [Yang et al., 1997] and references therein. Moreover, in the presence of stochastic process noise the hope for achieving asymptotic stability is obviously not realistic. Therefore, we naturally relax the notion of stability into mean-square boundedness of the state and impose the extra conditions that the system matrix AA is Lyapunov (or neutrally) stable and that the pair (A,B)(A,B) is stabilizable. Under such standard assumptions, we show that it is possible to augment the optimization problem with an extra stability constraint, and, consequently, that the successive application of our resulting policies renders the state of the overall system (plant and estimator) mean-square bounded.

Related Works

The research on stochastic receding horizon control is broadly subdivided into two parallel lines: the first treats multiplicative noise that enters the state equations, and the second caters to additive noise. The case of multiplicative noise has been treated in some detail in [Primbs, 2007; Primbs and Sung, 2009; Cannon et al., 2009], where the authors consider also soft constraints on the state and control input. However, in this article we exclusively focus on the case of additive noise.

The approach proposed in this article stems from and generalizes the idea of affine parametrization of control policies for finite-horizon linear quadratic problems proposed in [Ben-Tal et al., 2004, 2006], utilized within the robust MPC framework in [Ben-Tal et al., 2006; Löfberg, 2003; Goulart et al., 2006] for full state feedback, and in [van Hessem and Bosgra, 2003] for output feedback with Gaussian state and measurement noise inputs. More recently, this affine approximation was utilized in [Skaf and Boyd, 2009] for both the robust deterministic and the stochastic setups in the absence of control bounds, and optimality of the affine policies in the scalar deterministic case was reported in [Bertsimas et al., 2009]. In [Bertsimas and Brown, 2007] the authors reformulate the stochastic programming problem as a deterministic one with bounded noise and solve a robust optimization problem over a finite horizon, followed by estimating the performance when the noise can take unbounded values, i.e., when the noise is unbounded, but takes high values with low probability. Similar approach was utilized in [Oldewurtel et al., 2008] as well. There are also other approaches, e.g., those employing randomized algorithms as in [Batina, 2004; Blackmore and Williams, 2007; Maciejowski et al., 2005]. Results on obtaining lower bounds on the value functions of the stochastic optimization problem have been recently reported in [Wang and Boyd, 2009].

Organization

The remainder of this article is organized as follows. In Section 2 we formulate the receding horizon control problem with all the underlying assumptions, the construction of the estimator, and the main optimization problems to be solved. In Section 3 we provide the main results pertaining to tractability of the optimization problems and mean-square boundedness of the closed-loop system. We comment on the obtained results and provide some extensions in Section 4. We provide all the needed preliminary results, derivations, and proofs in Section 5, and some numerical examples in Section 6. Finally, we conclude in Section 7.

Notation

Let (Ω,𝔉,)(\Omega,\mathfrak{F},\mathbb{P}) be a general probability space, we denote the conditional expectation given the sub-σ\sigma algebra 𝔉\mathfrak{F}^{\prime} of 𝔉\mathfrak{F} as 𝔼𝔉[.]\mathbb{E}_{\mathfrak{F}^{\prime}}[.]. For any random vector ss we let Σs𝔼[ss𝖳]\Sigma_{s}\coloneqq\mathbb{E}[ss^{\mathsf{T}}]. Hereafter we let +{1,2,}\mathbb{N}_{+}\coloneqq\{1,2,\ldots\} and +{0}\mathbb{N}\coloneqq\mathbb{N}_{+}\cup\{0\} be the sets of positive and nonnegative integers, respectively. We let 𝗍𝗋()\mathsf{tr}\!\left(\cdot\right) denote the trace of a square matrix, p\left\lVert{\cdot}\right\rVert_{p} denote the standard p\ell_{p} norm, and simply .\left\lVert{.}\right\rVert denote the 2\ell_{2}-norm. In a Euclidean space we denote by r{\mathcal{B}}_{r} the closed Euclidean ball or radius rr centered at the origin. For any two matrices AA and BB of compatible dimensions, we denote by k(A,B)\mathfrak{R}_{k}(A,B) the kk-th step reachability matrix k(A,B)[Ak1BABB]\mathfrak{R}_{k}(A,B)\coloneqq\left[\begin{matrix}A^{k-1}B&\cdots&AB&B\end{matrix}\right]. For any matrix MM, we let σmin\sigma_{\min} and σmax\sigma_{\max} be its minimal and maximal singular values, respectively. We let (M)i1:i2(M)_{i_{1}:i_{2}} denote the sub-matrix obtained by selecting the rows i1i_{1} through i2i_{2} of MM, and let (M)i(M)_{i} denote its ii-th row.

2. Problem Setup

We are given a certain plant with discrete-time dynamics, process noise wtw_{t}, and imperfect state measurements yty_{t} tainted through noise vtv_{t} (Figure 1). The objective is to design a receding horizon controller which renders the overall closed-loop system mean-square bounded. We discuss the structure and the main assumptions on the dynamics of the system, the performance index to be minimized, and the construction of the optimal mean-square estimator x^t{\hat{x}}_{t} in Subsection 2.1. Then, we formalize the optimization problem to be solved in a receding horizon fashion (to generate the input policies utu_{t}) in Subsection 2.2.

Refer to caption
Figure 1. Main setup

2.1. Dynamics, Performance Index and Estimator

Consider the following affine discrete-time stochastic dynamical model:

xt+1\displaystyle x_{t+1} =Axt+But+wt,\displaystyle=Ax_{t}+Bu_{t}+w_{t}, (2.1a)
yt\displaystyle y_{t} =Cxt+vt\displaystyle=Cx_{t}+v_{t} (2.1b)

where tt\in\mathbb{N}, xtnx_{t}\in\mathbb{R}^{n} is the state, utmu_{t}\in\mathbb{R}^{m} is the control input, ytpy_{t}\in\mathbb{R}^{p} is the output, wtnw_{t}\in\mathbb{R}^{n} is a random process noise, vtpv_{t}\in\mathbb{R}^{p} is a random measurement noise, and AA, BB, and CC are known matrices. We posit the following main assumption:

Assumption 1.

  1. (i)

    The system matrices in (2.1a) satisfy the following:

    1. (a)

      The pair (A,B)(A,B) is stabilizable.

    2. (b)

      AA is discrete-time Lyapunov stable [Bernstein, 2009, Chapter 12], i.e., the eigenvalues {λi(A)i=1,,d}\{\lambda_{i}(A)\mid i=1,\ldots,d\} lie in the closed unit disc, and those eigenvalues λj(A)\lambda_{j}(A) with |λj(A)|=1\bigl{|}\lambda_{j}(A)\bigr{|}=1 have equal algebraic and geometric multiplicities.

  2. (ii)

    The initial condition and the process and measurement noise vectors are normally distributed, i.e., x0𝒩(0,Σx0)x_{0}\sim\mathcal{N}(0,\Sigma_{x_{0}}), wt𝒩(0,Σw)w_{t}\sim\mathcal{N}(0,\Sigma_{w}), and vt𝒩(0,Σv)v_{t}\sim\mathcal{N}(0,\Sigma_{v}). Moreover, x0x_{0}, (wt)t(w_{t})_{t\in\mathbb{N}} and (vt)t(v_{t})_{t\in\mathbb{N}} are mutually independent.

  3. (iii)

    The control inputs take values in the control constraint set:

    𝕌{ξm|ξUmax};\mathbb{U}\coloneqq\bigl{\{}\xi\in\mathbb{R}^{m}\,\big{|}\,\left\lVert{\xi}\right\rVert_{\infty}\leqslant U_{\max}\bigr{\}}; (C1)

    i.e., ut𝕌u_{t}\in\mathbb{U} for each tt\in\mathbb{N}.\diamondsuit

Without any loss of generality, we also assume that AA is in real Jordan canonical form. Indeed, given a linear system described by system matrices (A~,B~)\bigl{(}\tilde{A},\tilde{B}\bigr{)}, there exists a coordinate transformation in the state-space that brings the pair (A~,B~)\bigl{(}\tilde{A},\tilde{B}\bigr{)} to the pair (A,B)(A,B), where AA is in real Jordan form [Horn and Johnson, 1990, p. 150]. In particular, choosing a suitable ordering of the Jordan blocks, we can ensure that the pair (A,B)(A,B) has the form ([A100A2],[B1B2])\left(\begin{bmatrix}A_{1}&0\\ 0&A_{2}\end{bmatrix},\begin{bmatrix}B_{1}\\ B_{2}\end{bmatrix}\right), where A1n1×n1A_{1}\in\mathbb{R}^{n_{1}\times n_{1}} is Schur stable, and A2n2×n2A_{2}\in\mathbb{R}^{n_{2}\times n_{2}} has its eigenvalues on the unit circle. By hypothesis (i)-(i)(b) of Assumption 1, A2A_{2} is therefore block-diagonal with elements on the diagonal being either ±1\pm 1 or 2×22\times 2 rotation matrices. As a consequence, A2A_{2} is orthogonal. Moreover, since (A,B)(A,B) is stabilizable, the pair (A2,B2)(A_{2},B_{2}) must be reachable in a number of steps κn2\kappa\leqslant n_{2} that depends on the dimension of A2A_{2} and the structure of (A2,B2)(A_{2},B_{2}), i.e., rank(κ(A2,B2))=n2\text{rank}(\mathfrak{R}_{\kappa}(A_{2},B_{2}))=n_{2}. Summing up, we can start by considering that the state equation (2.1a) has the form

[xt+1(1)xt+1(2)]=[A1xt(1)A2xt(2)]+[B1B2]ut+[wt(1)wt(2)],\begin{bmatrix}x^{(1)}_{t+1}\\ --\\ x^{(2)}_{t+1}\end{bmatrix}=\begin{bmatrix}A_{1}x^{(1)}_{t}\\ --\\ A_{2}x^{(2)}_{t}\end{bmatrix}+\begin{bmatrix}B_{1}\\ --\\ B_{2}\end{bmatrix}u_{t}+\begin{bmatrix}w^{(1)}_{t}\\ --\\ w^{(2)}_{t}\end{bmatrix}, (2.2)

where A1A_{1} is Schur stable, A2A_{2} is orthogonal, and there exists a nonnegative integer κn2\kappa\leqslant n_{2} such that the subsystem (A2,B2)(A_{2},B_{2}) is reachable in κ\kappa steps. This integer κ\kappa is fixed throughout the rest of the article.

Fix a prediction horizon N+N\in\mathbb{N}_{+}, and let us consider, at any time tt, the cost VtV(t,𝒴t,u)V_{t}\coloneqq V(t,\mathcal{Y}_{t},u) defined by:

Vt\displaystyle V_{t} =𝔼𝒴t[k=0N1(xt+k𝖳Qkxt+k+ut+k𝖳Rkut+k)+xt+N𝖳QNxt+N],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}\Bigg{[}\sum\limits_{k=0}^{N-1}\!\bigl{(}x_{t+k}^{\mathsf{T}}Q_{k}x_{t+k}+u_{t+k}^{\mathsf{T}}R_{k}u_{t+k}\bigr{)}+x_{t+N}^{\mathsf{T}}Q_{N}x_{t+N}\Bigg{]}, (2.3)

where 𝒴t={y0,y1,,yt}\mathcal{Y}_{t}=\{y_{0},y_{1},\ldots,y_{t}\} is the set of outputs up to time tt, (or more precisely the σ\sigma-algebra generated by the set of outputs,) and Qk=Qk𝖳>0Q_{k}=Q^{\mathsf{T}}_{k}>0, QN=QN𝖳>0Q_{N}=Q_{N}^{\mathsf{T}}>0, and Rk=Rk𝖳>0R_{k}=R^{\mathsf{T}}_{k}>0 are some given symmetric matrices of appropriate dimension, k=0,,N1k=0,\ldots,N-1. At each time instant tt, we are interested in minimizing (2.3) over the class of causal output feedback strategies 𝒢\mathcal{G} defined as:

[utut+1ut+N1]=[gt(𝒴t)gt+1(𝒴t+1)gt+N1(𝒴t+N1)],\left[\begin{matrix}u_{t}\\ u_{t+1}\\ \vdots\\ u_{t+N-1}\end{matrix}\right]\!\!=\!\!\left[\begin{array}[]{l}g_{t}(\mathcal{Y}_{t})\\ g_{t+1}(\mathcal{Y}_{t+1})\\ \vdots\\ g_{t+N-1}(\mathcal{Y}_{t+N-1})\end{array}\right], (2.4)

for some measurable functions 𝐠t{gt,\mathbf{g}_{t}\coloneqq\{g_{t}, ,gt+N1}\cdots,g_{t+N-1}\} which satisfy the hard constraint (C1) on the inputs.

The cost VtV_{t} in (2.3) is a conditional expectation given the observations up through time tt, and as such requires the conditional density f(xt|𝒴t)f(x_{t}|\mathcal{Y}_{t}) of the state given the previous and current measurements. Our choice of causal control policies in (2.4) motivates us to rewrite the cost VtV_{t} in (2.3) as:

Vt\displaystyle V_{t} =𝔼𝒴t[𝔼{𝒴t,xt}[k=0N1(xt+k𝖳Qkxt+k+ut+k𝖳Rkut+k)+xt+N𝖳QNxt+N]].\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}\!\Bigg{[}\mathbb{E}_{\{\mathcal{Y}_{t},x_{t}\}}\!\Bigg{[}\sum\limits_{k=0}^{N-1}\big{(}x_{t+k}^{\mathsf{T}}Q_{k}x_{t+k}+u_{t+k}^{\mathsf{T}}R_{k}u_{t+k}\big{)}+x_{t+N}^{\mathsf{T}}Q_{N}x_{t+N}\Bigg{]}\Bigg{]}. (2.5)

The propagation of f(xt|𝒴t)f(x_{t}|\mathcal{Y}_{t}) in time is not a trivial task in general. In what follows we shall report an iterative method for the computation of f(xt|𝒴t)f(x_{t}|\mathcal{Y}_{t}), whenever the control input is a measurable deterministic feedback from the past and present outputs. For t,s0t,s\in\mathbb{N}_{0}, define x^t|s=𝔼𝒴s[xt]\hat{x}_{t|s}=\mathbb{E}_{\mathcal{Y}_{s}}[x_{t}] and Pt|s=𝔼𝒴s[(xtx^t|s)(xtx^t|s)𝖳]P_{t|s}=\mathbb{E}_{\mathcal{Y}_{s}}[(x_{t}-\hat{x}_{t|s})(x_{t}-\hat{x}_{t|s})^{\mathsf{T}}].

Assumption 2.

We require that Σw>0\Sigma_{w}>0 and Σv>0\Sigma_{v}>0.\diamondsuit

The following result is a slight extension of the usual Kalman Filter for which a proof can be found in Appendix A. An alternative proof can also be found in [Kumar and Varaiya, 1986, p.102].

Proposition 3.

Under Assumption 1-(ii) and Assumption 2, f(xt|𝒴t)f(x_{t}|\mathcal{Y}_{t}) and f(xt+1|𝒴t)f(x_{t+1}|\mathcal{Y}_{t}) are the probability densities of Gaussian distributions 𝒩(x^t|t,Pt|t)\mathcal{N}(\hat{x}_{t|t},P_{t|t}) and 𝒩(x^t+1|t,Pt+1|t)\mathcal{N}(\hat{x}_{t+1|t},P_{t+1|t}), respectively, with Pt|t>0P_{t|t}>0 and Pt+1|t>0P_{t+1|t}>0. For t=0,1,2,,t=0,1,2,\ldots, their conditional means and covariances can be computed iteratively as follows:

x^t+1|t+1\displaystyle\hat{x}_{t+1|t+1} =x^t+1|t\displaystyle=\hat{x}_{t+1|t}
+Pt+1|tC𝖳(CPt+1|tC𝖳+Σv)1(yt+1Cx^t+1|t),\displaystyle\quad+P_{t+1|t}C^{\mathsf{T}}(CP_{t+1|t}C^{\mathsf{T}}+\Sigma_{v})^{-1}(y_{t+1}-C\hat{x}_{t+1|t}), (2.6)
Pt+1|t+1\displaystyle P_{t+1|t+1} =Pt+1|tPt+1|tC𝖳(CPt+1|tC𝖳+Σv)1CPt+1|t,\displaystyle=P_{t+1|t}-P_{t+1|t}C^{\mathsf{T}}(CP_{t+1|t}C^{\mathsf{T}}+\Sigma_{v})^{-1}CP_{t+1|t}, (2.7)

where

x^t+1|t\displaystyle\hat{x}_{t+1|t} =Ax^t|t+Bgt(𝒴t),\displaystyle=A\hat{x}_{t|t}+Bg_{t}(\mathcal{Y}_{t}), (2.8)
Pt+1|t\displaystyle P_{t+1|t} =APt|tA𝖳+Σw.\displaystyle=AP_{t|t}A^{\mathsf{T}}+\Sigma_{w}. (2.9)

In particular, the matrix Pt|tP_{t|t} together with x^t|t\hat{x}_{t|t} characterize the conditional density f(xt|𝒴t)f(x_{t}|\mathcal{Y}_{t}), which is needed in the computation of the cost (2.5). Proposition 3 states that the conditional mean and covariances of xtx_{t} can be propagated by an iterative algorithm which resembles the Kalman filter. Since the system is generally nonlinear due to the function gg being nonlinear, we cannot assume that the probability distributions in the problem are Gaussian (in fact, the a priori distributions of xtx_{t} and of 𝒴t\mathcal{Y}_{t} are not) and the proof cannot be developed in the standard linear framework of the Kalman filter.

Hereafter, we shall denote for notational convenience by x^t\hat{x}_{t} the estimate x^t|t\hat{x}_{t|t}, and let

x^t=[x^t(1)x^t(2)],\hat{x}_{t}=\left[\begin{matrix}{\hat{x}}^{(1)}_{t}\\ --\\ {\hat{x}}^{(2)}_{t}\end{matrix}\right], (2.10)

which corresponds to the Jordan decomposition in (2.2).

2.2. Optimization Problem and Control Policies

Having discussed the iterative update of the laws of xtx_{t} given the measurements 𝒴t\mathcal{Y}_{t} in Section 2.1, we return to our optimization problem in (2.5). We can rewrite our optimization problem as:

min𝐠t{Vt| dynamics (2.1a)-(2.1b) and constraint (C1)},\displaystyle\min_{\mathbf{g}_{t}}\Bigl{\{}V_{t}\ \Big{|}\text{ dynamics \eqref{eq:system-a}-\eqref{eq:system-b} and constraint \eqref{eq:C1}}\Bigr{\}}, (OP1)

where the collection of functions 𝐠t\mathbf{g}_{t} were defined following (2.4).

The explicit solution via dynamic programming to Problem (OP1) over the class of causal feedback policies 𝒢\mathcal{G}, is difficult if not impossible to obtain in general [Bertsekas, 2000, 2007]. One way to circumvent this difficulty is to restrict our search for feedback policies to a subclass of 𝒢\mathcal{G}. This will result in a suboptimal solution to our problem, but may yield a tractable optimization problem. This is the track we pursue next.

Given a control horizon NcN_{c} and a prediction horizon N(Nc)N\;(\geqslant N_{c}), we would like to periodically solve Problem (OP1) at times t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots over the following class of affine-like control policies

u=η+i=tθ,iφi(yiy^i)for all =t,,t+N1,u_{\ell}=\eta_{\ell}+\sum_{i=t}^{\ell}\theta_{\ell,i}\varphi_{i}(y_{i}-\hat{y}_{i})\quad\text{for all }\ell=t,\cdots,t+N-1, (POL)

where y^i=Cx^i\hat{y}_{i}=C\hat{x}_{i}, and for any vector zpz\in\mathbb{R}^{p}, φi(z)[φi,1(z1),,\varphi_{i}(z)\coloneqq\bigl{[}\varphi_{i,1}(z_{1}),\ldots, φi,p(zp)]𝖳\varphi_{i,p}(z_{p})\bigr{]}^{\mathsf{T}}, where zjz_{j} is the jj-th element of the vector zz and φi,j:\varphi_{i,j}:\mathbb{R}\to\mathbb{R} is any function with sups|φi,j(s)|φmax<\sup\limits_{s\in\mathbb{R}}|\varphi_{i,j}(s)|\leqslant\varphi_{\max}<\infty. The feedback gains θ,im×p\theta_{\ell,i}\in\mathbb{R}^{m\times p} and the affine terms ηm\eta_{\ell}\in\mathbb{R}^{m} are the decision variables. The value of uu_{\ell} in (POL) depends on the values of the measured outputs from the beginning of the prediction horizon at time tt up to time \ell only, which requires a finite amount of memory. Note that we have chosen to saturate the measurements we obtain from the vectors (yiy^i)(y_{i}-\hat{y}_{i}) before inserting them into the control policy. This way we do not assume that either the process noise or the measurement noise distributions are defined over a compact domain, which is a crucial assumption in robust MPC approaches [Mayne et al., 2000]. Moreover, the choice of element-wise saturation functions φi()\varphi_{i}(\cdot) is left open. As such, we can accommodate standard saturation, piecewise linear, and sigmoidal functions, to name a few.

Finally, we collect all the ingredients of the stochastic receding horizon control problem in Algorithm 1 below.

Algorithm 1 Basic Stochastic Receding Horizon Algorithm
0: density f(x0|𝒴1)=𝒩(0,Σx0)f(x_{0}|\mathcal{Y}_{-1})=\mathcal{N}(0,\Sigma_{x_{0}})
1: set t=0t=0, x^0|1=0\hat{x}_{0|-1}=0, and P0|1=Σx0P_{0|-1}=\Sigma_{x_{0}}
2:loop
3:  for i=0i=0 to Nc1N_{c}-1 do
4:   measure yt+iy_{t+i}
5:   update x^t+i(=x^t+i|t+i)\hat{x}_{t+i}(=\hat{x}_{t+i|t+i}) and Pt+i|t+iP_{t+i|t+i} using (2.6)-(2.7)
6:   if i=0i=0 then
7:    solve the optimization problem (OP1) with the control policies in (POL) to obtain {ut,,ut+N1}\{u_{t}^{*},\ \cdots,\ u_{t+N-1}^{*}\}
8:   end if
9:   apply ut+iu^{*}_{t+i}
10:   calculate x^t+i+1|t+i\hat{x}_{t+i+1|t+i} and Pt+i+1|t+iP_{t+i+1|t+i} using (2.8)-(2.9)
11:  end for
12:  set t=t+Nct=t+N_{c}
13:end loop

Compact Notation

In order to state the main results in Section 3, it is convenient to utilize a more compact notation that describes the evolution of all signals over the entire prediction horizon NN.

The evolution of the system dynamics (2.1a)-(2.1b) over a single prediction horizon NN, starting at tt, can be described in a lifted form as follows:

Xt\displaystyle X_{t} =𝒜xt+Ut+𝒟Wt\displaystyle=\mathcal{A}x_{t}+\mathcal{B}U_{t}+\mathcal{D}W_{t} (2.11)
Yt\displaystyle Y_{t} =𝒞Xt+Vt\displaystyle=\mathcal{C}X_{t}+V_{t}

where Xt=[xtxt+1xt+N]X_{t}=\left[\begin{matrix}\begin{smallmatrix}x_{t}\\ x_{t+1}\\ \vdots\\ x_{t+N}\end{smallmatrix}\end{matrix}\right], Ut=[utut+1ut+N1]U_{t}=\left[\begin{matrix}\begin{smallmatrix}u_{t}\\ u_{t+1}\\ \vdots\\ u_{t+N-1}\end{smallmatrix}\end{matrix}\right], Wt=[wtwt+1wt+N1]W_{t}=\left[\begin{matrix}\begin{smallmatrix}w_{t}\\ w_{t+1}\\ \vdots\\ w_{t+N-1}\end{smallmatrix}\end{matrix}\right], Yt=[ytyt+1yt+N]Y_{t}=\left[\begin{matrix}\begin{smallmatrix}y_{t}\\ y_{t+1}\\ \vdots\\ y_{t+N}\end{smallmatrix}\end{matrix}\right], Vt=[vtvt+1vt+N]V_{t}=\left[\begin{matrix}\begin{smallmatrix}v_{t}\\ v_{t+1}\\ \vdots\\ v_{t+N}\end{smallmatrix}\end{matrix}\right], 𝒜=[IAAN]\mathcal{A}=\left[\begin{matrix}\begin{smallmatrix}I\\ A\\ \vdots\\ A^{N}\end{smallmatrix}\end{matrix}\right],

=[00BABB0AN1BABB],𝒟=[00IAI0AN1AI],\mathcal{B}=\left[\begin{matrix}\begin{smallmatrix}0&\cdots&\cdots&0\\ B&\ddots&&\vdots\\ AB&B&\ddots&\vdots\\ \vdots&&\ddots&0\\ A^{N-1}B&\cdots&AB&B\end{smallmatrix}\end{matrix}\right],\mathcal{D}=\left[\begin{matrix}\begin{smallmatrix}0&\cdots&\cdots&0\\ I&\ddots&&\vdots\\ A&I&\ddots&\vdots\\ \vdots&&\ddots&0\\ A^{N-1}&\cdots&A&I\end{smallmatrix}\end{matrix}\right],

and 𝒞=diag{C,,C}\mathcal{C}=\mathrm{diag}\{C,\cdots,C\}. Let

Kt(APt|tA𝖳+Σw)C𝖳(C(APt|tA𝖳+Σw)C𝖳+Σv)1K_{t}\coloneqq(AP_{t|t}A^{\mathsf{T}}+\Sigma_{w})C^{\mathsf{T}}(C(AP_{t|t}A^{\mathsf{T}}+\Sigma_{w})C^{\mathsf{T}}+\Sigma_{v})^{-1}

be the usual Kalman gain and define ΓtIKtC\Gamma_{t}\coloneqq I-K_{t}C, and ΦtΓtA\Phi_{t}\coloneqq\Gamma_{t}A. Then, we can write the estimation error vector as

EtXtX^t=teet+twWttvVt,E_{t}\coloneqq X_{t}-\hat{X}_{t}=\mathcal{F}^{e}_{t}e_{t}+\mathcal{F}^{w}_{t}W_{t}-\mathcal{F}^{v}_{t}V_{t}, (2.12)

where et=xtx^te_{t}=x_{t}-\hat{x}_{t}, te=[IΦtΦt+1ΦtΦt+N1Φt+N2Φt]\mathcal{F}^{e}_{t}=\tiny\left[\begin{matrix}I\\ \Phi_{t}\\ \Phi_{t+1}\cdot\Phi_{t}\\ \vdots\\ \Phi_{t+N-1}\cdot\Phi_{t+N-2}\cdots\Phi_{t}\end{matrix}\right],

tw\displaystyle\mathcal{F}^{w}_{t} =[000Γt00Φt+1Γt00Φt+N2Φt+1ΓtΓt+N20Φt+N1Φt+1ΓtΦt+N1Γt+N2Γt+N1],\displaystyle=\tiny\left[\begin{matrix}\begin{array}[]{l|l|l|l}0&&0&0\\ \Gamma_{t}&&0&0\\ \Phi_{t+1}\Gamma_{t}&&0&0\\ \vdots&\cdots&\vdots&\vdots\\ \Phi_{t+N-2}\cdots\Phi_{t+1}\Gamma_{t}&&\Gamma_{t+N-2}&0\\ \Phi_{t+N-1}\cdots\Phi_{t+1}\Gamma_{t}&&\Phi_{t+N-1}\Gamma_{t+N-2}&\Gamma_{t+N-1}\end{array}\end{matrix}\right],
tv\displaystyle\mathcal{F}^{v}_{t}\! =[00000Kt000Φt+1Kt000Φt+N2Φt+1KtKt+N200Φt+N1Φt+1KtΦt+N1Kt+N2Kt+N1].\displaystyle=\!\!\tiny\left[\begin{matrix}\begin{array}[]{l|l|l|l|l}0&0&&0&0\\ 0&K_{t}&&0&0\\ 0&\Phi_{t+1}K_{t}&&0&0\\ \vdots&\vdots&\!\!\cdots\!\!&\vdots&\vdots\\ 0&\Phi_{t+N-2}\cdots\Phi_{t+1}K_{t}&&K_{t+N-2}&0\\ 0&\Phi_{t+N-1}\cdots\Phi_{t+1}K_{t}&&\Phi_{t+N-1}K_{t+N-2}&K_{t+N-1}\end{array}\end{matrix}\right]\!\!.

Finally, the innovations sequence can be written as

YtY^t=𝒞teet+𝒞twWt+(I𝒞tv)Vt,Y_{t}-\hat{Y}_{t}=\mathcal{C}\mathcal{F}^{e}_{t}e_{t}+\mathcal{C}\mathcal{F}^{w}_{t}W_{t}+(I-\mathcal{C}\mathcal{F}^{v}_{t})V_{t}, (2.13)

where Y^t𝒞X^t\hat{Y}_{t}\coloneqq\mathcal{C}\hat{X}_{t}. Consequently, the innovations sequence over the prediction horizon is independent of the inputs vector UtU_{t}.

The cost function (2.3) at time tt can be written as

Vt=𝔼𝒴t[Xt𝖳𝒬Xt+Ut𝖳Ut],V_{t}=\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}X_{t}^{\mathsf{T}}\mathcal{Q}X_{t}+U_{t}^{\mathsf{T}}\mathcal{R}U_{t}\bigr{]}, (2.14)

where 𝒬=diag{Q0,,Qn}\mathcal{Q}=\text{diag}\{Q_{0},\cdots,Q_{n}\} and =diag{R0,,RN1}\mathcal{R}=\text{diag}\{R_{0},\cdots,R_{N-1}\}. Also, the control policy (POL) at time tt is given by

Ut=𝜼t+𝚯tφ(YtY^t),U_{t}=\boldsymbol{\eta}_{t}+\boldsymbol{\Theta}_{t}\varphi(Y_{t}-\hat{Y}_{t}), (POL)

where

𝜼t:=[ηtηt+1ηt+N1],φ(YtY^t)[φ0(yty^t)φN1(yt+N1y^t+N1)],\displaystyle\boldsymbol{\eta}_{t}\!:=\!\!\left[\begin{matrix}\eta_{t}\\ \eta_{t+1}\\ \vdots\\ \eta_{t+N-1}\end{matrix}\right]\!,\varphi(Y_{t}\!-\!\hat{Y}_{t})\!\coloneqq\!\!\left[\begin{matrix}\varphi_{0}(y_{t}\!-\!\hat{y}_{t})\\ \vdots\\ \varphi_{N-1}(y_{t+N-1}\!-\!\hat{y}_{t+N-1})\end{matrix}\right]\!,

and 𝚯t\boldsymbol{\Theta}_{t} has the following lower block diagonal structure

𝚯t[θt,t00θt+1,tθt+1,t+10θt+N1,tθt+N1,t+1θt+N1,t+N1].\boldsymbol{\Theta}_{t}\coloneqq\left[\begin{matrix}\theta_{t,t}&0&\ldots&0\\ \theta_{t+1,t}&\theta_{t+1,t+1}&&\vdots\\ \vdots&\vdots&\ddots&0\\ \theta_{t+N-1,t}&\theta_{t+N-1,t+1}&\ldots&\theta_{t+N-1,t+N-1}\end{matrix}\right]. (2.15)

Since the innovation vector YtY^tY_{t}-\hat{Y}_{t} in (2.13) is not a function of 𝜼t\boldsymbol{\eta}_{t} and 𝚯t\boldsymbol{\Theta}_{t}, the control inputs UtU_{t} in (POL) remain affine in the decision variables. This fact is important to show convexity of the optimization problem, as will be seen in the next section. Finally, the constraint (C1) can be rewritten as:

Ut𝕌{ξNm|ξUmax}.\displaystyle U_{t}\in{\mathbb{U}}\coloneqq\bigl{\{}\xi\in\mathbb{R}^{Nm}\big{|}\left\lVert{\xi}\right\rVert_{\infty}\leqslant U_{\max}\bigr{\}}. (C1)

3. Main Results

The optimization problem (OP1) we have stated so far, even if it is successively feasible, does not in general guarantee stability of the resulting receding horizon controller in Algorithm 1. Unlike deterministic stability arguments utilized in MPC (see, for example, [Mayne et al., 2000]), we cannot assume the existence of a final region that is control positively invariant and which is crucial to establish stability. This is simply due to the fact that the process noise sequence is not assumed to have a compact domain of support. However, guided by our earlier results in [Ramponi et al., 2009], we shall enforce an extra stability constraint which, if feasible, can render the state of the closed-loop system mean-square bounded.

For tt\in\mathbb{N}, the state estimate at time t+κt+\kappa given the state estimate at time tt, the control inputs from time tt through t+κ1t+\kappa-1, and the corresponding process and measurement noise sequences can be written as

x^t+κ|t+κ=Aκx^t|t+κ(A,B)[utut+κ1]+Ξt,{\hat{x}}_{t+\kappa|t+\kappa}=A^{\kappa}{\hat{x}}_{t|t}+\mathfrak{R}_{\kappa}(A,B)\left[\begin{matrix}u_{t}\\ \vdots\\ u_{t+\kappa-1}\end{matrix}\right]+\Xi_{t},

where we define

Ξt\displaystyle\Xi_{t} [Aκ1KtCA,Aκ2Kt+1CA,,Kt+κ1CA][etet+κ1]\displaystyle\coloneqq\left[\begin{matrix}A^{\kappa-1}K_{t}CA,A^{\kappa-2}K_{t+1}CA,\cdots,K_{t+\kappa-1}CA\end{matrix}\right]\left[\begin{matrix}e_{t}\\ \vdots\\ e_{t+\kappa-1}\end{matrix}\right] (3.1)
+[Aκ1KtC,Aκ2Kt+1C,,Kt+κ1C][wtwt+κ1]\displaystyle\quad+\left[\begin{matrix}A^{\kappa-1}K_{t}C,A^{\kappa-2}K_{t+1}C,\cdots,K_{t+\kappa-1}C\end{matrix}\right]\left[\begin{matrix}w_{t}\\ \vdots\\ w_{t+\kappa-1}\end{matrix}\right]
+[Aκ1Kt,Aκ2Kt+1,,Kt+κ1][vt+1vt+κ].\displaystyle\quad+\left[\begin{matrix}A^{\kappa-1}K_{t},A^{\kappa-2}K_{t+1},\cdots,K_{t+\kappa-1}\end{matrix}\right]\left[\begin{matrix}v_{t+1}\\ \vdots\\ v_{t+\kappa}\end{matrix}\right].
Scholium 4.

There exists an integer TT and a positive constant ζ=ζ(Σv,Σw,A,B,C)\zeta=\zeta(\Sigma_{v},\Sigma_{w},A,B,C) such that

𝔼𝒴t[Ξt]ζfor all tT.\mathbb{E}_{\mathcal{Y}_{t}}\left[\left\lVert{\Xi_{t}}\right\rVert\right]\leqslant\zeta\qquad\text{for all }t\geqslant T. (3.2)

A proof of Scholium 4 appears in Section 5. Using the constant ζ\zeta, we now require the following “drift condition” to be satisfied: for any given ε>0\varepsilon>0 and for every t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots, there exists a Ut𝕌U_{t}\in\mathbb{U}, such that the following condition is satisfied

A2κx^t(2)+κ(A2,B2)[utut+κ1]x^t(2)(ζ+ε2)whenever x^t(2)ζ+ε.\displaystyle\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})\left[\begin{matrix}u_{t}\\ \vdots\\ u_{t+\kappa-1}\end{matrix}\right]}\right\rVert\leqslant\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}-(\zeta+\tfrac{\varepsilon}{2})\;\;\text{whenever }\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}\geqslant\zeta+\varepsilon. (C2)

Note that [utut+κ1]=(𝜼t)1:κm+(𝚯t)1:κmφ(YY^)\left[\begin{matrix}u_{t}\\ \vdots\\ u_{t+\kappa-1}\end{matrix}\right]=(\boldsymbol{\eta}_{t})_{1:\kappa m}+(\boldsymbol{\Theta}_{t})_{1:\kappa m}\varphi(Y-\hat{Y}). (For notational convenience, we have retained φ(YY^)\varphi(Y-\hat{Y}) with the knowledge that the matrix (𝚯t)1:κm(\boldsymbol{\Theta}_{t})_{1:\kappa m} causally selects the outputs as they become available, see (2.15).) The constraint (C2) pertains only to the second subsystem of the estimator, as the first subsystem (x^(1)\hat{x}^{(1)}) is Schur stable (see (2.2) and (2.10)). We augment problem (OP1) with the stability constraint (C2) to obtain

min{Vt| dynamics (2.1a)-(2.1b), policies (POL), and constraints (C1) & (C2)}.\displaystyle\min\Bigl{\{}V_{t}\ \Big{|}\text{ dynamics \eqref{eq:system-a}-\eqref{eq:system-b}, policies \eqref{eq:policy}, and constraints \eqref{eq:C1} \& \eqref{eq:C4}}\Bigr{\}}. (OP2)
Assumption 5.

In addition to Assumption 1 and Assumption 2, we stipulate that:

  1. (i)

    The control authority UmaxUmaxU_{\max}\geqslant U_{\max}^{*}, where Umaxσmin(κ(A2,B2))1(ζ+ε2)U_{\max}^{*}\coloneqq\sigma_{\min}(\mathfrak{R}_{\kappa}(A_{2},B_{2}))^{-1}\big{(}\zeta+\tfrac{\varepsilon}{2}\big{)}.

  2. (ii)

    The control horizon NcκN_{c}\geqslant\kappa, where κ\kappa is the reachability index.

  3. (iii)

    (A,Σw1/2)\bigl{(}A,\Sigma_{w}^{1/2}\bigr{)} is stabilizable, and (A,C)(A,C) is observable.\diamondsuit

Theorem 6 (Main Result).

Consider the system (2.1a)-(2.1b), and suppose that Assumption 5 holds. Then:

  1. (i)

    For every time t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots, the optimization problem (OP2) is convex, feasible, and can be approximated to the following quadratic optimization problem:

    minimize(𝜼t,𝚯t)\displaystyle\operatorname*{minimize}_{(\boldsymbol{\eta}_{t},\boldsymbol{\Theta}_{t})}\quad 2x^t𝖳𝒜𝖳𝒬𝜼t+2𝗍𝗋(𝒜𝖳𝒬𝚯tΛtφx)+2𝗍𝗋(𝚯t𝖳𝖳𝒬𝒟Λtwφ)\displaystyle 2\hat{x}_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\eta}_{t}+2\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\Theta}_{t}\Lambda^{\varphi x}_{t}\right)+2\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{D}\Lambda^{w\varphi}_{t}\right)
    +𝜼t𝖳1𝜼t+2𝜼t𝖳1𝚯tΛtφ+𝗍𝗋(𝚯t𝖳1𝚯tΛtφφ)\displaystyle\;\;+\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\eta}_{t}+2\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\Theta}_{t}\Lambda^{\varphi}_{t}+\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\Theta}_{t}\Lambda^{\varphi\varphi}_{t}\right) (3.3)
    subject  to the structure of𝚯tin(2.15),\displaystyle\text{the structure of}\ \boldsymbol{\Theta}_{t}\ \text{in}\ \eqref{eq:decision-constraints},
    |(𝜼t)i|+(𝚯t)i1φmaxUmaxi=1,,Nm,\displaystyle|(\boldsymbol{\eta}_{t})_{i}|+\left\lVert{(\boldsymbol{\Theta}_{t})_{i}}\right\rVert_{1}\varphi_{\max}\leqslant U_{\max}\quad\forall\,i=1,\cdots,Nm, (3.4)
    A2κx^t(2)+κ(A2,B2)(𝜼t)1:κm+n2κ(A2,B2)(𝚯t)1:κmφmax\displaystyle\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\eta}_{t})_{1:\kappa m}}\right\rVert+\sqrt{n_{2}}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\Theta}_{t})_{1:\kappa m}}\right\rVert_{\infty}\varphi_{\max}
    x^t(2)(ζ+ε2)whenever x^t(2)ζ+ε\displaystyle\quad\leqslant\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}-(\zeta+\tfrac{\varepsilon}{2})\;\;\text{whenever }\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}\geqslant\zeta+\varepsilon (3.5)

    where 1=+𝖳𝒬\mathcal{M}_{1}=\mathcal{R}+\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{B},

    Λtφ\displaystyle\Lambda^{\varphi}_{t} =𝔼𝒴t[φ(YY^)],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[\varphi(Y-\hat{Y})], Λtφx\displaystyle\Lambda^{\varphi x}_{t} =𝔼𝒴t[φ(YY^)xt𝖳],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[\varphi(Y-\hat{Y})x_{t}^{\mathsf{T}}], (3.6)
    Λtwφ\displaystyle\Lambda^{w\varphi}_{t} =𝔼𝒴t[Wφ(YY^)𝖳],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[W\varphi(Y-\hat{Y})^{\mathsf{T}}], Λtφφ\displaystyle\Lambda^{\varphi\varphi}_{t} =𝔼𝒴t[φ(YY^)φ(YY^)𝖳].\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[\varphi(Y-\hat{Y})\varphi(Y-\hat{Y})^{\mathsf{T}}].
  2. (ii)

    For every initial state and noise statistics ((x^0,Σx0),Σw,Σv)\bigl{(}(\hat{x}_{0},\Sigma_{x_{0}}),\Sigma_{w},\Sigma_{v}\bigr{)}, successive application of the control laws given by the optimization problem in (i) for NcN_{c} steps renders the closed-loop system mean-square bounded in the following sense: there exists a constant γ=γ(𝒴0,κ,x^0,Σx0,Σw,Σv,Umax)>0\gamma=\gamma(\mathcal{Y}_{0},\kappa,\hat{x}_{0},\Sigma_{x_{0}},\Sigma_{w},\Sigma_{v},U_{\max}^{*})>0 such that

    supt𝔼𝒴0[xt2]γ.\sup_{t\in\mathbb{N}}\;\mathbb{E}_{\mathcal{Y}_{0}}\bigl{[}\left\lVert{x_{t}}\right\rVert^{2}\bigr{]}\leqslant\gamma. (3.7)

In practice, it may be also of interest to impose further some soft constraints both on the state and the input vectors. For example, one may be interested in imposing quadratic or linear constraints on the state, both of which can be captured in the following

𝔼𝒴t[Xt𝖳𝒮Xt+𝖳Xt]αt,\mathbb{E}_{\mathcal{Y}_{t}}\left[X_{t}^{\mathsf{T}}\mathcal{S}X_{t}+\mathcal{L}^{\mathsf{T}}X_{t}\right]\leqslant\alpha_{t}, (C3)

where 𝒮=𝒮𝖳0\mathcal{S}=\mathcal{S}^{\mathsf{T}}\geqslant 0 and αt>0\alpha_{t}>0. Moreover, expected energy expenditure constraints can be posed as follows

𝔼𝒴t[Ut𝖳𝒮~Ut]βt,\mathbb{E}_{\mathcal{Y}_{t}}\left[U_{t}^{\mathsf{T}}\tilde{\mathcal{S}}U_{t}\right]\leqslant\beta_{t}, (C4)

where 𝒮~=𝒮~𝖳0\tilde{\mathcal{S}}=\tilde{\mathcal{S}}^{\mathsf{T}}\geqslant 0 and βt>0\beta_{t}>0. In the absence of hard input constraints, such expectation-type constraints are commonly used in the stochastic MPC [Primbs and Sung, 2009; Agarwal et al., 2009] and in stochastic optimization in the form of integrated chance constraints [Klein Haneveld, 1983; Klein Haneveld and van der Vlerk, 2006]. This is partly because it is not possible, without posing further restrictions on the boundedness of the process noise wtw_{t}, to ensure that hard constraints on the state are satisfied. For example, in the standard LQG setting nontrivial hard constraints on the system state would generally be violated with nonzero probability. Moreover, in contrast to chance constraints where a bound is imposed on the probability of constraint violation, expectation-type constraints tend to give rise to convex optimization problems under weak assumptions [Agarwal et al., 2009].

We can also augment problem (OP2) with the constraints (C3) and (C4) to obtain

min{Vt| dynamics (2.1a)-(2.1b), policies (POL), constraints (C1), (C2), (C3) & (C4)}.\displaystyle\min\Bigl{\{}V_{t}\ \Big{|}\text{ dynamics \eqref{eq:system-a}-\eqref{eq:system-b}, policies \eqref{eq:policy}, constraints \eqref{eq:C1}, \eqref{eq:C4}, \eqref{eq:statevar} \& \eqref{eq:inputvar}}\Bigr{\}}. (OP3)

Notice that the constraints (C3) and (C4) are not necessarily feasible at time tt for any choice of parameters αt\alpha_{t} and βt\beta_{t}. As such, problem (OP3) may become infeasible over time if we simply apply Algorithm 1. We therefore replace the optimization Step 7 in Algorithm 1 with the following subroutine for some given αt\alpha_{t}^{*} and βt\beta_{t}^{*} that make the constraints feasible, precision number δ>0\delta>0 and maximal number of iterations ν¯\bar{\nu}:

  

Subroutine 7

 
  1. 7a:

    Set α¯=αt\overline{\alpha}=\alpha_{t}^{*}, α¯=0\underline{\alpha}=0, β¯=βt\overline{\beta}=\beta_{t}^{*}, and β¯=0\underline{\beta}=0

  2. 7b:

    Solve the optimization problem (OP3) using αt=α¯\alpha_{t}=\overline{\alpha} and βt=β¯\beta_{t}=\overline{\beta} to obtain the sequence {ut,ut+1,,ut+N1}\{u_{t}^{*},u_{t+1}^{*},\cdots,u_{t+N-1}^{*}\}

  3. 7c:

    Set ν=1\nu=1

  4. 7d:

    repeat

  5. 7e:

      Set αt=(α¯+α¯)/2\alpha_{t}=(\overline{\alpha}+\underline{\alpha})/2 and βt=(β¯+β¯)/2\beta_{t}=(\overline{\beta}+\underline{\beta})/2

  6. 7f:

      Solve Solve the optimization problem (OP3) using the new αt\alpha_{t} and βt\beta_{t} to obtain
       the sequence {u~t,u~t+1,,u~t+N1}\{\tilde{u}_{t},\tilde{u}_{t+1},\cdots,\tilde{u}_{t+N-1}\}

  7. 7g:

    if step 5f is feasible then

  8. 7h:

       Set α¯=αt\overline{\alpha}=\alpha_{t} and β¯=βt\overline{\beta}=\beta_{t}

  9. 7i:

       Set {ut,ut+1,,ut+N1}={u~t,u~t+1,,u~t+N1}\{u_{t}^{*},u_{t+1}^{*},\cdots,u_{t+N-1}^{*}\}=\{\tilde{u}_{t},\tilde{u}_{t+1},\cdots,\tilde{u}_{t+N-1}\}

  10. 7j:

    else

  11. 7k:

       Set α¯=αt\underline{\alpha}=\alpha_{t} and β¯=βt\underline{\beta}=\beta_{t}

  12. 7l:

    end if

  13. 7m:

      set ν=ν+1\nu=\nu+1

  14. 7n:

    until (|α¯α¯|δ|\overline{\alpha}-\underline{\alpha}|\leqslant\delta and |β¯β¯|δ|\overline{\beta}-\underline{\beta}|\leqslant\delta) or ν>ν¯\nu>\bar{\nu}

 

It may be argued that replacing Step 7 in Algorithm 1 with Subroutine 7 above increases the computational burden; however, the parameter ν¯\bar{\nu} guarantees that this iterative process is halted after some prespecified number of steps in case the required precision δ\delta is not reached in the meantime. In some instances, we may be given some fixed parameters α^\hat{\alpha} and β^\hat{\beta} with which the constraints (C3) and (C4) have to be satisfied, respectively. This requirement can be easily incorporated into Subroutine 7 by setting α¯=α^\underline{\alpha}=\hat{\alpha} and β¯=β^\underline{\beta}=\hat{\beta} in step 7a.

Assumption 7.

At each time step t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots, the constants αt\alpha^{*}_{t} and βt\beta_{t}^{*} in Subroutine 7 are chosen as

αt\displaystyle\alpha^{*}_{t} 3𝗍𝗋(𝒜𝖳𝒮𝒜𝔼𝒴t[xtxt𝖳]+𝒟𝖳𝒮𝒟Σw)+3Nmσmax(𝖳𝒮)Umax2+𝖳𝒜x^t+𝖳1Umax,\displaystyle\coloneqq 3\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{S}\mathcal{A}\mathbb{E}_{\mathcal{Y}_{t}}\left[x_{t}x_{t}^{\mathsf{T}}\right]+\mathcal{D}^{\mathsf{T}}\mathcal{S}\mathcal{D}\Sigma_{w}\right)+3Nm\sigma_{\max}(\mathcal{B}^{\mathsf{T}}\mathcal{S}\mathcal{B})U_{\max}^{2}+\mathcal{L}^{\mathsf{T}}\mathcal{A}\hat{x}_{t}+\left\lVert{\mathcal{L}^{\mathsf{T}}\mathcal{B}}\right\rVert_{1}U_{\max},
βt\displaystyle\beta^{*}_{t} Nmσmax(𝒮~)Umax2.\displaystyle\coloneqq Nm\sigma_{\max}(\tilde{\mathcal{S}})U_{\max}^{2}.
Corollary 8.

Consider the system (2.1a)-(2.1b), and suppose that Assumption 5, and 7 hold. Then:

  1. (i)

    For every time t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots the optimization problem (OP3) solved in Subroutine 7 is convex, feasible, and equivalent to the following quadratically constrained quadratic optimization problem:

    minimize(𝜼t,𝚯t)\displaystyle\operatorname*{minimize}_{(\boldsymbol{\eta}_{t},\boldsymbol{\Theta}_{t})}\quad 2x^t𝖳𝒜𝖳𝒬𝜼t+2𝗍𝗋(𝒜𝖳𝒬𝚯tΛtφx)+2𝗍𝗋(𝚯t𝖳𝖳𝒬𝒟Λtwφ)\displaystyle 2\hat{x}_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\eta}_{t}+2\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\Theta}_{t}\Lambda^{\varphi x}_{t}\right)+2\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{D}\Lambda^{w\varphi}_{t}\right)
    +𝜼t𝖳1𝜼t+2𝜼t𝖳1𝚯tΛtφ+𝗍𝗋(𝚯t𝖳1𝚯tΛtφφ)\displaystyle+\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\eta}_{t}+2\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\Theta}_{t}\Lambda^{\varphi}_{t}+\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\Theta}_{t}\Lambda^{\varphi\varphi}_{t}\right)
    subjectto\displaystyle\mathrm{subject\ to}\quad the structure of𝚯tin(2.15),\displaystyle\text{the structure of}\ \boldsymbol{\Theta}_{t}\ \text{in}\ \eqref{eq:decision-constraints},
    |(𝜼t)i|+(𝚯t)i1φmaxUmaxi=1,,Nm,\displaystyle|(\boldsymbol{\eta}_{t})_{i}|+\left\lVert{(\boldsymbol{\Theta}_{t})_{i}}\right\rVert_{1}\varphi_{\max}\leqslant U_{\max}\quad\forall\,i=1,\cdots,Nm,
    A2κx^t(2)+κ(A2,B2)(𝜼t)1:κm+n2κ(A2,B2)(𝚯t)1:κmφmax\displaystyle\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\eta}_{t})_{1:\kappa m}}\right\rVert+\sqrt{n_{2}}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\Theta}_{t})_{1:\kappa m}}\right\rVert_{\infty}\varphi_{\max}
    x^t(2)(ζ+ε2)whenever x^t(2)ζ+ε,i=1,,n2\displaystyle\hskip 14.22636pt\leqslant\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}-(\zeta+\tfrac{\varepsilon}{2})\;\;\text{whenever }\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}\geqslant\zeta+\varepsilon,\ \forall i=1,\cdots,n_{2}
    𝜼t𝖳𝖳𝒮𝜼t+2𝜼t𝖳𝖳𝒮𝚯tΛtφ+𝗍𝗋(𝚯t𝖳𝖳𝒮𝚯tΛtφφ)\displaystyle\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{S}\mathcal{B}\boldsymbol{\eta}_{t}+2\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{S}\mathcal{B}\boldsymbol{\Theta}_{t}\Lambda^{\varphi}_{t}+\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{S}\mathcal{B}\boldsymbol{\Theta}_{t}\Lambda^{\varphi\varphi}_{t}\right)
    +2x^t𝖳𝒜𝖳𝒮𝜼t+2𝗍𝗋(𝒜𝖳𝒮𝚯tΛtφx+𝚯t𝖳𝖳𝒮𝒟Λtwφ)\displaystyle\quad+2\hat{x}_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{S}\mathcal{B}\boldsymbol{\eta}_{t}+2\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{S}\mathcal{B}\boldsymbol{\Theta}_{t}\Lambda^{\varphi x}_{t}+\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{S}\mathcal{D}\Lambda^{w\varphi}_{t}\right)
    +𝖳(𝜼t+𝚯tΛtφ)+𝗍𝗋(𝒜𝖳𝒮𝒜𝔼𝒴t[xtxt𝖳])+𝗍𝗋(𝒟𝖳𝒮𝒟Σw)\displaystyle\quad+\mathcal{L}^{\mathsf{T}}\mathcal{B}(\boldsymbol{\eta}_{t}+\boldsymbol{\Theta}_{t}\Lambda^{\varphi}_{t})+\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{S}\mathcal{A}\mathbb{E}_{\mathcal{Y}_{t}}\left[x_{t}x_{t}^{\mathsf{T}}\right]\right)+\mathsf{tr}\!\left(\mathcal{D}^{\mathsf{T}}\mathcal{S}\mathcal{D}\Sigma_{w}\right)
    +𝖳𝒜x^tαt,\displaystyle\quad+\mathcal{L}^{\mathsf{T}}\mathcal{A}\hat{x}_{t}\leqslant\alpha_{t}, (3.8)
    𝜼t𝖳𝒮~𝜼t+2𝜼t𝖳𝒮~𝚯tΛtφ+𝗍𝗋(𝚯t𝖳𝒮~𝚯tΛtφφ)βt.\displaystyle\boldsymbol{\eta}_{t}^{\mathsf{T}}\tilde{\mathcal{S}}\boldsymbol{\eta}_{t}+2\boldsymbol{\eta}_{t}^{\mathsf{T}}\tilde{\mathcal{S}}\boldsymbol{\Theta}_{t}\Lambda^{\varphi}_{t}+\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\tilde{\mathcal{S}}\boldsymbol{\Theta}_{t}\Lambda^{\varphi\varphi}_{t}\right)\leqslant\beta_{t}. (3.9)
  2. (ii)

    For every initial state and noise statistics ((x^0,Σx0),Σw,Σv)\bigl{(}(\hat{x}_{0},\Sigma_{x_{0}}),\Sigma_{w},\Sigma_{v}\bigr{)}, successive application of the control laws given by the optimization problem in (i) for NcN_{c} steps renders the closed-loop system mean-square bounded in the following sense: there exists a constant γ=γ(𝒴0,κ,x^0,Σx0,Σw,Σv,Umax)>0\gamma=\gamma(\mathcal{Y}_{0},\kappa,\hat{x}_{0},\Sigma_{x_{0}},\Sigma_{w},\Sigma_{v},U_{\max}^{*})>0 such that

    supt𝔼𝒴0[xt2]γ.\sup_{t\in\mathbb{N}}\;\mathbb{E}_{\mathcal{Y}_{0}}\bigl{[}\left\lVert{x_{t}}\right\rVert^{2}\bigr{]}\leqslant\gamma.

4. Discussion

The optimization problem (OP2) being solved in Theorem 6 is a quadratic program (QP), whereas the optimization problem (OP3) being solved in Corollary (8) is a quadratically constrained quadratic program (QCQP) in the optimization parameters (η,Θ)(\eta,\Theta). As such, both can be easily solved via software packages such as cvx [Grant and Boyd, 2000] or yalmip [Löfberg, 2004].

4.1. Other Constraints and Policies

It is not difficult to see that constraints on the variation of the inputs of the form

PuΔUmax,\left\lVert{Pu}\right\rVert_{\infty}\leqslant\Delta U_{\max},

where P=[II000II000II]P=\tiny\left[\begin{matrix}I&-I&0&\cdots&0\\ 0&I&-I&\cdots&0\\ \vdots&&&&\vdots\\ 0&\cdots&0&I&-I\end{matrix}\right], can be incorporated into the optimization problems (OP2) and (OP3). Moreover, we can easily solve the problem using quadratic policies of the form

Ut=η+Θφ(YtY^t)+Θ~φ~(YtY^t),U_{t}=\eta+\Theta\varphi(Y_{t}-\hat{Y}_{t})+\tilde{\Theta}\tilde{\varphi}(Y_{t}-\hat{Y}_{t}), (POL)

instead of (POL), where

Θ~[θ~0,000θ~1,0θ~1,10θ~N1,0θ~N1,1θ~N1,N1],φ~(z)[φ~0(yty^t)𝖳φ~0(yty^t)φ~N1(yt+N1y^t+N1)𝖳φ~N1(yt+N1y^t+N1)],{\tiny\tilde{\Theta}\coloneqq\left[\begin{matrix}\tilde{\theta}_{0,0}&0&\cdots&0\\ \tilde{\theta}_{1,0}&\tilde{\theta}_{1,1}&&\vdots\\ \vdots&\vdots&\ddots&0\\ \tilde{\theta}_{N-1,0}&\tilde{\theta}_{N-1,1}&\cdots&\tilde{\theta}_{N-1,N-1}\end{matrix}\right]},\ \tilde{\varphi}(z)\coloneqq{\tiny\left[\begin{matrix}\tilde{\varphi}_{0}(y_{t}-\hat{y}_{t})^{\mathsf{T}}\tilde{\varphi}_{0}(y_{t}-\hat{y}_{t})\\ \vdots\\ \tilde{\varphi}_{N-1}(y_{t+N-1}-\hat{y}_{t+N-1})^{\mathsf{T}}\tilde{\varphi}_{N-1}(y_{t+N-1}-\hat{y}_{t+N-1})\end{matrix}\right]},

and θ~i,jm×1\tilde{\theta}_{i,j}\in\mathbb{R}^{m\times 1}. The underlying optimization problems (OP2) and (OP3) with the policy (POL) are still convex and both Theorem (6) and Corollary (8) still apply with minor changes.

4.2. Off-Line Computation of the Λ\Lambda Matrices

At any time t=0,Nc,2Nc,t=0,N_{c},2N_{c},\cdots, our ability to solve the optimization problems (OP2) and (OP3) in Theorem 6 and Corollary 8, respectively, hinges upon our ability to compute the following matrices

Λtφe\displaystyle\Lambda^{\varphi e}_{t} =𝔼𝒴t[φ(YtY^t)(xtx^t)𝖳],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[\varphi(Y_{t}-\hat{Y}_{t})(x_{t}-\hat{x}_{t})^{\mathsf{T}}], Λtwφ\displaystyle\Lambda^{w\varphi}_{t} =𝔼𝒴t[Wφ(YtY^t)𝖳],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[W\varphi(Y_{t}-\hat{Y}_{t})^{\mathsf{T}}], (4.1)
Λtφ\displaystyle\Lambda^{\varphi}_{t} =𝔼𝒴t[φ(YtY^t)],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[\varphi(Y_{t}-\hat{Y}_{t})], Λtφφ\displaystyle\Lambda^{\varphi\varphi}_{t} =𝔼𝒴t[φ(YtY^t)φ(YtY^t)𝖳],\displaystyle=\mathbb{E}_{\mathcal{Y}_{t}}[\varphi(Y_{t}-\hat{Y}_{t})\varphi(Y_{t}-\hat{Y}_{t})^{\mathsf{T}}],
Λtφx\displaystyle\Lambda^{\varphi x}_{t} =Λtφe+Λtφx^t𝖳.\displaystyle=\Lambda^{\varphi e}_{t}+\Lambda^{\varphi}_{t}\hat{x}_{t}^{\mathsf{T}}.

Recall that YtY^tY_{t}-\hat{Y}_{t} is the innovations sequence that was given in (2.13), and x^t\hat{x}_{t} is the optimal mean-square estimate of xtx_{t} given the history 𝒴t\mathcal{Y}_{t}. The matrices (4.1) may be computed by numerical integration with respect to the independent Gaussian measures of wt,wt+N1w_{t},\ldots w_{t+N-1}, of vt,v_{t},\ldots vt+Nv_{t+N}, and of xtx_{t} given 𝒴t\mathcal{Y}^{t}. Due to the large dimensionality of the integration space, this approach may be impractical for online computations. One alternative approach relies on the observation that Λtφe,Λtφ,Λtwφ\Lambda^{\varphi e}_{t},\Lambda^{\varphi}_{t},\Lambda^{w\varphi}_{t}, and Λtφφ\Lambda^{\varphi\varphi}_{t} depend on xtx_{t} via the difference xtx^tx_{t}-\hat{x}_{t}. Since xtx^tx_{t}-\hat{x}_{t} is conditionally zero-mean given 𝒴t\mathcal{Y}^{t}, we can write the dependency of (4.1) on the time-varying statistics of xtx_{t} given 𝒴t\mathcal{Y}^{t} as follows:

Λtφx(x^t,Pt|t)=Λtφe(Pt|t)+Λtφ(Pt|t)x^t𝖳,\displaystyle\Lambda^{\varphi x}_{t}(\hat{x}_{t},P_{t|t})=\Lambda^{\varphi e}_{t}(P_{t|t})+\Lambda^{\varphi}_{t}(P_{t|t})\hat{x}_{t}^{\mathsf{T}}, (4.2)
Λtwφ(Pt|t), and Λtφφ(Pt|t).\displaystyle\Lambda^{w\varphi}_{t}(P_{t|t}),\text{ and }\Lambda^{\varphi\varphi}_{t}(P_{t|t}).

In principle one may compute off-line and store the matrices Λtφe(Pt|t),Λtφ(Pt|t),Λtwφ(Pt|t)\Lambda^{\varphi e}_{t}(P_{t|t}),\Lambda^{\varphi}_{t}(P_{t|t}),\Lambda^{w\varphi}_{t}(P_{t|t}), and Λtφφ(Pt|t)\Lambda^{\varphi\varphi}_{t}(P_{t|t}), which depend on the covariance matrices Pt|tP_{t|t}, and just update online the value of Λtφx(x^t,Pt|t)\Lambda^{\varphi x}_{t}(\hat{x}_{t},P_{t|t}) as the estimate x^t\hat{x}_{t} becomes available. However, this poses serious requirements in terms of memory. A more appealing alternative is to exploit the convergence properties of Pt|tP_{t|t}. The following result can be inferred, for instance, from [Kamen and Su, 1999, Theorem 5.1].

Proposition 9.

Under Assumptions 2 and 5-(iii)

  • the (discrete-time) algebraic Riccati equation in Pn×nP\in\mathbb{R}^{n\times n}

    P=A[PPC𝖳(CPC𝖳+Σv)1CP]A𝖳+ΣwP=A[P-PC^{\mathsf{T}}(CPC^{\mathsf{T}}+\Sigma_{v})^{-1}CP]A^{\mathsf{T}}+\Sigma_{w} (4.3)

    has a unique solution P0P^{*}\geqslant 0, and

  • the sequence Pt+1|tP_{t+1|t} defined by (2.7) and (2.9) converges to PP^{*} as tt tends to \infty, for any initial condition P0|10P_{0|-1}\geqslant 0.

The assumption that Σv>0\Sigma_{v}>0 can be relaxed to Σv0\Sigma_{v}\geqslant 0 at the price of some additional technicality (more on this can be found in [Ferrante et al., 2002]). As a consequence of this result, under detectability and stabilizability assumptions, Pt|tP_{t|t} converges to

P=PPC𝖳(CPC𝖳+Σv)1CP,P^{\circ}=P^{*}-P^{*}C^{\mathsf{T}}(CP^{*}C^{\mathsf{T}}+\Sigma_{v})^{-1}CP^{*}, (4.4)

which is the asymptotic error covariance matrix of the estimator x^t\hat{x}_{t}. Thus, neglecting the initial transient, it makes sense to just compute off-line and store the matrices Λtφe(P),Λtφ(P),Λtwφ(P)\Lambda^{\varphi e}_{t}(P^{\circ}),\Lambda^{\varphi}_{t}(P^{\circ}),\Lambda^{w\varphi}_{t}(P^{\circ}), and Λtφφ(P)\Lambda^{\varphi\varphi}_{t}(P^{\circ}), and just update the matrix Λtφx\Lambda^{\varphi x}_{t} for new values of the estimate x^t\hat{x}_{t}.

5. Proofs

The proofs of the main results are presented as follows: We begin by showing the result in Scholium 4. Then, we state a fundamental result pertaining to mean-square boundedness for general random sequences in Proposition 11, which is utilized to show the mean-square boundedness conclusions of Theorem 6 and Corollary 8. We proceed to show the first assertion (i) of Theorem 6 in Lemma 12. The proof of the second assertion (ii) of Theorem 6 starts by showing Lemmas 13 and 14 to conclude mean-square boundedness of the orthogonal subsystem (A2,B2)(A_{2},B_{2}). We conclude the proof of Theorem 6 by showing mean-square boundedness of the Schur stable subsystem (A1,B1)(A_{1},B_{1}). We end this section by proving the extra conclusions of Corollary 8, beyond those present in Theorem 6.

Let us now look at the estimation equation x^t(=x^t|t)\hat{x}_{t}(=\hat{x}_{t|t}) in (2.6) and combine it with (2.8) and the output equation (2.1b) to obtain

x^t+1=Ax^t+But+Kt(CA(xtx^t)+Cwt+vt+1),{\hat{x}}_{t+1}=A{\hat{x}}_{t}+Bu_{t}+K_{t}\bigl{(}CA(x_{t}-{\hat{x}}_{t})+Cw_{t}+v_{t+1}\bigr{)}, (5.1)

where Kt=(APt|tA𝖳+Σw)C𝖳(C(APt|tA𝖳+Σw)C𝖳+Σv)1K_{t}=(AP_{t|t}A^{\mathsf{T}}+\Sigma_{w})C^{\mathsf{T}}\bigl{(}C(AP_{t|t}A^{\mathsf{T}}+\Sigma_{w})C^{\mathsf{T}}+\Sigma_{v}\bigr{)}^{-1} is the Kalman gain. Our next Fact pertains to the boundedness of the error covariance matrices Pt|tP_{t|t} in (2.7).

Fact 10.

There exists T{T}^{\prime}\in\mathbb{N} and ρ,ρm>0\rho,\rho_{m}>0 such that 𝗍𝗋(Pt|t)ρ\mathsf{tr}\!\left(P_{t|t}\right)\leqslant\rho for all tTt\geqslant{T}^{\prime}, and Ktρm\left\lVert{K_{t}}\right\rVert\leqslant\rho_{m} for all tTt\geqslant{T}^{\prime}.

Fact 10 follows, for example, immediately from Lemma 15 in Appendix B and since by assumption Σv>0\Sigma_{v}>0, one possible bound on (Kt)tT\bigl{(}\left\lVert{K_{t}}\right\rVert\bigr{)}_{t\geqslant{T}^{\prime}} is given by KtAPt|tA𝖳+ΣwC𝖳λmin(Σv)\left\lVert{K_{t}}\right\rVert\leqslant\frac{\left\lVert{AP_{t|t}A^{\mathsf{T}}+\Sigma_{w}}\right\rVert\left\lVert{C^{\mathsf{T}}}\right\rVert}{\lambda_{\min}(\Sigma_{v})} (Σw+A2Pt|t)C𝖳λmin(Σv)\leqslant\frac{\bigl{(}\left\lVert{\Sigma_{w}}\right\rVert+\left\lVert{A}\right\rVert^{2}\left\lVert{P_{t|t}}\right\rVert\bigr{)}\left\lVert{C^{\mathsf{T}}}\right\rVert}{\lambda_{\min}(\Sigma_{v})}. Note that there are many alternative bounds in the Literature, see, for example, [Anderson and Moore, 1979; Jazwinski, 1970]. Now, using the bounds in Fact 10, we can proceed to prove Scholium 4.

Proof of Scholium 4.

Recall the expression of Ξt\Xi_{t} in (3.1) and define the following quantities:

Fκte\displaystyle F^{e}_{\kappa t} [Aκ1KκtCAAKκ(t+1)2CAKκ(t+1)1CA],\displaystyle\coloneqq\begin{bmatrix}A^{\kappa-1}K_{\kappa t}CA&\cdots&AK_{\kappa(t+1)-2}CA&K_{\kappa(t+1)-1}CA\end{bmatrix}, (5.2)
Fκtw\displaystyle F^{w}_{\kappa t} [Aκ1KκtCAKκ(t+1)2CKκ(t+1)1C],\displaystyle\coloneqq\begin{bmatrix}A^{\kappa-1}K_{\kappa t}C&\cdots&AK_{\kappa(t+1)-2}C&K_{\kappa(t+1)-1}C\end{bmatrix},
Fκtv\displaystyle F^{v}_{\kappa t} [Aκ1KκtAKκ(t+1)2Kκ(t+1)1].\displaystyle\coloneqq\begin{bmatrix}A^{\kappa-1}K_{\kappa t}&\cdots&AK_{\kappa(t+1)-2}&K_{\kappa(t+1)-1}\end{bmatrix}.

Then, Ξκt\Xi_{\kappa t} can be rewritten as

Ξκt=Fκteeκt:κ(t+1)1+Fκtwwκt:κ(t+1)1+Fκtvvκt+1:κ(t+1).\Xi_{\kappa t}=F^{e}_{\kappa t}e_{\kappa t:\kappa(t+1)-1}+F^{w}_{\kappa t}w_{\kappa t:\kappa(t+1)-1}+F^{v}_{\kappa t}v_{\kappa t+1:\kappa(t+1)}. (5.3)

But for tT/κt\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}, 111Recall that for any positive real number ss, s\lceil s\rceil denotes the smallest integer that upper-bounds ss. we have that

Fκte\displaystyle\left\lVert{F^{e}_{\kappa t}}\right\rVert κCAsupκT/κK,\displaystyle\leqslant\kappa\left\lVert{CA}\right\rVert\sup_{\ell\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}\left\lVert{K_{\ell}}\right\rVert,
Fκtw\displaystyle\left\lVert{F^{w}_{\kappa t}}\right\rVert κCsupκT/κK,\displaystyle\leqslant\kappa\left\lVert{C}\right\rVert\sup_{\ell\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}\left\lVert{K_{\ell}}\right\rVert,
Fκtv\displaystyle\left\lVert{F^{v}_{\kappa t}}\right\rVert κsupκT/κK.\displaystyle\leqslant\kappa\sup_{\ell\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}\left\lVert{K_{\ell}}\right\rVert.

Using Fact 10 we know that supκT/κKρh\sup_{\ell\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}\left\lVert{K_{\ell}}\right\rVert\leqslant\rho_{h}. Thus, it suffices to take

ζ=κ3/2ρh(CAρ+C𝗍𝗋(Σw)+𝗍𝗋(Σv))\zeta=\kappa^{3/2}\rho_{h}\Bigl{(}\left\lVert{CA}\right\rVert\sqrt{\rho}+\left\lVert{C}\right\rVert\sqrt{\mathsf{tr}\!\left(\Sigma_{w}\right)}+\sqrt{\mathsf{tr}\!\left(\Sigma_{v}\right)}\Bigr{)}

in order to upper-bound the expectation of Ξ(t)\Xi(t) in (5.3) after time TκT/κT\coloneqq\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}. ∎

The following result pertains to mean-square boundedness of a random sequence (ξt)t(\xi_{t})_{t\in\mathbb{N}}:

Proposition 11 ([Pemantle and Rosenthal, 1999, Theorem 1]).

Let (ξt)t(\xi_{t})_{t\in\mathbb{N}} be a sequence of nonnegative random variables on some probability space (Ω,𝔉,)(\Omega,\mathfrak{F},\mathbb{P}), and let (𝔉t)t(\mathfrak{F}_{t})_{t\in\mathbb{N}} be any filtration to which (ξt)t(\xi_{t})_{t\in\mathbb{N}} is adapted. Suppose that there exist constants b>0b>0, and J,M<J,M<\infty, such that

ξ0J,\xi_{0}\leqslant J,

and for all tt\in\mathbb{N}:

𝔼[ξt+1ξt|𝔉t]bon the event {ξt>J},and\displaystyle\mathbb{E}\bigl{[}\xi_{t+1}-\xi_{t}\big{|}\mathfrak{F}_{t}\bigr{]}\leqslant-b\quad\text{on the event }\{\xi_{t}>J\},\quad\text{and} (5.4)
𝔼[|ξt+1ξt|4|ξ0,,ξt]M.\displaystyle\mathbb{E}\bigl{[}\left\lvert{\xi_{t+1}-\xi_{t}}\right\rvert^{4}\big{|}\xi_{0},\ldots,\xi_{t}\bigr{]}\leqslant M. (5.5)

Then there exists a constant γ=γ(b,J,M)>0\gamma=\gamma(b,J,M)>0 such that supt𝔼[ξt2]γ\displaystyle{\sup_{t\in\mathbb{N}}\mathbb{E}\bigl{[}\xi_{t}^{2}\bigr{]}\leqslant\gamma}.

Lemma 12.

Consider the system (2.1a)-(2.1b), and suppose that Assumption 5 holds. Then the first assertion (i) of Theorem 6 holds.

Proof.

It is clear that Xt𝖳𝒬Xt+Ut𝖳UtX_{t}^{\mathsf{T}}\mathcal{Q}X_{t}+U_{t}^{\mathsf{T}}\mathcal{R}U_{t} is convex quadratic in XtX_{t} and UtU_{t}, and both XtX_{t} and UtU_{t} are affine functions of the design parameters (𝜼t,𝚯t)(\boldsymbol{\eta}_{t},\boldsymbol{\Theta}_{t}) for every realization of the noise (wt)t(w_{t})_{t\in\mathbb{N}}. Since taking expectation of a convex function retains convexity [Boyd and Vandenberghe, 2004], we conclude that 𝔼xt[Xt𝖳𝒬Xt+Ut𝖳Ut]\mathbb{E}_{x_{t}}\bigl{[}X_{t}^{\mathsf{T}}\mathcal{Q}X_{t}+U_{t}^{\mathsf{T}}\mathcal{R}U_{t}\bigr{]} is convex in (𝜼t,𝚯t)(\boldsymbol{\eta}_{t},\boldsymbol{\Theta}_{t}). Also, note that the constraint sets described by (3.4) and (3.5) are convex in (𝜼t,𝚯t)(\boldsymbol{\eta}_{t},\boldsymbol{\Theta}_{t}). This settles the claim concerning convexity of the optimization program in Theorem 6-(i).

Concerning the objective function (3.3), we have that

𝔼𝒴t[Xt𝖳𝒬Xt+Ut𝖳Ut]\displaystyle\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}X_{t}^{\mathsf{T}}\mathcal{Q}X_{t}+U_{t}^{\mathsf{T}}\mathcal{R}U_{t}\bigr{]}
=𝔼𝒴t[(𝒜xt+Ut+𝒟Wt)𝖳𝒬(𝒜xt+Ut+𝒟Wt)+Ut𝖳Ut]\displaystyle\;=\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}\bigl{(}\mathcal{A}x_{t}+\mathcal{B}U_{t}+\mathcal{D}W_{t}\bigr{)}^{\mathsf{T}}\mathcal{Q}\bigl{(}\mathcal{A}x_{t}+\mathcal{B}U_{t}+\mathcal{D}W_{t}\bigr{)}+U_{t}^{\mathsf{T}}\mathcal{R}U_{t}\bigr{]}
=𝔼𝒴t[𝒜xt+(𝜼t+𝚯tφ(YtY^t))+𝒟Wt𝒬2+𝜼t+𝚯tφ(YtY^t)2]\displaystyle\;=\mathbb{E}_{\mathcal{Y}_{t}}\Bigl{[}\bigl{\|}\mathcal{A}x_{t}+\mathcal{B}\bigl{(}\boldsymbol{\eta}_{t}+\boldsymbol{\Theta}_{t}\varphi(Y_{t}-\hat{Y}_{t})\bigr{)}+\mathcal{D}W_{t}\bigr{\|}_{\mathcal{Q}}^{2}+\bigl{\|}\boldsymbol{\eta}_{t}+\boldsymbol{\Theta}_{t}\varphi(Y_{t}-\hat{Y}_{t})\bigr{\|}_{\mathcal{R}}^{2}\Bigr{]}
=𝔼𝒴t[xt𝖳𝒜𝖳𝒬𝒜xt+2xt𝖳𝒜𝖳𝒬𝜼t+2xt𝖳𝒜𝖳𝒬𝚯tφ(YtY^t)+2xt𝖳𝒜𝖳𝒬𝒟Wt\displaystyle\;=\mathbb{E}_{\mathcal{Y}_{t}}\Bigl{[}x_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{A}x_{t}+2x_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\eta}_{t}+2x_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\Theta}_{t}\varphi(Y_{t}-\hat{Y}_{t})+2x_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{D}W_{t}
+𝜼t𝖳(𝖳𝒬+)𝜼t+2𝜼t𝖳(𝖳𝒬+)𝚯tφ(YtY^t)+2𝜼t𝖳𝖳𝒬𝒟Wt\displaystyle\qquad+\boldsymbol{\eta}_{t}^{\mathsf{T}}(\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{B}+\mathcal{R})\boldsymbol{\eta}_{t}+2\boldsymbol{\eta}_{t}^{\mathsf{T}}(\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{B}+\mathcal{R})\boldsymbol{\Theta}_{t}\varphi(Y_{t}-\hat{Y}_{t})+2\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{D}W_{t}
+φ(YtY^t)𝖳𝚯t𝖳(𝖳𝒬+)𝚯tφ(YtY^t)+2φ(YtY^t)𝖳𝚯t𝖳𝖳𝒬𝒟Wt\displaystyle\qquad+\varphi(Y_{t}-\hat{Y}_{t})^{\mathsf{T}}\boldsymbol{\Theta}_{t}^{\mathsf{T}}(\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{B}+\mathcal{R})\boldsymbol{\Theta}_{t}\varphi(Y_{t}-\hat{Y}_{t})+2\varphi(Y_{t}-\hat{Y}_{t})^{\mathsf{T}}\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{D}W_{t}
+Wt𝖳𝒟𝖳𝒬𝒟Wt]\displaystyle\qquad+W_{t}^{\mathsf{T}}\mathcal{D}^{\mathsf{T}}\mathcal{Q}\mathcal{D}W_{t}\Bigr{]}

Since 𝔼{𝒴t,xt}[Wt]=0\mathbb{E}_{\{\mathcal{Y}_{t},x_{t}\}}\bigl{[}W_{t}\bigr{]}=0 and in view of the definitions of the various matrices in (3.6) we have

Vt\displaystyle V_{t} =2x^t𝖳𝒜𝖳𝒬𝜼t+2𝗍𝗋(𝒜𝖳𝒬𝚯tΛtφx)+2𝗍𝗋(𝚯t𝖳𝖳𝒬𝒟Λtwφ)\displaystyle=2\hat{x}_{t}^{\mathsf{T}}\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\eta}_{t}+2\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{Q}\mathcal{B}\boldsymbol{\Theta}_{t}\Lambda^{\varphi x}_{t}\right)+2\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{B}^{\mathsf{T}}\mathcal{Q}\mathcal{D}\Lambda^{w\varphi}_{t}\right)
+𝜼t𝖳1𝜼t+2𝜼t𝖳1𝚯tΛtφ+𝗍𝗋(𝚯t𝖳1𝚯tΛtφφ)\displaystyle\quad+\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\eta}_{t}+2\boldsymbol{\eta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\Theta}_{t}\Lambda^{\varphi}_{t}+\mathsf{tr}\!\left(\boldsymbol{\Theta}_{t}^{\mathsf{T}}\mathcal{M}_{1}\boldsymbol{\Theta}_{t}\Lambda^{\varphi\varphi}_{t}\right)
+terms that do not depend on (𝜼t,𝚯t).\displaystyle\quad+\text{terms that do not depend on }(\boldsymbol{\eta}_{t},\boldsymbol{\Theta}_{t}).

This tallies the objective function in (3.3) with the objective VtV_{t} in (2.14).

Concerning the constraints, we have shown in [Hokayem et al., 2009; Chatterjee et al., 2009] that combining the constraint utUmax\left\lVert{u_{t}}\right\rVert_{\infty}\leqslant U_{\max} and the class of policies (POL) is equivalent to the constraints |(𝜼t)i|+(𝚯t)i1φmaxUmax\left\lvert{(\boldsymbol{\eta}_{t})_{i}}\right\rvert+\left\lVert{(\boldsymbol{\Theta}_{t})_{i}}\right\rVert_{1}\varphi_{\max}\leqslant U_{\max} for all i=1,,Nmi=1,\ldots,Nm, which accounts for the constraint (3.4). Substituting (POL) into the stability constraint (C2), we obtain

A2κx^t(2)+κ(A2,B2)(𝜼t)1:κm+κ(A2,B2)(𝚯t)1:κmφ(W)\displaystyle\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\eta}_{t})_{1:\kappa m}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\Theta}_{t})_{1:\kappa m}\varphi(W)}\right\rVert
A2κx^t(2)+κ(A2,B2)(𝜼t)1:κm+κ(A2,B2)(𝚯t)1:κmφ(W)\displaystyle\qquad\leqslant\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\eta}_{t})_{1:\kappa m}}\right\rVert+\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\Theta}_{t})_{1:\kappa m}\varphi(W)}\right\rVert
A2κx^t(2)+κ(A2,B2)(𝜼t)1:κm+n2κ(A2,B2)(𝚯t)1:κmφ(W)\displaystyle\qquad\leqslant\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\eta}_{t})_{1:\kappa m}}\right\rVert+\sqrt{n_{2}}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\Theta}_{t})_{1:\kappa m}\varphi(W)}\right\rVert_{\infty}
A2κx^t(2)+κ(A2,B2)(𝜼t)1:κm+n2κ(A2,B2)(𝚯t)1:κmφmax.\displaystyle\qquad\leqslant\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\eta}_{t})_{1:\kappa m}}\right\rVert+\sqrt{n_{2}}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})(\boldsymbol{\Theta}_{t})_{1:\kappa m}}\right\rVert_{\infty}\varphi_{\max}.

Accordingly, if condition constraint (3.5) is satisfied, then the stability constraint (C2) is satisfied as well.

It remains to show that the constraints are simultaneously feasible. Inspired by the work in [Ramponi et al., 2009], we consider the candidate controller

u~t:t+κ1[ηtηt+κ1]κ(A2,B2)satr(A2κx^t(2)),\tilde{u}_{t:t+\kappa-1}\coloneqq\left[\begin{matrix}\eta_{t}\\ \vdots\\ \eta_{t+\kappa-1}\end{matrix}\right]\coloneqq-\mathfrak{R}_{\kappa}(A_{2},B_{2})^{\dagger}\operatorname{sat}_{r}(A_{2}^{\kappa}\hat{x}_{t}^{(2)}),

i.e., with 𝚯t0\boldsymbol{\Theta}_{t}\equiv 0, where rζ+ε/2r\coloneqq\zeta+\varepsilon/2. First, we have that

u~t:t+κ1u~t:t+κ12σmin(κ(A2,B2))1(ζ+ε/2)=Umax,\left\lVert{\tilde{u}_{t:t+\kappa-1}}\right\rVert_{\infty}\leqslant\left\lVert{\tilde{u}_{t:t+\kappa-1}}\right\rVert_{2}\leqslant\sigma_{\min}(\mathfrak{R}_{\kappa}(A_{2},B_{2}))^{-1}(\zeta+\varepsilon/2)=U_{\max}^{*},

and the constraint (C1) is satisfied. Concerning constraint (C2), we have that

A2κx^t(2)+κ(A2,B2)u~t:t+κ1\displaystyle\left\lVert{A_{2}^{\kappa}\hat{x}_{t}^{(2)}+\mathfrak{R}_{\kappa}(A_{2},B_{2})\tilde{u}_{t:t+\kappa-1}}\right\rVert =x^t(2)r\displaystyle=\left\lVert{\hat{x}_{t}^{(2)}}\right\rVert-r
=x^t(2)(ζ+ε/2),whenever x^t(2)ζ+ε,\displaystyle=\left\lVert{\hat{x}_{t}^{(2)}}\right\rVert-(\zeta+\varepsilon/2),\ \ \text{whenever }\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}\geqslant\zeta+\varepsilon, (5.6)

where the first equality follows from the orthogonality of A2A_{2} (see [Ramponi et al., 2009]), and the constraint (3.5) is also satisfied. The optimization program (3.3) subject to (3.4)-(3.5) is therefore a quadratic program that is equivalent to (OP2). ∎

Lemma 13.

Consider the system (2.1a)-(2.1b), and suppose that Assumption 5 holds. Then there exist constants b,J>0b,J>0 such that222For zz\in\mathbb{R} the notation z\left\lceil{z}\right\rceil stands for the smallest integer greater or equal to zz.

𝔼𝒴κt[x^κ(t+1)(2)x^κt(2)]bon the set {x^κt(2)>J}for all tT/κ.\mathbb{E}_{\mathcal{Y}_{\kappa t}}\Bigl{[}\bigl{\|}{\hat{x}}^{(2)}_{\kappa(t+1)}\bigr{\|}-\bigl{\|}{\hat{x}}^{(2)}_{\kappa t}\bigr{\|}\Bigr{]}\leqslant-b\quad\text{on the set }\bigl{\{}\bigl{\|}{\hat{x}}^{(2)}_{\kappa t}\bigr{\|}>J\bigr{\}}\qquad\text{for all }t\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}.
Proof.

Let Π(2)\Pi^{(2)} be a projection operator that picks the last n2n_{2} components of any vector in n\mathbb{R}^{n}, and consider the subsampled process (x^tκ(2))t({\hat{x}}^{(2)}_{t\kappa})_{t\in\mathbb{N}} given by

x^κ(t+1)(2)=A2κx^κt(2)+κ(A2,B2)uκt:κ(t+1)1+Π(2)(Ξκt).{\hat{x}}^{(2)}_{\kappa(t+1)}=A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}+\Pi^{(2)}\bigl{(}\Xi_{\kappa t}\bigr{)}. (5.7)

By utilizing the triangle inequality we get

𝔼𝒴κt[x^κ(t+1)(2)x^κt(2)]\displaystyle\mathbb{E}_{\mathcal{Y}_{\kappa t}}\Bigl{[}\left\lVert{{\hat{x}}^{(2)}_{\kappa(t+1)}}\right\rVert-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert\Bigr{]}
𝔼𝒴κt[A2κx^κt(2)+κ(A2,B2)uκt:κ(t+1)1x^κt(2)]+𝔼𝒴κt[Π(2)(Ξκt)].\displaystyle\;\leqslant\mathbb{E}_{\mathcal{Y}_{\kappa t}}\Bigl{[}\Bigl{\|}A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}\Bigr{\|}-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert\Bigr{]}+\mathbb{E}_{\mathcal{Y}_{\kappa t}}\Bigl{[}\left\lVert{\Pi^{(2)}\bigl{(}\Xi_{\kappa t}\bigr{)}}\right\rVert\Bigr{]}.

We know from Scholium 4 that there exists a uniform (with respect to time tt) upper bound ζ\zeta for the last term on the right-hand side of the preceding inequality for tκT/κt\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}. We rewrite the inequality as

𝔼𝒴κt[x^κ(t+1)(2)x^κt(2)]\displaystyle\mathbb{E}_{\mathcal{Y}_{\kappa t}}\Bigl{[}\left\lVert{{\hat{x}}^{(2)}_{\kappa(t+1)}}\right\rVert-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert\Bigr{]} 𝔼𝒴κt[A2κx^κt(2)+κ(A2,B2)uκt:κ(t+1)1x^κt(2)]+ζ\displaystyle\leqslant\mathbb{E}_{\mathcal{Y}_{\kappa t}}\Bigl{[}\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}}\right\rVert-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert\Bigr{]}+\zeta
ε2,whenever x^t(2)ζ+ε,\displaystyle\leqslant-\frac{\varepsilon}{2},\qquad\text{whenever }\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|}\geqslant\zeta+\varepsilon,

where the last inequality follows from (5.6). Setting b=ε/2b=\varepsilon/2 and J=ζ+εJ=\zeta+\varepsilon completes the proof. ∎

Lemma 14.

Consider the system (2.1a)-(2.1b), and suppose that Assumption 5 holds. Then there exists a constant M>0M>0 such that

𝔼[|x^κ(t+1)(2)x^κt(2)|4|x^κi(2),i=T/κ,,t]Mfor all tT/κ.\mathbb{E}\Bigl{[}\left\lvert{\left\lVert{{\hat{x}}^{(2)}_{\kappa(t+1)}}\right\rVert-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert}\right\rvert^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}\leqslant M\qquad\text{for all }t\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}.
Proof.

Fix tT/κt\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}, and consider again the subsampled process in (5.7). We have, by orthogonality of A2A_{2}, that x^κt(2)=A2κx^κt(2)\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert=\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}}\right\rVert and, by the triangle inequality, that

𝔼[|x^κ(t+1)(2)x^κt(2)|4|x^κi(2),i=T/κ,,t]\displaystyle\mathbb{E}\Bigl{[}\left\lvert{\left\lVert{{\hat{x}}^{(2)}_{\kappa(t+1)}}\right\rVert-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert}\right\rvert^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}
=𝔼[|x^κ(t+1)(2)A2κx^κt(2)|4|x^κi(2),i=T/κ,,t]\displaystyle=\mathbb{E}\Bigl{[}\left\lvert{\left\lVert{{\hat{x}}^{(2)}_{\kappa(t+1)}}\right\rVert-\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}}\right\rVert}\right\rvert^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}
𝔼[A2κx^κt(2)+κ(A2,B2)uκt:κ(t+1)1+ΞκtA2κx^κt(2)4|x^κi(2),i=T/κ,,t]\displaystyle\leqslant\mathbb{E}\Bigl{[}\left\lVert{A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}+\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}+\Xi_{\kappa t}-A_{2}^{\kappa}{\hat{x}}^{(2)}_{\kappa t}}\right\rVert^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}
=𝔼[(κ(A2,B2)uκt:κ(t+1)1+Ξκt)4|x^κi(2),i=T/κ,,t].\displaystyle=\mathbb{E}\Bigl{[}\Bigl{(}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}}\right\rVert+\left\lVert{\Xi_{\kappa t}}\right\rVert\Bigr{)}^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}.

Recall that for any two positive numbers aa and bb, (a+b)22a2+2b2(a+b)^{2}\leqslant 2a^{2}+2b^{2}. Using this latter fact, we arrive at the following upper bound

𝔼[|x^κ(t+1)(2)x^κt(2)|4|x^κi(2),i=T/κ,,t]\displaystyle\mathbb{E}\Bigl{[}\left\lvert{\left\lVert{{\hat{x}}^{(2)}_{\kappa(t+1)}}\right\rVert-\left\lVert{{\hat{x}}^{(2)}_{\kappa t}}\right\rVert}\right\rvert^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}
8𝔼[κ(A2,B2)uκt:κ(t+1)14+Ξκt4|x^κi(2),i=T/κ,,t].\displaystyle\leqslant 8\mathbb{E}\Bigl{[}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}}\right\rVert^{4}+\left\lVert{\Xi_{\kappa t}}\right\rVert^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}.

By design uiUmax\left\lVert{u_{i}}\right\rVert_{\infty}\leqslant U_{\max} and Ξκt\Xi_{\kappa t} is Gaussian and independent of {x^κi(2),i=T/κ,,t}\Bigl{\{}\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{\}} and has its fourth moment bounded. Therefore, we can easily infer that there exists an M>0M>0 such that

𝔼[(κ(A2,B2)uκt:κ(t+1)1+Ξκt)4|x^κi(2),i=T/κ,,t]M\mathbb{E}\Bigl{[}\Bigl{(}\left\lVert{\mathfrak{R}_{\kappa}(A_{2},B_{2})u_{\kappa t:\kappa(t+1)-1}}\right\rVert+\left\lVert{\Xi_{\kappa t}}\right\rVert\Bigr{)}^{4}\,\Big{|}\,\left\lVert{{\hat{x}}^{(2)}_{\kappa i}}\right\rVert,\;i={\left\lceil{{T}^{\prime}/\kappa}\right\rceil},\ldots,t\Bigr{]}\leqslant M

for all tT/κt\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}. ∎

We are finally ready to prove Theorem 6.

Proof of Theorem 6.

Claim (i) of Theorem 6 was proved in Lemma 12. It remains to show claim (ii). We start by asserting the following inequality

𝔼𝒴κT/κ[xt2]2𝔼𝒴κT/κ[xtx^t2]+2𝔼𝒴κT/κ[x^t2].\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\left\lVert{x_{t}}\right\rVert^{2}\bigr{]}\leqslant 2\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\left\lVert{x_{t}-\hat{x}_{t}}\right\rVert^{2}\bigr{]}+2\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\left\lVert{\hat{x}_{t}}\right\rVert^{2}\bigr{]}.

We know from Fact 10 that 𝔼𝒴κT/κ[xtx^t2]ρ\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\left\lVert{x_{t}-\hat{x}_{t}}\right\rVert^{2}\bigr{]}\leqslant\rho for all tκT/κt\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}. As such, if we are able to show that the state of the estimator is mean-square bounded, we can immediately infer a mean-square bound on the state of the plant.

We first start by splitting the squared norm of the estimator state as x^t2=x^t(1)2+x^t(2)2\left\lVert{\hat{x}_{t}}\right\rVert^{2}=\bigl{\|}\hat{x}_{t}^{(1)}\bigr{\|}^{2}+\bigl{\|}\hat{x}_{t}^{(2)}\bigr{\|}^{2}, where x^(1)\hat{x}^{(1)} and x^(2)\hat{x}^{(2)} are states corresponding to the Schur and orthogonal parts of the system, respectively. It then follows that

𝔼𝒴κT/κ[x^t2]=𝔼𝒴κT/κ[x^t(1)2]+𝔼𝒴κT/κ[x^t(2)2],t.\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\left\lVert{\hat{x}_{t}}\right\rVert^{2}\bigr{]}=\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\bigl{\|}\hat{x}_{t}^{(1)}\bigr{\|}^{2}\bigr{]}+\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\bigl{\|}\hat{x}_{t}^{(2)}\bigr{\|}^{2}\bigr{]},\qquad t\in\mathbb{N}. (5.8)

Letting ξtx^t(2)\xi_{t}\coloneqq\bigl{\|}{\hat{x}}^{(2)}_{t}\bigr{\|} for tt\in\mathbb{N}, we see from Lemma 13 and Lemma 14 the conditions (5.4) and (5.5) of Proposition 11 are verified for the sequence (ξκt)tT/κ(\xi_{\kappa t})_{t\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}. Thus, by Proposition 11, there exists a constant γ(2)=γ(2)(b,J,M)>0\gamma^{(2)}=\gamma^{(2)}(b,J,M)>0 such that

𝔼𝒴κT/κ[x^κt(2)2]γ(2)for all tT/κ.\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\bigl{\|}{\hat{x}}^{(2)}_{\kappa t}\bigr{\|}^{2}\bigr{]}\leqslant\gamma^{(2)}\qquad\text{for all }t\geqslant{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}.

Hence, the orthogonal subsystem is mean-square bounded.

Since the matrix A1A_{1} is Schur stable, we know [Bernstein, 2009, Proposition 11.10.5] that there exists a positive definite matrix Mn1×n1M\in\mathbb{R}^{n_{1}\times n_{1}} that satisfies A1𝖳MA1M=IA_{1}^{\mathsf{T}}MA_{1}-M=-I. It easily follows that there exists ρ]0,1[\rho\in\;]0,1[ such that A1𝖳MA1MρMA_{1}^{\mathsf{T}}MA_{1}-M\leqslant-\rho M; in fact, ρ\rho can be chosen from the interval ]0,min{1,λmin(I)/λmax(M)}[\bigl{]}0,\min\bigl{\{}1,\lambda_{\min}\bigl{(}I\bigr{)}/\lambda_{\max}\bigl{(}M\bigr{)}\bigr{\}}\bigr{[}. Therefore, we see that for any tTt\geqslant{T}^{\prime},

𝔼𝒴t\displaystyle\mathbb{E}_{\mathcal{Y}_{t}} [x^t+1(1)M2]x^t(1)M2ρx^t(1)M2+2𝔼𝒴t[(x^t(1))𝖳A1𝖳MB1ut(1)]+𝔼𝒴t[Π(1)(Ξt)],\displaystyle\left[\left\lVert{{\hat{x}}^{(1)}_{t+1}}\right\rVert^{2}_{M}\right]-\left\lVert{{\hat{x}}^{(1)}_{t}}\right\rVert^{2}_{M}\leqslant-\rho\left\lVert{{\hat{x}}^{(1)}_{t}}\right\rVert^{2}_{M}+2\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}({\hat{x}}^{(1)}_{t})^{\mathsf{T}}A_{1}^{\mathsf{T}}MB_{1}u^{(1)}_{t}\bigr{]}+\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}\Pi^{(1)}(\Xi_{t})\bigr{]},

where Π(1)\Pi^{(1)} is the projection onto the first n1n_{1} coordinates and Ξt\Xi_{t} was defined in (3.1). Young’s inequality shows that 2𝔼𝒴t[(x^t(1))𝖳A1𝖳MB1ut(1)]ϵ𝔼𝒴t[A1x^t(1)M2]+1ϵ𝔼𝒴t[B1ut(1)M2]2\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}({\hat{x}}^{(1)}_{t})^{\mathsf{T}}A_{1}^{\mathsf{T}}MB_{1}u^{(1)}_{t}\bigr{]}\leqslant\epsilon\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}\bigl{\|}A_{1}{\hat{x}}^{(1)}_{t}\bigr{\|}_{M}^{2}\bigr{]}+\frac{1}{\epsilon}\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}\bigl{\|}B_{1}u^{(1)}_{t}\bigr{\|}_{M}^{2}\bigr{]} for ε>0\varepsilon>0. Choosing ϵ=3ρ/(2(1ρ))\epsilon=3\rho/(2(1-\rho)) and utilizing Fact 10 and the upper bound on the inputs UmaxU_{\max}, it follows that for all tTt\geqslant{T}^{\prime},

𝔼𝒴t\displaystyle\mathbb{E}_{\mathcal{Y}_{t}} [x^t+1(1)M2](1ρ2)x^t(1)M22(1ρ)3ρλmax(M)B12Umax2n1+ζ=:c.\displaystyle\left[\left\lVert{{\hat{x}}^{(1)}_{t+1}}\right\rVert^{2}_{M}\right]-(1-\tfrac{\rho}{2})\left\lVert{{\hat{x}}^{(1)}_{t}}\right\rVert^{2}_{M}\leqslant\frac{2(1-\rho)}{3\rho}\lambda_{\max}\bigl{(}M\bigr{)}\left\lVert{B_{1}}\right\rVert^{2}U_{\max}^{2}n_{1}+\zeta=:c.

Therefore, for tTt\geqslant{T}^{\prime}, we have that

𝔼𝒴κT/κ[x^κT/κ+2(1)M2]\displaystyle\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\left[\left\lVert{{\hat{x}}^{(1)}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}+2}}\right\rVert^{2}_{M}\right] =𝔼𝒴κT[𝔼𝒴κT/κ+1[x^κT/κ+2(1)M2]]\displaystyle=\mathbb{E}_{\mathcal{Y}_{\kappa{T}^{\prime}}}\left[\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}+1}}\left[\left\lVert{{\hat{x}}^{(1)}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}+2}}\right\rVert^{2}_{M}\right]\right]
𝔼𝒴κT/κ[𝔼𝒴κT/κ+1[(1ρ2)x^κT/κ+1(1)M2+c]]\displaystyle\leqslant\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\left[\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}+1}}\left[(1-\tfrac{\rho}{2})\left\lVert{{\hat{x}}^{(1)}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}+1}}\right\rVert^{2}_{M}+c\right]\right]
(1ρ2)2x^κT/κ(1)M2+(1+(1ρ2))c.\displaystyle\leqslant(1-\tfrac{\rho}{2})^{2}\left\lVert{{\hat{x}}^{(1)}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\right\rVert^{2}_{M}+\bigl{(}1+(1-\tfrac{\rho}{2})\bigr{)}c.

Iterating the last inequality, it follows that

𝔼𝒴κT/κ[x^t+κT/κ(1)M2](1ρ2)tx^κT/κ(1)M2+i=0t1(1ρ2)ic,tT,\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\left[\left\lVert{{\hat{x}}^{(1)}_{t+\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\right\rVert^{2}_{M}\right]\leqslant(1-\tfrac{\rho}{2})^{t}\left\lVert{{\hat{x}}^{(1)}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\right\rVert^{2}_{M}+\sum_{i=0}^{t-1}(1-\tfrac{\rho}{2})^{i}c,\qquad t\geqslant{T}^{\prime},

or

𝔼𝒴κT/κ\displaystyle\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}} [x^t+κT/κ(1)M2]x^κT/κ(1)M2+(1ρ2)1c,tT.\displaystyle\left[\left\lVert{{\hat{x}}^{(1)}_{t+\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\right\rVert^{2}_{M}\right]\leqslant\left\lVert{{\hat{x}}^{(1)}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\right\rVert^{2}_{M}+(1-\tfrac{\rho}{2})^{-1}c,\qquad t\geqslant{T}^{\prime}.

We can conclude that there exists γ(1)=γ(1)(x^0)>0\gamma^{(1)}=\gamma^{(1)}({\hat{x}}_{0})>0 such that

𝔼𝒴κT/κ[x^t(1)2]γ(1)for all tκT/κ.\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\bigl{\|}{\hat{x}}^{(1)}_{t}\bigr{\|}^{2}\bigr{]}\leqslant\gamma^{(1)}\qquad\text{for all }t\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}. (5.9)

We can therefore conclude that

𝔼𝒴κT/κ[x^t2]γ(1)+γ(2)for all tκT/κ.\mathbb{E}_{\mathcal{Y}_{\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}}}\bigl{[}\bigl{\|}{\hat{x}}_{t}\bigr{\|}^{2}\bigr{]}\leqslant\gamma^{(1)}+\gamma^{(2)}\qquad\text{for all }t\geqslant\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}.

Since the sequence (xt)t(x_{t})_{t\in\mathbb{N}} in (2.1a) is generated through the addition of independent mean-square bounded random variables and a bounded control input, and since κT/κ<\kappa{\left\lceil{{T}^{\prime}/\kappa}\right\rceil}<\infty, it follows that there exists γ>0\gamma>0 such that

𝔼𝒴0[xt2]γfor all t,\mathbb{E}_{\mathcal{Y}_{0}}\bigl{[}\left\lVert{x_{t}}\right\rVert^{2}\bigr{]}\leqslant\gamma\qquad\text{for all }t\in\mathbb{N},

establishing the second claim (ii) of Theorem 6. ∎

Proof of Corollary 8.

The proof of Corollary 8 follows exactly the same reasoning as in the proof of Theorem 6, up to the constraints in (3.8) and (3.9). Also, rewriting the constraints (C3) and (C4) as (3.8) and (3.9), respectively, can be done similarly to the way we rewrote the cost in Theorem (6). It remains to show the upper bounds α\alpha^{*} and β\beta^{*} in Assumption (7).

The constraint (C3) can be upper-bounded as follows:

𝔼𝒴t[Xt𝖳𝒮Xt+𝖳Xt]\displaystyle\mathbb{E}_{\mathcal{Y}_{t}}\left[X_{t}^{\mathsf{T}}\mathcal{S}X_{t}+\mathcal{L}^{\mathsf{T}}X_{t}\right]
=𝔼𝒴t[𝒜xt+Ut+𝒟Wt𝒮2+𝖳(𝒜xt+Ut+𝒟Wt)]\displaystyle\quad=\mathbb{E}_{\mathcal{Y}_{t}}\left[\left\lVert{\mathcal{A}x_{t}+\mathcal{B}U_{t}+\mathcal{D}W_{t}}\right\rVert_{\mathcal{S}}^{2}+\mathcal{L}^{\mathsf{T}}(\mathcal{A}x_{t}+\mathcal{B}U_{t}+\mathcal{D}W_{t})\right]
3𝔼𝒴t[𝒜xt𝒮2+Ut𝒮2+𝒟Wt𝒮2]+𝖳𝒜x^t+|𝖳Ut|\displaystyle\quad\leqslant 3\mathbb{E}_{\mathcal{Y}_{t}}\left[\left\lVert{\mathcal{A}x_{t}}\right\rVert^{2}_{\mathcal{S}}+\left\lVert{\mathcal{B}U_{t}}\right\rVert_{\mathcal{S}}^{2}+\left\lVert{\mathcal{D}W_{t}}\right\rVert_{\mathcal{S}}^{2}\right]+\mathcal{L}^{\mathsf{T}}\mathcal{A}\hat{x}_{t}+|\mathcal{L}^{\mathsf{T}}\mathcal{B}U_{t}|
3𝗍𝗋(𝒜𝖳𝒮𝒜𝔼𝒴t[xtxt𝖳]+𝒟𝖳𝒮𝒟Σw)+𝖳𝒜x^t\displaystyle\quad\leqslant 3\mathsf{tr}\!\left(\mathcal{A}^{\mathsf{T}}\mathcal{S}\mathcal{A}\mathbb{E}_{\mathcal{Y}_{t}}\left[x_{t}x_{t}^{\mathsf{T}}\right]+\mathcal{D}^{\mathsf{T}}\mathcal{S}\mathcal{D}\Sigma_{w}\right)+\mathcal{L}^{\mathsf{T}}\mathcal{A}\hat{x}_{t}
3Nmσmax(𝖳𝒮)Umax2++𝖳1Umax,\displaystyle\quad\quad 3Nm\sigma_{\max}(\mathcal{B}^{\mathsf{T}}\mathcal{S}\mathcal{B})U_{\max}^{2}++\left\lVert{\mathcal{L}^{\mathsf{T}}\mathcal{B}}\right\rVert_{1}U_{\max},

where the first inequality follows from the fact that (a+b+c)23(a2+b2+c2)(a+b+c)^{2}\leqslant 3(a^{2}+b^{2}+c^{2}), for any a,b,c>0a,b,c>0, and the noise being zero-mean, the second inequality follows from applying norm bounds between the 2 and \infty-norms and Hölder’s inequality. As for the constraint (C4), it can be upper-bounded as follows:

𝔼𝒴t[Ut𝖳𝒮~Ut]σmax(S)Ut2Nmσmax(S)Ut2Nmσmax(S)Umax2.\displaystyle\mathbb{E}_{\mathcal{Y}_{t}}\left[U_{t}^{\mathsf{T}}\tilde{\mathcal{S}}U_{t}\right]\leqslant\sigma_{\max}(S)\left\lVert{U_{t}}\right\rVert^{2}\leqslant Nm\sigma_{\max}(S)\left\lVert{U_{t}}\right\rVert_{\infty}^{2}\leqslant Nm\sigma_{\max}(S)U_{\max}^{2}.

This completes the proof. ∎

6. Examples

Consider the system (2.1a)-(2.1b) with the following matrices:

A=[0.500001010],B=[101], and C=I.A=\left[\begin{matrix}0.5&0&0\\ 0&0&-1\\ 0&1&0\end{matrix}\right],B=\left[\begin{matrix}1\\ 0\\ 1\end{matrix}\right],\text{ and }C=I.

The simulation data was chosen to be x0𝒩(0,I)x_{0}\sim\mathcal{N}(0,I), wt𝒩(0,10I)w_{t}\sim\mathcal{N}(0,10I), vt=𝒩(0,10I)v_{t}=\mathcal{N}(0,10I), Q=IQ=I, R=1R=1, N=5N=5, Nc=κ=2N_{c}=\kappa=2, and φ\varphi the usual piecewise linear saturation function with φmax=1\varphi_{\max}=1. For this example the theoretical bound on the input is Umax=168.6783U^{*}_{\max}=168.6783 for a choice of ϵ=10\epsilon=10.

We simulated the system for the discrete-time interval [0,200][0,200] using Algorithm 1, without the constraints (C3) and (C4). The coding was done using MATLAB and the optimization problem was solved using yalmip [Löfberg, 2004] and sdpt3 [Toh et al., 1999]. The computation of the matrices Λφe(P)\Lambda^{\varphi e}(P^{\circ}), Λφ(P)\Lambda^{\varphi}(P^{\circ}), Λwφ(P)\Lambda^{w\varphi}(P^{\circ}), and Λφφ(P)\Lambda^{\varphi\varphi}(P^{\circ}) was done off-line using the steady state error covariance matrix PP^{\circ}, as discussed in the previous section, via classical Monte Carlo integration [Robert and Casella, 2004, Section 3.2] using 10510^{5} samples.

The norms of the state trajectories for 200 different sample paths of the process and measurement noises are plotted in Figure 2 starting from x0=[97.38100.1999.78]Tx_{0}=\left[\begin{matrix}97.38&100.19&99.78\end{matrix}\right]^{T}. The average state norm as well as the standard deviation of the state norm are depicted in Figure 3, where it is clear that the proposed controller renders the system mean-square bounded. The average total cost normalized by time for this simulation is plotted in Figure 4.

Refer to caption
Figure 2. State norm for 200 realizations of the process and measurement noise sequences
Refer to caption
Figure 3. Average and standard deviation of the state norm for 200 realizations of the process and measurement noise sequences
Refer to caption
Figure 4. Total average cost

7. Conclusions

We presented a method for stochastic receding horizon control of discrete-time linear systems with process and measurement noise and bounded input policies. We showed that the optimization problem solved periodically is successively feasible and convex. Moreover, we illustrated how a certain stability condition can be utilized to show that the application of the receding horizon controller renders the state of the system mean-square bounded. We discussed how certain matrices in the cost function can be computed off-line and provided an example that illustrates our approach.

References

  • Agarwal et al. [2009] Agarwal, M., Cinquemani, E., Chatterjee, D., Lygeros, J., 2009. On convexity of stochastic optimization problems with constraints. In: European Control Conference. pp. 2827–2832.
  • Anderson and Moore [1979] Anderson, B., Moore, J., 1979. Optimal Filtering. Prentice-Hall.
  • Batina [2004] Batina, I., 2004. Model predictive control for stochastic systems by randomized algorithms. Ph.D. thesis, Technische Universiteit Eindhoven.
  • Bemporad and Morari [1999] Bemporad, A., Morari, M., 1999. Robust model predictive control: a survey. Robustness in Identification and Control 245, 207–226.
  • Ben-Tal et al. [2006] Ben-Tal, A., Boyd, S., Nemirovski, A., 2006. Extending scope of robust optimization: Comprehensive robust counterparts of uncertain problems. Journal of Mathematical Programming 107, 63–89.
  • Ben-Tal et al. [2004] Ben-Tal, A., Goryashko, A., Guslitzer, E., Nemirovski, A., 2004. Adjustable robust solutions of uncertain linear programs. Mathematical Programming 99 (2), 351–376.
  • Bernstein [2009] Bernstein, D. S., 2009. Matrix Mathematics, 2nd Edition. Princeton University Press.
  • Bertsekas [2000] Bertsekas, D. P., 2000. Dynamic Programming and Optimal Control, 2nd Edition. Vol. 1. Athena Scientific.
  • Bertsekas [2005] Bertsekas, D. P., 2005. Dynamic programming and suboptimal control: a survey from ADP to MPC. European Journal of Control 11 (4-5).
  • Bertsekas [2007] Bertsekas, D. P., 2007. Dynamic Programming and Optimal Control, 3rd Edition. Vol. 2. Athena Scientific.
  • Bertsimas and Brown [2007] Bertsimas, D., Brown, D. B., 2007. Constrained stochastic LQC: a tractable approach. IEEE Transactions on Automatic Control 52 (10), 1826–1841.
  • Bertsimas et al. [2009] Bertsimas, D., Iancu, D. A., Parrilo, P. A., 2009. Optimality of affine policies in multi-stage robust optimizationPreprint available at: http://arxiv.org/abs/0904.3986.
  • Blackmore and Williams [2007] Blackmore, L., Williams, B. C., July 2007. Optimal, robust predictive control of nonlinear systems under probabilistic uncertainty using particles. In: Proceedings of the American Control Conference. pp. 1759 – 1761.
  • Boyd and Vandenberghe [2004] Boyd, S., Vandenberghe, L., 2004. Convex Optimization. Cambridge University Press, Cambridge, sixth printing with corrections, 2008.
  • Cannon et al. [2009] Cannon, M., Kouvaritakis, B., Wu, X., July 2009. Probabilistic constrained MPC for systems with multiplicative and additive stochastic uncertainty. IEEE Transactions on Automatic Control 54 (7), 1626–1632.
  • Chatterjee et al. [2009] Chatterjee, D., Hokayem, P., Lygeros, J., 2009. Stochastic receding horizon control with bounded control inputs: a vector space approach. IEEE Transactions on Automatic ControlUnder review. http://arxiv.org/abs/0903.5444.
  • Ferrante et al. [2002] Ferrante, A., Picci, G., Pinzoni, S., 2002. Silverman algorithm and the structure of discrete-time stochastic systems. Linear Algebra and its Applications 351/352, 219–242.
  • Goulart et al. [2006] Goulart, P. J., Kerrigan, E. C., Maciejowski, J. M., 2006. Optimization over state feedback policies for robust control with constraints. Automatica 42 (4), 523–533.
  • Grant and Boyd [2000] Grant, M., Boyd, S., December 2000. CVX: Matlab software for disciplined convex programming (web page and software). http://stanford.edu/~boyd/cvx.
  • Hokayem et al. [2009] Hokayem, P., Chatterjee, D., Lygeros, J., 2009. On stochastic model predictive control with bounded control inputs. In: Proceedings of the combined 48th IEEE Conference on Decision & Control and 28th Chinese Control Conference. pp. 6359–6364, available at http://arxiv.org/abs/0902.3944.
  • Horn and Johnson [1990] Horn, R. A., Johnson, C. R., 1990. Matrix Analysis. Cambridge University Press, Cambridge.
  • Jazwinski [1970] Jazwinski, A. H., 1970. Stochastic Processes and Filtering Theory. Academic Press.
  • Kamen and Su [1999] Kamen, E. W., Su, J. K., 1999. Introduction to Optimal Estimation. Springer, London, UK.
  • Klein Haneveld [1983] Klein Haneveld, W. K., 1983. On integrated chance constraints. In: Stochastic programming (Gargnano). Vol. 76 of Lecture Notes in Control and Inform. Sci. Springer, Berlin, pp. 194–209.
  • Klein Haneveld and van der Vlerk [2006] Klein Haneveld, W. K., van der Vlerk, M. H., 2006. Integrated chance constraints: reduced forms and an algorithm. Computational Management Science 3 (4), 245–269.
  • Kumar and Varaiya [1986] Kumar, P. R., Varaiya, P., 1986. Stochastic Systems: Estimation, Identification, and Adaptive Control. Prentice Hall.
  • Lazar et al. [2007] Lazar, M., Heemels, W. P. M. H., Bemporad, A., Weiland, S., 2007. Discrete-time non-smooth nonlinear MPC: stability and robustness. In: Lecture Notes in Control and Information Sciences. Vol. 358. Springer-Verlag, pp. 93–103.
  • Löfberg [2003] Löfberg, J., 2003. Minimax Approaches to Robust Model Predictive Control. Ph.D. thesis, Linköpings Universitet.
  • Löfberg [2004] Löfberg, J., 2004. YALMIP : A Toolbox for Modeling and Optimization in MATLAB. In: Proceedings of the CACSD Conference. Taipei, Taiwan.
  • Maciejowski [2001] Maciejowski, J. M., 2001. Predictive Control with Constraints. Prentice Hall.
  • Maciejowski et al. [2005] Maciejowski, M., Lecchini, A., Lygeros, J., 2005. NMPC for complex stochastic systems using Markov Chain Monte Carlo. Vol. 358/2007 of Lecture Notes in Control and Information Sciences. Springer, Stuttgart, Germany, pp. 269–281.
  • Mayne et al. [2000] Mayne, D. Q., Rawlings, J. B., Rao, C. V., Scokaert, P. O. M., Jun 2000. Constrained model predictive control: stability and optimality. Automatica 36 (6), 789–814.
  • Oldewurtel et al. [2008] Oldewurtel, F., Jones, C., Morari, M., Dec 2008. A tractable approximation of chance constrained stochastic MPC based on affine disturbance feedback. In: Conference on Decision and Control, CDC. Cancun, Mexico.
  • Pemantle and Rosenthal [1999] Pemantle, R., Rosenthal, J. S., 1999. Moment conditions for a sequence with negative drift to be uniformly bounded in LrL^{r}. Stochastic Processes and their Applications 82 (1), 143–155.
  • Primbs [2007] Primbs, J., 2007. A soft constraint approach to stochastic receding horizon control. In: Proceedings of the 46th IEEE Conference on Decision and Control. pp. 4797 – 4802.
  • Primbs and Sung [2009] Primbs, J. A., Sung, C. H., Feb. 2009. Stochastic receding horizon control of constrained linear systems with state and control multiplicative noise. IEEE Transactions on Automatic Control 54 (2), 221–230.
  • Qin and Badgwell [2003] Qin, S. J., Badgwell, T., Jul. 2003. A survey of industrial model predictive control technology. Control Engineering Practice 11 (7), 733–764.
  • Ramponi et al. [2009] Ramponi, F., Chatterjee, D., Milias-Argeitis, A., Hokayem, P., Lygeros, J., 2009. Attaining mean square boundedness of a marginally stable noisy linear system with a bounded control input. http://arxiv.org/abs/0907.1436, submitted to IEEE Transactionson Automatic Control, Revised Jan 2010.
  • Robert and Casella [2004] Robert, C. P., Casella, G., 2004. Monte Carlo Statistical Methods, 2nd Edition. Springer.
  • Skaf and Boyd [2009] Skaf, J., Boyd, S., 2009. Design of affine controllers via convex optimization. http://www.stanford.edu/~boyd/papers/affine_contr.html.
  • Toh et al. [1999] Toh, K., Todd, M., Tatuncu, R., 1999. SDPT3– a Matlab software package for semidefinite programming. Optimization Methods and Software (11), 545–581, http://www.math.nus.edu.sg/~mattohkc/sdpt3.html.
  • van Hessem and Bosgra [2003] van Hessem, D. H., Bosgra, O. H., 2003. A full solution to the constrained stochastic closed-loop MPC problem via state and innovations feedback and its receding horizon implementation. In: Proceedings of the 42nd IEEE Conference on Decision and Control. Vol. 1. pp. 929–934.
  • Wang and Boyd [2009] Wang, Y., Boyd, S., 2009. Peformance bounds for linear stochastic control. Systems and Control Letters 58 (3), 178–182.
  • Yang et al. [1997] Yang, Y. D., Sontag, E. D., Sussmann, H. J., 1997. Global stabilization of linear discrete-time systems with bounded feedback. Systems and Control Letters 30 (5), 273–281.

Appendix A Proof of Proposition 3

For t=0t=0, x0𝒩(x^0|1,P0|1)x_{0}\sim\mathcal{N}({\hat{x}}_{0|-1},P_{0|-1}), with P0|1>0P_{0|-1}>0, by assumption. Assume now that, for a given t0t\geqslant 0, f(xt|𝒴t1)f(x_{t}|\mathcal{Y}_{t-1}) is normal with mean x^t|t1{\hat{x}}_{t|t-1} and covariance matrix Pt|t1>0P_{t|t-1}>0. By Assumption 1-(ii) and dynamics in (2.1b), f(yt|xt,𝒴t1)f(y_{t}|x_{t},\mathcal{Y}_{t-1}) is also normal with mean CxtCx_{t} and covariance matrix Σv\Sigma_{v}. Hence, applying Bayes rule we may write

f(xt|𝒴t)=f(yt|xt,𝒴t1)f(xt|𝒴t1)f(yt|𝒴t1),f(x_{t}|\mathcal{Y}_{t})=\frac{f(y_{t}|x_{t},\mathcal{Y}_{t-1})f(x_{t}|\mathcal{Y}_{t-1})}{f(y_{t}|\mathcal{Y}_{t-1})},

where we have f(yt|𝒴t1)=f(yt|xt,𝒴t)f(xt|𝒴t1)dxtf(y_{t}|\mathcal{Y}_{t-1})=\int f(y_{t}|x_{t},\mathcal{Y}_{t})f(x_{t}|\mathcal{Y}_{t-1})\text{d}x_{t} by the Chapman-Kolmogorov equation. It follows that

f(xt|𝒴t)\displaystyle f(x_{t}|\mathcal{Y}_{t}) f(yt|xt,𝒴t1)f(xt|𝒴t1)\displaystyle\propto f(y_{t}|x_{t},\mathcal{Y}_{t-1})f(x_{t}|\mathcal{Y}_{t-1})
exp(12[(ytCxt)𝖳Σv1(ytCxt)+(xtx^t|t1)𝖳Pt|t11(xtx^t|t1)]),\displaystyle\propto\exp\biggl{(}-\frac{1}{2}\Bigl{[}(y_{t}-Cx_{t})^{\mathsf{T}}\Sigma_{v}^{-1}(y_{t}-Cx_{t})+(x_{t}-\hat{x}_{t|t-1})^{\mathsf{T}}P_{t|t-1}^{-1}(x_{t}-\hat{x}_{t|t-1})\Bigr{]}\biggr{)},

where the proportionality constants do not depend on xtx_{t}. Let us now focus on the term within square brackets and write xx, yy, x^\hat{x} and PP in place of xtx_{t}, yty_{t}, x^t|t1\hat{x}_{t|t-1} and Pt|t1P_{t|t-1} for shortness. Expanding the products and collecting the linear and quadratic terms in xx one gets

x𝖳(C𝖳Σv1C+P1)x2x𝖳(C𝖳Σv1y+P1x^)+(y𝖳Σv1y+x^𝖳P1x^).\displaystyle x^{\mathsf{T}}(C^{\mathsf{T}}\Sigma_{v}^{-1}C+P^{-1})x-2x^{\mathsf{T}}(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x})+(y^{\mathsf{T}}\Sigma_{v}^{-1}y+\hat{x}^{\mathsf{T}}P^{-1}\hat{x}).

Since P>0P>0 and Σv>0\Sigma_{v}>0 by assumption, it follows that C𝖳Σv1C+P1>0C^{\mathsf{T}}\Sigma_{v}^{-1}C+P^{-1}>0. If we let P=(C𝖳Σv1C+P1)1P_{*}=(C^{\mathsf{T}}\Sigma_{v}^{-1}C+P^{-1})^{-1}, the expression above can be rewritten as

(P1x)𝖳P(P1x)2(P1x)𝖳P(C𝖳Σv1y+P1x^)+(y𝖳Σv1y+x^𝖳P1x^).\displaystyle(P_{*}^{-1}x)^{\mathsf{T}}P_{*}(P_{*}^{-1}x)-2(P_{*}^{-1}x)^{\mathsf{T}}P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x})+(y^{\mathsf{T}}\Sigma_{v}^{-1}y+\hat{x}^{\mathsf{T}}P^{-1}\hat{x}).

Completing the square, the latter expression becomes

[P1x(C𝖳Σv1y+P1x^)]𝖳P[P1x(C𝖳Σv1y+P1x^)]+c,\displaystyle[P_{*}^{-1}x-(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x})]^{\mathsf{T}}P_{*}[P_{*}^{-1}x-(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x})]+c,

where cc depends on x^\hat{x} and yy but not on xx. Factoring P1P_{*}^{-1} out of the two terms P1x(C𝖳Σv1y+P1x^)P_{*}^{-1}x-(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x}) and simplifying yields

[xP(C𝖳Σv1y+P1x^)]𝖳P1[xP(C𝖳Σv1y+P1x^)]+c.\displaystyle[x-P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x})]^{\mathsf{T}}P_{*}^{-1}[x-P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x})]+c.

Hence,

f(xt|𝒴t)exp(12(xtx^)𝖳P1(xtx^)),f(x_{t}|\mathcal{Y}_{t})\propto\exp\left(-\frac{1}{2}(x_{t}-\hat{x}_{*})^{\mathsf{T}}P_{*}^{-1}(x_{t}-\hat{x}_{*})\right),

with x^=P(C𝖳Σv1y+P1x^)\hat{x}_{*}=P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}y+P^{-1}\hat{x}), where the missing normalization constant does not depend on xtx_{t}. Hence, f(xt|𝒴t)f(x_{t}|\mathcal{Y}_{t}) is normal with mean x^t|t=x^\hat{x}_{t|t}=\hat{x}_{*} and covariance matrix Pt|t=PP_{t|t}=P_{*}. Using the matrix inversion lemma, P=PPC𝖳(CPC𝖳+Σv)1CPP_{*}=P-PC^{\mathsf{T}}(CPC^{\mathsf{T}}+\Sigma_{v})^{-1}CP, which is (2.7). To obtain (2.6), replace yy by (yCx^)+Cx^(y-C\hat{x})+C\hat{x} in the expression of x^\hat{x}_{*} to get

x^\displaystyle\hat{x}_{*} =P[C𝖳Σv1(yCx^)+C𝖳Σv1Cx^+P1x^]\displaystyle=P_{*}\left[C^{\mathsf{T}}\Sigma_{v}^{-1}(y-C\hat{x})+C^{\mathsf{T}}\Sigma_{v}^{-1}C\hat{x}+P^{-1}\hat{x}\right]
=PC𝖳Σv1(yCx^)+P(C𝖳Σv1C+P1)x^\displaystyle=P_{*}C^{\mathsf{T}}\Sigma_{v}^{-1}(y-C\hat{x})+P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}C+P^{-1})\hat{x}
=PC𝖳Σv1(yCx^)+x^.\displaystyle=P_{*}C^{\mathsf{T}}\Sigma_{v}^{-1}(y-C\hat{x})+\hat{x}.

Finally, using the identity I=(CPC𝖳+Σv)(CPC𝖳+Σv)1I=(CPC^{\mathsf{T}}+\Sigma_{v})(CPC^{\mathsf{T}}+\Sigma_{v})^{-1} (where the inverse exists because P0P\geqslant 0 and Σv>0\Sigma_{v}>0) and the definition of PP_{*},

PC𝖳Σv1\displaystyle P_{*}C^{\mathsf{T}}\Sigma_{v}^{-1} =PC𝖳Σv1(CPC𝖳+Σv)(CPC𝖳+Σv)1\displaystyle=P_{*}C^{\mathsf{T}}\Sigma_{v}^{-1}(CPC^{\mathsf{T}}+\Sigma_{v})(CPC^{\mathsf{T}}+\Sigma_{v})^{-1}
=P(C𝖳Σv1CPC𝖳+C𝖳)(CPC𝖳+Σv)1\displaystyle=P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}CPC^{\mathsf{T}}+C^{\mathsf{T}})(CPC^{\mathsf{T}}+\Sigma_{v})^{-1}
=P(C𝖳Σv1C+P1)PC𝖳(CPC𝖳+Σv)1\displaystyle=P_{*}(C^{\mathsf{T}}\Sigma_{v}^{-1}C+P^{-1})PC^{\mathsf{T}}(CPC^{\mathsf{T}}+\Sigma_{v})^{-1}
=PC𝖳(CPC¯𝖳+Σv)1,\displaystyle=PC^{\mathsf{T}}(CP\bar{C}^{\mathsf{T}}+\Sigma_{v})^{-1},

which leads to (2.6).

To conclude the proof, we need to compute the density f(xt+1|𝒴t)f(x_{t+1}|\mathcal{Y}_{t}) and prove (2.8)–(2.9). Using the Chapman-Kolmogorov equation, we can compute the density as follows

f(xt+1|𝒴t)=f(xt+1|xt,𝒴t)f(xt|𝒴t)dxt.f(x_{t+1}|\mathcal{Y}_{t})=\int f(x_{t+1}|x_{t},\mathcal{Y}_{t})f(x_{t}|\mathcal{Y}_{t})\text{d}x_{t}. (A.1)

However, the explicit computation of the integral in (A.1) is a very tedious exercise. We shall instead rely on characteristic functions. Recall that the characteristic function of an nn-dimensional Gaussian random vector ξ\xi with mean μ\mu and covariance Σ0\Sigma\geqslant 0 is given by Ψ(z)=𝔼[exp(iz𝖳ξ)]=exp(iz𝖳μ12z𝖳Σz)\Psi(z)=\mathbb{E}[\exp{(iz^{\mathsf{T}}\xi)}]=\exp\left(iz^{\mathsf{T}}\mu-\frac{1}{2}z^{\mathsf{T}}\Sigma z\right), where i1i\coloneqq\sqrt{-1} and znz\in\mathbb{R}^{n}. The characteristic function of xt+1x_{t+1} given 𝒴t\mathcal{Y}_{t} is then

Ψ(z)\displaystyle\Psi(z) =𝔼[exp(iz𝖳xt+1)|𝒴t]\displaystyle=\mathbb{E}[\exp(iz^{\mathsf{T}}x_{t+1})|\mathcal{Y}_{t}]
=𝔼[𝔼[exp(iz𝖳(Axt+But+wt))|xt,𝒴t]|𝒴t]\displaystyle=\mathbb{E}\left[\mathbb{E}\big{[}\exp\big{(}iz^{\mathsf{T}}(Ax_{t}+Bu_{t}+w_{t})\big{)}\,\big{|}\,x_{t},\mathcal{Y}_{t}\big{]}\,\big{|}\,\mathcal{Y}_{t}\right]
=𝔼[exp(iz𝖳Axt+iz𝖳Bgt(𝒴t))×𝔼[exp(iz𝖳wt)|xt,𝒴t]|𝒴t],\displaystyle=\mathbb{E}\left[\exp\big{(}iz^{\mathsf{T}}Ax_{t}+iz^{\mathsf{T}}Bg_{t}(\mathcal{Y}_{t})\big{)}\right.\times\left.\mathbb{E}\big{[}\exp(iz^{\mathsf{T}}w_{t})\big{|}x_{t},\mathcal{Y}_{t}\big{]}\big{|}\mathcal{Y}_{t}\right],

where we have used system dynamics in (2.1a) and the general definition of the feedback policies (2.4). Since wtw_{t} is independent of xtx_{t} and 𝒴t\mathcal{Y}_{t} and it is Gaussian with mean 0 and covariance Σw\Sigma_{w}, 𝔼[exp(iz𝖳wt)|xt,𝒴t]=exp(12z𝖳Σwz)\mathbb{E}\big{[}\exp(iz^{\mathsf{T}}w_{t})\big{|}x_{t},\mathcal{Y}_{t}\big{]}=\exp\left(-\frac{1}{2}z^{\mathsf{T}}\Sigma_{w}z\right). Therefore the last expression in the chain becomes

𝔼[exp(iz𝖳Axt)|𝒴t]exp(iz𝖳Bgt(𝒴t))exp(12z𝖳Σwz).\mathbb{E}\left[\exp\left(iz^{\mathsf{T}}Ax_{t}\right)\ |\ \mathcal{Y}_{t}\right]\exp\left(iz^{\mathsf{T}}Bg_{t}(\mathcal{Y}_{t})\right)\exp\left(-\frac{1}{2}z^{\mathsf{T}}\Sigma_{w}z\right).

It was proved above that, conditionally on 𝒴t\mathcal{Y}_{t}, xtx_{t} is Gaussian with mean x^t|t{\hat{x}}_{t|t} and covariance matrix Pt|tP_{t|t}. Hence

𝔼[exp(iz𝖳Axt)|𝒴t]\displaystyle\mathbb{E}[\exp(iz^{\mathsf{T}}Ax_{t})|\mathcal{Y}_{t}] =𝔼[exp(i(A𝖳z)𝖳xt)|𝒴t]\displaystyle=\mathbb{E}\left[\exp\big{(}i(A^{\mathsf{T}}z)^{\mathsf{T}}x_{t}\big{)}\big{|}\mathcal{Y}_{t}\right]
=exp(i(A𝖳z)𝖳x^t|t12(A𝖳z)𝖳Pt|t(A𝖳z)),\displaystyle=\exp\left(i(A^{\mathsf{T}}z)^{\mathsf{T}}{\hat{x}}_{t|t}-\frac{1}{2}(A^{\mathsf{T}}z)^{\mathsf{T}}P_{t|t}(A^{\mathsf{T}}z)\right),

and consequently

Ψ(z)\displaystyle\Psi(z) =exp(iz𝖳Ax^t|t12z𝖳APt|tA𝖳z)×exp(iz𝖳Bgt(𝒴t))exp(12z𝖳Σwz)\displaystyle=\exp\left(iz^{\mathsf{T}}A\hat{x}_{t|t}-\frac{1}{2}z^{\mathsf{T}}AP_{t|t}A^{\mathsf{T}}z\right)\times\exp\left(iz^{\mathsf{T}}Bg_{t}(\mathcal{Y}_{t})\right)\exp\left(-\frac{1}{2}z^{\mathsf{T}}\Sigma_{w}z\right)
=exp(iz𝖳(Ax^t|t+Bgt(𝒴t))12z𝖳(APt|tA𝖳+Σw)z),\displaystyle=\exp\Bigl{(}iz^{\mathsf{T}}\big{(}A{\hat{x}}_{t|t}+Bg_{t}(\mathcal{Y}_{t})\big{)}-\frac{1}{2}z^{\mathsf{T}}(AP_{t|t}A^{\mathsf{T}}+\Sigma_{w})z\Bigr{)},

which is the characteristic function of a Gaussian random vector with mean x^t+1|t=Ax^t|t+Bgt(𝒴t)\hat{x}_{t+1|t}=A\hat{x}_{t|t}+Bg_{t}(\mathcal{Y}_{t}) and covariance matrix Pt+1|t=APt|tA𝖳+Σw>0P_{t+1|t}=AP_{t|t}A^{\mathsf{T}}+\Sigma_{w}>0.

Appendix B

Lemma 15.

Consider the system (2.1a)-(2.1b), and suppose that (A,Σw1/2)(A,\Sigma_{w}^{1/2}) be stabilizable and (A,C)(A,C) be observable. In addition, assume that P0|00P_{0|0}\geqslant 0. Then there exist constants ρ>0\rho>0 and an integer TT large enough such that

𝔼𝒴t[xtx^t2]=𝗍𝗋(Pt|t)ρfor all tT.\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}\left\lVert{x_{t}-\hat{x}_{t}}\right\rVert^{2}\bigr{]}=\mathsf{tr}\!\left(P_{t|t}\right)\leqslant\rho\qquad\text{for all }t\geqslant T. (B.1)
Proof.

First, observe that

i=0κ11AiΣw(Ai)𝖳=[Σw1/2AΣw1/2Aκ11Σw1/2][Σw1/2AΣw1/2Aκ11Σw1/2]𝖳,\sum_{i=0}^{\kappa_{1}-1}A^{i}\Sigma_{w}(A^{i})^{\mathsf{T}}=\Bigl{[}\Sigma_{w}^{1/2}\;A\Sigma_{w}^{1/2}\;\cdots\;A^{\kappa_{1}-1}\Sigma_{w}^{1/2}\Bigr{]}\Bigl{[}\Sigma_{w}^{1/2}\;A\Sigma_{w}^{1/2}\;\cdots\;A^{\kappa_{1}-1}\Sigma_{w}^{1/2}\Bigr{]}^{\mathsf{T}},

and since (A,Σw1/2)(A,\Sigma_{w}^{1/2}) is controllable by Assumption 5-(iii), we see that there exists κ1+\kappa_{1}\in\mathbb{N}_{+} such that for all kκ1k\geqslant\kappa_{1} the rank of [Σw1/2AΣw1/2Aκ11Σw1/2]=n\Bigl{[}\Sigma_{w}^{1/2}\;A\Sigma_{w}^{1/2}\;\cdots\;A^{\kappa_{1}-1}\Sigma_{w}^{1/2}\Bigr{]}=n; indeed, κ1\kappa_{1} is the reachability index of (A,Σw1/2)(A,\Sigma_{w}^{1/2}). Thus, i=0κ11AiΣw(Ai)𝖳\sum_{i=0}^{\kappa_{1}-1}A^{i}\Sigma_{w}(A^{i})^{\mathsf{T}} is positive definite, and therefore, there exists some δ1,δ2>0\delta_{1}^{\prime},\delta_{2}^{\prime}>0 such that

δ1Ii=0κ11AiΣw(Ai)𝖳δ2I.\delta_{1}^{\prime}I\leqslant\sum_{i=0}^{\kappa_{1}-1}A^{i}\Sigma_{w}(A^{i})^{\mathsf{T}}\leqslant\delta_{2}^{\prime}I. (B.2)

Second, observe that333Here \otimes denotes the standard Kronecker product.

i=0κ21(Ai)𝖳C𝖳Σv1CAi=[CCACAκ21]𝖳(Iκ2Σv1)[CCACAκ21].\sum_{i=0}^{\kappa_{2}-1}(A^{i})^{\mathsf{T}}C^{\mathsf{T}}\Sigma_{v}^{-1}CA^{i}=\begin{bmatrix}C\\ CA\\ \vdots\\ CA^{\kappa_{2}-1}\end{bmatrix}^{\mathsf{T}}\bigl{(}I_{\kappa_{2}}\otimes\Sigma_{v}^{-1}\bigr{)}\begin{bmatrix}C\\ CA\\ \vdots\\ CA^{\kappa_{2}-1}\end{bmatrix}.

Since (A,C)(A,C) is observable by assumption, there exists κ2+\kappa_{2}\in\mathbb{N}_{+} such that the rank of the matrix [C𝖳A𝖳C𝖳(Aκ21)𝖳C𝖳]𝖳\Bigl{[}C^{\mathsf{T}}\;A^{\mathsf{T}}C^{\mathsf{T}}\;\cdots\;(A^{\kappa_{2}-1})^{\mathsf{T}}C^{\mathsf{T}}\Bigr{]}^{\mathsf{T}} is nn. The matrix Iκ2Σv1I_{\kappa_{2}}\otimes\Sigma_{v}^{-1} is clearly positive definite by Assumption 5(iii), and therefore, we see that there exists δ1′′,δ2′′>0\delta_{1}^{\prime\prime},\delta_{2}^{\prime\prime}>0 such that

δ1′′Ii=0κ21(Ai)𝖳C𝖳Σv1CAiδ2′′I.\delta_{1}^{\prime\prime}I\leqslant\sum_{i=0}^{\kappa_{2}-1}(A^{i})^{\mathsf{T}}C^{\mathsf{T}}\Sigma_{v}^{-1}CA^{i}\leqslant\delta_{2}^{\prime\prime}I. (B.3)

Third, the conditions of Lemma 7.1 in [Jazwinski, 1970, pp. 234] are satisfied, as (B.2) and (B.3) hold with δ1=min{δ1,δ1′′}\delta_{1}=\min\{\delta_{1}^{\prime},\delta_{1}^{\prime\prime}\} and δ2=max{δ2,δ2′′}\delta_{2}=\max\{\delta_{2}^{\prime},\delta_{2}^{\prime\prime}\}, and the bound Pt|tρIP_{t|t}\leqslant\rho^{\prime}I for some ρ>0\rho^{\prime}>0 is established for all tTmax{κ1,κ2}t\geqslant T\coloneqq\max\{\kappa_{1},\ \kappa_{2}\}. The assertion now follows immediately from:

𝔼𝒴t[xtx^t2]=𝗍𝗋(Pt|t)nλmax(Pt|t)nρρ.\mathbb{E}_{\mathcal{Y}_{t}}\bigl{[}\left\lVert{x_{t}-\hat{x}_{t}}\right\rVert^{2}\bigr{]}=\mathsf{tr}\!\left(P_{t|t}\right)\leqslant n\lambda_{\max}(P_{t|t})\leqslant n\rho^{\prime}\eqqcolon\rho.\qed