This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributionally Robust Model Predictive Control with Mixture of Gaussian Processes*

Jingyi Wu and Chao Ning *This work was partially supported by the National Natural Science Foundation of China under Grants 62473256 and 62103264 (Corresponding author: Chao Ning). J. Wu and C. Ning are with the Department of Automation, Shanghai Jiao Tong University.
Abstract

Despite the success of Gaussian process based Model Predictive Control (MPC) in robotic control, its applicability scope is greatly hindered by multimodal disturbances that are prevalent in real-world settings. Here we propose a novel Mixture of Gaussian Processes based Distributionally Robust MPC (MoGP-DR-MPC) framework for linear time-invariant systems subject to potentially multimodal state-dependent disturbances. This framework utilizes MoGP to automatically determine the number of modes from disturbance data. Using the mean and variance information provided by each mode-specific predictive distribution, it constructs a data-driven state-dependent ambiguity set, which allows for flexible and fine-grained disturbance modeling. Based on this ambiguity set, we impose Distributionally Robust Conditional Value-at-Risk (DR-CVaR) constraints to effectively achieve distributional robustness against errors in the predictive distributions. To address the computational challenge posed by these constraints in the resulting MPC problem, we equivalently reformulate the DR-CVaR constraints into tractable second-order cone constraints. Furthermore, we provide theoretical guarantees on the recursive feasibility and stability of the proposed framework. The enhanced control performance of MoGP-DR-MPC is validated through both numerical experiments and simulations on a quadrotor system, demonstrating notable reductions in closed-loop cost by 17% and 4% respectively compared against Gaussian process based MPC.

I Introduction

Multimodal state-dependent disturbances are pervasive in real-world applications, posing significant challenges to control synthesis because of their inherent complexity and variability. Gaussian Process based Model Predictive Control (GP-MPC) offers an effective approach for controlling systems subject to state-dependent disturbances and its control performance largely depends on the accuracy of the GP model. However, due to the stationary kernels employed in GP, the performance of GP-MPC deteriorates remarkably under multimodal disturbances. Such performance decline underscores the pressing research need for new MPC strategies to tackle the challenges introduced by the heterogeneous modalities.

In the face of unknown disturbances, different strategies have been proposed for GP-MPC. Among them, a robust strategy is to enforce constraint satisfaction under the worst-case scenario to ensure complete safety [1]. For instance, References [2] and [3] used confidence interval bounds and worst-case error bounds, respectively, to tighten constraints. However, such robust methods tend to be overly conservative. An alternative is to integrate GP with stochastic MPC, which employs chance constraints to substantially mitigate conservatism [4]. To expedite the solution process, Reference [5] used probabilistic reachable sets to reformulate chance constraints, while [6] derived a deterministic solution method through online GP-based binary regression. When the underlying distribution is not perfectly known, stochastic GP-MPC methods fall short of guaranteeing the probabilistic constraint satisfaction. To address this issue, Distributionally Robust MPC (DR-MPC) has been proposed and gained tremendous popularity, ensuring robustness over a so-called ambiguity set[7][8]. DR-MPC can also integrate data-driven techniques to extract moment information or approximate distributions, facilitating the construction of the ambiguity set [9][10][11]. Notably, these DR-MPC approaches typically assume that disturbances are independent of system states, overlooking state-dependent scenarios. Drawing upon the capability of DR-MPC to tackle distributional uncertainties, Reference [12] introduced Distributionally Robust Conditional Value-at-Risk (DR-CVaR) constraints into GP-MPC, enhancing safety even when the true distribution deviates from the estimated one. However, according to the existing literature, a critical research gap remains: most existing GP-MPC research primarily focuses on unimodal disturbances, thus enormously limiting their applicability in real-world settings where multimodal state-dependent disturbances are prevalent.

To fill this challenging gap, we propose a Mixture of Gaussian Processes-based Distributionally Robust MPC (MoGP-DR-MPC) framework. To immunize against errors in the predictive distributions, an ambiguity set is constructed based on the structure of modalities and mode-specific predictive distributions. Within this framework, DR-CVaR constraints are employed in the first predicted system state, while subsequent constraints are tightened via an invariant set [13]. We equivalently reformulate the DR-CVaR constraints as second-order cone constraints to facilitate computation for control actions[14], and theoretically provide guarantees for the proposed framework. Its effectiveness and superiority are substantiated through two case studies. The major contributions are summarized as follows:

  • We develop a novel DR-MPC framework that delicately integrates MoGP with MPC to enhance control performance in the presence of state-dependent and potentially multimodal disturbances, while ensuring recursive feasibility and stability.

  • We propose an innovative data-driven state-dependent ambiguity set, which closely respects the inherent multimodalities and efficaciously enables the incorporation of fined-grained moment information.

II PROBLEM FORMULATION

We consider the following stochastic linear discrete time-invariant system.

xk+1=Axk+Buk+w(xk),x_{k+1}=Ax_{k}+Bu_{k}+w(x_{k}), (1)

where xkn,ukmx_{k}\in\mathbb{R}^{n},u_{k}\in\mathbb{R}^{m} and w(xk)nw(x_{k})\in\mathbb{R}^{n} denote the system state, control input and disturbance at time kk, respectively. In particular, unknown disturbance w()w(\cdot) is state-dependent, possibly multimodal and bounded by support 𝕎\mathbb{W}, which is represented by a box constraint set [w~1,w~2]n[\tilde{w}_{1},\tilde{w}_{2}]\subseteq\mathbb{R}^{n}. Assume system matrix (A,B)(A,B) is known and stabilizable. The system is subject to the probabilistic state constraints and hard input constraints defined as follows.

Prxk(xk\displaystyle\Pr_{x_{k}\sim\mathbb{P}}(x_{k}\in 𝕏)1ε,\displaystyle\mathbb{X})\geq 1-\varepsilon, (2a)
uk\displaystyle u_{k} 𝕌,\displaystyle\in\mathbb{U}, (2b)

where 𝕏=[x~1,x~2]n\mathbb{X}=[\tilde{x}_{1},\tilde{x}_{2}]\subseteq\mathbb{R}^{n} and 𝕌={uHuuhu}\mathbb{U}=\{u\mid H_{u}u\leq h_{u}\}. These sets are compact and contain the origin within their interiors. The notation Pr\Pr represents the probability and ε\varepsilon is the prescribed tolerance level that curbs the probability of constraint violation.

Since the underlying distribution \mathbb{P} is not perfectly known, we propose using a DR-CVaR version of Constraint (2a), which can be defined as

supQ-CVaRε{(1)i(xk(j)x~i(j))}0,i{1,2},j{1,,n}\begin{gathered}\sup_{\mathbb{Q}\in Q}\mathbb{Q}\text{-CVaR}_{\varepsilon}\left\{(-1)^{i}\left(x_{k}^{(j)}-\tilde{x}_{i}^{(j)}\right)\right\}\leq 0,\\ i\in\{1,2\},j\in\{1,\ldots,n\}\end{gathered} (3)

where the superscript ()(j)(\cdot)^{(j)} represents the jj-th element of the corresponding vector throughout this paper, and QQ is an ambiguity set derived from historical data using learning techniques presented in the following section. Although Constraint (3) may appear more complex, it admits an exact and convex reformulation that is more tractable in computation than the original chance constraint.

III INFINITE MIXTURE of GPs

To accurately model the state-dependent, potentially multimodal disturbance w()w(\cdot) and derive the corresponding ambiguity set, we adopt an MoGP approach inspired by the Mixture of experts (MoE) framework [15]. This method can naturally incorporate the multimodal structure into the model of w()w(\cdot).

In MoGP, data points are assigned to local experts via a gating network, with each expert then using the assigned data to train its local GP model. Based on this strategy, the observation likelihood for the MoGP model is given by

Pr(𝐲|𝐳,θ)=ijPr(yi|ci=j,zi,θj)Pr(ci=j|zi,ϕ),\Pr(\mathbf{y}|\mathbf{z},\theta)=\prod_{i}\sum_{j}\Pr(y_{i}|c_{i}=j,z_{i},\theta_{j})\Pr(c_{i}=j|z_{i},\phi),

where 𝐳\mathbf{z} represents the training inputs, 𝐲\mathbf{y} denotes the outputs, θj\theta_{j} denotes the parameters of expert jj, ϕ\phi refers to the parameters of the gating network, and cic_{i} represents the indicator variables specifying the corresponding experts. The number of experts is automatically determined from data through maximizing the likelihood.

III-A Gating Network Based on Dirichlet Process

To accommodate the state-dependent property, we tailor the Dirichlet process to serve as an input-dependent gating network. In the standard Dirichlet process with the concentration parameter α\alpha, the conditional probability of the indicator variable cic_{i} is given by

{p(ci=j|𝐜i,α)=ni,jn1+α,(ni,j>0)p(cici for all ii|𝐜i,α)=αn1+α,(ni,j=0)\begin{cases}p(c_{i}=j|\mathbf{c}_{-i},\alpha)&=\frac{n_{-i,j}}{n-1+\alpha},\;(n_{-i,j}>0)\\ p(c_{i}\neq c_{i^{\prime}}\text{ for all }i^{\prime}\neq i|\mathbf{c}_{-i},\alpha)&=\frac{\alpha}{n-1+\alpha},\;(n_{-i,j}=0)\end{cases} (4)

where ni,j=iiδ(ci,j)n_{-i,j}=\sum_{i^{\prime}\neq i}\delta(c_{i^{\prime}},j) represents the number of data points in expert jj excluding observation ii, and nn is the total number of data points. To incorporate input dependence into the gating network, we redefine ni,jn_{-i,j} as follows.

ni,j=(n1)iiKϕ(zi,zi)δ(ci,j)iiKϕ(zi,zi),n_{-i,j}=(n-1)\frac{\sum_{i^{\prime}\neq i}K_{\phi}(z_{i},z_{i^{\prime}})\delta(c_{i^{\prime}},j)}{\sum_{i^{\prime}\neq i}K_{\phi}(z_{i},z_{i^{\prime}})}, (5)

where KϕK_{\phi} is a kernel function parametrized by ϕ\phi and δ\delta is the Dirac delta function. Consequently, given a new data point, the conditional distribution of its indicator variable can be obtained by substituting (5) into (4).

III-B Local Gaussian Process

For a local GP expert jj, let 𝐳={z1,zN}\mathbf{z}=\{z_{1},\ldots z_{N}\} represent the training inputs and 𝐲={y1,,yN}\mathbf{y}=\{y_{1},\ldots,y_{N}\} denote the corresponding noisy outputs, where yi=w(zi)+ϵiy_{i}=w(z_{i})+\epsilon_{i}. The observation noise ϵi\epsilon_{i} is assumed to be i.i.d, following a normal distribution with mean 0 and variance Σon=diag{σ12,,σn2}\Sigma_{on}=\text{diag}\{\sigma_{1}^{2},\ldots,\sigma_{n}^{2}\}. Each output dimension is modeled independently. For dimension jj, we define a GP prior with kernel k(,)k(\cdot,\cdot) and mean function m()m(\cdot). Conditioning on the data 𝐳\mathbf{z} and 𝐲(j)={y1(j),,yN(j)}\mathbf{y}^{(j)}=\{y_{1}^{(j)},\ldots,y_{N}^{(j)}\}, the predictive posterior distribution for the output at a new data point zz^{*} is Gaussian with the following mean and variance.

μj(z)\displaystyle\mu_{j}(z^{*}) =k(z,𝐳)Kσj1𝐲(j)\displaystyle=k\left({z^{*},\mathbf{z}}\right)K_{\sigma_{j}}^{-1}\mathbf{y}^{(j)} (6)
Σj(z)\displaystyle\Sigma_{j}(z^{*}) =k(z,z)k(z,𝐳)Kσj1k(𝐳,z)\displaystyle=k\left(z^{*},z^{*}\right)-k\left({z^{*},\mathbf{z}}\right)K_{\sigma_{j}}^{-1}k\left({\mathbf{z},z^{*}}\right)

where Kσj=k(𝐳,𝐳)+σj2IK_{\sigma_{j}}=k\left({\mathbf{z},\mathbf{z}}\right)+\sigma_{j}^{2}I.

Based on the above analysis, the predictive distribution of w(z)w(z^{*}) in dimension aa generated by MoGP can be expressed by combining the conditional distributions of the indicator variables from the gating network with the results from local GP regression, as shown in (7).

Pr(y𝒟)=j=1Mγa,j(z)𝒩(yμa,j(z),Σa,j(z)),\Pr(y^{*}\mid\mathcal{D})=\sum_{j=1}^{M}\gamma_{a,j}(z^{*})\mathcal{N}\left(y^{*}\mid\mu_{a,j}(z^{*}),\Sigma_{a,j}(z^{*})\right), (7)

where MM is the number of experts, 𝒟\mathcal{D} represents the training dataset, and γa,j\gamma_{a,j} denotes the weight of the corresponding Gaussian component obtained by Dirichlet process. This formulation effectively captures the potential multimodalities in the latent function. However, errors between the predictive distribution and the true underlying distribution still exists. To mitigate this uncertainty, we construct an ambiguity set in the following section.

IV CONTROLLER DESIGN

With the MoGP model for w()w(\cdot), we design a controller tailored for system (1) in this section. To decouple the impact of disturbances, we decompose the system state xkx_{k} into nominal component sks_{k} and error component eke_{k}. The feedback control law is designed as uk=Kek+vku_{k}=Ke_{k}+v_{k}, where vkv_{k} is the nominal input and KK is a feedback gain matrix computed offline, satisfying that A+BKA+BK is stable. Employing this control law, the corresponding nominal and error dynamics are given below [13].

sk+1\displaystyle s_{k+1} =Ask+Bvk,\displaystyle=As_{k}+Bv_{k}, (8a)
ek+1\displaystyle e_{k+1} =(A+BK)ek+w(xk),\displaystyle=(A+BK)e_{k}+w(x_{k}), (8b)

IV-A Ambiguity Set Construction

To formulate the DR-CVaR constraints on system states, we construct a state-dependent ambiguity set that accounts for potential multimodalities by leveraging the MoGP model for w()w(\cdot) in each dimension. Particularly, given a specific system state xx^{*} at time tt, suppose that the predictive distribution for w(x)w(x^{*}) in dimension τ\tau, which consists of MτM_{\tau} state-dependent Gaussian components, can be written as

j=1Mτγτ,t,j𝒩(wμτ,j(x),Στ,j(x)).\sum_{j=1}^{M_{\tau}}\gamma_{\tau,t,j}\mathcal{N}\left(w\mid\mu_{\tau,j}(x^{*}),\Sigma_{\tau,j}(x^{*})\right). (9)

Considering the structure of modalities and independence between each mode, we establish local ambiguity sets using the state-dependent mean and variance information from the Gaussian components in (9). The overall ambiguity set 𝒲τ(x)\mathcal{W}_{\tau}(x^{*}) is then constructed as a weighted Minkowski sum of these local ambiguity sets. Explicitly, each local state-dependent ambiguity set 𝒲τ,j\mathcal{W}_{\tau,j} is defined below.

𝒲τ,j(𝕎,x)\displaystyle\mathcal{W}_{\tau,j}\left(\mathbb{W},x^{*}\right)\triangleq (10)
{ρ+|𝕎ρ(dξ)=1𝕎ξρ(dξ)=μτ,j(x)𝕎(ξμτ,j(x))2ρ(dξ)Στ,j(x)}\displaystyle\left\{\rho\in\mathcal{M}_{+}\left|\begin{aligned} &\int_{\mathbb{W}}\rho\left(d\xi\right)=1\\ &\int_{\mathbb{W}}\xi\cdot\rho\left(d\xi\right)=\mu_{\tau,j}(x^{*})\\ &\int_{\mathbb{W}}\left(\xi-\mu_{\tau,j}(x^{*})\right)^{2}\cdot\rho\left(d\xi\right)\leq\Sigma_{\tau,j}(x^{*})\end{aligned}\right.\right\}

where +\mathcal{M}_{+} represents the set of positive Borel measure on \mathbb{R}, and ρ\rho is a positive measure. The proposed ambiguity set 𝒲τ(x)\mathcal{W}_{\tau}(x^{*}) is defined as

j=1Mτγτ,t,j𝒲τ,j(𝕎,x).\sum_{j=1}^{M_{\tau}}\gamma_{\tau,t,j}\mathcal{W}_{\tau,j}(\mathbb{W},x^{*}). (11)

This construction closely aligns with the potentially multimodal structure, allowing for a more flexible and intricate representation of state-dependent disturbances.

IV-B Constraint Formulation

Define xk|tx_{k|t} as the kk step ahead predicted system state based on the current state xtx_{t}, while x0|t=xtx_{0|t}=x_{t} is the given initial state. To ensure recursive feasibility, we adopt a hybrid constraint tightening strategy that combines distributionally robust and worst-case tightening methods.

For x1|tx_{1|t}, the corresponding DR-CVaR constraint can be expressed as follows [16].

supQτ(x1|t)-CVaRε{(1)i(x1|t(τ)x~i(τ))}0,i{1,2},τ{1,,n}\begin{gathered}\sup_{\mathbb{Q}\in Q_{\tau}(x_{1|t})}\mathbb{Q}\text{-CVaR}_{\varepsilon}\left\{(-1)^{i}\left(x_{1|t}^{(\tau)}-\tilde{x}_{i}^{(\tau)}\right)\right\}\leq 0,\\ i\in\{1,2\},\tau\in\{1,\ldots,n\}\end{gathered} (12)

Noting that x1|t=(A+BK)x0|tBKs0|t+Bv0|t+w(x0|t)x_{1|t}=(A+BK)x_{0|t}-BKs_{0|t}+Bv_{0|t}+w(x_{0|t}), since the state-dependent term w(x0|t)w(x_{0|t}) is the only source of uncertainty, the ambiguity set Qτ(x1|t)Q_{\tau}(x_{1|t}) is essentially a shifted version of the ambiguity set for w(x0|t)(τ)w(x_{0|t})^{(\tau)}, denoted as 𝒲τ(x0|t)\mathcal{W}_{\tau}(x_{0|t}). Hence, it can be defined in the same manner as (11). Based on (11), we consequently prove that Constraint (12) can be equivalently reformulated as second-order cone constraints in the subsequent section.

Given the support set 𝕎\mathbb{W}, we define 𝒵\mathcal{Z} as the minimal Robust Positively Invariant (mRPI) set for the error dynamics in (8b). The remaining system constraints are then guaranteed by the tightened state constraints sk|t𝕏𝒵,k{2,,N1}s_{k|t}\in\mathbb{X}\ominus\mathcal{Z},k\in\{2,\ldots,N-1\} and input constraints vk|t𝕌K𝒵,k{0,,N1}v_{k|t}\in\mathbb{U}\ominus K\mathcal{Z},k\in\{0,\ldots,N-1\} [17]. Additionally, the constraint xts0|t𝒵x_{t}-s_{0|t}\in\mathcal{Z} is required to be satisfied. Finally, we impose the terminal constraint sN|t𝒳fs_{N|t}\in\mathcal{X}_{f}, where 𝒳f\mathcal{X}_{f} is defined in (13).

𝒳f{sN|kn:sN+i|k𝕏𝒵,KsN+i|k𝕌K𝒵,i}\mathcal{X}_{f}\triangleq\left\{s_{N|k}\in\mathbb{R}^{n}:\begin{gathered}s_{N+i|k}\in\mathbb{X}\ominus\mathcal{Z},\\ Ks_{N+i|k}\in\mathbb{U}\ominus K\mathcal{Z},\forall i\in\mathbb{N}\end{gathered}\right\} (13)

Based on the above analysis, the formulation of MoGP-DR-MPC at time tt, denoted as 𝒫(xt)\mathscr{P}(x_{t}), is written as

mins0|t,𝐯t\displaystyle\min_{s_{0|t},\mathbf{v}_{t}} k=0N1(sk|tQ2+vk|tR2)+sN|tP2\displaystyle\sum_{k=0}^{N-1}\left(\left\|s_{k|t}\right\|_{Q}^{2}+\left\|v_{k|t}\right\|_{R}^{2}\right)+\left\|s_{N|t}\right\|_{P}^{2}
s.t. x0|t=xt,x0|ts0|t𝒵\displaystyle x_{0|t}=x_{t},x_{0|t}-s_{0|t}\in\mathcal{Z} (14a)
sk+1|t=Ask|t+Bvk|t,k=0,,N1\displaystyle s_{k+1|t}=As_{k|t}+Bv_{k|t},\;k=0,\dots,N-1 (14b)
supQτ(x1|t)-CVaRε{(1)i(x1|t(τ)x~i(τ))}0i{1,2},τ{1,,n}\displaystyle\begin{gathered}\sup_{\mathbb{Q}\in Q_{\tau}(x_{1|t})}\mathbb{Q}\text{-CVaR}_{\varepsilon}\left\{(-1)^{i}\left(x_{1|t}^{(\tau)}-\tilde{x}_{i}^{(\tau)}\right)\right\}\leq 0\\ i\in\{1,2\},\tau\in\{1,\ldots,n\}\end{gathered} (14e)
sk|t𝕏𝒵,k=2,,N1\displaystyle s_{k|t}\in\mathbb{X}\ominus\mathcal{Z},\;k=2,\dots,N-1 (14f)
vk|t𝕌K𝒵,k=0,,N1\displaystyle v_{k|t}\in\mathbb{U}\ominus K\mathcal{Z},\;k=0,\dots,N-1 (14g)
sN|t𝒳f\displaystyle s_{N|t}\in\mathcal{X}_{f} (14h)

where 𝐯t=[v0|tvN1|t]\mathbf{v}_{t}=[v_{0|t}\ldots v_{N-1|t}]^{\top} and P,Q,RP,Q,R are positive definite weight matrices, with PP determined through the Riccati equation. Obviously, Problem 𝒫(xt)\mathscr{P}(x_{t}) is nonconvex and cannot be solved directly due to the presence of constraint (14e). We then derive its equivalent tractable form in the next section.

IV-C Tractable MPC Formulation

In this section, we demonstrate how to transform the critical and intractable DR-CVaR constraint (14e) into a set of second-order cone constraints using the following theorem.

Theorem 1

For any τ{1,,n}\tau\in\{1,\ldots,n\} and i{1,2}i\in\{1,2\}, the DR-CVaR constraint (14e) is satisfied if and only if

(1)i[s1|t(τ)+((A+BK)(xts0|t))(τ)]\displaystyle(-1)^{i}\left[s_{1|t}^{(\tau)}+\left((A+BK)\cdot(x_{t}-s_{0|t})\right)^{(\tau)}\right] (15)
(1)ix~i(τ)ηi,t(τ),\displaystyle\leq(-1)^{i}\tilde{x}_{i}^{(\tau)}-\eta_{i,t}^{(\tau)},

where ηi,t(τ)\eta_{i,t}^{(\tau)} is obtained by the following second-order cone program.

min\displaystyle\min η\displaystyle\eta (16)
s.t. εβ+j=1Mτ(x0|t)γτ,t,j{tj+μτ,j(x0|t)ωj+[Στ,j(x0|t)+μτ,j2(x0|t)]Ωj}0,\displaystyle\varepsilon\beta+\sum_{j=1}^{M_{\tau}(x_{0|t})}\gamma_{\tau,t,j}\left\{\begin{gathered}t_{j}+\mu_{\tau,j}(x_{0|t})\omega_{j}+\\ \left[\Sigma_{\tau,j}(x_{0|t})+\mu^{2}_{\tau,j}(x_{0|t})\right]\Omega_{j}\end{gathered}\right\}\leq 0,
ωj+φ1,jφ2,jΩjtj+w~2(τ)φ1,jw~1(τ)φ2,j2Γ1,j,\displaystyle\left\|\begin{array}[]{c}\omega_{j}+\varphi_{1,j}-\varphi_{2,j}\\ \Omega_{j}-t_{j}+\tilde{w}_{2}^{(\tau)}\varphi_{1,j}-\tilde{w}_{1}^{(\tau)}\varphi_{2,j}\end{array}\right\|_{2}\leq\Gamma_{1,j},
ωj(1)i+ϕ1,jϕ2,jΩjtjβη+w~2(τ)ϕ1,jw~1(τ)ϕ2,j2Γ2,j,\displaystyle\left\|\begin{array}[]{c}\omega_{j}-(-1)^{i}+\phi_{1,j}-\phi_{2,j}\\ \Omega_{j}-t_{j}-\beta-\eta+\tilde{w}_{2}^{(\tau)}\phi_{1,j}-\tilde{w}_{1}^{(\tau)}\phi_{2,j}\end{array}\right\|_{2}\leq\Gamma_{2,j},
Ωj0,φ1,j0,φ2,j0,ϕ1,j0,ϕ2,j0,\displaystyle\Omega_{j}\geq 0,\varphi_{1,j}\geq 0,\varphi_{2,j}\geq 0,\phi_{1,j}\geq 0,\phi_{2,j}\geq 0,
j{1,,Mτ(x0|t)}.\displaystyle\forall j\in\{1,\ldots,M_{\tau}(x_{0|t})\}.

where

Γ1,j\displaystyle\Gamma_{1,j} =Ωj+tjw~2(τ)φ1,j+w~1(τ)φ2,j,\displaystyle=\Omega_{j}+t_{j}-\tilde{w}_{2}^{(\tau)}\varphi_{1,j}+\tilde{w}_{1}^{(\tau)}\varphi_{2,j},
Γ2,j\displaystyle\Gamma_{2,j} =Ωj+tj+β+ηw~2(τ)ϕ1,j+w~1(τ)ϕ2,j.\displaystyle=\Omega_{j}+t_{j}+\beta+\eta-\tilde{w}_{2}^{(\tau)}\phi_{1,j}+\tilde{w}_{1}^{(\tau)}\phi_{2,j}.

Stacking the 2n2n inequalities defined as (15), the DR-CVaR constraints (14e) are equivalent to s1|t+(A+BK)(xts0|t)[x~1+η1,t,x~2η2,t]s_{1|t}+(A+BK)(x_{t}-s_{0|t})\in\left[\tilde{x}_{1}+\eta_{1,t},\tilde{x}_{2}-\eta_{2,t}\right].

Proof:

According to the definition of CVaR, the left side of Constraint (14e) can be rewritten as follows.

supQτ(x1|t)-CVaRε{(1)i(x1|t(τ)x~i(τ))}\displaystyle\sup_{\mathbb{Q}\in Q_{\tau}(x_{1|t})}\mathbb{Q}\text{-CVaR}_{\varepsilon}\left\{(-1)^{i}\left(x_{1|t}^{(\tau)}-\tilde{x}_{i}^{(\tau)}\right)\right\} (17)
=\displaystyle= infβ{β+1εsup𝒲τ(x0|t)𝔼((1)i(zt(τ)x~i(τ))+(1)iw(x0|t)(τ)β)+}\displaystyle\inf_{\beta\in\mathbb{R}}\left\{\beta+\frac{1}{\varepsilon}\sup_{\mathbb{P}\in\mathcal{W}_{\tau}(x_{0|t})}\mathbb{E}_{\mathbb{P}}\left(\begin{gathered}(-1)^{i}\left(z_{t}^{(\tau)}-\tilde{x}_{i}^{(\tau)}\right)\\ +(-1)^{i}w(x_{0|t})^{(\tau)}-\beta\end{gathered}\right)^{+}\right\} (20)

where zt=(A+BK)(xts0|t)+s1|tz_{t}=(A+BK)(x_{t}-s_{0|t})+s_{1|t}. To simplify the notation, we denote (1)i(zt(τ)x~i(τ))(-1)^{i}\left(z_{t}^{(\tau)}-\tilde{x}_{i}^{(\tau)}\right) as C(i,τ)C(i,\tau). Given the initial state x0|t=xtx_{0|t}=x_{t}, the ambiguity set 𝒲τ(x0|t)\mathcal{W}_{\tau}(x_{0|t}) is defined similarly as (11). Then the worst-case expectation term sup𝒲τ(x0|t)𝔼(C(i,τ)+(1)iw(x0|t)(τ)β)+\sup_{\mathbb{P}\in\mathcal{W}_{\tau}(x_{0|t})}\mathbb{E}_{\mathbb{P}}\left(C(i,\tau)+(-1)^{i}w(x_{0|t})^{(\tau)}-\beta\right)^{+} can be reformulated as follows [14] [18].

sup𝝆\displaystyle\sup_{\boldsymbol{\rho}} j=1Mτ(x0|t)γτ,t,j𝒲τ(x0|t){C(i,τ)+(1)iξβ}+ρj(dξ)\displaystyle\sum_{j=1}^{M_{\tau}(x_{0|t})}\gamma_{\tau,t,j}\int_{\mathcal{W}_{\tau}(x_{0|t})}\left\{C(i,\tau)+(-1)^{i}\xi-\beta\right\}^{+}\rho_{j}\left(d\xi\right)
s.t. 𝒲τ(x0|t)ρj(dξ)=1𝒲τ(x0|t)ξρj(dξ)=μτ,j(x0|t)𝒲τ(x0|t)(ξμτ,j(x0|t))2ρj(dξ)Στ,j(x0|t)\displaystyle\begin{aligned} &\int_{\mathcal{W}_{\tau}(x_{0|t})}\rho_{j}\left(d\xi\right)=1\\ &\int_{\mathcal{W}_{\tau}(x_{0|t})}\xi\rho_{j}\left(d\xi\right)=\mu_{\tau,j}(x_{0|t})\\ &\int_{\mathcal{W}_{\tau}(x_{0|t})}(\xi-\mu_{\tau,j}(x_{0|t}))^{2}\rho_{j}\left(d\xi\right)\leq\Sigma_{\tau,j}(x_{0|t})\end{aligned} (21)

Obviously, Problem (21) cannot be solved directly, so we take its dual, which is formulated below.

mintj,ωj,Ωj\displaystyle\min_{t_{j},\omega_{j},\Omega_{j}} j=1Mτ(x0|t)γτ,t,j{tj+μτ,j(x0|t)ωj+[Στ,j(x0|t)+μτ,j2(x0|t)]Ωj}\displaystyle\sum_{j=1}^{M_{\tau}(x_{0|t})}\gamma_{\tau,t,j}\left\{\begin{gathered}t_{j}+\mu_{\tau,j}(x_{0|t})\omega_{j}\\ +\left[\Sigma_{\tau,j}(x_{0|t})+\mu_{\tau,j}^{2}(x_{0|t})\right]\cdot\Omega_{j}\end{gathered}\right\} (22)
s.t. tj+ξωj+ξ2Ωj0,\displaystyle t_{j}+\xi\omega_{j}+\xi^{2}\Omega_{j}\geq 0,
(C(i,τ)+(1)iξβ)+tj+ξωj+ξ2Ωj0,\displaystyle-\left(C(i,\tau)+(-1)^{i}\xi-\beta\right)+t_{j}+\xi\omega_{j}+\xi^{2}\Omega_{j}\geq 0,
Ωij0,w~1(τ)ξw~2(τ),\displaystyle\Omega_{ij}\geq 0,\tilde{w}_{1}^{(\tau)}\leq\xi\leq\tilde{w}_{2}^{(\tau)},
1jMτ(x0|t).\displaystyle\forall 1\leq j\leq M_{\tau}(x_{0|t}).

The semi-infinite constraints in (22) can be transformed into linear matrix inequalities by duality theory and Schur complements [19]. For brevity, the exact form is omitted. Since the resulting matrices are two-dimensional, they can be further rewritten as second-order cone constraints [20]. Thus, Constraint (14e) is equivalent to the following set of constraints.

εβ+j=1Mτ(x0|t)γτ,t,j{tj+μτ,j(x0|t)ωj+[Στ,j(x0|t)+μτ,j2(x0|t)]Ωj}0,\displaystyle\varepsilon\beta+\sum_{j=1}^{M_{\tau}(x_{0|t})}\gamma_{\tau,t,j}\left\{\begin{gathered}t_{j}+\mu_{\tau,j}(x_{0|t})\omega_{j}+\\ \left[\Sigma_{\tau,j}(x_{0|t})+\mu^{2}_{\tau,j}(x_{0|t})\right]\Omega_{j}\end{gathered}\right\}\leq 0, (23)
ωj+φ1,jφ2,jΩjtj+w~2(τ)φ1,jw~1(τ)φ2,j2Γ~1,j,\displaystyle\left\|\begin{array}[]{c}\omega_{j}+\varphi_{1,j}-\varphi_{2,j}\\ \Omega_{j}-t_{j}+\tilde{w}_{2}^{(\tau)}\varphi_{1,j}-\tilde{w}_{1}^{(\tau)}\varphi_{2,j}\end{array}\right\|_{2}\leq\tilde{\Gamma}_{1,j},
ωj(1)i+ϕ1,jϕ2,jΩjtjβ+C(i,τ)+w~2(τ)ϕ1,jw~1(τ)ϕ2,j2Γ~2,j\displaystyle\left\|\begin{array}[]{c}\omega_{j}-(-1)^{i}+\phi_{1,j}-\phi_{2,j}\\ \Omega_{j}-t_{j}-\beta+C(i,\tau)+\tilde{w}_{2}^{(\tau)}\phi_{1,j}-\tilde{w}_{1}^{(\tau)}\phi_{2,j}\end{array}\right\|_{2}\leq\tilde{\Gamma}_{2,j}
Ωj0,φ1,j0,φ2,j0,ϕ1,j0,ϕ2,j0,\displaystyle\Omega_{j}\geq 0,\varphi_{1,j}\geq 0,\varphi_{2,j}\geq 0,\phi_{1,j}\geq 0,\phi_{2,j}\geq 0,
1jMτ(x0|t)\displaystyle\forall 1\leq j\leq M_{\tau}(x_{0|t})

where

Γ~1,j\displaystyle\tilde{\Gamma}_{1,j} =Ωj+tjw~2(τ)φ1,j+w~1(τ)φ2,j,\displaystyle=\Omega_{j}+t_{j}-\tilde{w}_{2}^{(\tau)}\varphi_{1,j}+\tilde{w}_{1}^{(\tau)}\varphi_{2,j},
Γ~2,j\displaystyle\tilde{\Gamma}_{2,j} =Ωj+tj+βC(i,τ)w~2(τ)ϕ1,j+w~1(τ)ϕ2,j.\displaystyle=\Omega_{j}+t_{j}+\beta-C(i,\tau)-\tilde{w}_{2}^{(\tau)}\phi_{1,j}+\tilde{w}_{1}^{(\tau)}\phi_{2,j}.

Substituting (23) into (17) gives (16), which completes the proof. ∎

Theorem 1 provides the convex reformulation alternatives of Constraint (14e). It follows that the optimization problem 𝒫(xt)\mathscr{P}(x_{t}) can be reformulated into the convex and tractable form below, which is denoted as 𝒫MoDR(xt)\mathscr{P}_{MoDR}(x_{t}).

mins0|t,𝐯t\displaystyle\min_{s_{0|t},\mathbf{v}_{t}} k=0N1(sk|tQ2+vk|tR2)+sN|tP2\displaystyle\sum_{k=0}^{N-1}\left(\left\|s_{k|t}\right\|_{Q}^{2}+\left\|v_{k|t}\right\|_{R}^{2}\right)+\left\|s_{N|t}\right\|_{P}^{2}
s.t. x0|ts0|t𝒵,\displaystyle x_{0|t}-s_{0|t}\in\mathcal{Z}, (24a)
sk+1|t=Ask|t+Bvk|t,k=0,,N1\displaystyle s_{k+1|t}=As_{k|t}+Bv_{k|t},\;k=0,\dots,N-1 (24b)
s1|t+(A+BK)(xts0|t),[x~1+η1,t,x~2η2,t]\displaystyle\begin{aligned} s_{1|t}+&(A+BK)(x_{t}-s_{0|t}),\\ &\in\left[\tilde{x}_{1}+\eta_{1,t},\tilde{x}_{2}-\eta_{2,t}\right]\end{aligned} (24c)
sk|t𝕏𝒵,k=2,,N1\displaystyle s_{k|t}\in\mathbb{X}\ominus\mathcal{Z},\;k=2,\dots,N-1 (24d)
vk|t𝕌K𝒵,k=0,,N1\displaystyle v_{k|t}\ominus\mathbb{U}\ominus K\mathcal{Z},\;k=0,\dots,N-1 (24e)
sN|t𝒳f.\displaystyle s_{N|t}\in\mathcal{X}_{f}. (24f)

If 𝒫MoDR(xt)\mathscr{P}_{MoDR}(x_{t}) is feasible at time tt, let s0|ts_{0|t}^{*} and 𝐯t=[v0|t,,vN1|t]\mathbf{v}_{t}^{*}=[v_{0|t}^{*},\ldots,v_{N-1|t}^{*}]^{\top} represent its optimal solution. At each time instant, the MoGP-DR-MPC framework solves 𝒫MoDR(xt)\mathscr{P}_{MoDR}(x_{t}) to obtain the optimal input sequence and only the first element ut=K(xts0|t)+v0|tu_{t}^{*}=K(x_{t}-s_{0|t}^{*})+v_{0|t}^{*} is executed. This process is repeated to perform receding horizon control.

V THEORETICAL PROPERTIES

In this section, we establish the recursive feasibility and stability of MoGP-DR-MPC through the following theorems.

Theorem 2 (Recursive feasibility)

If problem 𝒫MoDR(xt)\mathscr{P}_{MoDR}(x_{t}) is feasible with state xtx_{t}, then 𝒫MoDR(xt+1)\mathscr{P}_{MoDR}(x_{t+1}) is recursively feasible with ut=K(xts0|t)+v0|tu_{t}=K(x_{t}-s_{0|t}^{*})+v_{0|t}^{*}.

Proof:

Given the optimal sequences 𝐯t\mathbf{v}_{t}^{*} and 𝐬t\mathbf{s}_{t}^{*}, from Constraint (24a), we know that et=xts0|t𝒵e_{t}=x_{t}-s_{0|t}^{*}\in\mathcal{Z}. By the invariant property of 𝒵\mathcal{Z}, we have et+1=(A+BK)et+w(xt)𝒵e_{t+1}=(A+BK)e_{t}+w(x_{t})\in\mathcal{Z} for all possible w(xt)𝕎w(x_{t})\in\mathbb{W}. Define the solution sequence of problem 𝒫MoDR(xt)\mathscr{P}_{MoDR}(x_{t}) as Λt=[s0|t,v0|t,,vN1|t]\Lambda_{t}^{*}=[s_{0|t}^{*},v_{0|t}^{*},\ldots,v_{N-1|t}^{*}]. We then shift Λt\Lambda_{t}^{*} to generate the candidate solution for problem 𝒫MoDR(xt+1)\mathscr{P}_{MoDR}(x_{t+1}). In particular, Λt+1=[s1|t,v1|t,,KsN|t]\Lambda_{t+1}=[s_{1|t}^{*},v_{1|t}^{*},\ldots,Ks_{N|t}^{*}].

Since et+1𝒵e_{t+1}\in\mathcal{Z}, xt+1s1|t𝒵x_{t+1}-s_{1|t}^{*}\in\mathcal{Z}, i.e. Constraint (24a) is satisfied. Given the feasibility of Λt\Lambda_{t}^{*}, Constraints (24b)-(24e) are satisfied by [s1|t,,(A+BK)sN|t][s_{1|t}^{*},\ldots,(A+BK)s_{N|t}^{*}] and [v1|t,,vN1|t][v_{1|t}^{*},\ldots,v_{N-1|t}^{*}]. According to the definition of 𝒳f\mathcal{X}_{f}, (A+BK)sN|t𝒳f(A+BK)s_{N|t}^{*}\in\mathcal{X}_{f} and KsN|t𝕌K𝒵Ks_{N|t}^{*}\in\mathbb{U}\ominus K\mathcal{Z}, so that the terminal constraint (24f) is also satisfied. Therefore, Λt+1\Lambda_{t+1} is feasible at time t+1t+1. ∎

Theorem 3 (Closed-loop stability)

If the initial problem 𝒫MoDR(x0)\mathscr{P}_{MoDR}(x_{0}) is feasible, the closed-loop system asymptotically converges to a neighborhood of the origin.

Proof:

Consider Λt\Lambda_{t}^{*} at time tt, the optimal objective value is denoted as JtJ_{t}^{*}. Since candidate solution Λt+1\Lambda_{t+1} is feasible, the corresponding objective value is obtained by

Jt+1\displaystyle J_{t+1} =k=1N1(sk|tQ2+vk|tR2)+sN|tQ2\displaystyle=\sum_{k=1}^{N-1}\left(\left\|s^{*}_{k|t}\right\|_{Q}^{2}+\left\|v^{*}_{k|t}\right\|_{R}^{2}\right)+\left\|s^{*}_{N|t}\right\|_{Q}^{2}
+KsN|tR2+(A+BK)sN|tP2.\displaystyle+\left\|Ks_{N|t}^{*}\right\|_{R}^{2}+\left\|(A+BK)s_{N|t}^{*}\right\|_{P}^{2}.

Noting that PP is the solution of Riccati equation, we obtain the following inequality.

Jt+1\displaystyle J_{t+1}^{*} Jt+1\displaystyle\leq J_{t+1} (25)
k=0N1(sk|tQ2+vk|tR2)+sN|tP2\displaystyle\leq\sum_{k=0}^{N-1}\left(\left\|s^{*}_{k|t}\right\|_{Q}^{2}+\left\|v^{*}_{k|t}\right\|_{R}^{2}\right)+\left\|s^{*}_{N|t}\right\|_{P}^{2}
=Jt\displaystyle=J_{t}^{*}

Therefore, Jt+1Jts0|tQ2+v0|tR2J_{t+1}^{*}-J_{t}^{*}\leq\left\|s^{*}_{0|t}\right\|_{Q}^{2}+\left\|v^{*}_{0|t}\right\|_{R}^{2}. Adding this from t=0t=0, we obtain J0Jt=0s0|tQ2+v0|tR2J_{0}^{*}-J_{\infty}^{*}\leq\sum_{t=0}^{\infty}\left\|s^{*}_{0|t}\right\|_{Q}^{2}+\left\|v^{*}_{0|t}\right\|_{R}^{2}. Then we have limts0|tQ2=0\lim_{t\to\infty}\left\|s^{*}_{0|t}\right\|_{Q}^{2}=0 and limtv0|tR2=0\lim_{t\to\infty}\left\|v^{*}_{0|t}\right\|_{R}^{2}=0. This gives the convergence of system states. ∎

VI CASE STUDIES

This section presents two case studies to validate the effectiveness of the proposed MoGP-DR-MPC method. Its performance is compared against two baseline methods, namely GP-based DR-MPC (GP-DR-MPC) and robust tube MPC.

VI-A Numerical Experiments

We first conduct numerical experiments on a stochastic system given in (26).

x+=[1101]x+[0.51]u+w(x)x^{+}=\begin{bmatrix}1&1\\ 0&1\end{bmatrix}x+\begin{bmatrix}0.5\\ 1\end{bmatrix}u+w(x) (26)

Suppose the constraint sets are 𝕏=[7,0]×[3,2]\mathbb{X}=[-7,0]\times[-3,2] and 𝕌=[5,5]\mathbb{U}=[-5,5], with a tolerance level of ε=0.2\varepsilon=0.2. Disturbance w(x)w(x) is supported by [0.8,0.8]×[0.8,0.8][-0.8,0.8]\times[-0.8,0.8] and is defined using standard four-modal function i.e. Franke function [21] in each dimension. The control objective is to steer the initial state [5,2][-5,-2] to the origin under the disturbance w(x)w(x). Define the stage cost as xQx+uRux^{\top}Qx+u^{\top}Ru, where QQ is identity matrix I2I_{2} and R=0.1R=0.1. The horizon of MPC is set as 1010 and we perform 50 closed-loop simulations with a simulation horizon of 30 to showcase the improvement in control performance.

The average closed-loop costs for MoGP-DR-MPC, GP-DR-MPC and robust tube MPC are 66.06, 79.74 and 89.22, respectively. The simulation results indicate that MoGP-DR-MPC reduces the average closed-loop cost by 17% compared with GP-DR-MPC and by 24% compared with robust tube MPC, demonstrating its notable advantage in handling multimodal state-dependent disturbances.

VI-B Simulations on a Quadrotor System

In the second case study, we simulate a trajectory planning and control problem for a planar quadrotor system [22]. The quadrotor starts at (10,10)(10,10) and is expected to land at the origin. During its flight, it must remain within the safe region 𝒳\mathcal{X} with probability 0.80.8 under a multimodal state-dependent wind disturbance, which is common due to varying airflow patterns. The linearized dynamics of the quadrotor system are expressed as follows.

sk+1|t=Ask|t+Buk|t+w(sk|t),A=[1Δt000100001Δt0001],B=1m[Δt220Δt00Δt220Δt],\begin{gathered}s_{k+1|t}=As_{k|t}+Bu_{k|t}+w(s_{k|t}),\\ A=\begin{bmatrix}1&\Delta t&0&0\\ 0&1&0&0\\ 0&0&1&\Delta t\\ 0&0&0&1\end{bmatrix},B=\frac{1}{m}\begin{bmatrix}\frac{\Delta t^{2}}{2}&0\\ \Delta t&0\\ 0&\frac{\Delta t^{2}}{2}\\ 0&\Delta t\end{bmatrix},\end{gathered} (27)

where Δt=1\Delta t=1 is the sampling time and mm represents the mass of the system. The state vector st=[px,vx,py,vy]4s_{t}=[p_{x},v_{x},p_{y},v_{y}]^{\top}\in\mathbb{R}^{4} includes the quadrotor’s positions and velocities in the xx and yy directions, while the control input ut=[ux,uy][7,7]22u_{t}=[u_{x},u_{y}]^{\top}\in[-7,7]^{2}\subseteq\mathbb{R}^{2} applies forces along these axes. The support set of the wind disturbance is represented by 𝕎=[0.6,0.6]44\mathbb{W}=[-0.6,0.6]^{4}\subseteq\mathbb{R}^{4} and the safe region 𝒳\mathcal{X} is defined by

{s:[4444]s[184184]}.\left\{s:\begin{aligned} \begin{bmatrix}-4&-4&-4&-4\end{bmatrix}^{\top}\leq s\leq\begin{bmatrix}18&4&18&4\end{bmatrix}^{\top}\end{aligned}\right\}.

In the simulations, the MPC time horizon NN is set to five and we evaluate the performance of MoGP-DR-MPC against two baseline approaches. Specifically, we analyze the closed-loop costs under 50 different sequences of disturbance realizations, each with 30 simulation steps. As illustrated in Fig 1, MoGP-DR-MPC achieves an average reduction in closed-loop cost of 4% compared with GP-DR-MPC and 5% compared with robust tube MPC. This corroborates the superior ability of MoGP-DR-MPC to capture the multimodal characteristics of disturbance data, yielding less conservative control performance.

Additionally, because the optimal control problem is convex and the MoGP model is trained offline, the computation time for MoGP-DR-MPC is on par with that of conventional GP-MPC. Fig 2 shows the position trajectories of the quadrotor generated by 30 simulations.

Refer to caption
Figure 1: The closed-loop cost of MoGP-DR-MPC and two baseline methods under 50 realizations of disturbance sequences.
Refer to caption
Figure 2: The closed-loop trajectories of the proposed MoGP-DR-MPC strategy and two baseline methods under different disturbance realizations.

VII CONCLUSIONS

In this paper, we presented a novel MoGP-DR-MPC framework that seamlessly integrated MoGP to effectively control systems with state-dependent multimodal disturbances. Based on the MoGP model, we developed a data-driven state-dependent ambiguity set, which closely aligns with the multimodality structure. We reformulated the DR-CVaR constraints as scalable second-order cone constraints, ensuring computational tractability. The recursive feasibility and closed-loop stability of MoGP-DR-MPC were guranteed through invariant sets. Both numerical experiments and simulations on a quadrotor system verified the effectiveness of the proposed method in coping with multimodal disturbances.

References

  • [1] F. Berkenkamp and A. P. Schoellig, “Safe and robust learning control with gaussian processes,” in 2015 European Control Conference (ECC).   IEEE, 2015, pp. 2496–2501.
  • [2] Y. Wang, C. Ocampo-Martinez, and V. Puig, “Robust model predictive control based on gaussian processes: Application to drinking water networks,” in 2015 European Control Conference (ECC), 2015, pp. 3292–3297.
  • [3] A. Rose, M. Pfefferkorn, H. H. Nguyen, and R. Findeisen, “Learning a gaussian process approximation of a model predictive controller with guarantees,” in 2023 62nd IEEE Conference on Decision and Control (CDC).   IEEE, 2023, pp. 4094–4099.
  • [4] E. Bradford, L. Imsland, D. Zhang, and E. A. del Rio Chanona, “Stochastic data-driven model predictive control using gaussian processes,” Computers & Chemical Engineering, vol. 139, p. 106844, 2020.
  • [5] L. Hewing, J. Kabzan, and M. N. Zeilinger, “Cautious model predictive control using gaussian process regression,” IEEE Transactions on Control Systems Technology, vol. 28, no. 6, pp. 2736–2743, 2019.
  • [6] A. Capone, T. Brüdigam, and S. Hirche, “Online constraint tightening in stochastic model predictive control: A regression approach,” IEEE Transactions on Automatic Control, pp. 1–15, 2024, doi: 10.1109/TAC.2024.3433988.
  • [7] B. Li, T. Guan, L. Dai, and G.-R. Duan, “Distributionally robust model predictive control with output feedback,” IEEE Transactions on Automatic Control, vol. 69, no. 5, pp. 3270–3277, 2024.
  • [8] K. Kim and I. Yang, “Distributional robustness in minimax linear quadratic control with wasserstein distance,” SIAM Journal on Control and Optimization, vol. 61, no. 2, pp. 458–483, 2023.
  • [9] C. Mark and S. Liu, “Recursively feasible data-driven distributionally robust model predictive control with additive disturbances,” IEEE Control Systems Letters, vol. 7, pp. 526–531, 2022.
  • [10] G. Pan and T. Faulwasser, “Distributionally robust uncertainty quantification via data-driven stochastic optimal control,” IEEE Control Systems Letters, vol. 7, pp. 3036–3041, 2023.
  • [11] F. Micheli, T. Summers, and J. Lygeros, “Data-driven distributionally robust mpc for systems with uncertain dynamics,” in 2022 IEEE 61st Conference on Decision and Control (CDC).   IEEE, 2022, pp. 4788–4793.
  • [12] A. Hakobyan and I. Yang, “Learning-based distributionally robust motion control with gaussian processes,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 7667–7674.
  • [13] F. Li, H. Li, and Y. He, “Adaptive stochastic model predictive control of linear systems using gaussian process regression,” IET Control Theory & Applications, vol. 15, no. 5, pp. 683–693, 2021.
  • [14] C. Ning and F. You, “Online learning based risk-averse stochastic mpc of constrained linear uncertain systems,” Automatica, vol. 125, p. 109402, 2021.
  • [15] C. Rasmussen and Z. Ghahramani, “Infinite mixtures of gaussian process experts,” Advances in neural information processing systems, vol. 14, 2001.
  • [16] B. P. Van Parys, D. Kuhn, P. J. Goulart, and M. Morari, “Distributionally robust control of constrained stochastic systems,” IEEE Transactions on Automatic Control, vol. 61, no. 2, pp. 430–442, 2015.
  • [17] D. Q. Mayne, M. M. Seron, and S. V. Raković, “Robust model predictive control of constrained linear systems with bounded disturbances,” Automatica, vol. 41, no. 2, pp. 219–224, 2005.
  • [18] G. A. Hanasusanto, D. Kuhn, S. W. Wallace, and S. Zymler, “Distributionally robust multi-item newsvendor problems with multimodal demand distributions,” Mathematical Programming, vol. 152, no. 1, pp. 1–32, 2015.
  • [19] S. Boyd and L. Vandenberghe, Convex optimization.   Cambridge university press, 2004.
  • [20] L. Li, C. Ning, H. Qiu, W. Du, and Z. Dong, “Online data-stream-driven distributionally robust optimal energy management for hydrogen-based multimicrogrids,” IEEE Transactions on Industrial Informatics, vol. 20, no. 3, pp. 4370–4384, 2024.
  • [21] R. Franke, A critical comparison of some methods for interpolation of scattered data.   Naval Postgraduate School Monterey, CA, 1979.
  • [22] Y. K. Nakka, R. C. Foust, E. S. Lupu, D. B. Elliott, I. S. Crowell, S.-J. Chung, and F. Y. Hadaegh, “A six degree-of-freedom spacecraft dynamics simulator for formation control research,” in 2018 AAS/AIAA Astrodynamics Specialist Conference, 2018, pp. 1–20.