This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Continuous-Time and Event-Triggered Online Optimization for Linear Multi-Agent Systems

Yang Yu, Xiuxian Li, , Li Li, and Lihua Xie Yang Yu, Xiuxian Li, and Li Li are with the Department of Control Science and Engineering, Shanghai Research Institute for Autonomous Intelligent Systems, and Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai, China {1910639, xli, lili}@tongji.edu.cnLihua Xie is with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798 elhxie@ntu.edu.sg
Abstract

This paper studies the decentralized online convex optimization problem for heterogeneous linear multi-agent systems. Agents have access to their time-varying local cost functions related to their own outputs, and there are also time-varying coupling inequality constraints among them. The goal of each agent is to minimize the global cost function by selecting appropriate local actions only through communication between neighbors. We design a distributed controller based on the saddle-point method which achieves constant regret bound and sublinear fit bound. In addition, to reduce the communication overhead, we propose an event-triggered communication scheme and show that the constant regret bound and sublinear fit bound are still achieved in the case of discrete communications with no Zeno behavior. A numerical example is provided to verify the proposed algorithms.

I INTRODUCTION

Convex optimization has been widely studied as a pretty effective method in research fields involving optimization and decision-making, such as automatic control systems [1], communication networks [2], and machine learning [3]. Early convex optimization works were based on fixed cost functions and static constraints. However, in practice, optimization costs and constraints of many problems are possible to be time-varying and a priori unknown [4]. This motivated online convex optimization (OCO) which requires the decision maker to choose an action at each instant based on previous information. A widely used performance criterion of OCO is regret, that is, the gap between the cumulative loss of the selected action and that of the best ideal action made when knowing the global information beforehand. If regret is sublinear, the time average loss of the selected action is progressively not greater than that of the ideal action. Another performance indicator is fit, which measures the degree of violation of static/time-varying inequality constraints. For more details, a recent survey can be referenced [5].

The OCO framework was introduced by [6], where the projection-based online gradient descent algorithm was analyzed. Based on static constraints, the algorithm was proved to achieve 𝒪(T)\mathcal{O}(\sqrt{T}) static regret bound for time-varying convex cost functions with bounded subgradients. With the increase of data scale and problem complexity in recent years, distributed online convex optimization has also been widely studied in recent years [7]. In the continuous-time setting, the saddle-point algorithm proposed in [8] under constant constraints is shown to achieve sublinear bounds on the network disagreement and the regret achieved by the strategy. The authors of [9] generalized this result to the problem of time-varying constraints. In the discrete-time setting, [10, 11] used distributed primal-dual algorithms to solve online convex optimization with static independent and coupled inequality constraints. In order to solve the time-varying coupling constraints, the authors of [12] proposed a novel distributed online primal dual dynamic mirror descent algorithm to realize sublinear dynamic regret and constraint violation. A gradient-free distributed bandit online algorithm was proposed in [13], which is applicable to scenarios where it is difficult to obtain the gradient information of the cost functions. In the presence of aggregate variables in local cost functions, an online distributed gradient tracking algorithm was developed in [14] based on true or stochastic gradients.

In actual physical systems, the implementation of optimization strategies must take into account the complicated dynamics of each agent. Along this line, only a few works have investigated online convex optimization with physical systems in recent years. For continuous-time multi-agent systems with high-order integrators, the authors of [15] used PI control idea and gradient descent to solve the distributed OCO problem. The authors of [16] considered the online convex optimization problem of linear systems, but did not consider any constraints. The online convex optimization problem of linear time-invariant (LTI) system was studied in [17] based on the behavioral system theory, where a proposed data-driven algorithm that does not rely on the model achieves sublinear convergence. However, the above two papers for linear systems only provide centralized algorithms. The distributed setup for online optimization algorithm with linear systems is yet to be studied.

The main contributions of this paper are as follows.

  • Compared with the centralized OCO algorithms for linear systems with no constrains [16, 17], this paper studies the distributed online optimization of heterogeneous multi-agent systems with time-varying coupled inequality constraints for the first time. Agents only rely on the information of themselves and their neighbors to make decisions and achieve constant regret bound and 𝒪(T)\mathcal{O}(\sqrt{T}) fit bound. In comparison, most existing algorithms [12, 18] about distributed online optimization with coupled inequality constraints only achieve inferior sublinear regret bounds.

  • Compared with current continuous-time online optimization algorithms [15, 8, 9], this paper introduces an event-triggered mechanism to reduce the communication overhead. In the case of discrete communication, the constant regret bound and 𝒪(T)\mathcal{O}(\sqrt{T}) fit bound are still achieved.

The rest of the paper is organized as follows. Preliminaries are given in Section II. In Section III, the heterogeneous multi-agent system under investigation is described mathematically, the online convex optimization problem is defined and some useful lemmas are given. Following that, the control laws with continuous and event-triggered communication are proposed, respectively, and the constant regret bound and sublinear fit bound are established in Section IV. Then, a simulation example is provided to verify the effectiveness of the algorithm in Section V. Finally, the conclusion is discussed in Section VI.

II PRELIMINARIES

II-A Notations

Let \mathbb{R}, n\mathbb{R}^{n}, +n\mathbb{R}_{+}^{n}, m×n\mathbb{R}^{m\times n} be the sets of real numbers, real vectors of dimension nn, non-negative real vectors of dimension nn, and real matrices of dimension m×nm\times n, respectively. The n×nn\times n identity matrix is denoted by InI_{n}. The n×1n\times 1 all-one and all-zero column vectors are denoted by 𝟏n\boldsymbol{1}_{n} and 𝟎n\boldsymbol{0}_{n}, respectively. For a matrix Am×nA\in\mathbb{R}^{m\times n}, AA^{\top} is its transpose and diag(A1,,An)diag(A_{1},\dots,A_{n}) denotes a block diagonal matrix with diagonal blocks of A1A_{1}, \dots, AnA_{n}. For a vector xx, x1\|x\|_{1} is its 11-norm, x\|x\| is its 22-norm, and col(x1,,xn)col(x_{1},\dots,x_{n}) is a column vector by stacking vectors x1,,xnx_{1},\dots,x_{n}. ABA\otimes B represents the Kronecker product of matrices AA and BB. Let P𝒮(x)P_{\mathcal{S}}(x) be the Euclidean projection of a vector xnx\in\mathbb{R}^{n} onto the set 𝒮n\mathcal{S}\subseteq\mathbb{R}^{n}, i.e., P𝒮(x)=argminy𝒮xy2P_{\mathcal{S}}(x)=argmin_{y\in\mathcal{S}}\|x-y\|^{2}. For simplicity, let []+[\cdot]_{+} denote P+n()P_{\mathbb{R}_{+}^{n}}(\cdot). Define the set-valued sign function sgn()\mathrm{sgn}(\cdot) as follows:

sgn(x):=x1={1,ifx<0,[1,1],ifx=0,1,ifx>0.\displaystyle\mathrm{sgn}(x):=\partial\|x\|_{1}=\begin{cases}-1,&\mathrm{if}\ x<0,\\ [-1,1],&\mathrm{if}\ x=0,\\ 1,&\mathrm{if}\ x>0.\end{cases}

II-B Graph Theory

For a system with NN agents, its communication network is modeled by an undirected graph 𝒢=(𝒱,)\mathcal{G}=(\mathcal{V},\mathcal{E}), where 𝒱={v1,,vN}\mathcal{V}=\{v_{1},\dots,v_{N}\} is a node set and 𝒱×𝒱\mathcal{E}\in\mathcal{V}\times\mathcal{V} is an edge set. If information exchange can occur between vjv_{j} and viv_{i}, then (vj,vi)(v_{j},v_{i})\in\mathcal{E} with aij=1a_{ij}=1 denoting its weight. 𝒜=[aij]N×N\mathcal{A}=[a_{ij}]\in\mathbb{R}^{N\times N} is the adjacency matrix. If there exists a path from any node to any other node in 𝒱\mathcal{V}, then 𝒢\mathcal{G} is called connected.

III PROBLEM FORMULATION

Consider a multi-agent system consisting of NN heterogeneous agents indexed by i=1,,Ni=1,\dots,N, and the iith agent has following linear dynamics:

x˙i=Aixi+Biui,yi=Cixi,\displaystyle\begin{split}{\dot{x}}_{i}&=A_{i}x_{i}+B_{i}u_{i},\\ y_{i}&=C_{i}x_{i},\end{split} (1)

where xinix_{i}\in\mathbb{R}^{n_{i}}, uimiu_{i}\in\mathbb{R}^{m_{i}} and yipiy_{i}\in\mathbb{R}^{p_{i}} are the state, input and output variables, respectively. Aini×niA_{i}\in\mathbb{R}^{n_{i}\times n_{i}}, Bini×miB_{i}\in\mathbb{R}^{n_{i}\times m_{i}} and Cipi×niC_{i}\in\mathbb{R}^{p_{i}\times n_{i}} are the state, input and output matrices, respectively.

Each agent ii has an output set 𝒴ipi\mathcal{Y}_{i}\subseteq\mathbb{R}^{p_{i}} such that the output variable yi𝒴iy_{i}\in\mathcal{Y}_{i}. fi(t,):pif_{i}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R} and gi(t,):piqg_{i}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R}^{q} are the private cost and constraint functions for agent ii. Denote p:=i=1Npip:=\sum_{i=1}^{N}p_{i}, 𝒴:=𝒴1××𝒴Np\mathcal{Y}:=\mathcal{Y}_{1}\times\dots\times\mathcal{Y}_{N}\subseteq\mathbb{R}^{p}, y:=col(y1,,yN)py:=col(y_{1},\dots,y_{N})\in\mathbb{R}^{p}, and f(t,y):=i=1Nfi(t,yi)f(t,y):=\sum_{i=1}^{N}f_{i}(t,y_{i}). The objective of this paper is to design a controller ui(t)u_{i}(t) for each agent by using only local interaction and information such that all agents cooperatively minimize the sum of the cost functions over a period of time [0,T][0,T] with time-varying coupled inequality constraints:

miny𝒴0Tf(t,y)𝑑t,s.t.i=1Ngi(t,yi)𝟎.\displaystyle\begin{split}\min_{y\in\mathcal{Y}}\int_{0}^{T}f(t,y)\,dt,\\ s.t.\sum_{i=1}^{N}g_{i}(t,y_{i})\leq\boldsymbol{0}.\end{split} (2)

Let y=(y1,,yN)𝒴y^{*}=(y_{1}^{*},\dots,y_{N}^{*})\in\mathcal{Y} denote the optimal solution for problem (2) when the time-varying cost and constraint functions are known in advance.

In order to evaluate the cost performance of such output trajectories, we define two performance indicators: network regret and network fit. According to the previous definition, yy^{*} is the optimal output when the agents know all the information of network in the period of [0,T][0,T]. But in reality, agents can only make decisions based on their own and neighbors’ current and previous information. Regret is described as the gap between the cumulative action 0Tf(t,y(t))𝑑t\int_{0}^{T}f(t,y(t))\,dt incurred by y(t)y(t) and the cost incurred by the optimal output yy^{*}, i.e.,

T:=0T(f(t,y(t))f(t,y))𝑑t.\displaystyle\mathcal{R}^{T}:=\int_{0}^{T}\Big{(}f(t,y(t))-f(t,y^{*})\Big{)}\,dt. (3)

In order to evaluate the fitness of output trajectories y(t)y(t) to the constraints (or in other words, the degree of violation of the constraints), we define fit as the projection of the cumulative constraints onto the nonnegative orthant:

T:=[0Ti=1Ngi(t,yi)dt]+.\displaystyle\mathcal{F}^{T}:=\left\|\,\left[\int_{0}^{T}\sum_{i=1}^{N}g_{i}(t,y_{i})\,dt\right]_{+}\right\|. (4)

This definition implicitly allows strictly feasible decisions to compensate for violations of constraints at certain times. This is reasonable when the variables can be stored or preserved, such as the average power constraints [19]. By gi(t,):piqg_{i}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R}^{q}, one can define gi,j(t,):pig_{i,j}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R} as the jjth component of gi(t,)g_{i}(t,\cdot), i.e., gi(t,)=col(gi,1(t,),,gi,q(t,))g_{i}(t,\cdot)=col\left(g_{i,1}(t,\cdot),\dots,g_{i,q}(t,\cdot)\right). Further, define FjT:=0Ti=1Ngi,j(t,yi)dt,j=1,,qF_{j}^{T}:=\int_{0}^{T}\sum_{i=1}^{N}g_{i,j}(t,y_{i})\,dt,j=1,\dots,q as the jjth component of the constraint integral. It can be easily deduced that

T=j=1q[FjT]+2.\displaystyle\mathcal{F}^{T}=\sqrt{\sum_{j=1}^{q}\left[F_{j}^{T}\right]_{+}^{2}}.
Assumption 1

The communication network 𝒢\mathcal{G} is undirected and connected.

Assumption 2

Each set 𝒴i\mathcal{Y}_{i} is convex and compact. For t[0,T]t\in[0,T], functions fi(t,yi)f_{i}(t,y_{i}) and gi(t,yi)g_{i}(t,y_{i}) are convex, integrable and bounded on 𝒴i\mathcal{Y}_{i}, i.e., there exist constants Kf>0K_{f}>0 and Kg>0K_{g}>0 such that |fi(t,yi)|Kf|f_{i}(t,y_{i})|\leq K_{f} and gi(t,yi)Kg\|g_{i}(t,y_{i})\|\leq K_{g}.

Assumption 3

The set of feasible outputs 𝒴:={y:y𝒴,i=1Ngi(t,yi)0,t[0,T]}\mathcal{Y}^{\dagger}:=\{y:y\in\mathcal{Y},\sum_{i=1}^{N}g_{i}(t,y_{i})\leq 0,t\in[0,T]\} is non-empty.

Definition 1 ([20])

Let 𝒮n\mathcal{S}\subseteq\mathbb{R}^{n} be a closed convex set. Then, for any x𝒮x\in\mathcal{S} and vnv\in\mathbb{R}^{n}, the projection of vv over set 𝒮\mathcal{S} at the point xx can be defined as

Π𝒮[x,v]=limξ0+P𝒮(x+ξv)xξ.\displaystyle\Pi_{\mathcal{S}}[x,v]=\lim\limits_{\xi\to 0^{+}}\frac{P_{\mathcal{S}}(x+\xi v)-x}{\xi}.
Lemma 1 ([19])

Let 𝒮n\mathcal{S}\subseteq\mathbb{R}^{n} be a convex set and let x,y𝒮x,y\in\mathcal{S}, then

(xy)Π𝒮(x,v)(xy)v,v𝒮.\displaystyle(x-y)^{\top}\Pi_{\mathcal{S}}(x,v)\leq(x-y)^{\top}v,\forall v\in\mathcal{S}. (5)
Assumption 4

(Ai,Bi)(A_{i},B_{i}) is controllable, and

rank(CiBi)=pi,i=1,,N.\displaystyle rank(C_{i}B_{i})=p_{i},i=1,\dots,N.
Lemma 2 ([21])

Under Assumption 4, the matrix equations

CiBiKαi\displaystyle C_{i}B_{i}K_{\alpha_{i}} =CiAi,\displaystyle=C_{i}A_{i}, (6a)
CiBiKβi\displaystyle C_{i}B_{i}K_{\beta_{i}} =Ipi,\displaystyle=I_{p_{i}}, (6b)

have solutions KαiK_{\alpha_{i}}, KβiK_{\beta_{i}}.

Remark 1

Assumption 2 is reasonable since the output variables in practice, such as voltage, often have a certain range. The cost functions and constraint functions are not required to be differentiable, which can be dealt with by using subgradients. The controllability in Assumption 4 is quite standard in dealing with the problem for linear systems.

IV MAIN RESULTS

IV-A Continuous Communication

For agent ii, to solve the online optimization problem (2), we can construct the time-varying Lagrangian

i(t,yi,μi)=fi(t,yi)+μigi(t,yi)Kμhi,\displaystyle\mathcal{H}_{i}(t,y_{i},\mu_{i})=f_{i}(t,y_{i})+\mu_{i}^{\top}g_{i}(t,y_{i})-K_{\mu}h_{i}, (7)

where μi+q\mu_{i}\in\mathbb{R}_{+}^{q} is the local Lagrange multiplier for agent ii, Kμ>0K_{\mu}>0 is the preset parameter and hi:=j=1Naijμiμj1h_{i}:=\sum_{j=1}^{N}a_{ij}\left\|\mu_{i}-\mu_{j}\right\|_{1} is a metric of μi\mu_{i}’s disagreement [22].

Notice that fi(t,)f_{i}(t,\cdot), gi(t,)g_{i}(t,\cdot) are convex and μi𝟎q\mu_{i}\geq\boldsymbol{0}_{q}, hence the Lagrangian is convex with respect to yiy_{i}. Let us denote by iyi(t,yi,μi)\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i}) a subgradient of i\mathcal{H}_{i} with respect to yiy_{i}, i.e.,

iyi(t,yi,μi)fi(t,yi)+μigi(t,yi).\displaystyle\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})\in\partial f_{i}(t,y_{i})+\mu_{i}^{\top}\partial g_{i}(t,y_{i}). (8)

The Lagrangian is concave with respect to μi\mu_{i} and its subgradient is given by

iμi(t,yi,μi)\displaystyle\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i}) gi(t,yi)Kμj=1Naijsgn(μiμj).\displaystyle\in g_{i}(t,y_{i})-K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\mu_{i}-\mu_{j}). (9)

For simplicity, define

(t,y,μ):=f(t,y)+μg(t,y)Kμh(μ),\displaystyle\mathcal{H}(t,y,\mu):=f(t,y)+\mu^{\top}g(t,y)-K_{\mu}h(\mu), (10)

where y=col(y1,,yN)y=col(y_{1},\dots,y_{N}), μ=col(μ1,,μN)\mu=col(\mu_{1},\dots,\mu_{N}), f(t,y)=i=1Nfi(t,yi)f(t,y)=\sum_{i=1}^{N}f_{i}(t,y_{i}), g(t,y)=col(g1(t,y1),,gN(t,yN))g(t,y)=col(g_{1}(t,y_{1}),\dots,g_{N}(t,y_{N})), and h(μ)=i=1Nj=1Naijμiμj1h(\mu)=\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\left\|\mu_{i}-\mu_{j}\right\|_{1}. It can be easily verified that (t,y,μ)=i=1Ni(t,yi,μi)\mathcal{H}(t,y,\mu)=\sum_{i=1}^{N}\mathcal{H}_{i}(t,y_{i},\mu_{i}).

A controller following modified Arrow-Hurwicz algorithm for the iith agent is proposed as

ui\displaystyle u_{i} =Kαixi+Kβi(Π𝒴i[yi,εiyi(t,yi,μi)]),\displaystyle=-K_{\alpha_{i}}x_{i}+K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right), (11a)
μi˙\displaystyle\dot{\mu_{i}} =Π+q[μi,εiμi(t,yi,μi)],\displaystyle=\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})], (11b)

where ε>0\varepsilon>0 is the step size, Kαi,KβiK_{\alpha_{i}},K_{\beta_{i}} are feedback matrices that are the solutions of (6), and the initial value μi(0)=𝟎\mu_{i}(0){=}\boldsymbol{0}.

Substituting the controller (11) into the system (1), the system dynamics of the iith agent is

x˙i\displaystyle\dot{x}_{i} =(AiBiKαi)xi+BiKβi(Π𝒴i[yi,εiyi(t,yi,μi)]),\displaystyle{=}(A_{i}{-}B_{i}K_{\alpha_{i}})x_{i}{+}B_{i}K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right), (12a)
μi˙\displaystyle\dot{\mu_{i}} =Π+q[μi,εiμi(t,yi,μi)],\displaystyle{=}\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})], (12b)
yi\displaystyle y_{i} =Cixi.\displaystyle{=}C_{i}x_{i}. (12c)

For the subsequent analysis, consider the following energy function with any y~𝒴\tilde{y}\in\mathcal{Y} and μ~+Nq\tilde{\mu}\in\mathbb{R}_{+}^{Nq} :

V(y~,μ~)(y,μ)=12yy~2+12μμ~2.\displaystyle V_{(\tilde{y},\tilde{\mu})}(y,\mu)=\frac{1}{2}\|y-\tilde{y}\|^{2}+\frac{1}{2}\|\mu-\tilde{\mu}\|^{2}. (13)

The following lemma establishes the relationship between above energy function and time-varying Larangian (7) along the dynamics (12).

Lemma 3

If Assumptions 1-4 hold and μ~:=𝟏Nγ,γ+q\tilde{\mu}:=\boldsymbol{1}_{N}\otimes\gamma,\forall\gamma\in\mathbb{R}_{+}^{q}, then for any T0T\geq 0 the trajectories of linear multi-agent system (1) with control protocol (11) satisfy

0T((t,y,μ~)(t,y~,μ))𝑑tV(y~,μ~)(y(0),𝟎)ε.\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu})-\mathcal{H}(t,\tilde{y},\mu)\Big{)}\,dt\leq\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}. (14)
Proof:

Calculating the time derivative of the energy function (13) together with (12) yields

V˙(y~,μ~)=\displaystyle\dot{V}_{(\tilde{y},\tilde{\mu})}{=} (μμ~)μ˙+(yy~)y˙\displaystyle(\mu-\tilde{\mu})^{\top}\dot{\mu}+(y-\tilde{y})^{\top}\dot{y}
=\displaystyle= i=1N(μiμ~i)Π+q[μi,εiμi(t,yi,μi)]\displaystyle\sum_{i=1}^{N}(\mu_{i}\!-\!\tilde{\mu}_{i})\!^{\top}\Pi_{\mathbb{R}_{+}^{q}}\big{[}\mu_{i},\!\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})\big{]}
+i=1N(yiy~i)Π𝒴i[yi,εiyi(t,yi,μi)]\displaystyle\!+\!\sum_{i=1}^{N}\!(y_{i}{-}\tilde{y}_{i})\!^{\top}\Pi_{\mathcal{Y}_{i}}\!\big{[}y_{i},\!\!-\varepsilon\mathcal{H}_{i}^{y_{i}}\!(t,y_{i},\mu_{i})\big{]}
\displaystyle\leq i=1N(μiμ~i)(εiμi(t,yi,μi))\displaystyle\sum_{i=1}^{N}(\mu_{i}-\tilde{\mu}_{i})^{\top}\Big{(}\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})\!\Big{)}
+i=1N(yiy~i)(εiyi(t,yi,μi))\displaystyle+\sum_{i=1}^{N}(y_{i}-\tilde{y}_{i})^{\top}\Big{(}-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})\Big{)}
\displaystyle\leq εi=1N(i(t,yi,μi)i(t,yi,μ~i))\displaystyle\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,y_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\tilde{\mu}_{i})\Big{)}
+εi=1N(i(t,y~i,μi)i(t,yi,μi))\displaystyle+\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,\tilde{y}_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\mu_{i})\Big{)}
=\displaystyle= ε((t,y~,μ)(t,y,μ~)),\displaystyle\varepsilon\Big{(}\mathcal{H}(t,\tilde{y},\mu)-\mathcal{H}(t,y,\tilde{\mu})\Big{)}, (15)

where the second equation holds in view of (6) and (12), the first inequality holds because of (5), and the last inequality holds since the Lagrangian (7) is convex with respect to yiy_{i} and concave with respect to μi\mu_{i}.

Integrating (15) from 0 to TT on both sides leads to that

0T((t,y,μ~)(t,y~,μ))𝑑t\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu})-\mathcal{H}(t,\tilde{y},\mu)\Big{)}\,dt
\displaystyle\leq 1ε0TV˙(y~,μ~)(y(t),μ(t))𝑑t\displaystyle-\frac{1}{\varepsilon}\int_{0}^{T}\dot{V}_{(\tilde{y},\tilde{\mu})}(y(t),\mu(t))\,dt
=\displaystyle= 1ε(V(y~,μ~)(y(T),μ(T))V(y~,μ~)(y(0),μ(0))).\displaystyle-\frac{1}{\varepsilon}\Big{(}V_{(\tilde{y},\tilde{\mu})}(y(T),\mu(T))-V_{(\tilde{y},\tilde{\mu})}(y(0),\mu(0))\Big{)}. (16)

Because the energy function (13) is always nonnegative and μ(0)=𝟎\mu(0)=\boldsymbol{0}, the conclusion (14) can be obtained. ∎

We now state the main results about the regret and fit bounds of continuous communication controller (11).

Theorem 1

Suppose that Assumptions 1-4 hold. Then for any T0T\geq 0 and ε>0\varepsilon>0 in control protocol (11), by choosing KμNKgK_{\mu}\geq NK_{g}, it holds for the regret that

Ty(0)y22ε.\displaystyle\mathcal{R}^{T}\leq\frac{\|y(0)\!-\!y^{*}\|^{2}}{2\varepsilon}. (17)
Proof:

By choosing y~=y\tilde{y}=y^{*} and μ~=𝟎Nq\tilde{\mu}=\boldsymbol{0}_{Nq} in Lemma 3, one has

0T((t,y,𝟎Nq)(t,y,μ))𝑑tV(y,𝟎Nq)(y(0),𝟎)ε.\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\boldsymbol{0}_{Nq})-\mathcal{H}(t,y^{*},\mu)\Big{)}dt\leq\frac{V_{(y^{*},\boldsymbol{0}_{Nq})}(y(0),\boldsymbol{0})}{\varepsilon}. (18)

According to (3) and (10), it can be obtained that

T=\displaystyle\mathcal{R}^{T}{=} 0T((t,y,𝟎Nq)(t,y,μ))𝑑t+0Tμg(t,y)𝑑t\displaystyle\int_{0}^{T}\!\!\left(\mathcal{H}(t,y,\boldsymbol{0}_{Nq}){-}\mathcal{H}(t,y^{*},\mu)\right)dt{+}\int_{0}^{T}\!\mu^{\top}g(t,y^{*})\,dt
0TKμh(μ)𝑑t.\displaystyle-\int_{0}^{T}K_{\mu}h(\mu)\,dt. (19)

Consider the second term of the right-hand side of (19). Let ϕ(μ):=μg(t,y)\phi(\mu):=\mu^{\top}g(t,y^{*}) for simplicity. Then, by introducing an intermediate variable μ¯:=1Ni=1Nμi\bar{\mu}:=\frac{1}{N}\sum_{i=1}^{N}\mu_{i} and the relationship i=1Ngi(t,yi)𝟎\sum_{i=1}^{N}g_{i}(t,y_{i}^{*})\leq\boldsymbol{0}, one has that

ϕ(μ)(μ𝟏Nμ¯)g(t,y).\displaystyle\phi(\mu)\leq(\mu-\boldsymbol{1}_{N}\otimes\bar{\mu})^{\top}g(t,y^{*}). (20)

Further, one can obtain that

ϕ(μ)2\displaystyle\phi(\mu)^{2} ((μ1Nμ¯)g(t,y))2\displaystyle\leq\big{(}(\mu-1_{N}\otimes\bar{\mu})^{\top}g(t,y^{*})\big{)}^{2}
NKg2μ1Nμ¯2\displaystyle\leq NK_{g}^{2}\left\|\mu-1_{N}\otimes\bar{\mu}\right\|^{2}
=NKg2i=1Nμi1Nj=1Nμj2\displaystyle=NK_{g}^{2}\sum_{i=1}^{N}\Big{\|}\mu_{i}-\frac{1}{N}\sum_{j=1}^{N}\mu_{j}\Big{\|}^{2}
Kg2i=1Nj=1Nμiμj2\displaystyle\leq K_{g}^{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\left\|\mu_{i}-\mu_{j}\right\|^{2}
Kg2i=1Nj=1Nμiμj12.\displaystyle\leq K_{g}^{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\left\|\mu_{i}-\mu_{j}\right\|_{1}^{2}. (21)

Since 𝒢\mathcal{G} is connected, there always exists a path connecting nodes i0i_{0} and j0j_{0} for any i0,j0𝒱i_{0},j_{0}\in\mathcal{V}, i.e.,

h(μ)2=(i=1Nj=1Naijμiμj1)2μi0μj012.\displaystyle h(\mu)^{2}=\left(\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\left\|\mu_{i}-\mu_{j}\right\|_{1}\right)^{2}\geq\left\|\mu_{i_{0}}-\mu_{j_{0}}\right\|_{1}^{2}. (22)

Then for KμNKgK_{\mu}\geq NK_{g}, one has that ϕ(μ)2Kμ2h(μ)2\phi(\mu)^{2}\leq K_{\mu}^{2}h(\mu)^{2}, i.e.,

ϕ(μ)Kμh(μ)0.\displaystyle\phi(\mu)-K_{\mu}h(\mu)\leq 0. (23)

Substituting (18) and (23) into (19) completes the proof.

Theorem 2

Suppose that Assumptions 1-4 hold. Then for any T0T\geq 0 and ε>0\varepsilon>0 in control protocol (11), by choosing KμNKgK_{\mu}\geq NK_{g}, it holds for the fit that

TNy(0)yε+2NKfεT.\displaystyle\mathcal{F}^{T}\leq\frac{\sqrt{N}\|y(0)-y^{*}\|}{\varepsilon}+2N\sqrt{\frac{K_{f}}{\varepsilon}}\sqrt{T}. (24)
Proof:

By Lemma 3 with y~=y\tilde{y}=y^{*} and μ~=𝟏Nγ\tilde{\mu}=\boldsymbol{1}_{N}\otimes\gamma, where γ=col(γ1,,γq)q\gamma=col(\gamma_{1},\dots,\gamma_{q})\in\mathbb{R}^{q} is a parameter to be determined later, one has that

0T((t,y,𝟏Nγ)(t,y,μ))𝑑t\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\boldsymbol{1}_{N}\otimes\gamma)-\mathcal{H}(t,y^{*},\mu)\Big{)}\,dt
=\displaystyle= 0T(f(t,y)+γi=1Ngi(t,yi)f(t,y)μg(t,y)\displaystyle\int_{0}^{T}\bigg{(}f(t,y){+}\gamma^{\top}\sum_{i=1}^{N}g_{i}(t,y_{i}){-}f(t,y^{*}){-}\mu^{\top}g(t,y^{*})
+Kμh(μ))dt\displaystyle+K_{\mu}h(\mu)\bigg{)}dt
\displaystyle\leq V(y~,μ~)(y(0),𝟎)ε.\displaystyle\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}. (25)

Invoking Assumption 2 yields

0T(f(t,y)f(t,y))𝑑t2NKfT.\displaystyle\int_{0}^{T}\big{(}f(t,y^{*})-f(t,y)\big{)}\,dt\leq 2NK_{f}T. (26)

Substituting (23) and (26) into (25), one has that

0Tγi=1Ngi(t,yi)dt\displaystyle\int_{0}^{T}\gamma^{\top}\sum_{i=1}^{N}g_{i}(t,y_{i})\,dt\leq V(y~,μ~)(y(0),𝟎)ε+2NKfT.\displaystyle\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}+2NK_{f}T.

By choosing

γj={0,FjT0εNFjT,FjT>0,j=1,,q,\displaystyle\gamma_{j}=\left\{\begin{aligned} &0,&F_{j}^{T}\leq 0\\ &\frac{\varepsilon}{N}F_{j}^{T},&F_{j}^{T}>0\end{aligned}\right.,j=1,\dots,q,

it can be concluded that

εNT2\displaystyle\frac{\varepsilon}{N}\|\mathcal{F}^{T}\|^{2}\leq y(0)y2+ε2NT22ε+2NKfT.\displaystyle\frac{\|y(0)-y^{*}\|^{2}+\frac{\varepsilon^{2}}{N}\|\mathcal{F}^{T}\|^{2}}{2\varepsilon}+2NK_{f}T.

It can be further obtained by transposition that

T\displaystyle\mathcal{F}^{T}\leq Ny(0)y2ε+2NKfεT.\displaystyle\frac{\sqrt{N}\|y(0)-y^{*}\|_{2}}{\varepsilon}+2N\sqrt{\frac{K_{f}}{\varepsilon}}\sqrt{T}. (27)

Remark 2

Theorems 1 and 2 mean that T=𝒪(1)\mathcal{R}^{T}=\mathcal{O}(1) and T=𝒪(T)\mathcal{F}^{T}=\mathcal{O}(\sqrt{T}) under continuous communication. In comparison, explicit bounds on both the regret and fit with a sublinear growth are obtained in [12, 18] for single-integrator multi-agent systems, i.e., T=𝒪(Tmax{κ,1κ})\mathcal{R}^{T}=\mathcal{O}(T^{\max\{\kappa,1-\kappa\}}), T=𝒪(Tmax{κ,1κ})\mathcal{F}^{T}=\mathcal{O}(T^{\max\{\kappa,1-\kappa\}}) in [12] and T=𝒪(Tmax{κ,1κ})\mathcal{R}^{T}=\mathcal{O}(T^{\max\{\kappa,1-\kappa\}}), T=𝒪(Tmax{12+κ2,1κ2})\mathcal{F}^{T}=\mathcal{O}(T^{\max\{\frac{1}{2}+\frac{\kappa}{2},1-\frac{\kappa}{2}\}}) in [18] for κ(0,1)\kappa\in(0,1). Theorem 1 achieves stricter regret bound than [12, 18] under more complex system dynamics.

IV-B Event-triggered Communication

The above continuous-time control law, which requires each agent to know the real-time Lagrange multipliers of neighbors, may cause excessive communication overhead. In this section, an event-triggered protocol is proposed to avoid continuous communication.

For agent ii, suppose that tilt_{i}^{l} is its llth communication instant and {ti1,,til,}\{t_{i}^{1},\dots,t_{i}^{l},\dots\} is its communication instant sequence. Define μ^i(t):=μi(til),t[til,til+1)\hat{\mu}_{i}(t)\!:=\!\mu_{i}(t_{i}^{l}),~{}\forall t\!\in\![t_{i}^{l},t_{i}^{l+1}) as the available information of its neighbors and ei:=μ^i(t)μi(t)e_{i}\!:=\!\hat{\mu}_{i}(t)-\mu_{i}(t) as the measurement error. It can be known that ei=0e_{i}=0 at any instant tilt_{i}^{l}.

An event-triggered control law is proposed as

ui\displaystyle u_{i} =Kαixi+Kβi(Π𝒴i[yi,εiyi(t,yi,μi)]),\displaystyle=-K_{\alpha_{i}}x_{i}+K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right), (28a)
μi˙\displaystyle\dot{\mu_{i}} =Π+q[μi,εgi(t,yi)2εKμj=1Naijsgn(μ^iμ^j)],\displaystyle=\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i})-2\varepsilon K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})], (28b)

where ε>0\varepsilon>0 is the step size, Kαi,KβiK_{\alpha_{i}},K_{\beta_{i}} are feedback matrices that are solutions of (6), aija_{ij} is the weight corresponding to the edge (j,i)(j,i), and the initial value μi(0)=𝟎\mu_{i}(0)=\boldsymbol{0}. Note that 0 is chosen for the sign function in (28b) when its argument is zero.

Substituting controller (28) into (1), the system dynamics of the iith agent becomes

x˙i\displaystyle\dot{x}_{i} =(AiBiKαi)xi+BiKβi(Π𝒴i[yi,εiyi(t,yi,μi)]),\displaystyle=(A_{i}{-}B_{i}K_{\alpha_{i}})x_{i}+B_{i}K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right), (29a)
μi˙\displaystyle\dot{\mu_{i}} =Π+q[μi,εgi(t,yi)2εKμj=1Naijsgn(μ^iμ^j)],\displaystyle=\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K_{\mu}\!\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})], (29b)
yi\displaystyle y_{i} =Cixi.\displaystyle=C_{i}x_{i}. (29c)

The communication instant is chosen as

til+1:=inft>til{t|ei(t)16Nqj=1Naijμ^iμ^j1+σeιt3N2Kμq},\displaystyle t_{i}^{l+1}\!:=\!\inf_{t>t_{i}^{l}}\!\Big{\{}t\Big{|}\|e_{i}(t)\|{\geq}\frac{1}{6N\!\sqrt{q}}\!\sum_{j=1}^{N}\!a_{ij}\!\|\hat{\mu}_{i}{-}\hat{\mu}_{j}\|_{\!1}{+}\frac{\sigma e^{-\iota t}}{3N^{\!2}\!K_{\!\mu}\!\sqrt{q}}\Big{\}}, (30)

where σ\sigma and ι\iota are prespecified positive real numbers.

The following lemma is a modification of Lemma 3 under event-triggered communication.

Lemma 4

If Assumptions 1-4 hold and μ~=𝟏Nγ,γ+q\tilde{\mu}=\boldsymbol{1}_{N}\otimes\gamma,\forall\gamma\in\mathbb{R}_{+}^{q}, then for any T0T\geq 0 the trajectories of linear multi-agent system (1) with control protocol (29) satisfy

0T((t,y,μ~)(t,y~,μ))𝑑tV(y~,μ~)(y(0),𝟎)ε+σι.\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu}){-}\mathcal{H}(t,\tilde{y},\mu)\!\Big{)}dt{\leq}\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}{+}\frac{\sigma}{\iota}. (31)
Proof:

Similar to (15), one can obtain that

V˙(y~,μ~)=\displaystyle\dot{V}_{(\tilde{y},\tilde{\mu})}= (μμ~)μ˙+(yy~)y˙\displaystyle(\mu-\tilde{\mu})^{\top}\dot{\mu}+(y-\tilde{y})^{\top}\dot{y}
=\displaystyle= i=1N(μiμ~i)Π+q[μi,εgi(t,yi)2εKμj=1Naijsgn(μ^iμ^j)]\displaystyle\!\sum_{i=1}^{N}\!(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\!\Pi_{\mathbb{R}_{+}^{q}}\!\Big{[}\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K\!_{\mu}\!\sum_{j=1}^{N}\!a_{i\!j}\mathrm{sgn}(\hat{\mu}_{i}{-}\hat{\mu}_{j})\!\Big{]}
+i=1N(yiy~i)Π𝒴i[yi,εiyi(t,yi,μi)]\displaystyle\!+\!\sum_{i=1}^{N}(y_{i}\!-\!\tilde{y}_{i}\!)\!^{\top}\Pi_{\mathcal{Y}_{i}}\big{[}y_{i},\!-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,\!y_{i},\!\mu_{i})\big{]}
\displaystyle\leq εi=1N(μiμ~i)iμi(t,yi,μi)εi=1N(yiy~i)iyi(t,yi,μi)S1\displaystyle\underbrace{\varepsilon\sum_{i=1}^{N}(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\!\mathcal{H}_{i}^{\mu_{i}}(t,\!y_{i},\!\mu_{i}){-}\varepsilon\!\sum_{i=1}^{N}(y_{i}\!-\!\tilde{y}_{i}\!)\!^{\top}\!\mathcal{H}_{i}^{y_{i}}(t,\!y_{i},\!\mu_{i})}_{S_{1}}
2εKμi=1N(μiμ~i)j=1Naijsgn(μ^iμ^j)S2\displaystyle\underbrace{-2\varepsilon K_{\mu}\sum_{i=1}^{N}\!(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})}_{S_{2}}
+εKμi=1N(μiμ~i)j=1Naijsgn(μiμj)S3.\displaystyle+\underbrace{\varepsilon K_{\mu}\sum_{i=1}^{N}\!(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\mu_{i}-\mu_{j})}_{S_{3}}. (32)

Since the Lagrangian (7) is convex with respect to yiy_{i} and concave with respect to μi\mu_{i}, one has that

S1\displaystyle S_{1}\leq εi=1N(i(t,y~i,μi)i(t,yi,μi))\displaystyle\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,\tilde{y}_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\mu_{i})\Big{)}
+εi=1N(i(t,yi,μi)i(t,yi,μ~i))\displaystyle+\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,y_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\tilde{\mu}_{i})\Big{)}
=\displaystyle= ε((t,y~,μ)(t,y,μ~)).\displaystyle\varepsilon\Big{(}\mathcal{H}(t,\tilde{y},\mu)-\mathcal{H}(t,y,\tilde{\mu})\Big{)}. (33)

Since μ~=𝟏Nγ,γ+q\tilde{\mu}=\boldsymbol{1}_{N}\otimes\gamma,\forall\gamma\in\mathbb{R}_{+}^{q} and graph 𝒢\mathcal{G} is undirected, it follows that

S2=\displaystyle S_{2}= 2εKμi=1Nj=1Naijμisgn(μ^iμ^j)\displaystyle-2\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\mu_{i}^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})
=\displaystyle= εKμi=1Nj=1Naijμisgn(μ^iμ^j)\displaystyle-\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\mu_{i}^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})
εKμj=1Ni=1Najiμjsgn(μ^jμ^i)\displaystyle-\varepsilon K_{\mu}\sum_{j=1}^{N}\sum_{i=1}^{N}a_{ji}\mu_{j}^{\top}\mathrm{sgn}(\hat{\mu}_{j}-\hat{\mu}_{i})
=\displaystyle= εKμi=1Nj=1Naij(μiμj)sgn(μ^iμ^j)\displaystyle-\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}(\mu_{i}-\mu_{j})^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})
=\displaystyle= εKμi=1Nj=1Naij(μ^iμ^jei+ej)sgn(μ^iμ^j)\displaystyle-\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}(\hat{\mu}_{i}-\hat{\mu}_{j}-e_{i}+e_{j})^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})
\displaystyle\leq εKμi=1Nj=1Naijμ^iμ^j1+εKμqi=1Nj=1Neiej,\displaystyle\!-\!\varepsilon K_{\mu}\!\sum_{i=1}^{N}\!\sum_{j=1}^{N}a_{ij}\|\hat{\mu}_{i}{-}\hat{\mu}_{j}\|_{1}\!+\!\varepsilon K_{\mu}\!\sqrt{q}\sum_{i=1}^{N}\!\sum_{j=1}^{N}\|e_{i}{-}e_{j}\|, (34)

where the last inequality holds since the relationship v1qv,vq\|v\|_{1}\leq\sqrt{q}\|v\|,\forall v\in\mathbb{R}^{q}.

Similarly, one has that

S3=\displaystyle S_{3}= εKμi=1Nμij=1Naijsgn(μiμj)\displaystyle\varepsilon K_{\mu}\sum_{i=1}^{N}\!\mu_{i}^{\top}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\mu_{i}-\mu_{j})
=\displaystyle= εKμ2i=1Nj=1Naijμiμj1\displaystyle\frac{\varepsilon K_{\mu}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\|\mu_{i}-\mu_{j}\|_{1}
=\displaystyle= εKμ2i=1Nj=1Naijμ^iμ^jei+ej1\displaystyle\frac{\varepsilon K_{\mu}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\|\hat{\mu}_{i}-\hat{\mu}_{j}-e_{i}+e_{j}\|_{1}
\displaystyle\leq εKμ2i=1Nj=1Naijμ^iμ^j1+εKμq2i=1Nj=1Neiej.\displaystyle\frac{\varepsilon K_{\mu}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\|\hat{\mu}_{i}{-}\hat{\mu}_{j}\|_{1}+\frac{\varepsilon K_{\mu}\sqrt{q}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\|e_{i}{-}e_{j}\|. (35)

For the second item in (34) and (35), one can calculate that

i=1Nj=1Neiej\displaystyle\sum_{i=1}^{N}\sum_{j=1}^{N}\|e_{i}-e_{j}\|
\displaystyle\leq i=1Nj=1Nei+i=1Nj=1Nej\displaystyle\sum_{i=1}^{N}\sum_{j=1}^{N}\|e_{i}\|+\sum_{i=1}^{N}\sum_{j=1}^{N}\|e_{j}\|
\displaystyle\leq 2Ni=1Nei\displaystyle 2N\sum_{i=1}^{N}\|e_{i}\|
\displaystyle\leq 13qi=1Nj=1Naijμ^iμ^j1+2σeιt3Kμq,\displaystyle\frac{1}{3\sqrt{q}}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\|\hat{\mu}_{i}-\hat{\mu}_{j}\|_{1}+\frac{2\sigma e^{-\iota t}}{3K_{\mu}\sqrt{q}}, (36)

where the last inequality holds due to the trigger condition (30).

Summarizing the above-discussed analysis, V˙(y~,μ~)\dot{V}_{(\tilde{y},\tilde{\mu})} in (32) is calculated as

V˙(y~,μ~)\displaystyle\dot{V}_{(\tilde{y},\tilde{\mu})}\leq S1+S2+S3\displaystyle S_{1}+S_{2}+S_{3}
\displaystyle\leq ε((t,y~,μ)(t,y,μ~))+εσeιt.\displaystyle\varepsilon\Big{(}\mathcal{H}(t,\tilde{y},\mu)-\mathcal{H}(t,y,\tilde{\mu})\Big{)}+\varepsilon\sigma e^{-\iota t}.

By integrating it from 0 to TT on both sides and omitting negative terms, it can be obtained that

0T((t,y,μ~)(t,y~,μ))𝑑tV(y~,μ~)(y(0),𝟎)ε+σι.\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu}){-}\mathcal{H}(t,\tilde{y},\mu)\Big{)}dt{\leq}\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}{+}\frac{\sigma}{\iota}. (37)

We now state the main results about the regret and fit bounds of event-triggered communication controller (29).

Theorem 3

Suppose that Assumptions 1-4 hold. Then for any T0T\geq 0 and ε>0\varepsilon>0 in control protocol (28), by choosing KμNKgK_{\mu}\geq NK_{g}, the following regret bound holds:

Ty(0)y22ε+σι.\displaystyle\mathcal{R}^{T}\leq\frac{\|y(0)\!-\!y^{*}\|^{2}}{2\varepsilon}+\frac{\sigma}{\iota}. (38)
Theorem 4

Suppose that Assumptions 1-4 hold. Then for any T0T\geq 0 and ε>0\varepsilon>0 in control protocol (28), by choosing KμNKgK_{\mu}\geq NK_{g}, the following fit bound holds:

TNy(0)yε+2Nσει+2NKfεT.\displaystyle\mathcal{F}^{T}\leq\frac{\sqrt{N}\|y(0)-y^{*}\|}{\varepsilon}+\sqrt{\frac{2N\sigma}{\varepsilon\iota}}+2N\sqrt{\frac{K_{f}}{\varepsilon}}\sqrt{T}. (39)

The proofs of Theorems 3 and 4 are similar to that of Theorems 1 and 2, except that Lemma 4 is used instead of Lemma 3. They are thus omitted here.

Remark 3

Theorems 3 and 4 mean that T=𝒪(1)\mathcal{R}^{T}=\mathcal{O}(1) and T=𝒪(T)\mathcal{F}^{T}=\mathcal{O}(\sqrt{T}) still hold even under event-triggered communication. The bounds of regret and fit are determined by the communication frequency. Generally speaking, decreasing σ\sigma and increasing ι\iota will achieve smaller bounds on regret and fit, but meanwhile increase the communication frequency, which results in a tradeoff between them.

Theorem 5

Under the event triggering condition (30), system (29) does not exhibit the Zeno behavior..

Proof:

In the trigger interval [til,til+1)[t_{i}^{l},t_{i}^{l+1}) for agent ii, combining the definition of eie_{i} with (28b), one can write the upper right-hand Dini derivative as

D+ei(t)=Π+q[μi,εgi(t,yi)2εKμj=1Naijsgn(μ^iμ^j)].\displaystyle D^{+}e_{i}(t){=}-\!\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}{-}\hat{\mu}_{j})]. (40)

It is obvious that ei(til)=0e_{i}(t_{i}^{l})=0. Then, for t(til,til+1)t\in(t_{i}^{l},t_{i}^{l+1}), the solution of (40) is

ei(t)=tiltΠ+q[μi,εgi(t,yi)2εKμj=1Naijsgn(μ^iμ^j)]dτ.\displaystyle e_{i}(t){=}\!\int_{t_{i}^{l}}^{t}\!-\!\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}{-}\hat{\mu}_{j})]\,d\tau. (41)

From Assumption 2 and the inequality Π𝒮[x,v]v\|\Pi_{\mathcal{S}}[x,v]\|\leq\|v\| (cf. Remark 2.1 in [20]), it can be obtained that the norm of the integral term in (41) is bounded, and let the upper bound of its norm be δ\delta. It then follows from (41) that

ei(t)δ(ttil).\displaystyle\|e_{i}(t)\|\leq\delta(t-t_{i}^{l}). (42)

Hence, condition (30) will definitely not be triggered before the following condition holds:

δ(ttil)=16Nqj=1Naijμ^iμ^j1+σeιt3N2Kμq.\displaystyle\delta(t-t_{i}^{l})=\frac{1}{6N\sqrt{q}}\sum_{j=1}^{N}a_{ij}\|\hat{\mu}_{i}-\hat{\mu}_{j}\|_{1}{+}\frac{\sigma e^{-\iota t}}{3N^{2}K_{\mu}\sqrt{q}}. (43)

It is easy to obtain that the right-hand side of (43) is positive for any finite time tt, which further implies that ttil>0t-t_{i}^{l}>0. Hence, the value til+1tilt_{i}^{l+1}-t_{i}^{l} is strictly positive for finite tt, which implies that no Zeno behavior is exhibited. ∎

V SIMULATION

Consider a heterogeneous multi-agent system composed of 55 agents described by (1), where xi{2i=1,2,33i=4,5x_{i}\in\begin{cases}\mathbb{R}^{2}&i=1,2,3\\ \mathbb{R}^{3}&i=4,5\end{cases}, yi={(yi,a,yi,b)2i=1,2,3(yi,a,yi,b,yi,c)3i=4,5y_{i}=\begin{cases}(y_{i,a},y_{i,b})\in\mathbb{R}^{2}&i=1,2,3\\ (y_{i,a},y_{i,b},y_{i,c})\in\mathbb{R}^{3}&i=4,5\end{cases}, A1,2A_{1,2} == [1,0;0,2][1,0;0,2], A3A_{3} == [0,2;1,1][0,2;-1,1], A4,5A_{4,5} == [2,1,0;0,1,1;1,0,2][2,1,0;0,1,1;1,0,2], B1,2B_{1,2} == [0,1;1,3][0,1;1,3], B3B_{3} == [2,1;1,0][2,1;1,0], B4,5B_{4,5} == [1,0,0;0,1,0;0,0,1][1,0,0;0,1,0;0,0,1], C1,2C_{1,2} == [2,0;0,1][2,0;0,1], C3C_{3} == [2,1;1,0][2,1;-1,0], C4,5C_{4,5} == [3,0,0;0,1,0;0,1,2][3,0,0;0,1,0;0,1,2].

Refer to caption
Figure 1: Communication network among five agents.
Refer to caption
Figure 2: Evolution of T\mathcal{R}^{T} and T/T\mathcal{F}^{T}/T with continuous communication.

The local objective functions are time-varying quadratic functions as follows:

f1(t,y1)=\displaystyle f_{1}(t,y_{1})= (y1,acost1)2+(y1,bcos1.5t1.5)2;\displaystyle(y_{1,a}{-}\cos{t}{-}1)^{2}{+}(y_{1,b}{-}\cos{1.5t}{-}1.5)^{2};
f2(t,y2)=\displaystyle f_{2}(t,y_{2})= 2(y2,acost1)2+3(y2,bcos1.7t1)2;\displaystyle 2(y_{2,a}{-}\cos{t}{-}1)^{2}{+}3(y_{2,b}{-}\cos{1.7t}{-}1)^{2};
f3(t,y3)=\displaystyle f_{3}(t,y_{3})= 2(y3,acost1)2+(y3,bcos2t1)2;\displaystyle 2(y_{3,a}{-}\cos{t}{-}1)^{2}{+}(y_{3,b}{-}\cos{2t}{-}1)^{2};
f4(t,y4)=\displaystyle f_{4}(t,y_{4})= 0.5(y4,acost2)2+(y4,bcos1.2t1)2\displaystyle 0.5(y_{4,a}{-}\cos{t}{-}2)^{2}{+}(y_{4,b}{-}\cos{1.2t}{-}1)^{2}
+(y4,ccos1.5t2)2;\displaystyle{+}(y_{4,c}{-}\cos{1.5t}{-}2)^{2};
f5(t,y5)=\displaystyle f_{5}(t,y_{5})= 2(y5,acost1)2+3(y5,bcos1.5t1)2\displaystyle 2(y_{5,a}{-}\cos{t}{-}1)^{2}{+}3(y_{5,b}{-}\cos{1.5t}{-}1)^{2}
+(y5,ccos2t1.5)2.\displaystyle{+}(y_{5,c}{-}\cos{2t}{-}1.5)^{2}.

The feasible set of output variables 𝒴[1,5]12\mathcal{Y}\in[-1,5]^{12}. The constraints are defined by a time-varying function

g1(t,y1)=\displaystyle g_{1}(t,y_{1})= (0.5sin10t+1.5)y1,a+(0.5sin15t+1.5)y1,b1;\displaystyle(0.5\sin{10t}{+}1.5)y_{1,a}{+}(0.5\sin{15t}{+}1.5)y_{1,b}{-}1;
g1(t,y2)=\displaystyle g_{1}(t,y_{2})= (0.3sin10t+1.7)y2,a+(0.1sin25t+1.9)y2,b3;\displaystyle(0.3\sin{10t}{+}1.7)y_{2,a}{+}(0.1\sin{25t}{+}1.9)y_{2,b}{-}3;
g1(t,y3)=\displaystyle g_{1}(t,y_{3})= (0.6sin20t+1.4)y3,a+(0.5sin15t+1.5)y3,b4;\displaystyle(0.6\sin{20t}{+}1.4)y_{3,a}{+}(0.5\sin{15t}{+}1.5)y_{3,b}{-}4;
g1(t,y4)=\displaystyle g_{1}(t,y_{4})= (0.5sin20t+1.5)y4,a+(0.4sin25t+1.6)y4,b\displaystyle(0.5\sin{20t}{+}1.5)y_{4,a}{+}(0.4\sin{25t}{+}1.6)y_{4,b}
+(0.4sin25t+1.6)y4,c2;\displaystyle{+}(0.4\sin{25t}{+}1.6)y_{4,c}{-}2;
g1(t,y5)=\displaystyle g_{1}(t,y_{5})= (0.5sin10t+1.5)y5,a+(0.6sin15t+1.4)y5,b\displaystyle(0.5\sin{10t}{+}1.5)y_{5,a}{+}(0.6\sin{15t}{+}1.4)y_{5,b}
+(0.6sin15t+1.4)y5,c5;\displaystyle{+}(0.6\sin{15t}{+}1.4)y_{5,c}{-}5;

The above constraint selection ensures that 𝟎\boldsymbol{0} must be a strictly feasible solution for all t[0,T]t\in[0,T].

The clairvoyant optimal output can be computed by solving the problem

y:=argminy𝒴0Ti=1Nfi(t,yi)dt,s.t.i=1Ngi(t,yi)𝟎.\displaystyle\begin{split}y^{*}\!:=\mathop{\arg\min}_{y\in\mathcal{Y}}\int_{0}^{T}\sum_{i=1}^{N}f_{i}(t,y_{i})\,dt,\\ s.t.\sum_{i=1}^{N}g_{i}(t,y_{i})\leq\boldsymbol{0}.\end{split} (44)

The communication network among these agents is depicted in Fig. 1. It can be verified that Assumptions 1-4 hold.

For the numerical example, the selection of feedback matrices is based on (6), where Kα1,2K_{\alpha_{1,2}} == [3,2;1,0][-3,2;1,0], Kα3K_{\alpha_{3}} == [1,1;2,0][-1,1;2,0], Kα4,5K_{\alpha_{4,5}} == [2,1,0;0,1,1;1,0,2][2,1,0;0,1,1;1,0,2], Kβ1,2K_{\beta_{1,2}} == [1.5,1;0.5,0][-1.5,1;0.5,0], Kβ3K_{\beta_{3}} == [1,2;2,5][1,2;-2,-5], Kβ4,5K_{\beta_{4,5}} == [0.333,0,0;0,1,0;0,0.5,0.5][0.333,0,0;0,1,0;0,-0.5,0.5]. The initial values xi(0)x_{i}(0) are randomly selected in [5,5][-5,5] and μ(0)=𝟎\mu(0)=\boldsymbol{0}.

Refer to caption
Figure 3: Evolution of T\mathcal{R}^{T} and T/T\mathcal{F}^{T}/T with event-triggered communication.
Refer to caption
Figure 4: Triggering instants of five agents.

Fig. 2 illustrates that the continuous-time control law achieves constant regret bound and sublinear fit bound. The results are in accordance with those established in Theorems 1-2. Likewise, the similar results can be observed in Fig. 3 under event-triggered communication. Fig. 4 shows the communication moments of five agents with event-triggered control laws, from which one can observe that the communication among five agents is discrete and exhibits no Zeno behavior.

VI CONCLUSION

In this paper, we studied distributed online convex optimization for heterogeneous linear multi-agent systems with time-varying cost functions and time-varying coupling inequality constraints. A distributed controller was proposed based on the saddle-point method, showing the constant regret bound and sublinear fit bound. In order to avoid continuous communication and reduce the communication cost, an event-triggered communication scheme with no Zeno behavior was developed, which also achieves constant regret bound and sublinear fit bound.

References

  • [1] M. A. Dahleh and I. J. Diaz-Bobillo, Control of Uncertain Systems: A Linear Programming Approach. Prentice-Hall, Englewood Cliffs, NJ, 1995.
  • [2] Z. Luo, “Applications of convex optimization in signal processing and digital communication,” Math. Program., vol. 97, no. 1-2, pp. 177–207, 2003.
  • [3] J. Qiu, Q. Wu, G. Ding, Y. Xu, and S. Feng, “A survey of machine learning for big data processing.,” EURASIP J. Adv. Signal Process., vol. 2016, p. 67, 2016.
  • [4] S. Shalev-Shwartz, “Online learning and online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, pp. 107–194, 2011.
  • [5] X. Li, L. Xie, and N. Li, “A survey of decentralized online learning,” arXiv preprint arXiv:2205.00473, 2022.
  • [6] M. Zinkevich, “Online convex programming and generalized infinitesimal gradient ascent,” in Proceedings, Twentieth International Conference on Machine Learning, vol. 2, pp. 928–935, 2003.
  • [7] X. Zhou, E. Dallanese, L. Chen, and A. Simonetto, “An incentive-based online optimization framework for distribution grids,” IEEE Transactions on Automatic Control, vol. 63, no. 7, pp. 2019–2031, 2018.
  • [8] S. Lee, A. Ribeiro, and M. Zavlanos, “Distributed continuous-time online optimization using saddle-point methods,” in IEEE 55th Conference on Decision and Control, pp. 4314–4319, 2016.
  • [9] S. Paternain, S. Lee, M. Zavlanos, and A. Ribeiro, “Distributed constrained online learning,” IEEE Transactions on Signal Processing, vol. 68, pp. 3486–3499, 2020.
  • [10] D. Yuan, D. Ho, and G.-P. Jiang, “An adaptive primal-dual subgradient algorithm for online distributed constrained optimization,” IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3045–3055, 2018.
  • [11] X. Li, X. Yi, and L. Xie, “Distributed online optimization for multi-agent networks with coupled inequality constraints,” IEEE Transactions on Automatic Control, vol. 66, no. 8, pp. 3575–3591, 2021.
  • [12] X. Yi, X. Li, L. Xie, and K. Johansson, “Distributed online convex optimization with time-varying coupled inequality constraints,” IEEE Transactions on Signal Processing, vol. 68, pp. 731–746, 2020.
  • [13] X. Yi, X. Li, T. Yang, L. Xie, T. Chai, and K. H. Johansson, “Distributed bandit online convex optimization with time-varying coupled inequality constraints,” IEEE Transactions on Automatic Control, vol. 66, no. 10, pp. 4620–4635, 2021.
  • [14] X. Li, X. Yi, and L. Xie, “Distributed online convex optimization with an aggregative variable,” IEEE Transactions on Control of Network Systems, pp. 1–8, 2021.
  • [15] Z. Deng, Y. Zhang, and Y. Hong, “Distributed online optimization of high-order multi-agent systems,” in 2016 35th Chinese Control Conference (CCC), pp. 7672–7677, 2016.
  • [16] M. Nonhoff and M. Müller, “Online gradient descent for linear dynamical systems,” in IFAC-PapersOnLine, vol. 53, pp. 945–952, 2020.
  • [17] M. Nonhoff and M. A. Müller, “Data-driven online convex optimization for control of dynamical systems,” arXiv preprint arXiv:2103.09127, 2021.
  • [18] J. Li, C. Gu, Z. Wu, and T. Huang, “Online learning algorithm for distributed convex optimization with time-varying coupled constraints and bandit feedback,” IEEE Transactions on Cybernetics, vol. 52, no. 2, pp. 1009–1020, 2022.
  • [19] S. Paternain and A. Ribeiro, “Online learning of feasible strategies in unknown environments,” IEEE Transactions on Automatic Control, vol. 62, no. 6, pp. 2807–2822, 2017.
  • [20] D. Zhang and A. Nagurney, “On the stability of projected dynamical systems,” Journal of Optimization Theory and Applications, vol. 85, no. 1, pp. 97–124, 1995.
  • [21] L. Li, Y. Yu, X. Li, and L. Xie, “Exponential convergence of distributed optimization for heterogeneous linear multi-agent systems over unbalanced digraphs,” Automatica, vol. 141, p. 110259, 2022.
  • [22] S. Liang, X. Zeng, and Y. Hong, “Distributed nonsmooth optimization with coupled inequality constraints via modified Lagrangian function,” IEEE Transactions on Automatic Control, vol. 63, no. 6, pp. 1753–1759, 2018.