Continuous-Time and Event-Triggered Online Optimization for Linear Multi-Agent Systems

Yang Yu, Xiuxian Li, , Li Li, and Lihua Xie Yang Yu, Xiuxian Li, and Li Li are with the Department of Control Science and Engineering, Shanghai Research Institute for Autonomous Intelligent Systems, and Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai, China {1910639, xli, lili}@tongji.edu.cnLihua Xie is with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798 elhxie@ntu.edu.sg

Abstract

This paper studies the decentralized online convex optimization problem for heterogeneous linear multi-agent systems. Agents have access to their time-varying local cost functions related to their own outputs, and there are also time-varying coupling inequality constraints among them. The goal of each agent is to minimize the global cost function by selecting appropriate local actions only through communication between neighbors. We design a distributed controller based on the saddle-point method which achieves constant regret bound and sublinear fit bound. In addition, to reduce the communication overhead, we propose an event-triggered communication scheme and show that the constant regret bound and sublinear fit bound are still achieved in the case of discrete communications with no Zeno behavior. A numerical example is provided to verify the proposed algorithms.

I INTRODUCTION

Convex optimization has been widely studied as a pretty effective method in research fields involving optimization and decision-making, such as automatic control systems [1], communication networks [2], and machine learning [3]. Early convex optimization works were based on fixed cost functions and static constraints. However, in practice, optimization costs and constraints of many problems are possible to be time-varying and a priori unknown [4]. This motivated online convex optimization (OCO) which requires the decision maker to choose an action at each instant based on previous information. A widely used performance criterion of OCO is regret, that is, the gap between the cumulative loss of the selected action and that of the best ideal action made when knowing the global information beforehand. If regret is sublinear, the time average loss of the selected action is progressively not greater than that of the ideal action. Another performance indicator is fit, which measures the degree of violation of static/time-varying inequality constraints. For more details, a recent survey can be referenced [5].

The OCO framework was introduced by [6], where the projection-based online gradient descent algorithm was analyzed. Based on static constraints, the algorithm was proved to achieve $\mathcal{O}(\sqrt{T})$ static regret bound for time-varying convex cost functions with bounded subgradients. With the increase of data scale and problem complexity in recent years, distributed online convex optimization has also been widely studied in recent years [7]. In the continuous-time setting, the saddle-point algorithm proposed in [8] under constant constraints is shown to achieve sublinear bounds on the network disagreement and the regret achieved by the strategy. The authors of [9] generalized this result to the problem of time-varying constraints. In the discrete-time setting, [10, 11] used distributed primal-dual algorithms to solve online convex optimization with static independent and coupled inequality constraints. In order to solve the time-varying coupling constraints, the authors of [12] proposed a novel distributed online primal dual dynamic mirror descent algorithm to realize sublinear dynamic regret and constraint violation. A gradient-free distributed bandit online algorithm was proposed in [13], which is applicable to scenarios where it is difficult to obtain the gradient information of the cost functions. In the presence of aggregate variables in local cost functions, an online distributed gradient tracking algorithm was developed in [14] based on true or stochastic gradients.

In actual physical systems, the implementation of optimization strategies must take into account the complicated dynamics of each agent. Along this line, only a few works have investigated online convex optimization with physical systems in recent years. For continuous-time multi-agent systems with high-order integrators, the authors of [15] used PI control idea and gradient descent to solve the distributed OCO problem. The authors of [16] considered the online convex optimization problem of linear systems, but did not consider any constraints. The online convex optimization problem of linear time-invariant (LTI) system was studied in [17] based on the behavioral system theory, where a proposed data-driven algorithm that does not rely on the model achieves sublinear convergence. However, the above two papers for linear systems only provide centralized algorithms. The distributed setup for online optimization algorithm with linear systems is yet to be studied.

The main contributions of this paper are as follows.

•

Compared with the centralized OCO algorithms for linear systems with no constrains [16, 17], this paper studies the distributed online optimization of heterogeneous multi-agent systems with time-varying coupled inequality constraints for the first time. Agents only rely on the information of themselves and their neighbors to make decisions and achieve constant regret bound and $\mathcal{O}(\sqrt{T})$ fit bound. In comparison, most existing algorithms [12, 18] about distributed online optimization with coupled inequality constraints only achieve inferior sublinear regret bounds.
•

Compared with current continuous-time online optimization algorithms [15, 8, 9], this paper introduces an event-triggered mechanism to reduce the communication overhead. In the case of discrete communication, the constant regret bound and $\mathcal{O}(\sqrt{T})$ fit bound are still achieved.

The rest of the paper is organized as follows. Preliminaries are given in Section II. In Section III, the heterogeneous multi-agent system under investigation is described mathematically, the online convex optimization problem is defined and some useful lemmas are given. Following that, the control laws with continuous and event-triggered communication are proposed, respectively, and the constant regret bound and sublinear fit bound are established in Section IV. Then, a simulation example is provided to verify the effectiveness of the algorithm in Section V. Finally, the conclusion is discussed in Section VI.

II PRELIMINARIES

II-A Notations

Let $\mathbb{R}$ , $\mathbb{R}^{n}$ , $\mathbb{R}_{+}^{n}$ , $\mathbb{R}^{m\times n}$ be the sets of real numbers, real vectors of dimension $n$ , non-negative real vectors of dimension $n$ , and real matrices of dimension $m\times n$ , respectively. The $n\times n$ identity matrix is denoted by $I_{n}$ . The $n\times 1$ all-one and all-zero column vectors are denoted by $\boldsymbol{1}_{n}$ and $\boldsymbol{0}_{n}$ , respectively. For a matrix $A\in\mathbb{R}^{m\times n}$ , $A^{\top}$ is its transpose and $diag(A_{1},\dots,A_{n})$ denotes a block diagonal matrix with diagonal blocks of $A_{1}$ , $\dots$ , $A_{n}$ . For a vector $x$ , $\|x\|_{1}$ is its $1$ -norm, $\|x\|$ is its $2$ -norm, and $col(x_{1},\dots,x_{n})$ is a column vector by stacking vectors $x_{1},\dots,x_{n}$ . $A\otimes B$ represents the Kronecker product of matrices $A$ and $B$ . Let $P_{\mathcal{S}}(x)$ be the Euclidean projection of a vector $x\in\mathbb{R}^{n}$ onto the set $\mathcal{S}\subseteq\mathbb{R}^{n}$ , i.e., $P_{\mathcal{S}}(x)=argmin_{y\in\mathcal{S}}\|x-y\|^{2}$ . For simplicity, let $[\cdot]_{+}$ denote $P_{\mathbb{R}_{+}^{n}}(\cdot)$ . Define the set-valued sign function $\mathrm{sgn}(\cdot)$ as follows:

\displaystyle\mathrm{sgn}(x):=\partial\|x\|_{1}=\begin{cases}-1,&\mathrm{if}\ x<0,\\ [-1,1],&\mathrm{if}\ x=0,\\ 1,&\mathrm{if}\ x>0.\end{cases}

II-B Graph Theory

For a system with $N$ agents, its communication network is modeled by an undirected graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$ , where $\mathcal{V}=\{v_{1},\dots,v_{N}\}$ is a node set and $\mathcal{E}\in\mathcal{V}\times\mathcal{V}$ is an edge set. If information exchange can occur between $v_{j}$ and $v_{i}$ , then $(v_{j},v_{i})\in\mathcal{E}$ with $a_{ij}=1$ denoting its weight. $\mathcal{A}=[a_{ij}]\in\mathbb{R}^{N\times N}$ is the adjacency matrix. If there exists a path from any node to any other node in $\mathcal{V}$ , then $\mathcal{G}$ is called connected.

III PROBLEM FORMULATION

Consider a multi-agent system consisting of $N$ heterogeneous agents indexed by $i=1,\dots,N$ , and the $i$ th agent has following linear dynamics:

\displaystyle\begin{split}{\dot{x}}_{i}&=A_{i}x_{i}+B_{i}u_{i},\\ y_{i}&=C_{i}x_{i},\end{split}

(1)

where $x_{i}\in\mathbb{R}^{n_{i}}$ , $u_{i}\in\mathbb{R}^{m_{i}}$ and $y_{i}\in\mathbb{R}^{p_{i}}$ are the state, input and output variables, respectively. $A_{i}\in\mathbb{R}^{n_{i}\times n_{i}}$ , $B_{i}\in\mathbb{R}^{n_{i}\times m_{i}}$ and $C_{i}\in\mathbb{R}^{p_{i}\times n_{i}}$ are the state, input and output matrices, respectively.

Each agent $i$ has an output set $\mathcal{Y}_{i}\subseteq\mathbb{R}^{p_{i}}$ such that the output variable $y_{i}\in\mathcal{Y}_{i}$ . $f_{i}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R}$ and $g_{i}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R}^{q}$ are the private cost and constraint functions for agent $i$ . Denote $p:=\sum_{i=1}^{N}p_{i}$ , $\mathcal{Y}:=\mathcal{Y}_{1}\times\dots\times\mathcal{Y}_{N}\subseteq\mathbb{R}^{p}$ , $y:=col(y_{1},\dots,y_{N})\in\mathbb{R}^{p}$ , and $f(t,y):=\sum_{i=1}^{N}f_{i}(t,y_{i})$ . The objective of this paper is to design a controller $u_{i}(t)$ for each agent by using only local interaction and information such that all agents cooperatively minimize the sum of the cost functions over a period of time $[0,T]$ with time-varying coupled inequality constraints:

\displaystyle\begin{split}\min_{y\in\mathcal{Y}}\int_{0}^{T}f(t,y)\,dt,\\ s.t.\sum_{i=1}^{N}g_{i}(t,y_{i})\leq\boldsymbol{0}.\end{split}

(2)

Let $y^{*}=(y_{1}^{*},\dots,y_{N}^{*})\in\mathcal{Y}$ denote the optimal solution for problem (2) when the time-varying cost and constraint functions are known in advance.

In order to evaluate the cost performance of such output trajectories, we define two performance indicators: network regret and network fit. According to the previous definition, $y^{*}$ is the optimal output when the agents know all the information of network in the period of $[0,T]$ . But in reality, agents can only make decisions based on their own and neighbors’ current and previous information. Regret is described as the gap between the cumulative action $\int_{0}^{T}f(t,y(t))\,dt$ incurred by $y(t)$ and the cost incurred by the optimal output $y^{*}$ , i.e.,

\displaystyle\mathcal{R}^{T}:=\int_{0}^{T}\Big{(}f(t,y(t))-f(t,y^{*})\Big{)}\,dt.

(3)

In order to evaluate the fitness of output trajectories $y(t)$ to the constraints (or in other words, the degree of violation of the constraints), we define fit as the projection of the cumulative constraints onto the nonnegative orthant:

\displaystyle\mathcal{F}^{T}:=\left\|\,\left[\int_{0}^{T}\sum_{i=1}^{N}g_{i}(t,y_{i})\,dt\right]_{+}\right\|.

(4)

This definition implicitly allows strictly feasible decisions to compensate for violations of constraints at certain times. This is reasonable when the variables can be stored or preserved, such as the average power constraints [19]. By $g_{i}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R}^{q}$ , one can define $g_{i,j}(t,\cdot):\mathbb{R}^{p_{i}}\to\mathbb{R}$ as the $j$ th component of $g_{i}(t,\cdot)$ , i.e., $g_{i}(t,\cdot)=col\left(g_{i,1}(t,\cdot),\dots,g_{i,q}(t,\cdot)\right)$ . Further, define $F_{j}^{T}:=\int_{0}^{T}\sum_{i=1}^{N}g_{i,j}(t,y_{i})\,dt,j=1,\dots,q$ as the $j$ th component of the constraint integral. It can be easily deduced that

\displaystyle\mathcal{F}^{T}=\sqrt{\sum_{j=1}^{q}\left[F_{j}^{T}\right]_{+}^{2}}.

Assumption 1

The communication network $\mathcal{G}$ is undirected and connected.

Assumption 2

Each set $\mathcal{Y}_{i}$ is convex and compact. For $t\in[0,T]$ , functions $f_{i}(t,y_{i})$ and $g_{i}(t,y_{i})$ are convex, integrable and bounded on $\mathcal{Y}_{i}$ , i.e., there exist constants $K_{f}>0$ and $K_{g}>0$ such that $|f_{i}(t,y_{i})|\leq K_{f}$ and $\|g_{i}(t,y_{i})\|\leq K_{g}$ .

Assumption 3

The set of feasible outputs $\mathcal{Y}^{\dagger}:=\{y:y\in\mathcal{Y},\sum_{i=1}^{N}g_{i}(t,y_{i})\leq 0,t\in[0,T]\}$ is non-empty.

Definition 1 ([20])

Let $\mathcal{S}\subseteq\mathbb{R}^{n}$ be a closed convex set. Then, for any $x\in\mathcal{S}$ and $v\in\mathbb{R}^{n}$ , the projection of $v$ over set $\mathcal{S}$ at the point $x$ can be defined as

\displaystyle\Pi_{\mathcal{S}}[x,v]=\lim\limits_{\xi\to 0^{+}}\frac{P_{\mathcal{S}}(x+\xi v)-x}{\xi}.

Lemma 1 ([19])

Let $\mathcal{S}\subseteq\mathbb{R}^{n}$ be a convex set and let $x,y\in\mathcal{S}$ , then

\displaystyle(x-y)^{\top}\Pi_{\mathcal{S}}(x,v)\leq(x-y)^{\top}v,\forall v\in\mathcal{S}.

(5)

Assumption 4

$(A_{i},B_{i})$ is controllable, and

\displaystyle rank(C_{i}B_{i})=p_{i},i=1,\dots,N.

Lemma 2 ([21])

Under Assumption 4, the matrix equations


$\displaystyle C_{i}B_{i}K_{\alpha_{i}}$	$\displaystyle=C_{i}A_{i},$	(6a)
$\displaystyle C_{i}B_{i}K_{\beta_{i}}$	$\displaystyle=I_{p_{i}},$	(6b)

have solutions $K_{\alpha_{i}}$ , $K_{\beta_{i}}$ .

Remark 1

Assumption 2 is reasonable since the output variables in practice, such as voltage, often have a certain range. The cost functions and constraint functions are not required to be differentiable, which can be dealt with by using subgradients. The controllability in Assumption 4 is quite standard in dealing with the problem for linear systems.

IV MAIN RESULTS

IV-A Continuous Communication

For agent $i$ , to solve the online optimization problem (2), we can construct the time-varying Lagrangian

\displaystyle\mathcal{H}_{i}(t,y_{i},\mu_{i})=f_{i}(t,y_{i})+\mu_{i}^{\top}g_{i}(t,y_{i})-K_{\mu}h_{i},

(7)

where $\mu_{i}\in\mathbb{R}_{+}^{q}$ is the local Lagrange multiplier for agent $i$ , $K_{\mu}>0$ is the preset parameter and $h_{i}:=\sum_{j=1}^{N}a_{ij}\left\|\mu_{i}-\mu_{j}\right\|_{1}$ is a metric of $\mu_{i}$ ’s disagreement [22].

Notice that $f_{i}(t,\cdot)$ , $g_{i}(t,\cdot)$ are convex and $\mu_{i}\geq\boldsymbol{0}_{q}$ , hence the Lagrangian is convex with respect to $y_{i}$ . Let us denote by $\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})$ a subgradient of $\mathcal{H}_{i}$ with respect to $y_{i}$ , i.e.,

\displaystyle\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})\in\partial f_{i}(t,y_{i})+\mu_{i}^{\top}\partial g_{i}(t,y_{i}).

(8)

The Lagrangian is concave with respect to $\mu_{i}$ and its subgradient is given by

\displaystyle\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})

\displaystyle\in g_{i}(t,y_{i})-K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\mu_{i}-\mu_{j}).

(9)

For simplicity, define

\displaystyle\mathcal{H}(t,y,\mu):=f(t,y)+\mu^{\top}g(t,y)-K_{\mu}h(\mu),

(10)

where $y=col(y_{1},\dots,y_{N})$ , $\mu=col(\mu_{1},\dots,\mu_{N})$ , $f(t,y)=\sum_{i=1}^{N}f_{i}(t,y_{i})$ , $g(t,y)=col(g_{1}(t,y_{1}),\dots,g_{N}(t,y_{N}))$ , and $h(\mu)=\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\left\|\mu_{i}-\mu_{j}\right\|_{1}$ . It can be easily verified that $\mathcal{H}(t,y,\mu)=\sum_{i=1}^{N}\mathcal{H}_{i}(t,y_{i},\mu_{i})$ .

A controller following modified Arrow-Hurwicz algorithm for the $i$ th agent is proposed as


$\displaystyle u_{i}$	$\displaystyle=-K_{\alpha_{i}}x_{i}+K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right),$	(11a)
$\displaystyle\dot{\mu_{i}}$	$\displaystyle=\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})],$	(11b)

where $\varepsilon>0$ is the step size, $K_{\alpha_{i}},K_{\beta_{i}}$ are feedback matrices that are the solutions of (6), and the initial value $\mu_{i}(0){=}\boldsymbol{0}$ .

Substituting the controller (11) into the system (1), the system dynamics of the $i$ th agent is


$\displaystyle\dot{x}_{i}$	$\displaystyle{=}(A_{i}{-}B_{i}K_{\alpha_{i}})x_{i}{+}B_{i}K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right),$	(12a)
$\displaystyle\dot{\mu_{i}}$	$\displaystyle{=}\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})],$	(12b)
$\displaystyle y_{i}$	$\displaystyle{=}C_{i}x_{i}.$	(12c)

For the subsequent analysis, consider the following energy function with any $\tilde{y}\in\mathcal{Y}$ and $\tilde{\mu}\in\mathbb{R}_{+}^{Nq}$ :

\displaystyle V_{(\tilde{y},\tilde{\mu})}(y,\mu)=\frac{1}{2}\|y-\tilde{y}\|^{2}+\frac{1}{2}\|\mu-\tilde{\mu}\|^{2}.

(13)

The following lemma establishes the relationship between above energy function and time-varying Larangian (7) along the dynamics (12).

Lemma 3

If Assumptions 1-4 hold and $\tilde{\mu}:=\boldsymbol{1}_{N}\otimes\gamma,\forall\gamma\in\mathbb{R}_{+}^{q}$ , then for any $T\geq 0$ the trajectories of linear multi-agent system (1) with control protocol (11) satisfy

\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu})-\mathcal{H}(t,\tilde{y},\mu)\Big{)}\,dt\leq\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}.

(14)

Proof:

Calculating the time derivative of the energy function (13) together with (12) yields

$\displaystyle\dot{V}_{(\tilde{y},\tilde{\mu})}{=}$	$\displaystyle(\mu-\tilde{\mu})^{\top}\dot{\mu}+(y-\tilde{y})^{\top}\dot{y}$
$\displaystyle=$	$\displaystyle\sum_{i=1}^{N}(\mu_{i}\!-\!\tilde{\mu}_{i})\!^{\top}\Pi_{\mathbb{R}_{+}^{q}}\big{[}\mu_{i},\!\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})\big{]}$
	$\displaystyle\!+\!\sum_{i=1}^{N}\!(y_{i}{-}\tilde{y}_{i})\!^{\top}\Pi_{\mathcal{Y}_{i}}\!\big{[}y_{i},\!\!-\varepsilon\mathcal{H}_{i}^{y_{i}}\!(t,y_{i},\mu_{i})\big{]}$
$\displaystyle\leq$	$\displaystyle\sum_{i=1}^{N}(\mu_{i}-\tilde{\mu}_{i})^{\top}\Big{(}\varepsilon\mathcal{H}_{i}^{\mu_{i}}(t,y_{i},\mu_{i})\!\Big{)}$
	$\displaystyle+\sum_{i=1}^{N}(y_{i}-\tilde{y}_{i})^{\top}\Big{(}-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})\Big{)}$
$\displaystyle\leq$	$\displaystyle\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,y_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\tilde{\mu}_{i})\Big{)}$
	$\displaystyle+\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,\tilde{y}_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\mu_{i})\Big{)}$
$\displaystyle=$	$\displaystyle\varepsilon\Big{(}\mathcal{H}(t,\tilde{y},\mu)-\mathcal{H}(t,y,\tilde{\mu})\Big{)},$	(15)

where the second equation holds in view of (6) and (12), the first inequality holds because of (5), and the last inequality holds since the Lagrangian (7) is convex with respect to $y_{i}$ and concave with respect to $\mu_{i}$ .

Integrating (15) from $0$ to $T$ on both sides leads to that

	$\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu})-\mathcal{H}(t,\tilde{y},\mu)\Big{)}\,dt$
$\displaystyle\leq$	$\displaystyle-\frac{1}{\varepsilon}\int_{0}^{T}\dot{V}_{(\tilde{y},\tilde{\mu})}(y(t),\mu(t))\,dt$
$\displaystyle=$	$\displaystyle-\frac{1}{\varepsilon}\Big{(}V_{(\tilde{y},\tilde{\mu})}(y(T),\mu(T))-V_{(\tilde{y},\tilde{\mu})}(y(0),\mu(0))\Big{)}.$	(16)

Because the energy function (13) is always nonnegative and $\mu(0)=\boldsymbol{0}$ , the conclusion (14) can be obtained. ∎

We now state the main results about the regret and fit bounds of continuous communication controller (11).

Theorem 1

Suppose that Assumptions 1-4 hold. Then for any $T\geq 0$ and $\varepsilon>0$ in control protocol (11), by choosing $K_{\mu}\geq NK_{g}$ , it holds for the regret that

\displaystyle\mathcal{R}^{T}\leq\frac{\|y(0)\!-\!y^{*}\|^{2}}{2\varepsilon}.

(17)

Proof:

By choosing $\tilde{y}=y^{*}$ and $\tilde{\mu}=\boldsymbol{0}_{Nq}$ in Lemma 3, one has

\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\boldsymbol{0}_{Nq})-\mathcal{H}(t,y^{*},\mu)\Big{)}dt\leq\frac{V_{(y^{*},\boldsymbol{0}_{Nq})}(y(0),\boldsymbol{0})}{\varepsilon}.

(18)

According to (3) and (10), it can be obtained that

	$\displaystyle\mathcal{R}^{T}{=}$	$\displaystyle\int_{0}^{T}\!\!\left(\mathcal{H}(t,y,\boldsymbol{0}_{Nq}){-}\mathcal{H}(t,y^{},\mu)\right)dt{+}\int_{0}^{T}\!\mu^{\top}g(t,y^{})\,dt$
		$\displaystyle-\int_{0}^{T}K_{\mu}h(\mu)\,dt.$		(19)

Consider the second term of the right-hand side of (19). Let $\phi(\mu):=\mu^{\top}g(t,y^{*})$ for simplicity. Then, by introducing an intermediate variable $\bar{\mu}:=\frac{1}{N}\sum_{i=1}^{N}\mu_{i}$ and the relationship $\sum_{i=1}^{N}g_{i}(t,y_{i}^{*})\leq\boldsymbol{0}$ , one has that

\displaystyle\phi(\mu)\leq(\mu-\boldsymbol{1}_{N}\otimes\bar{\mu})^{\top}g(t,y^{*}).

(20)

Further, one can obtain that

$\displaystyle\phi(\mu)^{2}$	$\displaystyle\leq\big{(}(\mu-1_{N}\otimes\bar{\mu})^{\top}g(t,y^{*})\big{)}^{2}$
	$\displaystyle\leq NK_{g}^{2}\left\\|\mu-1_{N}\otimes\bar{\mu}\right\\|^{2}$
	$\displaystyle=NK_{g}^{2}\sum_{i=1}^{N}\Big{\\|}\mu_{i}-\frac{1}{N}\sum_{j=1}^{N}\mu_{j}\Big{\\|}^{2}$
	$\displaystyle\leq K_{g}^{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\left\\|\mu_{i}-\mu_{j}\right\\|^{2}$
	$\displaystyle\leq K_{g}^{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\left\\|\mu_{i}-\mu_{j}\right\\|_{1}^{2}.$	(21)

Since $\mathcal{G}$ is connected, there always exists a path connecting nodes $i_{0}$ and $j_{0}$ for any $i_{0},j_{0}\in\mathcal{V}$ , i.e.,

\displaystyle h(\mu)^{2}=\left(\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\left\|\mu_{i}-\mu_{j}\right\|_{1}\right)^{2}\geq\left\|\mu_{i_{0}}-\mu_{j_{0}}\right\|_{1}^{2}.

(22)

Then for $K_{\mu}\geq NK_{g}$ , one has that $\phi(\mu)^{2}\leq K_{\mu}^{2}h(\mu)^{2}$ , i.e.,

\displaystyle\phi(\mu)-K_{\mu}h(\mu)\leq 0.

(23)

Substituting (18) and (23) into (19) completes the proof.

∎

Theorem 2

Suppose that Assumptions 1-4 hold. Then for any $T\geq 0$ and $\varepsilon>0$ in control protocol (11), by choosing $K_{\mu}\geq NK_{g}$ , it holds for the fit that

\displaystyle\mathcal{F}^{T}\leq\frac{\sqrt{N}\|y(0)-y^{*}\|}{\varepsilon}+2N\sqrt{\frac{K_{f}}{\varepsilon}}\sqrt{T}.

(24)

Proof:

By Lemma 3 with $\tilde{y}=y^{*}$ and $\tilde{\mu}=\boldsymbol{1}_{N}\otimes\gamma$ , where $\gamma=col(\gamma_{1},\dots,\gamma_{q})\in\mathbb{R}^{q}$ is a parameter to be determined later, one has that

	$\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\boldsymbol{1}_{N}\otimes\gamma)-\mathcal{H}(t,y^{*},\mu)\Big{)}\,dt$
$\displaystyle=$	$\displaystyle\int_{0}^{T}\bigg{(}f(t,y){+}\gamma^{\top}\sum_{i=1}^{N}g_{i}(t,y_{i}){-}f(t,y^{}){-}\mu^{\top}g(t,y^{})$
	$\displaystyle+K_{\mu}h(\mu)\bigg{)}dt$
$\displaystyle\leq$	$\displaystyle\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}.$	(25)

Invoking Assumption 2 yields

\displaystyle\int_{0}^{T}\big{(}f(t,y^{*})-f(t,y)\big{)}\,dt\leq 2NK_{f}T.

(26)

Substituting (23) and (26) into (25), one has that

\displaystyle\int_{0}^{T}\gamma^{\top}\sum_{i=1}^{N}g_{i}(t,y_{i})\,dt\leq

\displaystyle\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}+2NK_{f}T.

By choosing

\displaystyle\gamma_{j}=\left\{\begin{aligned} &0,&F_{j}^{T}\leq 0\\ &\frac{\varepsilon}{N}F_{j}^{T},&F_{j}^{T}>0\end{aligned}\right.,j=1,\dots,q,

it can be concluded that

\displaystyle\frac{\varepsilon}{N}\|\mathcal{F}^{T}\|^{2}\leq

\displaystyle\frac{\|y(0)-y^{*}\|^{2}+\frac{\varepsilon^{2}}{N}\|\mathcal{F}^{T}\|^{2}}{2\varepsilon}+2NK_{f}T.

It can be further obtained by transposition that

\displaystyle\mathcal{F}^{T}\leq

\displaystyle\frac{\sqrt{N}\|y(0)-y^{*}\|_{2}}{\varepsilon}+2N\sqrt{\frac{K_{f}}{\varepsilon}}\sqrt{T}.

(27)

∎

Remark 2

Theorems 1 and 2 mean that $\mathcal{R}^{T}=\mathcal{O}(1)$ and $\mathcal{F}^{T}=\mathcal{O}(\sqrt{T})$ under continuous communication. In comparison, explicit bounds on both the regret and fit with a sublinear growth are obtained in [12, 18] for single-integrator multi-agent systems, i.e., $\mathcal{R}^{T}=\mathcal{O}(T^{\max\{\kappa,1-\kappa\}})$ , $\mathcal{F}^{T}=\mathcal{O}(T^{\max\{\kappa,1-\kappa\}})$ in [12] and $\mathcal{R}^{T}=\mathcal{O}(T^{\max\{\kappa,1-\kappa\}})$ , $\mathcal{F}^{T}=\mathcal{O}(T^{\max\{\frac{1}{2}+\frac{\kappa}{2},1-\frac{\kappa}{2}\}})$ in [18] for $\kappa\in(0,1)$ . Theorem 1 achieves stricter regret bound than [12, 18] under more complex system dynamics.

IV-B Event-triggered Communication

The above continuous-time control law, which requires each agent to know the real-time Lagrange multipliers of neighbors, may cause excessive communication overhead. In this section, an event-triggered protocol is proposed to avoid continuous communication.

For agent $i$ , suppose that $t_{i}^{l}$ is its $l$ th communication instant and $\{t_{i}^{1},\dots,t_{i}^{l},\dots\}$ is its communication instant sequence. Define $\hat{\mu}_{i}(t)\!:=\!\mu_{i}(t_{i}^{l}),~{}\forall t\!\in\![t_{i}^{l},t_{i}^{l+1})$ as the available information of its neighbors and $e_{i}\!:=\!\hat{\mu}_{i}(t)-\mu_{i}(t)$ as the measurement error. It can be known that $e_{i}=0$ at any instant $t_{i}^{l}$ .

An event-triggered control law is proposed as


$\displaystyle u_{i}$	$\displaystyle=-K_{\alpha_{i}}x_{i}+K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right),$	(28a)
$\displaystyle\dot{\mu_{i}}$	$\displaystyle=\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i})-2\varepsilon K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})],$	(28b)

where $\varepsilon>0$ is the step size, $K_{\alpha_{i}},K_{\beta_{i}}$ are feedback matrices that are solutions of (6), $a_{ij}$ is the weight corresponding to the edge $(j,i)$ , and the initial value $\mu_{i}(0)=\boldsymbol{0}$ . Note that $0$ is chosen for the sign function in (28b) when its argument is zero.

Substituting controller (28) into (1), the system dynamics of the $i$ th agent becomes


$\displaystyle\dot{x}_{i}$	$\displaystyle=(A_{i}{-}B_{i}K_{\alpha_{i}})x_{i}+B_{i}K_{\beta_{i}}\left(\Pi_{\mathcal{Y}_{i}}[y_{i},-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,y_{i},\mu_{i})]\right),$	(29a)
$\displaystyle\dot{\mu_{i}}$	$\displaystyle=\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K_{\mu}\!\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})],$	(29b)
$\displaystyle y_{i}$	$\displaystyle=C_{i}x_{i}.$	(29c)

The communication instant is chosen as

\displaystyle t_{i}^{l+1}\!:=\!\inf_{t>t_{i}^{l}}\!\Big{\{}t\Big{|}\|e_{i}(t)\|{\geq}\frac{1}{6N\!\sqrt{q}}\!\sum_{j=1}^{N}\!a_{ij}\!\|\hat{\mu}_{i}{-}\hat{\mu}_{j}\|_{\!1}{+}\frac{\sigma e^{-\iota t}}{3N^{\!2}\!K_{\!\mu}\!\sqrt{q}}\Big{\}},

(30)

where $\sigma$ and $\iota$ are prespecified positive real numbers.

The following lemma is a modification of Lemma 3 under event-triggered communication.

Lemma 4

If Assumptions 1-4 hold and $\tilde{\mu}=\boldsymbol{1}_{N}\otimes\gamma,\forall\gamma\in\mathbb{R}_{+}^{q}$ , then for any $T\geq 0$ the trajectories of linear multi-agent system (1) with control protocol (29) satisfy

\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu}){-}\mathcal{H}(t,\tilde{y},\mu)\!\Big{)}dt{\leq}\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}{+}\frac{\sigma}{\iota}.

(31)

Proof:

Similar to (15), one can obtain that

$\displaystyle\dot{V}_{(\tilde{y},\tilde{\mu})}=$	$\displaystyle(\mu-\tilde{\mu})^{\top}\dot{\mu}+(y-\tilde{y})^{\top}\dot{y}$
$\displaystyle=$	$\displaystyle\!\sum_{i=1}^{N}\!(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\!\Pi_{\mathbb{R}_{+}^{q}}\!\Big{[}\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K\!_{\mu}\!\sum_{j=1}^{N}\!a_{i\!j}\mathrm{sgn}(\hat{\mu}_{i}{-}\hat{\mu}_{j})\!\Big{]}$
	$\displaystyle\!+\!\sum_{i=1}^{N}(y_{i}\!-\!\tilde{y}_{i}\!)\!^{\top}\Pi_{\mathcal{Y}_{i}}\big{[}y_{i},\!-\varepsilon\mathcal{H}_{i}^{y_{i}}(t,\!y_{i},\!\mu_{i})\big{]}$
$\displaystyle\leq$	$\displaystyle\underbrace{\varepsilon\sum_{i=1}^{N}(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\!\mathcal{H}_{i}^{\mu_{i}}(t,\!y_{i},\!\mu_{i}){-}\varepsilon\!\sum_{i=1}^{N}(y_{i}\!-\!\tilde{y}_{i}\!)\!^{\top}\!\mathcal{H}_{i}^{y_{i}}(t,\!y_{i},\!\mu_{i})}_{S_{1}}$
	$\displaystyle\underbrace{-2\varepsilon K_{\mu}\sum_{i=1}^{N}\!(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})}_{S_{2}}$
	$\displaystyle+\underbrace{\varepsilon K_{\mu}\sum_{i=1}^{N}\!(\mu_{i}{-}\tilde{\mu}_{i}\!)\!^{\top}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\mu_{i}-\mu_{j})}_{S_{3}}.$	(32)

Since the Lagrangian (7) is convex with respect to $y_{i}$ and concave with respect to $\mu_{i}$ , one has that

$\displaystyle S_{1}\leq$	$\displaystyle\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,\tilde{y}_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\mu_{i})\Big{)}$
	$\displaystyle+\varepsilon\sum_{i=1}^{N}\Big{(}\mathcal{H}_{i}(t,y_{i},\mu_{i})-\mathcal{H}_{i}(t,y_{i},\tilde{\mu}_{i})\Big{)}$
$\displaystyle=$	$\displaystyle\varepsilon\Big{(}\mathcal{H}(t,\tilde{y},\mu)-\mathcal{H}(t,y,\tilde{\mu})\Big{)}.$	(33)

Since $\tilde{\mu}=\boldsymbol{1}_{N}\otimes\gamma,\forall\gamma\in\mathbb{R}_{+}^{q}$ and graph $\mathcal{G}$ is undirected, it follows that

$\displaystyle S_{2}=$	$\displaystyle-2\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\mu_{i}^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})$
$\displaystyle=$	$\displaystyle-\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\mu_{i}^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})$
	$\displaystyle-\varepsilon K_{\mu}\sum_{j=1}^{N}\sum_{i=1}^{N}a_{ji}\mu_{j}^{\top}\mathrm{sgn}(\hat{\mu}_{j}-\hat{\mu}_{i})$
$\displaystyle=$	$\displaystyle-\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}(\mu_{i}-\mu_{j})^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})$
$\displaystyle=$	$\displaystyle-\varepsilon K_{\mu}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}(\hat{\mu}_{i}-\hat{\mu}_{j}-e_{i}+e_{j})^{\top}\mathrm{sgn}(\hat{\mu}_{i}-\hat{\mu}_{j})$
$\displaystyle\leq$	$\displaystyle\!-\!\varepsilon K_{\mu}\!\sum_{i=1}^{N}\!\sum_{j=1}^{N}a_{ij}\\|\hat{\mu}_{i}{-}\hat{\mu}_{j}\\|_{1}\!+\!\varepsilon K_{\mu}\!\sqrt{q}\sum_{i=1}^{N}\!\sum_{j=1}^{N}\\|e_{i}{-}e_{j}\\|,$	(34)

where the last inequality holds since the relationship $\|v\|_{1}\leq\sqrt{q}\|v\|,\forall v\in\mathbb{R}^{q}$ .

Similarly, one has that

$\displaystyle S_{3}=$	$\displaystyle\varepsilon K_{\mu}\sum_{i=1}^{N}\!\mu_{i}^{\top}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\mu_{i}-\mu_{j})$
$\displaystyle=$	$\displaystyle\frac{\varepsilon K_{\mu}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\\|\mu_{i}-\mu_{j}\\|_{1}$
$\displaystyle=$	$\displaystyle\frac{\varepsilon K_{\mu}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\\|\hat{\mu}_{i}-\hat{\mu}_{j}-e_{i}+e_{j}\\|_{1}$
$\displaystyle\leq$	$\displaystyle\frac{\varepsilon K_{\mu}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\\|\hat{\mu}_{i}{-}\hat{\mu}_{j}\\|_{1}+\frac{\varepsilon K_{\mu}\sqrt{q}}{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\\|e_{i}{-}e_{j}\\|.$	(35)

For the second item in (34) and (35), one can calculate that

	$\displaystyle\sum_{i=1}^{N}\sum_{j=1}^{N}\\|e_{i}-e_{j}\\|$
$\displaystyle\leq$	$\displaystyle\sum_{i=1}^{N}\sum_{j=1}^{N}\\|e_{i}\\|+\sum_{i=1}^{N}\sum_{j=1}^{N}\\|e_{j}\\|$
$\displaystyle\leq$	$\displaystyle 2N\sum_{i=1}^{N}\\|e_{i}\\|$
$\displaystyle\leq$	$\displaystyle\frac{1}{3\sqrt{q}}\sum_{i=1}^{N}\sum_{j=1}^{N}a_{ij}\\|\hat{\mu}_{i}-\hat{\mu}_{j}\\|_{1}+\frac{2\sigma e^{-\iota t}}{3K_{\mu}\sqrt{q}},$	(36)

where the last inequality holds due to the trigger condition (30).

Summarizing the above-discussed analysis, $\dot{V}_{(\tilde{y},\tilde{\mu})}$ in (32) is calculated as

	$\displaystyle\dot{V}_{(\tilde{y},\tilde{\mu})}\leq$	$\displaystyle S_{1}+S_{2}+S_{3}$
	$\displaystyle\leq$	$\displaystyle\varepsilon\Big{(}\mathcal{H}(t,\tilde{y},\mu)-\mathcal{H}(t,y,\tilde{\mu})\Big{)}+\varepsilon\sigma e^{-\iota t}.$

By integrating it from $0$ to $T$ on both sides and omitting negative terms, it can be obtained that

\displaystyle\int_{0}^{T}\Big{(}\mathcal{H}(t,y,\tilde{\mu}){-}\mathcal{H}(t,\tilde{y},\mu)\Big{)}dt{\leq}\frac{V_{(\tilde{y},\tilde{\mu})}(y(0),\boldsymbol{0})}{\varepsilon}{+}\frac{\sigma}{\iota}.

(37)

∎

We now state the main results about the regret and fit bounds of event-triggered communication controller (29).

Theorem 3

Suppose that Assumptions 1-4 hold. Then for any $T\geq 0$ and $\varepsilon>0$ in control protocol (28), by choosing $K_{\mu}\geq NK_{g}$ , the following regret bound holds:

\displaystyle\mathcal{R}^{T}\leq\frac{\|y(0)\!-\!y^{*}\|^{2}}{2\varepsilon}+\frac{\sigma}{\iota}.

(38)

Theorem 4

Suppose that Assumptions 1-4 hold. Then for any $T\geq 0$ and $\varepsilon>0$ in control protocol (28), by choosing $K_{\mu}\geq NK_{g}$ , the following fit bound holds:

\displaystyle\mathcal{F}^{T}\leq\frac{\sqrt{N}\|y(0)-y^{*}\|}{\varepsilon}+\sqrt{\frac{2N\sigma}{\varepsilon\iota}}+2N\sqrt{\frac{K_{f}}{\varepsilon}}\sqrt{T}.

(39)

The proofs of Theorems 3 and 4 are similar to that of Theorems 1 and 2, except that Lemma 4 is used instead of Lemma 3. They are thus omitted here.

Remark 3

Theorems 3 and 4 mean that $\mathcal{R}^{T}=\mathcal{O}(1)$ and $\mathcal{F}^{T}=\mathcal{O}(\sqrt{T})$ still hold even under event-triggered communication. The bounds of regret and fit are determined by the communication frequency. Generally speaking, decreasing $\sigma$ and increasing $\iota$ will achieve smaller bounds on regret and fit, but meanwhile increase the communication frequency, which results in a tradeoff between them.

Theorem 5

Under the event triggering condition (30), system (29) does not exhibit the Zeno behavior..

Proof:

In the trigger interval $[t_{i}^{l},t_{i}^{l+1})$ for agent $i$ , combining the definition of $e_{i}$ with (28b), one can write the upper right-hand Dini derivative as

\displaystyle D^{+}e_{i}(t){=}-\!\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}{-}\hat{\mu}_{j})].

(40)

It is obvious that $e_{i}(t_{i}^{l})=0$ . Then, for $t\in(t_{i}^{l},t_{i}^{l+1})$ , the solution of (40) is

\displaystyle e_{i}(t){=}\!\int_{t_{i}^{l}}^{t}\!-\!\Pi_{\mathbb{R}_{+}^{q}}[\mu_{i},\varepsilon g_{i}(t,y_{i}){-}2\varepsilon K_{\mu}\sum_{j=1}^{N}a_{ij}\mathrm{sgn}(\hat{\mu}_{i}{-}\hat{\mu}_{j})]\,d\tau.

(41)

From Assumption 2 and the inequality $\|\Pi_{\mathcal{S}}[x,v]\|\leq\|v\|$ (cf. Remark 2.1 in [20]), it can be obtained that the norm of the integral term in (41) is bounded, and let the upper bound of its norm be $\delta$ . It then follows from (41) that

\displaystyle\|e_{i}(t)\|\leq\delta(t-t_{i}^{l}).

(42)

Hence, condition (30) will definitely not be triggered before the following condition holds:

\displaystyle\delta(t-t_{i}^{l})=\frac{1}{6N\sqrt{q}}\sum_{j=1}^{N}a_{ij}\|\hat{\mu}_{i}-\hat{\mu}_{j}\|_{1}{+}\frac{\sigma e^{-\iota t}}{3N^{2}K_{\mu}\sqrt{q}}.

(43)

It is easy to obtain that the right-hand side of (43) is positive for any finite time $t$ , which further implies that $t-t_{i}^{l}>0$ . Hence, the value $t_{i}^{l+1}-t_{i}^{l}$ is strictly positive for finite $t$ , which implies that no Zeno behavior is exhibited. ∎

V SIMULATION

Consider a heterogeneous multi-agent system composed of $5$ agents described by (1), where $x_{i}\in\begin{cases}\mathbb{R}^{2}&i=1,2,3\\ \mathbb{R}^{3}&i=4,5\end{cases}$ , $y_{i}=\begin{cases}(y_{i,a},y_{i,b})\in\mathbb{R}^{2}&i=1,2,3\\ (y_{i,a},y_{i,b},y_{i,c})\in\mathbb{R}^{3}&i=4,5\end{cases}$ , $A_{1,2}$ $=$ $[1,0;0,2]$ , $A_{3}$ $=$ $[0,2;-1,1]$ , $A_{4,5}$ $=$ $[2,1,0;0,1,1;1,0,2]$ , $B_{1,2}$ $=$ $[0,1;1,3]$ , $B_{3}$ $=$ $[2,1;1,0]$ , $B_{4,5}$ $=$ $[1,0,0;0,1,0;0,0,1]$ , $C_{1,2}$ $=$ $[2,0;0,1]$ , $C_{3}$ $=$ $[2,1;-1,0]$ , $C_{4,5}$ $=$ $[3,0,0;0,1,0;0,1,2]$ .

Refer to caption — Figure 1: Communication network among five agents.

The local objective functions are time-varying quadratic functions as follows:

	$\displaystyle f_{1}(t,y_{1})=$	$\displaystyle(y_{1,a}{-}\cos{t}{-}1)^{2}{+}(y_{1,b}{-}\cos{1.5t}{-}1.5)^{2};$
	$\displaystyle f_{2}(t,y_{2})=$	$\displaystyle 2(y_{2,a}{-}\cos{t}{-}1)^{2}{+}3(y_{2,b}{-}\cos{1.7t}{-}1)^{2};$
	$\displaystyle f_{3}(t,y_{3})=$	$\displaystyle 2(y_{3,a}{-}\cos{t}{-}1)^{2}{+}(y_{3,b}{-}\cos{2t}{-}1)^{2};$
	$\displaystyle f_{4}(t,y_{4})=$	$\displaystyle 0.5(y_{4,a}{-}\cos{t}{-}2)^{2}{+}(y_{4,b}{-}\cos{1.2t}{-}1)^{2}$
		$\displaystyle{+}(y_{4,c}{-}\cos{1.5t}{-}2)^{2};$
	$\displaystyle f_{5}(t,y_{5})=$	$\displaystyle 2(y_{5,a}{-}\cos{t}{-}1)^{2}{+}3(y_{5,b}{-}\cos{1.5t}{-}1)^{2}$
		$\displaystyle{+}(y_{5,c}{-}\cos{2t}{-}1.5)^{2}.$

The feasible set of output variables $\mathcal{Y}\in[-1,5]^{12}$ . The constraints are defined by a time-varying function

	$\displaystyle g_{1}(t,y_{1})=$	$\displaystyle(0.5\sin{10t}{+}1.5)y_{1,a}{+}(0.5\sin{15t}{+}1.5)y_{1,b}{-}1;$
	$\displaystyle g_{1}(t,y_{2})=$	$\displaystyle(0.3\sin{10t}{+}1.7)y_{2,a}{+}(0.1\sin{25t}{+}1.9)y_{2,b}{-}3;$
	$\displaystyle g_{1}(t,y_{3})=$	$\displaystyle(0.6\sin{20t}{+}1.4)y_{3,a}{+}(0.5\sin{15t}{+}1.5)y_{3,b}{-}4;$
	$\displaystyle g_{1}(t,y_{4})=$	$\displaystyle(0.5\sin{20t}{+}1.5)y_{4,a}{+}(0.4\sin{25t}{+}1.6)y_{4,b}$
		$\displaystyle{+}(0.4\sin{25t}{+}1.6)y_{4,c}{-}2;$
	$\displaystyle g_{1}(t,y_{5})=$	$\displaystyle(0.5\sin{10t}{+}1.5)y_{5,a}{+}(0.6\sin{15t}{+}1.4)y_{5,b}$
		$\displaystyle{+}(0.6\sin{15t}{+}1.4)y_{5,c}{-}5;$

The above constraint selection ensures that $\boldsymbol{0}$ must be a strictly feasible solution for all $t\in[0,T]$ .

The clairvoyant optimal output can be computed by solving the problem

\displaystyle\begin{split}y^{*}\!:=\mathop{\arg\min}_{y\in\mathcal{Y}}\int_{0}^{T}\sum_{i=1}^{N}f_{i}(t,y_{i})\,dt,\\ s.t.\sum_{i=1}^{N}g_{i}(t,y_{i})\leq\boldsymbol{0}.\end{split}

(44)

The communication network among these agents is depicted in Fig. 1. It can be verified that Assumptions 1 $-$ 4 hold.

For the numerical example, the selection of feedback matrices is based on (6), where $K_{\alpha_{1,2}}$ $=$ $[-3,2;1,0]$ , $K_{\alpha_{3}}$ $=$ $[-1,1;2,0]$ , $K_{\alpha_{4,5}}$ $=$ $[2,1,0;0,1,1;1,0,2]$ , $K_{\beta_{1,2}}$ $=$ $[-1.5,1;0.5,0]$ , $K_{\beta_{3}}$ $=$ $[1,2;-2,-5]$ , $K_{\beta_{4,5}}$ $=$ $[0.333,0,0;0,1,0;0,-0.5,0.5]$ . The initial values $x_{i}(0)$ are randomly selected in $[-5,5]$ and $\mu(0)=\boldsymbol{0}$ .

Fig. 2 illustrates that the continuous-time control law achieves constant regret bound and sublinear fit bound. The results are in accordance with those established in Theorems 1-2. Likewise, the similar results can be observed in Fig. 3 under event-triggered communication. Fig. 4 shows the communication moments of five agents with event-triggered control laws, from which one can observe that the communication among five agents is discrete and exhibits no Zeno behavior.

VI CONCLUSION

In this paper, we studied distributed online convex optimization for heterogeneous linear multi-agent systems with time-varying cost functions and time-varying coupling inequality constraints. A distributed controller was proposed based on the saddle-point method, showing the constant regret bound and sublinear fit bound. In order to avoid continuous communication and reduce the communication cost, an event-triggered communication scheme with no Zeno behavior was developed, which also achieves constant regret bound and sublinear fit bound.

References

[1] M. A. Dahleh and I. J. Diaz-Bobillo, Control of Uncertain Systems: A Linear Programming Approach. Prentice-Hall, Englewood Cliffs, NJ, 1995.
[2] Z. Luo, “Applications of convex optimization in signal processing and digital communication,” Math. Program., vol. 97, no. 1-2, pp. 177–207, 2003.
[3] J. Qiu, Q. Wu, G. Ding, Y. Xu, and S. Feng, “A survey of machine learning for big data processing.,” EURASIP J. Adv. Signal Process., vol. 2016, p. 67, 2016.
[4] S. Shalev-Shwartz, “Online learning and online convex optimization,” Foundations and Trends in Machine Learning, vol. 4, no. 2, pp. 107–194, 2011.
[5] X. Li, L. Xie, and N. Li, “A survey of decentralized online learning,” arXiv preprint arXiv:2205.00473, 2022.
[6] M. Zinkevich, “Online convex programming and generalized infinitesimal gradient ascent,” in Proceedings, Twentieth International Conference on Machine Learning, vol. 2, pp. 928–935, 2003.
[7] X. Zhou, E. Dallanese, L. Chen, and A. Simonetto, “An incentive-based online optimization framework for distribution grids,” IEEE Transactions on Automatic Control, vol. 63, no. 7, pp. 2019–2031, 2018.
[8] S. Lee, A. Ribeiro, and M. Zavlanos, “Distributed continuous-time online optimization using saddle-point methods,” in IEEE 55th Conference on Decision and Control, pp. 4314–4319, 2016.
[9] S. Paternain, S. Lee, M. Zavlanos, and A. Ribeiro, “Distributed constrained online learning,” IEEE Transactions on Signal Processing, vol. 68, pp. 3486–3499, 2020.
[10] D. Yuan, D. Ho, and G.-P. Jiang, “An adaptive primal-dual subgradient algorithm for online distributed constrained optimization,” IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3045–3055, 2018.
[11] X. Li, X. Yi, and L. Xie, “Distributed online optimization for multi-agent networks with coupled inequality constraints,” IEEE Transactions on Automatic Control, vol. 66, no. 8, pp. 3575–3591, 2021.
[12] X. Yi, X. Li, L. Xie, and K. Johansson, “Distributed online convex optimization with time-varying coupled inequality constraints,” IEEE Transactions on Signal Processing, vol. 68, pp. 731–746, 2020.
[13] X. Yi, X. Li, T. Yang, L. Xie, T. Chai, and K. H. Johansson, “Distributed bandit online convex optimization with time-varying coupled inequality constraints,” IEEE Transactions on Automatic Control, vol. 66, no. 10, pp. 4620–4635, 2021.
[14] X. Li, X. Yi, and L. Xie, “Distributed online convex optimization with an aggregative variable,” IEEE Transactions on Control of Network Systems, pp. 1–8, 2021.
[15] Z. Deng, Y. Zhang, and Y. Hong, “Distributed online optimization of high-order multi-agent systems,” in 2016 35th Chinese Control Conference (CCC), pp. 7672–7677, 2016.
[16] M. Nonhoff and M. Müller, “Online gradient descent for linear dynamical systems,” in IFAC-PapersOnLine, vol. 53, pp. 945–952, 2020.
[17] M. Nonhoff and M. A. Müller, “Data-driven online convex optimization for control of dynamical systems,” arXiv preprint arXiv:2103.09127, 2021.
[18] J. Li, C. Gu, Z. Wu, and T. Huang, “Online learning algorithm for distributed convex optimization with time-varying coupled constraints and bandit feedback,” IEEE Transactions on Cybernetics, vol. 52, no. 2, pp. 1009–1020, 2022.
[19] S. Paternain and A. Ribeiro, “Online learning of feasible strategies in unknown environments,” IEEE Transactions on Automatic Control, vol. 62, no. 6, pp. 2807–2822, 2017.
[20] D. Zhang and A. Nagurney, “On the stability of projected dynamical systems,” Journal of Optimization Theory and Applications, vol. 85, no. 1, pp. 97–124, 1995.
[21] L. Li, Y. Yu, X. Li, and L. Xie, “Exponential convergence of distributed optimization for heterogeneous linear multi-agent systems over unbalanced digraphs,” Automatica, vol. 141, p. 110259, 2022.
[22] S. Liang, X. Zeng, and Y. Hong, “Distributed nonsmooth optimization with coupled inequality constraints via modified Lagrangian function,” IEEE Transactions on Automatic Control, vol. 63, no. 6, pp. 1753–1759, 2018.

$\displaystyle\phi(\mu)^{2}$	$\displaystyle\leq\big{(}(\mu-1_{N}\otimes\bar{\mu})^{\top}g(t,y^{*})\big{)}^{2}$
	$\displaystyle\leq NK_{g}^{2}\left\\|\mu-1_{N}\otimes\bar{\mu}\right\\|^{2}$
	$\displaystyle=NK_{g}^{2}\sum_{i=1}^{N}\Big{\\|}\mu_{i}-\frac{1}{N}\sum_{j=1}^{N}\mu_{j}\Big{\\|}^{2}$
	$\displaystyle\leq K_{g}^{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\left\\|\mu_{i}-\mu_{j}\right\\|^{2}$
	$\displaystyle\leq K_{g}^{2}\sum_{i=1}^{N}\sum_{j=1}^{N}\left\\|\mu_{i}-\mu_{j}\right\\|_{1}^{2}.$	(21)