This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributed Control of Linear Quadratic Mean Field Social Systems with Heterogeneous Agents

Yong Liang    Bing-Chang Wang    \IEEEmembershipSenior Member, IEEE    and Huanshui Zhang    \IEEEmembershipSenior Member, IEEE This work was supported by the National Natural Science Foundation of China under Grants 61922051, 62122043, 62192753, Major Basic Research of Natural Science Foundation of Shandong Province (ZR2021ZD14, ZR2020ZD24), Science and Technology Project of Qingdao West Coast New Area (2019-32, 2020-20, 2020-1-4), High-level Talent Team Project of Qingdao West Coast New Area (RCTD-JC-2019-05), and Key Research and Development Program of Shandong Province (2020CXGC01208).Y. Liang is with the School of Control Science and Engineering, Shandong University, Jinan, China (e-mail: yongliang@mail.sdu.edu.cn). B.-C. Wang is with the School of Control Science and Engineering, Shandong University, Jinan, China (e-mail: bcwang@sdu.edu.cn).H. Zhang is with the College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao, China (e-mail: hszhang@sdu.edu.cn).
Abstract

In this paper, we study the social optimality for mean field linear-quadratic control systems following the direct approach, where subsystems are coupled via individual dynamics and costs according to a network topology. A graph is introduced to represent the network topology of the large-population system, where nodes represent subpopulations called clusters and edges represent communication relationship. By the direct approach, we first seek the optimal controller under centralized information structure, which characterized by a set of forward-backward stochastic differential equations. Then the feedback controller is obtained with the help of Riccat equations. Finally, we design the distributed controller with mean field approximations, which has the property of asymptotically social optimality.

{IEEEkeywords}

mean field games, multi-agent systems, linear quadratic optimal control, asymptotically social optimality

1 Introduction

In recent years, mean field games and control have became a hot topic, and it has wide applications in system control, applied mathematics, and economics [1, 2, 3], such as dynamic production adjustment [4], vaccination games [5], resource allocation in internet of things [6], etc. Mean field games theory originated from the parallel works of Lasry and Lions [7] and of Huang et al. [8]. A key feature of the mean field system is its mean field coupling terms, which often appears in multi-agent systems, such as the team decision problem with partially exchangeable agents [9].

Under the centralized information structure, the computational complexity increases exponentially with the increase of the number of agents. Due to communication constraints, it is also unrealistic for each agent to obtain global information. Therefore, it is meaningful to design decentralized controllers only using local information. The difficulty is how to use local information to approximate the mean field coupling term which is related to global information. At present, there are mainly two approaches: fixed point and direct approaches. The first approach uses an infinite to finite population approach. First the fixed point equation is obtained in the infinite population system and then substitute the solution to the finite population system to analyze the property of ε\varepsilon-Nash equilibrium. See [7, 10, 11, 12, 16] for details. The second approach uses an finite to infinite population approach. First, one directly solve the Nash equilibrium under the finite population and then obtain the limit of the Nash equilibrium with respect to the number of populations tending to infinity. Finally, it is proved that the limit is a ε\varepsilon-Nash equilibrium in the original system. See [8, 14, 15, 16, 17] for details. As the complexity of the system increases, the fixed point equation will become more complicated and difficult to solve, such as mean field games with major-minor agents [16]. Even the fixed point approach will fail when the system contains common noise [17].

In this paper, we study the mean field social control problem by the direct approach. The agents in the large-population system are heterogeneous, which makes it difficult to solve the centralized controller. The case we consider in this paper is that the large-population system contains KK-type agents. Agents of the same type are divided into a cluster and represented as a node of the graph 𝒢K\mathcal{G}_{K}. The agents in the same cluster generate a local mean field term. Therefore there are total of KK local mean field terms in this large-population system. Agents are coupled with each other in dynamics and costs through the mean field terms according to the adjacency matrix. Compared with the mean field social control involving homogeneous agents, the heterogeneous case is difficult to analyze and solve the centralized controller only using traditional methods. The key to solve this problem is to construct the local mean field terms. The distributed controller is designed by the optimal estimation of KK local mean field terms. The main contributions of the paper are listed as follows.

  • By variational analysis, the necessary and sufficient condition for the exist for the existence of centralized controllers is obtained, which is characterized by the adapted solution of a set of forward-backward stochastic differential equations.

  • The centralized optimal feedback controller is obtained by introducing Riccati equations.

  • A distributed controller is proposed by mean field approximations, which has the property of asymptotically social optimality.

The rest of the paper is organized as follows. In section 2 we formulate the mean field social control problem. Section 3 gives the centralized results of this paper. Section 4 design distributed controllers based on centralized results and show the asymptotically social optimality. Section 5 concludes this paper.

Notations: For a matrix MM, MTM^{T} means the transpose of MM. For a set of vectors {x1,x2,,xN}\{x_{1},x_{2},...,x_{N}\}, vec(x1,x2,,xN)\mathrm{vec}(x_{1},x_{2},...,x_{N}) denotes the vector (x1T,x2T,,xNT)T(x_{1}^{T},x_{2}^{T},...,x_{N}^{T})^{T}. For a set of matrices {A1,A2,,AN}\{A_{1},A_{2},...,A_{N}\}, rows(A1,\mathrm{rows}(A_{1}, A2,,AN)A_{2},...,A_{N}), cols(A1,A2,,AN)\mathrm{cols}(A_{1},A_{2},...,A_{N}), and diag(A1,A2,,AN)\mathrm{diag}(A_{1},A_{2},...,A_{N}) denote the matrices (A1T,A2T,,ANT)T(A_{1}^{T},A_{2}^{T},...,A_{N}^{T})^{T}, (A1,(A_{1}, A2,,AN)A_{2},...,A_{N}) and the diagonal matrix with the elements of (A1,A2,,AN)(A_{1},A_{2},...,A_{N}) on the main diagonal. Let InI_{n} be the nn-dimensional identity matrix. Let 1n×m1_{n\times m} and 0n×m0_{n\times m} denote the n×mn\times m matrix with all the elements 11 and 0, respectively. We use C0,C1,C2C_{0},C_{1},C_{2} etc. to denote generic constants, which may vary from place to place. For two matrices A=[aij]A=[a_{ij}] and B=[bij]B=[b_{ij}], let \otimes and \odot denote the Kronecker product and the dot product, respectively, i.e.,

AB=(a11Ba1nBam1BamnB),AB=(a11b11a1nb1nam1bm1amnbmn).\begin{split}A\otimes B=&\left(\begin{array}[]{cccc}a_{11}B&...&a_{1n}B\\ ...&...&...\\ a_{m1}B&...&a_{mn}B\end{array}\right),\quad A\odot B={\small\left(\begin{array}[]{cccc}a_{11}b_{11}&...&a_{1n}b_{1n}\\ ...&...&...\\ a_{m1}b_{m1}&...&a_{mn}b_{mn}\end{array}\right)}.\end{split}

2 Problem Formulation

2.1 Network topology

Consider a large-population system with KK\in\mathbb{R} clusters 𝒞q\mathcal{C}_{q}, q𝒱K{1,2,,K}q\in\mathcal{V}_{K}\triangleq\{1,2,...,K\}, where the cluster CqC_{q} contains NqN_{q} homogeneous agents with the same dynamics, cost functions and communication capabilities. In this scenario, the large-population system contains a total of Nq𝒱KNqN\triangleq\sum_{q\in\mathcal{V}_{K}}N_{q} heterogeneous agents 𝒜i\mathcal{A}_{i}, i{1,2,,N}i\in\mathcal{I}\triangleq\{1,2,...,N\}, and the agents are partitioned into the KK disjoint clusters 𝒞q\mathcal{C}_{q}, q𝒱Kq\in\mathcal{V}_{K}. The set of agents belonging to the cluster 𝒞q\mathcal{C}_{q} is denoted by q\mathcal{I}_{q}\subseteq\mathcal{I} with |q|=Nq|\mathcal{I}_{q}|=N_{q}, q𝒱Kq=\cup_{q\in\mathcal{V}_{K}}\mathcal{I}_{q}=\mathcal{I}, and q𝒱Kq=\cap_{q\in\mathcal{V}_{K}}\mathcal{I}_{q}=\emptyset. In this paper, we also list the agents as 𝒜i\mathcal{A}_{i}, iqi\in\mathcal{I}_{q}, q𝒱Kq\in\mathcal{V}_{K}.

The agents in different clusters communicate with each other through a network whose topology is modeled as a directed graph 𝒢K={𝒱K,K,K}\mathcal{G}_{K}=\{\mathcal{V}_{K},\mathcal{E}_{K},\mathcal{M}_{K}\}, where the clusters and the communication channels between clusters are represented by the node set 𝒱K\mathcal{V}_{K} and the edge set K𝒱K×𝒱K\mathcal{E}_{K}\subseteq\mathcal{V}_{K}\times\mathcal{V}_{K}, respectively. An edge denoted by the pair (j,i)(j,i) represents a communication channel from cluster 𝒞j\mathcal{C}_{j} to cluster 𝒞i\mathcal{C}_{i}. The neighbor set of cluster 𝒞q\mathcal{C}_{q} is denoted by 𝒩q{p|p𝒱K,(p,q)K}\mathcal{N}_{q}\triangleq\{\ p\ |\ p\in\mathcal{V}_{K},\ (p,q)\in\mathcal{E}_{K}\}. Let EK=[eij]K×KE_{K}=[e_{ij}]\in\mathbb{R}^{K\times K} be the communication matrix, where eij=1e_{ij}=1 if (j,i)K(j,i)\in\mathcal{E}_{K}, eij=0e_{ij}=0 if (j,i)K(j,i)\notin\mathcal{E}_{K}. The weighted adjacency matrix of 𝒢K\mathcal{G}_{K} is denoted by K=[mij]K×K\mathcal{M}_{K}=[m_{ij}]\in\mathbb{R}^{K\times K}. The agents are coupled with each other according to the adjacency matrix.

2.2 Dynamics and costs

Let xinx_{i}\in\mathbb{R}^{n} and uimu_{i}\in\mathbb{R}^{m} be the state and control of agent 𝒜i,iq,q𝒱K\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}. The local cluster mean field term of cluster 𝒞q\mathcal{C}_{q} is defined as the empirical mean of the states of all agents in the cluster, i.e.,

xqK(t)1Nqiqxi(t),q𝒱K.\displaystyle x^{K}_{q}(t)\triangleq\frac{1}{N_{q}}\sum_{i\in\mathcal{I}_{q}}x_{i}(t),\quad q\in\mathcal{V}_{K}. (1)

Therefore the global cluster mean field term of the large-population system is stacked as xKvec(x1K,x2K,x^{K}\triangleq\mathrm{vec}(x^{K}_{1},x^{K}_{2}, ,xKK)...,x^{K}_{K}). By the adjacency matrix K\mathcal{M}_{K}, the influence of the global cluster mean field term xKx^{K} on the agents in cluster CqC_{q} is denoted as zqKz^{K}_{q}, where

zqK(t)1Kp𝒱KmqpxpK(t),q𝒱K.\displaystyle z^{K}_{q}(t)\triangleq\frac{1}{K}\sum_{p\in\mathcal{V}_{K}}m_{qp}x^{K}_{p}(t),\quad q\in\mathcal{V}_{K}. (2)

Coupled by the term zqKz^{K}_{q}, the dynamics of agent 𝒜i,iq,q𝒱q\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{q}, is given by the following stochastic differential equation

dxi(t)=\displaystyle dx_{i}(t)= [Aqxi(t)+Bqui(t)+GqzqK(t)]dt+Σqdwi(t),xi(0)=xi0,\displaystyle\big{[}A_{q}x_{i}(t)+B_{q}u_{i}(t)+G_{q}z_{q}^{K}(t)\big{]}dt+\Sigma_{q}dw_{i}(t),\quad x_{i}(0)=x_{i0}, (3)

where Aq,Bq,Gq,ΣqA_{q},B_{q},G_{q},\Sigma_{q} are constant matrices with compatible dimensions and {wi,iq}\{w_{i},i\in\mathcal{I}_{q}\} are 11-dimensional independent standard Brownian motions defined on a complete filtered probability space (Ω,,)(\Omega,\mathcal{F},\mathbb{P}).

Let 𝐱vec(x1,x2,,xN)\mathbf{x}\triangleq\mathrm{vec}(x_{1},x_{2},...,x_{N}) and 𝐮vec(u1,u2,,uN)\mathbf{u}\triangleq\mathrm{vec}(u_{1},u_{2},...,u_{N}) be the state and control of the large-population system, respectively. Then the cost function of agent 𝒜i,iq,q𝒱K\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, is given by

Ji(𝐮)=𝔼0T[|xi(t)ΓqzqK(t)|Qq2+|ui(t)|Rq2]𝑑t+𝔼|xi(T)ΓqzqK(T)|Hq2,\begin{split}J_{i}(\mathbf{u})=&\mathbb{E}\int_{0}^{T}\big{[}|x_{i}(t)-\Gamma_{q}z_{q}^{K}(t)|^{2}_{Q_{q}}+|u_{i}(t)|^{2}_{R_{q}}\big{]}dt+\mathbb{E}|x_{i}(T)-\Gamma_{q}z_{q}^{K}(T)|^{2}_{H_{q}},\end{split}

where Γq,Qq0,Rq>0,Hq0\Gamma_{q},Q_{q}\geq 0,R_{q}>0,H_{q}\geq 0 are constant matrices with compatible dimensions. The social cost function is defined as

Jsoc(N)(𝐮)=q𝒱KiqJi(𝐮).\displaystyle J_{\rm soc}^{\rm(N)}(\mathbf{u})=\sum_{q\in\mathcal{V}_{K}}\sum_{i\in\mathcal{I}_{q}}J_{i}(\mathbf{u}). (4)

Let MqM_{q}, q𝒱Kq\in\mathcal{V}_{K}, be the qq-th row of K\mathcal{M}_{K} and

G¯qMqKGq.\bar{G}_{q}\triangleq\frac{M_{q}}{K}\otimes G_{q}.

Then the individual dynamics (3) can be rewritten as

dxi(t)=[Aqxi(t)+Bqui(t)+G¯qxK(t)]dt+Σqdwi(t),xi(0)=xi0.\begin{split}dx_{i}(t)=&\big{[}A_{q}x_{i}(t)+B_{q}u_{i}(t)+\bar{G}_{q}x^{K}(t)\big{]}dt+\Sigma_{q}dw_{i}(t),\quad x_{i}(0)=x_{i0}.\end{split} (5)

Similarly, let

Γ¯qMqKΓq.\bar{\Gamma}_{q}\triangleq\frac{M_{q}}{K}\otimes\Gamma_{q}.

Thus the social cost (4) can be rewritten as

Jsoc(N)(𝐮)=q𝒱Kiq{𝔼0T[|xi(t)Γ¯qxK(t)|Qq2+|ui(t)|Rq2]𝑑t+𝔼|xi(T)Γ¯qxK(T)|Hq2}.\begin{split}J_{\rm soc}^{\rm(N)}(\mathbf{u})=&\sum_{q\in\mathcal{V}_{K}}\sum_{i\in\mathcal{I}_{q}}\Big{\{}\mathbb{E}\int_{0}^{T}\big{[}|x_{i}(t)-\bar{\Gamma}_{q}x^{K}(t)|^{2}_{Q_{q}}+|u_{i}(t)|^{2}_{R_{q}}\big{]}dt+\mathbb{E}|x_{i}(T)-\bar{\Gamma}_{q}x^{K}(T)|^{2}_{H_{q}}\Big{\}}.\end{split} (6)

2.3 Main problems

In this paper, we apply the direct approach to sequentially study the mean field social control problem under centralized and distributed information patterns. So we first give the definitions of filtration used in this paper. Denote

ti\displaystyle\mathcal{F}_{t}^{i}\triangleq σ(xi(s),wi(s),st),\displaystyle\sigma(x_{i}(s),w_{i}(s),s\leq t),
t\displaystyle\mathcal{F}_{t}\triangleq σ(iq,q𝒱Kti),\displaystyle\sigma(\cup_{i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}}\mathcal{F}_{t}^{i}),
𝔽\displaystyle\mathbb{F}\triangleq {t}0tT,\displaystyle\{\mathcal{F}_{t}\}_{0\leq t\leq T},
tq,K\displaystyle\mathcal{H}_{t}^{q,K}\triangleq σ(jp,p𝒩qtj),\displaystyle\sigma(\cup_{j\in\mathcal{I}_{p},p\in\mathcal{N}_{q}}\mathcal{F}^{j}_{t}),
ti\displaystyle\mathcal{H}_{t}^{i}\triangleq σ(tq,K,ti),\displaystyle\sigma(\mathcal{H}_{t}^{q,K},\mathcal{F}_{t}^{i}),
i\displaystyle\mathbb{H}_{i}\triangleq {ti}0tT.\displaystyle\{\mathcal{H}^{i}_{t}\}_{0\leq t\leq T}.

Then the centralized and distributed admissible control sets of agent 𝒜i,iq,q𝒱K\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, are given by 𝒰c[0,T]\mathcal{U}_{c}[0,T] and 𝒰di[0,T]\mathcal{U}_{d}^{i}[0,T], respectively, where

𝒰c[0,T]=\displaystyle\mathcal{U}_{c}[0,T]= {u()L𝔽2(0,T;n)},\displaystyle\big{\{}u(\cdot)\in L^{2}_{\mathbb{F}}(0,T;\mathbb{R}^{n})\big{\}},
𝒰di[0,T]=\displaystyle\mathcal{U}_{d}^{i}[0,T]= {u()Li2(0,T;n)}.\displaystyle\big{\{}u(\cdot)\in L^{2}_{\mathbb{H}_{i}}(0,T;\mathbb{R}^{n})\big{\}}.

As stated in the Introduction, to our knowledge, there is currently no general method for designing optimal distributed controllers. Thus we consider asymptotically optimal distributed controllers based on the mean field approximation methodology. To characterize the asymptotic optimality of distributed controllers, the following definition is introduced:

Definition 1: For the dynamics (3) with the social cost function (4), a set of control uˇi𝒰di[0,T],iq,q𝒱K\check{u}_{i}\in\mathcal{U}_{d}^{i}[0,T],i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, is called an asymptotically optimal distributed controller with respect to the number of agents in the large-population system if

|1NJsoc(N)(𝐮ˇ)1Ninfui𝒰c[0,T]Jsoc(N)(𝐮)|=o(1).\displaystyle\Big{|}\frac{1}{N}J_{\rm soc}^{\rm(N)}(\check{\mathbf{u}})-\frac{1}{N}\inf_{u_{i}\in\mathcal{U}_{c}[0,T]}J_{\rm soc}^{\rm(N)}(\mathbf{u})\Big{|}=o(1). (7)

We propose the following two problems:

Problem 1 (P1)

For each agent 𝒜i\mathcal{A}_{i}, iq,q𝒱Ki\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, find a t\mathcal{F}_{t}-adapted optimal centralized controller u^i𝒰c[0,T]\hat{u}_{i}\in\mathcal{U}_{c}[0,T] to minimize the social cost function (4).

Problem 2 (P2)

For each agent 𝒜i\mathcal{A}_{i}, iq,q𝒱Ki\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, find a ti\mathcal{H}_{t}^{i}-adapted asymptotically optimal distributed controller uˇi𝒰di[0,T]\check{u}_{i}\in\mathcal{U}_{d}^{i}[0,T] to minimize the social cost function (4).

Therefore, in this paper we first solve Problem (P1) to obtain optimal centralized controllers, and then further study Problem (P2) to design distributed asymptotically optimal controllers based on mean field approximations. The following assumption on the distribution of initial states is imposed.

(A1) The initial states xi0x_{i0}, iqi\in\mathcal{I}_{q}, q𝒱Kq\in\mathcal{V}_{K}, are mutually independent with 𝔼[xi(0)]=m¯q\mathbb{E}[x_{i}(0)]=\bar{m}_{q}, iqi\in\mathcal{I}_{q}, q𝒱Kq\in\mathcal{V}_{K}, and there exists a finite constant C0C_{0} independent of NN such that max1iN𝔼|xi(0)|2C0\max_{1\leq i\leq N}\mathbb{E}|x_{i}(0)|^{2}\leq C_{0}. Let m¯Kdiag(m¯1,m¯2,,m¯K)\bar{m}^{K}\triangleq\mathrm{diag}(\bar{m}_{1},\bar{m}_{2},...,\bar{m}_{K}).

3 Optimal Centralized Controllers

In this section we first solve (P1) by variational analysis to obtain the open-loop centralized optimal controller which is characterized by a set of FBSDEs, then obtain its feedback representation by virtue of two Riccati equations.

3.1 Open-loop controllers and MF-FBSDEs

We make some denotations for convenience of the derivation of the main result. Denote πqNq/N\pi_{q}\triangleq N_{q}/N, q𝒱Kq\in\mathcal{V}_{K}. Then π=(π1,π2,,πK)\pi=(\pi_{1},\pi_{2},...,\pi_{K}) is a probability vector which gives the empirical distribution of the KK-type agents. Denote

ΠK\displaystyle\Pi^{K}\triangleq diag(Inπ1,Inπ2,,InπK),\displaystyle\ \mathrm{diag}(I_{n}\otimes\pi_{1},I_{n}\otimes\pi_{2},...,I_{n}\otimes\pi_{K}),
NK\displaystyle N^{K}\triangleq diag(InN1,InN2,,InNK),\displaystyle\ \mathrm{diag}(I_{n}\otimes N_{1},I_{n}\otimes N_{2},...,I_{n}\otimes N_{K}),
GK\displaystyle G^{K}\triangleq rows(G¯1,G¯2,,G¯K),\displaystyle\ \mathrm{rows}(\bar{G}_{1},\bar{G}_{2},...,\bar{G}_{K}),
QK\displaystyle Q^{K}\triangleq diag(Q1,Q2,,QK),\displaystyle\mathrm{diag}(Q_{1},Q_{2},...,Q_{K}),
ΓK\displaystyle\Gamma^{K}\triangleq rows(Γ¯1,Γ¯2,,Γ¯K),\displaystyle\ \mathrm{rows}(\bar{\Gamma}_{1},\bar{\Gamma}_{2},...,\bar{\Gamma}_{K}),
HK\displaystyle H^{K}\triangleq diag(H1,H2,,HK),\displaystyle\mathrm{diag}(H_{1},H_{2},...,H_{K}),
DK\displaystyle D^{K}\triangleq cols(D1,D2,,DK)\displaystyle\ \mathrm{cols}(D_{1},D_{2},...,D_{K})
=\displaystyle= NKGK(NK)1,\displaystyle\ N^{K}G^{K}(N^{K})^{-1},
Q¯K\displaystyle\bar{Q}^{K}\triangleq rows(Q¯1,Q¯2,,Q¯K)\displaystyle\ \mathrm{rows}(\bar{Q}_{1},\bar{Q}_{2},...,\bar{Q}_{K})
=\displaystyle= QKΓK+(NK)1(ΓK)TQKNK(NK)1(ΓK)TNKQKΓK,\displaystyle\ Q^{K}\Gamma^{K}+(N^{K})^{-1}(\Gamma^{K})^{T}Q^{K}N^{K}-(N^{K})^{-1}(\Gamma^{K})^{T}N^{K}Q^{K}\Gamma^{K},
H¯K\displaystyle\bar{H}^{K}\triangleq rows(H¯1,H¯2,,H¯K)\displaystyle\ \mathrm{rows}(\bar{H}_{1},\bar{H}_{2},...,\bar{H}_{K})
=\displaystyle= HKΓK+(NK)1(ΓK)THKNK(NK)1(ΓK)TNKHKΓK.\displaystyle\ H^{K}\Gamma^{K}+(N^{K})^{-1}(\Gamma^{K})^{T}H^{K}N^{K}-(N^{K})^{-1}(\Gamma^{K})^{T}N^{K}H^{K}\Gamma^{K}.

Note that we have NK=NΠKN^{K}=N\Pi^{K}. By variational analysis, the necessary and sufficient condition is obtained for the solvability of Problem (P1) as follows.

Theorem 1

(P1) is solvable if and only if the following MF-FBSDEs

{dx^i(t)=[Aqx^i(t)+Bqu^i(t)+G¯qx^K(t)]dt+Σqdwi(t),dλi(t)=[AqTλi(t)+DqTλK(t)+Qqx^i(t)Q¯qx^K(t)]dt+q𝒱Kjqβij(t)dwj(t),x^i(0)=xi0,λi(T)=Hqx^i(T)H¯qx^K(T),iq,q𝒱K\displaystyle\begin{split}\left\{\begin{aligned} &d\hat{x}_{i}(t)=\big{[}A_{q}\hat{x}_{i}(t)+B_{q}\hat{u}_{i}(t)+\bar{G}_{q}\hat{x}^{K}(t)\big{]}dt+\Sigma_{q}dw_{i}(t),\\ &d\lambda_{i}(t)=-[A_{q}^{T}\lambda_{i}(t)+D_{q}^{T}\lambda^{K}(t)+Q_{q}\hat{x}_{i}(t)-\bar{Q}_{q}\hat{x}^{K}(t)]dt+\sum_{q\in\mathcal{V}_{K}}\sum_{j\in\mathcal{I}_{q}}\beta_{i}^{j}(t)dw_{j}(t),\\ &\hat{x}_{i}(0)=x_{i0},\ \lambda_{i}(T)=H_{q}\hat{x}_{i}(T)-\bar{H}_{q}\hat{x}^{K}(T),i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}\end{aligned}\right.\end{split} (8)

admit a t\mathcal{F}_{t}-adapted solution (x^i(t),λi(t),iq,q𝒱K)(\hat{x}_{i}(t),\lambda_{i}(t),i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}) on t[0,T]t\in[0,T], where λKvec(λ1K,λ2K,,λKK)\lambda^{K}\triangleq\mathrm{vec}(\lambda^{K}_{1},\lambda^{K}_{2},...,\lambda^{K}_{K}) with λqK(1/Nq)iqλi\lambda^{K}_{q}\triangleq(1/N_{q})\sum_{i\in\mathcal{I}_{q}}\lambda_{i}. The optimal centralized open-loop controller for agent 𝒜i,iq,q𝒱K\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, is given by

u^i(t)=Rq1BqTλi(t).\displaystyle\hat{u}_{i}(t)=-R_{q}^{-1}B_{q}^{T}\lambda_{i}(t). (9)

3.2 Feedback representation and Riccati equations

Based on the open-loop centralized optimal controller (9), we next obtain its feedback representation by introducing two Riccati equations. Moreover, we also make the following denotations for convenience of discussions.

AK\displaystyle A^{K} diag(A1,A2,,AK),\displaystyle\triangleq\mathrm{diag}(A_{1},A_{2},...,A_{K}),
BK\displaystyle B^{K} diag(B1,B2,,BK),\displaystyle\triangleq\mathrm{diag}(B_{1},B_{2},...,B_{K}),
RK\displaystyle R^{K} diag(R1,R2,,RK),\displaystyle\triangleq\mathrm{diag}(R_{1},R_{2},...,R_{K}),
ΣK\displaystyle\Sigma^{K} diag(Σ1,Σ2,,ΣK).\displaystyle\triangleq\mathrm{diag}(\Sigma_{1},\Sigma_{2},...,\Sigma_{K}).

We propose the following Riccati equations

0=P˙K(t)+PK(t)AK+(AK)TPK(t)+QKPK(t)BK(RK)1(BK)TPK(t),\displaystyle\begin{split}0=&\dot{P}^{K}(t)+P^{K}(t)A^{K}+(A^{K})^{T}P^{K}(t)+Q^{K}-P^{K}(t)B^{K}(R^{K})^{-1}(B^{K})^{T}P^{K}(t),\end{split} (10a)
0=K˙K(t)+[AK+NKGK(NK)1]TKK(t)+KK(t)[AK+GK]+PK(t)GK+(NK)1(GK)TNKPK(t)Q¯KKK(t)BK(RK)1(BK)T[PK(t)+KK(t)]PK(t)BK(RK)1(BK)TKK(t),\displaystyle\begin{split}0=&\dot{K}^{K}(t)+[A^{K}+N^{K}G^{K}(N^{K})^{-1}]^{T}K^{K}(t)+K^{K}(t)[A^{K}+G^{K}]+P^{K}(t)G^{K}\cr&+(N^{K})^{-1}(G^{K})^{T}N^{K}P^{K}(t)-\bar{Q}^{K}-K^{K}(t)B^{K}(R^{K})^{-1}(B^{K})^{T}[P^{K}(t)+K^{K}(t)]\cr&-P^{K}(t)B^{K}(R^{K})^{-1}(B^{K})^{T}K^{K}(t),\end{split} (10b)
PK\displaystyle P^{K} (T)=HK,KK(T)=H¯K.\displaystyle(T)=H^{K},\quad K^{K}(T)=-\bar{H}^{K}.

Let PqP_{q} be the qq-th diagonal element of PKP^{K} and K¯q\bar{K}_{q} be the qq-th row of KKK^{K}, q𝒱Kq\in\mathcal{V}_{K}. Then the Riccati equations (10a)-(10b) can be rewritten as

0=P˙q(t)+Pq(t)Aq+AqTPq(t)+QqPq(t)BqRq1BqTPq(t),\displaystyle\begin{split}0=&\dot{P}_{q}(t)+P_{q}(t)A_{q}+A_{q}^{T}P_{q}(t)+Q_{q}-P_{q}(t)B_{q}R_{q}^{-1}B_{q}^{T}P_{q}(t),\end{split} (11a)
0=K¯˙q(t)+AqTK¯q(t)+DqTKK(t)+K¯q(t)(AK+GK)+Pq(t)G¯q+DqTPK(t)K¯q(t)BK(RK)1(BK)T[PK(t)+KK(t)]Q¯qPq(t)BqRq1BqTK¯q(t),\displaystyle\begin{split}0=&\dot{\bar{K}}_{q}(t)+A_{q}^{T}\bar{K}_{q}(t)+D_{q}^{T}K^{K}(t)+\bar{K}_{q}(t)(A^{K}+G^{K})+P_{q}(t)\bar{G}_{q}+D_{q}^{T}P^{K}(t)\cr&-\bar{K}_{q}(t)B^{K}(R^{K})^{-1}(B^{K})^{T}[P^{K}(t)+K^{K}(t)]-\bar{Q}_{q}-P_{q}(t)B_{q}R_{q}^{-1}B_{q}^{T}\bar{K}_{q}(t),\end{split} (11b)
Pq\displaystyle P_{q} (T)=Hq,K¯q(T)=H¯q,q𝒱K.\displaystyle(T)=H_{q},\quad\bar{K}_{q}(T)=-\bar{H}_{q},\quad q\in\mathcal{V}_{K}.

Therefore the result of feedback representation is given as follows.

Theorem 2

(P1) admits optimal centralized feedback controllers if and only if the Riccati equations (10a)-(10b) admit solutions PK(t),KK(t)P^{K}(t),K^{K}(t) on t[0,T]t\in[0,T]. In this case, the feedback controller of agent 𝒜i\mathcal{A}_{i}, iqi\in\mathcal{I}_{q}, q𝒱Kq\in\mathcal{V}_{K}, is given by

u^i(t)=\displaystyle\hat{u}_{i}(t)= Rq1BqTPq(t)x^i(t)Rq1BqTK¯q(t)x^K(t),\displaystyle-R_{q}^{-1}B_{q}^{T}P_{q}(t)\hat{x}_{i}(t)-R_{q}^{-1}B_{q}^{T}\bar{K}_{q}(t)\hat{x}^{K}(t), (12)

where PqP_{q} and K¯q\bar{K}_{q} are given by (11a) and (11b). The value function admits the following form:

V(𝐱0)=q𝒱Kiq𝔼Pq(0)xi0,xi0+𝔼NKKK(0)x0K,x0K.{\rm V}(\mathbf{x}_{0})=\sum_{q\in\mathcal{V}_{K}}\sum_{i\in\mathcal{I}_{q}}\mathbb{E}\langle P_{q}(0)x_{i0},x_{i0}\rangle+\mathbb{E}\langle N^{K}K^{K}(0)x^{K}_{0},x^{K}_{0}\rangle. (13)

We introduce the following notations to simplify the expression of closed-loop systems.

A~q\displaystyle\tilde{A}_{q}\triangleq AqBqRq1BqTPq,\displaystyle\ A_{q}-B_{q}R_{q}^{-1}B_{q}^{T}P_{q}, (14)
G~q\displaystyle\tilde{G}_{q}\triangleq G¯qBqRq1BqTK¯q,\displaystyle\ \bar{G}_{q}-B_{q}R_{q}^{-1}B_{q}^{T}\bar{K}_{q}, (15)
A~K\displaystyle\tilde{A}^{K}\triangleq diag(A~1,A~2,,A~K),\displaystyle\ \mathrm{diag}(\tilde{A}_{1},\tilde{A}_{2},...,\tilde{A}_{K}), (16)
G~K\displaystyle\tilde{G}^{K}\triangleq rows(G~1,G~2,,G~K).\displaystyle\ \mathrm{rows}(\tilde{G}_{1},\tilde{G}_{2},...,\tilde{G}_{K}).

Under the feedback controller (12), the closed-loop systems of x^i,x^qK\hat{x}_{i},\hat{x}^{K}_{q} and x^K\hat{x}^{K}, iq,q𝒱Ki\in\mathcal{I}_{q},q\in\mathcal{V}_{K} are given by

dx^i(t)=[A~q(t)x^i(t)+G~q(t)x^K(t)]dt+Σqdwi(t),x^i(0)=xi0,\displaystyle\begin{split}d\hat{x}_{i}(t)=&\ [\tilde{A}_{q}(t)\hat{x}_{i}(t)+\tilde{G}_{q}(t)\hat{x}^{K}(t)]dt+\Sigma_{q}dw_{i}(t),\quad\hat{x}_{i}(0)=x_{i0},\end{split} (17a)
dx^qK(t)=[A~q(t)x^qK(t)+G~q(t)x^K(t)]dt+ΣqdwqK(t),x^qK(0)=xq0K,\displaystyle\begin{split}d\hat{x}_{q}^{K}(t)=&\ [\tilde{A}_{q}(t)\hat{x}_{q}^{K}(t)+\tilde{G}_{q}(t)\hat{x}^{K}(t)]dt+\Sigma_{q}dw_{q}^{K}(t),\quad\hat{x}_{q}^{K}(0)=x^{K}_{q0},\end{split} (17b)
dx^K(t)=[A~K(t)+G~K(t)]x^K(t)dt+ΣKdwK(t),x^K(0)=x0K.\displaystyle\begin{split}d\hat{x}^{K}(t)=&\ [\tilde{A}^{K}(t)+\tilde{G}^{K}(t)]\hat{x}^{K}(t)dt+\Sigma^{K}dw^{K}(t),\quad\hat{x}^{K}(0)=x^{K}_{0}.\end{split} (17c)

4 Asymptotically Optimal Distributed Controllers

In this section we will design the asymptotically optimal distributed controller based on the optimal centralized feedback controller (12) according to the network topology 𝒢K\mathcal{G}_{K}. We first design cluster mean field estimators to estimate the global cluster mean filed term under the distributed information pattern. Then distributed controllers are proposed by using the cluster mean field estimators. Finally, we prove the asymptotic optimality of the distributed controllers. We use xˇi\check{x}_{i} to denote the state of agent 𝒜i\mathcal{A}_{i}, iqi\in\mathcal{I}_{q}, q𝒱Kq\in\mathcal{V}_{K}, under the distributed controller uˇi\check{u}_{i}. Let xˇqK\check{x}^{K}_{q} and xˇK\check{x}^{K} be the corresponding local and global cluster mean field terms, respectively.

4.1 Cluster mean field estimators

Note that the optimal centralized feedback controller (12) contains the individual state and the global cluster mean field term. Moreover, the ability of each agent to acquire information is different according to the communication matrix EKE_{K}. Therefore, the main idea of the design of distributed controllers is that each cluster makes a local estimation of the global cluster mean field term according to the network topology.

Therefore we design cluster mean filed estimators for agents in each cluster to estimate the global cluster mean filed term. In the following context we denote the local estimation of agent 𝒜i\mathcal{A}_{i} in cluster 𝒞q,q𝒱K\mathcal{C}_{q},q\in\mathcal{V}_{K}, for the global cluster mean field term xˇK\check{x}^{K} by

x¯K,qvec(x¯1K,q,x¯2K,q,,x¯KK,q),\bar{x}^{K,q}\triangleq\mathrm{vec}(\bar{x}^{K,q}_{1},\bar{x}^{K,q}_{2},...,\bar{x}^{K,q}_{K}),

where x¯pK,q\bar{x}^{K,q}_{p} denotes the estimation of the local cluster mean field term xˇpK\check{x}^{K}_{p} by the agents in cluster 𝒞q\mathcal{C}_{q}.

For agents in cluster 𝒞q\mathcal{C}_{q}, if p𝒩qp\in\mathcal{N}_{q}, then agents can obtain the local cluster mean field term xˇpK\check{x}^{K}_{p} by communication, i.e,

x¯pK,q=xˇpK,\bar{x}^{K,q}_{p}=\check{x}^{K}_{p}, (18)

else agents should make their own local estimation by mean field approximations according to (17b), i.e,

dx¯pK,q(t)=[A~q(t)x¯pK,q(t)+G~q(t)x¯K,q(t)]dt,x¯pK(0)=m¯p.\begin{split}&d\bar{x}^{K,q}_{p}(t)=[\tilde{A}_{q}(t)\bar{x}^{K,q}_{p}(t)+\tilde{G}_{q}(t)\bar{x}^{K,q}(t)]dt,\quad\bar{x}_{p}^{K}(0)=\bar{m}_{p}.\end{split} (19)

In order to represent the dynamics of the cluster mean field estimator more compact, the following notation is introduced for q,p𝒱Kq,p\in\mathcal{V}_{K}:

Eqp\displaystyle E_{qp} 1neij,\displaystyle\triangleq 1_{n}\otimes e_{ij},
EqK\displaystyle E^{K}_{q} rows(Eq1,Eq2,,EqK),\displaystyle\triangleq\mathrm{rows}(E_{q1},E_{q2},...,E_{qK}),
E¯qp\displaystyle\bar{E}_{qp} 1nEqp,\displaystyle\triangleq 1_{n}-E_{qp},
E¯qK\displaystyle\bar{E}^{K}_{q} rows(E¯q1,E¯q2,,E¯qK),\displaystyle\triangleq\mathrm{rows}(\bar{E}_{q1},\bar{E}_{q2},...,\bar{E}_{qK}),
ZK\displaystyle Z^{K} diag(B1R11B1TK¯1,,BKRK1BKTK¯K).\displaystyle\triangleq\mathrm{diag}(B_{1}R_{1}^{-1}B_{1}^{T}\bar{K}_{1},...,B_{K}R_{K}^{-1}B_{K}^{T}\bar{K}_{K}).

By (18) and (19), the dynamics of the designed cluster mean field estimator for agent 𝒜i,iq,q𝒱K\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}, is given by

dx¯K,q(t)=\displaystyle d\bar{x}^{K,q}(t)= EqKdxˇK(t)+E¯qK[A~K(t)+G~K(t)]x¯K,q(t)dt,x¯K,q(0)=EqKx0K+E¯qKm¯K.\displaystyle E^{K}_{q}\odot d\check{x}^{K}(t)+\bar{E}^{K}_{q}\odot[\tilde{A}^{K}(t)+\tilde{G}^{K}(t)]\bar{x}^{K,q}(t)dt,\quad\bar{x}^{K,q}(0)=E^{K}_{q}\odot x^{K}_{0}+\bar{E}^{K}_{q}\odot\bar{m}^{K}. (20)

4.2 Distributed controllers

Based on the above discussion, we propose the following distributed controller for agent 𝒜i,iq,q𝒱K\mathcal{A}_{i},i\in\mathcal{I}_{q},q\in\mathcal{V}_{K}:

uˇi(t)=\displaystyle\check{u}_{i}(t)= Rq1BqTPq(t)xˇi(t)Rq1BqTK¯q(t)x¯K,q(t),\displaystyle-R_{q}^{-1}B_{q}^{T}P_{q}(t)\check{x}_{i}(t)-R_{q}^{-1}B_{q}^{T}\bar{K}_{q}(t)\bar{x}^{K,q}(t), (21)

where x¯K,q\bar{x}^{K,q} is the estimated global cluster mean field term of cluster 𝒞q\mathcal{C}_{q} given by the cluster mean field estimator (20). Under this distributed controller, the average distributed control of cluster 𝒞q\mathcal{C}_{q} is given by

uˇqK(t)=\displaystyle\check{u}_{q}^{K}(t)= Rq1BqTPq(t)xˇqK(t)Rq1BqTK¯q(t)x¯K,q(t).\displaystyle-R_{q}^{-1}B_{q}^{T}P_{q}(t)\check{x}_{q}^{K}(t)-R_{q}^{-1}B_{q}^{T}\bar{K}_{q}(t)\bar{x}^{K,q}(t). (22)

Let

x¯Kvec(x¯K,1,x¯K,2,,x¯K,K)\bar{x}^{K}\triangleq\mathrm{vec}(\bar{x}^{K,1},\bar{x}^{K,2},...,\bar{x}^{K,K})

be the estimated global cluster mean field term of all clusters in the large-population system. Therefore the closed-loop systems under the distributed controller are given by

dxˇi(t)=[A~q(t)xˇi(t)+G¯q(t)xˇK(t)BqRq1BqTK¯q(t)x¯K,q(t)]dt+Σqdwi(t),xˇi(0)=xi0,\displaystyle\begin{split}d\check{x}_{i}(t)=&[\tilde{A}_{q}(t)\check{x}_{i}(t)+\bar{G}_{q}(t)\check{x}^{K}(t)-B_{q}R_{q}^{-1}B_{q}^{T}\bar{K}_{q}(t)\bar{x}^{K,q}(t)]dt+\Sigma_{q}dw_{i}(t),\quad\check{x}_{i}(0)=x_{i0},\end{split} (23a)
dxˇqK(t)=[A~q(t)xˇqK(t)+G¯qxˇK(t)BqRq1BqTK¯q(t)x¯K,q(t)]dt+ΣqdwqK(t),xˇqK(0)=xq0K,\displaystyle\begin{split}d\check{x}_{q}^{K}(t)=&[\tilde{A}_{q}(t)\check{x}_{q}^{K}(t)+\bar{G}_{q}\check{x}^{K}(t)-B_{q}R_{q}^{-1}B_{q}^{T}\bar{K}_{q}(t)\bar{x}^{K,q}(t)]dt+\Sigma_{q}dw_{q}^{K}(t),\quad\check{x}_{q}^{K}(0)=x^{K}_{q0},\end{split} (23b)
dxˇK(t)={[A~K(t)+GK]xˇK(t)ZK(t)x¯K(t)}dt+ΣKdwK(t),xˇK(0)=x0K.\displaystyle\begin{split}d\check{x}^{K}(t)=&\big{\{}[\tilde{A}^{K}(t)+G^{K}]\check{x}^{K}(t)-Z^{K}(t)\bar{x}^{K}(t)\big{\}}dt+\Sigma^{K}dw^{K}(t),\quad\check{x}^{K}(0)=x^{K}_{0}.\end{split} (23c)

4.3 Asymptotic optimality

Next we will show the asymptotic optimality property of the designed distributed controller. We first define two kinds of estimation errors of the designed estimator (20) with respect to the realized cluster mean field terms under the centralized controller (12) and distributed controller (21), respectively. For q,p𝒱Kq,p\in\mathcal{V}_{K}, denote

ξˇpK,q\displaystyle\check{\xi}^{K,q}_{p} xˇpKx¯pK,q,\displaystyle\triangleq\check{x}^{K}_{p}-\bar{x}^{K,q}_{p},
ξ^pK,q\displaystyle\hat{\xi}^{K,q}_{p} x^pKx¯pK,q,\displaystyle\triangleq\hat{x}^{K}_{p}-\bar{x}^{K,q}_{p},
ξˇK,q\displaystyle\check{\xi}^{K,q} vec(ξˇ1K,q,ξˇ2K,q,,ξˇKK,q),\displaystyle\triangleq\mathrm{vec}(\check{\xi}^{K,q}_{1},\check{\xi}^{K,q}_{2},...,\check{\xi}^{K,q}_{K}),
ξ^K,q\displaystyle\hat{\xi}^{K,q} vec(ξ^1K,q,ξ^2K,q,,ξ^KK,q),\displaystyle\triangleq\mathrm{vec}(\hat{\xi}^{K,q}_{1},\hat{\xi}^{K,q}_{2},...,\hat{\xi}^{K,q}_{K}),
ξˇK\displaystyle\check{\xi}^{K} vec(ξˇK,1,ξˇK,2,,ξˇK,K),\displaystyle\triangleq\mathrm{vec}(\check{\xi}^{K,1},\check{\xi}^{K,2},...,\check{\xi}^{K,K}),
ξ^K\displaystyle\hat{\xi}^{K} vec(ξ^K,1,ξ^K,2,,ξ^K,K).\displaystyle\triangleq\mathrm{vec}(\hat{\xi}^{K,1},\hat{\xi}^{K,2},...,\hat{\xi}^{K,K}).

To prove the asymptotic optimality of the distributed controller (21), we first obtain the following lemma, which shows that the estimator is asymptotically unbiased with respect to the number of agents.

Lemma 1

Assume (A1) holds. Let C1=minq𝒱KNqC_{1}=\min_{q\in\mathcal{V}_{K}}N_{q}. The cluster mean field estimator (19) is asymptotically unbiased, i.e.,

𝔼|ξˇqK(t)|2=O(1C1),𝔼|ξ^qK(t)|2=O(1C1),q𝒱K,\mathbb{E}|\check{\xi}^{K}_{q}(t)|^{2}=O(\frac{1}{C_{1}}),\quad\mathbb{E}|\hat{\xi}^{K}_{q}(t)|^{2}=O(\frac{1}{C_{1}}),\quad q\in\mathcal{V}_{K}, (24)

and x¯K\bar{x}^{K} and xˇK\check{x}^{K} is bounded in the mean-square sense, i.e.,

𝔼|xˇK(t)|2=O(1),𝔼|x¯K(t)|2=O(1),q𝒱K.\mathbb{E}|\check{x}^{K}(t)|^{2}=O(1),\quad\mathbb{E}|\bar{x}^{K}(t)|^{2}=O(1),\quad q\in\mathcal{V}_{K}. (25)

The following two lemmas are used to decompose the social cost function.

Lemma 2

Let ζi=xixqK\zeta_{i}=x_{i}-x^{K}_{q}, vi=uiuqKv_{i}=u_{i}-u^{K}_{q}, and 𝐯=vec(v1,v2,,vN)\mathbf{v}=\mathrm{vec}(v_{1},v_{2},...,v_{N}). Then the social cost (4) can be rewritten as

Jsoc(N)(𝐮)=J1(uK)+J2(𝐯),\displaystyle J_{\rm soc}^{\rm(N)}(\mathbf{u})=J^{1}(u^{K})+J^{2}(\mathbf{v}), (26)

where

J1(uK)q𝒱Kiq𝔼{0T[|xqK(t)Γ¯qxK(t)|Qq2+|uqK(t)|Rq2]𝑑t+|xqK(T)Γ¯qxK(T)|Hq2},\displaystyle\begin{split}J^{1}(u^{K})\triangleq&\sum_{q\in\mathcal{V}_{K}}\sum_{i\in\mathcal{I}_{q}}\mathbb{E}\Big{\{}\int_{0}^{T}\big{[}|x^{K}_{q}(t)-\bar{\Gamma}_{q}x^{K}(t)|^{2}_{Q_{q}}+|u^{K}_{q}(t)|^{2}_{R_{q}}\big{]}dt+|x^{K}_{q}(T)-\bar{\Gamma}_{q}x^{K}(T)|^{2}_{H_{q}}\Big{\}},\end{split} (27a)
J2(𝐯)q𝒱Kiq𝔼0T[|ζi(t)|Qq2+|vi(t)|Rq2]𝑑t+q𝒱Kiq𝔼|ζi(T)|Hq2.\displaystyle\begin{split}J^{2}(\mathbf{v})\triangleq&\sum_{q\in\mathcal{V}_{K}}\sum_{i\in\mathcal{I}_{q}}\mathbb{E}\int_{0}^{T}\big{[}|\zeta_{i}(t)|^{2}_{Q_{q}}+|v_{i}(t)|^{2}_{R_{q}}\big{]}dt+\sum_{q\in\mathcal{V}_{K}}\sum_{i\in\mathcal{I}_{q}}\mathbb{E}|\zeta_{i}(T)|^{2}_{H_{q}}.\end{split} (27b)
Lemma 3

Let ζ^i=x^ix^qK\hat{\zeta}_{i}=\hat{x}_{i}-\hat{x}^{K}_{q}, v^i=u^iu^qK\hat{v}_{i}=\hat{u}_{i}-\hat{u}^{K}_{q}, ζˇi=xˇixˇqK\check{\zeta}_{i}=\check{x}_{i}-\check{x}^{K}_{q} and vˇi=uˇiuˇqK\check{v}_{i}=\check{u}_{i}-\check{u}^{K}_{q}. The following equations hold for iq,q𝒱Ki\in\mathcal{I}_{q},q\in\mathcal{V}_{K}:

ζ^i(t)=ζˇi(t),v^i(t)=vˇi(t),\hat{\zeta}_{i}(t)=\check{\zeta}_{i}(t),\quad\hat{v}_{i}(t)=\check{v}_{i}(t), (28)

and

J2(𝐯^)=J2(𝐯ˇ).J^{2}(\hat{\mathbf{v}})=J^{2}(\check{\mathbf{v}}). (29)

By Lemma 1, 2 and 3, the asymptotic optimality property of the designed distributed controller is given as follows.

Theorem 3

Assume (A1) holds. The distributed controller (21) has the property of asymptotically social optimality, i,e.,

Jsoc(N)(𝐮ˇ)NJsoc(N)(𝐮^)N=O(1C1).\displaystyle\frac{J_{\rm soc}^{\rm(N)}(\mathbf{\check{u}})}{N}-\frac{J_{\rm soc}^{\rm(N)}(\mathbf{\hat{u}})}{N}=O(\frac{1}{\sqrt{C_{1}}}). (30)

5 Conclusion

This paper studies the problem of mean field social control in the case of heterogeneous agents following the direct approach. A set of asymptotically optimal distributed controllers is designed by constructing a cluster mean field estimator for each agent. In this paper we consider the finite cluster case with additive noise. An interesting work in the future is to further consider the heterogeneous mean field system with multiplicative noise in the infinite cluster case by graphon theory.

References

  • [1] A. Bensoussan, J. Frehse, and P. Yam, Mean Field Games and Mean Field Type Control Theory, Springer, New York, 2013.
  • [2] P. E. Caines, Ed. T. Samad and J. Baillieul, Mean field games, in Encyclopedia of Systems and Control, Berlin: Spinger-Verlag, 2014.
  • [3] D. A. Gomes and J. Saude, “Mean field games models-a brief survey,” Dynamic Games and Applications, vol. 4, no. 2, pp. 110-154, 2014.
  • [4] B.-C. Wang and M. Huang, “Mean field production output control with sticky prices: Nash and social solutions,” Automatica, vol. 100, pp. 90-98, Feb. 2019.
  • [5] M. R. Arefin , T. Masaki, K. M. A. Kabir, and J. Tanimoto, “Interplay between cost and effectiveness in influenza vaccine uptake: a vaccination game approach,” in Proc. R. Soc. A, vol. 475, no. 2232, Dec. 2019.
  • [6] M. Larranaga, J. Denis, M. Assaad, and K. D. Turck, “Energy-efficient distributed transmission scheme for MTC in dense wireless networks: a mean-field approach,” IEEE IoT-J, vol. 7, no. 1, pp. 477-490, Jan. 2020.
  • [7] J. M. Lasry and P. L. Lions, “Mean field games,” Japan J. Math., vol. 2, pp. 229-260, 2007.
  • [8] M. Huang, P. E. Caines and R. P. Malhame, “Large-population cost-coupled LQG problems with nonuniform agents: individual-mass behavior and decentralized ”-Nash equilibria,” IEEE Trans. Autom. Control, vol. 52, pp. 1560-1571, 2007.
  • [9] J. Arabneydi and A. G. Aghdam, “Deep teams: Decentralized decision making with finite and infinite number of agents,” IEEE Trans. Autom. Control, vol. 65, no. 10, pp. 4230–4245, 2020.
  • [10] A. Bensoussan, K. C. J. Sung, S. C. P. Yam and S. P. Yung, “Linear-quadratic mean field games,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 1556-1563, 2016.
  • [11] V. N. Kolokoltsov, M. Troevam, and W. Yang, “On the rate of convergence for the mean-field approximation of controlled diffusions with large number of players,” Dynamic Games and Applications, vol. 4, pp 208–230, 2014.
  • [12] M. Huang, P. Caines, and R. Malhame, “Social optima in mean field LQG control: Centralized and decentralized strategies,” IEEE Trans. Autom. Control, vol. 57, no. 7, pp. 1736-1751, 2012.
  • [13] M. Huang, “Large-population LQG games involving a major player: The Nash certainty equivalence principle,” SIAM J. Control Optim., vol. 48, no. 5, pp. 3318–3353, 2010.
  • [14] M. Huang and M. Zhou, “Linear quadratic mean field games: Asymptotic solvability and relation to the fixed point approach,”IEEE Trans. Autom. Control, vol. 65, no. 4, pp. 1397-1412, 2020.
  • [15] B.-C. Wang, H. Zhang, J.-F. Zhang, “Mean field linear-quadratic control: Uniform stabilization and social optimality,” Automatica, vol. 121, 2020.
  • [16] B. -C. Wang and H. Zhang, “Indefinite linear quadratic mean field social control problems with multiplicative noise,” IEEE Trans. Autom. Control, doi: 10.1109/TAC.2020.3036246.
  • [17] B.-C. Wang, ”Linear quadratic mean field social control with random coefficients and common noise,” in Proc. Chinese Control Conference, pp. 1491-1498, 2019.
  • [18] S. Gao, P. E. Caines, M. Huang, “LQG graphon mean field games,” arXiv:2004.00679, 2020.
  • [19] P. E. Caines, M. Huang, “Graphon mean field games and the GMFG equations,” arXiv:2008.10216, 2020.