Distributed Control of Linear Quadratic Mean Field Social Systems with Heterogeneous Agents
Abstract
In this paper, we study the social optimality for mean field linear-quadratic control systems following the direct approach, where subsystems are coupled via individual dynamics and costs according to a network topology. A graph is introduced to represent the network topology of the large-population system, where nodes represent subpopulations called clusters and edges represent communication relationship. By the direct approach, we first seek the optimal controller under centralized information structure, which characterized by a set of forward-backward stochastic differential equations. Then the feedback controller is obtained with the help of Riccat equations. Finally, we design the distributed controller with mean field approximations, which has the property of asymptotically social optimality.
mean field games, multi-agent systems, linear quadratic optimal control, asymptotically social optimality
1 Introduction
In recent years, mean field games and control have became a hot topic, and it has wide applications in system control, applied mathematics, and economics [1, 2, 3], such as dynamic production adjustment [4], vaccination games [5], resource allocation in internet of things [6], etc. Mean field games theory originated from the parallel works of Lasry and Lions [7] and of Huang et al. [8]. A key feature of the mean field system is its mean field coupling terms, which often appears in multi-agent systems, such as the team decision problem with partially exchangeable agents [9].
Under the centralized information structure, the computational complexity increases exponentially with the increase of the number of agents. Due to communication constraints, it is also unrealistic for each agent to obtain global information. Therefore, it is meaningful to design decentralized controllers only using local information. The difficulty is how to use local information to approximate the mean field coupling term which is related to global information. At present, there are mainly two approaches: fixed point and direct approaches. The first approach uses an infinite to finite population approach. First the fixed point equation is obtained in the infinite population system and then substitute the solution to the finite population system to analyze the property of -Nash equilibrium. See [7, 10, 11, 12, 16] for details. The second approach uses an finite to infinite population approach. First, one directly solve the Nash equilibrium under the finite population and then obtain the limit of the Nash equilibrium with respect to the number of populations tending to infinity. Finally, it is proved that the limit is a -Nash equilibrium in the original system. See [8, 14, 15, 16, 17] for details. As the complexity of the system increases, the fixed point equation will become more complicated and difficult to solve, such as mean field games with major-minor agents [16]. Even the fixed point approach will fail when the system contains common noise [17].
In this paper, we study the mean field social control problem by the direct approach. The agents in the large-population system are heterogeneous, which makes it difficult to solve the centralized controller. The case we consider in this paper is that the large-population system contains -type agents. Agents of the same type are divided into a cluster and represented as a node of the graph . The agents in the same cluster generate a local mean field term. Therefore there are total of local mean field terms in this large-population system. Agents are coupled with each other in dynamics and costs through the mean field terms according to the adjacency matrix. Compared with the mean field social control involving homogeneous agents, the heterogeneous case is difficult to analyze and solve the centralized controller only using traditional methods. The key to solve this problem is to construct the local mean field terms. The distributed controller is designed by the optimal estimation of local mean field terms. The main contributions of the paper are listed as follows.
-
•
By variational analysis, the necessary and sufficient condition for the exist for the existence of centralized controllers is obtained, which is characterized by the adapted solution of a set of forward-backward stochastic differential equations.
-
•
The centralized optimal feedback controller is obtained by introducing Riccati equations.
-
•
A distributed controller is proposed by mean field approximations, which has the property of asymptotically social optimality.
The rest of the paper is organized as follows. In section 2 we formulate the mean field social control problem. Section 3 gives the centralized results of this paper. Section 4 design distributed controllers based on centralized results and show the asymptotically social optimality. Section 5 concludes this paper.
Notations: For a matrix , means the transpose of . For a set of vectors , denotes the vector . For a set of matrices , , , and denote the matrices , and the diagonal matrix with the elements of on the main diagonal. Let be the -dimensional identity matrix. Let and denote the matrix with all the elements and , respectively. We use etc. to denote generic constants, which may vary from place to place. For two matrices and , let and denote the Kronecker product and the dot product, respectively, i.e.,
2 Problem Formulation
2.1 Network topology
Consider a large-population system with clusters , , where the cluster contains homogeneous agents with the same dynamics, cost functions and communication capabilities. In this scenario, the large-population system contains a total of heterogeneous agents , , and the agents are partitioned into the disjoint clusters , . The set of agents belonging to the cluster is denoted by with , , and . In this paper, we also list the agents as , , .
The agents in different clusters communicate with each other through a network whose topology is modeled as a directed graph , where the clusters and the communication channels between clusters are represented by the node set and the edge set , respectively. An edge denoted by the pair represents a communication channel from cluster to cluster . The neighbor set of cluster is denoted by . Let be the communication matrix, where if , if . The weighted adjacency matrix of is denoted by . The agents are coupled with each other according to the adjacency matrix.
2.2 Dynamics and costs
Let and be the state and control of agent . The local cluster mean field term of cluster is defined as the empirical mean of the states of all agents in the cluster, i.e.,
(1) |
Therefore the global cluster mean field term of the large-population system is stacked as . By the adjacency matrix , the influence of the global cluster mean field term on the agents in cluster is denoted as , where
(2) |
Coupled by the term , the dynamics of agent , is given by the following stochastic differential equation
(3) |
where are constant matrices with compatible dimensions and are -dimensional independent standard Brownian motions defined on a complete filtered probability space .
Let and be the state and control of the large-population system, respectively. Then the cost function of agent , is given by
where are constant matrices with compatible dimensions. The social cost function is defined as
(4) |
Let , , be the -th row of and
Then the individual dynamics (3) can be rewritten as
(5) |
Similarly, let
Thus the social cost (4) can be rewritten as
(6) |
2.3 Main problems
In this paper, we apply the direct approach to sequentially study the mean field social control problem under centralized and distributed information patterns. So we first give the definitions of filtration used in this paper. Denote
Then the centralized and distributed admissible control sets of agent , are given by and , respectively, where
As stated in the Introduction, to our knowledge, there is currently no general method for designing optimal distributed controllers. Thus we consider asymptotically optimal distributed controllers based on the mean field approximation methodology. To characterize the asymptotic optimality of distributed controllers, the following definition is introduced:
Definition 1: For the dynamics (3) with the social cost function (4), a set of control , is called an asymptotically optimal distributed controller with respect to the number of agents in the large-population system if
(7) |
We propose the following two problems:
Problem 1 (P1)
For each agent , , find a -adapted optimal centralized controller to minimize the social cost function (4).
Problem 2 (P2)
For each agent , , find a -adapted asymptotically optimal distributed controller to minimize the social cost function (4).
Therefore, in this paper we first solve Problem (P1) to obtain optimal centralized controllers, and then further study Problem (P2) to design distributed asymptotically optimal controllers based on mean field approximations. The following assumption on the distribution of initial states is imposed.
(A1) The initial states , , , are mutually independent with , , , and there exists a finite constant independent of such that . Let .
3 Optimal Centralized Controllers
In this section we first solve (P1) by variational analysis to obtain the open-loop centralized optimal controller which is characterized by a set of FBSDEs, then obtain its feedback representation by virtue of two Riccati equations.
3.1 Open-loop controllers and MF-FBSDEs
We make some denotations for convenience of the derivation of the main result. Denote , . Then is a probability vector which gives the empirical distribution of the -type agents. Denote
Note that we have . By variational analysis, the necessary and sufficient condition is obtained for the solvability of Problem (P1) as follows.
Theorem 1
(P1) is solvable if and only if the following MF-FBSDEs
(8) |
admit a -adapted solution on , where with . The optimal centralized open-loop controller for agent , is given by
(9) |
3.2 Feedback representation and Riccati equations
Based on the open-loop centralized optimal controller (9), we next obtain its feedback representation by introducing two Riccati equations. Moreover, we also make the following denotations for convenience of discussions.
We propose the following Riccati equations
(10a) | ||||
(10b) | ||||
Let be the -th diagonal element of and be the -th row of , . Then the Riccati equations (10a)-(10b) can be rewritten as
(11a) | ||||
(11b) | ||||
Therefore the result of feedback representation is given as follows.
Theorem 2
We introduce the following notations to simplify the expression of closed-loop systems.
(14) | ||||
(15) | ||||
(16) | ||||
Under the feedback controller (12), the closed-loop systems of and , are given by
(17a) | ||||
(17b) | ||||
(17c) |
4 Asymptotically Optimal Distributed Controllers
In this section we will design the asymptotically optimal distributed controller based on the optimal centralized feedback controller (12) according to the network topology . We first design cluster mean field estimators to estimate the global cluster mean filed term under the distributed information pattern. Then distributed controllers are proposed by using the cluster mean field estimators. Finally, we prove the asymptotic optimality of the distributed controllers. We use to denote the state of agent , , , under the distributed controller . Let and be the corresponding local and global cluster mean field terms, respectively.
4.1 Cluster mean field estimators
Note that the optimal centralized feedback controller (12) contains the individual state and the global cluster mean field term. Moreover, the ability of each agent to acquire information is different according to the communication matrix . Therefore, the main idea of the design of distributed controllers is that each cluster makes a local estimation of the global cluster mean field term according to the network topology.
Therefore we design cluster mean filed estimators for agents in each cluster to estimate the global cluster mean filed term. In the following context we denote the local estimation of agent in cluster , for the global cluster mean field term by
where denotes the estimation of the local cluster mean field term by the agents in cluster .
For agents in cluster , if , then agents can obtain the local cluster mean field term by communication, i.e,
(18) |
else agents should make their own local estimation by mean field approximations according to (17b), i.e,
(19) |
In order to represent the dynamics of the cluster mean field estimator more compact, the following notation is introduced for :
By (18) and (19), the dynamics of the designed cluster mean field estimator for agent , is given by
(20) |
4.2 Distributed controllers
Based on the above discussion, we propose the following distributed controller for agent :
(21) |
where is the estimated global cluster mean field term of cluster given by the cluster mean field estimator (20). Under this distributed controller, the average distributed control of cluster is given by
(22) |
Let
be the estimated global cluster mean field term of all clusters in the large-population system. Therefore the closed-loop systems under the distributed controller are given by
(23a) | ||||
(23b) | ||||
(23c) |
4.3 Asymptotic optimality
Next we will show the asymptotic optimality property of the designed distributed controller. We first define two kinds of estimation errors of the designed estimator (20) with respect to the realized cluster mean field terms under the centralized controller (12) and distributed controller (21), respectively. For , denote
To prove the asymptotic optimality of the distributed controller (21), we first obtain the following lemma, which shows that the estimator is asymptotically unbiased with respect to the number of agents.
Lemma 1
Assume (A1) holds. Let . The cluster mean field estimator (19) is asymptotically unbiased, i.e.,
(24) |
and and is bounded in the mean-square sense, i.e.,
(25) |
The following two lemmas are used to decompose the social cost function.
Lemma 2
Lemma 3
Let , , and . The following equations hold for :
(28) |
and
(29) |
By Lemma 1, 2 and 3, the asymptotic optimality property of the designed distributed controller is given as follows.
Theorem 3
Assume (A1) holds. The distributed controller (21) has the property of asymptotically social optimality, i,e.,
(30) |
5 Conclusion
This paper studies the problem of mean field social control in the case of heterogeneous agents following the direct approach. A set of asymptotically optimal distributed controllers is designed by constructing a cluster mean field estimator for each agent. In this paper we consider the finite cluster case with additive noise. An interesting work in the future is to further consider the heterogeneous mean field system with multiplicative noise in the infinite cluster case by graphon theory.
References
- [1] A. Bensoussan, J. Frehse, and P. Yam, Mean Field Games and Mean Field Type Control Theory, Springer, New York, 2013.
- [2] P. E. Caines, Ed. T. Samad and J. Baillieul, Mean field games, in Encyclopedia of Systems and Control, Berlin: Spinger-Verlag, 2014.
- [3] D. A. Gomes and J. Saude, “Mean field games models-a brief survey,” Dynamic Games and Applications, vol. 4, no. 2, pp. 110-154, 2014.
- [4] B.-C. Wang and M. Huang, “Mean field production output control with sticky prices: Nash and social solutions,” Automatica, vol. 100, pp. 90-98, Feb. 2019.
- [5] M. R. Arefin , T. Masaki, K. M. A. Kabir, and J. Tanimoto, “Interplay between cost and effectiveness in influenza vaccine uptake: a vaccination game approach,” in Proc. R. Soc. A, vol. 475, no. 2232, Dec. 2019.
- [6] M. Larranaga, J. Denis, M. Assaad, and K. D. Turck, “Energy-efficient distributed transmission scheme for MTC in dense wireless networks: a mean-field approach,” IEEE IoT-J, vol. 7, no. 1, pp. 477-490, Jan. 2020.
- [7] J. M. Lasry and P. L. Lions, “Mean field games,” Japan J. Math., vol. 2, pp. 229-260, 2007.
- [8] M. Huang, P. E. Caines and R. P. Malhame, “Large-population cost-coupled LQG problems with nonuniform agents: individual-mass behavior and decentralized ”-Nash equilibria,” IEEE Trans. Autom. Control, vol. 52, pp. 1560-1571, 2007.
- [9] J. Arabneydi and A. G. Aghdam, “Deep teams: Decentralized decision making with finite and infinite number of agents,” IEEE Trans. Autom. Control, vol. 65, no. 10, pp. 4230–4245, 2020.
- [10] A. Bensoussan, K. C. J. Sung, S. C. P. Yam and S. P. Yung, “Linear-quadratic mean field games,” Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 1556-1563, 2016.
- [11] V. N. Kolokoltsov, M. Troevam, and W. Yang, “On the rate of convergence for the mean-field approximation of controlled diffusions with large number of players,” Dynamic Games and Applications, vol. 4, pp 208–230, 2014.
- [12] M. Huang, P. Caines, and R. Malhame, “Social optima in mean field LQG control: Centralized and decentralized strategies,” IEEE Trans. Autom. Control, vol. 57, no. 7, pp. 1736-1751, 2012.
- [13] M. Huang, “Large-population LQG games involving a major player: The Nash certainty equivalence principle,” SIAM J. Control Optim., vol. 48, no. 5, pp. 3318–3353, 2010.
- [14] M. Huang and M. Zhou, “Linear quadratic mean field games: Asymptotic solvability and relation to the fixed point approach,”IEEE Trans. Autom. Control, vol. 65, no. 4, pp. 1397-1412, 2020.
- [15] B.-C. Wang, H. Zhang, J.-F. Zhang, “Mean field linear-quadratic control: Uniform stabilization and social optimality,” Automatica, vol. 121, 2020.
- [16] B. -C. Wang and H. Zhang, “Indefinite linear quadratic mean field social control problems with multiplicative noise,” IEEE Trans. Autom. Control, doi: 10.1109/TAC.2020.3036246.
- [17] B.-C. Wang, ”Linear quadratic mean field social control with random coefficients and common noise,” in Proc. Chinese Control Conference, pp. 1491-1498, 2019.
- [18] S. Gao, P. E. Caines, M. Huang, “LQG graphon mean field games,” arXiv:2004.00679, 2020.
- [19] P. E. Caines, M. Huang, “Graphon mean field games and the GMFG equations,” arXiv:2008.10216, 2020.