This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Mean Field Games of Major-Minor Agents with Recursive Functionals

Jianhui Huang Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China. Email: james.huang@polyu.edu.hk. This author’s research is partially supported by RGC Grant PolyU 15301119, 15307621, N PolyU504/19, NSFC 12171407 and KKZT.    Wenqiang Li Corresponding author. School of Mathematics and Statistics, Shandong University, Weihai 264209, P.R. China. Email: wenqiangli@sdu.edu.cn. This author’s research is supported by the NSF of P.R. China (No. 12101537, 12271304).    Harry Zheng Department of Mathematics, Imperial College, London SW72BZ, UK. Email:h.zheng@imperial.ac.uk. This author is partially supported by EPSRC (UK) grant (EP/V008331/1).

Abstract. This paper investigates a novel class of mean field games involving a major agent and numerous minor agents, where the agents’ functionals are recursive with nonlinear backward stochastic differential equation (BSDE) representations. We term these games “recursive major-minor” (RMM) problems. Our RMM modeling is quite general, as it employs empirical (state, control) averages to define the weak couplings in both the functionals and dynamics of all agents, regardless of their status as major or minor. We construct an auxiliary limiting problem of the RMM by a novel unified structural scheme combining a bilateral perturbation with a mixed hierarchical recomposition. This scheme has its own merits as it can be applied to analyze more complex coupling structures than those in the current RMM. Subsequently, we derive the corresponding consistency condition and explore asymptotic RMM equilibria. Additionally, we examine the RMM problem in specific linear-quadratic settings for illustrative purposes.

Keywords. Backward stochastic differential equation, Controlled large population system, Exchangeable decomposition, Major and minor agents, Mean field game, Recursive functional.

MSC2020 subject classifications. 93E20, 60H10, 60K35.

1 Introduction

Mean field game (MFG) theory was independently introduced by [27] and [24] from different perspectives, serving as an effective methodology for analyzing controlled large population (LP) systems. Typically, a LP system comprises a large number of agents with interactions through their empirical distribution or averages. These interactions are in a weak sense, as the degrees of couplings among agents diminish rapidly when the number of agents tends to infinity. A core element of the MFG theory is the construction of a limiting auxiliary problem, under which all agents can be largely disentangled, allowing for characterizations of some decentralized approximate equilibrium through a consistency matching. As such, MFG analysis can significantly reduce the dimension of controlled LP systems that each agent needs to analyze, greatly simplifying related numerical analysis. A substantial body of research has been dedicated to the MFG theory, yielding fruitful outcomes. A partial list of literature relevant to this current work includes [5, 6, 8, 11, 22, 23, 32, 34]. This paper focuses on a new class of MFG problems with the following weakly-coupled LP system including a major 𝒜0\mathcal{A}_{0} and multiple minors {𝒜i}i=1N\{\mathcal{A}_{i}\}_{i=1}^{N}, whose states X0X^{0} and {Xi}i=1N\{X^{i}\}_{i=1}^{N} satisfy the following forward stochastic differential equations (SDEs)

{dXt0=b0(t,Xt0,ut0;Xt(N),ut(N))dt+σ0(t,Xt0,ut0;Xt(N),ut(N))dWt0,dXti=b(t,Xti,uti;Xt0,ut0;Xt(N),ut(N))dt+σ(t,Xti,uti;Xt0,ut0;Xt(N),ut(N))dWti,\left\{\begin{aligned} dX_{t}^{0}=&b^{0}(t,X_{t}^{0},{u_{t}^{0}};X_{t}^{(N)},{u_{t}^{(N)}})dt+{\sigma^{0}(t,X_{t}^{0},{u_{t}^{0}};X_{t}^{(N)},{u_{t}^{(N)}})}dW^{0}_{t},\\ dX_{t}^{i}=&{b(t,X_{t}^{i},u_{t}^{i};X_{t}^{0},u_{t}^{0};X_{t}^{(N)},u_{t}^{(N)})dt+\sigma(t,X_{t}^{i},u_{t}^{i};X_{t}^{0},u_{t}^{0};X_{t}^{(N)},u_{t}^{(N)})dW_{t}^{i}},\end{aligned}\right. (1.1)

with initial conditions X00=x00n,X0i=x0n.X_{0}^{0}=x_{0}^{0}\in\mathbb{R}^{n},X_{0}^{i}=x_{0}\in\mathbb{R}^{n}. Here, X(N)=1Ni=1NXi,u(N)=1Ni=1NuiX_{\cdot}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}X_{\cdot}^{i},\ u_{\cdot}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}u_{\cdot}^{i} stand for the (forward) state-average and control-average of all minors, respectively, and W=(W0,W1,W=(W^{0},W^{1}, ,WN)\cdots,W^{N})^{\top} is a (N+1)(N+1)-dimensional standard Brownian motion on a complete probability space (Ω,,)(\Omega,\mathcal{F},{\mathbb{P}}), where WiW^{i} is the idiosyncratic noise for 𝒜i\mathcal{A}_{i} while W0W^{0} is the common noise. {𝒜k}k=0N\{\mathcal{A}_{k}\}_{k=0}^{N} aim to maximize recursive-type functionals {Jk}k=0N\{J_{k}\}_{k=0}^{N} given by

J0(u0;u1,,uN)=\displaystyle J_{0}(u^{0};u^{1},\cdots,u^{N})= Γ0(Y00)+𝔼0Tg0(t,Θt0,ut0;Θt(N),ut(N))𝑑t,\displaystyle\Gamma^{0}(Y_{0}^{0})+\mathbb{E}\int_{0}^{T}g^{0}(t,\Theta_{t}^{0},u_{t}^{0};\Theta_{t}^{(N)},u_{t}^{(N)})dt, (1.2)
Ji(ui;ui,u0)=\displaystyle J_{i}(u^{i};u^{-i},u^{0})= Γ(Y0i)+𝔼0Tg(t,Θti,uti;Θt0,ut0;Θt(N),ut(N))𝑑t,i=1,,N,\displaystyle\Gamma(Y_{0}^{i})+\mathbb{E}\int_{0}^{T}g(t,\Theta_{t}^{i},u_{t}^{i};\Theta_{t}^{0},u_{t}^{0};\Theta_{t}^{(N)},u_{t}^{(N)})dt,\ i=1,\cdots,N,

where ui={,ui1,ui+1,}u^{-i}=\{\cdots,u^{i-1},u^{i+1},\cdots\} is the control profile except that of 𝒜i\mathcal{A}_{i}; Θ(N)=(X(N),Y(N),Z(N))\Theta_{\cdot}^{(N)}=(X_{\cdot}^{(N)},Y_{\cdot}^{(N)},Z_{\cdot}^{(N)}) is state-average triple with Y(N)=1Ni=1NYiY^{(N)}_{\cdot}=\frac{1}{N}\sum_{i=1}^{N}Y_{\cdot}^{i} as the recursive state average and Z(N)=1Ni=1NZiZ^{(N)}_{\cdot}=\frac{1}{N}\sum_{i=1}^{N}Z_{\cdot}^{i} the intensity state average (see [12]); the state-triple {Θk=(Xk,Yk,Zk)}k=0N\{\Theta_{\cdot}^{k}=(X_{\cdot}^{k},Y_{\cdot}^{k},{Z}_{\cdot}^{k})\}_{k=0}^{N} satisfy (1.1) and the backward SDE (BSDE) motivated by recursive utilities in economics [15, 28, 29]:

{dYt0=f0(t,Θt0,ut0;Θt(N),ut(N))dtZt0,0dWt0j=1NZt0,jdWtj,dYti=f(t,Θti,uti;Θt0,ut0;Θt(N),ut(N))dtZti,0dWt0Zti,idWtij=1,jiNZti,jdWtj,YT0=Φ0(XT0,XT(N))+ξ0,YTi=Φ(XTi,XT0,XT(N))+ξi,\left\{\begin{aligned} -dY_{t}^{0}=&f^{0}(t,\Theta_{t}^{0},u_{t}^{0};\Theta_{t}^{(N)},u_{t}^{(N)})dt-Z_{t}^{0,0}dW_{t}^{0}-\sum_{j=1}^{N}Z_{t}^{0,j}dW_{t}^{j},\\ -dY_{t}^{i}=&f(t,\Theta_{t}^{i},u_{t}^{i};\Theta_{t}^{0},u_{t}^{0};\Theta_{t}^{(N)},u_{t}^{(N)})dt-Z_{t}^{i,0}dW^{0}_{t}-Z_{t}^{i,i}dW^{i}_{t}-\sum_{j=1,j\neq i}^{N}Z_{t}^{i,j}dW_{t}^{j},\\ Y_{T}^{0}=&\Phi^{0}(X_{T}^{0},X_{T}^{(N)})+\xi^{0},\ Y_{T}^{i}=\Phi(X_{T}^{i},X_{T}^{0},X_{T}^{(N)})+\xi^{i},\end{aligned}\right. (1.3)

where for i=1,,Ni=1,\cdots,N, Zi=(Zi,0,Zi,i,{Zi,j}ji,j=1N)Z^{i}=(Z^{i,0},Z^{i,i},{\{Z^{i,j}\}_{j\neq i,j=1}^{N}}) and (Zi,0,Zi,i)(Z^{i,0},Z^{i,i}) is the principal intensity component, and {Zi,j}ji\{Z^{i,j}\}_{j\neq i} the marginal components. Since remainder terms j=1NZt0,jdWtj\sum_{j=1}^{N}Z_{t}^{0,j}dW_{t}^{j} and j=1,jiNZti,jdWtj\sum_{j=1,j\neq i}^{N}Z_{t}^{i,j}dW_{t}^{j} vanish as NN\rightarrow\infty (see Remark 3.1), we focus on the principal terms and set Z(N)=1Ni=1N(Zi,0,Zi,i)Z^{(N)}_{\cdot}=\frac{1}{N}\sum_{i=1}^{N}(Z^{i,0}_{\cdot},Z^{i,i}_{\cdot}) as averages on principal intensities.

We refer (1.1)-(1.3) as the recursive major-minor (RMM) problem. We defer its detailed assumptions later, and highlight its modeling features first as follows.

Modeling features. (i) The RMM model delves into the interaction between a major 𝒜0\mathcal{A}_{0} and a large number of minors {𝒜i}i=1N\{\mathcal{A}_{i}\}_{i=1}^{N}. Traditional MFG studies assume that agents are all “minor” or “negligible”, meaning that an individual agent’s action cannot significantly impact the behaviors of the population at a macro scale. Associated MFGs are thus referred to as “symmetric” because it suffices to examine a representative agent, provided agents are homogenous hence statistically exchangeable. In contrast, our RMM explores asymmetric interactions where agents having varying decisional capacities. A major agent may significantly influence the population’s behavior through her own decisions, whereas numerous minor agents can only affect the population through collective actions. This model is more realistic than the homogeneous minor setting, as it captures a range of diversified interaction mechanisms, see [5, 11, 23, 32, 34].

(ii) The RMM model further posits that objectives of all agents are represented recursively through nonlinear BSDEs, such as (1.3), with non-additive drivers f0f^{0} or ff. The inclusion of recursive functionals in MFG studies is motivated by their advantageous decision-theoretic properties, especially in the current LP context featuring complex decision couplings. Indeed, recursive functionals are well-suited for decision theory due to their capability to explain various observed non-standard decision behaviors, such as the separation of inter-temporal substitution and risk aversion. Consistently, recursive functionals extend classical expected functionals (see [4, 7, 16]), which are relevant to a special class of BSDEs with additive drivers.

(iii) The RMM problem restricts its weak coupling to empirical averages, refraining from discussing a more extensive empirical measures or distributions. Despite this limitation, the weak coupling of RMM remains quite general, as it is integrated into both the dynamics and payoff functionals of the major 𝒜0\mathcal{A}_{0} and all minors {𝒜i}i=1N\{\mathcal{A}_{i}\}_{i=1}^{N}, encompassing elements from both the state and the control. Moreover, due to recursive functionals, the state averages is enriched by including not only X(N)X^{(N)} on the objective (forward) states; but also (Y(N),Z(N))(Y^{(N)},Z^{(N)}) on the (recursive and intensity) states reflecting the subjective averaged-out beliefs. Specifically, the intensity coupling Z(N)Z^{(N)} characterizes some average on risk (ambiguity) aversion across all agents.

Literature comparison. [10, 23] and [32] introduced the major-minor MFG within a linear-quadratic-Gaussian (LQG) framework on finite and infinite horizon, respectively. They employed augmented Riccati equations to characterize consistency conditions. [34] extended these major-minor studies to a nonlinear setting using the stochastic Hamilton-Jacobi-Bellman (HJB) approach, where the weak coupling is restrictive; for instance, the major’s state cannot enter the dynamics of the minors. Besides, [5] investigated a class of major-minor MFG problems also through the stochastic HJB approach. [8] studied major-minor MFG by master equations where the agents take closed loop control. Recently, [6] investigated a type of MFG problems with asymmetric information between major and minor agents. [11] explored nonlinear major-minor MFG with general weak couplings, allowing the major’s state to enter the dynamics of the minor agents. Additionally, the limiting control problem of the major agent incorporates an endogenous mean-field term, based on an approximation through a two-agent non-zero-sum game. A forward type of maximum principle was utilized in this context.

Our paper distinguishes itself from aforementioned works by its focus on a nonlinear major-minor interaction with recursive functionals, and associated methodology of a backward-forward type of stochastic maximum principle. Our RMM modeling is particularly noteworthy for its introduction and detailed analysis of the weak couplings of the backward (recursive and intensity) state-average (Y(N),Z(N))(Y^{(N)},Z^{(N)}) originating from the recursive functional. As previously mentioned, these couplings hold significant decision-making impacts and, to the best of our knowledge, have not been systematically addressed in the MFG literature. Consequently, the maximum principle we adopt and the consistency condition we derive take unique forms that differ from those in [11]. Additionally, we apply our general nonlinear outcomes to specific RMM problems in LQG settings. Our LQG-RMM studies not only recap and extend existing results on the forward MFG studies, but also provide new insights into its backward counterpart.

Another relevant work is [7], which also explored the major-minor interaction and recursive functionals. However, it is framed on a weak formulation, substantially different from our strong formulation. Specifically, the approach of [7] originates from a variant of the Girsanov transformation and optimization of a Hamiltonian function, whereas our analysis is rooted in a refined backward-forward stochastic maximum principle. More significantly, all minors in [7] are cooperative, and the associated MFG thus encompasses a two-layer mixed structure: all (cooperative) minor agents form a mean-field team (rather than a game) in an inner layer, while in an outer layer, the interacting major agent and a representative minor agent induce a two-person, non-mean-field game. Although also termed a MFG, the model in [7] is essentially a hybrid of a mean-field team and a non-mean-field-type two-person game, which contrasts with the decision structure we investigate. Consequently, the consistency matching, a central step in MFG analysis, is not applicable in [7] at all. In contrast, our work is distinguished by a novel scheme to auxiliary control construction and consistency matching, as detailed below. Furthermore, the limiting equilibria in [7] are characterized as saddle points, which are remarkably different from our non-zero-sum setup. Additionally, the admissible controls in [7] take feedback forms and are compact, unlike our open-loop and unbounded convex admissibility.

A unified structural scheme. Last but not least, as the core element in MFG, the auxiliary problem of the RMM is formulated through a novel structural scheme, which incorporates a bilateral perturbation and a hierarchical recomposition. This scheme facilitates a more incisive auxiliary construction through a sequential network, enabling a clear-cut realization of a complex mixture involving two minor agents: one is exogenous and the other is endogenous, alongside the endogenous major agent. This scheme is new in the MFG literature and distinguishes our work from previous studies, where auxiliary constructions are based on heuristic arguments. More importantly, such scheme offers a unified methodology to tackle more complex LP interactions for which heuristic arguments are no longer tractable. For instance, when the LP system consists of heterogenous agents with varying beliefs on model uncertainties.

Contributions. (i) We introduce a new class of RMM problems featuring major-minor asymmetric interactions and recursive objectives. (ii) We present a novel unified structural scheme to construct its pivotal auxiliary problem. (iii) We derive a new class of mean-field type of forward-backward SDEs (FBSDEs) to characterize the consistency condition of the RMM problem. (iv) We examine LQG studies in the RMM context in detail to gain deeper insights.

The remainder of this paper is organized as follows: Section 2 introduces basic assumptions of the RMM problem. Section 3 presents a unified structural scheme for the RMM problem, including a bilateral perturbation and a mixed triple-agent two-layer analysis. Section 4 studies the auxiliary control and associated consistency condition (CC) of the RMM problem, and verifies its approximate Nash equilibrium. Section 5 devotes to some LQG-RMM problems. Section 6 concludes, and some technical proofs and heavy notations are given in Appendix.

2 Preliminary

For i=0,1,,Ni=0,1,\cdots,N, let 𝔽i={ti}\mathbb{F}^{i}=\{{\mathcal{F}}_{t}^{i}\} be the complete filtration generated by the Brownian motion WiW^{i}; namely, ti=σ(Wi(s),0st)𝒩{\mathcal{F}}_{t}^{i}=\sigma(W_{i}(s),0\leq s\leq t)\vee\mathcal{N}_{\mathbb{P}} with 𝒩\mathcal{N}_{\mathbb{P}} the set of all \mathbb{P}-null sets in \mathcal{F}. Then, 𝔽={t}0tT={σ(i=0Nti)}0tT\mathbb{F}=\{{\mathcal{F}}_{t}\}_{0\leq t\leq T}=\{\sigma(\bigcup_{i=0}^{N}{\mathcal{F}}_{t}^{i})\}_{0\leq t\leq T} denotes the centralized information generated by the Brownian motion WW, and by 𝔽i,0={ti,0}0tT={σ(tit0)}0tT\mathbb{F}^{i,0}=\{{\mathcal{F}}^{i,0}_{t}\}_{0\leq t\leq T}=\{\sigma({\mathcal{F}}_{t}^{i}\bigcup{\mathcal{F}}_{t}^{0})\}_{0\leq t\leq T} the decentralized information for a generic minor agent 𝒜i\mathcal{A}_{i}, i=1,,Ni=1,\cdots,N. Let U0k0U_{0}\subset\mathbb{R}^{k_{0}} and UkU\subset\mathbb{R}^{k} be two convex sets.

Definition 2.1.

u0u^{0} is a centralized (resp. decentralized) admissible control for 𝒜0\mathcal{A}_{0}, if it is an 𝔽\mathbb{F}-adapted (resp. 𝔽0\mathbb{F}^{0}-adapted) U0U_{0}-valued process with uL2:=𝔼0T|ut|2𝑑t<{||u||_{L^{2}}:=}\mathbb{E}\int_{0}^{T}|u_{t}|^{2}dt<\infty. Similarly, for i=1,2,,Ni=1,2,\cdots,N, a UU-valued process uiu^{i} is called a centralized (resp. decentralized) admissible control for 𝒜i\mathcal{A}_{i}, if it is 𝔽\mathbb{F}-adapted (resp. 𝔽i,0\mathbb{F}^{i,0}-adapted) with uiL2<||u^{i}||_{L^{2}}<\infty. Let 𝒰0c\mathcal{U}_{0}^{c} and 𝒰c\mathcal{U}^{c} (resp. 𝒰0d\mathcal{U}_{0}^{d} and 𝒰id\mathcal{U}_{i}^{d}) be the set of all centralized (resp. decentralized) controls for 𝒜0\mathcal{A}_{0} and 𝒜i\mathcal{A}_{i}, respectively.

Definition 2.2.

For any ε>0\varepsilon>0, we say a (N+1)(N+1)-tuple admissible controls (u0,,u1,,,uN,)𝒰0c×𝒰c×𝒰c(u^{0,*},u^{1,*},\cdots,u^{N,*})\in\mathcal{U}^{c}_{0}\times\mathcal{U}^{c}\times\cdots\mathcal{U}^{c} (resp. 𝒰0d×𝒰1d×𝒰Nd\in\mathcal{U}^{d}_{0}\times\mathcal{U}^{d}_{1}\times\cdots\mathcal{U}^{d}_{N}) depending on ε\varepsilon is a centralized (resp. decentralized) approximate ε\varepsilon-Nash equilibrium, if for all (u0,u1,,uN)𝒰0c×𝒰c×𝒰c(u^{0},u^{1},\cdots,u^{N})\in\mathcal{U}^{c}_{0}\times\mathcal{U}^{c}\times\cdots\mathcal{U}^{c} (resp. 𝒰0d×𝒰1d×𝒰Nd\in\mathcal{U}^{d}_{0}\times\mathcal{U}^{d}_{1}\times\cdots\mathcal{U}^{d}_{N}), we have

J0(u0,,u1,,,uN,)J0(u0,u1,,,uN,)ε,\displaystyle J_{0}(u^{0,*},u^{1,*},\cdots,u^{N,*})\geq J_{0}(u^{0},u^{1,*},\cdots,u^{N,*})-\varepsilon,
Ji(u0,,u1,,,uN,)Ji(u0,,,ui1,,ui,ui+1,,,uN,)ε,i=1,,N.\displaystyle J_{i}(u^{0,*},u^{1,*},\cdots,u^{N,*})\geq J_{i}(u^{0,*},\cdots,u^{i-1,*},u^{i},u^{i+1,*},\cdots,u^{N,*})-\varepsilon,\ i=1,\cdots,N.

The exact Nash equilibrium corresponds to the case when ε=0\varepsilon=0. Now we impose some assumptions on the following coefficients of (1.1)-(1.3) of the RMM problem:

(b0,σ0):[0,T]×n×U0×n×Un,(b,σ):[0,T]×n×U×n×U0×n×Un,\displaystyle(b^{0},\sigma^{0}):\ \ [0,T]\times\mathbb{R}^{n}\times U_{0}\times\mathbb{R}^{n}\times U\rightarrow\mathbb{R}^{n},\quad(b,\sigma):\ \ [0,T]\times\mathbb{R}^{n}\times U\times\mathbb{R}^{n}\times U_{0}\times\mathbb{R}^{n}\times U\rightarrow\mathbb{R}^{n},
(f0,g0):[0,T]×n+m+m×U0×n+m+2m×Um×,\displaystyle(f^{0},g^{0}):\ \ [0,T]\times\mathbb{R}^{n+m+m}\times U_{0}\times\mathbb{R}^{n+m+2m}\times U\rightarrow\mathbb{R}^{m}\times\mathbb{R},
(f,g):[0,T]×n+m+2m×U×n+m+m×U0×n+m+2m×Um×,\displaystyle(f,g):\ \ [0,T]\times\mathbb{R}^{n+m+2m}\times U\times\mathbb{R}^{n+m+m}\times U_{0}\times\mathbb{R}^{n+m+2m}\times U\rightarrow\mathbb{R}^{m}\times\mathbb{R},
Φ0:n+nm,Φ:n+n+nm,(Γ0,Γ):m×,\displaystyle\Phi^{0}:\ \mathbb{R}^{n+n}\rightarrow{\mathbb{R}^{m}},\quad\ \ \Phi:\ \mathbb{R}^{n+n+n}\rightarrow{\mathbb{R}^{m}},\quad\ \ {(\Gamma^{0},\Gamma):\ \mathbb{R}^{m}\rightarrow\mathbb{R}\times\mathbb{R}},

Assumption (A1) (i) (b0,σ0)(b^{0},\sigma^{0}) and (b,σ)(b,\sigma) are continuously differentiable in (x0,u0;x(N),u(N))(x^{0},u^{0};x^{(N)},u^{(N)}) and (x,u;x0,u0;(x,u;x^{0},u^{0}; x(N),u(N))x^{(N)},u^{(N)}), respectively. All the derivatives of (b0,σ0,b,σ)(b^{0},\sigma^{0},b,\sigma) are bounded.
(ii) (f0,g0)(f^{0},g^{0}) and (f,g)(f,g) are continuously differentiable in (θ0,θ(N))(\theta^{0},\theta^{(N)}) and (θ,θ0,θ(N))(\theta,\theta^{0},\theta^{(N)}), respectively, where θ0=(x0,y0,z0,u0),θ(N)=(x(N),y(N),z(N),u(N)),θ=(x,y,z,u).\theta^{0}=(x^{0},y^{0},z^{0},u^{0}),\ \theta^{(N)}=(x^{(N)},y^{(N)},z^{(N)},u^{(N)}),\ \theta=(x,y,z,u). Φ0\Phi^{0}, Φ\Phi and (Γ0,Γ)(\Gamma^{0},\Gamma) are continuously differentiable in (x0,x(N))(x^{0},x^{(N)}), (x,x0,x(N))(x,x^{0},x^{(N)}) and yy, respectively.
(iii) All derivatives of (f0,f,Φ0,Φ)(f^{0},f,\Phi^{0},\Phi) are bounded. The derivatives of g0,gg^{0},g and (Γ0,Γ)(\Gamma^{0},\Gamma) are bounded by C(1+|θ0|+|θ(N)|)C(1+|\theta^{0}|+|\theta^{(N)}|), C(1+|θ|+|θ0|+|θ(N)|)C(1+|\theta|+|\theta^{0}|+|\theta^{(N)}|) and C(1+|y|)C(1+|y|), respectively, for some C>0C>0.
(iv) ξ0\xi^{0} and ξi\xi^{i}, 1iN1\leq i\leq N, are square-integrable, T0\mathcal{F}_{T}^{0}- and Ti,0\mathcal{F}_{T}^{i,0}-measurable, respectively. Moreover, ξ1,,ξN\xi^{1},\cdots,\xi^{N} are independent and identically distributed conditionally on T0\mathcal{F}_{T}^{0}.
(v) (b0,σ0,b,σ;f0,f,g0,g)(b^{0},\sigma^{0},b,\sigma;f^{0},f,g^{0},g) are uniformly continuous in tt.

For each admissible (N+1)(N+1)-tuple {uj}j=0N\{u^{j}\}_{j=0}^{N}, the coupled SDE system (1.1) (resp. BSDE (1.3)) admits a unique solution tuple {Xj}j=0N\{X^{j}\}_{j=0}^{N} (resp. {Yj,Zj}j=0N\{Y^{j},Z^{j}\}_{j=0}^{N}) under (A1), and the recursive-type payoffs {Ji}i=0N\{J_{i}\}_{i=0}^{N} are well-defined.

3 A unified structural scheme of the RMM problem

For a fixed NN, if each agent has access to centralized information about all agents, including their instantaneous states realized and controls adopted, the RMM becomes a classical but high-dimensional (N+1)(N+1)-agent game. Existence or uniqueness of its exact Nash equilibrium(s) can be ensured under certain mild but high-dimensional conditions, including semi-continuity, coercivity and concavities on {Ji}0=1N\{J_{i}\}_{0=1}^{N}, or compactness on control admissibility. The open-loop (exact) equilibrium, denoted as {ui,}i=0N𝒰0c×i=1N𝒰ic\{u^{i,*}\}_{i=0}^{N}\in{\mathcal{U}_{0}^{c}}\times\prod_{i=1}^{N}\mathcal{U}_{i}^{c}, can be further characterized through a system of N+1N+1 stationary conditions. However, this procedure to exact equilibriums is only feasible in theory and becomes impractical due to the curse of dimensionality when NN is large.

MFG theory offers one resolution to constructing near-optimal decentralized strategies, as an alternative approximation to the exact Nash equilibria. A key challenge in MFG is the construction of an auxiliary problem for dimension reduction. Previous MFG studies have constructed auxiliary problems intuitively based on heuristic arguments, effective only when the underlying coupling structure is not overly complex. For instance, when all agents are symmetric as in [14, 22, 24, 27], or even an asymmetric dominant major agent is included as in [5, 6, 8, 10, 11, 23, 32, 34]. However, heuristic analysis becomes infeasible to analyze LP systems with more intricate couplings. One reason is that it fails to effectively configure a complex logic network in which various representative agents shall be mutually connected by an “exogenous—endogenous” relation. Alternatively, we propose a structural scheme that can not only well amount for the extreme generality of the weak coupling in RMM problems, but also lay down an unified foundation to analyze more general and complex LP couplings. In current RMM context, this scheme yields a bilateral perturbation and a triple-agent two-layer game, as discussed below.

3.1 A bilateral perturbation: the major agent

Letting ε=0\varepsilon=0 in Definition 2.2, 𝒜0\mathcal{A}_{0} faces an optimization problem: supu0𝒰0cJ0(u0,u1,,,uN,),\mathop{\rm sup}_{u^{0}\in\mathcal{U}_{0}^{c}}J_{0}(u^{0},u^{1,*},\cdots,u^{N,*}), by assuming that all minor {𝒜i}i=1N\{\mathcal{A}_{i}\}_{i=1}^{N} implement exact Nash equilibrium {ui,}i=1N\{u^{i,*}\}_{i=1}^{N}. When 𝒜0\mathcal{A}_{0} adopts a perturbed centralized control u0𝒰0cu^{0}\in\mathcal{U}_{0}^{c} instead the exact u0,u^{0,*}, her state (1.1) becomes

{dXt0,=b0(t,Xt0,,ut0;Xt(N),,ut(N),)dt+σ0(t,Xt0,,ut0;Xt(N),,ut(N),)dWt0,dXti,=b(t,Xti,,uti,;Xt0,,ut0;Xt(N),,ut(N),)dt+σ(t,Xti,,uti,;Xt0,,ut0;Xt(N),,ut(N),)dWti,\left\{\begin{aligned} d{X}_{t}^{0,\dagger}=&b^{0}(t,{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),\ast})dt+\sigma^{0}(t,{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),\ast})dW_{t}^{0},\\ dX_{t}^{i,\dagger}=&b(t,X_{t}^{i,\dagger},u_{t}^{i,*};{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),*})dt+\sigma(t,X_{t}^{i,\dagger},u_{t}^{i,*};{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),*})dW_{t}^{i},\end{aligned}\right. (3.1)

with X00,=x00,X0i,=x0,i=1,,N.{X}_{0}^{0,\dagger}=x_{0}^{0},\ X_{0}^{i,\dagger}=x_{0},\ i=1,\cdots,N. Here, X(N),:=1Ni=1NXi,X_{\cdot}^{(N),\dagger}:=\frac{1}{N}\sum_{i=1}^{N}X_{\cdot}^{i,\dagger} is a quasi-realized state average with the superscript “” to emphasize its dependence on the major’s perturbed u0u^{0}; whereas u(N),:=1Ni=1Nui,u_{\cdot}^{(N),\ast}:=\frac{1}{N}\sum_{i=1}^{N}u_{\cdot}^{i,*} is the exact-realized control average only depending on the exact strategies so the superscript “” is still applied. This is essentially an open-loop feature.

By “quasi-realized,” the states are not “exactly” the ones to be realized when all agents apply their exact strategies; instead, they are “quasi-exact” as only 𝒜0\mathcal{A}_{0} deviates from the exact one by adopting a perturbed control. Actually, an exact Nash equilibrium {uti,}i=1N\{u_{t}^{i,*}\}_{i=1}^{N}, in its open-loop sense, are defined directly on the basic inputs (t,ω)(t,\omega) rather than on the “intermediate” states. Thus, a perturbed u0u^{0} by 𝒜0\mathcal{A}_{0}, will not change the controls of the minors. This is very different to the closed-loop case, for which a perturbed u0u^{0} will change the major’s state, so further alter the implementation of ui,u^{i,*} constructed on these realized states. Using the notation

Yt(N),=1Ni=1NYti,,Zt(N),=1Ni=1NZti,=1Ni=1N(Zti,,0,Zti,,i,{Zti,,j}ji),\displaystyle Y_{t}^{(N),\dagger}=\frac{1}{N}\sum_{i=1}^{N}Y_{t}^{i,\dagger},\quad Z_{t}^{(N),\dagger}=\frac{1}{N}\sum_{i=1}^{N}Z_{t}^{i,\dagger}=\frac{1}{N}\sum_{i=1}^{N}(Z_{t}^{i,\dagger,0},Z_{t}^{i,\dagger,i},\{Z_{t}^{i,\dagger,j}\}_{j\neq i})^{\top},
Θtj,=(Xtj,,Ytj,,Ztj,),j=0,1,,N,Θt(N),=(Xt(N),,Yt(N),,Zt(N),),\displaystyle\Theta_{t}^{j,\dagger}=(X_{t}^{j,\dagger},Y_{t}^{j,\dagger},Z_{t}^{j,\dagger}),\ j=0,1,\cdots,N,\quad\Theta_{t}^{(N),\dagger}=(X_{t}^{(N),\dagger},Y_{t}^{(N),\dagger},Z_{t}^{(N),\dagger}),

and similar to (3.1), we can get the following quasi-realized coupled BSDEs

{dYt0,=f0(t,ut0,Θt0,;ut(N),,Θt(N),)dtZt0,dWt0j=1NZt0,,jdWtj,dYti,=f(t,Θti,,uti,;Θt0,,ut0;Θt(N),,ut(N),)dtZti,,0dWt0Zti,,idWtij=1,jiNZti,,jdWtj,YT0,=Φ0(XT0,,XT(N),)+ξ0,YTi,=Φ(XTi,;XT0,,XT(N),)+ξi,i=1,,N.\left\{\begin{aligned} -dY_{t}^{0,\dagger}=&{f^{0}\big{(}t,u_{t}^{0},\Theta_{t}^{0,\dagger};u_{t}^{(N),\ast},\Theta_{t}^{(N),\dagger}\big{)}}dt-Z_{t}^{0,\dagger}dW_{t}^{0}-\sum_{j=1}^{N}Z_{t}^{0,\dagger,j}dW_{t}^{j},\\ -dY_{t}^{i,\dagger}=&f\big{(}t,{\Theta_{t}^{i,\dagger},u_{t}^{i,*};\Theta_{t}^{0,\dagger},u_{t}^{0};\Theta_{t}^{(N),\dagger},u_{t}^{(N),*}}\big{)}dt{-Z_{t}^{i,\dagger,0}dW_{t}^{0}-Z_{t}^{i,\dagger,i}dW_{t}^{i}}-\sum_{j=1,j\neq i}^{N}Z_{t}^{i,\dagger,j}dW_{t}^{j},\\ Y_{T}^{0,\dagger}=&\Phi^{0}\big{(}X_{T}^{0,\dagger},X_{T}^{(N),\dagger}\big{)}+\xi^{0},\quad Y_{T}^{i,\dagger}=\Phi\big{(}X_{T}^{i,\dagger};X_{T}^{0,\dagger},X_{T}^{(N),\dagger}\big{)}+\xi^{i},\quad i=1,\cdots,N.\\ \end{aligned}\right. (3.2)

We aim to analyze the asymptotic limit as NN\rightarrow\infty with ut(N),u¯tt0.u_{t}^{(N),*}\rightarrow\overline{u}_{t}^{*}\in\mathcal{F}_{t}^{0}. Then we take the limit of (3.1), (3.2) and the related recursive payoff for the major agent. By the continuity of coefficients b0b^{0} and bb, limNb0(t,Xt0,,ut0;Xt(N),,ut(N),)=b0(t,𝕏t0,,ut0;α¯t,u¯t),\lim_{N\rightarrow\infty}b^{0}(t,{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),\ast})=b^{0}(t,\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};\overline{\alpha}_{t}^{\dagger},\overline{u}_{t}^{*}), and

limN1Ni=1Nb(t,Xti,,uti,;Xt0,,ut0;Xt(N),,ut(N),)\displaystyle\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{i=1}^{N}b(t,X_{t}^{i,\dagger},u_{t}^{i,*};{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),*})
=\displaystyle= limN1Ni=1N[limN1Ni=1Nb(t,a;Xt0,,ut0;Xt(N),,ut(N),)]a=(Xti,,uti,)\displaystyle\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{i=1}^{N}\left[{\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{i=1}^{N}}b\left(t,a;{X}_{t}^{0,\dagger},u_{t}^{0};X_{t}^{(N),\dagger},u_{t}^{(N),*}\right)\right]_{a=(X_{t}^{i,\dagger},u_{t}^{i,*})}
=\displaystyle= limN1Ni=1Nb(t,Xti,,uti,;𝕏t0,,ut0;α¯t,u¯t)=𝔼t[b(t,𝕏t1,,ut1,;𝕏t0,,ut0;α¯t,u¯t)],\displaystyle\lim_{N\rightarrow\infty}\frac{1}{N}\sum_{i=1}^{N}b(t,X_{t}^{i,\dagger},u_{t}^{i,*};{\mathbb{X}}_{t}^{0,\dagger},u_{t}^{0};\overline{\alpha}_{t}^{{\dagger}},\overline{u}_{t}^{*})={\mathbb{E}_{t}\Big{[}b(t,{\mathbb{X}}_{t}^{1,\dagger},u_{t}^{1,*};\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};\overline{\alpha}_{t}^{{\dagger}},\overline{u}_{t}^{*})\Big{]}},

where 𝕏t0,=limNXt0,,𝕏t1,=limNXt1,,α¯t=limNXt(N),{\mathbb{X}}_{t}^{0,\dagger}=\lim_{N\rightarrow\infty}{X}_{t}^{0,\dagger},\ {\mathbb{X}}_{t}^{1,\dagger}=\lim_{N\rightarrow\infty}{X}_{t}^{1,\dagger},\ \overline{\alpha}_{t}^{{\dagger}}=\lim_{N\rightarrow\infty}X_{t}^{(N),\dagger} are the associated limiting quantities, and 𝔼t[]\mathbb{E}_{t}[\cdot] denotes the conditional expectation on the common information 𝔽0\mathbb{F}^{0}. Similarly, taking the limit on (3.1), we get the asymptotic limit of the major agent:

{d𝕏t0,=b0(t,𝕏t0,,ut0;α¯t,u¯t)dt+σ0(t,𝕏t0,,ut0;α¯t,u¯t)dWt0,𝕏00,=x00,dα¯t=𝔼t[b(t,𝕏t1,,ut1,;𝕏t0,,ut0;α¯t,u¯t)]dt,α¯0=x0,\left\{\begin{aligned} d\mathbb{X}_{t}^{0,\dagger}=&b^{0}(t,\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};{\overline{\alpha}}_{t}^{\dagger},\overline{u}_{t}^{*})dt+\sigma^{0}(t,\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};{\overline{\alpha}}_{t}^{\dagger},\overline{u}_{t}^{*})dW_{t}^{0},\quad\mathbb{X}_{0}^{0,\dagger}=x_{0}^{0},\\ d{\overline{\alpha}}_{t}^{\dagger}=&\mathbb{E}_{t}\Big{[}b(t,{\mathbb{X}}_{t}^{1,\dagger},u_{t}^{1,*};\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};{\overline{\alpha}}_{t}^{\dagger},\overline{u}_{t}^{*})\Big{]}dt,\quad\overline{\alpha}_{0}^{\dagger}=x_{0},\end{aligned}\right. (3.3)

where 𝕏1,{\mathbb{X}}^{1,\dagger} denotes the state of the representative minor agent, say, 𝒜1\mathcal{A}_{1}

d𝕏t1,=b(t,𝕏t1,,ut1,;𝕏t0,,ut0;α¯t,u¯t)dt+σ(t,𝕏t1,,ut1,;𝕏t0,,ut0;α¯t,u¯t)dWt1,𝕏01,=x0.d{\mathbb{X}}_{t}^{1,\dagger}=b(t,{\mathbb{X}}_{t}^{1,\dagger},u_{t}^{1,*};\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};{{\overline{\alpha}}_{t}^{\dagger}},\overline{u}_{t}^{*})dt+\sigma(t,{\mathbb{X}}_{t}^{1,\dagger},u_{t}^{1,*};\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};{{\overline{\alpha}}_{t}^{\dagger}},\overline{u}_{t}^{*})dW_{t}^{1},\quad{\mathbb{X}}_{0}^{1,\dagger}=x_{0}. (3.4)

Comparing (3.3) and (3.4), we have α¯t=𝔼t[𝕏t1,]{\overline{\alpha}}_{t}^{\dagger}=\mathbb{E}_{t}[{\mathbb{X}}^{1,\dagger}_{t}] by noting u¯t=𝔼t[ut1,]\overline{u}_{t}^{*}=\mathbb{E}_{t}[u_{t}^{1,*}] and the solution uniqueness of the second equation in (3.3). Then the limiting state 𝕏0,\mathbb{X}^{0,\dagger} of the major satisfies

{d𝕏t0,=b0(t,𝕏t0,,ut0;𝔼t[𝕏t1,],𝔼t[ut1,])dt+σ0(t,𝕏t0,,ut0;𝔼t[𝕏t1,],𝔼t[ut1,])dWt0,d𝕏t1,=b(t,𝕏t1,,ut1,;𝕏t0,,ut0;𝔼t[𝕏t1,],𝔼t[ut1,])dt+σ(t,𝕏t1,,ut1,;𝕏t0,,ut0;𝔼t[𝕏t1,],𝔼t[ut1,])dWt1,\left\{\begin{aligned} d\mathbb{X}_{t}^{0,\dagger}=&b^{0}\big{(}t,\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{\mathbb{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dt+\sigma^{0}\big{(}t,\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{\mathbb{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dW_{t}^{0},\\ d{\mathbb{X}}_{t}^{1,\dagger}=&b\big{(}t,{\mathbb{X}}_{t}^{1,\dagger},u_{t}^{1,*};\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{\mathbb{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dt+\sigma\big{(}t,{\mathbb{X}}_{t}^{1,\dagger},u_{t}^{1,*};\mathbb{X}_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{\mathbb{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dW_{t}^{1},\end{aligned}\right. (3.5)

with initial conditions 𝕏00,=x00,𝕏01,=x0.\mathbb{X}_{0}^{0,\dagger}=x_{0}^{0},\ {\mathbb{X}}_{0}^{1,\dagger}=x_{0}. Taking limit on BSDE (3.2) and similar to (3.5),

{d𝕐t0,=f0(t,Πt0,,ut0;𝔼t[Πt1,],u¯t)dtt0,dWt0,d𝕐t1,=f(t,Πt1,,ut1,;Πt0,,ut0;𝔼t[ut1,],𝔼t[Πt1,])dtt1,,0dWt0t1,,1dWt1,𝕐T0,=Φ0(𝕏T0,,𝔼T[𝕏T1,])+ξ0,𝕐T1,=Φ(𝕏T1,;𝕏T0,,𝔼T[𝕏T1,])+ξ1,\left\{\begin{aligned} -d\mathbb{Y}_{t}^{0,\dagger}=&f^{0}\Big{(}t,{\Pi_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[\Pi_{t}^{1,\dagger}],\overline{u}_{t}^{*}}\Big{)}dt-\mathbb{Z}_{t}^{0,\dagger}dW_{t}^{0},\\ -d{\mathbb{Y}}_{t}^{1,\dagger}=&f(t,{\Pi_{t}^{1,\dagger},u_{t}^{1,*};\Pi_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{u}_{t}^{1,*}],\mathbb{E}_{t}[\Pi_{t}^{1,\dagger}]})dt-{{\mathbb{Z}}_{t}^{1,\dagger,0}dW_{t}^{0}-{\mathbb{Z}}_{t}^{1,\dagger,1}dW_{t}^{1}},\\ \mathbb{Y}_{T}^{0,\dagger}=&\Phi^{0}\big{(}\mathbb{X}_{T}^{0,\dagger},\mathbb{E}_{T}[{\mathbb{X}}^{1,\dagger}_{T}]\big{)}+\xi^{0},\quad{\mathbb{Y}}_{T}^{1,\dagger}=\Phi\big{(}{\mathbb{X}}_{T}^{1,\dagger};\mathbb{X}_{T}^{0,\dagger},\mathbb{E}_{T}[{\mathbb{X}}^{1,\dagger}_{T}]\big{)}+\xi^{1},\end{aligned}\right. (3.6)

where Πtk,=limNΘtk,=(𝕏tk,,𝕐tk,,tk,)\Pi_{t}^{k,\dagger}=\lim_{N\rightarrow\infty}\Theta_{t}^{k,\dagger}=(\mathbb{X}_{t}^{k,\dagger},\mathbb{Y}_{t}^{k,\dagger},\mathbb{Z}_{t}^{k,\dagger}), k=0,1.k=0,1.

Remark 3.1.

The remainder terms j=1NZt0,,jdWtj\sum_{j=1}^{N}Z_{t}^{0,\dagger,j}dW_{t}^{j} and j=1,jiNZti,,jdWtj\sum_{j=1,j\neq i}^{N}Z_{t}^{i,\dagger,j}dW_{t}^{j} in BSDE (3.2) vanish as N+N\longrightarrow+\infty. For sake of presentation, we may omit these remainder terms hereafter.

Finally, for the given 𝔽\mathbb{F}-adapted u1,{u}^{1,*}, we construct the following auxiliary problem for 𝒜0\mathcal{A}_{0} associated with (3.6) and (3.5):

supu0𝒰0c{Γ0(𝕐00,)+𝔼[0Tg0(t,Πt0,,ut0;𝔼t[Πt1,],u¯t)𝑑t]}.\displaystyle\mathop{\rm sup}_{u^{0}\in\mathcal{U}_{0}^{c}}\Big{\{}\Gamma^{0}(\mathbb{Y}_{0}^{0,\dagger})+\mathbb{E}\big{[}\int_{0}^{T}g^{0}(t,\Pi_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[\Pi_{t}^{1,\dagger}],\overline{u}_{t}^{*})dt\big{]}\Big{\}}. (3.7)

An optimal control (if exists) of (3.7) should rely on the given control (u1,,u¯t)\big{(}{u}^{1,*},\bar{u}_{t}^{*}\big{)}.

3.2 A bilateral perturbation: a representative minor agent

We turn to a representative minor 𝒜1\mathcal{A}_{1}. By Definition 2.2, 𝒜1\mathcal{A}_{1} confronts an optimization problem when assuming 𝒜0\mathcal{A}_{0} implements the exact Nash u0,𝒰0cu^{0,*}\in\mathcal{U}_{0}^{c}; and 𝒜j(j2)\mathcal{A}_{j}(j\geq 2) implement uj,𝒰cu^{j,*}\in\mathcal{U}^{c}. If 𝒜1\mathcal{A}_{1} applies a perturbed u1𝒰cu^{1}\in\mathcal{U}^{c}, the state and functional of 𝒜0,𝒜j\mathcal{A}_{0},\mathcal{A}_{j} (j2j\geq 2) become

{dXtj,=b(t,Xtj,,utj,;Xt0,,ut0,;Xt(N),,ut(N),)dt+σ(t,Xtj,,utj,;Xt0,,ut0,;Xt(N),,ut(N),)dWtj,dYtj,=f(t,Θtj,,utj,;Θt0,,ut0,;Θt(N),,ut(N),)dtZtj,,0dWt0Ztj,,jdWtj,dXt0,=b0(t,Xt0,,ut0,;Xt(N),,ut(N),)dt+σ0(t,Xt0,,ut0,;Xt(N),,ut(N),)dWt0,dYt0,=f0(t,Θt0,,ut0,;Θt(N),,ut(N),)dtZt0,dWt0,\left\{\begin{aligned} dX_{t}^{j,*}=&b(t,X_{t}^{j,*},u_{t}^{j,*};X_{t}^{0,*},u_{t}^{0,*};X_{t}^{(N),*},u_{t}^{(N),*})dt+\sigma(t,X_{t}^{j,*},u_{t}^{j,*};X_{t}^{0,*},u_{t}^{0,*};X_{t}^{(N),*},u_{t}^{(N),*})dW_{t}^{j},\\ -dY_{t}^{j,*}=&f(t,{\Theta_{t}^{j,*},u_{t}^{j,*};\Theta_{t}^{0,*},u_{t}^{0,*};\Theta_{t}^{(N),*},u_{t}^{(N),*})dt-Z_{t}^{j,*,0}dW_{t}^{0}-Z_{t}^{j,*,j}dW_{t}^{j}},\\ dX_{t}^{0,*}=&b^{0}(t,X_{t}^{0,*},u_{t}^{0,*};{X_{t}^{(N),*},u_{t}^{(N),*}})dt+\sigma^{0}(t,X_{t}^{0,*},u_{t}^{0,*};{X_{t}^{(N),*},u_{t}^{(N),*}})dW_{t}^{0},\\ -dY_{t}^{0,*}=&f^{0}(t,\Theta_{t}^{0,*},u_{t}^{0,*};\Theta_{t}^{(N),*},u_{t}^{(N),*})dt-Z_{t}^{0,*}dW_{t}^{0},\end{aligned}\right. (3.8)

with X00,=x00,YT0,=Φ0(XT0,,XT(N),)+ξ0,X0j,=x0,YTj,=Φ(XTj,;XT0,,XT(N),)+ξj.X_{0}^{0,*}=x_{0}^{0},\ Y_{T}^{0,*}=\Phi^{0}(X_{T}^{0,*},X_{T}^{(N),*})+\xi^{0},\ X_{0}^{j,*}=x_{0},\ Y_{T}^{j,*}=\Phi(X_{T}^{j,*};X_{T}^{0,*},X_{T}^{(N),*})+\xi^{j}. We abuse notations (as with the same limits) to denote Υ(N),:=1N1j=2NΥj,,Υ=X,Y,Z,u,\Upsilon_{\cdot}^{(N),*}:=\frac{1}{N-1}\sum_{j=2}^{N}\Upsilon_{\cdot}^{j,*},\ \Upsilon=X,Y,Z,u, and {Θtk,=(Xtk,,Ytk,,Ztk,)}k=0N;Θt(N),=(Xt(N),,Yt(N),,Zt(N),).\{\Theta_{t}^{k,*}=(X_{t}^{k,*},Y_{t}^{k,*},Z_{t}^{k,*})\}_{k=0}^{N};\ \ \Theta_{t}^{(N),*}=(X_{t}^{(N),*},Y_{t}^{(N),*},Z_{t}^{(N),*}). Noting (Θ0,,Θj,,Θ(N),)(\Theta^{0,*},\Theta^{j,*},\Theta^{(N),*}) are independent of 𝒜1\mathcal{A}_{1}, so they are exogenous for 𝒜1\mathcal{A}_{1}. The state of 𝒜1\mathcal{A}_{1} under u1u^{1} satisfies

dXt1,=\displaystyle dX_{t}^{1,\ddagger}= b(t,Xt1,,ut1;Xt0,,ut0,;Xt(N),,ut(N),)dt+σ(t,Xt1,,ut1;Xt0,,ut0,;Xt(N),,ut(N),)dWt1.\displaystyle b(t,X_{t}^{1,\ddagger},u_{t}^{1};X_{t}^{0,*},u_{t}^{0,*};X_{t}^{(N),*},u_{t}^{(N),*})dt+\sigma(t,X_{t}^{1,\ddagger},u_{t}^{1};X_{t}^{0,*},u_{t}^{0,*};X_{t}^{(N),*},u_{t}^{(N),*})dW_{t}^{1}.

The quasi-realized state X1,X_{\cdot}^{1,\ddagger} of 𝒜1\mathcal{A}_{1} is with the superscript “\ddagger” to indicate its dependence on perturbed u1u^{1}; whereas the exact-realized state-control average (X(N),,u(N),)(X_{\cdot}^{(N),\ast},u_{\cdot}^{(N),\ast}) only depend on the exact strategies so are still with “\ast”. The value Y01,Y^{1,\ddagger}_{0} affected by u1u^{1} of 𝒜1\mathcal{A}_{1} satisfies

{dYt1,=f(t,Θt1,,ut1;Θt0,,ut0,;Θt(N),,ut(N),)dtZt1,,0dWt0Zt1,,1dWt1,YT1,=Φ(XT1,;XT0,,XT(N),)+ξ1,\left\{\begin{aligned} -dY_{t}^{1,\ddagger}=&f(t,\Theta_{t}^{1,\ddagger},u_{t}^{1};\Theta_{t}^{0,*},u_{t}^{0,*};\Theta_{t}^{(N),*},u_{t}^{(N),*})dt-Z_{t}^{1,\ddagger,0}dW_{t}^{0}-Z_{t}^{1,\ddagger,1}dW_{t}^{1},\\ Y_{T}^{1,\ddagger}=&\Phi(X_{T}^{1,\ddagger};X_{T}^{0,*},X_{T}^{(N),*})+\xi^{1},\end{aligned}\right. (3.9)

with Θt1,=(Xt1,,Yt1,,Zt1,)\Theta_{t}^{1,\ddagger}=(X_{t}^{1,\ddagger},Y_{t}^{1,\ddagger},Z_{t}^{1,\ddagger}). Next, similar to our analysis in Subsection 3.1, we can obtain the following coupled mean-field FBSDE from (3.8) (noting 𝔼t[utj,]=u¯t\mathbb{E}_{t}[u_{t}^{j,*}]=\overline{u}_{t}^{*}), j2j\geq 2,

{d𝕏t0,=b0(t,𝕏t0,,ut0,;𝔼t[𝕏tj,],𝔼t[utj,])dt+σ0(t,𝕏t0,,ut0,;𝔼t[𝕏tj,],𝔼t[utj,])dWt0,d𝕏tj,=b(t,𝕏tj,,utj,;𝕏t0,,ut0,;𝔼t[𝕏tj,],𝔼t[utj,])dt+σ(t,𝕏tj,,utj,;𝕏t0,,ut0,;𝔼t[𝕏tj,],𝔼t[utj,])dWtj,d𝕐t0,=f0(t,Πt0,,ut0,;𝔼t[Πtj,],𝔼t[utj,])dtt0,dWt0,d𝕐tj,=f(t,Πtj,,utj,;Πt0,,ut0,;𝔼t[Πtj,],𝔼t[utj,])dttj,,0dWt0tj,,jdWtj,𝕏00,=x0,𝕏0j,=ξj,𝕐T0,=Φ0(𝕏T0,,𝔼T[𝕏Tj,])+ξ0,𝕐Tj,=Φ(𝕏Tj,;𝕏T0,,𝔼T[𝕏Tj,])+ξj,\left\{\begin{aligned} d\mathbb{X}_{t}^{0,*}=&b^{0}\big{(}t,\mathbb{X}_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\mathbb{X}^{j,*}_{t}],\mathbb{E}_{t}[u_{t}^{j,*}]\big{)}dt+\sigma^{0}\big{(}t,\mathbb{X}_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\mathbb{X}^{j,*}_{t}],\mathbb{E}_{t}[u_{t}^{j,*}]\big{)}dW_{t}^{0},\\ d\mathbb{X}_{t}^{j,*}=&b\big{(}t,\mathbb{X}_{t}^{j,*},u_{t}^{j,*};\mathbb{X}_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\mathbb{X}^{j,*}_{t}],\mathbb{E}_{t}[u_{t}^{j,*}]\big{)}dt+\sigma\big{(}t,\mathbb{X}_{t}^{j,*},u_{t}^{j,*};\mathbb{X}_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\mathbb{X}^{j,*}_{t}],\mathbb{E}_{t}[u_{t}^{j,*}]\big{)}dW_{t}^{j},\\ -d\mathbb{Y}_{t}^{0,*}=&f^{0}\Big{(}t,{\Pi_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\Pi^{j,*}_{t}],\mathbb{E}_{t}[{u}^{j,*}_{t}]}\Big{)}dt-\mathbb{Z}_{t}^{0,*}dW_{t}^{0},\\ -d\mathbb{Y}_{t}^{j,*}=&f\Big{(}t,{\Pi_{t}^{j,*},u_{t}^{j,*};\Pi_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\Pi^{j,*}_{t}],\mathbb{E}_{t}[{u}^{j,*}_{t}]\Big{)}dt-\mathbb{Z}_{t}^{j,*,0}dW_{t}^{0}-\mathbb{Z}_{t}^{j,*,j}dW_{t}^{j}},\\ \mathbb{X}_{0}^{0,*}=&x_{0},\ \mathbb{X}_{0}^{j,*}=\xi_{j},\ \mathbb{Y}_{T}^{0,*}=\Phi^{0}\Big{(}\mathbb{X}_{T}^{0,*},\mathbb{E}_{T}[\mathbb{X}^{j,*}_{T}]\Big{)}+\xi^{0},\ \mathbb{Y}_{T}^{j,*}=\Phi\Big{(}\mathbb{X}_{T}^{j,*};\mathbb{X}_{T}^{0,*},\mathbb{E}_{T}[\mathbb{X}^{j,*}_{T}]\Big{)}+\xi^{j},\\ \end{aligned}\right. (3.10)

by setting Πi,=(𝕏i,,𝕐i,,i,)=limN(Xi,,Yi,,Zi,),i1.\Pi_{\cdot}^{i,*}=(\mathbb{X}^{i,*}_{\cdot},\mathbb{Y}^{i,*}_{\cdot},\mathbb{Z}^{i,*}_{\cdot})=\lim_{N\rightarrow\infty}(X^{i,*}_{\cdot},Y^{i,*}_{\cdot},Z^{i,*}_{\cdot}),\ i\neq 1. Then by (3.2), (3.9), when 𝒜1\mathcal{A}_{1} applying u1u^{1}, the limiting 𝕏1,(:=limNX1,)\mathbb{X}^{1,\ddagger}(:=\lim_{N\rightarrow\infty}X^{1,\ddagger}) and 𝕐t1,(:=limNYt1,)\mathbb{Y}^{1,\ddagger}_{t}(:=\lim_{N\rightarrow\infty}Y_{t}^{1,\ddagger}) satisfy

{d𝕏t1,=b(t,𝕏t1,,ut1;𝕏t0,,ut0,;𝔼t[𝕏tj,],𝔼t[utj,])dt+σ(t,𝕏t1,,ut1;𝕏t0,,ut0,;𝔼t[𝕏tj,],𝔼t[utj,])dWt1,d𝕐t1,=f(t,Πt1,,ut1;Πt0,,ut0,;𝔼t[Πtj,],𝔼t[utj,])dtt1,,0dWt0t1,,1dWt1,𝕏0i,=x0,𝕐T1,=Φ(𝕏T1,;𝕏T0,,𝔼T[𝕏Tj,])+ξ1,\left\{\begin{aligned} d\mathbb{X}_{t}^{1,\ddagger}=&b\Big{(}t,\mathbb{X}_{t}^{1,\ddagger},u_{t}^{1};\mathbb{X}_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\mathbb{X}^{j,*}_{t}],\mathbb{E}_{t}[u_{t}^{j,*}]\Big{)}dt+\sigma\Big{(}t,\mathbb{X}_{t}^{1,\ddagger},u_{t}^{1};\mathbb{X}_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\mathbb{X}^{j,*}_{t}],\mathbb{E}_{t}[u_{t}^{j,*}]\Big{)}dW_{t}^{1},\\ -d\mathbb{Y}^{1,\ddagger}_{t}=&f\Big{(}t,{\Pi^{1,\ddagger}_{t},u_{t}^{1};\Pi_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\Pi^{j,*}_{t}],\mathbb{E}_{t}[{u}^{j,*}_{t}]}\Big{)}dt{-\mathbb{Z}^{1,\ddagger,0}_{t}dW_{t}^{0}-\mathbb{Z}^{1,\ddagger,1}_{t}dW_{t}^{1}},\\ \mathbb{X}_{0}^{i,\ddagger}=&x_{0},\quad\mathbb{Y}^{1,\ddagger}_{T}=\Phi\Big{(}\mathbb{X}^{1,\ddagger}_{T};\mathbb{X}_{T}^{0,*},\mathbb{E}_{T}[\mathbb{X}^{j,*}_{T}]\Big{)}+\xi^{1},\end{aligned}\right. (3.11)

where Πt1,=(𝕏t1,,𝕐t1,,t1,)\Pi_{t}^{1,\ddagger}=(\mathbb{X}_{t}^{1,\ddagger},\mathbb{Y}_{t}^{1,\ddagger},\mathbb{Z}_{t}^{1,\ddagger}). Along with (3.11), the objective of 𝒜1\mathcal{A}_{1} is to maximize

Γ(𝕐01,)+𝔼[0Tg(t,Πt1,,ut1;Πt0,,ut0,;𝔼t[Πtj,],𝔼t[utj,])𝑑t].\Gamma(\mathbb{Y}_{0}^{1,\ddagger})+\mathbb{E}\Big{[}\int_{0}^{T}g(t,\Pi_{t}^{1,\ddagger},u_{t}^{1};\Pi_{t}^{0,*},u_{t}^{0,*};\mathbb{E}_{t}[\Pi^{j,*}_{t}],\mathbb{E}_{t}[{u}^{j,*}_{t}])dt\Big{]}. (3.12)

Note that an optimal control of 𝒜1\mathcal{A}_{1}, if exists, should depend on u0,{u}^{0,*} and u¯\bar{u}^{*}.

3.3 A hierarchical recomposition

We now apply a hierarchical recomposition to construct the desired auxiliary problem, that assumes a mixed two-layer game in the RMM context. To this end, we integrate the perturbed (3.5)-(3.7) in side of 𝒜0\mathcal{A}_{0} indexed by ``"``{\dagger}", and (3.11)-(3.12) in side of the representative minor 𝒜1\mathcal{A}_{1} by ``"``{\ddagger}" together. This yields an (extended) state (X0,,X1,;X1,,X0,,Xj,)({X}^{0,\dagger},{X}^{1,\dagger};{X}^{{1,}\ddagger},{X}^{0,\ddagger},{X}^{j,\ddagger}) labeled by j2j\geq 2:

{dXt0,=b0(t,Xt0,,ut0;𝔼t[Xt1,],𝔼t[ut1])dt+σ0(t,Xt0,,ut0;𝔼t[Xt1,],𝔼t[ut1])dWt0,dXt1,=b(t,Xt1,,ut1;Xt0,,ut0;𝔼t[Xt1,],𝔼t[ut1])dt+σ(t,Xt1,,ut1;Xt0,,ut0;𝔼t[Xt1,],𝔼t[ut1])dWt1,dXt1,=b(t,Xt1,,ut1;Xt0,,ut0;𝔼t[Xtj,],𝔼t[utj])dt+σ(t,Xt1,,ut1;Xt0,,ut0;𝔼t[Xtj,],𝔼t[utj])dWt1,dXt0,=b0(t,Xt0,,ut0;𝔼t[Xtj,],𝔼t[utj])dt+b0(t,Xt0,,ut0;𝔼t[Xtj,],𝔼t[utj])dWt0,dXtj,=b(t,Xtj,,utj;Xt0,,ut0;𝔼t[Xtj,],𝔼t[utj])dt+σ(t,Xtj,,utj;Xt0,,ut0;𝔼t[Xtj,],𝔼t[utj])dWtj,X00,=X00,=x00,X01,=X01,=X0j,=x0.\left\{\begin{aligned} \quad d{X}_{t}^{0,\dagger}=&b^{0}\big{(}t,{X}_{t}^{0,\dagger},{{u_{t}^{0}}};\mathbb{E}_{t}[{{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u_{t}^{1}}]\big{)}dt+\sigma^{0}\big{(}t,{X}_{t}^{0,\dagger},{{u_{t}^{0}}};\mathbb{E}_{t}[{{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u_{t}^{1}}]\big{)}dW^{0}_{t},\\ d{{X}}_{t}^{1,\dagger}=&b\big{(}t,{{X}}_{t}^{1,\dagger},{u_{t}^{1}};{X}_{t}^{0,\dagger},{u_{t}^{0}};\mathbb{E}_{t}[{{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u_{t}^{1}}]\big{)}dt+\sigma\big{(}t,{{X}}_{t}^{1,\dagger},{u_{t}^{1}};{X}_{t}^{0,\dagger},{u_{t}^{0}};\mathbb{E}_{t}[{{X}}^{1,\dagger}_{t}],\mathbb{E}_{t}[{u_{t}^{1}}]\big{)}dW_{t}^{1},\\ \quad d{X}_{t}^{{1,}\ddagger}=&b\big{(}t,{X}_{t}^{{1},\ddagger},{u_{t}^{1}};{{X}_{t}^{0,\ddagger},{u_{t}^{0}};\mathbb{E}_{t}[{X}^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u_{t}^{j}}]}\big{)}dt+\sigma\big{(}t,{X}_{t}^{{1},\ddagger},{u_{t}^{1}};{{X}_{t}^{0,\ddagger},{u_{t}^{0}};\mathbb{E}_{t}[{X}^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u_{t}^{j}}]}\big{)}dW_{t}^{1},\\ \quad d{X}_{t}^{0,\ddagger}=&b^{0}\big{(}t,{X}_{t}^{0,\ddagger},{u_{t}^{0}};\mathbb{E}_{t}[{X}^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u_{t}^{j}}]\big{)}dt+b^{0}\big{(}t,{X}_{t}^{0,\ddagger},{u_{t}^{0}};\mathbb{E}_{t}[{X}^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u_{t}^{j}}]\big{)}dW_{t}^{0},\\ \quad d{X}_{t}^{j,\ddagger}=&b\big{(}t,{X}_{t}^{j,\ddagger},{u_{t}^{j}};{X}_{t}^{0,\ddagger},{u_{t}^{0}};\mathbb{E}_{t}[{X}^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u_{t}^{j}}]\big{)}dt+\sigma\big{(}t,{X}_{t}^{j,\ddagger},{u_{t}^{j}};{X}_{t}^{0,\ddagger},{u_{t}^{0}};\mathbb{E}_{t}[{X}^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u_{t}^{j}}]\big{)}dW_{t}^{j},\\ {X}_{0}^{0,\dagger}=&{X}_{0}^{0,\ddagger}=x^{0}_{0},\quad{{X}}_{0}^{1,\dagger}={X}_{0}^{1,\ddagger}={X}_{0}^{j,\ddagger}=x_{0}.\end{aligned}\right. (3.13)

Indeed, the first and second state (X0,,X1,)({X}^{0,\dagger},{{X}}^{1,\dagger}) come from (3.5), the third X1,{X}^{1,\ddagger} from (3.11), and the last two (X0,,Xj,)({X}^{0,\ddagger},{X}^{j,\ddagger}) from (3.10) by replacing exact centralized controls (u0,,u1,,uj,)(u^{0,*},u^{1,*},u^{j,*}) with decentralized ones (u0,u1,uj).(u^{0},u^{1},u^{j}). With the extended state (3.13), there arise three agents, respectively: a follower 𝒜0\mathcal{A}_{0} using decentralized control u0u^{0}, a follower 𝒜1\mathcal{A}_{1} using u1u^{1}, and a leader 𝒜j\mathcal{A}_{j} using uj,u^{j}, for some j2.j\geq 2. Recall that u0u^{0} is 𝔽0\mathbb{F}^{0}-adapted and u1u^{1} (resp. uju^{j}) is 𝔽1,0\mathbb{F}^{1,0}- (resp. 𝔽j,0\mathbb{F}^{j,0})-adapted. Therefore, the follower 𝒜0\mathcal{A}_{0} can directly affect all 5-tuple components, the follower 𝒜1\mathcal{A}_{1} directly affects the first three (X0,,X1,,X1,),({X}^{0,\dagger},{{X}}^{1,\dagger},{X}^{1,\ddagger}), while the leader 𝒜j\mathcal{A}_{j} directly affect the last three (X1,,X0,,Xj,).({X}^{1,\ddagger},{X}^{0,\ddagger},{X}^{j,\ddagger}). On the other hand, all 5-tuple components are coupled through their dependence on the control triple (u0,u1,uj).(u^{0},u^{1},u^{j}). In this sense, 𝒜1\mathcal{A}_{1} and 𝒜j\mathcal{A}_{j} also influence the 5-tuple (indirectly); all components are endogenous and not redundant.

We can now formulate limiting recursive functionals. Specifically, the follower 𝒜0\mathcal{A}_{0} aims to

supu0𝒰0d{Γ0(Y00,)+𝔼[0Tg0(t,Θt0,,ut0;𝔼t[Θt1,],𝔼t[ut1])𝑑t]},{\mathop{\rm sup}\limits_{u^{0}\in\mathcal{U}^{d}_{0}}}\Big{\{}\Gamma^{0}({Y}_{0}^{0,\dagger})+\mathbb{E}\big{[}\int_{0}^{T}g^{0}(t,\Theta_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{\Theta}_{t}^{1,\dagger}],\mathbb{E}_{t}[u_{t}^{1}])dt\big{]}\Big{\}}, (3.14)

where (Y0,,Z0,;Y1,,Z1,)({Y}^{0,\dagger},{Z}^{0,\dagger};{Y}^{1,\dagger},{Z}^{1,\dagger}) is the solution of the coupled mean-field BSDE

{dYt0,=f0(t,Θt0,,ut0;𝔼t[Θt1,],𝔼t[ut1])dtZt0,dWt0,dYt1,=f(t,Θt1,,ut1;Θt0,,ut0;𝔼t[Θt1,],𝔼t[ut1])dtZt1,,0dWt0Zt1,,1dWt1,YT0,=Φ0(XT0,,𝔼T[XT1,])+ξ0,YT1,=Φ(XT1,;XT0,,𝔼T[XT1,])+ξ1.\left\{\begin{aligned} -d{Y}_{t}^{0,\dagger}=&f^{0}\Big{(}t,{\Theta_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[{\Theta}_{t}^{1,\dagger}],\mathbb{E}_{t}[{u}_{t}^{1}]}\Big{)}dt-{Z}_{t}^{0,\dagger}dW_{t}^{0},\\ -d{{Y}}_{t}^{1,\dagger}=&f(t,{\Theta_{t}^{1,\dagger},u_{t}^{1};\Theta_{t}^{0,\dagger},u_{t}^{0};\mathbb{E}_{t}[\Theta_{t}^{1,\dagger}],\mathbb{E}_{t}[{u}_{t}^{1}])dt-{{Z}}_{t}^{1,\dagger,0}dW_{t}^{0}-{{Z}}_{t}^{1,\dagger,1}dW_{t}^{1}},\\ {Y}_{T}^{0,\dagger}=&\Phi^{0}\big{(}{X}_{T}^{0,\dagger},\mathbb{E}_{T}[{{X}}^{1,\dagger}_{T}]\big{)}+\xi^{0},\quad{{Y}}_{T}^{1,\dagger}=\Phi\big{(}{{X}}_{T}^{1,\dagger};{X}_{T}^{0,\dagger},\mathbb{E}_{T}[{{X}}^{1,\dagger}_{T}]\big{)}+\xi^{1}.\end{aligned}\right. (3.15)

The aim of the follower 𝒜1\mathcal{A}_{1} is

supu1𝒰1d{Γ(Y01,)+𝔼[0Tg(t,Θt1,,ut1;Θt0,,ut0;𝔼t[Θtj,],𝔼t[utj])𝑑t]},{\mathop{\rm sup}_{u^{1}\in\mathcal{U}_{1}^{d}}}\Big{\{}\Gamma({Y}_{0}^{1,\ddagger})+\mathbb{E}\Big{[}\int_{0}^{T}g(t,\Theta_{t}^{1,\ddagger},u_{t}^{1};\Theta_{t}^{0,\ddagger},u_{t}^{0};\mathbb{E}_{t}[\Theta^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u}^{j}_{t}])dt\Big{]}\Big{\}}, (3.16)

where (Y1,,Z1,)({Y}^{1,\ddagger},{Z}^{1,\ddagger}) is the solution of the BSDE

{dYt1,=f(t,Θt1,,ut1;Θt0,,ut0;𝔼t[Θtj,],𝔼t[utj])dtZt1,,0dWt0Zt1,,1dWt1,YT1,=Φ(XT1,;XT0,,𝔼T[XTj,])+ξ1,\left\{\begin{aligned} -d{Y}^{1,\ddagger}_{t}=&f\Big{(}t,\Theta^{1,\ddagger}_{t},u_{t}^{1};\Theta_{t}^{0,\ddagger},u_{t}^{0};\mathbb{E}_{t}[\Theta^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u}^{j}_{t}]\Big{)}dt-{Z}^{1,\ddagger,0}_{t}dW_{t}^{0}-{Z}^{1,\ddagger,1}_{t}dW_{t}^{1},\\ {Y}^{1,\ddagger}_{T}=&\Phi\Big{(}{X}^{1,\ddagger}_{T};{X}_{T}^{0,\ddagger},\mathbb{E}_{T}[{X}^{j,\ddagger}_{T}]\Big{)}+\xi^{1},\end{aligned}\right. (3.17)

with the exogenous processes (Y0,,Z0,;Yj,,Zj,)({Y}^{0,\ddagger},{Z}^{0,\ddagger};{Y}^{j,\ddagger},{Z}^{j,\ddagger}) given by

{dYt0,=f0(t,Θt0,,ut0;𝔼t[Θtj,],𝔼t[utj])dtZt0,dWt0,dYtj,=f(t,Θtj,,utj;Θt0,,ut0;𝔼t[Θtj,],𝔼t[utj])dtZtj,,0dWt0Ztj,,jdWtj,YT0,=Φ0(XT0,,𝔼T[XTj,])+ξ0,YTj,=Φ(XTj,;XT0,,𝔼T[XTj,])+ξj.\left\{\begin{aligned} -d{Y}_{t}^{0,\ddagger}=&f^{0}\Big{(}t,\Theta_{t}^{0,\ddagger},u_{t}^{0};\mathbb{E}_{t}[\Theta^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u}^{j}_{t}]\Big{)}dt-{Z}_{t}^{0,\ddagger}dW_{t}^{0},\\ -d{Y}_{t}^{j,\ddagger}=&f\Big{(}t,\Theta_{t}^{j,\ddagger},u_{t}^{j};\Theta_{t}^{0,\ddagger},u_{t}^{0};\mathbb{E}_{t}[\Theta^{j,\ddagger}_{t}],\mathbb{E}_{t}[{u}^{j}_{t}]\Big{)}dt-{Z}_{t}^{j,\ddagger,0}dW_{t}^{0}-{Z}_{t}^{j,\ddagger,j}dW_{t}^{j},\\ {Y}_{T}^{0,\ddagger}=&\Phi^{0}\Big{(}{X}_{T}^{0,\ddagger},\mathbb{E}_{T}[{X}^{j,\ddagger}_{T}]\Big{)}+\xi^{0},\quad{Y}_{T}^{j,\ddagger}=\Phi\Big{(}{X}_{T}^{j,\ddagger};{X}_{T}^{0,\ddagger},\mathbb{E}_{T}[{X}^{j,\ddagger}_{T}]\Big{)}+\xi^{j}.\\ \end{aligned}\right. (3.18)

And the leader 𝒜j\mathcal{A}_{j} aims to minimize the following quadratic deviation functional

infuj𝒰jduju1L2W0:=𝔼0T|𝔼t[utj]𝔼t[ut1]|2𝑑t.{\inf_{u^{j}\in\mathcal{U}^{d}_{j}}||u^{j}-u^{1}||_{L^{2}}^{W^{0}}:=\mathbb{E}\int_{0}^{T}|\mathbb{E}_{t}[u^{j}_{t}]-\mathbb{E}_{t}[u^{1}_{t}]|^{2}dt.} (3.19)

Last, we identify mixed leader-follower-Nash interactions among 𝒜0,𝒜1,𝒜j\mathcal{A}_{0},\mathcal{A}_{1},\mathcal{A}_{j} as follows:

  • As the unique leader, at the beginning t=0,t=0, 𝒜j\mathcal{A}_{j} (for some j2j\geq 2) announces an (open-loop) control uju^{j} on the whole horizon [0,T][0,T] to the followers 𝒜0,𝒜1\mathcal{A}_{0},\mathcal{A}_{1} both.

  • Given the announced uj|[0,T],u^{j}|_{[0,T]}, 𝒜0,𝒜1\mathcal{A}_{0},\mathcal{A}_{1} will find a Nash equilibrium u0,u1u^{0},u^{1} simultaneously. In particular, 𝒜0\mathcal{A}_{0} and 𝒜1\mathcal{A}_{1} optimize the functionals (3.14) and (3.16), respectively.

  • Anticipate the best response u0u^{0} and u1u^{1} parameterized by the priori uju^{j}, the leader intakes such leader-follower interaction to minimize the quadratic deviation cost. Such minimum can be reached via a fixed point argument or consistency condition.

Given the above interactions, a mixed triple-agent game should be solved by the following steps.
(1) First, to formulate and solve an auxiliary control problem for 𝒜0\mathcal{A}_{0} in terms of control u0u^{0} and state (X0,,X1,,Y0,,Z0,,Y1,,Z1,)({X}^{0,\dagger},{{X}}^{1,\dagger},{Y}^{0,\dagger},{Z}^{0,\dagger},{Y}^{1,\dagger},{Z}^{1,\dagger}) by fixing the generic admissible control u1u^{1} of 𝒜1\mathcal{A}_{1}. We may denote the optimal one as u~0=u~0[u1]\tilde{u}^{0}=\tilde{u}^{0}[u^{1}] to show its dependence on u1u^{1}.
(2) Second, for given u0{u}^{0} and pre-announced uju^{j}, to formulate and solve an auxiliary control problem for 𝒜1\mathcal{A}_{1} in terms of control u1u^{1} and state (X1,,X0,,Xj,,Y1,,Z1,,Y0,,Z0,,({X}^{1,\ddagger},{X}^{0,\ddagger},{X}^{j,\ddagger},{Y}^{1,\ddagger},{Z}^{1,\ddagger},{Y}^{0,\ddagger},{Z}^{0,\ddagger}, Yj,,Zj,).{Y}^{j,\ddagger},{Z}^{j,\ddagger}). We may denote the optimal one as u~1=u~1[uj,u0]\tilde{u}^{1}=\tilde{u}^{1}[u^{j},{u}^{0}] depending on uju^{j} and u0.u^{0}.
(3) Last, get the Nash equilibrium of 𝒜0\mathcal{A}_{0} and 𝒜1\mathcal{A}_{1}, say (u^0[uj],u^1[uj])(\hat{u}^{0}[u^{j}],\hat{u}^{1}[u^{j}]) depending on pre-announced uju^{j}; and solve the optimization of the leader by matching 𝔼t[utj]=𝔼t[u^t1[utj]].\mathbb{E}_{t}[u^{j}_{t}]=\mathbb{E}_{t}\big{[}\hat{u}_{t}^{1}[u^{j}_{t}]\big{]}.

Remark 3.2.

(i) When the weak-coupling of RMM does not include the control-average of all minors, the term 𝔼t[Xtj]\mathbb{E}_{t}[{X}^{j}_{t}] in the third and fifth equation of (3.13) can be replaced by some off-line process. The last equation in (3.13) thus becomes redundant. In fact, the inclusion of the control-average of all minors necessitates the introduction of additional dynamics to account for the averages on the physical, recursive and intensity state.

(ii) Furthermore, if we exclude the consideration of recursive functionals (namely, Γ0=Γ0\Gamma^{0}=\Gamma\equiv 0), the mixed triple-agent game will simplify to a two-agent nonzero-sum game, as in [11].

(iii) The introduction of the additional 𝒜j\mathcal{A}_{j} facilitates a clearer specification of the exogenous processes, as discussed in Subsection 3.2 from the standpoint of 𝒜1\mathcal{A}_{1}. Otherwise, two exogenous processes s would need to be introduced to replace (𝔼t[𝕏tj,],𝔼t[utj,])(\mathbb{E}_{t}[\mathbb{X}_{t}^{j,*}],\mathbb{E}_{t}[u_{t}^{j,*}]) simultaneously, accompanied by an associated fixed point analysis. The overall analysis will thus become complicated, given that the consistency matching already involves another fixed-point analysis. In contrast, our formulation of the mixed triple-agent game, particularly the introduction of the virtual 𝒜j\mathcal{A}_{j}, may clearly elucidate the intricate exogenous-endogenous relationships.

3.4 A unified structural scheme

Our bilateral perturbations and mixed hierarchical recomposition in Sections 3.2 and 3.3 indeed suggest a unified scheme to analyze general LP systems with more complex couplings.

We start with a generic LP system characterized by a key 5-tuple (,𝒜,J,𝒟,𝒞0)(\mathcal{I},\mathcal{A},J,\mathcal{D},\mathcal{C}_{0}) where (i) \mathcal{I} is the index set of all agents, as enumerated by 𝒜:={𝒜i}i\mathcal{A}:=\{\mathcal{A}_{i}\}_{i\in\mathcal{I}}; (ii) J:={Ji}iJ:=\{J_{i}\}_{i\in\mathcal{I}} the set of weak-coupled functionals to be optimized by 𝒜i,i,\mathcal{A}_{i},i\in\mathcal{I}, respectively; (iii) 𝒟:={ui}i\mathcal{D}:=\{u_{i}\}_{i\in\mathcal{I}} denotes associated decision profile of 𝒜\mathcal{A}; (iv) 𝒞0:={k}kΘ\mathcal{C}_{0}:=\{\mathcal{I}_{k}\}_{k\in\Theta} is the maximal coalition structure on \mathcal{I}, which is structurally determined by J,𝒟J,\mathcal{D} and will be elaborated later. Typically, =1:={1,,N}\mathcal{I}=\mathcal{I}_{1}:=\{1,\cdots,N\}, 2:={1,,k}×1\mathcal{I}_{2}:=\{1,\cdots,k\}\times\mathcal{I}_{1}, or 3:={1,,k}1\mathcal{I}_{3}:=\{1,\cdots,k\}\cup\mathcal{I}_{1} respectively, represent the classical LP systems with homogenous agents, kk classes of heterogenous agents, or kk major agents with homogenous minors. For second order countable or continuum infinity, we can let 4:=1×1\mathcal{I}_{4}:=\mathcal{I}_{1}\times\mathcal{I}_{1} or 5=[0,1].\mathcal{I}_{5}=[0,1]. Given (,J,𝒟,𝒞0)(\mathcal{I},J,\mathcal{D},\mathcal{C}_{0}), a structural scheme includes the following three steps.

Step 1: Exchangeable decomposition. Essentially, the MFG is an effective dimension-reduction analysis relying on exchangeabilities across agents, shares a similar spirit with the notable symmetric game in deterministic context. Exchangeabilities of a controlled (,𝒜,J,𝒟,𝒞0)(\mathcal{I},\mathcal{A},J,\mathcal{D},\mathcal{C}_{0}) can be characterized asymptotically, as ||+|\mathcal{I}|\longrightarrow+\infty, by the so-called coalition structure. This refers to a partition of the index set \mathcal{I} (namely, kΘk=\sum_{k\in\Theta}\mathcal{I}_{k}=\mathcal{I} where \sum the disjoint sets’ union); and for kΘ\forall k\in\Theta, all agents {𝒜j|jk}\{\mathcal{A}_{j}|j\in\mathcal{I}_{k}\} form an exchangeable sub-class, in the sense that

(,J,𝒟,𝒞0)=(kΘk,jkkΘJj,k,jkkΘuj,k,𝒞0)\displaystyle(\mathcal{I},J,\mathcal{D},\mathcal{C}_{0})=(\sum_{k\in\Theta}\mathcal{I}_{k},\ \ \sum_{j\in\mathcal{I}_{k}}\sum_{k\in\Theta}J_{j,k},\ \ \sum_{j\in\mathcal{I}_{k}}\sum_{k\in\Theta}u_{j,k},\ \ \mathcal{C}_{0})
=(kkkΘk,{Jj,k}jkjkkkΘJj,k,{uj,k}jkjkkkΘuj,k,𝒞0)\displaystyle=(\mathcal{I}_{k}\cup\sum_{k^{\prime}\neq k\in\Theta}\mathcal{I}_{k^{\prime}},\ \ \{J_{j,k}\}_{j\in\mathcal{I}_{k}}\cup\sum_{j\in\mathcal{I}_{k^{\prime}}}\sum_{k^{\prime}\neq k\in\Theta}J_{j,k^{\prime}},\ \ \{u_{j,k}\}_{j\in\mathcal{I}_{k}}\cup\sum_{j\in\mathcal{I}_{k^{\prime}}}\sum_{k^{\prime}\neq k\in\Theta}u_{j,k^{\prime}},\ \ \mathcal{C}_{0})
(k~kkΘk,{Jj,k}~jkjkkkΘJj,k,{uj,k}~jkjkkkΘuj,k,𝒞0)\displaystyle\Longleftrightarrow_{\mathcal{E}}(\widetilde{\mathcal{I}_{k}}\cup\sum_{k^{\prime}\neq k\in\Theta}\mathcal{I}_{k^{\prime}},\ \ \ \widetilde{\{J_{j,k}\}}_{j\in\mathcal{I}_{k}}\cup\sum_{j\in\mathcal{I}_{k^{\prime}}}\sum_{k^{\prime}\neq k\in\Theta}J_{j,k^{\prime}},\ \ \ \widetilde{\{u_{j,k}\}}_{j\in\mathcal{I}_{k}}\cup\sum_{j\in\mathcal{I}_{k^{\prime}}}\sum_{k^{\prime}\neq k\in\Theta}u_{j,k^{\prime}},\ \ \ \mathcal{C}_{0})
:=(~,J~,𝒟~,𝒞0)\displaystyle:=(\widetilde{\mathcal{I}},\widetilde{J},\widetilde{\mathcal{D}},\mathcal{C}_{0})

where (k~,{Jj,k}~jk,{uj,k}~jk)\left(\widetilde{\mathcal{I}_{k}},\widetilde{\{J_{j,k}\}}_{j\in\mathcal{I}_{k}},\widetilde{\{u_{j,k}\}}_{j\in\mathcal{I}_{k}}\right) is a simultaneous permutation on the sub-index set k\mathcal{I}_{k} and associated intersections on JJ and 𝒟\mathcal{D}; ``"``\Longleftrightarrow_{\mathcal{E}}" denotes the equivalence relation in terms of game equilibrium. That is, the set of Nash equilibria of (,J,𝒟,𝒞0)(\mathcal{I},J,\mathcal{D},\mathcal{C}_{0}) is invariant to that of (~,J~,𝒟~,𝒞0)(\widetilde{\mathcal{I}},\widetilde{J},\widetilde{\mathcal{D}},\mathcal{C}_{0}) under finite permutation on k\mathcal{I}_{k}. Roughly, this means that the simultaneous optimizations faced by {𝒜j}jk\{\mathcal{A}_{j}\}_{j\in\mathcal{I}_{k}}, are endowed with the identical probabilistic structures. Moreover, under large symmetric assumption as in [9], the equivalence \mathcal{E} should be element-wise and thus transitive. Therefore, there exists a maximal coalition, denoted as 𝒞0\mathcal{C}_{0}, provided the set of coalition structures is non-empty. By “maximal”, it is the coarsest partition than other coalitions, that implies the largest exchangeable decomposition and hence the largest dimension reduction can be achieved. In fact, this maximal coalition can be constructed through the saturated sets using equivalent relation, as the union of all \mathcal{E}-equivalent classes ([18]).

A trivial coalition is {{e}|e},\{\{e\}|e\in\mathcal{I}\}, the set of all singleton sets generated by each element. In this case, |Θ|=|||\Theta|=|\mathcal{I}| without dimension reduction. The MFG is applicable for non-trivial coalitions in that |Θ|=O(1)|\Theta|=O(1) as ||+|\mathcal{I}|\longrightarrow+\infty; or reversely, for at least one kΘ,k\in\Theta, |k|=O(||).|\mathcal{I}_{k}|=O(|\mathcal{I}|). For example, |k|=O(||α)|\mathcal{I}_{k}|=O(|\mathcal{I}|^{\alpha}) for some α>0\alpha>0. For LP systems with 1,2,3,\mathcal{I}_{1,2,3}, 𝒞0=1,({1}×1,,{k}×1),({1},,{k},1),\mathcal{C}_{0}=\mathcal{I}_{1},(\{1\}\times\mathcal{I}_{1},\cdots,\{k\}\times\mathcal{I}_{1}),(\{1\},\cdots,\{k\},\mathcal{I}_{1}), with associated |Θ|=1,k,k+1|\Theta|=1,k,k+1, respectively. For these cases, heuristic arguments are still feasible to construct auxiliary control. However, they boil down when coupling structures assume more complex forms, such as non-standard bridge configuration; and in this case, the identification of 𝒞0\mathcal{C}_{0} with associated decomposition become necessary to alternatively yield a unified and systematic analysis.

Step 2: Representative multilateral perturbation. Given the maximal coalition 𝒞0\mathcal{C}_{0}, one can select a representative agent, denoted as 𝒜krep,kΘ\mathcal{A}^{\text{rep}}_{k},k\in\Theta, from each exchangeable sub-class {𝒜j|jk}.\{\mathcal{A}_{j}|j\in\mathcal{I}_{k}\}. This selection yields a representative collection :={𝒜krep}kΘ\mathcal{R}:=\{\mathcal{A}^{\text{rep}}_{k}\}_{k\in\Theta} that is dimension-reduced by noting |Θ|=O(1)|\Theta|=O(1). In fact, we may abuse notations to denote =𝒜/Θ\mathcal{R}={\mathcal{A}}/{\Theta} as the quotient space with elements in the equivalent classes by \mathcal{E}. Then, a multilateral perturbation can be introduced in side of each 𝒜krep\mathcal{A}^{\text{rep}}_{k} separately, by assuming all other representatives still keep their equilibrium strategies. Depending on the weak-coupling mechanism structured by {Jj}j\{{J_{j}}\}_{j\in\mathcal{I}}, each perturbation ukrepu^{\text{rep}}_{k} will be transmitted throughout the whole LP system {𝒜j}j=kΘ{𝒜j}jk\{\mathcal{A}_{j}\}_{j\in\mathcal{I}}=\sum_{k\in\Theta}\{{\mathcal{A}_{j}}\}_{j\in\mathcal{I}_{k}} across all exchangeable sub-classes. A typical transmission, in an open-loop and dynamic setting, can be sketched by the following channel via the weak-coupling of state-average:

ukrepxkrep(ukrep)(xk(N)(xkrep(ukrep)),xkk(N)(xkrep(ukrep)))Jjj(xk(N),xkk(N))u^{\text{rep}}_{k}\Longrightarrow x^{\text{rep}}_{k}(u^{\text{rep}}_{k})\Longrightarrow\left(x^{(N)}_{k}(x^{\text{rep}}_{k}(u^{\text{rep}}_{k})),\ \ x^{(N)}_{k^{\prime}\neq k}(x^{\text{rep}}_{k}(u^{\text{rep}}_{k}))\right)\Longrightarrow J_{j^{\prime}\neq j}(x^{(N)}_{k},x^{(N)}_{k^{\prime}\neq k})\Longrightarrow\cdots

Along with such transmission, influence of a representative 𝒜krep\mathcal{A}^{\text{rep}}_{k} on controlled LP systems can be completely quantified, that is essentially equivalent to the Fréchet differential of JkrepJ_{k}^{\text{rep}} on ukrepu^{\text{rep}}_{k}.

Step 3: Hierarchical recomposition. The multilateral perturbation involves approximations to completely quantify all LP weak-couplings asymptotically, thereby leading to a variety of limiting quantities. Due to the exchangeability, these quantities typically assume relevant conditional expectations on the tail sigma-algebra, as per DeFinetti theorem. However, unlike those in classical McKean-Vlasov control problems, these expectations exhibit distinct modes in realized degrees of controllability, contingent upon their hierarchical positions within the entire weak-coupled structure. For instance, in the RMM, three modes emerge: exact-realized (exact limit), quasi-realized (semi-exact), and null-realized, as indicated by (3.5) and (3.8). In the current RMM or simpler coupled setups, these modes can be ordered using an exogenous-endogenous relation, where the “exogenous” exact-realized mode dominates the “endogenous” null-realized one. However, in more complex forms of LP couplings, these modes cannot be fully encapsulated by a binary ordering alone, instead forming a more intricate directed graph network, which is challenging to be studied by heuristic arguments alone.

Our resolution is a hierarchical recomposition, similar to the well-studied structural function through the so-called path sets in reliability theory ([1]). In fact, reliability of any system is equivalent to that of a serial (sequential) arrangement of parallel subsystems. Likewise, for a generic LP system, all involved modes by Step 2 can be stratified into a sequential layers with leader-follower-type hierarchies; and each layer consists of parallel Nash-type nodes with simultaneous decisions. This stratification, similar to the construction of structural function, is indeed applicable to any LP systems, such as those with non-classical intermediate coupling, akin to the bridge systems in reliability analysis. Such stratification enables an unified and more clear-cut construction of the desired auxiliary problem by recomposing all modes across hierarchical layers with a combination of corresponding variants of fixed-point matching. For example, the aforementioned three modes in RMM yield two layers, thereby enabling the construction of the auxiliary problem via a triple-agent leader-follower-Nash game. By coincidence, the multilateral perturbation assumes a similar role to the synthesis of all minimal path sets in reliability analysis, as both aim to quantify all transmission channels within a given system.

Attainability. For classical LP systems, leader-type agents at higher layers in the hierarchical recomposition often engage in a fixed-point analysis, as demonstrated by (3.19) for the RMM when 𝒜j\mathcal{A}_{j} aligns the announced uju^{j} with the resultant u1(uj)u^{1}(u^{j}) on their conditional expectations 𝔼tuj,𝔼tu1\mathbb{E}_{t}u^{j},\mathbb{E}_{t}u^{1}. In contrast to follower-type agents at lower layers, who confront more regular control problems (see (3.16)), the fixed-point analysis for leaders can indeed degenerate, as the minimal deviations become trivially attainable at zero, provided 𝔼tuj=𝔼tu1\mathbb{E}_{t}u^{j}=\mathbb{E}_{t}u^{1}. Consequently, the choice of norm ϕ()\phi(\cdot) on deviation 𝔼tuj𝔼tu1(uj)\mathbb{E}_{t}u^{j}-\mathbb{E}_{t}u^{1}(u_{j}) becomes indifferent. Specifically, the introduction of a quadratic deviation on L2L^{2}-norm: ϕ()=||||L2W0\phi(\cdot)=||\cdot||_{L^{2}}^{W^{0}} as in (3.19), is merely formal. In fact, the top layer, characterized by a high degree of realized controllability, presents a tradeoff where its remaining control capability is more prone to degeneration. However, in LP systems with more complex couplings, there may emerge nonclassical hierarchical layers and the relevant analysis, particularly of top layers, may no longer be degenerate. This is particularly true for LP systems with nested or asymmetric information ([3, 25]), or those with heterogenous robustness beliefs, as well as those with bridging-intermediate-type couplings. In such instances, the optimal deviation norms cannot be trivially attainable at zero, necessitating the replacement of fixed-point analysis with some non-trivial optimization problems.

Summary. Step 1-3 constitute an unified structural scheme to analyze more general LP systems, particularly those with non-classical coupling structures. In fact, for classical LP systems with only minor (homogenous or heterogenous) agents, auxiliary problems can be constructed by straightforward heuristic arguments. The main reason is that their coalition structure are relatively simple thus no need to invoke Step 2-3. A more complex but classical LP system is the one involving single major agent, for which auxiliary problem can still be constructed heuristically but shall invoke an exogenous-endogenous relation to tackle additional couplings by the major agent. Our RMM has fairly general weak-couplings, especially those of the recursive state pairs, that motivate us to introduce the structural scheme, especially Step 2-3, to fully quantify complexities of all resultant perturbation transmissions and construct the auxiliary problem hierarchically. For non-classical LP systems such as those with nested or asymmetric information, or those featuring bridge-type couplings, heuristic arguments are no longer feasible for constructing auxiliary problems. For example, even for LP systems with kk majors and KK classes of heterogenous minors, heuristic construction necessitates an exogenous-endogenous structure represented by a directed graph with k×Kk\times K edges. By contrast, the structural scheme, conversely, simplifies this to a sequential flow with only k+Kk+K binary relations.

4 Approximated Nash equilibrium of the RMM problem

4.1 Auxiliary control problem for the follower 𝒜0\mathcal{A}_{0}

Now we fix the follower 𝒜1\mathcal{A}_{1}’s control u1u^{1} and solve the auxiliary problem of 𝒜0\mathcal{A}_{0}. The auxiliary problem for 𝒜0\mathcal{A}_{0} is formulated by (3.14) associated with mean-field BSDE (3.15) and McKean-Vlasov SDE (3.13) (its first two components). Related Hamiltonian is defined as

H0=p0,b0+p,b+q00,σ0+q11,σl0f0lf+g0,H_{0}=\langle p^{0},b^{0}\rangle+\langle p,b\rangle+\langle q^{00},\sigma^{0}\rangle+\langle q^{11},\sigma\rangle-l^{0}\cdot f^{0}-l\cdot f+g^{0},

where (b0,σ0)=(b0,σ0)(t,x0,u0;x¯,u¯),(b^{0},\sigma^{0})=(b^{0},\sigma^{0})(t,x_{0},u_{0};\bar{x},\bar{u}), (b,σ)=(b,σ)(t,x,u;x0,u0;x¯,u¯)(b,\sigma)=(b,\sigma)(t,x,u;x_{0},u_{0};\bar{x},\bar{u}), (f0,g0)=(f0,g0)(t,x0,(f^{0},g^{0})=(f^{0},g^{0})(t,x_{0}, y0,z0,u0;x¯,y¯,z¯,y_{0},z_{0},u_{0};\bar{x},\bar{y},\bar{z}, u¯)\bar{u}), f=f(t,x,y,z,u;x0,y0,z0,u0;x¯,y¯,z¯,u¯)f=f(t,x,y,z,u;x_{0},y_{0},z_{0},u_{0};\bar{x},\bar{y},\bar{z},\bar{u}). Using stochastic maximum principle ([2]), we get the following Hamiltonian system

{dXt0,=p0H0(t)dt+q00H0(t)dWt0,dXt1,=pH0(t)dt+q11H0(t)dWt1,dYt0,=l0H0(t)dtZt0,dWt0,dYt1,=lH0(t)dtZt1,,0dWt0Zt1,,1dWt1,dLt0=y0H0(t)dtz0H0(t)dWt0,dLt=[yH0(t)+𝔼t[y¯H0(t)]]dt[zH0(t)+𝔼t[z¯H0(t)]]d(W0,W1)t,dPt0=x0H0(t)dtQt00dWt0Qt01dWt1,dPt=[xH0(t)+𝔼t[x¯H0(t)]]dtQt10dWt0Qt11dWt1,\left\{\begin{aligned} d{X}_{t}^{0,*}=&\partial_{p_{0}}H_{0}(t)dt+\partial_{q_{00}}H_{0}(t)dW_{t}^{0},\ d{{X}}_{t}^{1,*}=\partial_{p}H_{0}(t)dt+\partial_{q_{11}}H_{0}(t)dW_{t}^{1},\\ -d{Y}_{t}^{0,*}=&-\partial_{l^{0}}H_{0}(t)dt-{Z}_{t}^{0,*}dW_{t}^{0},\ -d{{Y}}_{t}^{1,*}=-\partial_{l}H_{0}(t)dt-{{Z}}_{t}^{1,*,0}dW_{t}^{0}-{{Z}}_{t}^{1,*,1}dW_{t}^{1},\\ dL_{t}^{0}=&-\partial_{y_{0}}H_{0}(t)dt-\partial_{z_{0}}H_{0}(t)dW^{0}_{t},\\ dL_{t}=&-\big{[}\partial_{y}H_{0}(t)+\mathbb{E}_{t}[\partial_{\bar{y}}H_{0}(t)]\big{]}dt-\big{[}\partial_{z}H_{0}(t)+\mathbb{E}_{t}[\partial_{\bar{z}}H_{0}(t)]\big{]}d(W^{0},W^{1})_{t},\\ -dP_{t}^{0}=&\partial_{x_{0}}H_{0}(t)dt-Q_{t}^{00}dW^{0}_{t}-Q_{t}^{01}dW_{t}^{1},\\ -dP_{t}=&\big{[}\partial_{x}H_{0}(t)+\mathbb{E}_{t}[\partial_{\bar{x}}H_{0}(t)]\big{]}dt-Q_{t}^{10}dW^{0}_{t}-Q_{t}^{11}dW_{t}^{1},\\ \end{aligned}\right. (4.1)

where X00,=x00{X}_{0}^{0,*}=x_{0}^{0}, X01,=x0,X_{0}^{1,*}=x_{0}, YT0,=Φ0(XT0,,𝔼T[XT1,])+ξ0{Y}_{T}^{0,*}=\Phi^{0}\big{(}{X}_{T}^{0,*},\mathbb{E}_{T}[{{X}}^{1,*}_{T}]\big{)}+\xi^{0}, YT1,=Φ(XT1,;XT0,,𝔼T[XT1,])+ξ1,Y_{T}^{1,*}=\Phi\big{(}{{X}}_{T}^{1,*};{X}_{T}^{0,*},\mathbb{E}_{T}[{{X}}^{1,*}_{T}]\big{)}+\xi^{1}, L00=Γy0(Y00,),L_{0}^{0}=-\Gamma^{0}_{y}(Y_{0}^{0,*}), L0=0L_{0}=0, PT0=LT0x0Φ0LTx0Φ,P_{T}^{0}=-L_{T}^{0}\cdot\partial_{x_{0}}\Phi^{0}-L_{T}\cdot\partial_{x_{0}}\Phi, PT=LTxΦ𝔼T[LT0x¯Φ0+LTx¯Φ]P_{T}=-L_{T}\cdot\partial_{x}\Phi-\mathbb{E}_{T}[L_{T}^{0}\cdot\partial_{\bar{x}}\Phi^{0}+L_{T}\cdot\partial_{\bar{x}}\Phi]; (L0,L;P0,P,Q00,Q01,Q10,Q11)(L^{0},L;P^{0},P,Q^{00},Q^{01},Q^{10},Q^{11}) are the associated adjoint processes and

H0(t):=H0(t,Θt0,,ut0,;Θt1,,ut1;𝔼t[Θt1,],𝔼t[ut1];Lt0,Lt;Pt0,Pt,Qt00,Qt11),H_{0}(t):=H_{0}(t,\Theta_{t}^{0,*},u_{t}^{0,*};\Theta_{t}^{1,*},u_{t}^{1};\mathbb{E}_{t}[\Theta_{t}^{1,*}],\mathbb{E}_{t}[u_{t}^{1}];L_{t}^{0},L_{t};P_{t}^{0},P_{t},Q_{t}^{00},Q_{t}^{11}), (4.2)

assuming that u0,u^{0,*} is an optimal control and (Θ0,,Θ1,)=(X0,,Y0,,Z0,;X1,,Y1,,Z1,)(\Theta^{0,*},\Theta^{1,*})=(X^{0,*},Y^{0,*},Z^{0,*};X^{1,*},Y^{1,*},Z^{1,*}) the associated optimal state trajectory. Then we have the following result.

Proposition 4.1.

Let (A1) be in force. Moreover, we assume that
(1) There exists a unique maximizer of the Hamiltonian H0H_{0} as a function of u0u_{0} (denoted by u^0\widehat{u}^{0});
(2) The function H0H_{0} is convex in (x0,x,y0,y,z0,z,u0)(x_{0},x,y_{0},y,z_{0},z,u_{0}).
If (Θ0,,Θ1,;L0,L;P0,Q00,Q01;P,Q10,Q11)(\Theta^{0,*},\Theta^{1,*};L^{0},L;P^{0},Q^{00},Q^{01};P,Q^{10},Q^{11}) solves system (4.1), the optimal control of 𝒜0\mathcal{A}_{0} is

u~t0=𝔼t[u^0(t,Θt0,,Θt1,,𝔼t[Θt1,];Lt0,Lt,Pt0,Qt00,Pt,Qt11;ut1,𝔼t[ut1])].\widetilde{u}_{t}^{0}=\mathbb{E}_{t}\big{[}\widehat{u}^{0}\big{(}t,\Theta_{t}^{0,*},\Theta_{t}^{1,*},\mathbb{E}_{t}[\Theta_{t}^{1,*}];L^{0}_{t},L_{t},P^{0}_{t},Q^{00}_{t},P_{t},Q^{11}_{t};u^{1}_{t},\mathbb{E}_{t}[u^{1}_{t}]\big{)}\big{]}.

4.2 Auxiliary control problem for the follower 𝒜1\mathcal{A}_{1}

In this subsection, we fix uj,u0u^{j},u^{0}, the controls adopted by the leader 𝒜j\mathcal{A}_{j} and the follower 𝒜0\mathcal{A}_{0} respectively. Consider the control problem of the follower 𝒜1\mathcal{A}_{1} associated with BSDE (3.17), SDE (3.13) and the functional (3.16). Given (u0,uj),(u^{0},u^{j}), (3.18) and the last two equations in (3.13) become exogenous processes. The Hamiltonian functional H(t,x0,y0,z0,u0;x,y,z,u;x¯,y¯,z¯,u¯;l,p,q11)H(t,x_{0},y_{0},z_{0},u_{0};x,y,z,u;\bar{x},\bar{y},\bar{z},\bar{u};l,p,q_{11}) =p,b+q11,σlf+g,=\langle p,b\rangle+\langle q_{11},\sigma\rangle-l\cdot f+g, and the related Hamiltonian system takes the following form

{dXt1,=pH(t)dt+q11H(t)dWt1,X01,=x0,dYt1,=lH(t)dtZt1,,0dWt0Zt1,,1dWt1,YT1,=Φ(XT1,;XT0,,𝔼T[XT1,])+ξ1,dLt=yH(t)dtzH(t)d(W0,W1)t,L0=Γy(Y01,),dPt=xH(t)dtQt10,dWt0Qt11,dWt,PT=LTxΦ(XT1,;XT0,,𝔼T[XTj,]),\left\{\begin{aligned} d{{X}}_{t}^{1,\ddagger}=&\partial_{p}H(t)dt+\partial_{q_{11}}H(t)dW_{t}^{1},\quad{X}_{0}^{1,\ddagger}=x_{0},\\ -d{{Y}}_{t}^{1,\ddagger}=&-\partial_{l}H(t)dt-{{Z}}_{t}^{1,\ddagger,0}dW_{t}^{0}-{{Z}}_{t}^{1,\ddagger,1}dW_{t}^{1},\ Y_{T}^{1,\ddagger}=\Phi\big{(}{{X}}_{T}^{1,\ddagger};{X}_{T}^{0,\ddagger},\mathbb{E}_{T}[{{X}}^{1,\ddagger}_{T}]\big{)}+\xi^{1},\\ dL_{t}^{\ddagger}=&-\partial_{y}H(t)dt-\partial_{z}H(t)d(W^{0},W^{1})_{t},\quad L_{0}^{\ddagger}=-\Gamma_{y}(Y_{0}^{1,\ddagger}),\\ -dP_{t}^{\ddagger}=&\partial_{x}H(t)dt-Q_{t}^{10,\ddagger}dW^{0}_{t}-Q_{t}^{11,\ddagger}dW_{t},\quad P_{T}^{\ddagger}=-L_{T}^{\ddagger}\cdot\partial_{{x}}\Phi({X}_{T}^{1,\ddagger};{X}_{T}^{0,\ddagger},\mathbb{E}_{T}[{X}_{T}^{j,\ddagger}]),\end{aligned}\right. (4.3)

where the last two equations are the associated adjoint equations and

H(t)=(t,Θt0,ut0;Θt1,ut1,;𝔼t[Θt1],𝔼t[ut1,];Lt,Pt,Qt11,).H(t)=\big{(}t,{\Theta_{t}^{0}},u_{t}^{0};{\Theta_{t}^{1}},u_{t}^{1,*};{\mathbb{E}_{t}[\Theta_{t}^{1}]},\mathbb{E}_{t}[u_{t}^{1,*}];L_{t}^{\ddagger},P_{t}^{\ddagger},Q^{11,\ddagger}_{t}\big{)}. (4.4)

Then from the stochastic maximum principle for FBSDE system (e.g. [21]), we have

Proposition 4.2.

Under (A1), assume the FBSDE (4.3) admits an unique solution and
(1) There exists a unique maximizer of the Hamiltonian HH as a function of uu (denoted by u^\widehat{u});
(2) The function HH is convex in (x0,x,y0,y,z0,z,u)(x_{0},x,y_{0},y,z_{0},z,u).
Then the optimal control of 𝒜1\mathcal{A}_{1} is given by u~t1=u^(t,Θt0,,Θt1,,𝔼t[Θtj,];Lt,Pt,Qt11,;𝔼t[utj],ut0).\begin{aligned} \widetilde{u}_{t}^{1}=\widehat{u}\Big{(}&t,\Theta_{t}^{0,\ddagger},\Theta_{t}^{1,\ddagger},\mathbb{E}_{t}[\Theta^{j,\ddagger}_{t}];L^{\ddagger}_{t},P_{t}^{\ddagger},Q^{11,\ddagger}_{t};\mathbb{E}_{t}[u^{j}_{t}],u^{0}_{t}\Big{)}.\end{aligned}

4.3 Consistency condition system

For sake of presentation, hereafter we may write a bar on the top of a random variable (or process) to denote its conditional expectation with respect to 𝔽0\mathbb{F}^{0}, for example, u¯tj=𝔼t[utj]\overline{u}_{t}^{j}=\mathbb{E}_{t}[u_{t}^{j}]. We impose the following consistency conditions on the followers 𝒜0,𝒜1\mathcal{A}_{0},\mathcal{A}_{1} and the leader 𝒜j\mathcal{A}_{j}:

u^[u¯tj;𝔼t[u^0[ut1,u¯t1]]]=ut1,u¯tj=u¯t1,t[0,T].\widehat{u}\big{[}\overline{u}_{t}^{j};\mathbb{E}_{t}[\widehat{u}^{0}[u_{t}^{1},\overline{u}_{t}^{1}]]\big{]}=u_{t}^{1},\quad\overline{u}^{j}_{t}=\overline{u}^{1}_{t},\quad t\in[0,T]. (4.5)

By (4.5) and the solution uniqueness of SDE (3.13) and BSDEs (3.15), (3.17)-(3.18), we have

Θt0,=Θt0,,Θt1,=Θt1,,Θ¯t1,=Θ¯tj,,t[0,T].\Theta^{0,\dagger}_{t}=\Theta^{0,\ddagger}_{t},\ \ \ \Theta^{1,\dagger}_{t}=\Theta^{1,\ddagger}_{t},\ \ \ \overline{\Theta}_{t}^{1,\dagger}=\overline{\Theta}_{t}^{j,\ddagger},\ t\in[0,T]. (4.6)

Noting (4.6) and by Proposition 4.1 and 4.2, we may introduce the following assumption:
Assumption (A2) Suppose that there exist a pair of deterministic continuous functions (φ0,φ):[0,T]×3n+6m×3m×6n×UU0×U,({\varphi}^{0},{\varphi}):[0,T]\times\mathbb{R}^{3n+6m}\times\mathbb{R}^{3m}\times\mathbb{R}^{6n}\times U\rightarrow U^{0}\times U, satisfy the following conditions

{φ0=u^0(t,θ0,θ,θ¯,l0,l,p0,q00,p,q11,u¯;φ),φ=u^(t,θ0,θ,θ¯;l,p,q11,,u¯;φ0),t[0,T],\left\{\begin{aligned} \varphi^{0}=&\widehat{u}^{0}\Big{(}t,\theta^{0},\theta,\bar{\theta},l^{0},l,p^{0},q^{00},p,q^{11},\bar{u};\varphi\Big{)},\\ \varphi=&\widehat{u}\Big{(}t,\theta^{0},\theta,\bar{\theta};l^{\ddagger},p^{\ddagger},q^{11,\ddagger},\bar{u};\varphi^{0}\Big{)},\ t\in[0,T],\end{aligned}\right.

where u^0\widehat{u}^{0} and u^\widehat{u} are the mappings given in Proposition 4.1 and 4.2.

Under (A2), by measurable selection theorem, there exists a measurable ψ\psi with

ψt=𝔼t[φ(t,Θt0,,Θt1,,Θ¯t1,;Lt0,Lt,Lt,Pt0,Qt00,Pt,Qt11,Pt,Qt11,;ψt)].\psi_{t}=\mathbb{E}_{t}[\varphi(t,{\Theta_{t}^{0,\dagger},\Theta_{t}^{1,\dagger},\overline{\Theta}^{1,\dagger}_{t}};L^{0}_{t},L_{t},L_{t}^{\ddagger},P^{0}_{t},Q^{00}_{t},P_{t},Q^{11}_{t},P_{t}^{\ddagger},Q^{11,\ddagger}_{t};\psi_{t})]. (4.7)

Combining (4.7) and (A2), we can denote

Ψ0(t,Θt0,,Θt1,;Lt0,Lt,Lt,Pt0,Qt00,Pt,Qt11,Pt,Qt11,)\displaystyle\Psi^{0}(t,\Theta_{t}^{0,\dagger},\Theta_{t}^{1,\dagger};L^{0}_{t},L_{t},L_{t}^{\ddagger},P^{0}_{t},Q^{00}_{t},P_{t},Q^{11}_{t},P_{t}^{\ddagger},Q^{11,\ddagger}_{t}) (4.8)
=\displaystyle= 𝔼t[φ0(t,Θt0,,Θt1,,Θ¯t1,;Lt0,Lt,Lt,Pt0,Qt00,Pt,Qt11,Pt,Qt11,;ψt)],\displaystyle\mathbb{E}_{t}[\varphi^{0}(t,{\Theta_{t}^{0,\dagger},\Theta_{t}^{1,\dagger},\overline{\Theta}^{1,\dagger}_{t}};L^{0}_{t},L_{t},L_{t}^{\ddagger},P^{0}_{t},Q^{00}_{t},P_{t},Q^{11}_{t},P_{t}^{\ddagger},Q^{11,\ddagger}_{t};\psi_{t})],

Similarly, Ψ(t,)=φ(t,Θt0,,Θt1,,Θ¯t1,;;ψt).\Psi(t,\cdots)=\varphi(t,{\Theta_{t}^{0,\dagger},\Theta_{t}^{1,\dagger},\overline{\Theta}^{1,\dagger}_{t}};\cdots;\psi_{t}). Plugging the mappings (Ψ0,Ψ)(\Psi^{0},\Psi) into (4.1) and (4.3), we obtain

{dXt0=b0(t,Xt0,ut0,;𝔼t[Xt1],𝔼t[ut1,])dt+σ0(t,Xt0,ut0,;𝔼t[Xt1],𝔼t[ut1,])dWt0,dXt1=b(t,Xt1,ut1,;Xt0,ut0,;𝔼t[Xt1],𝔼t[ut1,])dt+σ(t,Xt1,ut1,;Xt0,ut0,;𝔼t[Xt1],𝔼t[ut1,])dWt1,dLt0=y0H0(t)dtz0H0(t)dWt0,dLt=yH(t)dtzH(t)d(W0,W1)t,dLt=[yH0(t)+𝔼t[y¯H0(t)]]dt[zH0(t)+𝔼t[z¯H0(t)]]d(W0,W1)t,dYt0=f0(t,Θt0,ut0,;𝔼t[Θt1],𝔼t[ut1,])dtZt0dWt0,dYt1=f(t,Θt1,ut1,;Θt0,ut0,;𝔼t[Θt1],𝔼t[ut1,])dtZt1,0dWt0Zt1,1dWt1,dPt0=x0H0(t)dtQt00dWt0Qt01dWt1,dPt=[xH0(t)+𝔼t[x¯H0(t)]]dtQt10dWt0Qt11dWt1,dPt=xH(t)dtQt10,dWt0Qt11,dWt1,\left\{\begin{aligned} d{X}_{t}^{0}=&b^{0}\big{(}t,{X}_{t}^{0},u_{t}^{0,*};\mathbb{E}_{t}[{X}^{1}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dt+\sigma^{0}\big{(}t,{X}_{t}^{0},u_{t}^{0,*};\mathbb{E}_{t}[{X}^{1}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dW_{t}^{0},\\ d{{X}}_{t}^{1}=&b\big{(}t,X_{t}^{1},u_{t}^{1,*};{X}_{t}^{0},u_{t}^{0,*};\mathbb{E}_{t}[X^{1}_{t}],\mathbb{E}_{t}[u_{t}^{1,*}]\big{)}dt+\sigma\big{(}t,X_{t}^{1},u_{t}^{1,*};{X}_{t}^{0},u_{t}^{0,*};\mathbb{E}_{t}[X^{1}_{t}],\mathbb{E}_{t}[u_{t}^{1,*}]\big{)}dW_{t}^{1},\\ dL_{t}^{0}=&-\partial_{y_{0}}H_{0}(t)dt-\partial_{z_{0}}H_{0}(t)dW^{0}_{t},\ \ \ \ \quad\quad dL_{t}^{\ddagger}=-\partial_{y}H(t)dt-\partial_{z}H(t)d(W^{0},W^{1})^{\top}_{t},\\ dL_{t}=&-\big{[}\partial_{y}H_{0}(t)+\mathbb{E}_{t}[\partial_{\bar{y}}H_{0}(t)]\big{]}dt-\big{[}\partial_{z}H_{0}(t)+\mathbb{E}_{t}[\partial_{\bar{z}}H_{0}(t)]\big{]}d(W^{0},W^{1})_{t}^{\top},\\ -d{Y}_{t}^{0}=&f^{0}\big{(}t,{{\Theta}_{t}^{0},u_{t}^{0,*};\mathbb{E}_{t}[{\Theta}^{1}_{t}],\mathbb{E}_{t}[{{u}_{t}^{1,*}}]}\big{)}dt-{Z}_{t}^{0}dW_{t}^{0},\\ -d{{Y}}_{t}^{1}=&f\big{(}t,{\Theta_{t}^{1},{u_{t}^{1,*}};\Theta_{t}^{0},u_{t}^{0,*};\mathbb{E}_{t}[\Theta^{1}_{t}],\mathbb{E}_{t}[{u}_{t}^{1,*}]\big{)}dt-{{Z}}_{t}^{1,0}dW_{t}^{0}-{{Z}}_{t}^{1,1}dW_{t}^{1}},\\ -dP_{t}^{0}=&\partial_{x_{0}}H_{0}(t)dt-Q_{t}^{00}dW^{0}_{t}-Q_{t}^{01}dW_{t}^{1},\quad\quad-dP_{t}=\big{[}\partial_{x}H_{0}(t)+\mathbb{E}_{t}[\partial_{\bar{x}}H_{0}(t)]\big{]}dt-Q_{t}^{10}dW^{0}_{t}-Q_{t}^{11}dW_{t}^{1},\\ -dP_{t}^{\ddagger}=&\partial_{x}H(t)dt-Q_{t}^{10,\ddagger}dW^{0}_{t}-Q_{t}^{11,\ddagger}dW_{t}^{1},\\ \end{aligned}\right. (4.9)

with the mixed initial-terminal condition

X00=\displaystyle{X}_{0}^{0}= x00,X01=x0,L00=Γy0(Y00),L0=Γy(Y01),L0=0,\displaystyle x_{0}^{0},\ \ {X}_{0}^{1}=x_{0},\ \ L_{0}^{0}=-\Gamma_{y}^{0}(Y_{0}^{0}),\ \ L_{0}^{\ddagger}=-\Gamma_{y}(Y_{0}^{1}),\ \ L_{0}=0,
YT0=\displaystyle{Y}_{T}^{0}= Φ0(XT0,𝔼T[XT1])+ξ0,YT1=Φ(XT1;XT0,ET[XT1])+ξ1,PT=LTxΦ,\displaystyle\Phi^{0}\big{(}{X}_{T}^{0},\mathbb{E}_{T}[{{X}}^{1}_{T}]\big{)}+\xi^{0},\ \ \ \ {{Y}}_{T}^{1}=\Phi\big{(}{{X}}_{T}^{1};{X}_{T}^{0},{E}_{T}[{{X}}^{1}_{T}]\big{)}+\xi^{1},\ \ \ \ P_{T}^{\ddagger}=-L_{T}^{\ddagger}\cdot\partial_{{x}}\Phi,
PT0=\displaystyle P_{T}^{0}= LT0x0Φ0LTx0Φ,PT=LTxΦ𝔼T[LT0x¯Φ0+LTx¯Φ].\displaystyle-L_{T}^{0}\cdot\partial_{x_{0}}\Phi^{0}-L_{T}\cdot\partial_{x_{0}}\Phi,\ \ \ \ P_{T}=-L_{T}\cdot\partial_{x}\Phi-\mathbb{E}_{T}[L_{T}^{0}\cdot\partial_{\bar{x}}\Phi^{0}+L_{T}\cdot\partial_{\bar{x}}\Phi].

H0(t)H_{0}(t) (resp. H(t)H(t)) is given by (4.2) ((4.4)) by replacing u1u^{1} (u0u^{0}) with u1,u^{1,*} (u0,u^{0,*}) and

ut0,=\displaystyle u_{t}^{0,*}= Ψ0(t,Θt0,,Θt1,;Lt0,Lt,Lt,Pt0,Qt00,Pt,Qt11,Pt,Qt11,),\displaystyle\Psi^{0}(t,\Theta_{t}^{0,\dagger},\Theta_{t}^{1,\dagger};L^{0}_{t},L_{t},L_{t}^{\ddagger},P^{0}_{t},Q^{00}_{t},P_{t},Q^{11}_{t},P_{t}^{\ddagger},Q^{11,\ddagger}_{t}),
ut1,=\displaystyle u_{t}^{1,*}= Ψ(t,Θt0,,Θt1,;Lt0,Lt,Lt,Pt0,Qt00,Pt,Qt11,Pt,Qt11,).\displaystyle\Psi(t,\Theta_{t}^{0,\dagger},\Theta_{t}^{1,\dagger};L^{0}_{t},L_{t},L_{t}^{\ddagger},P^{0}_{t},Q^{00}_{t},P_{t},Q^{11}_{t},P_{t}^{\ddagger},Q^{11,\ddagger}_{t}).

The following result is a direct consequence based on our previous analysis.

Corollary 4.1.

Assume the assumptions of Proposition 4.1 and 4.2 hold. If FBSDE (4.9) admits a solution, then (u0,,u1,,uj,)(u^{0,*},u^{1,*},u^{j,*}) is an equilibrium of the triple-agent mixed leader-follower-Nash game, where uj,𝒰jdu^{j,*}\in\mathcal{U}_{j}^{d} satisfies u¯t1,=u¯tj,\overline{u}^{1,*}_{t}=\overline{u}^{j,*}_{t}, for all t[0,T]t\in[0,T].

4.4 ε\varepsilon-Nash equilibrium of the RMM problem

Proposition 4.3 yields an approximate Nash equilibrium for the RMM problem. To verify it, we first introduce an technical assumption, that is common in control literature [11]:
Assumption (A3) (i) The diffusion coefficients σ0,σ\sigma^{0},\sigma are independent of (u0,u)(u^{0},u), if applicable.
(ii) The maximizers u^0\widehat{u}^{0} and u^\widehat{u} in Proposition 4.1 and 4.2 are independent of (Z0,Z1)(Z^{0},Z^{1}).
(iii) The system (4.9) has a unique solution (Θt0,Θt1;Lt0,Lt,Lt;Pt0,Pt,(\Theta_{t}^{0},\Theta_{t}^{1};L_{t}^{0},L_{t},L_{t}^{\ddagger};P^{0}_{t},P_{t}, Pt,Qt00,Qt01,Qt10,Qt11,P^{\ddagger}_{t},Q^{00}_{t},Q_{t}^{01},Q_{t}^{10},Q_{t}^{11}, Qt10,,Qt11,)Q^{10,\ddagger}_{t},Q^{11,\ddagger}_{t}), where L0,LL^{0},L and LL^{\ddagger} are 𝔽0\mathbb{F}^{0}-adapted.
(iv) There exists a random decoupling field η:[0,T]×Ω×2n+3m2m+3n\eta:[0,T]\times\Omega\times\mathbb{R}^{2n+3m}\rightarrow\mathbb{R}^{2m+3n} such that

(Yt0,Yt1,Pt0,Pt,Pt)=η(t,Xt0,Xt1,Lt0,Lt,Lt),a.s.,(Y^{0}_{t},Y^{1}_{t},P^{0}_{t},P_{t},P^{\ddagger}_{t})=\eta(t,X_{t}^{0},X_{t}^{1},L_{t}^{0},L_{t},L_{t}^{\ddagger}),\ \text{a.s.},

with η(,x0,x1,l0,l,l)\eta(\cdot,x^{0},x^{1},l^{0},l,l^{\ddagger}) is 𝔽0\mathbb{F}^{0}-adapted for each (x0,x1,l0,l,l)2n+3m(x^{0},x^{1},l^{0},l,l^{\ddagger})\in\mathbb{R}^{2n+3m}, and η(t,)\eta(t,\cdots) is Lipschitz continuous on all its arguments uniformly on tt.

Decoupling field is a key to study the well-posedness of FBSDEs; see, [13, 30, 31]. Proposition 5.1 presents a sufficient condition ensuring the existence of decoupling field for LQG-RMM. By (A3-i), u^0\widehat{u}^{0} and u^\widehat{u} in Proposition 4.1 and 4.2 are independent of (Q00,Q11,Q11,)(Q^{00},Q^{11},Q^{11,\ddagger}), and it is also the case for φ0,φ\varphi^{0},\varphi in (A2), ψ\psi in (4.7), Ψ0\Psi^{0} and Ψ\Psi in (4.8). We abuse the notation to write

χ(t,Xt0,Xt1)=χ(t,Xt0,Xt1;Lt0,Lt,Lt,η(t,Xt0,Xt1,Lt0,Lt,Lt)),χ=Ψ0,Ψ.\displaystyle\chi(t,X^{0}_{t},X^{1}_{t})=\chi(t,X^{0}_{t},X^{1}_{t};L_{t}^{0},L_{t},L_{t}^{\ddagger},\eta(t,X^{0}_{t},X^{1}_{t},L_{t}^{0},L_{t},L_{t}^{\ddagger})),\ \chi=\Psi^{0},\Psi.

Then it follows from (4.8) and (A3-iii) that Ψ0(t,Xt0,Xt1)\Psi^{0}(t,X_{t}^{0},X_{t}^{1}) and Ψ(t,,)\Psi(t,\cdot,\cdot) are 𝔽0\mathbb{F}^{0}-adapted.

Assumption (A4) Both Ψ0\Psi^{0} and Ψ\Psi are Lipschitz in state variable (x0,x)(x^{0},x) uniformly on tt, and Ψ0(t,0,0)L2+Ψ(t,0,0)L2<.||\Psi^{0}(t,0,0)||_{L^{2}}+||\Psi(t,0,0)||_{L^{2}}<\infty.

(A3) and (A4) are commonly adopted in MFG literature (e.g., (A7) and (A8) in [11]). Indeed, (A3) and (A4) are satisfied for the LQG-RMM problem in Section 5. Applying the feedback control pair (Ψ0,Ψ)(\Psi^{0},\Psi) to (1.1) and (1.3), we get the following forward-backward system

{dXt0,N=b0(t,Xt0,N,ut0,N;Xt(N),ut(N))dt+σ0(t,Xt0,N;Xt(N),ut(N))dWt0,X00=x00,dXti,N=b(t,Xti,N,uti,N;Xt0,N,ut0,N;Xt(N),ut(N))dt+σ(t,Xti,N,Xt0,N;Xt(N),ut(N))dWti,X0i=x0,dYt0,N=f0(t,Θt0,N,ut0,N;Θt(N),ut(N))dtZt0,NdWt0,YT0,N=Φ0(XT0,N,XT(N))+ξ0,dYti,N=f(t,Θti,N,uti,N;Θt0,N,ut0,N;Θt(N),ut(N))dtZti,Nd(W0,Wi)t,YTi=Φ(XTi,N,XT0,N,XT(N))+ξi,\left\{\begin{aligned} dX_{t}^{0,N}=&b^{0}\big{(}t,X_{t}^{0,N},u_{t}^{0,N};X_{t}^{(N)},{u_{t}^{(N)}}\big{)}dt+{\sigma^{0}\big{(}t,X_{t}^{0,N};X_{t}^{(N)},{u_{t}^{(N)}}\big{)}}dW^{0}_{t},\ X_{0}^{0}=x_{0}^{0},\\ dX_{t}^{i,N}=&b\big{(}t,X_{t}^{i,N},u_{t}^{i,N};X_{t}^{0,N},u_{t}^{0,N};X_{t}^{(N)},u_{t}^{(N)}\big{)}dt\\ &+\sigma\big{(}t,X_{t}^{i,N},X_{t}^{0,N};X_{t}^{(N)},u_{t}^{(N)}\big{)}dW_{t}^{i},\quad X_{0}^{i}=x_{0},\\ -dY_{t}^{0,N}=&f^{0}\big{(}t,\Theta_{t}^{0,N},u_{t}^{0,N};\Theta_{t}^{(N)},u_{t}^{(N)}\big{)}dt-Z_{t}^{0,N}dW_{t}^{0},\ Y_{T}^{0,N}=\Phi^{0}(X_{T}^{0,N},X_{T}^{(N)})+\xi^{0},\\ -dY_{t}^{i,N}=&f\big{(}t,\Theta_{t}^{i,N},u_{t}^{i,N};\Theta_{t}^{0,N},u_{t}^{0,N};\Theta_{t}^{(N)},u_{t}^{(N)}\big{)}dt-Z_{t}^{i,N}d(W^{0},W^{i})_{t},\\ Y_{T}^{i}=&\Phi(X_{T}^{i,N},X_{T}^{0,N},X_{T}^{(N)})+\xi^{i},\end{aligned}\right. (4.10)

where Ξ(N)=1Ni=1NΞi,N,Ξ=X,Y,Z,u(N)=1Ni=1Nuti,N\Xi_{\cdot}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}\Xi_{\cdot}^{i,N},\ \Xi=X,Y,Z,\ u_{\cdot}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}u_{t}^{i,N}, ut0,N=Ψ0(t,Xt0,N,Xt1,N),u_{t}^{0,N}=\Psi^{0}(t,X_{t}^{0,N},X_{t}^{1,N}), uti,N=Ψ(t,Xt0,N;Xti,N).u_{t}^{i,N}=\Psi(t,X_{t}^{0,N};X_{t}^{i,N}). The payoff functionals (1.2) now become

J0N=Γ0(Y00,N)+𝔼[0Tg0\displaystyle J_{0}^{N}=\Gamma^{0}(Y_{0}^{0,N})+\mathbb{E}\big{[}\int_{0}^{T}g^{0} (t,Θt0,N,ut0,N;Θt(N),ut(N))dt],\displaystyle\big{(}t,\Theta_{t}^{0,N},u_{t}^{0,N};\Theta_{t}^{(N)},u_{t}^{(N)}\big{)}dt\big{]}, (4.11)
JiN=Γ(Y0i,N)+𝔼[0Tg\displaystyle J_{i}^{N}=\Gamma(Y_{0}^{i,N})+\mathbb{E}\big{[}\int_{0}^{T}g (t,Θti,N,uti,N;Θt0,N,ut0,N;Θt(N),ut(N))dt].\displaystyle\big{(}t,\Theta_{t}^{i,N},u_{t}^{i,N};\Theta_{t}^{0,N},u_{t}^{0,N};\Theta_{t}^{(N)},u_{t}^{(N)}\big{)}dt\big{]}.

To ease the notation, hereafter we may write (J0N,JiN)(J_{0}^{N},J_{i}^{N}) instead of (J0N,JiN)(u0,u1,,uN)(J_{0}^{N},J_{i}^{N})(u^{0},u^{1},\cdots,u^{N}).

Theorem 4.1.

Assume that (A1)-(A4) hold. The feedback control (ut0,N,ut1,N,,utN,N)\big{(}u^{0,N}_{t},u^{1,N}_{t},\cdots,u^{N,N}_{t}\big{)} is an εN\varepsilon_{N}-Nash equilibrium for RMM problem, where εNCN\varepsilon_{N}\leq\frac{C}{\sqrt{N}}.

The proof of Theorem 4.1 essentially relies on a tailor-made propagation chaos analysis in the same spirit of [11], but invokes additional estimates on BSDEs. We defer it in Appendix A.1.

5 Linear-quadratic-Gaussian cases

This section studies the RMM problem in some linear-quadratic-Gaussian (LQG) cases (LQG-RMM), where the agents’ states evolve by the following linear SDEs: for 1iN1\leq i\leq N,

{dXt0=(b10Xt0+b20ut0+b30Xt(N)+b40ut(N))dt+σ0dWt0,X00=x00,dXti=(b1Xti+b2uti+b3Xt0+b4ut0+b5Xt(N)+b6ut(N))dt+σdWti,X0i=x0,\left\{\begin{aligned} dX_{t}^{0}=&\big{(}b_{1}^{0}X_{t}^{0}+b_{2}^{0}{u_{t}^{0}}+b_{3}^{0}X_{t}^{(N)}+b_{4}^{0}{u_{t}^{(N)}}\big{)}dt+\sigma_{0}dW^{0}_{t},\quad X_{0}^{0}=x_{0}^{0},\\ dX_{t}^{i}=&\big{(}b_{1}X_{t}^{i}+b_{2}u_{t}^{i}+b_{3}X_{t}^{0}+b_{4}u_{t}^{0}+b_{5}X_{t}^{(N)}+b_{6}u_{t}^{(N)}\big{)}dt+\sigma dW_{t}^{i},\quad X_{0}^{i}=x_{0},\end{aligned}\right. (5.1)

and the recursive functionals assume the following quadratic forms

J0N=γ0|Y00|2𝔼[0TQ0|Xt0μ10Xt(N)|2+R0|ut0μ20ut(N)|2dt],\displaystyle J_{0}^{N}=-\gamma_{0}|Y_{0}^{0}|^{2}-\mathbb{E}\big{[}\int_{0}^{T}Q_{0}|X_{t}^{0}-\mu^{0}_{1}\cdot X_{t}^{(N)}|^{2}+R_{0}|u_{t}^{0}-\mu^{0}_{2}\cdot u_{t}^{(N)}|^{2}dt\big{]}, (5.2)
JiN=γ|Y0i|2𝔼[0TQ|Xtiμ1Xt(N)μ2Xt0|2+R|utiμ3ut(N)μ4ut0|2dt],\displaystyle J_{i}^{N}=-\gamma|Y_{0}^{i}|^{2}-\mathbb{E}\big{[}\int_{0}^{T}Q|X_{t}^{i}-\mu_{1}\cdot X_{t}^{(N)}-\mu_{2}\cdot X_{t}^{0}|^{2}+R|u_{t}^{i}-\mu_{3}\cdot u_{t}^{(N)}-\mu_{4}\cdot u_{t}^{0}|^{2}dt\big{]},

where (Y0,Z0;Yi,Zi)(Y^{0},Z^{0};Y^{i},Z^{i}) satisfy the following coupled linear BSDE system:

{dYt0=(f10Xt0+f20Yt0+f30Zt0+f40ut0+f50Xt(N)+f60Yt(N)+f70Zt(N),0+f80ut(N))dtZt0dWt0,dYti=(f1Xti+f2Yti+f3Zti,0+f4uti+f5Xt0+f6Yt0+f7Zt0+f8ut0+f9Xt(N)+f10Yt(N)+f11Zt(N),0+f12ut(N))dtZti,0dWt0Zti,idWti,YT0=Φ10XT0+Φ20XT(N)+ξ0,YTi=Φ1XTi+Φ2XT0+Φ3XT(N)+ξi.\left\{\begin{aligned} -dY_{t}^{0}=&\big{(}f_{1}^{0}X_{t}^{0}+f_{2}^{0}Y_{t}^{0}+f_{3}^{0}Z_{t}^{0}+f_{4}^{0}u_{t}^{0}+f_{5}^{0}X_{t}^{(N)}+f_{6}^{0}Y_{t}^{(N)}+f_{7}^{0}Z_{t}^{(N),0}+f_{8}^{0}u_{t}^{(N)}\big{)}dt-Z_{t}^{0}dW_{t}^{0},\\ -dY_{t}^{i}=&\big{(}f_{1}X_{t}^{i}+f_{2}Y_{t}^{i}+f_{3}Z_{t}^{i,0}+f_{4}u_{t}^{i}+f_{5}X_{t}^{0}+f_{6}Y_{t}^{0}+f_{7}Z_{t}^{0}+f_{8}u_{t}^{0}+f_{9}X_{t}^{(N)}\\ &+f_{10}Y_{t}^{(N)}+f_{11}Z_{t}^{(N),0}+f_{12}u_{t}^{(N)}\big{)}dt-Z_{t}^{i,0}dW_{t}^{0}-Z_{t}^{i,i}dW_{t}^{i},\\ Y_{T}^{0}=&\Phi_{1}^{0}X_{T}^{0}+\Phi_{2}^{0}X_{T}^{(N)}+\xi^{0},\quad Y_{T}^{i}=\Phi_{1}X_{T}^{i}+\Phi_{2}X_{T}^{0}+\Phi_{3}X_{T}^{(N)}+\xi^{i}.\end{aligned}\right. (5.3)

with Yt(N)=1Ni=1NYti,Zt(N),0=1Ni=1NZti,0Y_{t}^{(N)}=\frac{1}{N}\sum_{i=1}^{N}Y_{t}^{i},Z_{t}^{(N),0}=\frac{1}{N}\sum_{i=1}^{N}Z_{t}^{i,0}. To simplify the analysis, we assume that n=m=k0=k=1n=m=k^{0}=k=1; all coefficients are constants with nonnegative (γ0,γ,Q0,Q)(\gamma_{0},\gamma,Q_{0},Q), positive (R0,R)(R_{0},R), and μ3+μ20μ41\mu_{3}+\mu^{0}_{2}\mu_{4}\neq 1. The forward-backward LQG setting of (5.1)-(5.3) is strongly motivated by various practical applications, see [19, 26, 33] for more details. We introduce the following notations:

𝐗t=(Xt0,Xt1),𝐋t=(Lt0,Lt,Lt),𝐘t=(Yt0,Yt1),𝐏t=(Pt0,Pt,Pt),\displaystyle\mathbf{X}_{t}=(X^{0}_{t},{X}^{1}_{t})^{\top},\ \mathbf{L}_{t}=({L}_{t}^{0},{L}_{t},{L}^{\ddagger}_{t})^{\top},\ \mathbf{Y}_{t}=(Y_{t}^{0},{Y}_{t}^{1})^{\top},\ \mathbf{P}_{t}=({P}^{0}_{t},{P}_{t},{P}^{\ddagger}_{t})^{\top}, (5.4)
𝐙t=(Zt0,Zt1,0),𝐐t=(Qt00,Qt10,Qt10,),𝐙t1=(0,Zt1,1),𝐐t1=(Qt01,Qt11,Qt11,).\displaystyle\mathbf{Z}_{t}=(Z_{t}^{0},{Z}_{t}^{1,0})^{\top},\ \mathbf{Q}_{t}=({Q}^{00}_{t},{Q}^{10}_{t},{Q}^{10,\ddagger}_{t})^{\top},\ \mathbf{Z}_{t}^{1}=(0,{Z}_{t}^{1,1})^{\top},\ \mathbf{Q}_{t}^{1}=({Q}^{01}_{t},{Q}^{11}_{t},{Q}^{11,\ddagger}_{t})^{\top}.

By Proposition 4.1, 4.2 and (4.5), the equilibrium strategy (u0,,u1,,uj,)(u^{0,*},u^{1,*},u^{j,*}) now read as

ut0,=\displaystyle u_{t}^{0,*}= a10𝔼t[𝐏t]+a30𝔼t[𝐋t],ut1,=a1𝔼t[𝐏t]+a3𝔼t[𝐋t]+a7𝐏t+a8𝐋t,\displaystyle{a_{1}^{0}}^{\top}\mathbb{E}_{t}[\mathbf{P}_{t}]+{a_{3}^{0}}^{\top}\mathbb{E}_{t}[\mathbf{L}_{t}],\quad u_{t}^{1,*}=a_{1}^{\top}\mathbb{E}_{t}[\mathbf{P}_{t}]+a_{3}^{\top}\mathbb{E}_{t}[\mathbf{L}_{t}]+a_{7}^{\top}\mathbf{P}_{t}+a_{8}^{\top}\mathbf{L}_{t}, (5.5)
𝔼t[utj,]=\displaystyle\mathbb{E}_{t}[{u}_{t}^{j,*}]= (a1+a7)𝔼t[𝐏t]+(a3+a8)𝔼t[𝐋t].\displaystyle({a}_{1}+a_{7})^{\top}\mathbb{E}_{t}[\mathbf{P}_{t}]+({a}_{3}+a_{8})^{\top}\mathbb{E}_{t}[\mathbf{L}_{t}].

Combining with (4.9), we get the following FBSDE

{d𝐗t=(𝔸1𝐗t+𝔸2𝔼t[𝐗t]+𝔹1𝐏t+𝔹2𝔼t[𝐏t]+1𝐋t+2𝔼t[𝐋t])dt+(σ0,0)dWt0+(0,σ)dWt1,𝐗0=(x00,x0),d𝐋t=(3𝐋t+4𝔼t[𝐋t])dt+(5𝐋t+6𝔼t[𝐋t])dWt0,𝐋0=ρ𝐘0,d𝐘t=(𝔸3𝐗t+𝔸4𝔼t[𝐗t]+𝔻1𝐘t+𝔻2𝔼t[𝐘t]+𝔽1𝐙t+𝔽2𝔼t[𝐙t]+𝔹3𝐏t+𝔹4𝔼t[𝐏t]+7𝐋t+8𝔼t[𝐋t])dt𝐙tdWt0𝐙t1dWt1,d𝐏t=(𝔸5𝐗t+𝔸6𝔼t[𝐗t]+𝔹5𝐏t+𝔹6𝔼t[𝐏t]+9𝐋t+10𝔼t[𝐋t])dt𝐐tdWt0𝐐t1dWt1,𝐘T=𝔸7𝐗T+𝔸8𝔼T[𝐗T]+(ξ0,ξ),𝐏T=11𝐋T+12𝔼T[𝐋T].\left\{\begin{aligned} d\mathbf{X}_{t}=&\Big{(}\mathbb{A}_{1}\mathbf{X}_{t}+\mathbb{A}_{2}\mathbb{E}_{t}[\mathbf{X}_{t}]+\mathbb{B}_{1}\mathbf{P}_{t}+\mathbb{B}_{2}\mathbb{E}_{t}[\mathbf{P}_{t}]+\mathbb{C}_{1}\mathbf{L}_{t}+\mathbb{C}_{2}\mathbb{E}_{t}[\mathbf{L}_{t}]\Big{)}dt\\ &+(\sigma_{0},0)^{\top}dW_{t}^{0}+(0,\sigma)^{\top}dW_{t}^{1},\ \mathbf{X}_{0}=({x}_{0}^{0},x_{0})^{\top},\\ d\mathbf{L}_{t}=&\Big{(}\mathbb{C}_{3}\mathbf{L}_{t}+\mathbb{C}_{4}\mathbb{E}_{t}[\mathbf{L}_{t}]\Big{)}dt+\Big{(}\mathbb{C}_{5}\mathbf{L}_{t}+\mathbb{C}_{6}\mathbb{E}_{t}[\mathbf{L}_{t}]\Big{)}dW_{t}^{0},\ \mathbf{L}_{0}=\rho\mathbf{Y}_{0},\\ -d\mathbf{Y}_{t}=&\Big{(}\mathbb{A}_{3}\mathbf{X}_{t}+\mathbb{A}_{4}\mathbb{E}_{t}[\mathbf{X}_{t}]+\mathbb{D}_{1}\mathbf{Y}_{t}+\mathbb{D}_{2}\mathbb{E}_{t}[\mathbf{Y}_{t}]+\mathbb{F}_{1}\mathbf{Z}_{t}+\mathbb{F}_{2}\mathbb{E}_{t}[\mathbf{Z}_{t}]\\ &+\mathbb{B}_{3}\mathbf{P}_{t}+\mathbb{B}_{4}\mathbb{E}_{t}[\mathbf{P}_{t}]+\mathbb{C}_{7}\mathbf{L}_{t}+\mathbb{C}_{8}\mathbb{E}_{t}[\mathbf{L}_{t}]\Big{)}dt-\mathbf{Z}_{t}dW_{t}^{0}-\mathbf{Z}_{t}^{1}dW_{t}^{1},\\ -d\mathbf{P}_{t}=&\Big{(}\mathbb{A}_{5}\mathbf{X}_{t}+\mathbb{A}_{6}\mathbb{E}_{t}[\mathbf{X}_{t}]+\mathbb{B}_{5}\mathbf{P}_{t}+\mathbb{B}_{6}\mathbb{E}_{t}[\mathbf{P}_{t}]+\mathbb{C}_{9}\mathbf{L}_{t}+\mathbb{C}_{10}\mathbb{E}_{t}[\mathbf{L}_{t}]\Big{)}dt-\mathbf{Q}_{t}dW_{t}^{0}-\mathbf{Q}_{t}^{1}dW_{t}^{1},\\ \mathbf{Y}_{T}=&\mathbb{A}_{7}\mathbf{X}_{T}+\mathbb{A}_{8}\mathbb{E}_{T}[\mathbf{X}_{T}]+(\xi^{0},\xi)^{\top},\ \mathbf{P}_{T}=\mathbb{C}_{11}\mathbf{L}_{T}+\mathbb{C}_{12}\mathbb{E}_{T}[\mathbf{L}_{T}].\end{aligned}\right. (5.6)

For sake of presentation, we defer the definitions of {ai0}i=1,3\{a_{i}^{0}\}_{i=1,3}, {ai}i=1,3,7,8\{a_{i}\}_{i=1,3,7,8} and matrices (vectors) of (5.5)-(5.6) to Appendix A.2. (5.6) is a fully-coupled FBSDE with mixed initial-terminal conditions, and its wellposedness can be discussed through the following steps. Set M¯t=𝔼t[Mt]\overline{M}_{t}=\mathbb{E}_{t}[M_{t}] for M=𝐗,𝐘,𝐙,𝐋,𝐏,𝐐,M=\mathbf{X},\mathbf{Y},\mathbf{Z},\mathbf{L},\mathbf{P},\mathbf{Q}, and take conditional expectation on (5.6), we have

{d𝐗¯t=[(𝔸1+𝔸2)𝐗¯t+(𝔹1+𝔹2)𝐏¯t+(1+2)𝐋¯t]dt+(σ0,0)dWt0,𝐗¯0=(x00,x0),d𝐋¯t=(3+4)𝐋¯tdt+(5+6)𝐋¯tdWt0,𝐋¯0=ρ𝐘¯0,d𝐘¯t=[(𝔸3+𝔸4)𝐗¯t+(𝔻1+𝔻2)𝐘¯t+(𝔽1+𝔽2)𝐙¯t+(𝔹3+𝔹4)𝐏¯t+(7+8)𝐋¯t]dt𝐙¯tdWt0,d𝐏¯t=[(𝔸5+𝔸6)𝐗¯t+(𝔹5+𝔹6)𝐏¯t+(9+10)𝐋¯t]dt𝐐¯tdWt0,𝐘¯T=(𝔸7+𝔸8)𝐗¯T+(ξ0,𝔼T[ξ]),𝐏¯T=(11+12)𝐋¯T.\left\{\begin{aligned} d\overline{\mathbf{X}}_{t}=&[(\mathbb{A}_{1}+\mathbb{A}_{2})\overline{\mathbf{X}}_{t}+(\mathbb{B}_{1}+\mathbb{B}_{2})\overline{\mathbf{P}}_{t}+(\mathbb{C}_{1}+\mathbb{C}_{2})\overline{\mathbf{L}}_{t}]dt+(\sigma_{0},0)^{\top}dW_{t}^{0},\quad\overline{\mathbf{X}}_{0}=({x}_{0}^{0},x_{0})^{\top},\\ d\overline{\mathbf{L}}_{t}=&(\mathbb{C}_{3}+\mathbb{C}_{4})\overline{\mathbf{L}}_{t}dt+(\mathbb{C}_{5}+\mathbb{C}_{6})\overline{\mathbf{L}}_{t}dW_{t}^{0},\quad\overline{\mathbf{L}}_{0}=\rho\overline{\mathbf{Y}}_{0},\\ -d\overline{\mathbf{Y}}_{t}=&[(\mathbb{A}_{3}+\mathbb{A}_{4})\overline{\mathbf{X}}_{t}+(\mathbb{D}_{1}+\mathbb{D}_{2})\overline{\mathbf{Y}}_{t}+(\mathbb{F}_{1}+\mathbb{F}_{2})\overline{\mathbf{Z}}_{t}+(\mathbb{B}_{3}+\mathbb{B}_{4})\overline{\mathbf{P}}_{t}+(\mathbb{C}_{7}+\mathbb{C}_{8})\overline{\mathbf{L}}_{t}]dt-\overline{\mathbf{Z}}_{t}dW_{t}^{0},\\ -d\overline{\mathbf{P}}_{t}=&[(\mathbb{A}_{5}+\mathbb{A}_{6})\overline{\mathbf{X}}_{t}+(\mathbb{B}_{5}+\mathbb{B}_{6})\overline{\mathbf{P}}_{t}+(\mathbb{C}_{9}+\mathbb{C}_{10})\overline{\mathbf{L}}_{t}]dt-\overline{\mathbf{Q}}_{t}dW_{t}^{0},\\ \overline{\mathbf{Y}}_{T}=&(\mathbb{A}_{7}+\mathbb{A}_{8})\overline{\mathbf{X}}_{T}+(\xi^{0},\mathbb{E}_{T}[\xi])^{\top},\quad\overline{\mathbf{P}}_{T}=(\mathbb{C}_{11}+\mathbb{C}_{12})\overline{\mathbf{L}}_{T}.\end{aligned}\right. (5.7)

Next, we consider the following ODE and BSDE

{S˙t+St1+2St+St3St+(I+Stρ~)4(I+Stρ~)1(St5+St6St)+7=0,ST=(I𝔾1)1𝔾2,\left\{\begin{aligned} &\dot{S}_{t}+{S}_{t}{\mathbb{H}_{1}}+\mathbb{H}_{2}S_{t}+S_{t}{\mathbb{H}_{3}}S_{t}+(I+S_{t}\widetilde{\rho})\mathbb{H}_{4}(I+S_{t}\widetilde{\rho})^{-1}\big{(}S_{t}\mathbb{H}_{5}+S_{t}\mathbb{H}_{6}S_{t}\big{)}+\mathbb{H}_{7}=0,\\ &{S}_{T}=(I-\mathbb{G}_{1})^{-1}\mathbb{G}_{2},\end{aligned}\right. (5.8)
{dΥt=[(St3+2+(I+Stρ~)4(I+Stρ~)1St6)Υt+(I+Stρ~)4(I+Stρ~)1υt+(St+I)4(I+Stρ~)1St(σ0,0,0,0,0)]dtυtdWt0,ΥT=(I𝔾1)1(ξ0,𝔼T[ξ],0,0,0).\left\{\begin{aligned} -d\Upsilon_{t}=&\Big{[}\Big{(}S_{t}\mathbb{H}_{3}+\mathbb{H}_{2}+(I+S_{t}\widetilde{\rho})\mathbb{H}_{4}(I+S_{t}\widetilde{\rho})^{-1}S_{t}\mathbb{H}_{6}\Big{)}\Upsilon_{t}+(I+S_{t}\widetilde{\rho})\mathbb{H}_{4}(I+S_{t}\widetilde{\rho})^{-1}\upsilon_{t}\\ &+(S_{t}+I)\mathbb{H}_{4}(I+S_{t}\widetilde{\rho})^{-1}S_{t}(\sigma_{0},0,0,0,0)^{\top}\Big{]}dt-\upsilon_{t}dW_{t}^{0},\\ \Upsilon_{T}=&(I-\mathbb{G}_{1})^{-1}(\xi^{0},\mathbb{E}_{T}[\xi],0,0,0)^{\top}.\end{aligned}\right. (5.9)

Again, we defer the definitions of {k}k=17\{\mathbb{H}_{k}\}_{k=1}^{7} and 𝔾1,𝔾2\mathbb{G}_{1},\mathbb{G}_{2} to Appendix A.2. The well-posedness of (5.8) may be obtained as Theorem 4.6 in [19]. We refrain to present these conditions in details, as they might be rather technical and incur unnecessary degression along our presentation. Instead, we directly assume that

(A5) (5.8) admits a unique 5×5\mathbb{R}^{5\times 5}-valued solution SS with bounded [I+Stρ~]1,a.e.t.[I+S_{t}\widetilde{\rho}]^{-1},\ \text{a.e.}\ t.

Under (A5), BSDE (5.9) admits an unique 𝔽0\mathbb{F}^{0}-adapted solution (Υ,υ)(\Upsilon,\upsilon) by the standard BSDE solvability arguments. Moreover, we have the following result:

Lemma 5.1.

Let (A5) holds. Then the linear FBSDE (5.7) has a unique 𝔽0\mathbb{F}^{0}-adapted solution (𝐗¯,𝐋¯,𝐘¯,𝐙¯,𝐏¯,𝐐¯)(\overline{\mathbf{X}},\overline{\mathbf{L}},\overline{\mathbf{Y}},\overline{\mathbf{Z}},\overline{\mathbf{P}},\overline{\mathbf{Q}}) with the following relations:

(𝐘¯t,𝐏¯t)=\displaystyle(\overline{\mathbf{Y}}_{t},\overline{\mathbf{P}}_{t})^{\top}= (I+Stρ~)1St(𝐗¯t,𝐋¯t)+(I+Stρ~)1Υt,\displaystyle(I+S_{t}\widetilde{\rho})^{-1}S_{t}(\overline{\mathbf{X}}_{t},\overline{\mathbf{L}}_{t})^{\top}+(I+S_{t}\widetilde{\rho})^{-1}\Upsilon_{t}, (5.10)
(𝐙¯t,𝐐¯t)=\displaystyle(\overline{\mathbf{Z}}_{t},\overline{\mathbf{Q}}_{t})^{\top}= (I+Stρ~)1(St5+St6St)(Iρ~(I+Stρ~)1St)(𝐗¯t,𝐋¯t)\displaystyle(I+S_{t}\widetilde{\rho})^{-1}(S_{t}\mathbb{H}_{5}+S_{t}\mathbb{H}_{6}S_{t})\big{(}I-\widetilde{\rho}(I+S_{t}\widetilde{\rho})^{-1}S_{t}\big{)}(\overline{\mathbf{X}}_{t},\overline{\mathbf{L}}_{t})^{\top}
(I+Stρ~)1(St5+St6St)ρ~(I+Stρ~)1Υt\displaystyle-(I+S_{t}\widetilde{\rho})^{-1}(S_{t}\mathbb{H}_{5}+S_{t}\mathbb{H}_{6}S_{t})\widetilde{\rho}(I+S_{t}\widetilde{\rho})^{-1}\Upsilon_{t}
+(I+Stρ~)1St6Υt+(I+Stρ~)1(St(σ0,0,0,0,0)+υt).\displaystyle+(I+S_{t}\widetilde{\rho})^{-1}S_{t}\mathbb{H}_{6}\Upsilon_{t}+(I+S_{t}\widetilde{\rho})^{-1}(S_{t}(\sigma_{0},0,0,0,0)^{\top}+\upsilon_{t}).

Its proof is based on a standard linear transformation decoupling method (e.g., [17]) and we omit its details here. We now assume

(A6) det{(0,I)e𝒜t(0,I)}>0,t[0,T],\det\{(0,I)e^{\mathcal{A}t}(0,I)^{\top}\}>0,\ \forall\ t\in[0,T], where 𝒜=(b112R1|b2|22Qb1)\mathcal{A}=\left(\begin{array}[]{cc}b_{1}&\frac{1}{2}R^{-1}|b_{2}|^{2}\\ 2Q&-b_{1}\end{array}\right).

Proposition 5.1.

Let (A5)-(A6) hold. Then the Hamiltonian system (5.6) has a unique solution (𝐗,𝐋,𝐘,𝐙,𝐙1,𝐏,𝐐,𝐐1)(\mathbf{X},\mathbf{L},\mathbf{Y},\mathbf{Z},\mathbf{Z}^{1},\mathbf{P},\mathbf{Q},\mathbf{Q}^{1}) (see (5.4) for notation), where (X0,𝐋,Y0,Z0,P0,Q00,P,Q10)(X^{0},\mathbf{L},Y^{0},Z^{0},P^{0},Q^{00},P,Q^{10}) are 𝔽0\mathbb{F}^{0}-adapted, (X1,Y1,Z10,Z11,P,Q10,,Q11,)(X^{1},Y^{1},Z^{10},Z^{11},P^{\ddagger},Q^{10,\ddagger},Q^{11,\ddagger}) are 𝔽0,1\mathbb{F}^{0,1}-adapted, and Q01=Q110Q^{01}=Q^{11}\equiv 0. Moreover, the following relation holds Pt=ΣtXt1+pt,t[0,T]P_{t}^{\ddagger}=\Sigma_{t}X_{t}^{1}+p_{t},\ t\in[0,T], where Σ\Sigma and pp are given by

Σt=[(0,I)e𝒜(Tt)(0,I)]1(0,I)e𝒜(Tt)(I,0),\Sigma_{t}=-\Big{[}(0,I)e^{\mathcal{A}(T-t)}(0,I)^{\top}\Big{]}^{-1}(0,I)e^{\mathcal{A}(T-t)}(I,0)^{\top}, (5.11)
pt=𝔼t[LTΦ1ΠT+tT[(Σs𝚲1+Σs𝚲2(I+Ssρ~)1Ss+𝚲4)(𝐗¯s,𝐋s)+Σs𝚲2(I+Ssρ~)1Υs]Πs𝑑s],\displaystyle p_{t}=\mathbb{E}_{t}\Big{[}-L_{T}^{\ddagger}\Phi_{1}\Pi_{T}+\int_{t}^{T}[(\Sigma_{s}\mathbf{\Lambda}_{1}+\Sigma_{s}\mathbf{\Lambda}_{2}(I+S_{s}\widetilde{\rho})^{-1}S_{s}+\mathbf{\Lambda}_{4})(\overline{\mathbf{X}}_{s},\mathbf{L}_{s})^{\top}+\Sigma_{s}\mathbf{\Lambda}_{2}(I+S_{s}\widetilde{\rho})^{-1}\Upsilon_{s}]\Pi_{s}ds\Big{]}, (5.12)

with Πs=ets(12R1|b2|2Σr+b1)𝑑r,s[t,T]\Pi_{s}=e^{\int_{t}^{s}(\frac{1}{2}R^{-1}|b_{2}|^{2}\Sigma_{r}+b_{1})dr},\ \ s\in[t,T], and 𝚲1,𝚲2\mathbf{\Lambda}_{1},\mathbf{\Lambda}_{2} and 𝚲4\mathbf{\Lambda}_{4} are given in Appendix A.2.

We defer the proof of Proposition 5.1 in Appendix. Combining (5.5), (5.10) and (5.12), the equilibrium for 𝒜0\mathcal{A}_{0} and 𝒜1\mathcal{A}_{1} are respectively

ut0,=\displaystyle u_{t}^{0,*}= (0,a10)(𝐘¯t,𝐏¯t)+a30𝐋t=At0,1(Xt0,𝔼t[Xt1])+Mt0,\displaystyle(0,{a_{1}^{0}}^{\top})(\overline{\mathbf{Y}}_{t},\overline{\mathbf{P}}_{t})^{\top}+{a_{3}^{0}}^{\top}{\mathbf{L}}_{t}=A_{t}^{0,1}(X_{t}^{0},\mathbb{E}_{t}[X_{t}^{1}])^{\top}+M_{t}^{0},
ut1,=\displaystyle u_{t}^{1,*}= (0,a1)(𝐘¯t,𝐏¯t)+(a3+a8)𝐋t+12R1b2Pt=At1(Xt0,𝔼t[Xt1])\displaystyle(0,a_{1}^{\top})(\overline{\mathbf{Y}}_{t},\overline{\mathbf{P}}_{t})^{\top}+(a_{3}+a_{8})^{\top}\mathbf{L}_{t}+\frac{1}{2}R^{-1}b_{2}P_{t}^{\ddagger}=A_{t}^{1}(X_{t}^{0},\mathbb{E}_{t}[X_{t}^{1}])^{\top}
+12R1b2ΣtXt1+12R1b2𝔼t[tTCs1(Xs0,𝔼s[Xs1])]Πsds]+Mt,\displaystyle+\frac{1}{2}R^{-1}b_{2}\Sigma_{t}X_{t}^{1}+\frac{1}{2}R^{-1}b_{2}\mathbb{E}_{t}\Big{[}\int_{t}^{T}C_{s}^{1}(X_{s}^{0},\mathbb{E}_{s}[X_{s}^{1}])^{\top}]\Pi_{s}ds\Big{]}+M_{t},

where

(At0,1,At0,2):=(0,a10)(I+Stρ~)1St,Mt0:=At0,2𝐋t+(0,a10)(I+Stρ~)1Υt+a30𝐋¯t,\displaystyle(A_{t}^{0,1},A_{t}^{0,2}):=(0,{a_{1}^{0}}^{\top})(I+S_{t}\widetilde{\rho})^{-1}S_{t},\ \ \ \ \ M_{t}^{0}:=A_{t}^{0,2}\mathbf{L}_{t}+(0,{a_{1}^{0}}^{\top})(I+S_{t}\widetilde{\rho})^{-1}\Upsilon_{t}+{a_{3}^{0}}^{\top}\overline{\mathbf{L}}_{t},
(At1,At2):=(0,a1)(I+Stρ~)1St,(Ct1,Ct2):=Σt𝚲1+Σt𝚲2(I+Stρ~)1St+𝚲4,\displaystyle(A_{t}^{1},A_{t}^{2}):=(0,a_{1}^{\top})(I+S_{t}\widetilde{\rho})^{-1}S_{t},\ \ \ \ \ \ (C_{t}^{1},C_{t}^{2}):=\Sigma_{t}\mathbf{\Lambda}_{1}+\Sigma_{t}\mathbf{\Lambda}_{2}(I+S_{t}\widetilde{\rho})^{-1}S_{t}+\mathbf{\Lambda}_{4},
Mt:=At2𝐋t+(0,a1)(I+Stρ~)1Υt+(a3+a8)𝐋t\displaystyle M_{t}:=A_{t}^{2}\mathbf{L}_{t}+(0,a_{1}^{\top})(I+S_{t}\widetilde{\rho})^{-1}\Upsilon_{t}+(a_{3}+a_{8})^{\top}\mathbf{L}_{t}
+12R1b2𝔼t[LTΦ1ΠT+tT[Cs2𝐋s+Σs𝚲2(I+Ssρ~)1Υs]Πs𝑑s].\displaystyle\qquad+\frac{1}{2}R^{-1}b_{2}\mathbb{E}_{t}\Big{[}-L_{T}^{\ddagger}\Phi_{1}\Pi_{T}+\int_{t}^{T}[C_{s}^{2}\mathbf{L}_{s}+\Sigma_{s}\mathbf{\Lambda}_{2}(I+S_{s}\widetilde{\rho})^{-1}\Upsilon_{s}]\Pi_{s}ds\Big{]}.

Note that both Nt0N_{t}^{0} and NtN_{t} are 𝔽0\mathbb{F}^{0}-adapted. By Theorem 4.1, we have the following result.

Theorem 5.1.

Let (A5)-(A6) hold. Then

(\displaystyle\Big{(} At0,1(Xt0,N,𝔼t[Xt1,N])+Mt0,At1(Xt0,N,𝔼t[Xt1,N])\displaystyle A_{t}^{0,1}(X_{t}^{0,N},\mathbb{E}_{t}[X_{t}^{1,N}])^{\top}+M_{t}^{0},\ \ \ A_{t}^{1}(X_{t}^{0,N},\mathbb{E}_{t}[X_{t}^{1,N}])^{\top} (5.13)
+12R1b2ΣtXti,N+12R1b2𝔼t[tTCs1(Xs0,N,𝔼s[Xs1,N])]Πsds]+Mt), 1iN,\displaystyle+\frac{1}{2}R^{-1}b_{2}\Sigma_{t}X_{t}^{i,N}+\frac{1}{2}R^{-1}b_{2}\mathbb{E}_{t}\Big{[}\int_{t}^{T}C_{s}^{1}(X_{s}^{0,N},\mathbb{E}_{s}[X_{s}^{1,N}])^{\top}]\Pi_{s}ds\Big{]}+M_{t}\Big{)},\ 1\leq i\leq N,

is an εN\varepsilon_{N}-Nash equilibrium strategy for LQG-RMM with εN=O(1N)\varepsilon_{N}=O(\frac{1}{\sqrt{N}}); (X0,N,X1,N,(X^{0,N},X^{1,N}, ,XN,N)\cdots,X^{N,N}) is the solution of McKean-Vlasov SDE (5.1) by applying the equilibrium (5.13).

5.1 Forward LQG-RMM

This subsection studies a special case with γ0=γ=0\gamma_{0}=\gamma=0 in (5.1), (5.2). In this case, all functionals are still quadratic but involve only the forward state, and the RMM reduces to the classical forward major-minor game in [11]. Although such forward setting is not novel, our RMM-LQG still gains novelties in its generality of weak-couplings. Now (5.8) and (5.9) become

{S˙t+St(𝔸1+𝔸2)+(𝔹5+𝔹6)St+St𝔹2St+(𝔸5+𝔸6)=0,dΥt=(St𝔹2+𝔹5+𝔹6)Υtdt,ST=ΥT=0.\left\{\begin{aligned} &\dot{S}_{t}+{S}_{t}(\mathbb{A}_{1}+\mathbb{A}_{2})+(\mathbb{B}_{5}+\mathbb{B}_{6})S_{t}+S_{t}\mathbb{B}_{2}S_{t}+(\mathbb{A}_{5}+\mathbb{A}_{6})=0,\\ &-d\Upsilon_{t}=(S_{t}\mathbb{B}_{2}+\mathbb{B}_{5}+\mathbb{B}_{6})\Upsilon_{t}dt,\ \ \ \ S_{T}=\Upsilon_{T}=0.\end{aligned}\right.

It is easy to check that if

det{(0,I)e𝒜^t(0,I)}>0,t[0,T],with𝒜^=(𝔸1+𝔸2𝔹2𝔸5𝔸6𝔹5𝔹6),\det\{(0,I)e^{\widehat{\mathcal{A}}t}(0,I)^{\top}\}>0,\ \forall\ t\in[0,T],\ \ \ \text{with}\ \widehat{\mathcal{A}}=\left(\begin{array}[]{cc}\mathbb{A}_{1}+\mathbb{A}_{2}&\mathbb{B}_{2}\\ -\mathbb{A}_{5}-\mathbb{A}_{6}&-\mathbb{B}_{5}-\mathbb{B}_{6}\end{array}\right), (5.14)

then (A5) can be ensured and SS admits the following representation

St=[(0,I)e𝒜^(Tt)(0,I)]1(0,I)e𝒜^(Tt)(I,0)=:(St11St12St21St22St31St32).S_{t}=-\Big{[}(0,I)e^{\widehat{\mathcal{A}}(T-t)}(0,I)^{\top}\Big{]}^{-1}(0,I)e^{\widehat{\mathcal{A}}(T-t)}(I,0)^{\top}=:\left(\begin{array}[]{ccc}S_{t}^{11}&S_{t}^{12}\\ S_{t}^{21}&S_{t}^{22}\\ S_{t}^{31}&S_{t}^{32}\\ \end{array}\right). (5.15)

By the solution uniqueness, Υt0,\Upsilon_{t}\equiv 0, so Mt0=Mt0,t[0,T]M_{t}^{0}=M_{t}\equiv 0,\ t\in[0,T] (noting L0=L=L0L^{0}=L=L^{\ddagger}\equiv 0). Then

Corollary 5.1.

Let (A6) and (5.14) hold. An εN\varepsilon_{N}-Nash equilibrium strategy for the forward LQG-RMM problem takes the following form

(\displaystyle\Big{(} At0,1(Xt0,N,𝔼t[Xt1,N]),At1(Xt0,N,𝔼t[Xt1,N])\displaystyle A_{t}^{0,1}(X_{t}^{0,N},\mathbb{E}_{t}[X_{t}^{1,N}])^{\top},\ A_{t}^{1}(X_{t}^{0,N},\mathbb{E}_{t}[X_{t}^{1,N}])^{\top}
+12R1b2ΣtXti,N+12R1b2𝔼t[tTCs1(Xs0,N,𝔼s[Xs1,N])]Πsds]),1iN,\displaystyle+\frac{1}{2}R^{-1}b_{2}\Sigma_{t}X_{t}^{i,N}+\frac{1}{2}R^{-1}b_{2}\mathbb{E}_{t}\Big{[}\int_{t}^{T}C_{s}^{1}(X_{s}^{0,N},\mathbb{E}_{s}[X_{s}^{1,N}])^{\top}]\Pi_{s}ds\Big{]}\Big{)},\quad 1\leq i\leq N,

where At0,1=a10(St11,St21,St31),At1=a1(St11,St21,St31),Ct1=Σt(b3,b5)+Σt𝚲3St+(2Qμ2,2Qμ1),A_{t}^{0,1}={a_{1}^{0}}^{\top}(S_{t}^{11},S_{t}^{21},S_{t}^{31})^{\top},\ A_{t}^{1}=a_{1}^{\top}(S_{t}^{11},S_{t}^{21},S_{t}^{31})^{\top},\ C_{t}^{1}=\Sigma_{t}(b_{3},b_{5})+\Sigma_{t}\mathbf{\Lambda}_{3}S_{t}+(2Q\mu_{2},2Q\mu_{1}), 𝚲3\mathbf{\Lambda}_{3} is given in Subsection 6.2; StS_{t} and Σt\Sigma_{t} are given by (5.15) and (5.11), respectively.

Example 5.1.

Consider a forward LQG-RMM with μ20=μ3=μ4=0\mu^{0}_{2}=\mu_{3}=\mu_{4}=0 and b40=b4=b6=0b_{4}^{0}=b_{4}=b_{6}=0 (no weak-coupling of the control-average). The εN\varepsilon_{N}-Nash equilibrium by Corollary 5.1 becomes

(12R01b20St11Xt0,N+12R01b20St21𝔼t[Xt1,N],12R1b2ΣtXti,N+kt), 1iN,\displaystyle\Big{(}\frac{1}{2}R_{0}^{-1}b_{2}^{0}S_{t}^{11}X_{t}^{0,N}+\frac{1}{2}R_{0}^{-1}b_{2}^{0}S_{t}^{21}\mathbb{E}_{t}[X_{t}^{1,N}],\ \ \ \frac{1}{2}R^{-1}b_{2}\Sigma_{t}X_{t}^{i,N}+k_{t}\Big{)},\ 1\leq i\leq N,

where kt=12R1b2𝔼t[tTCs1(Xs0,N,𝔼s[Xs1,N])Πs𝑑s]k_{t}=\frac{1}{2}R^{-1}b_{2}\mathbb{E}_{t}\Big{[}\int_{t}^{T}C_{s}^{1}(X_{s}^{0,N},\mathbb{E}_{s}[X_{s}^{1,N}])^{\top}\Pi_{s}ds\Big{]} and Ct1=Σt(b3,b5)+(2Qμ2,2Qμ1).C_{t}^{1}=\Sigma_{t}(b_{3},b_{5})+(2Q\mu_{2},2Q\mu_{1}). This result recovers Theorem 5.1 in [11] with a subtle difference: note that εN=O(N1d+4)\varepsilon_{N}=O(N^{-\frac{1}{d+4}}) in [11] while our εN=O(N)\varepsilon_{N}=O(\sqrt{N}). This is mainly due to the modeling differences in weak-couplings: [11] considers the empirical distribution while we focus on the more special empirical average. As a tradeoff, we can obtain an explicit expression for ktk_{t} while [11] only shows its existence.

5.2 Backward LQG-RMM

In this subsection, Q0=Q=0Q_{0}=Q=0, f10=f50=f1,5,9=0f_{1}^{0}=f_{5}^{0}=f_{1,5,9}=0, Φ10=Φ20=Φ1,2,3=0\Phi_{1}^{0}=\Phi_{2}^{0}=\Phi_{1,2,3}=0 in (5.2) and (5.3). The LQG-RMM is now solely “backward” without the forward state (5.1). Backward LQG setting has found broad applications in such as optimal investment, recursive utility and hedging (e.g., [20]). Related MFG studies on this setting has also been well addressed (see [14, 22]). However, the backward LQG-RMM seems still novel in literature. It can well capture the insights of the large investor (see [16]) and relative performance ([14]), both are well motivated in financial studies. In this case, (5.8) reads as (noting that SS is 2×3\mathbb{R}^{2\times 3}-valued)

S˙t+St(3+4+ρ7+ρ8)+~St+St𝔹~St+(I+Stρ)(𝔽1+𝔽2)(I+Stρ)1\displaystyle\dot{S}_{t}+{S}_{t}(\mathbb{C}_{3}+\mathbb{C}_{4}+\rho\mathbb{C}_{7}+\rho\mathbb{C}_{8})+\widetilde{\mathbb{C}}S_{t}+S_{t}\widetilde{\mathbb{B}}S_{t}+(I+S_{t}\rho)(\mathbb{F}_{1}+\mathbb{F}_{2})(I+S_{t}\rho)^{-1}\cdot (5.16)
(St(5+6)+St(5+6)ρSt)+7+8=0,ST=0,\displaystyle\big{(}S_{t}(\mathbb{C}_{5}+\mathbb{C}_{6})+S_{t}(\mathbb{C}_{5}+\mathbb{C}_{6})\rho S_{t}\big{)}+\mathbb{C}_{7}+\mathbb{C}_{8}=0,\ {S}_{T}=0,

and the linear BSDE (5.9) now becomes

{dΥt=[(St𝔹~+~+(I+Stρ)(𝔽1+𝔽2)(I+Stρ)1St(5+6)ρ)Υt+(I+Stρ)(𝔽1+𝔽2)(I+Stρ)1υt]dtυtdWt0,ΥT=(ξ0,𝔼T[ξ1]),\left\{\begin{aligned} -d\Upsilon_{t}=&\Big{[}\Big{(}S_{t}\widetilde{\mathbb{B}}+\widetilde{\mathbb{C}}+(I+S_{t}\rho)(\mathbb{F}_{1}+\mathbb{F}_{2})(I+S_{t}\rho)^{-1}S_{t}(\mathbb{C}_{5}+\mathbb{C}_{6})\rho\Big{)}\Upsilon_{t}\\ &+(I+S_{t}\rho)(\mathbb{F}_{1}+\mathbb{F}_{2})(I+S_{t}\rho)^{-1}\upsilon_{t}\Big{]}dt-\upsilon_{t}dW_{t}^{0},\ \Upsilon_{T}=(\xi^{0},\mathbb{E}_{T}[\xi^{1}])^{\top},\end{aligned}\right. (5.17)

where ~=7ρ+8ρ+𝔻1+𝔻2,\widetilde{\mathbb{C}}=\mathbb{C}_{7}\rho+\mathbb{C}_{8}\rho+\mathbb{D}_{1}+\mathbb{D}_{2}, 𝔹~=(3+4)ρ+ρ~\widetilde{\mathbb{B}}=(\mathbb{C}_{3}+\mathbb{C}_{4})\rho+\rho\widetilde{\mathbb{C}}. Moreover, it follows from (5.11)-(5.12) that Σt=pt0\Sigma_{t}=p_{t}\equiv 0. Besides, by (5.7) and Lemma 5.1, 𝐋t=(Lt0,Lt,Lt)\mathbf{L}_{t}=(L_{t}^{0},L_{t},L_{t}^{\ddagger})^{\top} satisfies

d(Lt0Lt)=(𝔻1+𝔻2)(Lt0Lt)dt+(𝔽1+𝔽2)(Lt0Lt)dWt0,dLt=f2Ltdt+f3LtdWt0,\displaystyle d\left(\begin{array}[]{c}{L}_{t}^{0}\\ L_{t}\end{array}\right)=(\mathbb{D}_{1}+\mathbb{D}_{2})^{\top}\left(\begin{array}[]{c}{L}_{t}^{0}\\ L_{t}\end{array}\right)dt+(\mathbb{F}_{1}+\mathbb{F}_{2})^{\top}\left(\begin{array}[]{c}{L}_{t}^{0}\\ L_{t}\end{array}\right)dW_{t}^{0},\ dL_{t}^{\ddagger}=f_{2}L_{t}^{\ddagger}dt+f_{3}L_{t}^{\ddagger}dW_{t}^{0}, (5.18)

with the initial condition 𝐋0=ρ(I+S0ρ)1S0𝐋0+ρ(I+S0ρ)1Υ0.\mathbf{L}_{0}=\rho(I+S_{0}\rho)^{-1}S_{0}\mathbf{L}_{0}+\rho(I+S_{0}\rho)^{-1}\Upsilon_{0}. In particular, if S0=0S_{0}=0, then 𝐋0=ρΥ0\mathbf{L}_{0}=\rho\Upsilon_{0}; and if det(S0S0)0\det(S_{0}^{\top}S_{0})\neq 0, then 𝐋0=(S0S0)1S0(I+S0ρ)S0ρ(I+S0ρ)1Υ0.\mathbf{L}_{0}=(S_{0}^{\top}S_{0})^{-1}S_{0}^{\top}(I+S_{0}\rho)S_{0}\rho(I+S_{0}\rho)^{-1}\Upsilon_{0}. Since (A6) can be readily verified, the following result follows directly from Theorem 5.1.

Corollary 5.2.

Under (A5), assume (S,Υ,υ)(S,\Upsilon,\upsilon) and 𝐋\mathbf{L} solve (5.16)-(5.17) and (5.18), respectively. Then an εN\varepsilon_{N}-Nash equilibrium (Mt0,Mt,,Mt)\Big{(}M_{t}^{0},M_{t},\cdots,M_{t}\Big{)} for backward LQG-RMM is given by

Mt0=\displaystyle M_{t}^{0}= a30𝐋t=a30(et+𝕂Wt0(L00,L0),e(f212|f3|2)t+f3Wt0L0),\displaystyle{a_{3}^{0}}^{\top}\mathbf{L}_{t}={a_{3}^{0}}^{\top}\Big{(}e^{\mathbb{H}t+\mathbb{K}W_{t}^{0}}(L_{0}^{0},L_{0}),e^{(f_{2}-\frac{1}{2}|f_{3}|^{2})t+f_{3}W_{t}^{0}}L_{0}^{\ddagger}\Big{)}^{\top}, (5.19)
Mt=\displaystyle M_{t}= (a3+a8)𝐋t=(a3+a8)(et+𝕂Wt0(L00,L0),e(f212|f3|2)t+f3Wt0L0),\displaystyle(a_{3}+a_{8})^{\top}\mathbf{L}_{t}=(a_{3}+a_{8})^{\top}\Big{(}e^{\mathbb{H}t+\mathbb{K}W_{t}^{0}}(L_{0}^{0},L_{0}),e^{(f_{2}-\frac{1}{2}|f_{3}|^{2})t+f_{3}W_{t}^{0}}L_{0}^{\ddagger}\Big{)}^{\top},

with =(𝔻1+𝔻2)12(𝔽1+𝔽2)(𝔽1+𝔽2),𝕂=(𝔽1+𝔽2).\mathbb{H}=(\mathbb{D}_{1}+\mathbb{D}_{2})^{\top}-\frac{1}{2}(\mathbb{F}_{1}+\mathbb{F}_{2})^{\top}(\mathbb{F}_{1}+\mathbb{F}_{2}),\ \mathbb{K}=(\mathbb{F}_{1}+\mathbb{F}_{2})^{\top}.

We have the following observations on (5.19):
(1) Both M0M^{0} and MM are 𝔽0\mathbb{F}^{0}-adapted, and depend on the common noise W0W^{0} only through e𝕂Wt0e^{\mathbb{K}W_{t}^{0}} and ef3Wt0e^{f_{3}W_{t}^{0}}. If the BSDE’s driver of (5.3) is independent to the intensity state zz (namely, f30=f70=f3,7,11=0f_{3}^{0}=f_{7}^{0}=f_{3,7,11}=0, so 𝕂=0\mathbb{K}=0), the equilibrium for each agent becomes deterministic.
(2) Unlike the forward LQG-RMM (Corollary 5.1) with 𝔽i,0\mathbb{F}^{i,0}-adapted equilibrium, the equilibrium (Mt)t[0,T](M_{t})_{t\in[0,T]} in the backward case is 𝔽0\mathbb{F}^{0}-adapted, hence the idiosyncratic information driven by individual noise {Wi}i=1N\{W^{i}\}_{i=1}^{N} plays no role in the equilibrium. This is mainly because the driver of BSDE (5.3) is now linear and independent to the principal intensity state Zi,iZ^{i,i}.
(3) If the major agent is absent, the equilibrium for each (minor) agent becomes Mt=(a3+a8)(0,0,Lt)M_{t}=(a_{3}+a_{8})^{\top}(0,0,L_{t}^{\ddagger})^{\top}. Comparing with MtM_{t} in (5.19), one can see that the term (a3+a8)(Lt0,Lt,0)(a_{3}+a_{8})^{\top}(L_{t}^{0},L_{t},0)^{\top} captures the influence of 𝒜0\mathcal{A}_{0}.

Remark 5.1.

(i) When f30=f3+f11,f70=f7=0f_{3}^{0}=f_{3}+f_{11},f_{7}^{0}=f_{7}=0 (𝕂=f30I\mathbb{K}^{\top}=f_{3}^{0}I), (5.16) can be solved as

St=[(0,I2)e(Tt)(0,I)]1(0,I)e(Tt)(I,0),t[0,T],S_{t}=-\Big{[}(0,I_{2})e^{\mathcal{B}(T-t)}(0,I)^{\top}\Big{]}^{-1}(0,I)e^{\mathcal{B}(T-t)}(I,0)^{\top},\ t\in[0,T], (5.20)

where =(^𝔹~+f30(5+6)ρ(7+8)~)\mathcal{B}=\left(\begin{array}[]{cc}\widehat{\mathbb{C}}&\widetilde{\mathbb{B}}+f_{3}^{0}(\mathbb{C}_{5}+\mathbb{C}_{6})\rho\\ -(\mathbb{C}_{7}+\mathbb{C}_{8})&-\widetilde{\mathbb{C}}\end{array}\right),   ^=3+4+ρ(7+8)+f30(5+6)\widehat{\mathbb{C}}=\mathbb{C}_{3}+\mathbb{C}_{4}+\rho(\mathbb{C}_{7}+\mathbb{C}_{8})+f_{3}^{0}(\mathbb{C}_{5}+\mathbb{C}_{6}) and (A5) is satisfied when det{(0,I)et(0,I)}>0,t[0,T].\det\{(0,I)e^{\mathcal{B}t}(0,I)^{\top}\}>0,\ \forall\ t\in[0,T].

(ii) If f30=f70=f3,7,11=0f_{3}^{0}=f_{7}^{0}=f_{3,7,11}=0, μ20=μ3=μ4=0\mu^{0}_{2}=\mu_{3}=\mu_{4}=0 and f80=f8=f12=0f_{8}^{0}=f_{8}=f_{12}=0, there has no weak-coupling by the control-average, and the εN\varepsilon_{N}-Nash equilibrium is given by Mt0=12(R01f40,R01f8)et(l00,l0),Mt=12R1f4ef2tl0.M_{t}^{0}=-\frac{1}{2}\big{(}R_{0}^{-1}f_{4}^{0},R_{0}^{-1}f_{8}\big{)}e^{\mathbb{H}t}(l_{0}^{0},l_{0})^{\top},\ M_{t}=-\frac{1}{2}R^{-1}f_{4}e^{f_{2}t}l_{0}^{\ddagger}. Besides, when the major agent is absent, the above result recovers that of [22] (Theorem 3.1).

We present two concrete examples with more explicit representations for (Mt0,Mt,,Mt)\big{(}M_{t}^{0},M_{t},\cdots,M_{t}\big{)}.

Example 5.2.

Consider the backward LQG-RMM problem with: for 1iN,1\leq i\leq N,

J0N=12|Y00|212𝔼[0T|ut0μ20ut(N)|2𝑑t],JiN=12|Y0i|212𝔼[0T|utiμ3ut(N)μ4ut0|2𝑑t],\displaystyle J_{0}^{N}=-\frac{1}{2}|Y_{0}^{0}|^{2}-\frac{1}{2}\mathbb{E}\Big{[}\int_{0}^{T}|u_{t}^{0}-\mu^{0}_{2}\cdot u_{t}^{(N)}|^{2}dt\Big{]},\ J_{i}^{N}=-\frac{1}{2}|Y_{0}^{i}|^{2}-\frac{1}{2}\mathbb{E}\Big{[}\int_{0}^{T}|u_{t}^{i}-\mu_{3}\cdot u_{t}^{(N)}-\mu_{4}\cdot u_{t}^{0}|^{2}dt\Big{]},
{dYt0=[Zt0+(ut0μ20ut(N))]dtZt0dWt0,YT0=ξ0,dYti=[f2Yti+(1f2)Zti,0f2Yt(N)+f2Zt(N,0)+(utiμ4ut0μ3ut(N))]dtZti,0dWt0Zti,idWti,YTi=ξi.\left\{\begin{aligned} -dY_{t}^{0}=&\Big{[}Z_{t}^{0}+\big{(}u_{t}^{0}-\mu^{0}_{2}u_{t}^{(N)}\big{)}\Big{]}dt-Z_{t}^{0}dW_{t}^{0},\quad Y_{T}^{0}=\xi^{0},\\ -dY_{t}^{i}=&\Big{[}f_{2}Y_{t}^{i}+{(1-f_{2})Z_{t}^{i,0}}-f_{2}Y_{t}^{(N)}+f_{2}Z_{t}^{(N,0)}+\big{(}u_{t}^{i}-\mu_{4}u_{t}^{0}-\mu_{3}u_{t}^{(N)}\big{)}\Big{]}dt\\ &-Z_{t}^{i,0}dW_{t}^{0}-Z_{t}^{i,i}dW_{t}^{i},\quad Y_{T}^{i}=\xi^{i}.\end{aligned}\right.

Therefore, the major and each minor agent are weakly-coupled through their control-average along with an identical relative performance parameter. \mathcal{B} in Remark 5.1 now reads as

=(0μ4μ200μ2001000000001μ4μ201μ2000101).\mathcal{B}=\left(\begin{array}[]{ccccc}0&\mu_{4}&-\mu^{0}_{2}&0&-\mu^{0}_{2}\\ 0&1&0&0&0\\ 0&0&0&0&0\\ 1&-\mu_{4}&\mu^{0}_{2}&1&\mu^{0}_{2}\\ 0&0&1&0&1\\ \end{array}\right).

Since n=\mathcal{B}^{n}=\mathcal{B}, n1n\geq 1, det{(0,I)et(0,I)}=det(etμ20(et1)0et)=e2t>0,t[0,T]\det\{(0,I)e^{\mathcal{B}t}(0,I)^{\top}\}=\det\left(\begin{array}[]{cc}e^{t}&\mu^{0}_{2}(e^{t}-1)\\ 0&e^{t}\end{array}\right)=e^{2t}>0,\forall\ t\in[0,T] hence (A5) holds. From (5.20),

St=(1e(Tt)μ4(1e(Tt))μ20e(Tt)(1e(Tt))001e(Tt)),t[0,T].S_{t}=-\left(\begin{array}[]{ccc}1-e^{-(T-t)}&-\mu_{4}(1-e^{-(T-t)})&\mu^{0}_{2}e^{-(T-t)}(1-e^{-(T-t)})\\ 0&0&1-e^{-(T-t)}\end{array}\right),\ t\in[0,T].

We calculate from (5.19) that, for t[0,T],t\in[0,T],

Mt0=\displaystyle M_{t}^{0}= (1μ3μ20μ4)1e12t+Wt0𝔼[e32T+WT0((1μ3)ξ0+μ20(ef2t1+eT+μ3μ3eT)ξ1)],\displaystyle-(1-\mu_{3}-\mu^{0}_{2}\mu_{4})^{-1}e^{-\frac{1}{2}t+W_{t}^{0}}\mathbb{E}\Big{[}e^{-\frac{3}{2}T+W_{T}^{0}}\Big{(}(1-\mu_{3})\xi^{0}+\mu^{0}_{2}(e^{f_{2}t}-1+e^{-T}+\mu_{3}-\mu_{3}e^{-T})\xi^{1}\Big{)}\Big{]},
Mt=\displaystyle M_{t}= (1μ3μ20μ4)1e12t+Wt0𝔼[e32T+WT0(μ4ξ0+(ef2tμ4μ20+μ4μ20eT)ξ1)].\displaystyle-(1-\mu_{3}-\mu^{0}_{2}\mu_{4})^{-1}e^{-\frac{1}{2}t+W_{t}^{0}}\mathbb{E}\Big{[}e^{-\frac{3}{2}T+W_{T}^{0}}\Big{(}\mu_{4}\xi^{0}+(e^{f_{2}t}-\mu_{4}\mu^{0}_{2}+\mu_{4}\mu^{0}_{2}e^{-T})\xi^{1}\Big{)}\Big{]}.
Remark 5.2.

(1) The equilibrium (M0,M)(M^{0},M) have linear dependence on the terminal conditions (ξ0,ξ1)(\xi^{0},\xi^{1}) via the parameters (μ3,μ20,f2,T,μ4)(\mu_{3},\mu^{0}_{2},f_{2},T,\mu_{4}). (2) When μ20=0\mu^{0}_{2}=0, the major’s payoff is independent on the control-average of all minors, so her Nash strategy M0M^{0} will not depend on ξ1\xi^{1} any more, but may still get influence from each minor agent provided μ30\mu_{3}\neq 0.

Example 5.3.

Consider the backward LQG-RMM problem with:

J0N=γ0|Y00|2𝔼[0TR0|ut0|2𝑑t],JiN=γ|Y0i|2𝔼[0TR|uti|2𝑑t], 1iN,\displaystyle J_{0}^{N}=-\gamma_{0}|Y_{0}^{0}|^{2}-\mathbb{E}\Big{[}\int_{0}^{T}R_{0}|u_{t}^{0}|^{2}dt\Big{]},\ J_{i}^{N}=-\gamma|Y_{0}^{i}|^{2}-\mathbb{E}\Big{[}\int_{0}^{T}R|u_{t}^{i}|^{2}dt\Big{]},\ 1\leq i\leq N,
{dYt0=(f60Yt(N)+f80ut(N))dtZt0dWt0,YT0=ξ0,dYti=(f2Yti+f8ut0f2Yt(N)+f12ut(N))dtZti,0dWt0Zti,idWti,YTi=ξi, 1iN.\left\{\begin{aligned} -dY_{t}^{0}=&\Big{(}f_{6}^{0}Y_{t}^{(N)}+f_{8}^{0}u_{t}^{(N)}\Big{)}dt-Z_{t}^{0}dW_{t}^{0},\quad Y_{T}^{0}=\xi^{0},\\ \ -dY_{t}^{i}=&\Big{(}f_{2}Y_{t}^{i}+f_{8}u_{t}^{0}-f_{2}Y_{t}^{(N)}+f_{12}u_{t}^{(N)}\Big{)}dt-Z_{t}^{i,0}dW_{t}^{0}-Z_{t}^{i,i}dW_{t}^{i},\ Y_{T}^{i}=\xi^{i},\ 1\leq i\leq N.\end{aligned}\right.

The major and each minor are interacted through the coupling (Y(N),u(N))(Y^{(N)},u^{(N)}). Moreover, each minor agent is also influenced by u0u^{0} directly, the major’s control. \mathcal{B} in Remark 5.1 become

=(00002γ0f60f60002γ0f600000000000f6000000).\mathcal{B}=\left(\begin{array}[]{ccccc}0&0&0&0&2\gamma_{0}f_{6}^{0}\\ f_{6}^{0}&0&0&2\gamma_{0}f_{6}^{0}&0\\ 0&0&0&0&0\\ 0&0&0&0&-f_{6}^{0}\\ 0&0&0&0&0\\ \end{array}\right).

Now, 2=0\mathcal{B}^{2}=0, and det{(0,I2)et(0,I2)}=det(1f60t01)=1>0\det\{(0,I_{2})e^{\mathcal{B}t}(0,I_{2})^{\top}\}=\det\left(\begin{array}[]{cc}1&-f_{6}^{0}t\\ 0&1\end{array}\right)=1>0, t\forall\ t, so (A5) holds. Moreover, St0S_{t}\equiv 0 by (5.20), Mt0=R01f8f60γ0(𝔼[ξ0]+f60T𝔼[ξ1])t,Mt0M_{t}^{0}=-R_{0}^{-1}f_{8}f_{6}^{0}\gamma_{0}\Big{(}\mathbb{E}[\xi^{0}]+f_{6}^{0}T\mathbb{E}[\xi^{1}]\Big{)}t,\ M_{t}\equiv 0 by (5.19).

6 Conclusions

This paper studies a new class recursive major-minor (RMM) games featured by: (1) recursive functionals with nonlinear BSDE representations; (2) comprehensive and general weak-couplings. We propose a novel structural scheme to construct its auxiliary problem, a key step towards the desired ε\varepsilon-Nash equilibrium. In the RMM context, this scheme consists of a bilateral perturbation and a mixed triple-agent leader-follower-Nash analysis. In contrast to heuristic arguments in most MFG literature, such scheme indeed lay down an unified game-theoretic foundation to analyze more complex LP coupling structures, such as the ones with heterogenous robust beliefs, or with coalition interactions from nested information. We plan to address them in future.

Appendix A Appendix

A.1 Proof of Theorem 4.1

For a fixed NN, it suffices to verify the εN\varepsilon_{N}-Nash equilibrium property of (u0,N,u1,N,(u^{0,N},u^{1,N},\cdots, uN,N)u^{N,N}) in side of 𝒜0\mathcal{A}_{0}. The verification in side of {𝒜i}i=1N\{\mathcal{A}_{i}\}_{i=1}^{N} is analogous thus we omit the details here. For this purpose, we consider the following limiting processes of (4.10) and (4.11):

{dXt0=b0(t,Xt0,ut0,;X¯ti,u¯ti,)dt+σ0(t,Xt0;X¯ti,u¯ti,)dWt0,X00=x00,dXti=b(t,Xti,uti,;Xt0,ut0,;X¯ti,u¯ti,)dt+σ(t,Xti,Xt0;X¯ti,u¯ti,)dWti,X0i=x0,dYt0=f0(t,Θt0,ut0,;Θ¯ti,u¯ti,)dtZt0dWt0,YT0=Φ0(XT0,X¯Ti)+ξ0,dYti=f(t,Θti,uti,;Θt0,ut0,;Θ¯ti,u¯ti,)dtZtid(W0,Wi)t,YTi=Φ(XTi,XT0,X¯Ti)+ξi,\left\{\begin{aligned} dX_{t}^{0}=&b^{0}(t,X_{t}^{0},u_{t}^{0,*};\overline{X}_{t}^{i},\overline{u}_{t}^{i,*})dt+{\sigma^{0}(t,{X}_{t}^{0};\overline{X}_{t}^{i},\overline{u}_{t}^{i,*})}dW^{0}_{t},\quad X_{0}^{0}=x_{0}^{0},\\ dX_{t}^{i}=&b(t,X_{t}^{i},{u}_{t}^{i,*};X_{t}^{0},u_{t}^{0,*};\overline{X}_{t}^{i},\overline{u}_{t}^{i,*})dt+\sigma(t,X_{t}^{i},X_{t}^{0};\overline{X}_{t}^{i},\overline{u}_{t}^{i,*})dW_{t}^{i},\quad X_{0}^{i}=x_{0},\\ -dY_{t}^{0}=&f^{0}(t,\Theta_{t}^{0},u_{t}^{0,*};\overline{\Theta}_{t}^{i},\overline{u}_{t}^{i,*})dt-Z_{t}^{0}dW_{t}^{0},\quad Y_{T}^{0}=\Phi^{0}(X_{T}^{0},\overline{X}_{T}^{i})+\xi^{0},\\ -dY_{t}^{i}=&f(t,\Theta_{t}^{i},{u}_{t}^{i,*};\Theta_{t}^{0},u_{t}^{0,*};\overline{\Theta}_{t}^{i},\overline{u}_{t}^{i,*})dt-Z_{t}^{i}d(W^{0},W^{i})_{t},\quad Y_{T}^{i}=\Phi(X_{T}^{i},X_{T}^{0},\overline{X}_{T}^{i})+\xi^{i},\\ \end{aligned}\right.

and J0=Γ0(Y00)+𝔼[0Tg0(t,Θt0,ut0,;Θ¯ti,u¯ti,)𝑑t],J_{0}=\Gamma^{0}(Y_{0}^{0})+\mathbb{E}[\int_{0}^{T}g^{0}(t,\Theta_{t}^{0},{u}_{t}^{0,*};\overline{\Theta}_{t}^{i},\overline{u}_{t}^{i,*})dt], where ut0,=Ψ0(t,Xt0,Xt1),uti,=Ψ(t,Xt0;Xti).u_{t}^{0,*}=\Psi^{0}(t,X_{t}^{0},X_{t}^{1}),\ u_{t}^{i,*}=\Psi(t,X_{t}^{0};X_{t}^{i}). By Burkholder-Davis-Gundy inequality and standard convergence estimates of SDEs (e.g., Theorem 10.1.7 in [35]), for 0sT,0\leq s\leq T, j=0,1,,N,j=0,1,\cdots,N, we have

𝔼[sup0ts|Xtj,NXtj|2]C0sk=0,1,i𝔼[sup0tr|Xrk,NXrk|2]dr\displaystyle\mathbb{E}[\mathop{\rm sup}_{0\leq t\leq s}|X_{t}^{j,N}-X_{t}^{j}|^{2}]\leq C\int_{0}^{s}\sum_{k=0,1,i}\mathbb{E}[\mathop{\rm sup}_{0\leq t\leq r}|X_{r}^{k,N}-X_{r}^{k}|^{2}]dr
+C0s𝔼[|1Ni=1NXri,N𝔼r[Xri]|2]𝑑r+C0s𝔼[|1Ni=1NΨ(r,Xr0,N,Xri,N)𝔼r[Ψ(r,Xr0,Xri)]|2]𝑑r\displaystyle+C\int_{0}^{s}\mathbb{E}\Big{[}\Big{|}\frac{1}{N}\sum_{i=1}^{N}X_{r}^{i,N}-\mathbb{E}_{r}[X_{r}^{i}]\Big{|}^{2}\Big{]}dr+C\int_{0}^{s}\mathbb{E}\Big{[}\Big{|}\frac{1}{N}\sum_{i=1}^{N}\Psi(r,X_{r}^{0,N},X_{r}^{i,N})-\mathbb{E}_{r}[\Psi(r,X_{r}^{0},X_{r}^{i})]\Big{|}^{2}\Big{]}dr
C0s𝔼[sup0tr|Xr0,NXr0|2]𝑑r+C0ssup1iN𝔼[sup0tr|Xri,NXri|2]dr+CN.\displaystyle\leq C\int_{0}^{s}\mathbb{E}[\mathop{\rm sup}_{0\leq t\leq r}|X_{r}^{0,N}-X_{r}^{0}|^{2}]dr+C\int_{0}^{s}\mathop{\rm sup}_{1\leq i\leq N}\mathbb{E}[\mathop{\rm sup}_{0\leq t\leq r}|X_{r}^{i,N}-X_{r}^{i}|^{2}]dr+\frac{C}{N}.

It follows from Gronwall inequality that

sup0iN𝔼[sup0tT|Xti,NXti|2]CN.\mathop{\rm sup}_{0\leq i\leq N}\mathbb{E}[\mathop{\rm sup}_{0\leq t\leq T}|X_{t}^{i,N}-X_{t}^{i}|^{2}]\leq{\frac{C}{N}}. (A.1)

Applying Itô’s formula to |Yt0,NYt0|2|Y_{t}^{0,N}-Y_{t}^{0}|^{2} and |Yti,NYti|2|Y_{t}^{i,N}-Y_{t}^{i}|^{2}, we have

𝔼[|Ytj,NYtj|2+tT|Zsj,NZsj|2𝑑s]CtT𝔼[|Ysj,NYsj|2]𝑑s+C𝔼[|XTj,NXTj|2]+Cj𝔼[|XT0,NXT0|2]\displaystyle\mathbb{E}\big{[}|Y_{t}^{j,N}-Y_{t}^{j}|^{2}+\int_{t}^{T}|Z_{s}^{j,N}-Z_{s}^{j}|^{2}ds\big{]}\leq C\int_{t}^{T}\mathbb{E}[|Y_{s}^{j,N}-Y_{s}^{j}|^{2}]ds+C\mathbb{E}[|X_{T}^{j,N}-X_{T}^{j}|^{2}]+C_{j}\mathbb{E}[|X_{T}^{0,N}-X_{T}^{0}|^{2}]
+C𝔼[|1Ni=1NXTi,N𝔼T[XTi]|2]+CtTk=0,1,i𝔼[|Xsk,NXsk|2]ds+Cj𝔼[tT(|Ys0,NYs0|2+|Zs0,NZs0|2)𝑑s]\displaystyle+C\mathbb{E}\big{[}\big{|}\frac{1}{N}\sum_{i=1}^{N}X_{T}^{i,N}-\mathbb{E}_{T}[X_{T}^{i}]\big{|}^{2}\big{]}+C\int_{t}^{T}\sum_{k=0,1,i}\mathbb{E}[|X_{s}^{k,N}-X_{s}^{k}|^{2}]ds+C_{j}\mathbb{E}\big{[}\int_{t}^{T}(|Y_{s}^{0,N}-Y_{s}^{0}|^{2}+|Z_{s}^{0,N}-Z_{s}^{0}|^{2})ds\big{]}
+CtT𝔼[|1Ni=1NXsi,N𝔼s[Xsi]|2]𝑑s+CtT𝔼[|1Ni=1NΨ(r,Xr0,N,Xri,N)𝔼r[Ψ(r,Xr0,Xri)]|2]𝑑r\displaystyle+C\int_{t}^{T}\mathbb{E}\Big{[}\Big{|}\frac{1}{N}\sum_{i=1}^{N}X_{s}^{i,N}-\mathbb{E}_{s}[X_{s}^{i}]\Big{|}^{2}\Big{]}ds+C\int_{t}^{T}\mathbb{E}\Big{[}\Big{|}\frac{1}{N}\sum_{i=1}^{N}\Psi(r,X_{r}^{0,N},X_{r}^{i,N})-\mathbb{E}_{r}[\Psi(r,X_{r}^{0},X_{r}^{i})]\Big{|}^{2}\Big{]}dr (A.2)
+CtT𝔼[|1Ni=1NYsi,N𝔼s[Ysi]|2]𝑑s+CtT𝔼[|1Ni=1NZsi,N𝔼s[Zsi]|2]𝑑s\displaystyle+C\int_{t}^{T}\mathbb{E}\Big{[}\Big{|}\frac{1}{N}\sum_{i=1}^{N}Y_{s}^{i,N}-\mathbb{E}_{s}[Y_{s}^{i}]\Big{|}^{2}\Big{]}ds+C\int_{t}^{T}\mathbb{E}\Big{[}\Big{|}\frac{1}{N}\sum_{i=1}^{N}Z_{s}^{i,N}-\mathbb{E}_{s}[Z_{s}^{i}]\Big{|}^{2}\Big{]}ds
CtT𝔼[|Ysi,NYsi|2]𝑑s+C𝔼[|XTj,NXTj|2]+Cj𝔼[|XT0,NXT0|2]+CtTk=0,1,i𝔼[|Xsk,NXsk|2]ds\displaystyle\leq C\int_{t}^{T}\mathbb{E}[|Y_{s}^{i,N}-Y_{s}^{i}|^{2}]ds+C\mathbb{E}[|X_{T}^{j,N}-X_{T}^{j}|^{2}]+C_{j}\mathbb{E}[|X_{T}^{0,N}-X_{T}^{0}|^{2}]+C\int_{t}^{T}\sum_{k=0,1,i}\mathbb{E}[|X_{s}^{k,N}-X_{s}^{k}|^{2}]ds
+Cj𝔼[tT(|Ys0,NYs0|2+|Zs0,NZs0|2)𝑑s]+CN,\displaystyle+C_{j}\mathbb{E}\big{[}\int_{t}^{T}(|Y_{s}^{0,N}-Y_{s}^{0}|^{2}+|Z_{s}^{0,N}-Z_{s}^{0}|^{2})ds\big{]}+\frac{C}{N},

where C0=0C_{0}=0 and {Cj}j=1N\{C_{j}\}_{j=1}^{N} stands for some positive constant. Combining (A.1) and (A.2),

sup0iN{sup0tT𝔼[|Yti,NYti|2]+𝔼[0T|Zti,NZti|2𝑑t]}CN.\mathop{\rm sup}_{0\leq i\leq N}\Big{\{}\mathop{\rm sup}_{0\leq t\leq T}\mathbb{E}[|Y_{t}^{i,N}-Y_{t}^{i}|^{2}]+\mathbb{E}[\int_{0}^{T}|Z_{t}^{i,N}-Z_{t}^{i}|^{2}dt]\Big{\}}\leq\frac{C}{N}.

As to the functionals,

|J0NJ0||Γ0(Y00,N)Γ0(Y00)|+𝔼[0T|g0(t,Θt0,N,Ψ0(t,Xt0,N,Xt1,N);Θt(N),utN)\displaystyle|J^{N}_{0}-J_{0}|\leq|\Gamma^{0}(Y_{0}^{0,N})-\Gamma^{0}(Y_{0}^{0})|+\mathbb{E}\Big{[}\int_{0}^{T}\Big{|}g^{0}\Big{(}t,\Theta_{t}^{0,N},\Psi^{0}(t,X_{t}^{0,N},X_{t}^{1,N});\Theta_{t}^{(N)},u_{t}^{N}\Big{)}
g0(t,Θt0,Ψ0(t,Xt0,Xt1);Θ¯ti,𝔼t[Ψ(t,Xt0,Xti)])|dt]\displaystyle-g^{0}\Big{(}t,\Theta_{t}^{0},\Psi^{0}(t,X_{t}^{0},X_{t}^{1});\overline{\Theta}_{t}^{i},\mathbb{E}_{t}\big{[}\Psi(t,X_{t}^{0},X_{t}^{i})\big{]}\Big{)}\Big{|}dt\Big{]}
C(1+|Y00,N|+|Y00|)|Y00,NY00|\displaystyle\leq C(1+|Y_{0}^{0,N}|+|Y_{0}^{0}|)\cdot|Y_{0}^{0,N}-Y_{0}^{0}|
+C𝔼[0T(1+|Θt0,N|+|Θt0|+|Θt(N)|+|Θ¯ti|+|ut(N)|+|𝔼t[Ψ(t,Xt0,Xt1)]|)\displaystyle+C\mathbb{E}\Big{[}\int_{0}^{T}\big{(}1+|\Theta_{t}^{0,N}|+|\Theta_{t}^{0}|+|\Theta_{t}^{(N)}|+|\overline{\Theta}_{t}^{i}|+|u_{t}^{(N)}|+|\mathbb{E}_{t}[\Psi(t,X_{t}^{0},X_{t}^{1})]|\big{)}
(|Θt0,NΘt0|+|Θt(N)Θ¯ti|+|ut(N)𝔼t[Ψ(t,Xt0,Xt1)]|)dt\displaystyle\big{(}|\Theta_{t}^{0,N}-\Theta_{t}^{0}|+|\Theta_{t}^{(N)}-\overline{\Theta}_{t}^{i}|+|u_{t}^{(N)}-\mathbb{E}_{t}[\Psi(t,X_{t}^{0},X_{t}^{1})]|\big{)}dt
C(1+|Y00,N|2+|Y00|2)|Y00,NY00|2\displaystyle\leq C(1+|Y_{0}^{0,N}|^{2}+|Y_{0}^{0}|^{2})\cdot|Y_{0}^{0,N}-Y_{0}^{0}|^{2}
+C(𝔼[0T(1+|Θt0,N|2+|Θt0|2+|Θt(N)|2+|Θ¯ti|2+|ut(N)|2+|Ψ(t,Xt0,Xt1)|2)dt)12\displaystyle+C\Big{(}\mathbb{E}\Big{[}\int_{0}^{T}\big{(}1+|\Theta_{t}^{0,N}|^{2}+|\Theta_{t}^{0}|^{2}+|\Theta_{t}^{(N)}|^{2}+|\overline{\Theta}_{t}^{i}|^{2}+|u_{t}^{(N)}|^{2}+|\Psi(t,X_{t}^{0},X_{t}^{1})|^{2}\big{)}dt\Big{)}^{\frac{1}{2}}\cdot
(𝔼[0T(|Θt0,NΘt0|2+|Θt(N)Θ¯ti|2+|ut(N)𝔼t[Ψ(t,Xt0,Xt1)]|2)dt)12CN.\displaystyle\Big{(}\mathbb{E}\Big{[}\int_{0}^{T}\big{(}|\Theta_{t}^{0,N}-\Theta_{t}^{0}|^{2}+|\Theta_{t}^{(N)}-\overline{\Theta}_{t}^{i}|^{2}+|u_{t}^{(N)}-\mathbb{E}_{t}[\Psi(t,X_{t}^{0},X_{t}^{1})]|^{2}\big{)}dt\Big{)}^{\frac{1}{2}}\leq\frac{C}{\sqrt{N}}.

We study the uniliteral deviation of 𝒜0\mathcal{A}_{0} from the strategy ut0,N=Ψ0(t,Xt0,N,Xt1,N)u_{t}^{0,N}=\Psi^{0}(t,X_{t}^{0,N},X_{t}^{1,N}). Assume now that 𝒜0\mathcal{A}_{0} adopts a different control u0𝒰d0u^{0}\in\mathcal{U}_{d}^{0} and {𝒜i}i1N\{\mathcal{A}_{i}\}_{i\geq 1}^{N} keep to apply {uti,N}i1N\{u_{t}^{i,N}\}_{i\geq 1}^{N}. The resulting perturbed states, denoted by (X^ti,N)i0(\widehat{X}_{t}^{i,N})_{i\geq 0}, should satisfy

{dX^t0,N=b0(t,X^t0,N,ut0;X^t(N),ut(N))dt+σ0(t,X^t0,N;X^t(N),ut(N))dWt0,X^00,N=x00,dX^ti,N=b(t,X^ti,N,uti,,N;X^t0,N,ut0;X^t(N),ut(N))dt+σ(t,X^ti,N,X^t0,N;X^t(N),ut(N))dWti,X^0i,N=x0,\left\{\begin{aligned} d\widehat{X}_{t}^{0,N}=&b^{0}\big{(}t,\widehat{X}_{t}^{0,N},u^{0}_{t};\widehat{X}_{t}^{(N)},{u_{t}^{(N)}}\big{)}dt+{\sigma^{0}\big{(}t,\widehat{X}_{t}^{0,N};\widehat{X}_{t}^{(N)},{u_{t}^{(N)}}\big{)}}dW^{0}_{t},\quad\widehat{X}_{0}^{0,N}=x_{0}^{0},\\ d\widehat{X}_{t}^{i,N}=&b\big{(}t,\widehat{X}_{t}^{i,N},u_{t}^{i,*,N};\widehat{X}_{t}^{0,N},u_{t}^{0};\widehat{X}_{t}^{(N)},u_{t}^{(N)}\big{)}dt+\sigma\big{(}t,\widehat{X}_{t}^{i,N},\widehat{X}_{t}^{0,N};\widehat{X}_{t}^{(N)},u_{t}^{(N)}\big{)}dW_{t}^{i},\quad\widehat{X}_{0}^{i,N}=x_{0},\\ \end{aligned}\right.

and the related limiting processes is given by

{dX^t0=b0(t,X^t0,ut0;𝔼t[X^ti],u¯ti,N)dt+σ0(t,X^t0;𝔼t[X^ti],u¯ti,N)dWt0,X^00=x00,dX^ti=b(t,X^ti,uti,N;X^t0,ut0;𝔼t[X^ti],u¯ti,N)dt+σ(t,X^ti,X^t0;𝔼t[X^ti],u¯ti,N)dWti,X^0i=x0.\left\{\begin{aligned} d\widehat{X}_{t}^{0}=&b^{0}\big{(}t,\widehat{X}_{t}^{0},u^{0}_{t};\mathbb{E}_{t}[\widehat{X}_{t}^{i}],\overline{u}_{t}^{i,N}\big{)}dt+{\sigma^{0}\big{(}t,\widehat{X}_{t}^{0};\mathbb{E}_{t}[\widehat{X}_{t}^{i}],\overline{u}_{t}^{i,N}\big{)}}dW^{0}_{t},\quad\widehat{X}_{0}^{0}=x_{0}^{0},\\ d\widehat{X}_{t}^{i}=&b\big{(}t,\widehat{X}_{t}^{i},{u}_{t}^{i,N};\widehat{X}_{t}^{0},u_{t}^{0};\mathbb{E}_{t}[\widehat{X}_{t}^{i}],\overline{u}_{t}^{i,N}\big{)}dt+\sigma\big{(}t,\widehat{X}_{t}^{i},\widehat{X}_{t}^{0};\mathbb{E}_{t}[\widehat{X}_{t}^{i}],\overline{u}_{t}^{i,N}\big{)}dW_{t}^{i},\quad\widehat{X}_{0}^{i}=x_{0}.\end{aligned}\right.

Similar to (A.1), sup0iN𝔼[sup0tT|X^ti,NX^ti|2]CN.\mathop{\rm sup}_{0\leq i\leq N}\mathbb{E}[\mathop{\rm sup}_{0\leq t\leq T}|\widehat{X}_{t}^{i,N}-\widehat{X}_{t}^{i}|^{2}]\leq\frac{C}{N}. By the same estimates as in (LABEL:apr-7),

|J^0NJ^0|CN.|\widehat{J}^{N}_{0}-\widehat{J}_{0}|\leq\frac{C}{\sqrt{N}}. (A.3)

Since (ut0,,ut1,,utj,)\big{(}u^{0,*}_{t},u^{1,*}_{t},u_{t}^{j,*}\big{)} is an equilibrium strategy of the limiting triple-agent game problem, it is clear that J^0J0\widehat{J}_{0}\leq{J}_{0}, and combining (LABEL:apr-7) and (A.3), we get the desired result for 𝒜0\mathcal{A}_{0}.

A.2 Some notations in Section 5

1. Constants in (5.5): a=12(1μ3μ20μ4)1a=\frac{1}{2}(1-\mu_{3}-\mu^{0}_{2}\mu_{4})^{-1} and

a10=a((1μ3)R01b20,(1μ3)R01b4,μ20R1b2),a30=a((1μ3)R01f40,(1μ3)R01f8,μ20R1f4),a1=a(μ4R01b20,μ4R01b4,(μ3+μ4μ20)R1b2),a3=(aμ4R01f40,aμ4R01f8,(a12)R1f4),a7=12R1b2(0,0,1),a8=12R1f4(0,0,1).\begin{array}[]{llll}&{a_{1}^{0}}^{\top}=a\big{(}(1-\mu_{3})R_{0}^{-1}b_{2}^{0},(1-\mu_{3})R_{0}^{-1}b_{4},\mu^{0}_{2}R^{-1}b_{2}\big{)},&{a_{3}^{0}}^{\top}=-a\big{(}(1-\mu_{3})R_{0}^{-1}f_{4}^{0},(1-\mu_{3})R_{0}^{-1}f_{8},\mu^{0}_{2}R^{-1}f_{4}\big{)},\\ &a_{1}^{\top}=a\big{(}\mu_{4}R_{0}^{-1}b_{2}^{0},\mu_{4}R_{0}^{-1}b_{4},(\mu_{3}+\mu_{4}\mu^{0}_{2})R^{-1}b_{2}\big{)},&a_{3}^{\top}=-\big{(}a\mu_{4}R_{0}^{-1}f_{4}^{0},a\mu_{4}R_{0}^{-1}f_{8},(a-\frac{1}{2})R^{-1}f_{4}\big{)},\\ &a_{7}^{\top}=\frac{1}{2}R^{-1}b_{2}(0,0,1),&a_{8}^{\top}=-\frac{1}{2}R^{-1}f_{4}(0,0,1).\\ \end{array}

2. Matrices in equation (5.6):

𝔸1=(b100b3b1),𝔸2=(0b300b5),𝔸3=(f100f5f1),𝔸4=(0f500f9),𝔸5=(2Q002μ10Q002μ2Q2Q),\mathbb{A}_{1}=\left(\begin{array}[]{cc}b_{1}^{0}&0\\ b_{3}&b_{1}\\ \end{array}\right),\ \mathbb{A}_{2}=\left(\begin{array}[]{cc}0&b_{3}^{0}\\ 0&b_{5}\\ \end{array}\right),\ \mathbb{A}_{3}=\left(\begin{array}[]{cc}f_{1}^{0}&0\\ f_{5}&f_{1}\\ \end{array}\right),\ \mathbb{A}_{4}=\left(\begin{array}[]{cc}0&f_{5}^{0}\\ 0&f_{9}\\ \end{array}\right),\ \mathbb{A}_{5}=\left(\begin{array}[]{cc}-2Q_{0}&0\\ 2\mu^{0}_{1}Q_{0}&0\\ 2\mu_{2}Q&-2Q\\ \end{array}\right),\
𝔸6=(02μ10Q002|μ10|2Q002μ1Q),𝔸7=(Φ100Φ2Φ1),𝔸8=(0Φ200Φ3),𝔹1=(0b2a7),\mathbb{A}_{6}=\left(\begin{array}[]{cc}0&2\mu^{0}_{1}Q_{0}\\ 0&-2|\mu^{0}_{1}|^{2}Q_{0}\\ 0&2\mu_{1}Q\\ \end{array}\right),\ \mathbb{A}_{7}=\left(\begin{array}[]{cc}\Phi_{1}^{0}&0\\ \Phi_{2}&\Phi_{1}\\ \end{array}\right),\ \mathbb{A}_{8}=\left(\begin{array}[]{ccccc}0&\Phi_{2}^{0}\\ 0&\Phi_{3}\\ \end{array}\right),\ \mathbb{B}_{1}=\left(\begin{array}[]{c}0\\ b_{2}a_{7}^{\top}\\ \end{array}\right),\
𝔹2=(b20a10+b40a1+b40a7(b2+b6)a1+b4a10+b6a7),𝔹3=(0f4a7),𝔹4=(f40a10+f80(a1+a7)f4a1+f8a10+f12(a1+a7)),\mathbb{B}_{2}=\left(\begin{array}[]{c}b_{2}^{0}{a_{1}^{0}}^{\top}+b_{4}^{0}{a}_{1}^{\top}+b_{4}^{0}{a}_{7}^{\top}\\ (b_{2}+b_{6})a_{1}^{\top}+b_{4}{a_{1}^{0}}^{\top}+b_{6}{a}_{7}^{\top}\\ \end{array}\right),\ \mathbb{B}_{3}=\left(\begin{array}[]{c}0\\ f_{4}a_{7}^{\top}\\ \end{array}\right),\ \mathbb{B}_{4}=\left(\begin{array}[]{c}f_{4}^{0}{a_{1}^{0}}^{\top}+f_{8}^{0}(a_{1}+a_{7})^{\top}\\ f_{4}a_{1}^{\top}+f_{8}{a_{1}^{0}}^{\top}+f_{12}(a_{1}+a_{7})^{\top}\\ \end{array}\right),\
𝔹5=(b10b300b1000b1),𝔹6=(000b30b50000),1=(0b2a8),2=(b20a30+b40a3+b40a8(b2+b6)a3+b4a30+b6a8),\mathbb{B}_{5}=\left(\begin{array}[]{ccc}b_{1}^{0}&b_{3}&0\\ 0&b_{1}&0\\ 0&0&b_{1}\end{array}\right),\ \mathbb{B}_{6}=\left(\begin{array}[]{ccc}0&0&0\\ b_{3}^{0}&b_{5}&0\\ 0&0&0\end{array}\right),\ \mathbb{C}_{1}=\left(\begin{array}[]{c}0\\ b_{2}a_{8}^{\top}\\ \end{array}\right),\ \mathbb{C}_{2}=\left(\begin{array}[]{c}b_{2}^{0}{a_{3}^{0}}^{\top}+b_{4}^{0}{a}_{3}^{\top}+b_{4}^{0}{a}_{8}^{\top}\\ (b_{2}+b_{6})a_{3}^{\top}+b_{4}{a_{3}^{0}}^{\top}+b_{6}{a}_{8}^{\top}\\ \end{array}\right),\
3=(f20f600f2000f2),4=(000f60f100000),5=(f30f700f3000f3),6=(000f70f110000),\mathbb{C}_{3}=\left(\begin{array}[]{ccc}f_{2}^{0}&f_{6}&0\\ 0&f_{2}&0\\ 0&0&f_{2}\\ \end{array}\right),\ \mathbb{C}_{4}=\left(\begin{array}[]{ccccc}0&0&0\\ f_{6}^{0}&f_{10}&0\\ 0&0&0\\ \end{array}\right),\ \mathbb{C}_{5}=\left(\begin{array}[]{ccc}f_{3}^{0}&f_{7}&0\\ 0&f_{3}&0\\ 0&0&f_{3}\\ \end{array}\right),\ \mathbb{C}_{6}=\left(\begin{array}[]{ccccc}0&0&0\\ f_{7}^{0}&f_{11}&0\\ 0&0&0\\ \end{array}\right),\
7=(0f4a8),8=(f40a30+f80(a3+a8)f4a3+f8a30+f12(a3+a8)),9=(f10f500f1000f1),\mathbb{C}_{7}=\left(\begin{array}[]{c}0\\ f_{4}a_{8}^{\top}\\ \end{array}\right),\ \mathbb{C}_{8}=\left(\begin{array}[]{c}f_{4}^{0}{a_{3}^{0}}^{\top}+f_{8}^{0}(a_{3}+a_{8})^{\top}\\ f_{4}a_{3}^{\top}+f_{8}{a_{3}^{0}}^{\top}+f_{12}(a_{3}+a_{8})^{\top}\\ \end{array}\right),\ \mathbb{C}_{9}=\left(\begin{array}[]{ccccc}-f_{1}^{0}&-f_{5}&0\\ 0&-f_{1}&0\\ 0&0&-f_{1}\\ \end{array}\right),\
10=(000f50f90000),11=(Φ10Φ200Φ1000Φ1),12=(000Φ20Φ30000),\mathbb{C}_{10}=\left(\begin{array}[]{ccccc}0&0&0\\ -f_{5}^{0}&-f_{9}&0\\ 0&0&0\\ \end{array}\right),\ \mathbb{C}_{11}=\left(\begin{array}[]{ccc}-\Phi_{1}^{0}&-\Phi_{2}&0\\ 0&-\Phi_{1}&0\\ 0&0&-\Phi_{1}\\ \end{array}\right),\ \mathbb{C}_{12}=\left(\begin{array}[]{ccc}0&0&0\\ -\Phi_{2}^{0}&-\Phi_{3}&0\\ 0&0&0\\ \end{array}\right),\
𝔻1=(f200f6f2),𝔻2=(0f600f10),𝔽1=(f300f7f3),𝔽2=(0f700f11),ρ=(2γ000002γ).\mathbb{D}_{1}=\left(\begin{array}[]{cc}f_{2}^{0}&0\\ f_{6}&f_{2}\\ \end{array}\right),\ \mathbb{D}_{2}=\left(\begin{array}[]{cc}0&f_{6}^{0}\\ 0&f_{10}\\ \end{array}\right),\ \mathbb{F}_{1}=\left(\begin{array}[]{cc}f_{3}^{0}&0\\ f_{7}&f_{3}\\ \end{array}\right),\ \mathbb{F}_{2}=\left(\begin{array}[]{cc}0&f_{7}^{0}\\ 0&f_{11}\\ \end{array}\right),\ \mathbb{\rho}=\left(\begin{array}[]{cc}2\gamma_{0}&0\\ 0&0\\ 0&2\gamma\\ \end{array}\right).

3. The 5×55\times 5 block matrices in equations (5.8) and (5.9):

1=(𝔸1+𝔸21+2ρ(𝔸3+𝔸4)ρ(7+8)),2=((7+8)ρ+𝔻1+𝔻2𝔹3+𝔹4(9+10)ρ𝔹5+𝔹6),\mathbb{H}_{1}=\left(\begin{array}[]{cc}\mathbb{A}_{1}+\mathbb{A}_{2}&\mathbb{C}_{1}+\mathbb{C}_{2}\\ \rho(\mathbb{A}_{3}+\mathbb{A}_{4})&\rho(\mathbb{C}_{7}+\mathbb{C}_{8})\\ \end{array}\right),\ \mathbb{H}_{2}=\left(\begin{array}[]{cc}(\mathbb{C}_{7}+\mathbb{C}_{8})\rho+\mathbb{D}_{1}+\mathbb{D}_{2}&\mathbb{B}_{3}+\mathbb{B}_{4}\\ (\mathbb{C}_{9}+\mathbb{C}_{10})\rho&\mathbb{B}_{5}+\mathbb{B}_{6}\\ \end{array}\right),\
3=((1+2)ρ𝔹1+𝔹2(3+4)ρ+ρ(7+8)ρ+ρ(𝔻1+𝔻2)ρ(𝔹3+𝔹4)),4=(𝔽1+𝔽2000),\mathbb{H}_{3}=\left(\begin{array}[]{cc}(\mathbb{C}_{1}+\mathbb{C}_{2})\rho&\mathbb{B}_{1}+\mathbb{B}_{2}\\ (\mathbb{C}_{3}+\mathbb{C}_{4})\rho+\rho(\mathbb{C}_{7}+\mathbb{C}_{8})\rho+\rho(\mathbb{D}_{1}+\mathbb{D}_{2})&\rho(\mathbb{B}_{3}+\mathbb{B}_{4})\\ \end{array}\right),\ \mathbb{H}_{4}=\left(\begin{array}[]{cc}\mathbb{F}_{1}+\mathbb{F}_{2}&0\\ 0&0\\ \end{array}\right),\
5=(0005+6),6=(00(5+6)ρ0),7=(𝔸3+𝔸47+8𝔸5+𝔸69+10),\mathbb{H}_{5}=\left(\begin{array}[]{cc}0&0\\ 0&\mathbb{C}_{5}+\mathbb{C}_{6}\\ \end{array}\right),\ \mathbb{H}_{6}=\left(\begin{array}[]{cc}0&0\\ (\mathbb{C}_{5}+\mathbb{C}_{6})\rho&0\\ \end{array}\right),\ \mathbb{H}_{7}=\left(\begin{array}[]{cc}\mathbb{A}_{3}+\mathbb{A}_{4}&\mathbb{C}_{7}+\mathbb{C}_{8}\\ \mathbb{A}_{5}+\mathbb{A}_{6}&\mathbb{C}_{9}+\mathbb{C}_{10}\\ \end{array}\right),\
𝔾1=(00(11+12)ρ0),𝔾2=(𝔸7+𝔸80011+12),ρ~=(00ρ0).\mathbb{G}_{1}=\left(\begin{array}[]{cc}0&0\\ (\mathbb{C}_{11}+\mathbb{C}_{12})\rho&0\\ \end{array}\right),\ \mathbb{G}_{2}=\left(\begin{array}[]{cc}\mathbb{A}_{7}+\mathbb{A}_{8}&0\\ 0&\mathbb{C}_{11}+\mathbb{C}_{12}\\ \end{array}\right),\ \widetilde{\rho}=\left(\begin{array}[]{cc}0&0\\ \rho&0\\ \end{array}\right).\

4. Constant vectors in equation (5.12)

𝚲1=\displaystyle\mathbf{\Lambda}_{1}= (b3,b5,[(b2+b6)μ4+b4(1μ3)]aR01f40,[(b2+b6)μ4+b4(1μ3)]aR01f8,[b2+b4μ20+b6]aR1f4),\displaystyle\big{(}b_{3},b_{5},-[(b_{2}+b_{6})\mu_{4}+b_{4}(1-\mu_{3})]aR_{0}^{-1}f_{4}^{0},-[(b_{2}+b_{6})\mu_{4}+b_{4}(1-\mu_{3})]aR_{0}^{-1}f_{8},-[b_{2}+b_{4}\mu^{0}_{2}+b_{6}]aR^{-1}f_{4}\big{)},
𝚲2=\displaystyle\mathbf{\Lambda}_{2}= (0,0,𝚲3),𝚲4=(2Qμ2,2Qμ1,0,0,f1),\displaystyle\big{(}0,0,\mathbf{\Lambda}_{3}\big{)},\ \mathbf{\Lambda}_{4}=(2Q\mu_{2},2Q\mu_{1},0,0,-f_{1}),
𝚲3=\displaystyle\mathbf{\Lambda}_{3}= ([b2μ4+b4(1μ3)+b6μ4]aR01b20,[b2μ4+b4(1μ3)+b6μ4]aR01b4,[b2(μ3+μ4μ20)+b4μ20+b6]aR1b2).\displaystyle\big{(}[b_{2}\mu_{4}+b_{4}(1-\mu_{3})+b_{6}\mu_{4}]aR_{0}^{-1}b_{2}^{0},[b_{2}\mu_{4}+b_{4}(1-\mu_{3})+b_{6}\mu_{4}]aR_{0}^{-1}b_{4},[b_{2}(\mu_{3}+\mu_{4}\mu^{0}_{2})+b_{4}\mu^{0}_{2}+b_{6}]aR^{-1}b_{2}\big{)}.

A.3 Proof of Proposition 5.1

The uniqueness part follows by the standard arguments. Thus, we focus only on the existence part. By Lemma 6.1, we can substitute the solution (𝐗¯t,𝐋¯t)=(X¯t0,X¯t1,L¯t0,L¯t,L¯t),(\overline{\mathbf{X}}_{t},\overline{\mathbf{L}}_{t})^{\top}=(\overline{X}^{0}_{t},\overline{X}^{1}_{t},\overline{L}_{t}^{0},\overline{L}_{t},\overline{L}^{\ddagger}_{t})^{\top}, (𝐘¯t,𝐏¯t)=(Y¯t0,Y¯t1,P¯t0,P¯t,P¯t),(𝐙¯t,𝐐¯t)=(Z¯t0,Z¯t1,Q¯t00,Q¯t10,Q¯t10,)(\overline{\mathbf{Y}}_{t},\overline{\mathbf{P}}_{t})^{\top}=(\overline{Y}_{t}^{0},\overline{Y}_{t}^{1},\overline{P}^{0}_{t},\overline{P}_{t},\overline{P}^{\ddagger}_{t})^{\top},\ (\overline{\mathbf{Z}}_{t},\overline{\mathbf{Q}}_{t})^{\top}=(\overline{Z}_{t}^{0},\overline{Z}_{t}^{1},\overline{Q}^{00}_{t},\overline{Q}^{10}_{t},\overline{Q}^{10,\ddagger}_{t})^{\top} of (5.7) into (5.6). Then, its solution can be constructed by the following three steps.

Step 1. The construction of (X0,Y0,Z0)(X^{0},Y^{0},Z^{0}) and (L0,L,L;P0,Q00,Q01;P,Q10,Q11)(L^{0},L,L^{\ddagger};P^{0},Q^{00},Q^{01};P,Q^{10},Q^{11}).

The triple (X0,Y0,Z0)(X^{0},Y^{0},Z^{0}) (if exists), should be 𝔽0\mathbb{F}^{0}-adapted, and define Xt0=X¯t0,Yt0=Y¯t0,Zt0=Z¯t0.X^{0}_{t}=\overline{X}^{0}_{t},\ Y^{0}_{t}=\overline{Y}^{0}_{t},\ Z^{0}_{t}=\overline{Z}^{0}_{t}. From (5.6), the adjoint equations (except those of (P,Q10,,Q11,(P^{\ddagger},Q^{10,\ddagger},Q^{11,\ddagger})) satisfy

{dLt0=[f20Lt0+f6Lt]dt+[f30Lt0+f7Lt]dWt0,L00=2γ0Y00,dLt=[f2Lt+f10L¯t+f60L¯t0]dt+[f3Lt+f11L¯t+f70L¯t0]dWt0,L0=0,dLt=f2Ltdt+f3LtdWt0,L0=2γY¯01,\left\{\begin{aligned} dL_{t}^{0}=&[f_{2}^{0}L_{t}^{0}+f_{6}L_{t}]dt+[f_{3}^{0}L_{t}^{0}+f_{7}L_{t}]dW^{0}_{t},\quad L_{0}^{0}=2\gamma_{0}Y_{0}^{0},\\ dL_{t}=&\big{[}f_{2}L_{t}+f_{10}\overline{L}_{t}+f_{6}^{0}\overline{L}_{t}^{0}\big{]}dt+\big{[}f_{3}L_{t}+f_{11}\overline{L}_{t}+f_{7}^{0}\overline{L}_{t}^{0}\big{]}dW_{t}^{0},\quad L_{0}=0,\\ dL_{t}^{\ddagger}=&f_{2}L_{t}^{\ddagger}dt+f_{3}L_{t}^{\ddagger}dW_{t}^{0},\quad\ L_{0}^{\ddagger}=2\gamma\overline{Y}_{0}^{1},\\ \end{aligned}\right.
{dPt0=[b10Pt0+b3Ptf10Lt0f5Lt2Q0(Xt0μ10X¯t1)]dtQt00dWt0Qt01dWt1,dPt=[b1Ptf1Lt+b5P¯t+b30P¯t0f50L¯t0f9L¯t+2μ10Q0(Xt0μ10X¯t1)]dtQt10dWt0Qt11dWt1,PT0=LT0Φ10LTΦ2,PT=LTΦ1L¯T0Φ20L¯TΦ3.\left\{\begin{aligned} -d{P_{t}^{0}}=&\Big{[}b_{1}^{0}{P_{t}^{0}}+b_{3}{P_{t}}-f_{1}^{0}L_{t}^{0}-f_{5}L_{t}-2Q_{0}\big{(}{X_{t}^{0}}-\mu^{0}_{1}\overline{X}_{t}^{1}\big{)}\Big{]}dt-{Q_{t}^{00}}dW^{0}_{t}-{Q_{t}^{01}}dW_{t}^{1},\\ -d{P_{t}}=&\Big{[}b_{1}{P_{t}}-f_{1}L_{t}+b_{5}\overline{P}_{t}+b_{3}^{0}\overline{P}_{t}^{0}-f_{5}^{0}\overline{L}_{t}^{0}-f_{9}\overline{L}_{t}+2\mu^{0}_{1}Q_{0}\big{(}X_{t}^{0}-\mu^{0}_{1}\overline{X}_{t}^{1}\big{)}\Big{]}dt\\ &-{Q_{t}^{10}}dW^{0}_{t}-{Q_{t}^{11}}dW_{t}^{1},\\ {P_{T}^{0}}=&-L_{T}^{0}\Phi_{1}^{0}-L_{T}\Phi_{2},\quad{P_{T}}=-L_{T}\Phi_{1}-\overline{L}_{T}^{0}\Phi_{2}^{0}-\overline{L}_{T}\Phi_{3}.\end{aligned}\right. (A.4)

(A.4) is a linear decoupled FBSDE with L2L^{2} nonhomogeneous terms, thus it admits a unique solution (L0,L,L;P0,Q00,Q01;P,Q10,Q11)(L^{0},L,L^{\ddagger};P^{0},Q^{00},Q^{01};P,Q^{10},Q^{11}). Moreover, comparing (A.4), (5.7), we have

Lt0=L¯t0,Lt=L¯t,Lt=L¯t,Pt0=P¯t0,Qt00=Q¯t00,Qt01=0,Pt=P¯t,Qt10=Q¯t10,Qt11=0,t[0,T].L_{t}^{0}=\overline{L}_{t}^{0},\ L_{t}=\overline{L}_{t},\ L_{t}^{\ddagger}=\overline{L}^{\ddagger}_{t},\ P_{t}^{0}=\overline{P}_{t}^{0},\ Q^{00}_{t}=\overline{Q}_{t}^{00},Q^{01}_{t}=0,\ P_{t}=\overline{P}_{t},\ Q^{10}_{t}=\overline{Q}_{t}^{10},Q^{11}_{t}=0,\ t\in[0,T].

Step 2. The construction of (X1,P,Q10,,Q11,)(X^{1},P^{\ddagger},Q^{10,\ddagger},Q^{11,\ddagger}).

It follows from the system (5.6) that the 4-tuple (X1,P,Q10,,Q11,)(X^{1},P^{\ddagger},Q^{10,\ddagger},Q^{11,\ddagger}) satisfies

{dXt1=[b1Xt1+b2a7Pt+b~t]dt+σdWt1,X01=x0,dPt=[b1Pt2QXt1+f~t]dtQt10,dWt0Qt11,dWt1,PT=LTΦ1,\left\{\begin{aligned} dX_{t}^{1}=&\Big{[}b_{1}X_{t}^{1}+b_{2}a_{7}P_{t}^{\ddagger}+\tilde{b}_{t}\Big{]}dt+\sigma dW_{t}^{1},\ {{X}_{0}^{1}}=x_{0},\\ -d{{P}_{t}^{\ddagger}}=&\Big{[}b_{1}{P_{t}^{\ddagger}}-2Q{X_{t}^{1}}+\tilde{f}_{t}\Big{]}dt-{Q_{t}^{10,\ddagger}}dW^{0}_{t}-{Q_{t}^{11,\ddagger}}dW_{t}^{1},\ {P_{T}^{\ddagger}}=-L_{T}^{\ddagger}\Phi_{1},\end{aligned}\right. (A.5)

where b~t=𝚲1𝐗¯t+𝚲2𝐘¯t=[𝚲1+𝚲2(I+Stρ)1St]𝐗¯t+𝚲2(I+Stρ)1Υt\tilde{b}_{t}=\mathbf{\Lambda}_{1}\overline{\mathbf{X}}_{t}+\mathbf{\Lambda}_{2}\overline{\mathbf{Y}}_{t}=[\mathbf{\Lambda}_{1}+\mathbf{\Lambda}_{2}(I+S_{t}\rho)^{-1}S_{t}]\overline{\mathbf{X}}_{t}+\mathbf{\Lambda}_{2}(I+S_{t}\rho)^{-1}\Upsilon_{t} and f~t=𝚲4𝐗¯t\tilde{f}_{t}=\mathbf{\Lambda}_{4}\overline{\mathbf{X}}_{t} are 𝔽0\mathbb{F}^{0}-adapted. Next we introduce the following Riccati equation and BSDE

{Σ˙t+2b1Σt+12R1|b2|2|Σt|22Q=0,ΣT=0,dpt=[(12R1|b2|2Σt+b1)pt+Σtb~t+f~t]dtqt0dWt0qt1dWt1,pT=LTΦ1,\left\{\begin{aligned} &\dot{\Sigma}_{t}+2b_{1}\Sigma_{t}+\frac{1}{2}R^{-1}|b_{2}|^{2}|\Sigma_{t}|^{2}-2Q=0,\ \ \ \Sigma_{T}=0,\\ &-dp_{t}=[(\frac{1}{2}R^{-1}|b_{2}|^{2}\Sigma_{t}+b_{1})p_{t}+\Sigma_{t}\tilde{b}_{t}+\tilde{f}_{t}]dt-q_{t}^{0}dW_{t}^{0}-q_{t}^{1}dW_{t}^{1},\ \ \ p_{T}=-L_{T}^{\ddagger}\Phi_{1},\end{aligned}\right.

Under (A6), it follows from Theorem 4.3 in [31] (see p.48) that Σ\Sigma takes the form (5.11). Notice that Σ\Sigma is bounded, the above linear BSDE has an unique 𝔽0\mathbb{F}^{0}-adapted solution (p,q0,0)(p,q^{0},0) with (5.12). By the relation Pt=ΣtXt1+pt,P_{t}^{\ddagger}=\Sigma_{t}X_{t}^{1}+p_{t}, one can show the well-posedness of (A.5).

Step 3. The construction of (Y1,Z1,0,Z1,1)(Y^{1},Z^{1,0},Z^{1,1}).

Denote ft=f1Xt1+f5Xt0+f6Yt0+f7Zt0+f9X¯t1+f10Y¯t1+f11Z¯t1,0+f4a7Pt+(f4a1+f8a10+f12a1+f12a7)𝐏¯t+[(f4+f12)(a3+a8)+f8a30]𝐋t+f4a7𝐏t.f_{t}=f_{1}X_{t}^{1}+f_{5}X_{t}^{0}+f_{6}Y_{t}^{0}+f_{7}Z_{t}^{0}+f_{9}\overline{X}_{t}^{1}+f_{10}\overline{Y}_{t}^{1}+{f_{11}\overline{Z}_{t}^{1,0}}+f_{4}a_{7}P_{t}^{\ddagger}+(f_{4}a_{1}+f_{8}a_{1}^{0}+f_{12}a_{1}+f_{12}a_{7})^{\top}\overline{\mathbf{P}}_{t}+[(f_{4}+f_{12})(a_{3}+a_{8})+f_{8}{a}_{3}^{0}]^{\top}\mathbf{L}_{t}+f_{4}a_{7}^{\top}\mathbf{P}_{t}. Then (Y1,Z1,0,Z1,1)(Y^{1},Z^{1,0},Z^{1,1}) is the unique 𝔽0,1\mathbb{F}^{0,1}-adapted solution of the following BSDE

dYt1=[f2Yt1+f3Zt1,0+ft]dtZt1,0dWt0Zt1,1dWt1,YT1=Φ1XT1+Φ2XT0+Φ3X¯T1+ξ1.\displaystyle-dY_{t}^{1}=\Big{[}f_{2}Y_{t}^{1}+{f_{3}Z_{t}^{1,0}}+f_{t}\Big{]}dt-{Z_{t}^{1,0}dW_{t}^{0}-Z_{t}^{1,1}dW_{t}^{1}},\ Y_{T}^{1}=\Phi_{1}X_{T}^{1}+\Phi_{2}X_{T}^{0}+\Phi_{3}\overline{X}_{T}^{1}+\xi^{1}.

Finally, combining the above three steps we construct a solution of the system (5.6).

References

  • [1] A. Agrawal and R. E. Barlow, A survey of network reliability and domination theory, Operations Research, 32 (1984), pp. 478–492.
  • [2] D. Andersson and B. Djehiche, A maximum principle for SDEs of mean-field type, Applied Mathematics & Optimization, 63 (2011), pp. 341–356.
  • [3] R. J. Aumann, M. Maschler, and R. E. Stearns, Repeated games with incomplete information, MIT press, 1995.
  • [4] J. Aurand and Y.-J. Huang, Mortality and healthcare: A stochastic control analysis under Epstein–Zin preferences, SIAM Journal on Control and Optimization, 59 (2021), pp. 4051–4080.
  • [5] A. Bensoussan, M. H. Chau, and S. C. Yam, Mean field games with a dominating player, Applied Mathematics & Optimization, 74 (2016), pp. 91–128.
  • [6] P. Bergault, P. Cardaliaguet, and C. Rainer, Mean field games in a stackelberg problem with an informed major player, SIAM Journal on Control and Optimization, 62 (2024), pp. 1737–1765.
  • [7] R. Buckdahn, J. Li, and S. Peng, Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents, SIAM Journal on Control and Optimization, 52 (2014), pp. 451–492.
  • [8] P. Cardaliaguet, M. Cirant, and A. Porretta, Remarks on Nash equilibria in mean field game models with a major player, Proceedings of the American Mathematical Society, 148 (2020), pp. 4241–4255.
  • [9] R. Carmona, F. Delarue, R. Carmona, and F. Delarue, Extensions for volume I, Probabilistic Theory of Mean Field Games with Applications I: Mean Field FBSDEs, Control, and Games, (2018), pp. 619–680.
  • [10] R. Carmona and P. Wang, An alternative approach to mean field game with major and minor players, and applications to herders impacts, Applied Mathematics & Optimization, 76 (2017), pp. 5–27.
  • [11] R. A. Carmona and X. Zhu, A probabilistic approach to mean field games with major and minor players, Annals of Applied Probability, 26 (2016), pp. 1535–1580.
  • [12] Z. Chen and L. Epstein, Ambiguity, risk, and asset returns in continuous time, Econometrica, 70 (2002), pp. 1403–1443.
  • [13] F. Delarue, On the existence and uniqueness of solutions to FBSDEs in a non-degenerate case, Stochastic Processes and Their Applications, 99 (2002), pp. 209–286.
  • [14] K. Du, J. Huang, and Z. Wu, Linear quadratic mean-field-game of backward stochastic differential systems, Mathematical Control and Related Fields, 8 (2018), pp. 653–678.
  • [15] D. Duffie and L. G. Epstein, Stochastic differential utility, Econometrica, 60 (1992), pp. 353–394.
  • [16] N. El Karoui, S. Peng, and M. C. Quenez, A dynamic maximum principle for the optimization of recursive utilities under constraints, Annals of Applied Probability, 11 (2001), pp. 664–693.
  • [17] X. Feng, Y. Hu, and J. Huang, Backward stackelberg differential game with constraints: a mixed terminal-perturbation and linear-quadratic approach, SIAM Journal on Control and Optimization, 60 (2022), pp. 1488–1518.
  • [18] Z. Hellman and Y. J. Levy, Measurable selection for purely atomic games, Econometrica, 87 (2019), pp. 593–629.
  • [19] M. Hu, S. Ji, and X. Xue, Optimization under rational expectations: A framework of fully coupled forward-backward stochastic linear quadratic systems, Mathematics of Operations Research, 48 (2023), pp. 1767–1790.
  • [20] Y. Hu, J. Huang, and W. Li, Backward stochastic differential equations with conditional reflection and related recursive optimal control problems, SIAM Journal on Control and Optimization, 62 (2024), pp. 2557–2589.
  • [21] J. Huang, W. Li, and H. Zhao, A class of optimal control problems of forward–backward systems with input constraint, Journal of Optimization Theory and Applications, 199 (2023), pp. 1050–1084.
  • [22] J. Huang, S. Wang, and Z. Wu, Backward mean-field linear-quadratic-Gaussian (LQG) games: full and partial information, IEEE Transactions on Automatic Control, 61 (2016), pp. 3784–3796.
  • [23] M. Huang, Large-population LQG games involving a major player: the Nash certainty equivalence principle, SIAM Journal on Control and Optimization, 48 (2010), pp. 3318–3353.
  • [24] M. Huang, R. P. Malhamé, and P. E. Caines, Large population stochastic dynamic games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle, Communications in Information and Systems, 6 (2006), pp. 221–252.
  • [25] E. Kamenica, Bayesian persuasion and information design, Annual Review of Economics, 11 (2019), pp. 249–272.
  • [26] X.-I. Kartala, N. Englezos, and A. N. Yannacopoulos, Future expectations modeling, random coefficient forward–backward stochastic differential equations, and stochastic viscosity solutions, Mathematics of Operations Research, 45 (2020), pp. 403–433.
  • [27] J.-M. Lasry and P.-L. Lions, Mean field games, Japanese Journal of Mathematics, 2 (2007), pp. 229–260.
  • [28] A. Lazrak, Generalized stochastic differential utility and preference for information, The Annals of Applied Probability, 14 (2004), pp. 2149–2175.
  • [29] A. Lazrak and M. C. Quenez, A generalized stochastic differential utility, Mathematics of operations research, 28 (2003), pp. 154–180.
  • [30] J. MA, Z. WU, D. ZHANG, and J. ZHANG, On well-posedness of forward-backward SDEs-a unified approach, Annals of Applied Probability, 25 (2015), pp. 2168–2214.
  • [31] J. Ma and J. Yong, Forward-Backward Stochastic Differential Equations and Their Applications, no. 1702, Springer Science & Business Media, 1999.
  • [32] Y. Ma and M. Huang, Linear quadratic mean field games with a major player: The multi-scale approach, Automatica, 113 (2020), p. 108774.
  • [33] M. Miller and P. Weller, Stochastic saddlepoint systems stabilization policy and the stock market, Journal of Economic Dynamics and Control, 19 (1995), pp. 279–302.
  • [34] M. Nourian and P. E. Caines, ε\varepsilon-nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents, SIAM Journal on Control and Optimization, 51 (2013), pp. 3302–3331.
  • [35] S. T. Rachev and L. Rüschendorf, Mass Transportation Problems: Applications, Springer Science & Business Media, 2006.