Mean Field Games of Major-Minor Agents with Recursive Functionals
Abstract. This paper investigates a novel class of mean field games involving a major agent and numerous minor agents, where the agents’ functionals are recursive with nonlinear backward stochastic differential equation (BSDE) representations. We term these games “recursive major-minor” (RMM) problems. Our RMM modeling is quite general, as it employs empirical (state, control) averages to define the weak couplings in both the functionals and dynamics of all agents, regardless of their status as major or minor. We construct an auxiliary limiting problem of the RMM by a novel unified structural scheme combining a bilateral perturbation with a mixed hierarchical recomposition. This scheme has its own merits as it can be applied to analyze more complex coupling structures than those in the current RMM. Subsequently, we derive the corresponding consistency condition and explore asymptotic RMM equilibria. Additionally, we examine the RMM problem in specific linear-quadratic settings for illustrative purposes.
Keywords. Backward stochastic differential equation, Controlled large population system, Exchangeable decomposition, Major and minor agents, Mean field game, Recursive functional.
MSC2020 subject classifications. 93E20, 60H10, 60K35.
1 Introduction
Mean field game (MFG) theory was independently introduced by [27] and [24] from different perspectives, serving as an effective methodology for analyzing controlled large population (LP) systems. Typically, a LP system comprises a large number of agents with interactions through their empirical distribution or averages. These interactions are in a weak sense, as the degrees of couplings among agents diminish rapidly when the number of agents tends to infinity. A core element of the MFG theory is the construction of a limiting auxiliary problem, under which all agents can be largely disentangled, allowing for characterizations of some decentralized approximate equilibrium through a consistency matching. As such, MFG analysis can significantly reduce the dimension of controlled LP systems that each agent needs to analyze, greatly simplifying related numerical analysis. A substantial body of research has been dedicated to the MFG theory, yielding fruitful outcomes. A partial list of literature relevant to this current work includes [5, 6, 8, 11, 22, 23, 32, 34]. This paper focuses on a new class of MFG problems with the following weakly-coupled LP system including a major and multiple minors , whose states and satisfy the following forward stochastic differential equations (SDEs)
(1.1) |
with initial conditions Here, stand for the (forward) state-average and control-average of all minors, respectively, and is a -dimensional standard Brownian motion on a complete probability space , where is the idiosyncratic noise for while is the common noise. aim to maximize recursive-type functionals given by
(1.2) | ||||
where is the control profile except that of ; is state-average triple with as the recursive state average and the intensity state average (see [12]); the state-triple satisfy (1.1) and the backward SDE (BSDE) motivated by recursive utilities in economics [15, 28, 29]:
(1.3) |
where for , and is the principal intensity component, and the marginal components. Since remainder terms and vanish as (see Remark 3.1), we focus on the principal terms and set as averages on principal intensities.
We refer (1.1)-(1.3) as the recursive major-minor (RMM) problem. We defer its detailed assumptions later, and highlight its modeling features first as follows.
Modeling features. (i) The RMM model delves into the interaction between a major and a large number of minors . Traditional MFG studies assume that agents are all “minor” or “negligible”, meaning that an individual agent’s action cannot significantly impact the behaviors of the population at a macro scale. Associated MFGs are thus referred to as “symmetric” because it suffices to examine a representative agent, provided agents are homogenous hence statistically exchangeable. In contrast, our RMM explores asymmetric interactions where agents having varying decisional capacities. A major agent may significantly influence the population’s behavior through her own decisions, whereas numerous minor agents can only affect the population through collective actions. This model is more realistic than the homogeneous minor setting, as it captures a range of diversified interaction mechanisms, see [5, 11, 23, 32, 34].
(ii) The RMM model further posits that objectives of all agents are represented recursively through nonlinear BSDEs, such as (1.3), with non-additive drivers or . The inclusion of recursive functionals in MFG studies is motivated by their advantageous decision-theoretic properties, especially in the current LP context featuring complex decision couplings. Indeed, recursive functionals are well-suited for decision theory due to their capability to explain various observed non-standard decision behaviors, such as the separation of inter-temporal substitution and risk aversion. Consistently, recursive functionals extend classical expected functionals (see [4, 7, 16]), which are relevant to a special class of BSDEs with additive drivers.
(iii) The RMM problem restricts its weak coupling to empirical averages, refraining from discussing a more extensive empirical measures or distributions. Despite this limitation, the weak coupling of RMM remains quite general, as it is integrated into both the dynamics and payoff functionals of the major and all minors , encompassing elements from both the state and the control. Moreover, due to recursive functionals, the state averages is enriched by including not only on the objective (forward) states; but also on the (recursive and intensity) states reflecting the subjective averaged-out beliefs. Specifically, the intensity coupling characterizes some average on risk (ambiguity) aversion across all agents.
Literature comparison. [10, 23] and [32] introduced the major-minor MFG within a linear-quadratic-Gaussian (LQG) framework on finite and infinite horizon, respectively. They employed augmented Riccati equations to characterize consistency conditions. [34] extended these major-minor studies to a nonlinear setting using the stochastic Hamilton-Jacobi-Bellman (HJB) approach, where the weak coupling is restrictive; for instance, the major’s state cannot enter the dynamics of the minors. Besides, [5] investigated a class of major-minor MFG problems also through the stochastic HJB approach. [8] studied major-minor MFG by master equations where the agents take closed loop control. Recently, [6] investigated a type of MFG problems with asymmetric information between major and minor agents. [11] explored nonlinear major-minor MFG with general weak couplings, allowing the major’s state to enter the dynamics of the minor agents. Additionally, the limiting control problem of the major agent incorporates an endogenous mean-field term, based on an approximation through a two-agent non-zero-sum game. A forward type of maximum principle was utilized in this context.
Our paper distinguishes itself from aforementioned works by its focus on a nonlinear major-minor interaction with recursive functionals, and associated methodology of a backward-forward type of stochastic maximum principle. Our RMM modeling is particularly noteworthy for its introduction and detailed analysis of the weak couplings of the backward (recursive and intensity) state-average originating from the recursive functional. As previously mentioned, these couplings hold significant decision-making impacts and, to the best of our knowledge, have not been systematically addressed in the MFG literature. Consequently, the maximum principle we adopt and the consistency condition we derive take unique forms that differ from those in [11]. Additionally, we apply our general nonlinear outcomes to specific RMM problems in LQG settings. Our LQG-RMM studies not only recap and extend existing results on the forward MFG studies, but also provide new insights into its backward counterpart.
Another relevant work is [7], which also explored the major-minor interaction and recursive functionals. However, it is framed on a weak formulation, substantially different from our strong formulation. Specifically, the approach of [7] originates from a variant of the Girsanov transformation and optimization of a Hamiltonian function, whereas our analysis is rooted in a refined backward-forward stochastic maximum principle. More significantly, all minors in [7] are cooperative, and the associated MFG thus encompasses a two-layer mixed structure: all (cooperative) minor agents form a mean-field team (rather than a game) in an inner layer, while in an outer layer, the interacting major agent and a representative minor agent induce a two-person, non-mean-field game. Although also termed a MFG, the model in [7] is essentially a hybrid of a mean-field team and a non-mean-field-type two-person game, which contrasts with the decision structure we investigate. Consequently, the consistency matching, a central step in MFG analysis, is not applicable in [7] at all. In contrast, our work is distinguished by a novel scheme to auxiliary control construction and consistency matching, as detailed below. Furthermore, the limiting equilibria in [7] are characterized as saddle points, which are remarkably different from our non-zero-sum setup. Additionally, the admissible controls in [7] take feedback forms and are compact, unlike our open-loop and unbounded convex admissibility.
A unified structural scheme. Last but not least, as the core element in MFG, the auxiliary problem of the RMM is formulated through a novel structural scheme, which incorporates a bilateral perturbation and a hierarchical recomposition. This scheme facilitates a more incisive auxiliary construction through a sequential network, enabling a clear-cut realization of a complex mixture involving two minor agents: one is exogenous and the other is endogenous, alongside the endogenous major agent. This scheme is new in the MFG literature and distinguishes our work from previous studies, where auxiliary constructions are based on heuristic arguments. More importantly, such scheme offers a unified methodology to tackle more complex LP interactions for which heuristic arguments are no longer tractable. For instance, when the LP system consists of heterogenous agents with varying beliefs on model uncertainties.
Contributions. (i) We introduce a new class of RMM problems featuring major-minor asymmetric interactions and recursive objectives. (ii) We present a novel unified structural scheme to construct its pivotal auxiliary problem. (iii) We derive a new class of mean-field type of forward-backward SDEs (FBSDEs) to characterize the consistency condition of the RMM problem. (iv) We examine LQG studies in the RMM context in detail to gain deeper insights.
The remainder of this paper is organized as follows: Section 2 introduces basic assumptions of the RMM problem. Section 3 presents a unified structural scheme for the RMM problem, including a bilateral perturbation and a mixed triple-agent two-layer analysis. Section 4 studies the auxiliary control and associated consistency condition (CC) of the RMM problem, and verifies its approximate Nash equilibrium. Section 5 devotes to some LQG-RMM problems. Section 6 concludes, and some technical proofs and heavy notations are given in Appendix.
2 Preliminary
For , let be the complete filtration generated by the Brownian motion ; namely, with the set of all -null sets in . Then, denotes the centralized information generated by the Brownian motion , and by the decentralized information for a generic minor agent , . Let and be two convex sets.
Definition 2.1.
is a centralized (resp. decentralized) admissible control for , if it is an -adapted (resp. -adapted) -valued process with . Similarly, for , a -valued process is called a centralized (resp. decentralized) admissible control for , if it is -adapted (resp. -adapted) with . Let and (resp. and ) be the set of all centralized (resp. decentralized) controls for and , respectively.
Definition 2.2.
For any , we say a -tuple admissible controls (resp. ) depending on is a centralized (resp. decentralized) approximate -Nash equilibrium, if for all (resp. ), we have
The exact Nash equilibrium corresponds to the case when . Now we impose some assumptions on the following coefficients of (1.1)-(1.3) of the RMM problem:
Assumption (A1)
(i) and are continuously differentiable in and , respectively. All the derivatives of are bounded.
(ii) and are continuously differentiable in and , respectively, where
, and are continuously differentiable in , and , respectively.
(iii) All derivatives of are bounded.
The derivatives of
and are bounded by , and , respectively, for some .
(iv) and , , are square-integrable, - and -measurable, respectively. Moreover,
are independent and identically distributed conditionally on .
(v) are uniformly continuous in .
3 A unified structural scheme of the RMM problem
For a fixed , if each agent has access to centralized information about all agents, including their instantaneous states realized and controls adopted, the RMM becomes a classical but high-dimensional -agent game. Existence or uniqueness of its exact Nash equilibrium(s) can be ensured under certain mild but high-dimensional conditions, including semi-continuity, coercivity and concavities on , or compactness on control admissibility. The open-loop (exact) equilibrium, denoted as , can be further characterized through a system of stationary conditions. However, this procedure to exact equilibriums is only feasible in theory and becomes impractical due to the curse of dimensionality when is large.
MFG theory offers one resolution to constructing near-optimal decentralized strategies, as an alternative approximation to the exact Nash equilibria. A key challenge in MFG is the construction of an auxiliary problem for dimension reduction. Previous MFG studies have constructed auxiliary problems intuitively based on heuristic arguments, effective only when the underlying coupling structure is not overly complex. For instance, when all agents are symmetric as in [14, 22, 24, 27], or even an asymmetric dominant major agent is included as in [5, 6, 8, 10, 11, 23, 32, 34]. However, heuristic analysis becomes infeasible to analyze LP systems with more intricate couplings. One reason is that it fails to effectively configure a complex logic network in which various representative agents shall be mutually connected by an “exogenous—endogenous” relation. Alternatively, we propose a structural scheme that can not only well amount for the extreme generality of the weak coupling in RMM problems, but also lay down an unified foundation to analyze more general and complex LP couplings. In current RMM context, this scheme yields a bilateral perturbation and a triple-agent two-layer game, as discussed below.
3.1 A bilateral perturbation: the major agent
Letting in Definition 2.2, faces an optimization problem: by assuming that all minor implement exact Nash equilibrium . When adopts a perturbed centralized control instead the exact , her state (1.1) becomes
(3.1) |
with Here, is a quasi-realized state average with the superscript “†” to emphasize its dependence on the major’s perturbed ; whereas is the exact-realized control average only depending on the exact strategies so the superscript “∗” is still applied. This is essentially an open-loop feature.
By “quasi-realized,” the states are not “exactly” the ones to be realized when all agents apply their exact strategies; instead, they are “quasi-exact” as only deviates from the exact one by adopting a perturbed control. Actually, an exact Nash equilibrium , in its open-loop sense, are defined directly on the basic inputs rather than on the “intermediate” states. Thus, a perturbed by , will not change the controls of the minors. This is very different to the closed-loop case, for which a perturbed will change the major’s state, so further alter the implementation of constructed on these realized states. Using the notation
and similar to (3.1), we can get the following quasi-realized coupled BSDEs
(3.2) |
We aim to analyze the asymptotic limit as with Then we take the limit of (3.1), (3.2) and the related recursive payoff for the major agent. By the continuity of coefficients and , and
where are the associated limiting quantities, and denotes the conditional expectation on the common information . Similarly, taking the limit on (3.1), we get the asymptotic limit of the major agent:
(3.3) |
where denotes the state of the representative minor agent, say,
(3.4) |
Comparing (3.3) and (3.4), we have by noting and the solution uniqueness of the second equation in (3.3). Then the limiting state of the major satisfies
(3.5) |
with initial conditions Taking limit on BSDE (3.2) and similar to (3.5),
(3.6) |
where ,
Remark 3.1.
The remainder terms and in BSDE (3.2) vanish as . For sake of presentation, we may omit these remainder terms hereafter.
3.2 A bilateral perturbation: a representative minor agent
We turn to a representative minor . By Definition 2.2, confronts an optimization problem when assuming implements the exact Nash ; and implement . If applies a perturbed , the state and functional of () become
(3.8) |
with We abuse notations (as with the same limits) to denote and Noting are independent of , so they are exogenous for . The state of under satisfies
The quasi-realized state of is with the superscript “” to indicate its dependence on perturbed ; whereas the exact-realized state-control average only depend on the exact strategies so are still with “”. The value affected by of satisfies
(3.9) |
with . Next, similar to our analysis in Subsection 3.1, we can obtain the following coupled mean-field FBSDE from (3.8) (noting ), ,
(3.10) |
by setting Then by (3.2), (3.9), when applying , the limiting and satisfy
(3.11) |
where . Along with (3.11), the objective of is to maximize
(3.12) |
Note that an optimal control of , if exists, should depend on and .
3.3 A hierarchical recomposition
We now apply a hierarchical recomposition to construct the desired auxiliary problem, that assumes a mixed two-layer game in the RMM context. To this end, we integrate the perturbed (3.5)-(3.7) in side of indexed by , and (3.11)-(3.12) in side of the representative minor by together. This yields an (extended) state labeled by :
(3.13) |
Indeed, the first and second state come from (3.5), the third from (3.11), and the last two from (3.10) by replacing exact centralized controls with decentralized ones With the extended state (3.13), there arise three agents, respectively: a follower using decentralized control , a follower using , and a leader using for some Recall that is -adapted and (resp. ) is - (resp. )-adapted. Therefore, the follower can directly affect all 5-tuple components, the follower directly affects the first three while the leader directly affect the last three On the other hand, all 5-tuple components are coupled through their dependence on the control triple In this sense, and also influence the 5-tuple (indirectly); all components are endogenous and not redundant.
We can now formulate limiting recursive functionals. Specifically, the follower aims to
(3.14) |
where is the solution of the coupled mean-field BSDE
(3.15) |
The aim of the follower is
(3.16) |
where is the solution of the BSDE
(3.17) |
with the exogenous processes given by
(3.18) |
And the leader aims to minimize the following quadratic deviation functional
(3.19) |
Last, we identify mixed leader-follower-Nash interactions among as follows:
-
•
As the unique leader, at the beginning (for some ) announces an (open-loop) control on the whole horizon to the followers both.
- •
-
•
Anticipate the best response and parameterized by the priori , the leader intakes such leader-follower interaction to minimize the quadratic deviation cost. Such minimum can be reached via a fixed point argument or consistency condition.
Given the above interactions, a mixed triple-agent game should be solved by the following steps.
(1) First, to formulate and solve an auxiliary control problem for in terms of control and state by fixing the generic admissible control of . We may denote the optimal one as to show its dependence on .
(2) Second, for given and pre-announced , to formulate and solve an auxiliary control problem for in terms of control and state
We may denote the optimal one as depending on and
(3) Last, get the Nash equilibrium of and , say depending on pre-announced ; and solve the optimization of the leader by matching
Remark 3.2.
(i) When the weak-coupling of RMM does not include the control-average of all minors, the term in the third and fifth equation of (3.13) can be replaced by some off-line process. The last equation in (3.13) thus becomes redundant. In fact, the inclusion of the control-average of all minors necessitates the introduction of additional dynamics to account for the averages on the physical, recursive and intensity state.
(ii) Furthermore, if we exclude the consideration of recursive functionals (namely, ), the mixed triple-agent game will simplify to a two-agent nonzero-sum game, as in [11].
(iii) The introduction of the additional facilitates a clearer specification of the exogenous processes, as discussed in Subsection 3.2 from the standpoint of . Otherwise, two exogenous processes s would need to be introduced to replace simultaneously, accompanied by an associated fixed point analysis. The overall analysis will thus become complicated, given that the consistency matching already involves another fixed-point analysis. In contrast, our formulation of the mixed triple-agent game, particularly the introduction of the virtual , may clearly elucidate the intricate exogenous-endogenous relationships.
3.4 A unified structural scheme
Our bilateral perturbations and mixed hierarchical recomposition in Sections 3.2 and 3.3 indeed suggest a unified scheme to analyze general LP systems with more complex couplings.
We start with a generic LP system characterized by a key 5-tuple where (i) is the index set of all agents, as enumerated by ; (ii) the set of weak-coupled functionals to be optimized by respectively; (iii) denotes associated decision profile of ; (iv) is the maximal coalition structure on , which is structurally determined by and will be elaborated later. Typically, , , or respectively, represent the classical LP systems with homogenous agents, classes of heterogenous agents, or major agents with homogenous minors. For second order countable or continuum infinity, we can let or Given , a structural scheme includes the following three steps.
Step 1: Exchangeable decomposition. Essentially, the MFG is an effective dimension-reduction analysis relying on exchangeabilities across agents, shares a similar spirit with the notable symmetric game in deterministic context. Exchangeabilities of a controlled can be characterized asymptotically, as , by the so-called coalition structure. This refers to a partition of the index set (namely, where the disjoint sets’ union); and for , all agents form an exchangeable sub-class, in the sense that
where is a simultaneous permutation on the sub-index set and associated intersections on and ; denotes the equivalence relation in terms of game equilibrium. That is, the set of Nash equilibria of is invariant to that of under finite permutation on . Roughly, this means that the simultaneous optimizations faced by , are endowed with the identical probabilistic structures. Moreover, under large symmetric assumption as in [9], the equivalence should be element-wise and thus transitive. Therefore, there exists a maximal coalition, denoted as , provided the set of coalition structures is non-empty. By “maximal”, it is the coarsest partition than other coalitions, that implies the largest exchangeable decomposition and hence the largest dimension reduction can be achieved. In fact, this maximal coalition can be constructed through the saturated sets using equivalent relation, as the union of all -equivalent classes ([18]).
A trivial coalition is the set of all singleton sets generated by each element. In this case, without dimension reduction. The MFG is applicable for non-trivial coalitions in that as ; or reversely, for at least one For example, for some . For LP systems with with associated , respectively. For these cases, heuristic arguments are still feasible to construct auxiliary control. However, they boil down when coupling structures assume more complex forms, such as non-standard bridge configuration; and in this case, the identification of with associated decomposition become necessary to alternatively yield a unified and systematic analysis.
Step 2: Representative multilateral perturbation. Given the maximal coalition , one can select a representative agent, denoted as , from each exchangeable sub-class This selection yields a representative collection that is dimension-reduced by noting . In fact, we may abuse notations to denote as the quotient space with elements in the equivalent classes by . Then, a multilateral perturbation can be introduced in side of each separately, by assuming all other representatives still keep their equilibrium strategies. Depending on the weak-coupling mechanism structured by , each perturbation will be transmitted throughout the whole LP system across all exchangeable sub-classes. A typical transmission, in an open-loop and dynamic setting, can be sketched by the following channel via the weak-coupling of state-average:
Along with such transmission, influence of a representative on controlled LP systems can be completely quantified, that is essentially equivalent to the Fréchet differential of on .
Step 3: Hierarchical recomposition. The multilateral perturbation involves approximations to completely quantify all LP weak-couplings asymptotically, thereby leading to a variety of limiting quantities. Due to the exchangeability, these quantities typically assume relevant conditional expectations on the tail sigma-algebra, as per DeFinetti theorem. However, unlike those in classical McKean-Vlasov control problems, these expectations exhibit distinct modes in realized degrees of controllability, contingent upon their hierarchical positions within the entire weak-coupled structure. For instance, in the RMM, three modes emerge: exact-realized (exact limit), quasi-realized (semi-exact), and null-realized, as indicated by (3.5) and (3.8). In the current RMM or simpler coupled setups, these modes can be ordered using an exogenous-endogenous relation, where the “exogenous” exact-realized mode dominates the “endogenous” null-realized one. However, in more complex forms of LP couplings, these modes cannot be fully encapsulated by a binary ordering alone, instead forming a more intricate directed graph network, which is challenging to be studied by heuristic arguments alone.
Our resolution is a hierarchical recomposition, similar to the well-studied structural function through the so-called path sets in reliability theory ([1]). In fact, reliability of any system is equivalent to that of a serial (sequential) arrangement of parallel subsystems. Likewise, for a generic LP system, all involved modes by Step 2 can be stratified into a sequential layers with leader-follower-type hierarchies; and each layer consists of parallel Nash-type nodes with simultaneous decisions. This stratification, similar to the construction of structural function, is indeed applicable to any LP systems, such as those with non-classical intermediate coupling, akin to the bridge systems in reliability analysis. Such stratification enables an unified and more clear-cut construction of the desired auxiliary problem by recomposing all modes across hierarchical layers with a combination of corresponding variants of fixed-point matching. For example, the aforementioned three modes in RMM yield two layers, thereby enabling the construction of the auxiliary problem via a triple-agent leader-follower-Nash game. By coincidence, the multilateral perturbation assumes a similar role to the synthesis of all minimal path sets in reliability analysis, as both aim to quantify all transmission channels within a given system.
Attainability. For classical LP systems, leader-type agents at higher layers in the hierarchical recomposition often engage in a fixed-point analysis, as demonstrated by (3.19) for the RMM when aligns the announced with the resultant on their conditional expectations . In contrast to follower-type agents at lower layers, who confront more regular control problems (see (3.16)), the fixed-point analysis for leaders can indeed degenerate, as the minimal deviations become trivially attainable at zero, provided . Consequently, the choice of norm on deviation becomes indifferent. Specifically, the introduction of a quadratic deviation on -norm: as in (3.19), is merely formal. In fact, the top layer, characterized by a high degree of realized controllability, presents a tradeoff where its remaining control capability is more prone to degeneration. However, in LP systems with more complex couplings, there may emerge nonclassical hierarchical layers and the relevant analysis, particularly of top layers, may no longer be degenerate. This is particularly true for LP systems with nested or asymmetric information ([3, 25]), or those with heterogenous robustness beliefs, as well as those with bridging-intermediate-type couplings. In such instances, the optimal deviation norms cannot be trivially attainable at zero, necessitating the replacement of fixed-point analysis with some non-trivial optimization problems.
Summary. Step 1-3 constitute an unified structural scheme to analyze more general LP systems, particularly those with non-classical coupling structures. In fact, for classical LP systems with only minor (homogenous or heterogenous) agents, auxiliary problems can be constructed by straightforward heuristic arguments. The main reason is that their coalition structure are relatively simple thus no need to invoke Step 2-3. A more complex but classical LP system is the one involving single major agent, for which auxiliary problem can still be constructed heuristically but shall invoke an exogenous-endogenous relation to tackle additional couplings by the major agent. Our RMM has fairly general weak-couplings, especially those of the recursive state pairs, that motivate us to introduce the structural scheme, especially Step 2-3, to fully quantify complexities of all resultant perturbation transmissions and construct the auxiliary problem hierarchically. For non-classical LP systems such as those with nested or asymmetric information, or those featuring bridge-type couplings, heuristic arguments are no longer feasible for constructing auxiliary problems. For example, even for LP systems with majors and classes of heterogenous minors, heuristic construction necessitates an exogenous-endogenous structure represented by a directed graph with edges. By contrast, the structural scheme, conversely, simplifies this to a sequential flow with only binary relations.
4 Approximated Nash equilibrium of the RMM problem
4.1 Auxiliary control problem for the follower
Now we fix the follower ’s control and solve the auxiliary problem of . The auxiliary problem for is formulated by (3.14) associated with mean-field BSDE (3.15) and McKean-Vlasov SDE (3.13) (its first two components). Related Hamiltonian is defined as
where , , . Using stochastic maximum principle ([2]), we get the following Hamiltonian system
(4.1) |
where , , , ; are the associated adjoint processes and
(4.2) |
assuming that is an optimal control and the associated optimal state trajectory. Then we have the following result.
Proposition 4.1.
Let (A1) be in force. Moreover, we assume that
(1) There exists a unique maximizer of the Hamiltonian as a function of
(denoted by );
(2) The function is convex in .
If solves system (4.1), the optimal control of is
4.2 Auxiliary control problem for the follower
In this subsection, we fix , the controls adopted by the leader and the follower respectively. Consider the control problem of the follower associated with BSDE (3.17), SDE (3.13) and the functional (3.16). Given (3.18) and the last two equations in (3.13) become exogenous processes. The Hamiltonian functional and the related Hamiltonian system takes the following form
(4.3) |
where the last two equations are the associated adjoint equations and
(4.4) |
Then from the stochastic maximum principle for FBSDE system (e.g. [21]), we have
Proposition 4.2.
Under (A1), assume the FBSDE (4.3) admits an unique solution and
(1) There exists a unique maximizer of the Hamiltonian as a function of
(denoted by );
(2) The function is convex in .
Then the optimal control of is given by
4.3 Consistency condition system
For sake of presentation, hereafter we may write a bar on the top of a random variable (or process) to denote its conditional expectation with respect to , for example, . We impose the following consistency conditions on the followers and the leader :
(4.5) |
By (4.5) and the solution uniqueness of SDE (3.13) and BSDEs (3.15), (3.17)-(3.18), we have
(4.6) |
Noting (4.6) and by Proposition 4.1 and 4.2, we may introduce the following assumption:
Assumption (A2)
Suppose that there exist a pair of deterministic continuous functions
satisfy the following conditions
where and are the mappings given in Proposition 4.1 and 4.2.
Under (A2), by measurable selection theorem, there exists a measurable with
(4.7) |
Combining (4.7) and (A2), we can denote
(4.8) | ||||
Similarly, Plugging the mappings into (4.1) and (4.3), we obtain
(4.9) |
with the mixed initial-terminal condition
(resp. ) is given by (4.2) ((4.4)) by replacing () with () and
The following result is a direct consequence based on our previous analysis.
4.4 -Nash equilibrium of the RMM problem
Proposition 4.3 yields an approximate Nash equilibrium for the RMM problem. To verify it, we first introduce an technical assumption, that is common in control literature [11]:
Assumption (A3)
(i) The diffusion coefficients are independent of , if applicable.
(ii) The maximizers and in Proposition 4.1 and 4.2 are independent of .
(iii) The system (4.9) has
a unique solution
, where and are -adapted.
(iv) There exists a random decoupling field
such that
with is -adapted for each , and is Lipschitz continuous on all its arguments uniformly on .
Decoupling field is a key to study the well-posedness of FBSDEs; see, [13, 30, 31]. Proposition 5.1 presents a sufficient condition ensuring the existence of decoupling field for LQG-RMM. By (A3-i), and in Proposition 4.1 and 4.2 are independent of , and it is also the case for in (A2), in (4.7), and in (4.8). We abuse the notation to write
Then it follows from (4.8) and (A3-iii) that and are -adapted.
Assumption (A4) Both and are Lipschitz in state variable uniformly on , and
(A3) and (A4) are commonly adopted in MFG literature (e.g., (A7) and (A8) in [11]). Indeed, (A3) and (A4) are satisfied for the LQG-RMM problem in Section 5. Applying the feedback control pair to (1.1) and (1.3), we get the following forward-backward system
(4.10) |
where , The payoff functionals (1.2) now become
(4.11) | ||||
To ease the notation, hereafter we may write instead of .
Theorem 4.1.
Assume that (A1)-(A4) hold. The feedback control is an -Nash equilibrium for RMM problem, where .
5 Linear-quadratic-Gaussian cases
This section studies the RMM problem in some linear-quadratic-Gaussian (LQG) cases (LQG-RMM), where the agents’ states evolve by the following linear SDEs: for ,
(5.1) |
and the recursive functionals assume the following quadratic forms
(5.2) | ||||
where satisfy the following coupled linear BSDE system:
(5.3) |
with . To simplify the analysis, we assume that ; all coefficients are constants with nonnegative , positive , and . The forward-backward LQG setting of (5.1)-(5.3) is strongly motivated by various practical applications, see [19, 26, 33] for more details. We introduce the following notations:
(5.4) | ||||
By Proposition 4.1, 4.2 and (4.5), the equilibrium strategy now read as
(5.5) | ||||
Combining with (4.9), we get the following FBSDE
(5.6) |
For sake of presentation, we defer the definitions of , and matrices (vectors) of (5.5)-(5.6) to Appendix A.2. (5.6) is a fully-coupled FBSDE with mixed initial-terminal conditions, and its wellposedness can be discussed through the following steps. Set for and take conditional expectation on (5.6), we have
(5.7) |
Next, we consider the following ODE and BSDE
(5.8) |
(5.9) |
Again, we defer the definitions of and to Appendix A.2. The well-posedness of (5.8) may be obtained as Theorem 4.6 in [19]. We refrain to present these conditions in details, as they might be rather technical and incur unnecessary degression along our presentation. Instead, we directly assume that
(A5) (5.8) admits a unique -valued solution with bounded
Under (A5), BSDE (5.9) admits an unique -adapted solution by the standard BSDE solvability arguments. Moreover, we have the following result:
Lemma 5.1.
Let (A5) holds. Then the linear FBSDE (5.7) has a unique -adapted solution with the following relations:
(5.10) | ||||
Its proof is based on a standard linear transformation decoupling method (e.g., [17]) and we omit its details here. We now assume
(A6) where .
Proposition 5.1.
We defer the proof of Proposition 5.1 in Appendix. Combining (5.5), (5.10) and (5.12), the equilibrium for and are respectively
where
Note that both and are -adapted. By Theorem 4.1, we have the following result.
Theorem 5.1.
5.1 Forward LQG-RMM
This subsection studies a special case with in (5.1), (5.2). In this case, all functionals are still quadratic but involve only the forward state, and the RMM reduces to the classical forward major-minor game in [11]. Although such forward setting is not novel, our RMM-LQG still gains novelties in its generality of weak-couplings. Now (5.8) and (5.9) become
It is easy to check that if
(5.14) |
then (A5) can be ensured and admits the following representation
(5.15) |
By the solution uniqueness, so (noting ). Then
Corollary 5.1.
Example 5.1.
Consider a forward LQG-RMM with and (no weak-coupling of the control-average). The -Nash equilibrium by Corollary 5.1 becomes
where and This result recovers Theorem 5.1 in [11] with a subtle difference: note that in [11] while our . This is mainly due to the modeling differences in weak-couplings: [11] considers the empirical distribution while we focus on the more special empirical average. As a tradeoff, we can obtain an explicit expression for while [11] only shows its existence.
5.2 Backward LQG-RMM
In this subsection, , , in (5.2) and (5.3). The LQG-RMM is now solely “backward” without the forward state (5.1). Backward LQG setting has found broad applications in such as optimal investment, recursive utility and hedging (e.g., [20]). Related MFG studies on this setting has also been well addressed (see [14, 22]). However, the backward LQG-RMM seems still novel in literature. It can well capture the insights of the large investor (see [16]) and relative performance ([14]), both are well motivated in financial studies. In this case, (5.8) reads as (noting that is -valued)
(5.16) | ||||
and the linear BSDE (5.9) now becomes
(5.17) |
where . Moreover, it follows from (5.11)-(5.12) that . Besides, by (5.7) and Lemma 5.1, satisfies
(5.18) |
with the initial condition In particular, if , then ; and if , then Since (A6) can be readily verified, the following result follows directly from Theorem 5.1.
Corollary 5.2.
We have the following observations on (5.19):
(1) Both and are -adapted, and depend on the common noise only through and . If the BSDE’s driver of (5.3) is independent to the intensity state (namely, , so ),
the equilibrium for each agent becomes deterministic.
(2) Unlike the forward LQG-RMM (Corollary 5.1) with -adapted equilibrium, the equilibrium in the backward case is -adapted, hence the idiosyncratic information driven by individual noise plays no role in the equilibrium. This is mainly because the driver of BSDE (5.3) is now linear and independent to the principal intensity state .
(3) If the major agent is absent, the equilibrium for each (minor) agent becomes . Comparing with in (5.19), one can see that the term captures the influence of .
Remark 5.1.
(ii) If , and , there has no weak-coupling by the control-average, and the -Nash equilibrium is given by Besides, when the major agent is absent, the above result recovers that of [22] (Theorem 3.1).
We present two concrete examples with more explicit representations for .
Example 5.2.
Consider the backward LQG-RMM problem with: for
Therefore, the major and each minor agent are weakly-coupled through their control-average along with an identical relative performance parameter. in Remark 5.1 now reads as
Since , , hence (A5) holds. From (5.20),
We calculate from (5.19) that, for
Remark 5.2.
(1) The equilibrium have linear dependence on the terminal conditions via the parameters . (2) When , the major’s payoff is independent on the control-average of all minors, so her Nash strategy will not depend on any more, but may still get influence from each minor agent provided .
6 Conclusions
This paper studies a new class recursive major-minor (RMM) games featured by: (1) recursive functionals with nonlinear BSDE representations; (2) comprehensive and general weak-couplings. We propose a novel structural scheme to construct its auxiliary problem, a key step towards the desired -Nash equilibrium. In the RMM context, this scheme consists of a bilateral perturbation and a mixed triple-agent leader-follower-Nash analysis. In contrast to heuristic arguments in most MFG literature, such scheme indeed lay down an unified game-theoretic foundation to analyze more complex LP coupling structures, such as the ones with heterogenous robust beliefs, or with coalition interactions from nested information. We plan to address them in future.
Appendix A Appendix
A.1 Proof of Theorem 4.1
For a fixed , it suffices to verify the -Nash equilibrium property of , in side of . The verification in side of is analogous thus we omit the details here. For this purpose, we consider the following limiting processes of (4.10) and (4.11):
and where By Burkholder-Davis-Gundy inequality and standard convergence estimates of SDEs (e.g., Theorem 10.1.7 in [35]), for we have
It follows from Gronwall inequality that
(A.1) |
Applying Itô’s formula to and , we have
(A.2) | ||||
where and stands for some positive constant. Combining (A.1) and (A.2),
As to the functionals,
We study the uniliteral deviation of from the strategy . Assume now that adopts a different control and keep to apply . The resulting perturbed states, denoted by , should satisfy
and the related limiting processes is given by
Similar to (A.1), By the same estimates as in (LABEL:apr-7),
(A.3) |
Since is an equilibrium strategy of the limiting triple-agent game problem, it is clear that , and combining (LABEL:apr-7) and (A.3), we get the desired result for .
A.2 Some notations in Section 5
A.3 Proof of Proposition 5.1
The uniqueness part follows by the standard arguments. Thus, we focus only on the existence part. By Lemma 6.1, we can substitute the solution of (5.7) into (5.6). Then, its solution can be constructed by the following three steps.
Step 1. The construction of and .
The triple (if exists), should be -adapted, and define From (5.6), the adjoint equations (except those of )) satisfy
(A.4) |
(A.4) is a linear decoupled FBSDE with nonhomogeneous terms, thus it admits a unique solution . Moreover, comparing (A.4), (5.7), we have
Step 2. The construction of .
It follows from the system (5.6) that the 4-tuple satisfies
(A.5) |
where and are -adapted. Next we introduce the following Riccati equation and BSDE
Under (A6), it follows from Theorem 4.3 in [31] (see p.48) that takes the form (5.11). Notice that is bounded, the above linear BSDE has an unique -adapted solution with (5.12). By the relation one can show the well-posedness of (A.5).
Step 3. The construction of .
Denote Then is the unique -adapted solution of the following BSDE
Finally, combining the above three steps we construct a solution of the system (5.6).
References
- [1] A. Agrawal and R. E. Barlow, A survey of network reliability and domination theory, Operations Research, 32 (1984), pp. 478–492.
- [2] D. Andersson and B. Djehiche, A maximum principle for SDEs of mean-field type, Applied Mathematics & Optimization, 63 (2011), pp. 341–356.
- [3] R. J. Aumann, M. Maschler, and R. E. Stearns, Repeated games with incomplete information, MIT press, 1995.
- [4] J. Aurand and Y.-J. Huang, Mortality and healthcare: A stochastic control analysis under Epstein–Zin preferences, SIAM Journal on Control and Optimization, 59 (2021), pp. 4051–4080.
- [5] A. Bensoussan, M. H. Chau, and S. C. Yam, Mean field games with a dominating player, Applied Mathematics & Optimization, 74 (2016), pp. 91–128.
- [6] P. Bergault, P. Cardaliaguet, and C. Rainer, Mean field games in a stackelberg problem with an informed major player, SIAM Journal on Control and Optimization, 62 (2024), pp. 1737–1765.
- [7] R. Buckdahn, J. Li, and S. Peng, Nonlinear stochastic differential games involving a major player and a large number of collectively acting minor agents, SIAM Journal on Control and Optimization, 52 (2014), pp. 451–492.
- [8] P. Cardaliaguet, M. Cirant, and A. Porretta, Remarks on Nash equilibria in mean field game models with a major player, Proceedings of the American Mathematical Society, 148 (2020), pp. 4241–4255.
- [9] R. Carmona, F. Delarue, R. Carmona, and F. Delarue, Extensions for volume I, Probabilistic Theory of Mean Field Games with Applications I: Mean Field FBSDEs, Control, and Games, (2018), pp. 619–680.
- [10] R. Carmona and P. Wang, An alternative approach to mean field game with major and minor players, and applications to herders impacts, Applied Mathematics & Optimization, 76 (2017), pp. 5–27.
- [11] R. A. Carmona and X. Zhu, A probabilistic approach to mean field games with major and minor players, Annals of Applied Probability, 26 (2016), pp. 1535–1580.
- [12] Z. Chen and L. Epstein, Ambiguity, risk, and asset returns in continuous time, Econometrica, 70 (2002), pp. 1403–1443.
- [13] F. Delarue, On the existence and uniqueness of solutions to FBSDEs in a non-degenerate case, Stochastic Processes and Their Applications, 99 (2002), pp. 209–286.
- [14] K. Du, J. Huang, and Z. Wu, Linear quadratic mean-field-game of backward stochastic differential systems, Mathematical Control and Related Fields, 8 (2018), pp. 653–678.
- [15] D. Duffie and L. G. Epstein, Stochastic differential utility, Econometrica, 60 (1992), pp. 353–394.
- [16] N. El Karoui, S. Peng, and M. C. Quenez, A dynamic maximum principle for the optimization of recursive utilities under constraints, Annals of Applied Probability, 11 (2001), pp. 664–693.
- [17] X. Feng, Y. Hu, and J. Huang, Backward stackelberg differential game with constraints: a mixed terminal-perturbation and linear-quadratic approach, SIAM Journal on Control and Optimization, 60 (2022), pp. 1488–1518.
- [18] Z. Hellman and Y. J. Levy, Measurable selection for purely atomic games, Econometrica, 87 (2019), pp. 593–629.
- [19] M. Hu, S. Ji, and X. Xue, Optimization under rational expectations: A framework of fully coupled forward-backward stochastic linear quadratic systems, Mathematics of Operations Research, 48 (2023), pp. 1767–1790.
- [20] Y. Hu, J. Huang, and W. Li, Backward stochastic differential equations with conditional reflection and related recursive optimal control problems, SIAM Journal on Control and Optimization, 62 (2024), pp. 2557–2589.
- [21] J. Huang, W. Li, and H. Zhao, A class of optimal control problems of forward–backward systems with input constraint, Journal of Optimization Theory and Applications, 199 (2023), pp. 1050–1084.
- [22] J. Huang, S. Wang, and Z. Wu, Backward mean-field linear-quadratic-Gaussian (LQG) games: full and partial information, IEEE Transactions on Automatic Control, 61 (2016), pp. 3784–3796.
- [23] M. Huang, Large-population LQG games involving a major player: the Nash certainty equivalence principle, SIAM Journal on Control and Optimization, 48 (2010), pp. 3318–3353.
- [24] M. Huang, R. P. Malhamé, and P. E. Caines, Large population stochastic dynamic games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle, Communications in Information and Systems, 6 (2006), pp. 221–252.
- [25] E. Kamenica, Bayesian persuasion and information design, Annual Review of Economics, 11 (2019), pp. 249–272.
- [26] X.-I. Kartala, N. Englezos, and A. N. Yannacopoulos, Future expectations modeling, random coefficient forward–backward stochastic differential equations, and stochastic viscosity solutions, Mathematics of Operations Research, 45 (2020), pp. 403–433.
- [27] J.-M. Lasry and P.-L. Lions, Mean field games, Japanese Journal of Mathematics, 2 (2007), pp. 229–260.
- [28] A. Lazrak, Generalized stochastic differential utility and preference for information, The Annals of Applied Probability, 14 (2004), pp. 2149–2175.
- [29] A. Lazrak and M. C. Quenez, A generalized stochastic differential utility, Mathematics of operations research, 28 (2003), pp. 154–180.
- [30] J. MA, Z. WU, D. ZHANG, and J. ZHANG, On well-posedness of forward-backward SDEs-a unified approach, Annals of Applied Probability, 25 (2015), pp. 2168–2214.
- [31] J. Ma and J. Yong, Forward-Backward Stochastic Differential Equations and Their Applications, no. 1702, Springer Science & Business Media, 1999.
- [32] Y. Ma and M. Huang, Linear quadratic mean field games with a major player: The multi-scale approach, Automatica, 113 (2020), p. 108774.
- [33] M. Miller and P. Weller, Stochastic saddlepoint systems stabilization policy and the stock market, Journal of Economic Dynamics and Control, 19 (1995), pp. 279–302.
- [34] M. Nourian and P. E. Caines, -nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents, SIAM Journal on Control and Optimization, 51 (2013), pp. 3302–3331.
- [35] S. T. Rachev and L. Rüschendorf, Mass Transportation Problems: Applications, Springer Science & Business Media, 2006.