On Numerical approximations of fractional and nonlocal Mean Field Games

Indranil Chowdhury (I. Chowdhury) Department of Mathematics, Faculty of Science, University of Zagreb, Croatia indranil.chowdhury@ntnu.no, indranill2011@gmail.com , Olav Ersland (O. Ersland) Norwegian University of Science and Technology, Norway olav.ersland@ntnu.no and Espen R. Jakobsen (E. R. Jakobsen) Norwegian University of Science and Technology, Norway espen.jakobsen@ntnu.no

Abstract.

We construct numerical approximations for Mean Field Games with fractional or nonlocal diffusions. The schemes are based on semi-Lagrangian approximations of the underlying control problems/games along with dual approximations of the distributions of agents. The methods are monotone, stable, and consistent, and we prove convergence along subsequences for (i) degenerate equations in one space dimension and (ii) nondegenerate equations in arbitrary dimensions. We also give results on full convergence and convergence to classical solutions. Numerical tests are implemented for a range of different nonlocal diffusions and support our analytical findings.

Key words and phrases:

Mean Field Games, jump diffusion, anomalous diffusion, nonlocal operators, fractional PDEs, nonlocal PDEs, degenerate PDEs, semi-Lagrangian scheme, convergence, compactness, Fokker-Planck equations, Hamilton-Jacobi-Bellman equations, duality methods

2020 Mathematics Subject Classification:

35Q89, 47G20, 35Q84, 49L12, 45K05, 35K61, 65M12, 91A16, 65M22, 35R11 , 35R06,

1. Introduction

In this article we study numerical approximations of Mean Field Games (MFGs) with fractional and general non-local diffusions. We consider the mean field game system

(1)

\displaystyle\begin{cases}-u_{t}-\mathcal{L}u+H(x,Du)=F(x,m(t)),\quad&\text{ in }(0,T)\times\mathbb{R}^{d},\\ m_{t}-\mathcal{L}^{*}m-\text{div}(mD_{p}H(x,Du))=0\quad&\text{ in }(0,T)\times\mathbb{R}^{d},\\ u(T,x)=G(x,m(T)),\ m(0)=m_{0}\quad&\text{ in }\mathbb{R}^{d},\end{cases}

where

(2)

\displaystyle\mathcal{L}\phi(x)=\int_{|z|>0}\big{[}\phi(x+z)-\phi(x)-\mathbbm{1}_{\{|z|<1\}}D\phi(x)\cdot z\big{]}d\nu(z),

is a nonlocal diffusion operator (possibly degenerate), $\nu$ is a Lévy measure (see assumption ( $\nu$ 0): ), and the adjoint $\mathcal{L}^{*}$ is defined as $(\mathcal{L}^{*}\phi,\psi)_{L^{2}}=(\phi,\mathcal{L}\psi)_{L^{2}}$ for $\phi,\psi\in C_{c}^{\infty}(\mathbb{R}^{d})$ .

The first equation in (1) is a backward in time Hamilton-Jacobi-Bellman (HJB) equation with terminal data $G$ , and the second equation is a forward in time Fokker-Planck-Kolmogorov (FPK) equation with initial data $m_{0}$ . Here $H$ is the Hamiltonian, and the system is coupled through the cost functions $F$ and $G$ . There are two different types of couplings: (i) Local couplings where $F$ and $G$ depend on point values of $m$ , and (ii) non-local or smoothing couplings where they depend on distributional properties induced from $m$ through integration or convolution. Here we work with nonlocal couplings.

A mathematical theory of MFGs were introduced by Lasry–Lions [49] and Caines–Huang–Malhame [44], and describes the limiting behavior of $N$ -player stochastic differential games when the number of players $N$ tends to $\infty$ [18]. In recent years there has been significant progress on MFG systems with local (or no) diffusion, including e.g. modeling, wellposedness, numerical approximations, long time behavior, convergence of Nash equilibria, and various control and game theoretic questions, see e.g. [5, 27, 18, 13, 39, 43] and references therein. The study of MFGs with ‘non-local diffusion’ is quite recent, and few results exist so far. Stationary problems with fractional Laplacians were studied in [30], and parabolic problems including (1), in [33] and [37]. We refer to [48] and references therein for some development using probabilistic methods.

The difference between problem (1) and standard MFG formulations lies in the type of noise driving the underlying controlled stochastic differential equations (SDEs). Usually Gaussian noise is considered [49, 51, 20, 26, 5], or there is no noise (the first order case) [17, 19]. Here the underlying SDEs are driven by pure jump Lévy processes, which leads to nonlocal operators (2) in the MFG system. In many real world applications, jump processes model the observed noise better than Gaussian processes [9, 50, 34, 54]. Prototypical examples are symmetric $\sigma$ -stable processes and their generators, the fractional Laplace operators $(-\triangle)^{\frac{\sigma}{2}}$ . In Economy and Finance the observed noise is not symmetric and $\sigma$ -stable, but rather non-symmetric and tempered. A typical example is the one-dimensional CGMY process [34] where $\frac{d\nu}{dz}(z)=\frac{C}{|z|^{1+Y}}e^{-Gz^{+}-Mz^{-}}$ for $C,G,M>0$ and $Y\in(0,2)$ . Such models are covered by the results of this article. Our assumptions on the nonlocal operators (cf. ( $\nu$ 1): ) are quite general, allowing for degenerate operators and no restrictions on the tail of the Lévy measure $\nu$ .

There has been some development on numerical approximations for MFG systems with local operators. Finite difference schemes for nondegenerate second order equations have been designed and analyzed e.g. by Achdou et al. [1, 2, 3, 4, 7, 8, 6] and Gueant [40, 42, 41]. Semi-Lagrangian (SL) schemes for MFG system have been developed by Carlini–Silva both for first order equations [23] and possibly degenerate second order equations [24]. Other numerical schemes for MFGs include recent machine learning methods [28, 29, 52] for high dimensional problems. We refer to the survey article [6] for recent developments on numerical methods for MFG. We know of no prior schemes or numerical analysis for MFGs with fractional or nonlocal diffusions.

In this paper we will focus on SL schemes. They are monotone, stable, connected to the underlying control problem, easily handles degenerate and arbitrarily directed diffusions, and large time steps are allowed. Although the SL schemes for HJB equations have been studied for some time (see e.g. [38, 16, 14, 35]), there are few results for FPK equations (but see [25]) and the coupled MFG system. For nonlocal problems we only know of the results in [15] for HJB equations.

Our contributions

A. Derivation. We construct fully discrete monotone numerical schemes for the MFG system (1). These dual SL schemes are closely related to the underlying control formulation of the MFG. In our case it is based on the following controlled SDE:

dX_{t}=-\alpha_{t}\,dt+dL_{t},

where $\alpha_{t}$ is the control and $L_{t}$ a pure jump Lévy process (cf. (6)). Note that $L_{t}$ can be decomposed into small and large jumps, where the small jumps may have infinite intensity. We derive our approximation in several steps:

1.

(Approximate small jumps) The small jumps are approximated by Brownian motion (see (7)) following e.g. [10, 15, 36]. This is done to avoid infinitely many jumps per time-interval and singular integrals, and gives a better approximation compared to simply neglecting these terms.
2.

(SL scheme for HJB) We discretise the resulting SDE from step 1 in time and approximate the noise by random walks and approximate compound Poisson processes in the spirit of [15] (Section 3.1). From the corresponding discrete time optimal control problem, dynamic programming, and interpolation we construct an SL scheme for the HJB equation (Section 3.2).
3.

(Approximate control) We define an approximate optimal feedback control for the SL scheme in step 2 from the continuous optimal feedback control as in [23, 24]: $\alpha^{*}_{\textup{approx}}=D_{p}H(\cdot,Du_{d}^{\epsilon})$ , where $u_{d}^{\epsilon}$ is a regularization of the (interpolated) solution from step 2 (Section 3.3).
4.

(Dual SL scheme for FPK) The control of step 3 and the scheme in step 2 define a controlled approximate SDE with a corresponding discrete FPK equation for the densities of the solutions. We explicitly derive this FPK equation in weak form, and obtain the final dual SL scheme taking test functions to be linear interpolation basis functions (Section 3.4).

See (18) and (24) in Section 3 for the specific form of our discretizations. These seem to be the first numerical approximations of MFG systems with nonlocal or fractional diffusion and the first SL approximations of nonlocal FPK equations. Our dual SL schemes are extensions to the nonlocal case of the schemes in [23, 24, 25], but a clear derivation of such type of schemes seems to be new. The schemes come in the form of nonlinear coupled systems (27) that need to be resolved numerically. We prove existence of solutions using fixed point arguments, see Proposition 3.4.

B. Analysis. We establish a range of properties for the scheme including monotonicity, consistency, stability, (discrete) regularity, convergence of individual equations, and convergence to the full MFG system.

1.

(HJB approximation) For the approximation of the HJB equation we prove pointwise consistency and uniform discrete $L^{\infty}$ , Lipschitz, and semiconcavity bounds. Convergence to a viscosity solution is obtained via the half relaxed limit method [12].
2.

(FKP approximation) We prove consistency in the sense of distributions, preservation of mass and positivity, $L^{1}$ -stability, tightness, and equi-continuity in time. In dimension $d=1$ , we also prove uniform $L^{p}$ -estimates for all $p\in(1,\infty]$ . Convergence is obtained from compactness and stability arguments.
3.

(The full MFG approximation) We prove convergence along subsequences to viscosity-very weak solutions of the MFG system in two cases: (i) Degenerate equations in dimension $d=1$ , and (ii) non-degenerate equation in $\mathbb{R}^{d}$ under the assumption that solutions of the HJB equation are $C^{1}$ in space. Full convergence follows for MFGs with unique solutions, and convergence to classical solutions follows under certain regularity and weak uniqueness conditions. Applying the results to the setting of [37], we obtain full convergence to classical solutions in this case.

Because of the nonlocal or smoothing couplings, the HJB approximation can be analysed almost independently of the FKP approximation. The analysis of the FKP scheme on the other hand, strongly depends on boundedness and regularity properties of solutions of the HJB scheme. Compactness in measure is enough in the nondegenerate case when the HJB equation has $C^{1}$ solutions, while stronger weak ( $*$ ) compactness in $L^{p}$ for some $p\in(1,\infty]$ is needed in the degenerate case. As in [23], we are only able to prove this latter compactness in dimension $d=1$ . A priori estimates and convergence for $p\in(1,\infty)$ seems to be new also for local MFGs.

In this paper we study general Lévy jump processes and nonlocal operators. This means that the underlying stochastic processes may not have first moments whatever initial distribution we take (like e.g. $\sigma$ -stable processes with $\sigma<1$ ), and then we can no longer work in the commonly used Wasserstein-1 space $(P_{1},d_{1})$ for the FKP equations. Instead we work in the space $(P,d_{0})$ of probability measures under weak convergence metrizised by the Rubinstein-Kantorovich metric $d_{0}$ (see Section 2). Surprisingly, a result from [31] (Proposition 6.1) allow us to prove tightness and compactness in this space without any moment assumptions! We refer to section 4.3 for a more detailed discussion along with convergence results in the traditional $(P_{1},d_{1})$ topology when first moments are available.

This $(P,d_{0})$ setting can be adapted to local problems, to give results also there without moment assumptions. Finally, we note that our results for degenerate problems cover the first order equations and improve [23] in the sense that more general initial distributions $m_{0}$ are allowed: $P\cap L^{p}$ for some $p\in(1,\infty]$ instead of $P_{1+\delta}\cap L^{\infty}$ for some $\delta>0$ .

C. Testing. We provide several numerical simulations. In Example 1 and 2 we use a similar setup as in [24], comparing the effects of a range of different diffusion operators: Fractional Laplacians of different powers, CGMY-diffusions, a degenerate diffusion, a spectrally one-sided diffusion, as well as classical local diffusion and the case of no diffusion. In Example 3 we solve the MFG system on a long time horizon and observe the turnpike property in a nonlocal setting. Finally, in Example 4 we study the convergence of the scheme.

Outline of the paper

In section 2 we list our assumptions and state mostly known results of the MFG system (1) and its individual HJB and FKP equations. In section 3 we construct the discrete schemes for the HJB, FKP, and full MFG equations from the underlying stochastic control problem/game. The convergence results are given in Section 4, along with extensions and a discussion section. In sections 5 and 6 we analyze the discretisations of the HJB and FKP equations respectively, including establishing a priori estimates, stability, and some consistency results. Using these results, we prove the convergence results of section 4 in section 7. In section 8 we provide and discuss numerical simulations of various nonlocal MFG systems. Finally, there are three appendices with proofs of technical results.

2. Assumptions and Preliminaries

We start with some notation. By $C,K$ we mean various constants which may change from line to line. The Euclidean norm on any $\mathbb{R}^{d}$ -type space is denoted by $|\cdot|$ . For any subset $Q\subseteq\mathbb{R}^{d}$ or $Q\subseteq[0,T]\times\mathbb{R}^{d}$ , and for any bounded, possibly vector valued function on $Q$ , we will consider $L^{p}$ -spaces $L^{p}(Q)$ and spaces $C_{b}(Q)$ of bounded continuous functions. Often we use the notation $\|\cdot\|_{0}$ as an alternative notation for the norms in $C_{b}$ or $L^{\infty}$ . The space $C^{m}_{b}(Q)$ is the subset of $C_{b}(Q)$ with $m$ bounded and continuous derivatives, and for $Q\subseteq[0,T]\times\mathbb{R}^{d}$ , $C^{l,k}_{b}(Q)$ is the subset of $C_{b}(Q)$ with $l$ bounded and continuous derivatives in time and $k$ in space. By $P(\mathbb{R}^{d})$ we denote the set of probability measure on $\mathbb{R}^{d}$ . The Kantorovich-Rubinstein distance $d_{0}(\mu_{1},\mu_{2})$ on the space $P(\mathbb{R}^{d})$ is defined as

d_{0}(\mu_{1},\mu_{2}):=\sup_{f\in\mbox{Lip}_{1,1}(\mathbb{R}^{d})}\Big{\{}\int_{\mathbb{R}^{d}}f(x)d(\mu_{1}-\mu_{2})(x)\Big{\}},

where $\mbox{Lip}_{1,1}(\mathbb{R}^{d})=\Big{\{}f:f\,\mbox{is Lipschitz continuous and}\,\|f\|_{0},\|Df\|_{0}\leq 1\Big{\}}$ . We define the Legendre transform $L$ of $H$ as:

\displaystyle L(x,q):=\sup_{p\in\mathbb{R}^{d}}\big{\{}p\cdot q-H(x,p)\big{\}}.

We use the following assumptions for equation (1):

( $\nu$ 0):

(Lévy condition) $\nu$ is a positive Radon measure that satisfies

\displaystyle\int_{\mathbb{R}^{d}}1\wedge|z|^{2}d\nu(z)<\infty.

( $\nu$ 1):

(Growth near singularity) There exists constants $\sigma\in(0,2)$ and $C>0$ such that the density of $\nu$ for $|z|<1$ satisfies

\displaystyle 0\leq\frac{d\nu}{dz}\leq\frac{C}{|z|^{d+\sigma}},\text{ for }|z|<1.

(L0):

(Continuity and local boundedness) The function $L:\mathbb{R}^{d}\times\mathbb{R}^{d}\to\mathbb{R}$ is continuous in $x,q$ , and for any $K>0$ , there exists $C_{L}(K)>0$ such that

\displaystyle\sup_{|q|\leq K}|L(x,q)|\leq C_{L}(K),\qquad x\in\mathbb{R}^{d}.

(L1):

(Convexity and growth) The function $L(x,q)$ is convex in $q$ and satisfies

\displaystyle\lim_{|q|\to+\infty}\frac{L(x,q)}{|q|}=+\infty,\qquad x\in\mathbb{R}^{d}.

(L2):

(Lipschitz regularity) There exists a constant $L_{L}>0$ independent of $q$ , such that

\displaystyle|L(x,q)-L(y,q)|\leq L_{L}|x-y|.

(L3):

(Semi-concavity) There exists a constant $c_{L}>0$ independent of $q$ , such that

\displaystyle L(x+y,q)-2L(x,q)+L(x-y,q)\leq c_{L}|y|^{2}.

(F1):

(Uniform bounds) There exists constants $C_{F},C_{G}>0$ such that

\displaystyle|F(x,\mu)|\leq C_{F},|G(x,\mu)|\leq C_{G},\qquad\forall x\in\mathbb{R}^{d},\mu\in P(\mathbb{R}^{d}).

(F2):

(Lipschitz assumption) There exists constants $L_{F},L_{G}>0$ such that

	$\displaystyle\|F(x,\mu_{1})-F(y,\mu_{2})\|\leq L_{F}\big{[}\|x-y\|+d_{0}(\mu_{1},\mu_{2})\big{]},$
	$\displaystyle\vskip 6.0pt plus 2.0pt minus 2.0pt\|G(x,\mu_{1})-G(y,\mu_{2})\|\leq L_{G}\big{[}\|x-y\|+d_{0}(\mu_{1},\mu_{2})\big{]}.$

(F3):

(Semi-concavity) There exists constants $c_{F},c_{G}>0$ such that

	$\displaystyle F(x+y,\mu)-2F(x,\mu)+F(x-y,\mu)\leq c_{F}$
	$\displaystyle\vskip 6.0pt plus 2.0pt minus 2.0ptG(x+y,\mu)-2G(x,\mu)+G(x-y,\mu)\leq c_{G}$

(M):

(Initial condition) We assume $m_{0}\in P(\mathbb{R}^{d})$ .

(M’):

The dimension $d=1$ , and $m_{0}\in P(\mathbb{R})\cap L^{p}(\mathbb{R})$ for some $p\in(1,\infty]$ .

By (L1): , the Legendre transform $H=L^{*}$ is welldefined and the optimal $q$ is $q^{*}=D_{p}H(x,p)$ . To study the convergence of the numerical schemes we further assume local uniform bounds on the derivatives of Hamiltonian:

(H1):

The function $D_{p}H\in C(\mathbb{R}^{d}\times\mathbb{R}^{d})$ , and for every $R>0$ , there is a constant $C_{R}>0$ such that for every $x\in\mathbb{R}^{d}$ and $p\in B_{R}$ we have $|D_{p}H(x,p)|\leq C_{R}$ .

(H2):

The function $D_{p}H\in C^{1}(\mathbb{R}^{d}\times\mathbb{R}^{d})$ . For every $R>0$ there exists a constant $C_{R}>0$ such that for every $x\in\mathbb{R}^{d}$ and $p\in B_{R}$ we have

\displaystyle|D_{pp}H(x,p)|+|D_{px}H(x,p)|\leq C_{R}.

Remark 2.1.

We impose most of the conditions on $L$ , and not on $H$ , as $L$ appears in optimal control problem, which would be the basis of our semi-Lagrangian approximation. Assumptions (L1): and (L2): (but, not (L3): !) would immediately carry forward to the corresponding Hamiltonian $H$ from the definition of Legendre transform. Whereas, we require to assume (H1): –(H2): on $H$ , in contrary to the other assumptions, as it does not follow from the condition on $L$ in general. However, when the Lagrangian $L$ behaves like $|\cdot|^{r}$ in $q$ variable for large $q$ and $r>1$ , the growth of the corresponding Hamiltonian $H$ would be $|\cdot|^{\frac{r}{r-1}}$ in $p$ variable for large $p$ (cf. [32, Proposition 2.1]). The growth of the derivatives of $H$ for large $p$ can be computed similarly, which would correspond to similar condition as in (H1): –(H2): .

In most of this paper solutions of the HJB equation in (1) are interpreted in the viscosity sense, we refer to [46] and references therein for general definition and wellposedness results, while solutions of FPK equation in (1) are considered in the very weak sense defined as follows:

Definition 2.2.

(a) If $u\in C^{0,1}_{b}((0,T)\times\mathbb{R}^{d})$ , a measure $m\in C([0,T],P(\mathbb{R}^{d}))$ is a very weak solution of the FPK equation in (1), if for every $\phi\in C_{c}^{\infty}(\mathbb{R}^{d})$ and $t\in[0,T]$

(3)

\displaystyle\begin{split}&\int_{\mathbb{R}^{d}}\phi(x)\,dm(t)(x)-\int_{\mathbb{R}^{d}}\phi(x)\,dm_{0}(x)\\ &=\int_{0}^{t}\int_{\mathbb{R}^{d}}\Big{(}\mathcal{L}[\phi](x)-D_{p}H(x,Du)\cdot D\phi(x)\Big{)}dm(s)(x)ds.\end{split}

(b) If $u\in L^{\infty}(0,T;W^{1,\infty}(\mathbb{R}^{d}))$ and $p\in[1,\infty]$ , a function $m\in C([0,T],P(\mathbb{R}^{d}))\cap L^{p}([0,T]\times\mathbb{R}^{d})$ , is a very weak solution of the FPK equation in (1), if (3) holds for every $\phi\in C_{c}^{\infty}(\mathbb{R}^{d})$ and $t\in[0,T]$ .

Remark 2.3.

Inequality (3) holding for every $\phi\in C_{c}^{\infty}(\mathbb{R}^{d})$ and $t\in[0,T]$ is equivalent to

\displaystyle\begin{split}&\int_{\mathbb{R}^{d}}\phi(T,x)\,d(m(T))(x)-\int_{\mathbb{R}^{d}}\phi(0,x)\,dm_{0}(x)\\ &=\int_{0}^{T}\int_{\mathbb{R}^{d}}\Big{(}\phi_{t}(s,x)+\mathcal{L}[\phi](s,x)-D_{p}H(x,Du)\cdot D\phi(s,x)\Big{)}dm(s)(x)ds,\end{split}

holding for every $\phi\in C^{1,2}_{b}([0,T]\times\mathbb{R}^{d})$ (cf. [31, Lemma 6.1]).

Definition 2.4.

A pair $(u,m)$ is a viscosity-very weak solution of the MFG system (1), if $u$ is a viscosity solution of the HJB equation, and $m$ is a very weak solution of the FPK equation (see, Definition 2.2).

Proposition 2.5.

Fix, $\mu\in C([0,T],P(\mathbb{R}^{d}))$ . Let ( $\nu$ 0): , (L2): and (F1): hold.

(a) (Comparison principle) If $u$ is a viscosity subsolution and $v$ is a viscosity supersolution of the HJB equation in (1) with $u(T,\cdot)\leq v(T,\cdot)$ , then $u\leq v$ .

(b) There exists a unique bounded viscosity solution $u\in C_{b}([0,T]\times\mathbb{R}^{d})$ of the HJB equation in (1), and for any $t\in[0,T]$ we have $\|u(t)\|_{0}\leq C_{F}T+C_{G}$ .

(c) If (L2): and (F2): hold, then the viscosity solution $u$ is Lipschitz continuous in space variable and for every $t\in[0,T]$ and $x,y\in\mathbb{R}^{d}$ we have

\displaystyle|u(t,x)-u(t,x+y)|\leq\big{(}T(L_{L}+L_{F})+L_{G}\big{)}\,|y|.

In addition, if (L3): and (F3): hold, then $u$ is semiconcave in space variable and for every $t\in[0,T]$ and $x,y\in\mathbb{R}^{d}$ we have

\displaystyle u(t,x+y)+u(t,x-y)-2u(t,x)\leq\big{(}T(c_{L}+c_{F})+c_{G}\big{)}\,|y|^{2}.

Proof.

These results are by now standard: (a) follows by a similar argument as for [46, Theorem 3.1], (b) follows by e.g. Perron’s method, and (c) by adapting the comparison arguments of [46] in a standard way. We omit the details. Under some extra assumptions, (b) and (c) also follows from Theorem 5.4 and Lemma 5.3 below. ∎

Proposition 2.6.

Assume ( $\nu$ 0): , ( $\nu$ 1): , (H1): , and (M): .

(a) If $u\in C([0,T];C^{1}_{b}(\mathbb{R}^{d}))$ , then there exists a very weak solution $m\in C([0,T];P(\mathbb{R}^{d}))$ of the FPK equation in (1).

(b) If $d=1$ , $u\in C([0,T];W^{1,\infty}(\mathbb{R}))$ , $u$ semi-concave, and (M’): holds, then there exists a very weak solution $m\in C([0,T];P(\mathbb{R}))\cap L^{p}([0,T]\times\mathbb{R})$ of the FPK equation in (1). Moreover, $\|m(t)\|_{L^{p}(\mathbb{R})}\leq e^{CT}\|m_{0}\|_{L^{p}(\mathbb{R})}$ for some constant $C>0$ and $t\in[0,T]$ .

Proof.

The results follow from the convergence of the discrete scheme in this article. The proof of (a) follows the proof of Theorem 4.3, setting $Du_{\rho,h}=Du$ . The proof of (b) follows the proof of Theorem 4.1 and Theorem 6.7, setting $Du_{\rho,h}=Du$ . Note that semi-concavity of $u$ is crucial for the the $L^{p}$ -bound of Theorem 6.7. ∎

Existence and uniqueness results are given in [37] for classical solutions of MFGs with nonlocal diffusions under additional assumptions:

( $\nu$ 2):

(Growth near singularity) There exists constants $\sigma\in(1,2)$ and $c>0$ such that the density of $\nu$ for $|z|<1$ satisfies

\displaystyle\frac{c}{|z|^{d+\sigma}}\leq\frac{d\nu}{dz},\text{ for }|z|<1.

(F4):

There exists constants $C_{F},C_{G}>0$ , such that $\|F(\cdot,m)\|_{C_{b}^{2}}\leq C_{F}$ and $\|G(\cdot,\tilde{m})\|_{C_{b}^{3}}\leq C_{G}$ for all $m,\tilde{m}\in P(\mathbb{R}^{d})$ .

(F5):

$F$ and $G$ satisfy monotonicity conditions:

	$\displaystyle\int_{\mathbb{R}^{d}}\left(F\left(x,m_{1}\right)-F\left(x,m_{2}\right)\right)d\left(m_{1}-m_{2}\right)\left(x\right)$	$\displaystyle\geq 0\qquad\forall m_{1},m_{2}\in P(\mathbb{R}^{d}),$
	$\displaystyle\int_{\mathbb{R}^{d}}\left(G\left(x,m_{1}\right)-G\left(x,m_{2}\right)\right)d\left(m_{1}-m_{2}\right)\left(x\right)$	$\displaystyle\geq 0\qquad\forall m_{1},m_{2}\in P(\mathbb{R}^{d}).$

(H3):

The Hamiltonian $H\in C^{3}(\mathbb{R}^{d}\times\mathbb{R}^{d})$ , and for every $R>0$ there is $C_{R}>0$ such that for $x\in\mathbb{R}^{d}$ , $p\in B_{R}$ , $\alpha\in\mathbb{N}_{0}^{N}$ , $|\alpha|\leq 3$ , then $|D^{\alpha}H(x,p)|\leq C_{R}$ .

(H4):

For every $R>0$ there is $C_{R}>0$ such that for $x,y\in\mathbb{R}^{d},u\in\left[-R,R\right],p\in\mathbb{R}^{d}$ : $|H\left(x,u,p\right)-H\left(y,u,p\right)|\leq C_{R}\left(|p|+1\right)|x-y|$ .

(H5):

(Uniform convexity) There exists a constant $C>0$ such that $\frac{1}{C}I_{d}\leq D_{pp}^{2}H\left(x,p\right)\leq CI_{d}$ .

(M”):

The probability measure $m_{0}$ has a density (also denoted by $m_{0}$ ) $m_{0}\in C_{b}^{2}$ .

Theorem 2.7.

Assume ( $\nu$ 0): , ( $\nu$ 1): , ( $\nu$ 2): , (F2): , (F4): , (H3): ,(H4): , and (M”): .

(a) There exists a classical solution $(u,m)$ of (1) such that $u\in C^{1,3}_{b}((0,T)\times\mathbb{R}^{d})$ and $m\in C^{1,2}_{b}((0,T)\times\mathbb{R}^{d})\cap C(0,T;P(\mathbb{R}^{d}))$ .

(b) If in addition (F5): and (H5): hold, then the classical solution is unique.

This is a consequence of [37, Theorem 2.5 and Theorem 2.6]. We refer to [37] for more general results, where in particular assumptions ( $\nu$ 1): and ( $\nu$ 2): can be relaxed to allow for a much larger class of nonlocal operators $\mathcal{L}$ . In the nondegenerate case, for the individual equations in (1) we also have uniqueness of viscosity-very weak solutions and existence of classical solutions. Uniqueness for HJB equations and existence for HJB and FPK equations follows by Theorem 5.3, Theorem 5.5, and Proposition 6.8 in [37]. We prove uniqueness for very weak solutions of FPK equations here.

Proposition 2.8 (Uniqueness for the FPK equation).

Assume ( $\nu$ 0): , ( $\nu$ 1): , ( $\nu$ 2): , and $D_{p}H(x,Du(t))\in C_{b}^{0,2}((0,T)\times\mathbb{R}^{d})$ . Then there is at most one very weak solution of the FPK equation in (1).

Proof.

Let $m_{1},m_{2}$ be two very weak solutions, define $\tilde{m}:=m_{1}-m_{2}$ and take any $\psi\in C_{c}^{\infty}\left(\mathbb{R}^{d}\right)$ . For any $\tau\in(0,T)$ , the terminal value problem

\displaystyle\partial_{t}\phi+\mathcal{L}\phi-D\phi\cdot D_{p}H(x,Du)=0\quad\text{in}\quad\mathbb{R}^{d}\times(0,\tau)\quad\text{and}\quad\phi(x,\tau)=\psi(x)\quad\text{in}\quad\mathbb{R}^{d},

has a unique classical solution $\phi\in C_{b}^{1,2}((0,\tau)\times\mathbb{R}^{d})$ essentially by [37, Theorem 5.5] (the result follows from Proposition 5.8 with $k=2$ and the observation that the proof of Theorem 5.5 also holds for $k=2$ ). Using the definition of very weak solution (see Remark 2.3) we get

\displaystyle\int_{\mathbb{R}^{d}}\psi(x)\,d\tilde{m}(\tau)(x)=\int_{0}^{\tau}\int_{\mathbb{R}^{d}}\big{(}\partial_{t}\phi+\mathcal{L}\phi-D\phi\cdot D_{p}H(x,Du)\big{)}\,d\tilde{m}(t)(x)\,dt=0,

for any $\tau\in[0,T]$ . Since $\psi$ was arbitrary, it follows that $\tilde{m}(\tau)=0$ in $P(\mathbb{R}^{d})$ for every $\tau\in[0,T]$ , and uniqueness follows. ∎

3. Discretisation of the MFG system

To discretise the MFG system (1), we first follow [15] and derive a Semi-Lagrange approximation of the HJB equation in (1). Using this approximation and the optimal control of the original problem, we derive an approximation of the FPK equation in (1) which is in (approximate) duality with the approximation of the HJB-equation.

This derivation is based on the following control interpretation of the HJB equation. For a fixed given density $m=\mu$ , the solution $u$ of the HJB equation in (1) is the value function of the optimal stochastic control problem:

(4)

\displaystyle u(t,x)=\inf_{\alpha}J\big{(}x,t,\alpha\big{)},

where $\alpha_{t}$ is an admissible control, $J$ is the total cost to be minimized,

(5)

\displaystyle J\big{(}x,t,\alpha\big{)}=\mathbb{E}\bigg{[}\int_{t}^{T}\Big{(}L(\tilde{X}_{s},\alpha_{s})+F(\tilde{X}_{s},\mu_{s}\Big{)}ds+G(\tilde{X}_{T},\mu_{T})\bigg{]},

and $\tilde{X}_{s}=\tilde{X}_{s}^{x,t}$ solves the controlled stochastic differential equation (SDE)

(6)

\displaystyle\begin{cases}&d\tilde{X}_{s}=-\alpha_{s}\,ds+\int_{|z|<1}z\tilde{N}(dz,ds)+\int_{|z|\geq 1}zN(dz,ds),\quad s>t,\\ &\tilde{X}_{t}=x,\end{cases}

where $N$ a Poisson random measure with intensity/Lévy measure $\nu(dz)ds$ , and $\tilde{N}=N(dz,ds)-\nu(dz)ds$ is the compensated Poisson measure.¹¹1The $N$ -integral is just a (difficult way of writing a) compound Poisson jump-process, while the $\tilde{N}$ -integral is a centered jump process with an infinite number of (small) jumps per time interval a.s. [9].

3.1. Approximation of the underlying controlled SDE

A. Approximate small jumps by Brownian motion.

First we approximate small jumps in (6) by (vanishing) Brownian motion²²2To avoid singular integrals and infinite number of jumps per time interval. (cf. [10]): For $r\in(0,1)$ , let $X_{s}=X_{s}^{x,t}$ solve

(7)

\displaystyle\begin{cases}dX_{s}=\bar{b}(\alpha_{s})ds+\sigma_{r}\,dW_{s}+\int_{|z|\geq r}zN(dz,ds),\quad s>t\\ X_{t}=x,\end{cases}

where $W_{s}$ is a standard Brownian motion, $\bar{b}(\alpha_{s})=-\,\alpha_{s}-b_{r}^{\sigma}$ , and

(8)			$\displaystyle b_{r}^{\sigma}:=\int_{r<\|z\|<1}z\,\nu(dz),$
(9)			$\displaystyle\sigma_{r}:=\bigg{(}\frac{1}{2}\int_{\|z\|<r}zz^{T}\nu(dz)\bigg{)}^{1/2}.$

The last integral in (7) is a compound Poisson process (cf. e.g. [9]): For any $t\geq 0$ ,

(10)

\displaystyle\int_{0}^{t}\int_{|z|\geq r}zN(dz,dt)=\sum_{j=1}^{\hat{N}_{t}}J_{j}

where the number of jumps up to time $t$ is $\hat{N}_{t}\sim\textup{Poisson}(t\lambda_{r})$ , the jumps $\{J_{j}\}_{j}$ are iid rv’s in $\mathbb{R}^{d}$ with distribution $\nu_{r}$ and $J_{0}=0$ , and for $r\in(0,1]$ ,

(11)

\displaystyle\nu_{r}:=\nu\mathbbm{1}_{|z|>r}\qquad\text{and}\qquad\lambda_{r}:=\int_{\mathbb{R}^{d}}\nu_{r}(dz).

The infinitesimal generators $\mathcal{L}^{\alpha}$ and $\hat{\mathcal{L}}^{\alpha}$ of the SDEs (6) and (7) are (cf. [9])

	$\displaystyle\mathcal{L}^{\alpha}\phi(x)=-\alpha_{t}\cdot\nabla\phi+\mathcal{L}_{1}\phi(x)+\mathcal{L}^{1}\phi(x),$
	$\displaystyle\hat{\mathcal{L}}^{\alpha}\phi(x)=\,\bar{b}(\alpha_{t})\cdot\nabla\phi(x)+tr\big{(}\sigma_{r}^{T}\cdot D^{2}\phi(x)\cdot\sigma_{r}\big{)}+\mathcal{L}^{r}[\phi](x)$

for $\phi\in C^{2}_{b}(\mathbb{R}^{d})$ , where

(12)

\displaystyle\begin{split}&\mathcal{L}\phi(x)=\mathcal{L}_{r}\phi(x)+\mathcal{L}^{r}\phi(x)\\ &:=\bigg{(}\int_{|z|<r}+\int_{|z|>r}\bigg{)}\Big{(}\phi(x+z)-\phi(x)-\mathbbm{1}_{\{|z|<1\}}D\phi(x)\cdot z\Big{)}d\nu(z).\end{split}

The operator $\hat{\mathcal{L}}^{\alpha}$ is an approximation of $\mathcal{L}^{\alpha}$ .

Lemma 3.1 ([47]).

If ( $\nu$ 1): holds and $\phi\in C_{b}^{3}(\mathbb{R}^{d})$ , then for $\mathcal{L}_{r}$ and $\sigma_{r}$ defined in (12) and (9) respectively, we have

\displaystyle|\mathcal{L}_{r}\phi(x)-tr\big{(}\sigma_{r}^{T}\cdot D^{2}\phi(x)\cdot\sigma_{r}\big{)}|\leq Cr^{3-\sigma}\|D^{3}\phi\|_{0}.

If in addition, $\phi\in C_{b}^{4}(\mathbb{R}^{d})$ and the Lévy measure $\nu$ is symmetric, then

\displaystyle|\mathcal{L}_{r}\phi(x)-tr\big{(}\sigma_{r}^{T}\cdot D^{2}\phi(x)\cdot\sigma_{r}\big{)}|\leq Cr^{4-\sigma}\|D^{4}\phi\|_{0}.

B. Time discretization of the approximate SDE

Fix a time step $h=\frac{T}{N}\in(0,1)$ for some $N\in\mathbb{N}$ and discrete times $t_{k}=kh$ for $k\in\{0,1,\dots,N\}$ . Following [15], we propose the following Euler-Maruyama discretization of the SDE (7): Let $X_{n}^{t_{l},x}\approx X^{t_{l},x}_{t_{n}}$ , where $X_{n}=X^{t_{l},x}_{n}$ solves

(13)

\displaystyle\begin{cases}X_{l}=x\\ X_{n}=X_{n-1}+h\bar{b}(\alpha_{n-1})+\sqrt{h}\displaystyle\sum_{m=1}^{d}\sigma_{r}^{m}\xi_{n-1}^{m},\ \ n=l+N_{i}+1,\dots,l+N_{i+1}-1,\\ X_{l+N_{i+1}}=X_{l+N_{i+1}-1}+J_{i}.\end{cases}

Here the control $\alpha_{n}$ is constant on each time interval, $\sigma_{r}^{m}$ is the $m$ th-column of $\sigma_{r}$ , and $\xi_{n}=(\xi_{n}^{1},\ldots,\xi_{n}^{d})$ is a random walk in $\mathbb{R}^{d}$ with

\displaystyle\mathbf{P}\big{(}\xi^{i}_{n}=\pm 1\big{)}=\frac{1}{2d}.

The processes $J_{k}$ and $N_{k}$ defines an approximation of the compound Poisson part of (7) through equation (10) where $\hat{N}_{t}$ is replaced by an approximation

\tilde{N}_{t}=\max\{k:\Delta T_{1}+\Delta T_{2}+\dots+\Delta T_{k}\leq t\},

where exponentially distributed waiting times (time between jumps) are replaced by approximations $\{\Delta T_{k}\}_{k\in\mathbb{N}}$ ³³3In the new model, $\tilde{N}_{t}$ still gives the number of jumps up to time $t$ .: $\Delta T_{k}=h\Delta N_{k}=h(N_{k}-N_{k-1})$ where $N_{k}:\Omega\to\mathbb{N}\cup\{0\}$ , $N_{0}=0$ , and $\Delta N_{k}$ iid with approximate $h\lambda_{r}$ -exponential distribution given by

\displaystyle\mathbf{P}[\Delta N_{k}>j]=e^{-h\lambda_{r}j}\quad\mbox{for}\quad j=0,1,2,\dots.

Then for $p_{j}:=P[\Delta N_{k}=j]$ , $p_{0}=0$ and $p_{j}=P[\Delta N_{k}>j-1]-P[\Delta N_{k}>j]=e^{-jh\lambda_{r}}(e^{h\lambda_{r}}-1)$ for $j>0$ . We find that $\sum_{j=0}^{\infty}p_{j}=1$ and $E(\Delta N_{k})=\sum_{j=0}^{\infty}e^{-jh\lambda_{r}}=\frac{e^{h\lambda_{r}}}{e^{h\lambda_{r}}-1}$ . Note that in each time interval, approximation (13) either diffuses (the second equation) or jumps (the third equation), and that we have ignored the unlikely event of more than one jump per time interval. For the scheme to converge, we will see that we need to send both $h\to 0$ and $h\lambda_{r}\to 0$ . In this case $E(\Delta N_{k})\to\infty$ and the jumps become less and less frequent and the random walk dominates the evolution of $X_{k}$ (which is to be expected).

3.2. Semi-Lagrangian approximation of the HJB equation

A. Control approximation of the HJB equation

We approximate the control problem (4) – (6) by a discrete time control problem: Define the value function

(14)

\displaystyle\tilde{u}_{h}(t_{l},x)=\inf_{\{\alpha_{n}\}}J_{h}\big{(}x,t_{l},\{\alpha_{n}\}\big{)},

where the controls $\{\alpha_{n}\}$ are piecewise constant in time, the cost function $J_{h}$ is given by

(15)

\displaystyle J_{h}\big{(}x,t_{l},\{\alpha_{n}\}\big{)}=\mathbb{E}\bigg{[}\sum_{n=l}^{N-1}\Big{(}L(X_{n},\alpha_{n})+F(X_{n},\mu(t_{n}))\Big{)}h+G(X_{N},\mu(t_{N}))\bigg{]},

and the controlled discrete time process $X_{n}=X_{n}^{t_{l},x}$ is the solution of (13). By the (discrete time) Dynamic Programming principle it follows that

\displaystyle\tilde{u}_{h}(t_{l},x)=\inf_{\alpha_{n}}\mathbb{E}\bigg{[}\sum_{n=l}^{l+p}\Big{(}L(X_{n}^{t_{l},x},\alpha_{n})+F(X_{n}^{t_{l},x},\mu(t_{n}))\Big{)}h+\tilde{u}_{h}(t_{l+p+1},X_{l+p+1}^{t_{l},x})\bigg{]},

for $l+p+1\leq N$ . Taking $p=0$ and computing the expectation using conditional probabilities (the probability to jump in a time interval is $p_{1}=1-e^{-h\lambda_{r}}$ ), we find a (discrete time) HJB equation

	$\displaystyle\tilde{u}_{h}(t_{l},x$	$\displaystyle)=\inf_{\alpha}\bigg{\{}hF(x,\mu(t_{l}))+hL(x,\alpha)+\Big{[}\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}\big{(}\tilde{u}_{h}(t_{l}+h,x+h\bar{b}(\alpha)+\sqrt{hd}\sigma_{r}^{m})$
(16)			$\displaystyle+\tilde{u}_{h}(t_{l}+h,x+h\bar{b}(\alpha)-\sqrt{hd}\sigma_{r}^{m})\big{)}+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|\geq r}\tilde{u}_{h}(t_{l}+h,x+z)\nu(dz)\Big{]}\bigg{\}}.$

B. Interpolation and the fully discrete scheme

For $\rho>0$ we fix a grid $\mathcal{G}_{\rho}=\{i\rho:i\in\mathbb{Z}^{d}\}$ and a linear/multilinear $\mathcal{G}_{\rho}$ -interpolation $I$ . For functions $f:\mathcal{G}_{\rho}\rightarrow\mathbb{R}$ ,

(17)

\displaystyle I[f](x):=\sum_{i\in\mathbb{Z}^{d}}f(x_{i})\beta_{i}(x),\qquad x\in\mathbb{R}^{d},

where the $\beta_{j}$ ’s are piecewise linear/multilinear basis functions satisfying

\displaystyle\beta_{j}\geq 0,\quad\beta_{j}(x_{i})=\delta_{j,i},\quad\sum_{j}\beta_{j}(x)=1,\quad\text{and}\quad\|I[\phi]-\phi\|_{0}=\|D^{2}\phi\|_{0}\rho^{2}

for any $\phi\in C^{2}_{b}(\mathbb{R}^{d})$ . A fully discrete scheme is then obtained from (16) as follows:

(18)

\displaystyle\tilde{u}_{i,k}[\mu]=S_{\rho,h,r}[\mu](\tilde{u}_{\cdot,k+1},i,k),\ k<N,\quad\text{and}\quad\tilde{u}_{i,N}[\mu]=G(x_{i},\mu(t_{N})),

where

	$\displaystyle S_{\rho,h,r}[\mu]$	$\displaystyle(v,i,k)=\inf_{\alpha}\Bigg{\{}hF(x_{i},\mu(t_{k}))+hL(x_{i},\alpha)+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|\geq r}I[v](x_{i}+z)\nu(dz)$
(19)			$\displaystyle+\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}\Big{(}I[v](x_{i}+h\bar{b}(\alpha)+\sqrt{hd}\sigma_{r}^{m})+I[v](x_{i}+h\bar{b}(\alpha)-\sqrt{hd}\sigma_{r}^{m})\Big{)}\Bigg{\}}.$

Finally, we extend the solution of the discrete scheme $\tilde{u}_{i,k}[\mu]$ to the whole $\mathbb{R}^{d}\times[0,T]$ by linear interpolation in $x$ and piecewise constant interpolation in $t$ :

(20)

\displaystyle\tilde{u}_{\rho,h}[\mu](t,x)=I\big{(}\tilde{u}_{\cdot,[\frac{t}{h}]}[\mu]\big{)}(x)=\sum_{i\in\mathbb{Z}^{d}}\beta_{i}(x)\,\tilde{u}_{i,[\frac{t}{h}]}[\mu]\quad\mbox{for any}\quad(t,x)\in[0,T)\times\mathbb{R}^{d}.

3.3. Approximate optimal feedback control

For the HJB equation in (1), satisfied by the value function (4), it easily follows that the optimal feedback control is

\alpha(t,x)=D_{p}H(x,Du[\mu](t,x)).

Based on this feedback law, we define an approximate feedback control for the discrete time optimal control problem (13)–(15) in the following way: For $h,\rho,\epsilon>0$ and $(t,x)\in\mathbb{R}^{d}\times[0,T]$ ,

(21)

\displaystyle\alpha_{\text{num}}(t,x):=D_{p}H(x,D\tilde{u}^{\epsilon}_{\rho,h}[\mu](t,x)),

where $\tilde{u}_{\rho,h}[\mu]$ is given by (20),

(22)

\displaystyle\tilde{u}_{\rho,h}^{\epsilon}[\mu](t,x)=\tilde{u}_{\rho,h}[\mu](t,\cdot)*\rho_{\epsilon}(x),

and the mollifier $\rho_{\epsilon}(x)=\frac{1}{\epsilon^{d}}\rho\big{(}\frac{x}{\epsilon}\big{)}$ for $0\leq\rho\in C_{c}^{\infty}(\mathbb{R}^{d})$ with $\int_{\mathbb{R}^{d}}\rho(x)dx=1$ . We state a standard result on mollification.

Lemma 3.2.

If $u\in W^{1,\infty}(\mathbb{R}^{d})$ , $\epsilon>0$ , and $u^{\epsilon}=u*\rho_{\epsilon}$ . Then $u^{\epsilon}\in C_{b}^{\infty}(\mathbb{R}^{d})$ , and there exists a constant $c_{\rho}>0,$ such that for all $\epsilon>0$ ,

\displaystyle\|u^{\epsilon}-u\|_{0}\leq\|Du\|_{0}\,\epsilon\qquad\mbox{and}\qquad\|D^{p}u^{\epsilon}\|_{0}\leq c_{\rho}\|Du\|_{0}\,\epsilon^{1-p}\ \ \text{for any}\ \ p\in\mathbb{N}.

By construction, we expect $\alpha_{\text{num}}$ to be an approximation of the optimal feedback control for the approximate control problem with value function (14) when $h,\rho,\epsilon$ are small and $\tilde{u}^{\epsilon}_{\rho,h}$ is close to $u$ .

3.4. Dual SL discretization of the FPK equation

A. Dual approximation of the FPK equation

First note that if $\tilde{X}_{s}=\tilde{X}^{0,Z_{0}}_{s}$ solves (6) with $t=0$ and $X_{0}=Z_{0}$ , a rv with distribution $m_{0}$ , then the FPK equation for $\tilde{m}:=Law(\tilde{X}_{s})$ is

\displaystyle\begin{cases}\tilde{m}_{t}-\mathcal{L}^{*}\tilde{m}-\text{div}(\tilde{m}\alpha)=0,\\ \tilde{m}(0)=m_{0}.\end{cases}

Setting $\alpha=\alpha_{\text{num}}$ , this equation becomes an approximation of the FPK equation in (1). With this choice of $\alpha$ , we further approximate $\tilde{m}$ by the density $\tilde{m}_{k}:=Law(X_{k})$ , of the approximate process $X_{k}=X_{k}^{0,Z_{0}}$ solving (13) with $l=0$ and $X_{0}=Z_{0}$ .

We now derive a FPK equation for $\tilde{m}_{k}$ which in discretised form will serve as our approximation of the FPK equation in (1). To simplify we consider dimension $d=1$ . By definition of $\tilde{m}_{k}$ ,

\displaystyle\mathbb{E}[\phi(X_{k+1})]=\int_{\mathbb{R}}\phi(x)\,d\tilde{m}_{k+1}(x),

for $\phi\in C_{b}(\mathbb{R}^{d})$ and $k\in\mathbb{N}\cup\{0\}$ . Let $A_{k}$ be the event of at least one jump in $[t_{k},t_{k+1})$ , i.e. $A_{k}=\{\omega:N_{k+1}(\omega)-N_{k}(\omega)\geq 1\}$ where $N_{k}$ is the random jump time defined in Section 3.1 B. Then by the definition of $X_{k}$ in (13), the fact that $N_{k}$ , $J_{k}$ , and $\xi_{k}$ are i.i.d. and hence independent of $X_{k}$ , and conditional expectations, we find that

	$\displaystyle\int_{\mathbb{R}}\phi(x)\,d\tilde{m}_{k+1}(x)=\mathbb{E}[\phi(X_{k+1})]$
	$\displaystyle=\mathbb{E}[\phi(X_{k+1})\|A_{k}^{c}]\,P(A_{k}^{c})+\mathbb{E}[\phi(X_{k+1})\|A_{k}]\,P(A_{k})$
	$\displaystyle=e^{-h\lambda_{r}}\mathbb{E}(\phi(X_{k}+h\bar{b}(\alpha_{\text{num}})+\sqrt{h}\sigma_{r}\xi_{k}))+(1-e^{-h\lambda_{r}})\mathbb{E}(\phi(X_{k}+J_{i}))$
	$\displaystyle=\frac{e^{-h\lambda_{r}}}{2}\int_{\mathbb{R}}\big{(}\phi(x+h\bar{b}(\alpha_{\text{num}})+\sqrt{h}\sigma_{r})+\phi(x+h\bar{b}(\alpha_{\text{num}})-\sqrt{h}\sigma_{r})\big{)}\tilde{m}_{k}(dx)$
	$\displaystyle\qquad+(1-e^{-h\lambda_{r}})\int_{\mathbb{R}}\int_{\|z\|>r}\phi(x+z)\frac{\nu(dz)}{\lambda_{r}}\tilde{m}_{k}(dx).$

Let $E_{i}:=\big{(}x_{i}-\frac{\rho}{2},x_{i}+\frac{\rho}{2}\big{)}$ , $\tilde{m}_{j,k}=\int_{E_{j}}\tilde{m}_{k}(dx)$ . We approximate the above expression by a midpoint (quadrature) approximation, i.e. $\int_{E_{j}}f(x)\tilde{m}_{k}(dx)\approx f(x_{j})\tilde{m}_{j,k}$ , then by choosing $\phi(x)=\beta_{j}(x)$ (linear interpolant) for $j\in\mathbb{Z}$ and using $\beta_{j}(x_{i})=\delta_{j,i}$ we get a fully discrete approximation

	$\displaystyle\tilde{m}_{j,k+1}\approx\sum_{i\in\mathbb{Z}}$	$\displaystyle\tilde{m}_{i,k}\Big{[}\frac{e^{-h\lambda_{r}}}{2}\Big{(}\beta_{j}(x_{i}+h\bar{b}(\alpha_{\text{num}})+\sqrt{h}\sigma_{r})+\beta_{j}(x_{i}+h\bar{b}(\alpha_{\text{num}})-\sqrt{h}\sigma_{r})\Big{)}$
		$\displaystyle\qquad+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|>r}\beta_{j}(x_{i}+z)\nu(dz)\Big{]}.$

In arbitrary dimension $d$ , we denote

(23)

\displaystyle\Phi^{\epsilon,\pm}_{j,k,p}:=x_{j}-h\,\big{(}H_{p}(x_{j},D\tilde{u}_{\rho,h}^{\epsilon}[\mu](t_{k},x_{j}))+B_{r}^{\sigma}\big{)}\pm\sqrt{hd}\sigma_{r}^{p}.

for $j\in\mathbb{Z}^{d}$ , $k=0,\ldots,N$ , $p=1,\ldots,d$ . Redefining $E_{i}:=x_{i}+\frac{\rho}{2}(-1,1)^{d}$ and reasoning as for $d=1$ above, we get the following discrete FPK equation

(24)

\displaystyle\begin{cases}\tilde{m}_{i,k+1}[\mu]&:=\displaystyle\sum_{j\in\mathbb{Z}^{d}}\tilde{m}_{j,k}[\mu]\,\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,D\tilde{u}_{\rho,h}^{\epsilon}[\mu])(i,j,k),\\ \tilde{m}_{i,0}&=\displaystyle\int_{E_{i}}dm_{0}(x),\end{cases}

where

(25)

\displaystyle\begin{split}\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,D\tilde{u}_{\rho,h}^{\epsilon}[\mu]](i,j,k)&:=\bigg{[}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}\Big{(}\beta_{i}\big{(}\Phi^{\epsilon,+}_{j,k,p}\big{)}+\beta_{i}\big{(}\Phi^{\epsilon,-}_{j,k,p}\big{)}\Big{)}\\ &\hskip 56.9055pt+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{|z|>r}\beta_{i}(x_{j}+z)\nu(dz)\bigg{]}.\end{split}

The solution is a probability distribution on $\mathcal{G}_{\rho}\times h\mathcal{N}_{h}$ , where $\mathcal{N}_{h}:=\{0,\dots,N\}$ :

Lemma 3.3.

Let $(\tilde{m}_{i,k})$ be the solution of (24). If $m_{0}\in P(\mathbb{R}^{d})$ , then $(\tilde{m}_{i,k})_{i}\in P(\mathbb{Z}^{d})$ , i.e. $\tilde{m}_{i,k}\geq 0$ , $i\in\mathbb{Z}^{d}$ , and $\sum_{j\in\mathbb{Z}^{d}}\tilde{m}_{j,k}=1$ for all $k\in\mathcal{N}_{h}$ .

Proof.

First note that $\tilde{m}_{i,k}\geq 0$ follows directly from the definition of the scheme and $m_{i,0}\geq 0$ . Changing the order of summation and as $\sum_{i}\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,D\tilde{u}_{\rho,h}^{\epsilon}[\mu]](i,j,k)=1$ , we find that

\displaystyle\sum_{i}\tilde{m}_{i,k+1}=\sum_{i}\sum_{j}\tilde{m}_{j,k}\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,D\tilde{u}_{\rho,h}^{\epsilon}[\mu]](i,j,k)=\sum_{j}\tilde{m}_{j,k}.

The result follows by iteration since $\sum_{j}\tilde{m}_{j,0}=1$ . ∎

We extend $(\tilde{m}_{i,k}[\mu])$ to $\mathbb{R}^{d}$ by piecewise constant interpolation in $x$ and then to $[0,T]$ by linear interpolation in $t$ : For $t\in[t_{k},t_{k+1}]$ and $k\in\mathcal{N}_{h}$ ,

(26)

\displaystyle\tilde{m}_{\rho,h}^{\epsilon}[\mu](t,x)

\displaystyle:=\frac{t-t_{k}}{h}\tilde{m}_{\rho,h}^{\epsilon}[\mu](t_{k+1},x)+\frac{t_{k+1}-t}{h}\tilde{m}_{\rho,h}^{\epsilon}[\mu](t_{k},x),

where, $\tilde{m}_{\rho,h}^{\epsilon}[\mu](t_{k},x):=\frac{1}{\rho^{d}}\sum_{i\in\mathbb{Z}^{d}}\tilde{m}_{i,k}[\mu]\,\mathbbm{1}_{E_{i}}(x)$ . Note that $\tilde{m}_{\rho,h}^{\epsilon}[\mu]\in C([0,T],P(\mathbb{R}^{d}))$ and the duality with the linear in $x$ /constant in $t$ interpolation used for $\tilde{u}_{\rho,h}$ in (20).

3.5. Discretisation of the coupled MFG system

The discretisation of the MFG system is obtained by coupling the two discretisations above by setting $\mu=\tilde{m}^{\epsilon}_{\rho,h}[\mu]$ . With this choice and $u=\tilde{u}[\mu]$ and $m=\tilde{m}[\mu]$ we get the following discretisation of (1):

(27)

\displaystyle\begin{cases}u_{i,k}=S_{\rho,h,r}[m^{\epsilon}_{\rho,h}](u_{\cdot,k+1},i,k),\\[5.69046pt] u_{i,N}=G(x_{i},m^{\epsilon}_{\rho,h}(t_{N})),\\[5.69046pt] m_{i,k+1}=\sum_{j\in\mathbb{Z}^{d}}m_{j,k}\,\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,Du_{\rho,h}^{\epsilon})](i,j,k),\\[5.69046pt] m_{i,0}=\int_{E_{i}}dm_{0}(x),\end{cases}

where $S_{\rho,h,r},\mathbf{B}_{\rho,h,r},u_{\rho,h}^{\epsilon},m^{\epsilon}_{\rho,h}$ are defined above.

The individual discretisations are explicit, but due to the forward-backward nature of the coupling, the total discretisation is not explicit. It yields a nonlinear system that must be solved by some method like e.g. a fixed point iteration or a Newton type method.

The approximation scheme (27) has a least one solution:

Proposition 3.4.

(Existence for the discrete MFG system) Assume ( $\nu$ 0): , ( $\nu$ 1): , (L1): –(L2): , (F1): –(F2): , (H1): , and (M): . Then there exist a pair $(u_{\rho,h},\ m_{\rho,h}^{\epsilon})$ solving (27).

The proof of this result is non-constructive and given in Appendix A.

4. Convergence to the MFG system

In this section we give the main theoretical results of this paper, various convergence results as $h,\rho,\epsilon,r\to 0$ under CFL-conditions. The proofs will be given in Section 7 and require results for the individual schemes given in Sections 5 and 6.

4.1. Convergence to viscosity-very weak solutions

We consider degenerate and non-degenerate cases separately. For the degenerate case, the convergence holds only in dimension $d=1$ .

Theorem 4.1 (Degenerate case, $d=1$ ).

Assume ( $\nu$ 0): , ( $\nu$ 1): , (L1): –(L3): , (F1): –(F3): , (H1): –(H2): , (M’): , $\{(u_{\rho,h},m^{\epsilon}_{\rho,h})\}_{\rho,h,\epsilon>0}$ are solutions of the discrete MFG system (27). If $\rho_{n},h_{n},\epsilon_{n},r_{n}\to 0$ under the CFL conditions $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}},\frac{\sqrt{h_{n}}}{\epsilon_{n}}=o(1)$ , then:

(i)

$\{u_{\rho_{n},h_{n}}\}_{n}$ is precompact in $C_{b}([0,T]\times K)$ for every compact set $K\subset\mathbb{R}$ .
(ii)

$\{m^{\epsilon_{n}}_{\rho_{n},h_{n}}\}_{n}$ is sequentially precompact in $C([0,T],P(\mathbb{R}))$ , and (a) in $L^{1}$ weak if $p\in(1,\infty)$ in (M’): , or (b) in $L^{\infty}$ weak $*$ if $p=\infty$ in (M’): .
(iii)

If $(u,m)$ is a limit point of $\{(u_{\rho_{n},h_{n}},m^{\epsilon_{n}}_{\rho_{n},h_{n}})\}_{n}$ , then $(u,m)$ is a viscosity-very weak solution of the MFG system (1).

Note that $\{m^{\epsilon}_{\rho,h}\}$ is precompact in $C([0,T],P(\mathbb{R}^{d}))$ , just by assuming (M): for the initial distribution. But in the degenerate case this is not enough for convergence of the MFG system, due to lower regularity of the solutions of the HJB equation (no longer $C^{1}$ ). Therefore we need assumption (M’): and the stronger compactness given by Theorem 4.1(ii) part (a) or (b). This latter result we are only able to show in $d=1$ .

In arbitrary dimensions we assume more regularity on solutions of the HJB equation in (1):

(U):: Let $u[m]$ be a viscosity solution of the HJB equation in (1). For any $m\in C([0,T],P(\mathbb{R}^{d}))$ and $t\in(0,T)$ , $u[m](t)\in C^{1}(\mathbb{R}^{d})$ .

Remark 4.2.

Assumption (U): holds in non-degenerate cases, e.g. under assumption ( $\nu$ 2): , see Theorem 2.7 and the discussion below.

We have the following convergence result in arbitrary dimensions.

Theorem 4.3 (Non-degenerate case).

Assume ( $\nu$ 0): , ( $\nu$ 1): , (L1): –(L3): , (F1): –(F3): , (H1): –(H2): , (U): , (M): , $\{(u_{\rho,h},m^{\epsilon}_{\rho,h})\}_{\rho,h,\epsilon>0}$ are solutions of the discrete MFG system (27). If $\rho_{n},h_{n},\epsilon_{n},r_{n}\to 0$ under the CFL conditions $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}},\frac{\sqrt{h_{n}}}{\epsilon_{n}}=o(1)$ , then:

(i)

$\{u_{\rho_{n},h_{n}}\}_{n}$ is precompact in $C_{b}([0,T]\times K)$ for every compact set $K\subset\mathbb{R}^{d}$ .
(ii)

$\{m^{\epsilon_{n}}_{\rho_{n},h_{n}}\}_{n}$ is precompact in $C([0,T],P(\mathbb{R}^{d}))$ .
(iii)

If $(u,m)$ is a limit point of $\{(u_{\rho_{n},h_{n}},m^{\epsilon_{n}}_{\rho_{n},h_{n}})\}_{n}$ , then $(u,m)$ is a viscosity-very weak solution of the MFG system (1).

These results give compactness of the approximations and convergence along subsequences. To be precise, by part (i) and (ii) there are convergent subsequences, and by part (iii) the corresponding limits are solutions of the MFG system (1).

We immediately have existence for (1).

Corollary 4.4 (Existence of solutions of (1)).

Under the assumptions of either Theorem 4.1 or 4.3, there exists a viscosity-very weak solution $(u,m)$ of the MFG system (1).

If in addition we have uniqueness for the MFG system (1), then we have full convergence of the sequence of approximations.

Corollary 4.5.

Under the assumption of either Theorem 4.1 or Theorem 4.3, if the MFG system (1) has at most one viscosity-very weak solution, then the whole sequence $\{(u_{\rho_{n},h_{n}},m^{\epsilon_{n}}_{\rho_{n},h_{n}})\}_{n}$ converges to a limit $(u,m)$ which is the (unique) viscosity-very weak solution of the MFG system (1).

4.2. Convergence to classical solutions

In the case the individual equations are regularising, we can get convergence to classical solutions of the MFG system. To be precise we need:

1.

(“Weak” uniqueness of individual PDEs) The HJB equation have unique viscosity solutions, and the FPK equation have unique very weak solutions.
2.

(Smoothness of individual PDEs) Both equations have classical solutions.

This means that viscosity-very weak solutions of the MFG system automatically (by uniqueness for individual equations) are classical solutions. If in addition

3.

(Classical uniqueness for MFG) classical solutions of the MFG system are unique,

we get full convergence of the approximate solutions to the solution of the MFG system.

We now give a precise result in the setting of [37], see Theorem 2.7 in Section 2 for existence and uniqueness of classical solutions of (1).

Corollary 4.6.

Assume ( $\nu$ 0): –( $\nu$ 2): , (L1): –(L3): , (F1): –(F4): , (H3): –(H4): , and (M”): . Let $(u_{\rho,h},m^{\epsilon}_{\rho,h})$ be solutions of the discrete MFG system (27). If $\rho_{n},h_{n},\epsilon_{n},r_{n}\to 0$ under the CFL conditions $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}},\frac{\sqrt{h_{n}}}{\epsilon_{n}}=o(1)$ , then:

(a) $\{(u_{\rho_{n},h_{n}},m^{\epsilon_{n}}_{\rho_{n},h_{n}})\}_{n}$ has a convergent subsequence in the space $C_{b,\textup{loc}}([0,T]\times\mathbb{R}^{d})\times C([0,T],P(\mathbb{R}^{d}))$ , and any limit point is a classical-classical solution of (1).

(b) If in addition (F5): and (H5): hold, then the whole sequence in (a) converges to the unique classical-classical solution $(u,m)$ of (1).

Proof.

1. Assumption (U): holds by Theorem 2.7, and then by Theorem 4.3, there is a convergent subsequence $\{(u_{\rho_{n},h_{n}},m_{\rho_{n},h_{n}}^{\epsilon_{n}})\}_{n}$ such that $(u_{\rho_{n},h_{n}},m_{\rho_{n},h_{n}}^{\epsilon_{n}})\to(u,m)$ and $(u,m)$ is a viscosity-very weak solution of (1).

2. Since $m\in C([0,T],P(\mathbb{R}^{d}))$ , the viscosity solution $u$ is unique by Proposition 2.5 (b) (see also [37, Theorem $5.3$ ]). Hence it coincides with the classical $C_{b}^{1,3}((0,T)\times\mathbb{R}^{d})$ solution given by [37, Theorem $5.5$ ].

3. Now $D_{p}H(x,Du(t))\in C_{b}^{2}(\mathbb{R}^{d})$ by part 2 and (H3): , and then by Proposition 2.8 there is at most one very weak solution of the FPK equation. Hence it coincides with the classical $C_{b}^{1,2}((0,T)\times\mathbb{R}^{d})$ solution given by [37, Proposition $6.8$ ].

4. In addition if (F5): and (H5): hold, there is a most one classical solution $(u,m)$ by Theorem 2.7 (b).

5. This shows (compactness, smoothness, and uniqueness) that all convergent subsequences of $\{(u_{\rho_{n},h_{n}},m_{\rho_{n},h_{n}}^{\epsilon_{n}})\}_{n}$ have the same limit, and thus the whole sequence converges to $(u,m)$ , the unique classical solution of (1). ∎

4.3. Extension and discussion

Extension to more general Lévy operators

The results of Theorem 4.1 and 4.3 hold under much more general assumptions on the Lévy operator $\mathcal{L}$ . In [37] they use ( $\nu$ 0): together with the assumptions,

( $\nu$ 1^′):

$\displaystyle r^{-2+\sigma}\int_{|z|<r}|z|^{2}d\nu+r^{-1+\sigma}\int_{r<|z|<1}|z|d\nu+r^{\sigma}\int_{r<|z|<1}d\nu\leq c,\ r\in(0,1)$ .

( $\nu$ 2^′):

There are $\sigma\in(1,2)$ and $\mathcal{K}>0$ such that the heat kernels $K_{\sigma}$ and $K_{\sigma}^{*}$ of $\mathcal{L}$ and $\mathcal{L}^{*}$ satisfy for $K=K_{\sigma},K_{\sigma}^{*}$ : $K\geq 0$ , $\|K(t,\cdot)\|_{L^{1}(\mathbb{R}^{d})}=1$ , and

\displaystyle\|D^{\beta}K(t,\cdot)\|_{L^{p}(\mathbb{R}^{d})}\leq\mathcal{K}t^{-\frac{1}{\sigma}\big{(}|\beta|+(1-\frac{1}{p})d\big{)}}\quad\text{for $t\in(0,T)$}

and any $p\in[1,\infty)$ and multi-index $\beta\in\mathbb{N}^{d}\cup\{0\}$ .

where the heat kernel of the operator $\mathcal{L}$ is defined as the fundamental solution of the heat equation $\partial_{t}u-\mathcal{L}u=0$ . These assumptions cover lots of new cases compared to ( $\nu$ 0): , ( $\nu$ 1): , and ( $\nu$ 2): . New cases include (i) sums of operators satisfying ( $\nu$ 1): on subspaces spanning $\mathbb{R}^{d}$ , having possibly different orders, (ii) more general non-absolutely continuous Lévy measures, and (iii) Lévy measures supported on positive cones. An example of (i) (cf. [37]) is

\mathcal{L}=-\Big{(}\!-\frac{\partial^{2}}{\partial x_{1}^{2}}\Big{)}^{\sigma_{1}/2}-\dots-\Big{(}\!-\frac{\partial^{2}}{\partial x_{d}^{2}}\Big{)}^{\sigma_{d}/2},\qquad\sigma_{1},\dots,\sigma_{d}\in(1,2),

which satisfies ( $\nu$ 1^′): with $\sigma=\min_{i}\sigma_{i}$ and $d\nu(z)=\sum_{i=1}^{d}\frac{dz_{i}}{|z_{i}|^{1+\sigma_{i}}}\Pi_{j\neq i}\delta_{0}(dz_{j})$ . This is a sum of one-dimensional fractional Laplacians of different orders. An example of (iii) is given by the spectrally positive “fractional Laplacian” in one space dimension: $\mathcal{L}u=c_{\sigma}\int_{0}^{\infty}(u(x+z)-u(x)-Du(x)\cdot z\mathbbm{1}_{\{z<1\}})z^{-1-\sigma}dz$ .

We have the following generalization of the wellposedness result for classical solutions given in Theorem 2.7.

Theorem 4.7 ([37]).

Theorem 2.7 holds when you replace ( $\nu$ 1): – ( $\nu$ 2): by ( $\nu$ 1^′): – ( $\nu$ 2^′): .

It follows that (U): holds whenever Theorem 4.7 holds. Since ( $\nu$ 1): implies ( $\nu$ 1^′): and the integrals in ( $\nu$ 1^′): are what appear in the different proofs, it is easy to check that all estimates in this paper are true for Lévy measures satisfying ( $\nu$ 1^′): instead of ( $\nu$ 1): . This means that under assumption ( $\nu$ 1^′): and ( $\nu$ 2^′): we have the following extensions of Theorems 4.1 and 4.3 and Corollary 4.6.

Theorem 4.8.

Theorem 4.1 holds when you replace ( $\nu$ 1): with ( $\nu$ 1^′): .

Theorem 4.9.

Theorem 4.3 holds when you replace ( $\nu$ 1): – ( $\nu$ 2): by ( $\nu$ 1^′): – ( $\nu$ 2^′): .

Corollary 4.10.

Corollary 4.6 holds when you replace ( $\nu$ 1): – ( $\nu$ 2): by ( $\nu$ 1^′): – ( $\nu$ 2^′): .

The Wasserstein metric $d_{1}$ versus our metric $d_{0}$

The typical setting for the FPK equations in the MFG literature seems to be the metric space $(P_{1}(\mathbb{R}^{d}),d_{1})$ , that is the $1-$ Wasserstein space $\mathcal{W}_{1}$ of probability measures with finite first moment. This is also the case in [25] where convergence results are given for SL schemes for local nondegenerate MFGs in $\mathbb{R}^{d}$ . In this paper we can not assume finite first moments if we want to cover general non-local operators. An example is the fractional Laplacian $-(-\Delta)^{\frac{\sigma}{2}}$ for $\sigma<1$ , where the underlying $\sigma$ -stable process only has finite moments of order less than $\sigma$ . Instead we consider the weaker metric space $(P(\mathbb{R}^{d}),d_{0})$ , which is just a metrization of the weak (weak-* in $C_{b}$ ) convergence of probability measures. In this topology we can consider processes, probability measures and solutions of the FPK equations that do not have any finite moments or any restrictions on the tail behaviour of the corresponding Lévy measures.

Of course, under additional assumptions convergence in $d_{0}$ implies convergence in $d_{1}$ .

Lemma 4.11.

If $m_{n}$ converges to $m$ in $(P(\mathbb{R}^{d}),d_{0})$ and $m_{n}$ and $m$ has uniformly bounded $(1+\delta)$ -moments for $\delta>0$ , then $m_{n}\to m$ in $(P_{1}(\mathbb{R}^{d}),d_{1})$ .

Convergence in $P_{1}(\mathbb{R}^{d})$ [53, Definition 6.8] is by definition equivalent to weak convergence plus convergence of first moments, and the result follows from e.g. Proposition 1.1 and Lemma 1.5 in [5].

We then have the following version of Theorem 4.1 and Theorem 4.3.

Corollary 4.12.

Assume $m_{0}\in P_{1+\delta}(\mathbb{R})$ , $\int_{\mathbb{R}^{d}\setminus B_{1}}|z|^{1+\delta}d\nu(z)<\infty$ for some $\delta>0$ , and the assumptions of Theorem 4.1 and Theorem 4.3. Then the statements of Theorem 4.1 and Theorem 4.3 hold if we replace $(P,d_{0})$ by $(P_{1},d_{1})$ in part (ii).

Note that the number of moments of $m$ is determined by the number of moments of $1_{|z|>1}\nu$ (and $m_{0}$ ), see e.g. the discussion in section 2.3 in [37]. Moreover, if $1_{|z|>1}\nu$ has at most $\alpha$ finite moments, then $\mathcal{L}u$ is well-defined only if $u$ has at most order $\alpha$ growth at infinity. Hence in the nonlocal case there is ”duality” between the moments of $m$ and the growth of $u$ . Note that $um$ will always be integrable which is natural since then e.g. $Eu(X_{t},t)=\int u(x,t)m(dx,t)$ is finite.

In our case we assume no moments and have to work with bounded solutions $u$ .

On moments and weak compactness in $L^{p}$ in the degenerate case

Previous results for Semi-Lagrangian schemes in the first order and the degenerate second order case [23, 24] cover the case $m_{0}\in P_{1}(\mathbb{R})\cap L^{\infty}(\mathbb{R})$ , which means that $m_{0}$ has finite first-moments. Our results assume $m_{0}\in P(\mathbb{R})\cap L^{p}(\mathbb{R})$ , for $p\in(1,\infty]$ , and hence no moment bounds and possibly unbounded $m_{0}$ . When $p<\infty$ we have weak compactness in $L^{1}$ instead of weak-* compactness in $L^{\infty}$ .

Since our results in the degenerate case allows for $\mathcal{L}=0$ , they immediately give an extension to this $P\cap L^{p}$ setting for the convergence results for first order problems of [23]. Moreover, the same conditions, arguments, and results easily also holds in the local diffusive case considered in [24].

5. On the SL scheme for the HJB equation

We prove results for the numerical approximation of the HJB equation, including monotonicity, consistency, and different uniform a priori stability and regularity estimates. Using the “half-relaxed” limit method [12], we then show convergence in the form of $v_{\rho_{n},h_{n}}[\mu_{n}](t_{n},x_{n})\rightarrow v[\mu](t,x)$ , where $v[\mu]$ is the (viscosity) solution of the continuous HJB equation. Let $B(\mathcal{G}_{\rho})$ be the set of all bounded functions defined on $\mathcal{G}_{\rho}$ .

Theorem 5.1.

Assume ( $\nu$ 0): , (L1): , $\rho,h,r>0$ , $\mu\in C([0,T],P(\mathbb{R}^{d}))$ , and let $S_{\rho,h,r}[\mu]$ denote the scheme defined in (18).

(i)

(Bounded control) If $\phi\in\text{Lip}(\mathbb{R}^{d})$ , $S_{\rho,h,r}[\mu](\phi,i,k)$ has a minimal control and $|\alpha|\leq K$ where $K$ only depends on $\|D_{x}\phi\|_{0}$ and the growth of $L$ as $|x|\rightarrow\infty$ .

(ii)

(Monotonicity) For all $v,w\in B(\mathcal{G}_{\rho})$ with $v\leq w$ we have,

S_{\rho,h,r}[\mu](v,i,k)\leq S_{\rho,h,r}[\mu](w,i,k)\text{ for all }i\in\mathcal{G}_{\rho},\ k=0,\ldots,N-1.

(iii)

(Commutation by constant) For every $c\in\mathbb{R}$ and $w\in B(\mathcal{G}_{\rho})$ ,

S_{\rho,h,r}[\mu](w+c,i,k)=S_{\rho,h,r}[\mu](w,i,k)+c\text{ for all }i\in\mathcal{G}_{\rho},\ k=0,\ldots,N-1.

Assume also ( $\nu$ 1): and (F2): .

(iv)

(Consistency) Let $\rho_{n},h_{n},r_{n}\xrightarrow{n\to\infty}0$ under CFL conditions $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}}=o(1)$ , grid points $(t_{k_{n}},x_{i_{n}})\to(t,x)$ , and $\mu_{n},\mu\in C([0,T];P(\mathbb{R}^{d}))$ such that $\mu_{n}\to\mu$ . Then, for every $\phi\in C_{c}^{\infty}(\mathbb{R}^{d}\times[0,T))$ ,

	$\displaystyle\lim_{n\to\infty}\frac{1}{h_{n}}\big{[}\phi(t_{k_{n}+1},x_{i_{n}})-$	$\displaystyle S_{\rho_{n},h_{n},r_{n}}[\mu_{n}](\phi_{\cdot,k_{n}+1},i_{n},k_{n})\big{]}$
	$\displaystyle=$	$\displaystyle-\partial_{t}\phi(t,x)-\inf_{\alpha\in\mathbb{R}^{d}}\big{[}L(x,\alpha)-D\phi\cdot\alpha\big{]}-\mathcal{L}\phi(x)-F(x,\mu(t)).$

Proof.

(i) Since

\displaystyle h(\alpha):=\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}I[\phi](x_{i}+h\bar{b}(\alpha)+\sqrt{h}\sigma_{r}^{m})+I[\phi](x_{i}+h\bar{b}(\alpha)-\sqrt{h}\sigma_{r}^{m})

is Lipschitz in $\alpha$ (maximum linear growth at infinity), while $L(x,\alpha)$ is coercive (more than linear growth at infinity) by (L1): , there exists a ball $B_{R}$ , where $R$ depends on the Lipschitz constant of $I[\phi]$ and the growth of $L$ , such that the minimizing control $\bar{\alpha}$ of $S_{\rho,h,r}[\mu](\phi,i,k)$ belongs to $B_{R}$ .

(ii) and (iii) Follows directly from the definition of the scheme.

(iv) For ease of notation, we write $\rho,h,r,\mu$ instead of $\rho_{n},h_{n},r_{n},\mu_{n}$ . A $4$ th order Taylor expansion of $\phi$ gives

	$\displaystyle\phi(x+h\bar{b}(\alpha)\pm$	$\displaystyle\sqrt{hd}\sigma_{r}^{m})=\phi(x)+D\phi(x)\cdot(h\bar{b}(\alpha)\pm\sqrt{hd}\sigma_{r}^{m})$
		$\displaystyle+\frac{hd}{2}(\sigma_{r}^{m})^{T}D^{2}\phi(x)\sigma_{r}^{m}\pm\sqrt{d}\,h^{\frac{3}{2}}b(\alpha)^{T}D^{2}\phi(x)\sigma_{r}^{m}+\frac{h^{2}}{2}\bar{b}(\alpha)^{T}D^{2}\phi(x)\bar{b}(\alpha)$
		$\displaystyle+\sum_{\|\beta\|=3}\frac{D^{\beta}\phi(x)}{\beta!}(h\bar{b}(\alpha)\pm\sqrt{hd}\,\sigma_{r}^{m})^{\beta}+\sum_{\|\beta\|=4}\frac{D^{\beta}\phi(\xi_{\pm})}{\beta!}(h\bar{b}(\alpha)\pm\sqrt{hd}\,\sigma_{r}^{m})^{\beta},$

for some $\xi_{\pm}\in\mathbb{R}^{d}$ . Using that $\bar{b}(\alpha)=-\alpha-\int_{r\leq|z|\leq 1}z\nu(dz)$ , and by ( $\nu$ 1): $\int_{r\leq|z|\leq 1}z\nu(dz)=O(r^{1-\sigma})$ , we get that

(28)			$\displaystyle\phi(x+h\bar{b}(\alpha)+\sqrt{hd}\sigma_{r}^{m})+\phi(x+h\bar{b}(\alpha)-\sqrt{hd}\sigma_{r}^{m})-2\phi(x)$
	$\displaystyle=-2hD$	$\displaystyle\phi(x)\cdot\alpha-2h\int_{r<\|z\|<1}D\phi(x)\cdot z\nu(dz)+hd(\sigma_{r}^{m})^{T}\cdot D^{2}\phi(x)\cdot\sigma_{r}^{m}+\mathcal{O}\big{(}h^{2}r^{2-2\sigma}\big{)}.$

We used that $\frac{h^{2}}{2}\bar{b}(\alpha)^{T}D^{2}\phi(x)\bar{b}(\alpha)$ is of order $\mathcal{O}(h^{2}r^{2-2\sigma})$ , the $3$ rd order terms are of order $\mathcal{O}(h^{3}r^{3-3\sigma}+h^{2}r^{1-\sigma})$ , and the $4$ th order terms are of order $(\sqrt{hd}\sigma_{r})^{4}=\mathcal{O}(h^{2}r^{4-2\sigma})$ . Then the error of the Taylor expansion is $O(h^{2}r^{2-2\sigma})$ . Using Lemma 3.1,

	$\displaystyle\phi(x_{i})-S_{\rho,h,r}[\mu](\phi,i,k)$
	$\displaystyle=\ \phi(x_{i})-\inf_{\alpha}\bigg{[}hF(x_{i},\mu(t_{k+1}))+hL(x_{i},\alpha)+\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}\Big{(}2\phi(x_{i})-2hD\phi(x_{i})\cdot\alpha$
	$\displaystyle\hskip 71.13188pt+hd(\sigma_{r}^{m})^{T}D^{2}\phi(x_{i})\sigma_{r}^{m}-2h\int_{r<\|z\|<1}D\phi(x_{i})\cdot z\nu(dz)\Big{)}$
	$\displaystyle\hskip 71.13188pt+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|>r}\phi(x_{i}+z)\nu(dz)+\mathcal{O}(\rho^{2})+\mathcal{O}(h^{2}r^{2-2\sigma})\bigg{]}$
(29)		$\displaystyle=\ hF(x_{i},\mu(t_{k+1}))-\inf_{\alpha}\bigg{[}hL(x_{i},\alpha)-he^{-h\lambda_{r}}D\phi(x_{i})\cdot\alpha\bigg{]}+(1-e^{-h\lambda_{r}})\phi(x_{i})$
	$\displaystyle\quad-he^{-h\lambda_{r}}\Big{(}\mathcal{L}_{r}\phi(x_{i})+\mathcal{O}(r^{3-\sigma})\Big{)}+he^{-h\lambda_{r}}\int_{r<\|z\|<1}D\phi(x_{i})\cdot z\nu(dz)$
	$\displaystyle\quad-\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|>r}\phi(x_{i}+z)\nu(dz)+\mathcal{O}(\rho^{2}+h^{2}r^{2-2\sigma}).$

Using that $\int_{|z|<r}|z|^{2}\nu(dz)\leq Kr^{2-\sigma}$ (by ( $\nu$ 1): ), for the small jump operator $\mathcal{L}_{r}$ (defined in (12)) we have

(30)

\displaystyle|\mathcal{L}_{r}\phi(x_{i})-e^{-h\lambda_{r}}\mathcal{L}_{r}\phi(x_{i})|\leq h\lambda_{r}\,r^{2-\sigma}\|D^{2}\phi\|_{0}.

Again, as $\int_{r<|z|<1}|z|\nu(dz)\leq Kr^{1-\sigma}$ and $\int_{|z|>1}\nu(dz)\leq K$ , for the long jump operator $\mathcal{L}^{r}$ (defined in (12)) we have that

	$\displaystyle\Big{\|}\mathcal{L}^{r}\phi(x_{i})$	$\displaystyle+e^{-h\lambda_{r}}\int_{r<\|z\|<1}D\phi(x_{i})\cdot z\nu(dz)-\frac{1-e^{-h\lambda_{r}}}{h\lambda_{r}}\int_{\|z\|>r}(\phi(x_{i}+z)-\phi(x_{i}))\nu(dz)\Big{\|}$
		$\displaystyle\leq K(1-e^{-h\lambda_{r}})r^{1-\sigma}\\|D\phi\\|_{0}+K\Big{(}1-\frac{1-e^{-h\lambda_{r}}}{h\lambda_{r}}\Big{)}\Big{(}r^{1-\sigma}\\|D\phi\\|_{0}+\\|\phi\\|_{0}\Big{)}$
(31)			$\displaystyle\leq K\Big{(}h\lambda_{r}r^{1-\sigma}\\|D\phi\\|_{0}+h\lambda_{r}\\|\phi\\|_{0}\Big{)}.$

Recalling that $\mathcal{L}\phi(x_{i})=\mathcal{L}_{r}\phi(x_{i})+\mathcal{L}^{r}\phi(x_{i})$ , combining (5) with (30) and (5), we find

	$\displaystyle\phi(x_{i})-S_{\rho,h,r}[\mu](\phi,i,k)$	$\displaystyle=hF(x_{i},\mu(t_{k+1}))-h\inf_{\alpha}\bigg{[}L(x_{i},\alpha)-D\phi(x_{i})\cdot\alpha\bigg{]}-h\mathcal{L}\phi(x_{i})$
		$\displaystyle\hskip 28.45274pt+\mathcal{O}\big{(}h^{2}\lambda_{r}+hr^{3-\sigma}+h^{2}\lambda_{r}r^{1-\sigma}+\rho^{2}+h^{2}r^{2-2\sigma}\big{)}.$

As $|\lambda_{r}|\leq Cr^{-\sigma}$ , we have

	$\displaystyle\frac{\phi(t_{k},x_{i})-\phi(t_{k+1},x_{i})}{h}+\frac{1}{h}\Big{(}\phi(t_{k+1},x_{i})-S_{\rho,h}[\mu](\phi_{\cdot,k+1},i,k)\Big{)}$
	$\displaystyle=-\partial_{t}\phi(t_{k},x_{i})-\mathcal{L}\phi(t_{k+1},x_{i})+F(x_{i},\mu(t_{k+1}))-\inf_{\alpha}\bigg{[}L(x_{i},\alpha)-D\phi(t_{k+1},x_{i})\cdot\alpha\bigg{]}$
	$\displaystyle\hskip 113.81102pt+\mathcal{O}\big{(}h+hr^{-\sigma}+r^{3-\sigma}+hr^{1-2\sigma}+\frac{\rho^{2}}{h}+hr^{2-2\sigma}\big{)}.$

Hence the result follows by taking the limit $n\to\infty$ with $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}}=o(1)$ . ∎

Theorem 5.2.

(Comparison) Assume $\mu_{1},\mu_{2}\in C([0,T],P(\mathbb{R}^{d}))$ , ( $\nu$ 0): , and (L1): . Let $u_{\rho,h}[\mu_{1}]$ and ${u}_{\rho,h}[\mu_{2}]$ be defined by the scheme (20) for $\mu=\mu_{1},\mu_{2}$ , respectively. Then,

\displaystyle\|u_{\rho,h}[\mu_{1}]-u_{\rho,h}[\mu_{2}]\|_{0}\leq T\|F(\cdot,\mu_{1})-F(\cdot,\mu_{2})\|_{0}+\|G(\cdot,\mu_{1}(T))-G(\cdot,\mu_{2}(T))\|_{0}.

Proof.

Let $c^{\pm}_{m}(\alpha):=h\bar{b}(\alpha)\pm\sqrt{hd}\sigma_{r}^{m}$ , and note that

(32)

\displaystyle I[u_{\cdot,k+1}[\mu_{1}]](x)-I[u_{\cdot,k+1}[\mu_{2}]](x)=

\displaystyle\sum_{p\in\mathbb{Z}^{d}}\beta_{p}(x)(u_{p,k+1}[\mu_{1}]-u_{p,k+1}[\mu_{2}]).

By (18) and the definition of $\inf$ , for any $\epsilon>0$ , there is $\alpha_{\epsilon}\in\mathbb{R}^{d}$ such that

	$\displaystyle u_{i,k}$	$\displaystyle[\mu_{2}]\geq\,hF(x_{i},\mu_{2}(t_{k}))+hL(x_{i},\alpha_{\epsilon})+\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}\Big{[}I[u_{\cdot,k+1}[\mu_{2}]](x_{i}+c^{+}_{m}(\alpha_{\epsilon}))$
(33)			$\displaystyle+I[u_{\cdot,k+1}[\mu_{2}]](x_{i}+c^{-}_{m}(\alpha_{\epsilon}))\Big{]}+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|\geq r}I[u_{\cdot,k+1}[\mu_{2}]](x_{i}+z)\nu(dz)-\epsilon.$

We then find, using (18), (32), (33),

	$\displaystyle u_{i,k}[\mu_{1}]$	$\displaystyle-u_{i,k}[\mu_{2}]\leq h\big{(}F(x_{i},\mu_{1}(t_{k}))-F(x_{i},\mu_{2}(t_{k})\big{)}+h(L(x_{i},\alpha_{\epsilon})-L(x_{i},\alpha_{\epsilon}))$
		$\displaystyle\quad+\sum_{p\in\mathbb{Z}^{d}}\bigg{[}\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}\Big{(}\beta_{p}(c^{+}_{m}(\alpha_{\epsilon}))+\beta_{p}(c^{-}_{m}(\alpha_{\epsilon}))\Big{)}\big{(}u_{p+i,k+1}[\mu_{1}]-u_{p+i,k+1}[\mu_{2}]\big{)}$
		$\displaystyle\quad+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{\|z\|\geq r}\beta_{p}(z)\big{(}u_{p+i,k+1}[\mu_{1}]-u_{p+i,k+1}[\mu_{2}]\big{)}\nu(dz)\bigg{]}+\epsilon$
		$\displaystyle\leq h\\|F(\cdot,\mu_{1})-F(\cdot,\mu_{2})\\|_{0}+c\sup_{i}\|u_{i,k+1}-\tilde{u}_{i,k+1}\|+\epsilon,$

where since $\sum_{p}\beta_{p}\equiv 1$ ,

\displaystyle c=\frac{e^{-h\lambda_{r}}}{2d}\sum_{m=1}^{d}\sum_{p\in\mathbb{Z}^{d}}\Big{(}\beta_{p}(c^{+}_{m}(\alpha_{\epsilon}))+\beta_{p}(c^{-}_{i}(\alpha_{\epsilon}))\Big{)}+\frac{1-e^{-h\lambda_{r}}}{\lambda_{r}}\int_{|z|\geq r}\sum_{p\in\mathbb{Z}^{d}}\beta_{p}(z)\nu(dz)=1.

Since $|u_{i,N}[\mu_{1}]-u_{i,N}[\mu_{2}]|\leq\|G(\cdot,\mu_{1}(t_{N}))-G(\cdot,\mu_{2}(t_{N}))\|_{0}$ , a symmetry and iteration argument shows that

\displaystyle\big{|}u_{i,k}[\mu_{1}]-u_{i,k}[\mu_{2}]\big{|}\leq(N-k)h\|F(\cdot,\mu_{1})-F(\cdot,\mu_{2})\|_{0}+\|G(\cdot,\mu_{1}(t_{N}))-G(\cdot,\mu_{2}(t_{N}))\|_{0}.

The result then follows from interpolation and $T=Nh$ . ∎

The SL scheme is very stable in the sense that we have uniform in $h,\rho,\mu$ boundedness, Lipschitz continuity, and semi-concavity of the solutions $u_{i,k}$ .

Lemma 5.3.

Let $\mu\in C([0,T],P(\mathbb{R}^{d}))$ and $u_{i,k}[\mu]$ be defined by the scheme (18).

(a)

(Lipschitz continuity) Assume ( $\nu$ 0): , (L2): and (F2): . Then

\displaystyle\frac{|u_{i,k}-u_{j,k}|}{|x_{i}-x_{j}|}\leq(L_{F}+L_{L})(T-t_{k})+L_{G},\quad i,j\in\mathbb{Z}^{d},\ k\in\{0,1,\ldots N\}.

(b)

(Semi-concavity) Assume ( $\nu$ 0): , (L3): and (F3): . Then

\displaystyle\frac{u_{i+j,k}-2u_{i,k}+u_{i-j,k}}{|x_{j}|^{2}}\leq(c_{F}+c_{L})(T-t_{k})+c_{G},\quad i,j\in\mathbb{Z}^{d},\ k\in\{0,1,\ldots N\}.

(c)

(Uniformly bounded) Assume ( $\nu$ 0): , (L0): –(L2): , (F1): , and (F2): . Then

$\displaystyle|u_{i,k}|\leq(C_{F}+C_{L}(K))(T-t_{k})+C_{G},\quad i,j\in\mathbb{Z}^{d},\ k\in\{0,1,\ldots N\},$

where $K$ is defined in Theorem 5.1 (i).

Proof.

(a) Note that since $\beta_{m}(x_{j}+x)=\beta_{m-j}(x)$ ,

(34)

\displaystyle I[u_{\cdot,k+1}](x_{j}+x)-I[u_{\cdot,k+1}](x_{i}+x)=

\displaystyle\sum_{p\in\mathbb{Z}^{d}}\beta_{p}(x)(u_{p+j,k+1}-u_{p+i,k+1}).

Then, by (L2): , (F2): , and similar computations as in Theorem 5.2, we find that

\displaystyle u_{j,k}-u_{i,k}\leq h(L_{f}+L_{L})|x_{i}-x_{j}|+\sup_{j}|u_{i,k+1}-u_{j,k+1}|+\epsilon,

Since $|u_{i,N+1}-u_{j,N+1}|=|G(x_{i},m(t_{N+1}))-G(x_{j},m(t_{N+1}))|\leq L_{G}|x_{i}-x_{j}|$ by (F2): , the result follows by iteration.
(b) Similar to (34) we see

		$\displaystyle I[u_{\cdot,k+1}](x_{i+j}+x)-2I[u_{\cdot,k+1}](x_{i}+x)+I[u_{\cdot,k+1}](x_{i-j}+x)$
	$\displaystyle=$	$\displaystyle\sum_{p\in\mathbb{Z}^{d}}\beta_{p}(x_{i}+x)(u_{p+j,k+1}-2u_{p,k+1}+u_{p-j,k+1}).$

Then, by (L3): , (F3): , and similar computations as in Theorem 5.2, we find that

\displaystyle u_{i+j,k}-2u_{i,k}+u_{i-j,k}\leq(c_{L}+c_{F})h|x_{j}|^{2}+\sup_{i}(u_{i+j,k+1}-2u_{i,k+1}+u_{i-j,k+1}).

Since $u_{i+j,N}-2u_{i,N}+u_{i-j,N}\leq c_{G}|x_{j}|^{2}$ by (F3): , the result follows by iteration.

\displaystyle-\sup_{|\alpha|\leq K}\Big{(}h(|F|+|L|)+\sup_{j}|u_{j,k+1}|\Big{)}\leq u_{i,k}\leq\sup_{|\alpha|\leq K}\Big{(}h(|F|+|L|)+\sup_{j}|u_{j,k+1}|\Big{)}.

The result follows from (L1): and (F1): . ∎

Theorem 5.4.

(Convergence of the HJB scheme) Assume ( $\nu$ 0): , ( $\nu$ 1): , (F1): , (F2): , (L2): , $\rho_{n},h_{n},r_{n}\xrightarrow{n\to\infty}0$ under CFL conditions $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}}=o(1)$ , $\mu_{n}\rightarrow\mu$ in $C([0,T],P(\mathbb{R}^{d}))$ , and $u_{\rho_{n},h_{n}}[\mu_{n}]$ is the solution of the scheme (18) defined by (20). Then there is a continuous bounded function $u[\mu]$ such that $u_{\rho_{n},h_{n}}[\mu_{n}]\rightarrow u[\mu]$ locally uniformly in $\mathbb{R}^{d}\times[0,T]$ , and $u[\mu]$ is the viscosity solution of the HJB equation in (1) for $m=\mu$ .

Proof.

The result follows from the Barles-Perthame-Souganidis relaxed limit method [12], using the monotonicity, consistency, and $L^{\infty}$ -stability properties of the scheme (cf. Theorem 5.1 (ii), (iii), and Lemma 5.3 (c)), and the strong comparison principle for the HJB equation in Proposition 2.5 (a). We refer to the proof of [23, Theorem 3.3] for a standard but more detailed proof in a similar case. ∎

We recall that the continuous extensions $u_{\rho,h}[\mu](t,x)$ and $u_{\rho,h}^{\epsilon}[\mu](t,x)$ are defined in (20) and (22), respectively. The results of Lemma 5.3 transfers to $u_{\rho,h}^{\epsilon}[\mu](t,x)$ .

Lemma 5.5.

Let $\mu\in C([0,T],P(\mathbb{R}^{d}))$ and $u_{\rho,h}^{\epsilon}[\mu]$ be given by (22).

(a)

(Lipschitz continuity) Assume ( $\nu$ 0): , (L2): and (F2): . Then

\displaystyle\big{|}u_{\rho,h}^{\epsilon}[\mu](t,x)-u_{\rho,h}^{\epsilon}[\mu](t,y)\big{|}\leq((L_{L}+L_{F})T+L_{G})|x-y|.

(b)

(Approximate semiconcavity) Assume ( $\nu$ 0): , (L2): ,(L3): , (F2): , and (F3): . Then there exist a constant $c_{1}>0$ , independent of $\rho,h,\epsilon$ and $\mu$ , such that

		$\displaystyle u_{\rho,h}^{\epsilon}[\mu](t,x+y)-2u_{\rho,h}^{\epsilon}[\mu](t,x)+u_{\rho,h}^{\epsilon}[\mu](t,x-y)\leq c_{1}(\|y\|^{2}+\rho^{2}+\frac{\rho^{2}}{\epsilon}),\ \mbox{and}$
		$\displaystyle\langle Du_{\rho,h}^{\epsilon}[\mu](t,y)-Du_{\rho,h}^{\epsilon}[\mu](t,x),y-x\rangle\leq c_{1}\Big{(}\|x-y\|^{2}+\frac{\rho^{2}}{\epsilon^{2}}\Big{)}.$

(c)

Assume $d=1$ , ( $\nu$ 0): , (L3): , and (F3): . Then there exists a constant $c_{2}>0$ , independent of $\rho,h,\epsilon$ and $\mu$ , such that for each $i,j\in\mathbb{Z}^{d}$ and $k\in\mathcal{N}_{h}$

\displaystyle\langle Du_{\rho,h}^{\epsilon}[\mu](t_{k},x_{j})-Du_{\rho,h}^{\epsilon}[\mu](t_{k},x_{i}),x_{j}-x_{i}\rangle\leq c_{2}|x_{j}-x_{i}|^{2}.

Proof.

(a) Since $u_{i,k}$ satisfies the discrete Lipschitz bound of Lemma 5.3 (a), $u_{\rho,h}[\mu]$ is Lipschitz with same Lipschitz constant as $u_{i,k}$ by properties of linear interpolation, and $u_{\rho,h}^{\epsilon}[\mu]$ is Lipschitz with same constant as $u_{\rho,h}[\mu]$ by properties of mollifiers (Lemma 3.2).

(b) For $i,j\in\mathbb{Z}^{d}$ we have by Lemma 5.3 (b), $u_{i+j}+u_{i-j}-2u_{i}\leq c|x_{j}|^{2}.$ Multiplying both sides by $\beta_{i}(x)$ , and summing over $\mathbb{Z}^{d}$ , we get

\displaystyle u_{\rho,h}(x+x_{j})+u_{\rho,h}(x-x_{j})-2u_{\rho,h}(x)\leq c|x_{j}|^{2}.

Letting $x\to x-z$ , multiplying by a positive mollifier $\rho_{\epsilon}(z)$ and integrating, we get

\displaystyle u_{\rho,h}^{\epsilon}(x+x_{j})+u_{\rho,h}^{\epsilon}(x-x_{j})-2u_{\rho,h}^{\epsilon}(x)\leq c|x_{j}|^{2}.

We multiply both sides with $\beta_{j}(y)$ , and sum over $\mathbb{Z}^{d}$ ,

\displaystyle I[u_{\rho,h}^{\epsilon}](x+y)+I[u_{\rho,h}^{\epsilon}](x-y)-2I[u_{\rho,h}^{\epsilon}](x)\leq cI[|\cdot|^{2}](y)\leq c(|y|^{2}+\rho^{2}).

By Lemma 3.2 and part (a), we have that $|I[u_{\rho,h}^{\epsilon}](\xi)-u_{\rho,h}^{\epsilon}(\xi)|\leq K\|D^{2}u_{\rho,h}^{\epsilon}\|_{0}\rho^{2}\leq K\frac{\rho^{2}}{\epsilon}$ , where the Lipschitz bound $K$ depends on the constants in (L2): and (F2): . Thus,

\displaystyle u_{\rho,h}^{\epsilon}(x+y)+u_{\rho,h}^{\epsilon}(x-y)-2u_{\rho,h}^{\epsilon}(x)\leq c(|y|^{2}+\rho^{2}+\frac{\rho^{2}}{\epsilon}).

The second part of (b) then follows as in [3, Remark 6].

Under our assumptions, the continuous HJB equation has a (viscosity) solution $u(t)\in W^{1,\infty}(\mathbb{R}^{d})$ , that is, the derivative exists almost everywhere [37, Theorem 4.3]. We have the following result for $Du_{\rho,h}^{\epsilon}[\mu]$ .

Theorem 5.6.

Assume ( $\nu$ 0): , ( $\nu$ 1): , (L1): –(L2): , (F1): –(F2): , $\rho_{n},h_{n},r_{n},\epsilon_{n}\xrightarrow{n\to\infty}0$ under CFL conditions $\frac{\rho_{n}^{2}}{h_{n}},\frac{h_{n}}{r_{n}^{\sigma}}=o(1)$ , and $\mu_{n}\rightarrow\mu$ in $C([0,T],P(\mathbb{R}^{d}))$ . Let $u_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}]$ be defined by (22) and $u[\mu]$ the viscosity solution of the HJB equation in (1) for $m=\mu$ . Then

(i)

$u_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}]\rightarrow u[\mu]$ locally uniformly,
(ii)

Assume also (L3): , (F3): and $\frac{\rho_{n}}{\epsilon_{n}}=o(1)$ . Then $Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t,x)\rightarrow Du[\mu](t,x)$ whenever $Du(t,x)$ exists, that is, the convergence is almost everywhere.
(iii)

Assume also (L3): , (F3): , $\frac{\rho_{n}}{\epsilon_{n}}=o(1)$ , and (U): . Then $Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}]\rightarrow Du[\mu]$ locally uniformly.

Proof.

(i) This follows from the convergence result Theorem 5.4 and Lemma 3.2.

(ii) and (iii). We refer to [23, Theorem $3.5$ ] and [25, Proposition $5.1$ ]. Estimates from Lemma 5.5 are needed. For completeness we give the proof in Appendix B. ∎

6. On the dual SL scheme for the FPK equation

In this section we establish more properties of the discrete FPK equation (24), including tightness, equicontinuity in time, $L^{1}$ -stability of solutions with respect to $\mu$ , and $L^{p}$ -bounds in dimension $d=1$ . To prove tightness we will use a result from [31].

Proposition 6.1.

Assume ( $\nu$ 0): and (M): . Then there exists a function $0\leq\Psi\in C^{2}(\mathbb{R}^{d})$ with $\|D\Psi\|_{0},$ $\|D^{2}\Psi\|_{0}<\infty$ , and $\displaystyle\lim_{|x|\rightarrow\infty}\Psi(x)=\infty$ , such that

(35)

\displaystyle\sup_{x\in\mathbb{R}^{d}}\Big{|}\int_{|z|>1}\big{(}\Psi(x+z)-\Psi(z)\big{)}\nu(dz)\Big{|}<\infty\quad\mbox{and}\quad\int_{\mathbb{R}^{d}}\Psi(x)\,m_{0}(dx)<\infty.

Proof.

We use [31, Lemma 4.9] on the family of measures $\{\nu_{1},m_{0}\}$ , where $\nu_{1}$ is defined in (11), to get a function $\Psi(x)=V_{0}(\sqrt{1+|x|^{2}})$ such that $V_{0}:[0,\infty)\rightarrow[0,\infty)$ is a non-decreasing sub-additive function, $\|V_{0}^{\prime}\|_{0},\|V_{0}^{\prime\prime}\|_{0}<\infty$ , $\displaystyle\lim_{x\rightarrow\infty}V_{0}(x)=\infty$ , and

\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)\,\mu(dx)<\infty\qquad\mbox{for}\qquad\mu\in\{\nu_{1},m_{0}\}.

We immediately get the result except for the first part of (35). But this estimate follows from sub-additivity and $\nu_{1}$ -integrability of $V_{0}$ , see [31, Lemma 4.13 (ii)]. ∎

Remark 6.2.

(a) If $\frac{d\nu}{dz}\leq\frac{C}{|z|^{d+\sigma_{1}}}$ for $|z|>1$ and $\int_{\mathbb{R}^{d}}|x|^{\sigma_{2}}\,m_{0}(dx)<\infty$ for $\sigma_{1},\sigma_{2}>0$ , then $\Psi(z)=\log(\sqrt{1+|z|^{2}})$ is a possible explicit choice for the function in Proposition 6.1.

(b) Since $\Psi\in C^{2}(\mathbb{R}^{d})$ , the first part of (35) is equivalent to $\|\mathcal{L}\Psi\|_{0}<\infty$ (see [31, Lemma 4.13 (ii)]).

Lemma 6.3.

Assume $\{\mu_{\alpha}\}_{\alpha\in A}\subset P(\mathbb{R}^{d})$ and there exists a function $0\leq\psi\in C(\mathbb{R}^{d})$ such that $\lim_{|x|\rightarrow\infty}\psi(x)=\infty$ and $\sup_{\alpha}\int_{\mathbb{R}^{d}}\psi(x)\mu_{\alpha}(dx)\leq C$ . Then $\{\mu_{\alpha}\}_{\alpha}$ is tight.

This result is classical and can be proved in a similar way as the Chebychev inequality.

Theorem 6.4 (Tightness).

Assume ( $\nu$ 0): , ( $\nu$ 1): , (L1): –(L2): , (F2): , (H1): , (M): , the CFL condition $\frac{\rho^{2}}{h},hr^{1-2\sigma}=\mathcal{O}(1)$ , $\mu\in C([0,T],P(\mathbb{R}^{d}))$ , and $m^{\epsilon}_{\rho,h}[\mu]$ is defined by (26). Take $\Psi$ as in Proposition 6.1. Then there exists $C>0$ , independent of $\rho,h,\epsilon$ and $\mu$ , such that

\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)\,dm^{\epsilon}_{\rho,h}[\mu](t)\leq C\qquad\mbox{for any }\qquad t\in[0,T].

Proof.

Essentially we start by multiplying the scheme (24) by $\Psi$ and integrating in space. By the definition of $m^{\epsilon}_{\rho,h}=m^{\epsilon}_{\rho,h}[\mu]$ in (26) and (24), we find that

	$\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)dm^{\epsilon}_{\rho,h}(t_{k+1})$	$\displaystyle=\frac{1}{\rho^{d}}\sum_{i\in\mathbb{Z}^{d}}m_{i,k+1}\int_{E_{i}}\Psi(x)dx$
		$\displaystyle=\sum_{i\in\mathbb{Z}^{d}}\frac{1}{\rho^{d}}\int_{E_{i}}\Psi(x)dx\sum_{j}m_{j,k}\,\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,Du_{\rho,h}^{\epsilon})](i,j,k).$

By the definition of $\mathbf{B}_{\rho,h,r}$ in (25) and interchanging the order of summation and integration, we have

	$\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)d\,m^{\epsilon}_{\rho,h}(t_{k+1})=$	$\displaystyle\sum_{j\in\mathbb{Z}^{d}}\frac{m_{j,k}}{\rho^{d}}\bigg{[}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}\sum_{i\in\mathbb{Z}^{d}}\int_{E_{i}}\Psi(x)\big{(}\beta_{i}(\Phi^{\epsilon,+}_{j,k,p})+\beta_{i}(\Phi^{\epsilon,-}_{j,k,p})\big{)}dx$
		$\displaystyle\hskip 48.36958pt+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{\|z\|>r}\sum_{i\in\mathbb{Z}^{d}}\int_{E_{i}}\Psi(x)\beta_{i}(x_{j}+z)dx\,\nu(dz)\bigg{]}.$

Since $\Psi\in C^{2}(\mathbb{R}^{d})$ , by properties of midpoint approximation and linear/multilinear interpolation we have $\big{|}\frac{1}{\rho^{d}}\int_{E_{i}}\Psi(x)dx-\Psi(x_{i})\big{|}+\big{|}\Psi(x)-\sum_{i\in\mathbb{Z}^{d}}\beta_{i}(x)\Psi(x_{i})\big{|}\leq\mathcal{O}(\rho^{2})$ . Therefore

(36)		$\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)d\,m^{\epsilon}_{\rho,h}(t_{k+1})\leq$	$\displaystyle\sum_{j\in\mathbb{Z}^{d}}m_{j,k}\bigg{[}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}\big{(}\Psi\big{(}\Phi^{\epsilon,+}_{j,k,p}\big{)}+\Psi\big{(}\Phi^{\epsilon,-}_{j,k,p}\big{)}\big{)}$
		$\displaystyle\hskip 28.45274pt+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{\|z\|>r}\Psi(x_{j}+z)\,\nu(dz)\bigg{]}+\mathcal{O}(\rho^{2}).$

We estimate the terms on the right hand side. Let $\Phi^{\epsilon,\pm}_{j,k,p}=x_{j}\pm a^{\pm}_{h,j}$ where

(37)

\displaystyle a^{\pm}_{h,j}=h\,\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}+B_{r}^{\sigma}\Big{)}\pm\sqrt{h}\sigma_{r}^{p}.

By the fundamental theorem of Calculus,

(38)

\displaystyle\Psi(x_{j}-a^{+}_{h,j})+\Psi(x_{j}-a^{-}_{h,j})=2\Psi(x_{j})-(a^{+}_{h,j}+a^{-}_{h,j})\cdot D\Psi(x_{j})+E_{1}

where $a^{+}_{h,j}+a^{-}_{h,j}=2h\,\big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}+B_{r}^{\sigma}\big{)}$ and

\displaystyle E_{1}=-\int_{0}^{1}\Big{[}a^{+}_{h,j}\cdot\big{(}D\Psi(x_{j}-ta^{+}_{h,j})-D\Psi(x_{j})\big{)}+a^{-}_{h,j}\cdot\big{(}D\Psi(x_{j}-ta^{-}_{h,j})-D\Psi(x_{j})\big{)}\Big{]}dt.

By Lemma 5.5 (a) and (H1): , we find that $\|D_{p}H(\cdot,Du_{\rho,h}^{\epsilon})\|_{0}\leq C_{R}$ with $R=(L_{L}+L_{F})T+L_{G}+1$ , and then that

\displaystyle|E_{1}|

\displaystyle\leq\|D^{2}\Psi\|_{0}(|a^{+}_{h,j}|^{2}+|a^{-}_{h,j}|^{2})\leq 4\|D^{2}\Psi\|_{0}\big{(}h^{2}(C_{R}^{2}+|B_{r}^{\sigma}|^{2})+h|\sigma_{r}^{p}|^{2}\big{)}.

To estimate the nonlocal term, we write

	$\displaystyle\int_{\|z\|>r}\Psi(x_{j}+z)\,\nu(dz)=\int_{\|z\|>1}\Psi(x_{j}+z)\nu(dz)$
	$\displaystyle\quad+\int_{r<\|z\|<1}\Big{\{}\Psi(x_{j})+z\cdot D\Psi(x_{j})+\int_{0}^{1}z\cdot\Big{[}D\Psi(x_{j}+tz)-D\Psi(x_{j})\Big{]}dt\Big{\}}\,\nu(dz)$
	$\displaystyle\leq\Big{\|}\int_{\|z\|>1}\big{(}\Psi(x_{j}+z)-\Psi(x_{j})\big{)}\nu(dz)\Big{\|}+\lambda_{r}\Psi(x_{j})+B_{r}^{\sigma}\cdot D\Psi(x_{j})$
	$\displaystyle\hskip 184.9429pt+\\|D^{2}\Psi\\|_{0}\int_{r<\|z\|<1}\|z\|^{2}\nu(dz)$
	$\displaystyle\leq\,\lambda_{r}\Psi(x_{j})+B_{r}^{\sigma}\cdot D\Psi(x_{j})+E_{2},$

where $E_{2}$ is finite and independent of $\rho,h,\epsilon$ by Proposition 6.1 and $\int_{|z|<1}|z|^{2}\nu(dz)<\infty$ . Going back to (36) and using the above estimates then leads to

	$\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)d\,m^{\epsilon}_{\rho,h}(t_{k+1})$
	$\displaystyle\leq\sum_{j\in\mathbb{Z}^{d}}m_{j,k}\bigg{[}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}\Big{(}2\Psi(x_{j})-2h\,\big{[}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}+B_{r}^{\sigma}\big{]}\cdot D\Psi(x_{j})+\|E_{1}\|\Big{)}$
	$\displaystyle\hskip 56.9055pt+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\Big{(}\lambda_{r}\Psi(x_{j})+B_{r}^{\sigma}\cdot D\Psi(x_{j})+E_{2}\Big{)}\bigg{]}+C\rho^{2}$
	$\displaystyle\leq\sum_{j\in\mathbb{Z}^{d}}m_{j,k}\,\Psi(x_{j})+C\Big{(}h^{2}\lambda_{r}\|B_{r}^{\sigma}\|+h^{2}\|B_{r}^{\sigma}\|^{2}+h+\rho^{2}\Big{)},$

where we used $|-he^{-\lambda_{r}h}+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}|\leq\frac{3}{2}\lambda_{r}h^{2}$ and $\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\leq h$ to get the last inequality.

With $A_{k+1}=\int_{\mathbb{R}^{d}}\Psi(x)d\,m^{\epsilon}_{\rho,h}(t_{k+1})$ , the above estimate becomes $A_{k+1}\leq A_{k}+E$ where $E=C(\lambda_{r}h^{2}|B_{r}^{\sigma}|+h^{2}|B_{r}^{\sigma}|^{2}+h+\rho^{2})$ . By iteration, $|B^{\sigma}_{r}|^{2}\leq\lambda_{r}|B_{r}^{\sigma}|\leq Cr^{1-2\sigma}$ (by ( $\nu$ 0): , ( $\nu$ 1): ), and $k\leq N\leq\frac{C}{h}$ , we find that

(39)

\displaystyle A_{k+1}

\displaystyle\leq\,A_{0}+(k+1)E\leq A_{0}+C\Big{(}hr^{1-2\sigma}+1+\frac{\rho^{2}}{h}\Big{)}.

By assumption $\frac{\rho^{2}}{h},hr^{1-2\sigma}=\mathcal{O}(1)$ , and by Proposition 6.1, $A_{0}=\int_{\mathbb{R}^{d}}\Psi(x)d\,m_{0}<\infty$ . Therefore

\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)d\,m^{\epsilon}_{\rho,h}(t_{k})\leq C\qquad\mbox{for}\qquad k=0,1,\dots,N,

for some constant $C>0$ independent of $\rho,h,\epsilon,\mu$ , and hence by (26) the result follows for $t\in[0,T]$ . ∎

Theorem 6.5 (Equicontinuity in time).

Assume ( $\nu$ 0): , ( $\nu$ 1): , (L1): –(L2): , (F2): , (H1): , (M): , $\mu\in C([0,T],P(\mathbb{R}^{d}))$ , and $m^{\epsilon}_{\rho,h}[\mu]$ is defined by (26). Let $\frac{\rho^{2}}{h},\frac{h}{r^{\sigma}}=\mathcal{O}(1)$ if $\sigma\in(0,1)$ , or $\frac{\rho^{2}}{h},hr^{1-2\sigma}=\mathcal{O}(1)$ if $\sigma\in(1,2)$ . Then there exists a constant $C_{0}>0$ , independent of $\rho,h,\epsilon$ and $\mu$ , such that for any $t_{1},t_{2}\in[0,T]$ ,

\displaystyle d_{0}(m^{\epsilon}_{\rho,h}[\mu](t_{1}),m^{\epsilon}_{\rho,h}[\mu](t_{2}))\leq C_{0}\sqrt{|t_{1}-t_{2}|}.

Proof.

We start by the case $\sigma>1$ . For $\delta>0$ , let $\phi_{\delta}:=\phi*\rho_{\delta}$ for $\rho_{\delta}$ defined just before Lemma 3.2. With $m^{\epsilon}_{\rho,h}=m^{\epsilon}_{\rho,h}[\mu]$ we first note that

	$\displaystyle d_{0}(m^{\epsilon}_{\rho,h}(t_{1}),m^{\epsilon}_{\rho,h}(t_{2}))=\sup_{\phi\in\mbox{Lip}_{1,1}}\int_{\mathbb{R}^{d}}\phi(x)(m^{\epsilon}_{\rho,h}(t_{1})-m^{\epsilon}_{\rho,h}(t_{2}))dx$
	$\displaystyle=\sup_{\phi\in\mbox{Lip}_{1,1}}\Big{\{}\int_{\mathbb{R}^{d}}(\phi-\phi_{\delta})(m^{\epsilon}_{\rho,h}(t_{1})-m^{\epsilon}_{\rho,h}(t_{2}))dx+\int_{\mathbb{R}^{d}}\phi_{\delta}\,(m^{\epsilon}_{\rho,h}(t_{1})-m^{\epsilon}_{\rho,h}(t_{2}))dx\Big{\}}$
(40)		$\displaystyle\leq\,2\delta\\|D\phi\\|_{0}+\sup_{\phi\in\mbox{Lip}_{1,1}}\int_{\mathbb{R}^{d}}\phi_{\delta}\,(m^{\epsilon}_{\rho,h}(t_{1})-m^{\epsilon}_{\rho,h}(t_{2}))dx,$

where Lemma 3.2 was used to estimate the $\phi-\phi_{\delta}$ term and $\int m^{\epsilon}_{\rho,h}dx=1$ . Since $m^{\epsilon}_{\rho,h}$ and $\int_{\mathbb{R}^{d}}\phi_{\delta}(x)m^{\epsilon}_{\rho,h}(t,x)dx$ are affine on each interval $[t_{k},t_{k+1}]$ , $\int_{\mathbb{R}^{d}}\phi_{\delta}(x)\,m^{\epsilon}_{\rho,h}(\cdot,x)dx\in W^{1,\infty}[0,T]$ and

\displaystyle\Big{\|}\frac{d}{dt}\int_{\mathbb{R}^{d}}\phi_{\delta}(x)\,m^{\epsilon}_{\rho,h}(\cdot,x)dx\Big{\|}_{0}\leq\sup_{k}|I_{k}|.

where $I_{k}=\int_{\mathbb{R}^{d}}\phi_{\delta}(x)\,\frac{m^{\epsilon}_{\rho,h}(t_{k+1},x)-m^{\epsilon}_{\rho,h}(t_{k},x)}{h}dx$ . It follows that

(41)

\displaystyle\int_{\mathbb{R}^{d}}\phi_{\delta}\,(m^{\epsilon}_{\rho,h}(t_{1},x)-m^{\epsilon}_{\rho,h}(t_{2},x))dx\leq|t_{1}-t_{2}|\sup_{k}|I_{k}|.

Let us estimate $I_{k}$ . By (26), (24), (25), the midpoint quadrature approximation error bound, and the linear/multi-linear interpolation error bound, we have

	$\displaystyle I_{k}=$	$\displaystyle\frac{1}{h}\sum_{i}\frac{1}{\rho^{d}}\int_{E_{i}}\phi_{\delta}(x)\,dx[m_{i,k+1}-m_{i,k}]$
	$\displaystyle=$	$\displaystyle\frac{1}{h\rho^{d}}\sum_{j,i}\Big{(}\int_{E_{i}}\phi_{\delta}(x)dx\Big{)}\Big{[}m_{j,k}\,\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,Du_{\rho,h}^{\epsilon})](i,j,k)-m_{i,k}\,\delta_{i,j}\Big{]}$
	$\displaystyle=$	$\displaystyle\frac{1}{h}\sum_{j}m_{j,k}\Big{[}\sum_{i}\phi_{\delta}(x_{i})\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,Du_{\rho,h}^{\epsilon})](i,j,k)-\phi_{\delta}(x_{j})+C\\|D^{2}\phi_{\delta}\\|_{0}\rho^{2}\Big{]}$
	$\displaystyle=$	$\displaystyle\frac{1}{h}\sum_{j}m_{j,k}\Big{[}\frac{e^{-\lambda_{r}h}}{2d}\Big{(}\sum^{d}_{p=1}\phi_{\delta}(\Phi^{\epsilon,+}_{j,k,p})+\phi_{\delta}(\Phi^{\epsilon,-}_{j,k,p})-2\phi_{\delta}(x_{j})\Big{)}$
		$\displaystyle\hskip 56.9055pt+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{\|z\|>r}\big{(}\phi_{\delta}(x_{j}+z)-\phi_{\delta}(x_{j})\big{)}\nu(dz)+C\\|D^{2}\phi_{\delta}\\|_{0}\rho^{2}\Big{]}.$

Since $\Phi^{\epsilon,\pm}_{j,k,p}=x_{j}+a^{\pm}_{h,j}$ by (37), a 2nd order Taylor’s expansion gives us

	$\displaystyle\big{\|}I_{k}\big{\|}\leq$	$\displaystyle\frac{1}{h}\sum_{j}m_{j,k}\bigg{[}e^{-\lambda_{r}h}\Big{(}(-hD_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}[\mu](t_{k},x_{j})\big{)}-hB_{r}^{\sigma})\cdot D\phi_{\delta}(x_{j})$
		$\displaystyle\hskip 28.45274pt+\frac{\\|D^{2}\phi_{\delta}\\|_{0}}{2d}\sum^{d}_{p=1}\big{(}\|a^{+}_{h,j}\|^{2}+\|a^{-}_{h,j}\|^{2}\big{)}+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\Big{(}B^{\sigma}_{r}\cdot D\phi_{\delta}(x_{j})$
		$\displaystyle\hskip 28.45274pt+\\|D^{2}\phi_{\delta}\\|_{0}\int_{r<\|z\|<1}\|z\|^{2}\nu(dz)+2\\|\phi_{\delta}\\|_{0}\int_{\|z\|>1}\nu(dz)+C\\|D^{2}\phi_{\delta}\\|_{0}\rho^{2}\Big{)}\bigg{]}$
	$\displaystyle\leq$	$\displaystyle\,\frac{1}{h}\bigg{[}\Big{(}h\\|D_{p}H(\cdot,Du_{\rho,h}^{\epsilon})\\|_{0}+h^{2}\lambda_{r}\|B^{\sigma}_{r}\|\Big{)}\\|D\phi_{\delta}\\|_{0}+c_{3}h{\\|\phi_{\delta}\\|}_{0}$
		$\displaystyle\hskip 14.22636pt+c_{1}\Big{(}h^{2}{\\|D_{p}H(\cdot,Du_{\rho,h}^{\epsilon})\\|}^{2}+h^{2}{\|B^{\sigma}_{r}\|}^{2}+h{\|\sigma_{r}\|}^{2}+h+\rho^{2}\Big{)}{\\|D^{2}\phi_{\delta}\\|}_{0}\bigg{]}\sum_{j}m_{j,k}.$

The above inequality follows since $(\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}-he^{-h\lambda_{r}})\leq h^{2}\lambda_{r}$ (used for the $B_{r}^{\sigma}\cdot D\phi_{\delta}$ -terms), and $\int_{r<|z|<1}{|z|}^{2}\nu(dz)+\int_{|z|>1}\nu(dz)\leq C$ independently of $r$ by ( $\nu$ 0): and ( $\nu$ 1): . By Lemma 5.5 (a) and (H1): , $\|D_{p}H(\cdot,Du_{\rho,h}^{\epsilon})\|_{0}\leq C_{R}$ with $R=(L_{L}+L_{F})T+L_{G}+1$ . Since $\sum m_{j,k}=1$ , $\phi\in\textup{Lip}_{1,1}$ , $\|D^{2}\phi_{\delta}\|_{0}\leq\frac{\|D\phi\|_{0}}{\delta}$ , and $|B^{\sigma}_{r}|^{2}\leq\lambda_{r}|B_{r}^{\sigma}|\leq Kr^{1-2\sigma}$ (by ( $\nu$ 0): , ( $\nu$ 1): ), we get that

\displaystyle|I_{k}|\leq C(1+hr^{1-2\sigma})+C\big{(}1+h+hr^{1-2\sigma}+\frac{\rho^{2}}{h}\big{)}\frac{1}{\delta}.

To conclude the proof in the case $\sigma>1$ , we go back to (40) and (41). In view of the above estimate on $I_{k}$ and the assumption that $\frac{\rho^{2}}{h},hr^{1-2\sigma}=\mathcal{O}(1)$ , we find that

\displaystyle d_{0}(m^{\epsilon}_{\rho,h}(t_{1}),m^{\epsilon}_{\rho,h}(t_{2}))

\displaystyle\leq 2\delta+C|t_{1}-t_{2}|\Big{(}1+\frac{1}{\delta}\Big{)}.

Finally taking $\delta=\sqrt{|t_{1}-t_{2}|}$ we get $d_{0}(m^{\epsilon}_{\rho,h}(t_{1}),m^{\epsilon}_{\rho,h}(t_{2}))\leq C\sqrt{|t_{1}-t_{2}|}.$

When $\sigma<1$ , we find that $|B_{r}^{\sigma}|\leq C$ and hence that

\displaystyle|I_{k}|\leq C(1+hr^{-\sigma})+C\big{(}1+h+hr^{-\sigma}+\frac{\rho^{2}}{h}\big{)}\frac{1}{\delta}.

By assumption $hr^{-\sigma}+\frac{\rho^{2}}{h}=\mathcal{O}(1)$ , so again we find that

\displaystyle d_{0}(m^{\epsilon}_{\rho,h}(t_{1}),m^{\epsilon}_{\rho,h}(t_{2}))\leq 2\delta+C|t_{1}-t_{2}|\Big{(}1+\frac{1}{\delta}\Big{)},

and can conclude as before. ∎

We also need a $L^{1}$ -stability result for $m^{\epsilon}_{\rho,h}[\mu]$ with respect to variations in $\mu$ .

Lemma 6.6 ( $L^{1}$ -stability).

Assume ( $\nu$ 0): , (H1): , and $m^{\epsilon}_{\rho,h}[\mu]$ is defined by (26). Then for $\mu_{1},\mu_{2}\in C([0,T],P(\mathbb{R}^{d}))$ ,

	$\displaystyle\sup_{t\in[0,T]}\\|m^{\epsilon}_{\rho,h}[\mu_{1}](t,\cdot)-m^{\epsilon}_{\rho,h}[\mu_{2}](t,\cdot)\\|_{L^{1}(\mathbb{R}^{d})}$
	$\displaystyle\leq\frac{cKT}{\rho}e^{-h\lambda_{r}}\big{\\|}D_{p}H(\cdot,Du^{\epsilon}_{\rho,h}[\mu_{1}])-D_{p}H(\cdot,Du^{\epsilon}_{\rho,h}[\mu_{2}])\big{\\|}_{0}.$

Proof.

Let $\alpha=D_{p}H(\cdot,Du^{\epsilon}_{\rho,h}[\mu_{1}])$ , $\tilde{\alpha}=D_{p}H(\cdot,Du^{\epsilon}_{\rho,h}[\mu_{2}])$ , $m_{j,k}=m_{j,k}[\mu_{1}]$ , and $\tilde{m}_{j,k}=m_{j,k}[\mu_{2}]$ . By (25) and Lemma 3.3, $\mathbf{B}_{\rho,h,r}[\alpha](i,j,k)\geq 0$ and $m_{j,k}\geq 0$ , so that

	$\displaystyle\sum_{i}\big{\|}m_{i,k+1}-\tilde{m}_{i,k+1}\big{\|}=\sum_{i}\big{\|}\sum_{j}(m_{j,k}\,\mathbf{B}_{\rho,h,r}[\alpha](i,j,k)-\tilde{m}_{j,k}\,\mathbf{B}_{\rho,h,r}[\tilde{\alpha}](i,j,k))\big{\|}$
	$\displaystyle\leq\sum_{i}\sum_{j}\Big{(}m_{j,k}\big{\|}\mathbf{B}_{\rho,h,r}[\alpha](i,j,k)-\mathbf{B}_{\rho,h,r}[\tilde{\alpha}](i,j,k)\big{\|}+\big{\|}m_{j,k}-\tilde{m}_{j,k}\big{\|}\mathbf{B}_{\rho,h,r}[\tilde{\alpha}](i,j,k)\Big{)}.$

Since $\sum_{i}\mathbf{B}_{\rho,h,r}[\tilde{\alpha}](i,j,k)=1$ (follows from $\sum_{i}\beta_{i}=1$ and (25)),

\displaystyle\sum_{i}\sum_{j}\big{|}m_{j,k}-\tilde{m}_{j,k}\big{|}\mathbf{B}_{\rho,h,r}[\tilde{\alpha}](i,j,k)=\sum_{j}\big{|}m_{j,k}-\tilde{m}_{j,k}\big{|}.

Moreover, since only a finite number $K_{d}$ of $\beta_{i}$ ’s are non-zero at any given point, $\beta_{i}$ is Lipschitz with constant $\frac{c}{\rho}$ , and $\sum_{j}m_{j,k}=1$ by Lemma 3.3, by the definitions of $\mathbf{B}_{\rho,h,r}$ (25) and $\Phi_{j,k,p}^{\pm}$ (23),

	$\displaystyle\sum_{i}\sum_{j}m_{j,k}\big{\|}\mathbf{B}_{\rho,h,r}[\alpha](i,j,k)-\mathbf{B}_{\rho,h,r}[\tilde{\alpha}](i,j,k)\big{\|}$
	$\displaystyle\leq\sum_{j}m_{j,k}\frac{e^{-h\lambda_{r}}}{2d}\sum_{p=1}^{d}\sum_{i}\,\big{\|}\beta_{i}(\Phi_{j,k,p}^{+}[\mu_{1}])-\beta_{i}(\Phi_{j,k,p}^{+}[\mu_{2}])$
	$\displaystyle\hskip 56.9055pt+\beta_{i}(\Phi_{j,k,p}^{-}[\mu_{1}])-\beta_{i}(\Phi_{j,k,p}^{-}[\mu_{2}])\big{\|}\leq K_{d}\frac{che^{-h\lambda_{r}}}{\rho}\\|\alpha-\tilde{\alpha}\\|_{0}.$

An iteration then shows that

\displaystyle\sum_{i}\big{|}m_{i,k+1}-\tilde{m}_{i,k+1}\big{|}\leq\sum_{i}\big{|}m_{i,0}-\tilde{m}_{i,0}\big{|}+\frac{cK_{d}T}{\rho}e^{-h\lambda_{r}}\|\alpha-\tilde{\alpha}\|_{0}.

Since $m_{i,0}=\tilde{m}_{i,0}=\int_{E_{i}}m_{0}\,dx$ , the result follows by interpolation. ∎

We end this section by a uniform $L^{p}$ -bound on $m^{\epsilon}_{\rho,h}$ in dimension $d=1$ .

Theorem 6.7 ( $L^{p}$ bounds).

Assume $d=1$ , ( $\nu$ 0): , ( $\nu$ 1): , (L1): , (L3): , (F3): , (H2): , (M’): , $\mu\in C([0,T],P(\mathbb{R}^{d}))$ , and $m^{\epsilon}_{\rho,h}[\mu]$ be defined by (26). Then $m_{\rho,h}^{\epsilon}[\mu]\in L^{p}(\mathbb{R})$ and there exist a constant $K>0$ independent of $\epsilon,h,\rho$ and $\mu$ such that

\displaystyle\|m^{\epsilon}_{\rho,h}[\mu](\cdot,t)\|_{L^{p}(\mathbb{R})}\leq e^{KT}\|m_{0}\|_{L^{p}(\mathbb{R})}.

To prove the theorem we need few technical lemmas.

Lemma 6.8.

Assume $d=1$ , ( $\nu$ 0): , ( $\nu$ 1): , (L1): , (L3): , (F3): , and (H2): . There exists a constant $c_{0}>0$ independent of $\rho,h,\epsilon,\mu$ such that

\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}-D_{p}H\big{(}x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}(x_{j}-x_{i})\leq c_{0}|x_{j}-x_{i}|^{2}.

Proof.

By (L1): and (H2): for $R=((L_{F}+L_{L})T+L_{G})+1$ we have

	$\displaystyle\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}-D_{p}H\big{(}x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}(x_{j}-x_{i})$
	$\displaystyle=(x_{j}-x_{i})\int_{0}^{1}\frac{d}{dt}\Big{(}D_{p}H\big{(}x_{j},t\,Du^{\epsilon}_{\rho,h}(t_{k},x_{j})+(1-t)Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}dt$
	$\displaystyle\quad\ +(x_{j}-x_{i})\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}-D_{p}H\big{(}x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}$
	$\displaystyle=(x_{j}-x_{i})\int_{0}^{1}D_{pp}H\Big{(}x_{j},t\,Du^{\epsilon}_{\rho,h}(t_{k},x_{j})$
	$\displaystyle\hskip 113.81102pt+(1-t)Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\Big{)}\big{(}Du^{\epsilon}_{\rho,h}(t_{k},x_{j})-Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}dt$
	$\displaystyle\quad\ +(x_{j}-x_{i})\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}-D_{p}H\big{(}x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}$
	$\displaystyle\leq C_{R}\,c_{2}\|x_{j}-x_{i}\|^{2}+C_{R}\|x_{j}-x_{i}\|^{2},$

where the last inequality follows from convexity of $H$ (since $L$ is convex by (L1): ), semiconcavity of $u^{\epsilon}_{\rho,h}$ in Lemma 5.5 (c), and regularity of $H$ in (H2): . ∎

Lemma 6.9.

Assume $d=1$ , ( $\nu$ 0): , ( $\nu$ 1): , (L1): , (L3): , (F3): , (H2): , $\mu\in C([0,T],P(\mathbb{R}^{d}))$ , and let $\Phi^{\epsilon,\pm}_{j,k}[\mu]$ be defined in (23). There exist a constant $K_{0}>0$ independent of $\epsilon,\rho,h,\mu$ , such that for all $i\in\mathbb{Z}$ and $k=\mathcal{N}_{h}$ ,

\displaystyle\max\Big{\{}\sum_{j\in\mathbb{Z}}\beta_{i}(\Phi^{\epsilon,+}_{j,k})[\mu],\sum_{j\in\mathbb{Z}}\beta_{i}(\Phi^{\epsilon,-}_{j,k})[\mu]\Big{\}}\leq 1+K_{0}h.

The proof of this result is similar to the proof of [23, Lemma 3.8] – a slightly expanded proof is given in Appendix C. A similar result holds for the integral-term:

Lemma 6.10.

Assume $d=1$ . Then we have

\displaystyle\frac{1}{\lambda_{r}}\sum_{j\in\mathbb{Z}}\int_{|z|>r}\beta_{i}(x_{j}+z)\nu(dz)=1.

Proof.

By (11) and properties of the basis functions $\beta_{j}$ we have

\displaystyle\frac{1}{\lambda_{r}}\sum_{j\in\mathbb{Z}}\int_{|z|>r}\beta_{i}(x_{j}+z)\nu(dz)=\frac{1}{\lambda_{r}}\int_{|z|>r}\sum_{j\in\mathbb{Z}}\beta_{i-j}(z)\nu(dz)=\frac{1}{\lambda_{r}}\int_{|z|>r}\nu(dz)=1.\qquad\quad\qed

Proof of Theorem 6.7.

By definition of $m^{\epsilon}_{\rho,h}$ in (26) and the scheme (24),

	$\displaystyle\int_{\mathbb{R}}(m^{\epsilon}_{\rho,h}(x,t_{k+1}))^{p}dx=\int_{\mathbb{R}}\Big{(}\frac{1}{\rho}\sum_{i}m_{i,k+1}\mathbbm{1}_{E_{i}}(x)\Big{)}^{p}dx$
	$\displaystyle=\frac{1}{\rho^{p-1}}\sum_{i\in\mathbb{Z}}(m_{i,k+1})^{p}=\frac{1}{\rho^{p-1}}\sum_{i}\Big{(}\sum_{j}m_{j,k}\,\mathbf{B}_{\rho,h,r}(i,j,k)\Big{)}^{p},$

where $\mathbf{B}_{\rho,h,r}=\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,Du_{\rho,h}^{\epsilon}[\mu])]$ is defined in (25). By Jensen’s inequality we have

\displaystyle\sum_{i\in\mathbb{Z}}\Big{(}\sum_{j}m_{j,k}\,\mathbf{B}_{\rho,h,r}(i,j,k)\Big{)}^{p}\leq\sum_{i\in\mathbb{Z}}\Big{(}\sum_{p\in\mathbb{Z}}\mathbf{B}_{\rho,h,r}(i,p,k)\Big{)}^{p-1}\Big{(}\sum_{j}\big{(}m_{j,k}\big{)}^{p}\,\mathbf{B}_{\rho,h,r}(i,j,k)\Big{)},

and by Lemma 6.9 and 6.10,

\sum_{p\in\mathbb{Z}}\mathbf{B}_{\rho,h,r}(i,p,k)\leq 1+K_{0}h,

where $K_{0}$ is independent of $i,\rho,h,\epsilon$ and $\mu$ . Since $\sum_{i}\mathbf{B}_{\rho,h,r}(i,p,k)=1$ (follows from $\sum_{i}\beta_{i}=1$ ), we find that

	$\displaystyle\sum_{i\in\mathbb{Z}}(m_{i,k+1})^{p}$	$\displaystyle\leq(1+K_{0}h)^{p-1}\sum_{j}\big{(}m_{j,k}\big{)}^{p}\sum_{i}\mathbf{B}_{\rho,h,r}(i,j,k)$
		$\displaystyle\leq\rho^{p-1}\\|m^{\epsilon}_{\rho,h}(t_{k},\cdot)\\|^{p}_{L^{p}(\mathbb{R})}(1+K_{0}h)^{p-1}.$

By iteration and $\|m^{\epsilon}_{\rho,h}(\cdot,t_{0})\|_{L^{p}}=\|m_{0}\|_{L^{p}}$ , $\|m^{\epsilon}_{\rho,h}(t_{k+1},\cdot)\|_{L^{p}}\leq e^{K_{0}T(1-\frac{1}{p})}\|m_{0}\|_{L^{p}}$ , and the result follows for $p\in[1,\infty)$ .

The proof of $p=\infty$ is simpler, and in view of Lemma 6.9 and 6.10, the proof follows as in [24] for 2nd order case. ∎

7. Proof of convergence – Theorem 4.1 and 4.3

The main structure of the proofs are similar, so we present the proofs together. We proceed by several steps.

Step 1. (Compactness of $m^{\epsilon_{n}}_{\rho_{n},h_{n}}$ ) In view of Theorem 6.4 and 6.5, $m^{\epsilon}_{\rho,h}$ is precompact in $C([0,T],P(\mathbb{R}^{d}))$ by the Prokhorov and Arzelà-Ascoli Theorem. Hence there exist a subsequence $\{m^{\epsilon_{n}}_{\rho_{n},h_{n}}\}$ and $m$ in $C([0,T],P(\mathbb{R}^{d}))$ such that

\displaystyle m^{\epsilon_{n}}_{\rho_{n},h_{n}}\rightarrow m\quad\mbox{in}\quad C([0,T],P(\mathbb{R}^{d})).

This proves Theorem 4.3 (a) (ii) and the first part of Theorem 4.1 (a) (ii).

If (M’): holds with $p=\infty$ , then Theorem 6.7 and Helly’s weak $*$ compactness theorem imply that $\{m^{\epsilon}_{\rho,h}\}$ is weak $*$ precompact in $L^{\infty}([0,T]\times\mathbb{R})$ and there is a subsequence $\{m^{\epsilon_{n}}_{\rho_{n},h_{n}}\}$ and function $m$ such that $m^{\epsilon_{n}}_{\rho_{n},h_{n}}\overset{\ast}{\rightharpoonup}m$ in $L^{\infty}([0,T]\times\mathbb{R})$ . If (M’): holds with $p\in(1,\infty)$ , then $\{m^{\epsilon}_{\rho,h}\}$ is equiintegrable in $[0,T]\times\mathbb{R}$ by Theorem 6.4 and 6.7 and de la Vallée Poussin’s theorem. By Dunford-Pettis’ theorem, it is then weakly precompact in $L^{1}([0,T]\times\mathbb{R})$ and there exists a subsequence $\{m^{\epsilon_{n}}_{\rho_{n},h_{n}}\}$ and function $m$ such that $m^{\epsilon_{n}}_{\rho_{n},h_{n}}\rightharpoonup m$ in $L^{1}([0,T]\times\mathbb{R})$ . The second part of Theorem 4.1 (a) (ii) follows.

Step 2. (Compactness and limit points for $u_{\rho_{n},h_{n}}$ ) Part (i) and limit points $u$ as viscosity solutions in part (iii) of both Theorem 4.1 and 4.3 follow from step 1 and Theorem 5.6 (i).

Step 3. (Consistency for $m^{\epsilon_{n}}_{\rho_{n},h_{n}}$ ) Let $(u,m)$ be a limit point of $\{(u^{\epsilon_{n}}_{\rho_{n},h_{n}},m^{\epsilon_{n}}_{\rho_{n},h_{n}})\}_{n}$ . Then by step 2, $u$ is a viscosity solution of the HJB equation in (1). We now show that $m$ is a very weak solution of the FPK equation in (1) with $u$ as the input data, i.e. $m$ satisfies (3) for $t\in[0,T]$ and $\phi\in C_{c}^{\infty}(\mathbb{R}^{d})$ . In the rest of the proof we use $\rho,h,r,\epsilon$ instead of $\rho_{n},h_{n},r_{n},\epsilon_{n}$ to simplify. We also let $\widehat{\widehat{m}}=m^{\epsilon_{n}}_{\rho_{n},h_{n}}$ , $w=u_{\rho_{n},h_{n}}^{\epsilon_{n}}[\widehat{\widehat{m}}]$ , and take $t_{n}=\big{[}\frac{t}{h_{n}}\big{]}h_{n}$ . Then we note that

\displaystyle\int_{\mathbb{R}^{d}}\phi(x)d\widehat{\widehat{m}}(t_{n})(x)=\int_{\mathbb{R}^{d}}\phi(0)dm_{0}(x)+\sum_{k=0}^{n-1}\int_{\mathbb{R}^{d}}\phi(x)d[\widehat{\widehat{m}}(t_{k+1})-\widehat{\widehat{m}}(t_{k})],

so to prove (3), we must estimate the sum on the right.

By the midpoint approximation and (26), the scheme (24), and (25) combined with linear/multilinear interpolation, and finally midpoint approximation again, we find that

	$\displaystyle\int_{\mathbb{R}^{d}}\phi(x)d\widehat{\widehat{m}}(t_{k+1})=\frac{1}{\rho^{d}}\sum_{i\in\mathbb{Z}^{d}}m_{i,k+1}\int_{E_{i}}\phi(x)dx=\sum_{i}m_{i,k+1}\phi(x_{i})+\mathcal{O}(\rho^{2})$
	$\displaystyle=\sum_{i}\phi(x_{i})\sum_{j}m_{j,k}\,\mathbf{B}_{\rho,h,r}[H_{p}(\cdot,Dw)](i,j,k)+\mathcal{O}(\rho^{2})$
	$\displaystyle=\sum_{j}m_{j,k}\Big{(}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}[\phi(\Phi_{j,k,p}^{\epsilon,+})+\phi(\Phi_{j,k,p}^{\epsilon,-})]+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{\|z\|>r}\phi(x_{j}+z)\nu(dz)\Big{)}+\mathcal{O}(\rho^{2})$
	$\displaystyle=\sum_{j}\frac{m_{j,k}}{\rho^{d}}\int_{E_{j}}\Big{(}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}[\phi(\Phi_{k,p}^{\epsilon,+})(x)+\phi(\Phi_{k,p}^{\epsilon,-})(x)]$
	$\displaystyle\hskip 76.82234pt+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{\|z\|>r}\phi(x+z)\nu(dz)\Big{)}dx+\mathcal{O}(\rho^{2})+E_{\Phi}+E_{\nu},$

where $\Phi_{j,k,p}^{\epsilon,\pm}$ is defined in (23), $\Phi^{\epsilon,\pm}_{k,p}(x)=x-h\,\big{(}H_{p}(x,Dw(t_{k},x))+B_{r}^{\sigma}\big{)}\pm\sqrt{hd}\sigma_{r}^{p}$ , and $E_{\Phi}+E_{\nu}$ is the error of the last midpoint approximation. Since $\phi$ is smooth, $u_{\rho,h}$ uniformly Lipschitz (Lemma 5.5 (a)), $\|D^{2}w\|_{0}\leq\frac{C\|Du_{\rho,h}\|_{0}}{\epsilon}$ , and by assumption (H2): ,

	$\displaystyle\Big{\|}\phi(\Phi^{\epsilon,\pm}_{j,k,p})-\frac{1}{\rho^{d}}\int_{E_{j}}\phi(\Phi^{\epsilon,\pm}_{k,p})(x)dx\Big{\|}$
	$\displaystyle\leq\frac{\\|D\phi\\|_{0}}{\rho^{d}}\int_{E_{j}}\|x-x_{j}\|dx+\frac{h\\|D\phi\\|_{0}}{\rho^{d}}\int_{E_{j}}\big{\|}D_{p}H(x_{j},Dw(t_{k},x_{j}))-D_{p}H(x,Dw(t_{k},x))\big{\|}dx$
	$\displaystyle\leq K\rho\big{(}1+h(\\|H_{pp}\\|_{0}\\|D^{2}w\\|_{0}+\\|H_{px}\\|_{0})\big{)}\leq K\rho\big{(}1+\frac{h}{\epsilon}\\|Du_{\rho,h}\\|_{0}\big{)},$

and hence $E_{\Phi}=\mathcal{O}(\frac{h\rho}{\epsilon})$ . Similarly, $E_{\nu}=\mathcal{O}(h\rho^{2}\lambda_{r})=\mathcal{O}(\frac{h\rho^{2}}{r^{\sigma}})$ .

From the above estimates, we find that

	$\displaystyle\int_{\mathbb{R}^{d}}\phi(x)d\big{(}\widehat{\widehat{m}}(t_{k+1})-\widehat{\widehat{m}}(t_{k})\big{)}(x)=\int_{\mathbb{R}^{d}}\Big{(}\frac{e^{-\lambda_{r}h}}{2d}\sum_{p=1}^{d}[\phi(\Phi_{k,p}^{\epsilon,+})(x)+\phi(\Phi_{k,p}^{\epsilon,-})(x)-2\phi(x)]$
	$\displaystyle\qquad\quad+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\int_{\|z\|>r}\big{(}\phi(x+z)-\phi(x)\big{)}\nu(dz)\Big{)}d\widehat{\widehat{m}}(t_{k})(x)+\mathcal{O}\big{(}\rho^{2}+\frac{h\rho}{\epsilon}+\frac{h\rho^{2}}{r^{\sigma}}\big{)}.$

By a similar argument as in (28) and using Lemma 3.1,

	$\displaystyle\phi(\Phi_{k,p}^{\epsilon,+})(x)+\phi(\Phi_{k,p}^{\epsilon,-})(x)-2\phi(x)=$	$\displaystyle-2h\Big{(}D\phi(x)\cdot D_{p}H(x,Dw(t_{k},x))+B_{r}^{\sigma}\cdot D\phi(x)\Big{)}$
		$\displaystyle\,+2h\mathcal{L}_{r}[\phi](x)+\mathcal{O}(h^{2}r^{2-2\sigma}+hr^{3-\sigma}).$

Hence using (30) and (5) we have

	$\displaystyle\int_{\mathbb{R}^{d}}\phi(x)d(\widehat{\widehat{m}}(t_{k+1})-\widehat{\widehat{m}}(t_{k}))(x)$
	$\displaystyle=h\int_{\mathbb{R}^{d}}\big{[}-D\phi(x)\cdot D_{p}H(x,Dw(t_{k},x))+\mathcal{L}_{r}[\phi](x)+\mathcal{L}^{r}[\phi](x)\big{]}d\widehat{\widehat{m}}(t_{k})(x)$
	$\displaystyle\quad+\mathcal{O}(h^{2}r^{-\sigma}+h^{2}r^{1-2\sigma}+h^{2}r^{2-2\sigma})+\mathcal{O}(\rho^{2}+\frac{h\rho}{\epsilon}+\frac{h\rho^{2}}{r^{\sigma}}+h^{2}r^{2-2\sigma}+hr^{3-\sigma}).$

Summing from $k=0$ to $k=n-1$ and approximating sums by integrals, we obtain

(42)

\displaystyle\begin{split}&\int_{\mathbb{R}^{d}}\phi(x)d\widehat{\widehat{m}}(t_{n})(x)-\int_{\mathbb{R}^{d}}\phi(x)d\widehat{\widehat{m}}(t_{0})\\ &=h\sum_{k=0}^{n-1}\int_{\mathbb{R}^{d}}\big{[}-D\phi(x)\cdot D_{p}H(x,Dw(t_{k},x))+\mathcal{L}[\phi](x)\big{]}d\widehat{\widehat{m}}(t_{k})(x)\\ &\qquad\qquad+n\,\mathcal{O}(\rho^{2}+\frac{h\rho}{\epsilon}+\frac{h\rho^{2}}{r^{\sigma}}+h^{2}r^{-\sigma}+hr^{3-\sigma})\\ &=\int_{\mathbb{R}^{d}}\int_{0}^{t_{n}}\big{[}-D\phi(x)\cdot D_{p}H(x,Dw(s,x))+\mathcal{L}[\phi](x)\big{]}d\widehat{\widehat{m}}(s)(x)\,ds\\ &\qquad\qquad+\mathcal{O}\Big{(}\frac{\rho^{2}}{h}+\frac{\rho}{\epsilon}+\frac{\rho^{2}+h}{r^{\sigma}}+r^{3-\sigma}\Big{)}+E,\end{split}

where $E$ is Riemann sum approximation error. Let $I_{k}(x):=-D\phi(x)\cdot D_{p}H(x,Dw(t_{k},x))$ $+\mathcal{L}[\phi](x)$ and use time-continuity $\widehat{\widehat{m}}$ in the $d_{0}$ -metric (Theorem 6.5), that $w(\cdot,x)$ is constant on $[t_{k},t_{k+1})$ , (H1): , (H2): and $\|D^{2}w\|_{0}\leq\frac{C\|Du_{\rho,h}\|_{0}}{\epsilon}$ , to conclude that for $s\in[t_{k},t_{k+1})$

	$\displaystyle\int_{t_{k}}^{t_{k+1}}\int_{\mathbb{R}^{d}}I_{k}(x)d\big{(}\widehat{\widehat{m}}(t_{k})-\widehat{\widehat{m}}(s)\big{)}(x)ds\leq h\big{(}\\|I_{k}\\|_{0}+\\|DI_{k}\\|_{0}\big{)}C_{0}\sup_{s\in[t_{k},t_{k+1})}\sqrt{s-t_{k}}$
	$\displaystyle\leq Kh\Big{(}1+\\|Dw\\|_{0}+\\|D^{2}w\\|_{0}\Big{)}\sqrt{h}\leq Kh\Big{(}1+\frac{1}{\epsilon}\Big{)}\sqrt{h}.$

Summing over $k$ , we have $E=\big{|}\sum_{k=0}^{n-1}\int_{t_{k}}^{t_{k+1}}\int_{\mathbb{R}^{d}}I_{k}(x)d\big{(}\widehat{\widehat{m}}(t_{k})-\widehat{\widehat{m}}(s)\big{)}(x)ds\big{|}=\mathcal{O}(\frac{\sqrt{h}}{\epsilon})$ .

Since $\widehat{\widehat{m}}$ converges to $m$ in $C([0,T],P(\mathbb{R}^{d}))$ and $\phi\in C_{c}^{\infty}(\mathbb{R}^{d})$ implies $\mathcal{L}[\phi]\in C_{b}(\mathbb{R}^{d})$ , we have

(43)

\displaystyle\int_{\mathbb{R}^{d}}\int_{0}^{t_{n}}\mathcal{L}[\phi](x)d\widehat{\widehat{m}}(s)(x)\xrightarrow{n\rightarrow\infty}\int_{\mathbb{R}^{d}}\int_{0}^{t}\mathcal{L}[\phi](x)d{m}(s)(x).

It now remains to show convergence of the $D_{p}H$ -term and pass to the limit in (42) to get that $m$ is a very weak solution satisfying (3).

Step 4 (Proof of Theorem 4.1 (a) (iii)). Now $d=1$ and part (ii) of Theorem 4.1 (a) implies that $\widehat{\widehat{m}}\overset{\ast}{\rightharpoonup}m$ in $L^{\infty}([0,t]\times\mathbb{R})$ if $m_{0}\in L^{\infty}(\mathbb{R})$ , or $\widehat{\widehat{m}}\rightharpoonup m$ in $L^{1}([0,t]\times\mathbb{R})$ if $m_{0}\in L^{p}(\mathbb{R})$ for $p\in(1,\infty)$ . We also have $Dw(t,x)=Du_{\rho,h}^{\epsilon}(t,x)\rightarrow Du(t,x)$ almost everywhere in $[0,T]\times\mathbb{R}$ by Theorem 5.6 (ii). Since $D\phi\in C_{c}^{\infty}(\mathbb{R})$ and $D_{p}H(\cdot,Dw)$ uniformly bounded, by the triangle inequality and the dominated convergence Theorem we find that

	$\displaystyle\int_{\mathbb{R}}\int_{0}^{t_{n}}D\phi(x)\cdot$	$\displaystyle D_{p}H(x,Dw(s,x))\,d\widehat{\widehat{m}}(s)(x)$
		$\displaystyle\longrightarrow\int_{\mathbb{R}}\int_{0}^{t}D\phi(x)\cdot D_{p}H(x,Du(s,x))\,d{m}(s)(x).$

Then by passing to the limit in (42) using the above limit, (43), and the CFL conditions $\frac{\rho^{2}}{h},\frac{h}{r^{\sigma}},\frac{\sqrt{h}}{\epsilon}=\mathit{o}(1)$ (note that $\rho^{2}\leq h$ for large $n$ ), we see that (3) holds and ${m}$ is a very weak solution of the FPK equation. This completes the proof of Theorem 4.1 (a) (iii).

Step 5(Proof of Theorem 4.3(iii)). Now (U): holds and $Dw=Du_{\rho,h}^{\epsilon}\rightarrow Du$ locally uniformly by Theorem 5.6 (iii). Since $D\phi\in C^{\infty}_{c}(\mathbb{R}^{d})$ and $\int_{\mathbb{R}^{d}}d\widehat{\widehat{m}}(s)(x)=1$ , by continuity and uniform boundedness of $D_{p}H(\cdot,Dw)$ , it follows that

(44)

\displaystyle\begin{split}&\Big{|}\int_{\mathbb{R}^{d}}\int_{0}^{t_{n}}D\phi(x)\cdot D_{p}H(x,Dw(s,x))\,d\widehat{\widehat{m}}(s)(x)\\ &\hskip 85.35826pt-\int_{\mathbb{R}^{d}}\int_{0}^{t_{n}}D\phi(x)\cdot D_{p}H(x,Du(s,x))\,d{\widehat{\widehat{m}}}(s)(x)\Big{|}\\ &\leq T\|D\phi\|_{0}\|D_{pp}H\|_{0}\|Dw-Du\|_{L^{\infty}(\textup{supp}(\phi))}\int_{\mathbb{R}^{d}}d\widehat{\widehat{m}}(s)(x)\longrightarrow 0.\end{split}

Since $\widehat{\widehat{m}}\rightarrow m$ in $C([0,T],P(\mathbb{R}^{d}))$ and $D\phi\cdot D_{p}H(\cdot,Du)(t)\in C_{b}(\mathbb{R}^{d})$ by (U): , we get

	$\displaystyle\int_{\mathbb{R}^{d}}\int_{0}^{t_{n}}D\phi(x)\cdot$	$\displaystyle D_{p}H(x,Du(s,x))\,d\widehat{\widehat{m}}(s)(x)$
		$\displaystyle\longrightarrow\int_{\mathbb{R}^{d}}\int_{0}^{t}D\phi(x)\cdot D_{p}H(x,Du(s,x))\,d{m}(s)(x).$

Then by passing to the limit in (42) using the above limit, (44), (43), and the CFL conditions $\frac{\rho^{2}}{h},\frac{h}{r^{\sigma}},\frac{\sqrt{h}}{\epsilon}=\mathit{o}(1)$ , we see that (3) holds and ${m}$ is a very weak solution of the FPK equation. This completes the proof of Theorem 4.3(iii).

8. Numerical examples

For numerical experiments we look at

(45)

\displaystyle\begin{cases}-u_{t}-\sigma^{2}\mathcal{L}u+\frac{1}{2}|u_{x}|^{2}=f(t,x)+K\ \phi_{\delta}\ast m(t,x)\quad&\text{ in }(0,T)\times[a,b],\\ m_{t}-\sigma^{2}\mathcal{L}^{*}m-\text{div}(mu_{x})=0\quad&\text{ in }(0,T)\times[a,b],\\ u(T,x)=G(x,m(T)),\qquad m(x,0)=m_{0}(x)\quad&\text{ in }[a,b],\end{cases}

where $a<b$ are real numbers, $\mathcal{L}$ is a diffusion operator, $\phi_{\delta}=\frac{1}{\delta\sqrt{2\pi}}e^{-\frac{x^{2}}{2\delta^{2}}}$ , $K$ some real number, and $f$ is some bounded smooth function. We will specify these quantities in the examples below.

Artificial boundary conditions

Our schemes (18) and (24) for approximating (45) are posed in all of $\mathbb{R}$ . To work in a bounded domain we impose (artificial) exterior conditions:

(U1)

$u\equiv\|u_{0}\|_{0}+T\cdot\|f\|_{L^{\infty}((0,T)\times(a,b))}$ in $(\mathbb{R}\setminus[a,b])\times[0,T]$ ,
(M1)

$m\equiv 0$ in $(\mathbb{R}\setminus[a,b])\times[0,T]$ , and $m_{0}$ is compactly supported in $[a,b]$ .

Condition (U1) penalize being in $[a,b]^{c}$ ensuring that optimal controls $\alpha$ in (18) are such that $x_{i}-h\alpha\pm\sqrt{h}\sigma_{r}\in[a,b]$ . Moreover, the contributions to non-local operators of $u$ from $[a,b]^{c}$ will be small away from the boundary. Condition (M1) ensures that the mass of $m$ is essentially contained in $[a,b]$ up to some finite time (but some mass will leak out due to nonlocal effects), and there is no contribution from $[a,b]^{c}$ when we compute non-local operators of $m$ . We will present numerical results from a region of interest that is far away from the boundary of $[a,b]$ , and where the influence of the (artificial) exterior data is expected to be negligible.

Evaluating the integrals

To implement the scheme, we need to evaluate the integral

\displaystyle\int_{|z|\geq r}I[f](x_{i}+z)\nu(dz)=\sum_{j\in\mathbb{Z}}f[x_{i}]\omega_{j-i,\nu},

where

\displaystyle\omega_{j-i,\nu}=\int_{|z|\geq r}\beta_{j-i}(z)\nu(dz),

see (17). In addition, we need to compute the values of $\sigma_{r},b_{r}$ , and $\lambda_{r}$ (see (9), (8), and (11)). To compute the weights $\omega_{j-i,\nu}$ we use two different methods. For the fractional Laplacians, we use the explicit weights of [45], while for CGMY diffusions we calculate the weights numerically using the inbuilt integral function in MATLAB. When tested on the fractional Laplacian, the MATLAB integrator produced an error of less than $10^{-15}$ . Below the quantities $\sigma_{r},b_{r},\lambda_{r}$ are computed explicitly, except in the CGMY case where we use numerical integration.

Solving the coupled system

We use a fixed point iteration scheme: (i) Let $\mu=m_{0}$ , and solve for $u_{\rho,h}$ in (18)–(20). (ii) With approximate optimal control $Du_{\rho,h}^{\epsilon}$ as in (21), we solve for $m_{\rho,h}^{\epsilon}$ in (24). (iii) Let $\mu_{\text{new}}=(m_{\rho,h}^{\epsilon}+\mu)/2$ , and repeat the process with $\mu=\mu_{\text{new}}$ . We continue until we have converged to a fixed point to within machine accuracy.

Remark 8.1.

Instead $\mu_{\text{new}}=m_{\rho,h}^{\epsilon}$ , we take $\mu_{\text{new}}=(m_{\rho,h}^{\epsilon}+\mu)/2$ . I.e. we use a fixed point iteration with some memory. This gives much faster convergence in our examples.

Example 1.

Problem (45) with $[0,T]\times[a,b]=[0,2]\times[0,1]$ , $G=0$ , $f(t,x)=5(x-0.5(1-\sin(2\pi t)))^{2}$ , $m_{0}(x)=Ce^{-\frac{(x-0.5)^{2}}{0.1^{2}}}$ , where $C$ is such that $\int_{a}^{b}m_{0}=1$ . Furthermore, in accordance with the CFL-conditions of Theorem 4.1, we let $h=\rho=0.005$ , $r=h^{\frac{1}{2s}}$ , $\epsilon=\sqrt{h}\approx 0.0707$ , $\sigma=0.09$ , $\delta=0.4$ , $K=1$ .

For the diffusions, we consider $\mathcal{L}=(-\Delta)^{\frac{s}{2}}$ for $s=0.5,1.5,1.9$ , $\mathcal{L}=\Delta$ , and $\mathcal{L}\equiv 0$ . In figure 1 we plot the different solutions at time $t=0.5$ and $t=1.5$ .

In figure 2 we plot the solution with $s=1.5$ on the time interval $[0,2]$ .

Example 2.

Problem (45) with the same cost functions as in Example 1, but different diffusions with parameter $s=1.5$ :

(i)

$\mathcal{L}=\sigma^{2}(-\Delta)^{\frac{s}{2}},$
(ii)

$\mathcal{L}=\sigma^{2}C_{d,s}\int_{\mathbb{R}}[u(x+y)-u(x)-Du(x)\cdot y\mathbbm{1}_{|y|<1}]\,\mathbbm{1}_{[0,+\infty)}\,\frac{dy}{|y|^{1+s}}$ ,
(iii)

$\mathcal{L}=\sigma^{2}C_{d,s}\int_{\mathbb{R}}[u(x+y)-u(x)-Du(x)\cdot y\mathbbm{1}_{|y|<1}]\,\mathbbm{1}_{[-0.5,0.5]^{c}}\,\frac{dy}{|y|^{1+s}}$ ,
(iv)

$\mathcal{L}=\sigma^{2}C_{d,s}\int_{\mathbb{R}}[u(x+y)-u(x)-Du(x)\cdot y\mathbbm{1}_{|y|<1}]\,e^{-10y^{-}-y^{+}}\,\frac{dy}{|y|^{1+s}}$ ,

where $C_{d,s}$ is the normalizing constant for the fractional Laplacian (see [45]). Case (i) is the reference solution, a symmetric and uniformly elliptic operator. Case (ii) is non-symmetric and non-degenerate, case (iii) is symmetric and degenerate, and case (iv) is a CGMY-diffusion (see e.g. [34]). We have plotted $m$ at $t=0.5$ and $t=1.5$ in Figure 3.

Example 3.

(Long time behaviour). Under certain conditions (see e.g. [22, 21]), the solution of time dependent MFG systems will quickly converge to the solution of the corresponding stationary ergodic MFG system, as the time horizon $T$ increases. We check numerically that this is also the case for nonlocal diffusions. In (45), we take $\mathcal{L}=(-\Delta)^{\frac{s}{2}}$ , with $s=1.5$ , $[0,T]\times[a,b]=[0,10]\times[-1,2]$ , $G(x)=(x-2)^{2}$ , $f(t,x)=x^{2}$ , and $m_{0}(x)=\mathbbm{1}_{[1,2]}(x)$ . We expect (from the cost functions $f$ and $G$ ) that the solution $m$ will approach the line $x=0$ quite fast, and then travel along this line, until it goes towards the point $x=2$ in the very end. Our numerical simulations shows that this is the case also for nonlocal diffusions. Here we have considered the cases $K=0$ (no coupling in the $u$ equation) and $K=0.4$ (some coupling). The parameters used in the simulations are $h=\rho=0.01$ , $\epsilon=\sqrt{h}$ , $r=h^{1/2s}$ , and the results are shown in Figure 4.

The players want to avoid each other in the case of $K=0.4$ , so the solution is more spread out in space direction than in the case of $K=0$ .

Example 4.

We compute the convergence rate when $f$ , $G$ , $m_{0}$ are as in Example 1, $s=1.5$ , $\nu=0.2$ , $\delta=0.4$ , and the domain $[0,T]\times[a,b]=[0,0.5]\times[0,1]$ . We take $\rho=h$ , $r=h^{\frac{1}{2s}}$ , and for simplicity $\epsilon=0.25$ .

We calculate solutions for different values of $h$ , and compare with a reference solution computed at $h=2^{-10}$ . We calculate $L^{\infty}$ and $L^{1}$ relative errors restricted to the $x$ -interval $[\frac{1}{3},\frac{2}{3}]$ (to avoid boundary effects), and $t=0$ for $u$ and $t=T$ for $m$ :

\displaystyle\textup{ERR}_{u}:=\frac{\|u_{\rho,h}(0,\cdot)-u_{\text{ref}}(0,\cdot)\|_{L^{\infty}(\frac{1}{3},\frac{2}{3})}}{\|u_{\text{ref}}(0,\cdot)\|_{L^{\infty}(\frac{1}{3},\frac{2}{3})}},\quad\textup{ERR}_{m}:=\frac{\|m_{\rho,h}^{\epsilon}(T,\cdot)-m_{\text{ref}}(T,\cdot)\|_{L^{1}(\frac{1}{3},\frac{2}{3})}}{\|m_{\text{ref}}(T,\cdot)\|_{L^{1}(\frac{1}{3},\frac{2}{3})}}.

The results are given in the table below.

h	$2^{-2}$	$2^{-3}$	$2^{-4}$	$2^{-5}$	$2^{-6}$	$2^{-7}$	$2^{-8}$	$2^{-9}$
ERR_u	0.3155	0.1951	0.0920	0.0446	0.0218	0.0097	0.0035	0.0013
ERR_m	0.8055	0.4583	0.2886	0.1869	0.1023	0.0596	0.0300	0.0186

We see that when we halve $h$ , the error is halved, i.e we observe an error of order $O(h)$ .

Appendix A Proof of Proposition 3.4

The proof is an adaptation of the Schauder fixed point argument used to prove existence for MFGs. We will use a direct consequence of Theorem 6.4 and 6.5:

Corollary A.1.

Assume ( $\nu$ 0): ,( $\nu$ 1): , (L1): –(L2): , (H1): , (F2): , (M): , $\Psi$ is given by Proposition 6.1, and $m^{\epsilon}_{\rho,h}[\mu]$ is defined by (26). Then there is $C_{\rho,h,\epsilon}>0$ , such that for any $\mu\in C([0,T],P(\mathbb{R}^{d}))$ and $t,s\in[0,T]$ ,

\displaystyle\int_{\mathbb{R}^{d}}\Psi(x)\,dm^{\epsilon}_{\rho,h}[\mu](t)+\frac{d_{0}(m^{\epsilon}_{\rho,h}[\mu](t),m^{\epsilon}_{\rho,h}[\mu](s))}{\sqrt{|t-s|}}\leq C_{\rho,h,\epsilon}.

The point is that $\rho,h,\epsilon$ are fixed in this result. Let

	$\displaystyle\mathcal{C}:=\Big{\{}$	$\displaystyle\mu\in C(0,T;P(\mathbb{R}^{d})):\mu(0)=m_{0},$
		$\displaystyle\quad\sup_{t,s\in[0,T]}\Big{[}\int_{\mathbb{R}^{d}}\psi(x)d\mu(t,x)+\frac{d_{0}(\mu(t),\mu(s))}{\sqrt{\|t-s\|}}\Big{]}\leq C_{\rho,h,\epsilon}\Big{\}},$

where $C_{\rho,h,\epsilon}$ is defined in Corollary A.1. For $\mu\in\mathcal{C}$ , let $u_{\rho,h}[\mu]$ be solution of (18) and $u_{\rho,h}^{\epsilon}[\mu]$ defined by (22). Then $m_{\rho,h}^{\epsilon}=S(\mu)$ is defined to the corresponding solution of (24). Note that a fixed point of $S$ will give a solution $(u,m)$ of the scheme (27). We now conclude the proof by applying Schauder’s fixed point theorem since:

1. ( $\mathcal{C}$ is a convex, closed, compact set). It is a convex and closed by standard arguments and compact by the Prokhorov and Arzelà-Ascoli theorems.

2. ( $S$ is a self-map on $\mathcal{C}$ ). The map $S$ maps $\mathcal{C}$ into itself by Corollary A.1 (tightness and equicontinuity), and Lemma 3.3 (positivity and mass preservation).

3. ( $S$ is continuous). Let $\mu_{n}\to\mu$ in $\mathcal{C}$ . By Theorem 5.2 (comparison) and (F2): ,

	$\displaystyle\\|u_{\rho,h}[\mu_{n}]-u_{\rho,h}[\mu]\\|_{0}$
	$\displaystyle\leq T\sup_{t,x}\|F(x,\mu_{n}(t))-F(x,\mu(t))\|+\sup_{x}\|G(x,\mu_{n}(T))-G(x,\mu(T))\|$
	$\displaystyle\leq TL_{F}\,\sup_{t}d_{0}(\mu_{n}(t),\mu(t))+L_{G}\,d_{0}(\mu_{n}(T),\mu(T))\to 0.$

Then $\sup_{i}\big{|}\frac{u_{i,k}[\mu_{n}]-u_{i-j,k}[\mu_{n}]}{\rho}-\frac{u_{i,k}[\mu]-u_{i-j,k}[\mu]}{\rho}\big{|}\to 0$ uniformly for $|i-j|=1$ , $\|Du_{\rho,h}^{\epsilon}[\mu_{n}]-Du_{\rho,h}^{\epsilon}[\mu]\|_{0}\to 0$ , and finally by Lemma 6.6,

\displaystyle\sup_{t\in[0,T]}\|m_{\rho,h}^{\epsilon}[\mu_{n}](t,\cdot)-m_{\rho,h}^{\epsilon}[\mu](t,\cdot)\|_{L^{1}(\mathbb{R}^{d})}\leq\frac{cKT}{\rho}e^{-h\lambda_{r}}\|Du_{\rho,h}^{\epsilon}[\mu_{n}]-Du_{\rho,h}^{\epsilon}[\mu]\|_{0}\to 0.

Hence $\mathcal{S}$ is continuous.

Appendix B Proof of Lemma 5.6 (ii) and (iii)

Fix $(t,x)\in[0,T]\times\mathbb{R}^{d}$ and consider a sequence $(t_{k},x_{k})\to(t,x)$ . For any $y\in\mathbb{R}^{d}$ , a Taylor expansion shows that

(46)

\displaystyle\begin{split}&u_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k}+y)-u_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k})-Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k})\cdot y\\ &=\int_{0}^{1}\big{(}Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k}+sy)-Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k})\big{)}\cdot y\,ds:=\int_{0}^{1}I(s)\cdot y\,ds.\end{split}

Using first Lemma 5.5 (a) and then part two of Lemma 5.5 (b), we find that

	$\displaystyle\int_{0}^{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}I(s)\cdot y\,ds$	$\displaystyle\leq 2\\|Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}]\\|_{0}\frac{\rho_{n}}{\epsilon_{n}}\leq 2((L_{L}+L_{F})T+L_{G})\frac{\rho_{n}}{\epsilon_{n}}$
	$\displaystyle\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}I(s)\cdot y\,ds$	$\displaystyle\leq c_{1}\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}\frac{1}{s}\Big{(}\|sy\|^{2}+\frac{\rho_{n}^{2}}{\epsilon_{n}^{2}}\Big{)}ds=c_{1}\|y\|^{2}\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}\Big{(}s\,+\frac{1}{s}\frac{\rho_{n}^{2}}{\|y\|^{2}\epsilon_{n}^{2}}\Big{)}\,ds$
		$\displaystyle\leq c_{1}\|y\|^{2}\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}\Big{(}s\,+\frac{1}{s}s^{2}\Big{)}\,ds\leq c_{1}\|y\|^{2}.$

By Lemma 5.5 (a), the sequence $Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k})$ is precompact. Now take any convergent subsequence as $n,k\to\infty$ and $\frac{\rho_{n}}{\epsilon_{n}}=o(1)$ . If $p$ is the limit, then by passing to the limit in (46) along this subsequence we have

\displaystyle u[\mu](x+y)-u[\mu](x)-p\cdot y\leq c_{1}|y|^{2}\quad\text{for every}\quad y\in\mathbb{R}^{d},

and $p\in D^{+}u[\mu](t,x)$ , the superdifferential of $u[\mu](t,x)$ . At points $(x,t)$ where $u[\mu]$ is differentiable, $D^{+}u[\mu](t,x)=\{Du[\mu](t,x)\}$ and $p=Du[\mu](t,x)$ , and then since the subsequence was arbitrary in the above argument and all limit points $p$ coincide,

(47)

\displaystyle\begin{split}&\limsup_{(t_{k},x_{k})\to(t,x),n\to\infty}Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k})\\ &\qquad=\liminf_{(t_{k},x_{k})\to(t,x),n\to\infty}Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}](t_{k},x_{k})\\ &\qquad=Du(t,x).\end{split}

We conclude that $Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}]\rightarrow Du[\mu]$ at $(t,x)$ . Part (ii) now follows since $u[\mu]$ is Lipschitz in space by Proposition 2.5 (c) and then $x$ -differentiable for a.e. $x$ and every $t$ .

To prove part (iii), we note that $u$ is $C^{1}$ by (U): , so now (47) holds for every $(t,x)$ . Then in view of the uniform Lipschitz estimate from Lemma 5.5 (a), local uniform convergence follows from [11, Chapter V, Lemma 1.9]. The proof is complete.

Appendix C Proof of Lemma 6.9

We first show strong separation between any two characteristics $\Phi^{\epsilon,\pm}$ : By Lemma 6.8,

	$\displaystyle\big{\|}\Phi^{\epsilon,\pm}_{j,k}-\Phi^{\epsilon,\pm}_{i,k}\big{\|}^{2}=\,\Big{\|}x_{j}-x_{i}\pm\sqrt{h}\sigma_{r}\mp\sqrt{h}\sigma_{r}-h\Big{(}D_{p}H(x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j}))+B^{\sigma}_{r}$
	$\displaystyle\hskip 142.26378pt-D_{p}H(x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i}))-B^{\sigma}_{r}\Big{)}\Big{\|}^{2}$
	$\displaystyle\geq\,\|x_{j}-x_{i}\|^{2}-2h\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}-D_{p}H\big{(}x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}(x_{j}-x_{i})$
	$\displaystyle\geq\,(1-c_{0}h)\|x_{j}-x_{i}\|^{2}.$

Hence, we have

(48)

\displaystyle\min\Big{\{}\big{|}\Phi^{\epsilon,+}_{j,k}-\Phi^{\epsilon,+}_{i,k}\big{|},\big{|}\Phi^{\epsilon,-}_{j,k}-\Phi^{\epsilon,-}_{i,k}\big{|}\Big{\}}\geq\sqrt{1-c_{0}h}|j-i|\rho>\rho\sqrt{1-c_{0}h}.

The result now holds following the proof of [23, Lemma 3.8]. We give the proof for completeness.

Since the diameter of the support of a (hat) basis functions $\beta_{i}$ is $2\rho$ , by (48) there can be at most 3 characteristics inside the $\textup{supp}(\beta_{i})$ for small enough $h$ . The result is trivial if there is only one in characteristic $\textup{supp}(\beta_{i})$ . When $\textup{supp}(\beta_{i})$ contains 2 characteristics, say $\Phi^{\epsilon,+}_{j_{1},k}$ and $\Phi^{\epsilon,+}_{j_{2},k}$ , we see by (48) (check the different orderings of $x_{k}$ , $\Phi^{\epsilon,+}_{j_{1},k}$ , $\Phi^{\epsilon,+}_{j_{2},k}$ ) that

	$\displaystyle\beta_{i}(\Phi^{\epsilon,+}_{j_{1},k})+\beta_{i}(\Phi^{\epsilon,+}_{j_{2},k})=$	$\displaystyle\,1-\frac{\Big{\|}x_{i}-\Phi^{\epsilon,+}_{j_{1},k}\Big{\|}}{\rho}+1-\frac{\Big{\|}x_{i}-\Phi^{\epsilon,+}_{j_{2},k}\Big{\|}}{\rho}$
	$\displaystyle\leq$	$\displaystyle\,2-\frac{\Big{\|}\Phi^{\epsilon,+}_{j_{1},k}-\Phi^{\epsilon,+}_{j_{2},k}\Big{\|}}{\rho}\leq 2-\sqrt{1-c_{0}h}\leq\,1+K_{0}h.$

Finally, assume $support(\beta_{i})$ contains 3 characteristics $\Phi^{\epsilon,+}_{j_{1},k},\Phi^{\epsilon,+}_{j_{2},k}$ and $\Phi^{\epsilon,+}_{j_{3},k}$ . By (48) that all three characteristics can not be on one side (left or right) of $x_{i}$ . Without loss of generality we assume $\Phi^{\epsilon,+}_{j_{1},k}<x_{i}<\Phi^{\epsilon,+}_{j_{2},k}<\Phi^{\epsilon,+}_{j_{3},k}$ , and find

	$\displaystyle\beta_{i}(\Phi^{\epsilon,+}_{j_{1},k})+\beta_{i}(\Phi^{\epsilon,+}_{j_{2},k})+\beta_{i}(\Phi^{\epsilon,+}_{j_{3},k})=1-\frac{x_{i}-\Phi^{\epsilon,+}_{j_{1},k}}{\rho}+1-\frac{\Phi^{\epsilon,+}_{j_{2},k}-x_{i}}{\rho}+1-\frac{\Phi^{\epsilon,+}_{j_{3},k}-x_{i}}{\rho}$
	$\displaystyle\leq\,3-\frac{\Phi^{\epsilon,+}_{j_{2},k}-\Phi^{\epsilon,+}_{j_{1},k}}{\rho}-\frac{\Phi^{\epsilon,+}_{j_{3},k}-\Phi^{\epsilon,+}_{j_{2},k}}{\rho}$
	$\displaystyle\leq\,3-2\sqrt{1-c_{0}h}\leq 1+2(1-\sqrt{1-c_{0}h})\leq 1+K_{0}h.$

Combining all three cases we get

\sum_{j\in\mathbb{Z}}\beta_{i}(\Phi^{\epsilon,+}_{j,k})\leq 1+K_{0}h\quad\mbox{for any}\,i\in\mathbb{Z}.

The estimate of $\sum_{j\in\mathbb{Z}}\beta_{i}(\Phi^{\epsilon,-}_{j,k})$ is similar. This completes the proof.

Acknowledgements

The authors are supported by the Toppforsk (research excellence) project Waves and Nonlinear Phenomena (WaNP), grant no. 250070 from the Research Council of Norway. IC is partially supported by the Croatian Science Foundation under the project 4197. The authors would like to thank Elisabetta Carlini for sharing the code of the numerical methods introduced in [23].

References

[1] Y. Achdou, F. Camilli, and I. Capuzzo-Dolcetta. Mean Field Games: Numerical methods for the planning problem. SIAM J. Control Optim, 50(1):77–109, 2012.
[2] Y. Achdou, F. Camilli, and I. Capuzzo-Dolcetta. Mean Field Games: Convergence of a finite difference method. SIAM J. Numer. Anal., 51(5):2585–2612, 2013.
[3] Y. Achdou, F. Camilli, and L. Corrias. On numerical approximation of the Hamilton-Jacobi-transport system arising in high frequency approximations. Discrete Contin. Dyn. Syst. Ser. B, 19(3):629–650, 2014.
[4] Y. Achdou and I. Capuzzo-Dolcetta. Mean Field Games: Numerical methods. SIAM J. Numer. Anal., 48(3):1136–1162, 2010.
[5] Y. Achdou, P. Cardaliaguet, F. Delarue, A. Porretta, F. Santambrogio. Mean field games. Lecture Notes in Mathematics, CIME vol. 2281, Springer, 2020.
[6] Y. Achdou and M. Laurière. Mean Field Games and applications: Numerical aspects. arXiv preprint arXiv:2003.04444, 2020.
[7] Y. Achdou and V. Perez. Iterative strategies for solving linearized discrete Mean Field Games systems. Netw. Heterog. Media, 7(2):197, 2012.
[8] Y. Achdou and A. Porretta. Convergence of a finite difference scheme to weak solutions of the system of partial differential equations arising in mean field games. SIAM J. Numer. Anal., 54(1):161–186, 2016.
[9] D. Applebaum. Lévy processes and stochastic calculus. Cambridge university press, 2009.
[10] S. Asmussen and J. Rosiński. Approximations of small jumps of Lévy processes with a view towards simulation. J. Appl. Probab., 38(2):482–493, 2001.
[11] M. Bardi and I. Capuzzo-Dolcetta. Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations. Birkhäuser, 1997.
[12] G. Barles and P. E. Souganidis. Convergence of approximation schemes for fully nonlinear second order equations. Asymptotic Anal., 4(3):271–283, 1991.
[13] A. Bensoussan, J. Frehse, and P. Yam Mean field games and mean field type control theory SpringerBriefs in Mathematics, Springer, 2013.
[14] F. Camilli and M. Falcone. An approximation scheme for the optimal control of diffusion processes. RAIRO Modél. Math. Anal. Numér., 29(1):97–122, 1995.
[15] F. Camilli and E. R. Jakobsen. A finite element like scheme for integro-partial differential Hamilton–Jacobi–Bellman equations. SIAM J. Numer. Anal., 47(4):2407–2431, 2009.
[16] I. Capuzzo Dolcetta. On a discrete approximation of the Hamilton-Jacobi equation of dynamic programming. Appl. Math. Optim. 10(4): 367–377, 1983.
[17] P. Cardaliaguet. Weak solutions for first order Mean Field Games with local coupling. In Analysis and geometry in control theory and its applications, volume 11 of Springer INdAM Ser., pages 111–158. Springer, Cham, 2015.
[18] P. Cardaliaguet, F. Delarue, J.-M. Lasry, and P.-L. Lions. The Master Equation and the Convergence Problem in Mean Field Games. Annals of Mathematics Studies, vol. 201, Princeton University Press, 2019.
[19] P. Cardaliaguet and P. J. Graber. Mean Field Games systems of first order. ESAIM Control Optim. Calc. Var., 21(3):690–722, 2015.
[20] P. Cardaliaguet, P. J. Graber, A. Porretta, and D. Tonon. Second order Mean Field Games with degenerate diffusion and local coupling. NoDEA Nonlinear Differential Equations Appl., 22(5):1287–1317, 2015.
[21] P. Cardaliaguet, J.-M. Lasry, P.-L. Lions, and A. Porretta. Long time average of Mean Field Games with a nonlocal coupling. SIAM J. Control Optim., 51(5):3558–3591, 2013.
[22] P. Cardaliaguet, J.-M. Lasry, P.-L. Lions, and A. Porretta. Long time average of mean field games. Netw. Heterog. Media, 7(2):279, 2012.
[23] E. Carlini and F. J. Silva. A fully discrete semi-Lagrangian scheme for a first order Mean Field Game problem. SIAM J. Numer. Anal., 52(1):45–67, 2014.
[24] E. Carlini and F. J. Silva. A semi-Lagrangian scheme for a degenerate second order Mean Field Game system. Discrete Contin. Dyn. Syst., 35(9):4269, 2015.
[25] E. Carlini and F. J. Silva. On the discretization of some nonlinear Fokker-Planck-Kolmogorov equations and applications. SIAM J. Numer. Anal., 56(4):2148–2177, 2018.
[26] R. Carmona and F. Delarue. Probabilistic analysis of Mean-Field Games. SIAM J. Control Optim., 51(4):2705–2734, 2013.
[27] R. Carmona and F. Delarue. Probabilistic theory of mean field games with applications. I-II, Probability Theory and Stochastic Modelling 84, Springer 2018.
[28] R. Carmona and M. Laurière. Convergence analysis of machine learning algorithms for the numerical solution of Mean Field Control and Games: II–the finite horizon case. arXiv preprint arXiv:1908.01613, 2019.
[29] R. Carmona, M. Laurière, and Z. Tan. Linear-quadratic Mean-Field reinforcement learning: Convergence of policy gradient methods. arXiv preprint arXiv:1910.04295, 2019.
[30] A. Cesaroni, M. Cirant, S. Dipierro, M. Novaga, and E. Valdinoci. On stationary fractional Mean Field Games. J. Math. Pures Appl., 122(9):1-22, 2017.
[31] I. Chowdhury, E. R. Jakobsen, and M. Krupski. On fully nonlinear parabolic mean field games with examples of nonlocal and local diffusions. arXiv preprint arXiv:2104.06985, 2021.
[32] M. Cirant. On the solvability of some ergodic control problems in $\mathbb{R}^{d}$ . SIAM J. Control Optim., 52(6):4001–4026, 2014.
[33] M. Cirant and A. Goffi. On the existence and uniqueness of solutions to time-dependent fractional MFG. SIAM J. Math. Anal., 51(2):913–954, 2019.
[34] R. Cont and P. Tankov. Financial modelling with jump processes. CRC press, 2003.
[35] K. Debrabant and E. R. Jakobsen. Semi-Lagrangian schemes for linear and fully non-linear diffusion equations. Math. Comp., 82(283):1433–1462, 2013.
[36] S. Elghanjaoui and K. H. Karlsen. A markov chain approximation scheme for a singular investment-consumption problem with Lévy driven stock prices. Online available url: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15.2182 $\&$ rep=rep1 $\&$ type=pdf, 2002.
[37] O. Ersland and E. R. Jakobsen. On fractional and nonlocal parabolic Mean Field Games in the whole space. arXiv preprint arXiv:2003.12302, 2020.
[38] M. Falcone and R. Ferretti. Semi-Lagrangian approximation schemes for linear and Hamilton-Jacobi equations. Society for Industrial and Applied Mathematics (SIAM), 2014.
[39] D. A. Gomes, E. A. Pimentel, and V. Voskanyan. Regularity theory for mean-field game systems. SpringerBriefs in Mathematics, Springer, 2016
[40] O. Guéant. Mean Field Games equations with quadratic Hamiltonian: a specific approach. Math. Models Methods Appl. Sci., 22(09), 2012.
[41] O. Guéant. New numerical methods for Mean Field Games with quadratic costs. Netw. Heterog. Media, 7(2):315-336, 2012.
[42] O. Guéant. Mean Field Games with a quadratic Hamiltonian: a constructive scheme. Advances in dynamic games, pages 229–241. Springer, 2013.
[43] O. Guéant, J.-M. Lasry, and P.-L. Lions. Mean field games and applications, in Paris–Princeton Lectures on Mathematical Finance 2010. Lecture Notes in Mathematics 2003, 205–266, Springer, 2011.
[44] M. Huang, R. P. Malhamé, and P. E. Caines. Large population stochastic dynamic games: closed-loop McKean-Vlasov systems and the Nash certainty equivalence principle. Commun. Inf. Syst., 6(3):221–251, 2006.
[45] Y. Huang and A. Oberman. Numerical methods for the fractional Laplacian: A finite difference-quadrature approach. SIAM J. Numer. Anal., 52(6):3056–3084, 2014.
[46] E. R. Jakobsen and K. H. Karlsen. Continuous dependence estimates for viscosity solutions of integro-PDEs. J. Differential Equations, 212(2):278–318, 2005.
[47] E. R. Jakobsen, K. H. Karlsen and C. La Chioma. Error estimates for approximate solutions to Bellman equations associated with controlled jump-diffusions. Numer. Math., 110(2):221–255, 2008.
[48] V. N. Kolokoltsov, M. S. Troeva, and W. Yang. Mean Field Games based on stable-like processes. Autom. Remote Control, 77(11):2044–2064, 2016.
[49] J.-M. Lasry and P.-L. Lions. Mean Field Games. Jpn. J. Math., 2(1):229–260, 2007.
[50] R. Metzler and J. Klafter. The random walk’s guide to anomalous diffusion: a fractional dynamics approach. Phys. Rep., 339(1):1–77, 2000.
[51] A. Porretta. Weak solutions to Fokker-Planck equations and Mean Field Games. Arch. Ration. Mech. Anal., 216(1):1–62, 2015.
[52] L. Ruthotto, S. J. Osher, W. Li, L. Nurbekyan, and S. W. Fung. A machine learning framework for solving high-dimensional Mean Field Game and Mean Field Control problems. Proc. Natl. Acad. Sci. U.S.A., 117(17):9183–9193, 2020.
[53] C. Villani. Optimal transport: Old and new. Springer Science & Business Media, 2008.
[54] W. A. Woyczyński. Lévy processes in the physical sciences. Lévy processes, pages 241–266. Springer, 2001.

	$\displaystyle\Big{\|}\mathcal{L}^{r}\phi(x_{i})$	$\displaystyle+e^{-h\lambda_{r}}\int_{r<\|z\|<1}D\phi(x_{i})\cdot z\nu(dz)-\frac{1-e^{-h\lambda_{r}}}{h\lambda_{r}}\int_{\|z\|>r}(\phi(x_{i}+z)-\phi(x_{i}))\nu(dz)\Big{\|}$
		$\displaystyle\leq K(1-e^{-h\lambda_{r}})r^{1-\sigma}\\|D\phi\\|_{0}+K\Big{(}1-\frac{1-e^{-h\lambda_{r}}}{h\lambda_{r}}\Big{)}\Big{(}r^{1-\sigma}\\|D\phi\\|_{0}+\\|\phi\\|_{0}\Big{)}$
(31)			$\displaystyle\leq K\Big{(}h\lambda_{r}r^{1-\sigma}\\|D\phi\\|_{0}+h\lambda_{r}\\|\phi\\|_{0}\Big{)}.$

	$\displaystyle\int_{\|z\|>r}\Psi(x_{j}+z)\,\nu(dz)=\int_{\|z\|>1}\Psi(x_{j}+z)\nu(dz)$
	$\displaystyle\quad+\int_{r<\|z\|<1}\Big{\{}\Psi(x_{j})+z\cdot D\Psi(x_{j})+\int_{0}^{1}z\cdot\Big{[}D\Psi(x_{j}+tz)-D\Psi(x_{j})\Big{]}dt\Big{\}}\,\nu(dz)$
	$\displaystyle\leq\Big{\|}\int_{\|z\|>1}\big{(}\Psi(x_{j}+z)-\Psi(x_{j})\big{)}\nu(dz)\Big{\|}+\lambda_{r}\Psi(x_{j})+B_{r}^{\sigma}\cdot D\Psi(x_{j})$
	$\displaystyle\hskip 184.9429pt+\\|D^{2}\Psi\\|_{0}\int_{r<\|z\|<1}\|z\|^{2}\nu(dz)$
	$\displaystyle\leq\,\lambda_{r}\Psi(x_{j})+B_{r}^{\sigma}\cdot D\Psi(x_{j})+E_{2},$

	$\displaystyle\big{\|}I_{k}\big{\|}\leq$	$\displaystyle\frac{1}{h}\sum_{j}m_{j,k}\bigg{[}e^{-\lambda_{r}h}\Big{(}(-hD_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}[\mu](t_{k},x_{j})\big{)}-hB_{r}^{\sigma})\cdot D\phi_{\delta}(x_{j})$
		$\displaystyle\hskip 28.45274pt+\frac{\\|D^{2}\phi_{\delta}\\|_{0}}{2d}\sum^{d}_{p=1}\big{(}\|a^{+}_{h,j}\|^{2}+\|a^{-}_{h,j}\|^{2}\big{)}+\frac{1-e^{-\lambda_{r}h}}{\lambda_{r}}\Big{(}B^{\sigma}_{r}\cdot D\phi_{\delta}(x_{j})$
		$\displaystyle\hskip 28.45274pt+\\|D^{2}\phi_{\delta}\\|_{0}\int_{r<\|z\|<1}\|z\|^{2}\nu(dz)+2\\|\phi_{\delta}\\|_{0}\int_{\|z\|>1}\nu(dz)+C\\|D^{2}\phi_{\delta}\\|_{0}\rho^{2}\Big{)}\bigg{]}$
	$\displaystyle\leq$	$\displaystyle\,\frac{1}{h}\bigg{[}\Big{(}h\\|D_{p}H(\cdot,Du_{\rho,h}^{\epsilon})\\|_{0}+h^{2}\lambda_{r}\|B^{\sigma}_{r}\|\Big{)}\\|D\phi_{\delta}\\|_{0}+c_{3}h{\\|\phi_{\delta}\\|}_{0}$
		$\displaystyle\hskip 14.22636pt+c_{1}\Big{(}h^{2}{\\|D_{p}H(\cdot,Du_{\rho,h}^{\epsilon})\\|}^{2}+h^{2}{\|B^{\sigma}_{r}\|}^{2}+h{\|\sigma_{r}\|}^{2}+h+\rho^{2}\Big{)}{\\|D^{2}\phi_{\delta}\\|}_{0}\bigg{]}\sum_{j}m_{j,k}.$

	$\displaystyle\int_{0}^{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}I(s)\cdot y\,ds$	$\displaystyle\leq 2\\|Du_{\rho_{n},h_{n}}^{\epsilon_{n}}[\mu_{n}]\\|_{0}\frac{\rho_{n}}{\epsilon_{n}}\leq 2((L_{L}+L_{F})T+L_{G})\frac{\rho_{n}}{\epsilon_{n}}$
	$\displaystyle\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}I(s)\cdot y\,ds$	$\displaystyle\leq c_{1}\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}\frac{1}{s}\Big{(}\|sy\|^{2}+\frac{\rho_{n}^{2}}{\epsilon_{n}^{2}}\Big{)}ds=c_{1}\|y\|^{2}\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}\Big{(}s\,+\frac{1}{s}\frac{\rho_{n}^{2}}{\|y\|^{2}\epsilon_{n}^{2}}\Big{)}\,ds$
		$\displaystyle\leq c_{1}\|y\|^{2}\int^{1}_{\frac{\rho_{n}}{\epsilon_{n}\|y\|}}\Big{(}s\,+\frac{1}{s}s^{2}\Big{)}\,ds\leq c_{1}\|y\|^{2}.$

	$\displaystyle\big{\|}\Phi^{\epsilon,\pm}_{j,k}-\Phi^{\epsilon,\pm}_{i,k}\big{\|}^{2}=\,\Big{\|}x_{j}-x_{i}\pm\sqrt{h}\sigma_{r}\mp\sqrt{h}\sigma_{r}-h\Big{(}D_{p}H(x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j}))+B^{\sigma}_{r}$
	$\displaystyle\hskip 142.26378pt-D_{p}H(x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i}))-B^{\sigma}_{r}\Big{)}\Big{\|}^{2}$
	$\displaystyle\geq\,\|x_{j}-x_{i}\|^{2}-2h\Big{(}D_{p}H\big{(}x_{j},Du^{\epsilon}_{\rho,h}(t_{k},x_{j})\big{)}-D_{p}H\big{(}x_{i},Du^{\epsilon}_{\rho,h}(t_{k},x_{i})\big{)}\Big{)}(x_{j}-x_{i})$
	$\displaystyle\geq\,(1-c_{0}h)\|x_{j}-x_{i}\|^{2}.$

On Numerical approximations of fractional and nonlocal Mean Field Games

Abstract.

Key words and phrases:

2020 Mathematics Subject Classification:

1. Introduction

Our contributions

Outline of the paper

2. Assumptions and Preliminaries

Remark 2.1.

Definition 2.2.

Remark 2.3.

Definition 2.4.

Proposition 2.5.

Proof.

Proposition 2.6.

Proof.

Theorem 2.7.

Proposition 2.8 (Uniqueness for the FPK equation).

Proof.

3. Discretisation of the MFG system

3.1. Approximation of the underlying controlled SDE

A. Approximate small jumps by Brownian motion.

Lemma 3.1 ([47]).

B. Time discretization of the approximate SDE

3.2. Semi-Lagrangian approximation of the HJB equation

A. Control approximation of the HJB equation

B. Interpolation and the fully discrete scheme

3.3. Approximate optimal feedback control

Lemma 3.2.

3.4. Dual SL discretization of the FPK equation

A. Dual approximation of the FPK equation

Lemma 3.3.

Proof.

3.5. Discretisation of the coupled MFG system

Proposition 3.4.

4. Convergence to the MFG system

4.1. Convergence to viscosity-very weak solutions

Theorem 4.1 (Degenerate case, d=1d=1).

Remark 4.2.

Theorem 4.3 (Non-degenerate case).

Corollary 4.4 (Existence of solutions of (1)).

Corollary 4.5.

4.2. Convergence to classical solutions

Corollary 4.6.

Proof.

4.3. Extension and discussion

Extension to more general Lévy operators

Theorem 4.7 ([37]).

Theorem 4.8.

Theorem 4.9.

Corollary 4.10.

The Wasserstein metric d1d_{1} versus our metric d0d_{0}

Lemma 4.11.

Corollary 4.12.

On moments and weak compactness in LpL^{p} in the degenerate case

5. On the SL scheme for the HJB equation

Theorem 5.1.

Proof.

Theorem 5.2.

Proof.

Lemma 5.3.

Proof.

Theorem 5.4.

Proof.

Lemma 5.5.

Proof.

Theorem 5.6.

Proof.

6. On the dual SL scheme for the FPK equation

Proposition 6.1.

Proof.

Remark 6.2.

Lemma 6.3.

Theorem 6.4 (Tightness).

Proof.

Theorem 6.5 (Equicontinuity in time).

Proof.

Lemma 6.6 (L1L^{1}-stability).

Proof.

Theorem 6.7 (LpL^{p} bounds).

Theorem 4.1 (Degenerate case, $d=1$ ).

The Wasserstein metric $d_{1}$ versus our metric $d_{0}$

On moments and weak compactness in $L^{p}$ in the degenerate case

Lemma 6.6 ( $L^{1}$ -stability).

Theorem 6.7 ( $L^{p}$ bounds).