This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Quasi-Akaike information criterion of structural equation modeling with latent variables for diffusion processes

Shogo Kusano 1  and  Masayuki Uchida 1,2 1Graduate School of Engineering Science, Osaka University 2Center for Mathematical Modeling and Data Science (MMDS), Osaka University and JST CREST
Abstract.

We consider a model selection problem for structural equation modeling (SEM) with latent variables for diffusion processes based on high-frequency data. First, we propose the quasi-Akaike information criterion of the SEM and study the asymptotic properties. Next, we consider the situation where the set of competing models includes some misspecified parametric models. It is shown that the probability of choosing the misspecified models converges to zero. Furthermore, examples and simulation results are given.

Key words and phrases:
Structural equation modeling; Quasi-Akaike information criterion; Quasi-likelihood analysis; High-frequency data; Stochastic differential equation.

1. Introduction

We consider a model selection problem for structural equation modeling (SEM) with latent variables for diffusion processes. First, we define the true model of the SEM. The stochastic processes 𝕏1,0,t\mathbb{X}_{1,0,t} and 𝕏2,0,t\mathbb{X}_{2,0,t} are defined by the factor models as follows:

𝕏1,0,t\displaystyle\mathbb{X}_{1,0,t} =𝚲x1,0ξ0,t+δ0,t,\displaystyle={\bf{\Lambda}}_{x_{1},0}\xi_{0,t}+\delta_{0,t}, (1.1)
𝕏2,0,t\displaystyle\mathbb{X}_{2,0,t} =𝚲x2,0η0,t+ε0,t,\displaystyle={\bf{\Lambda}}_{x_{2},0}\eta_{0,t}+\varepsilon_{0,t}, (1.2)

where {𝕏1,0,t}t0\{\mathbb{X}_{1,0,t}\}_{t\geq 0} and {𝕏2,0,t}t0\{\mathbb{X}_{2,0,t}\}_{t\geq 0} are p1p_{1} and p2p_{2}-dimensional observable vector processes, {ξ0,t}t0\{\xi_{0,t}\}_{t\geq 0} and {η0,t}t0\{\eta_{0,t}\}_{t\geq 0} are k1k_{1} and k2k_{2}-dimensional latent common factor vector processes, {δ0,t}t0\{\delta_{0,t}\}_{t\geq 0} and {ε0,t}t0\{\varepsilon_{0,t}\}_{t\geq 0} are p1p_{1} and p2p_{2}-dimensional latent unique factor vector processes, respectively. 𝚲x1,0p1×k1{\bf{\Lambda}}_{x_{1},0}\in\mathbb{R}^{p_{1}\times k_{1}} and 𝚲x2,0p2×k2{\bf{\Lambda}}_{x_{2},0}\in\mathbb{R}^{p_{2}\times k_{2}} are constant loading matrices. Both p1p_{1} and p2p_{2} are not zero, p1p_{1}, p2p_{2}, k1k_{1} and k2k_{2} are fixed, k1p1k_{1}\leq p_{1} and k2p2k_{2}\leq p_{2}. Let p=p1+p2p=p_{1}+p_{2}. Suppose that {ξ0,t}t0\{\xi_{0,t}\}_{t\geq 0}, {δ0,t}t0\{\delta_{0,t}\}_{t\geq 0} and {ε0,t}t0\{\varepsilon_{0,t}\}_{t\geq 0} satisfy the following stochastic differential equations:

dξ0,t\displaystyle\quad\mathrm{d}\xi_{0,t} =B1(ξ0,t)dt+𝐒1,0dW1,t,ξ0,0=c1,\displaystyle=B_{1}(\xi_{0,t})\mathrm{d}t+{\bf{S}}_{1,0}\mathrm{d}W_{1,t},\ \ \xi_{0,0}=c_{1}, (1.3)
dδ0,t\displaystyle\quad\mathrm{d}\delta_{0,t} =B2(δ0,t)dt+𝐒2,0dW2,t,δ0,0=c2,\displaystyle=B_{2}(\delta_{0,t})\mathrm{d}t+{\bf{S}}_{2,0}\mathrm{d}W_{2,t},\ \ \delta_{0,0}=c_{2}, (1.4)
dε0,t\displaystyle\quad\mathrm{d}\varepsilon_{0,t} =B3(ε0,t)dt+𝐒3,0dW3,t,ε0,0=c3,\displaystyle=B_{3}(\varepsilon_{0,t})\mathrm{d}t+{\bf{S}}_{3,0}\mathrm{d}W_{3,t},\ \ \varepsilon_{0,0}=c_{3}, (1.5)

where B1:k1k1B_{1}:\mathbb{R}^{k_{1}}\rightarrow\mathbb{R}^{k_{1}}, 𝐒1,0k1×r1{\bf{S}}_{1,0}\in\mathbb{R}^{k_{1}\times r_{1}}, c1k1c_{1}\in\mathbb{R}^{k_{1}}, B2:p1p1B_{2}:\mathbb{R}^{p_{1}}\rightarrow\mathbb{R}^{p_{1}}, 𝐒2,0p1×r2{\bf{S}}_{2,0}\in\mathbb{R}^{p_{1}\times r_{2}}, c2p1c_{2}\in\mathbb{R}^{p_{1}}, B3:p2p2B_{3}:\mathbb{R}^{p_{2}}\rightarrow\mathbb{R}^{p_{2}}, 𝐒3,0p2×r3{\bf{S}}_{3,0}\in\mathbb{R}^{p_{2}\times r_{3}}, c3p2c_{3}\in\mathbb{R}^{p_{2}} and W1,tW_{1,t}, W2,tW_{2,t} and W3,tW_{3,t} are r1r_{1}, r2r_{2} and r3r_{3}-dimensional standard Wiener processes, respectively. Moreover, we express the relationship between η0,t\eta_{0,t} and ξ0,t\xi_{0,t} as follows:

η0,t=𝐁0η0,t+𝚪0ξ0,t+ζ0,t,\displaystyle\eta_{0,t}={\bf{B}}_{0}\eta_{0,t}+{\bf{\Gamma}}_{0}\xi_{0,t}+\zeta_{0,t}, (1.6)

where 𝐁0k2×k2{\bf{B}}_{0}\in\mathbb{R}^{k_{2}\times k_{2}} is a constant loading matrix, whose diagonal elements are zero, and 𝚪0k2×k1{\bf{\Gamma}}_{0}\in\mathbb{R}^{k_{2}\times k_{1}} is a constant loading matrix. Define 𝚿0=𝕀k2𝐁0{\bf{\Psi}}_{0}=\mathbb{I}_{k_{2}}-{\bf{B}}_{0}, where 𝕀k2\mathbb{I}_{k_{2}} denotes the identity matrix of size k2k_{2}. We assume that 𝚲x1,0{\bf{\Lambda}}_{x_{1},0} is a full column rank matrix and 𝚿0{\bf{\Psi}}_{0} is non-singular. {ζ0,t}t0\{\zeta_{0,t}\}_{t\geq 0} is a k2k_{2}-dimensional latent unique factor vector process defined by the following stochastic differential equation:

dζ0,t=B4(ζ0,t)dt+𝐒4,0dW4,t,ζ0,0=c4,\displaystyle\quad\mathrm{d}\zeta_{0,t}=B_{4}(\zeta_{0,t})\mathrm{d}t+{\bf{S}}_{4,0}\mathrm{d}W_{4,t},\ \ \zeta_{0,0}=c_{4}, (1.7)

where B4:k2k2B_{4}:\mathbb{R}^{k_{2}}\rightarrow\mathbb{R}^{k_{2}}, 𝐒4,0k2×r4{\bf{S}}_{4,0}\in\mathbb{R}^{k_{2}\times r_{4}}, c4k2c_{4}\in\mathbb{R}^{k_{2}} and W4,tW_{4,t} is an r4r_{4}-dimensional standard Wiener process. Set 𝚺ξξ,0=𝐒1,0𝐒1,0{\bf{\Sigma}}_{\xi\xi,0}={\bf{S}}_{1,0}{\bf{S}}_{1,0}^{\top}, 𝚺δδ,0=𝐒2,0𝐒2,0{\bf{\Sigma}}_{\delta\delta,0}={\bf{S}}_{2,0}{\bf{S}}_{2,0}^{\top}, 𝚺εε,0=𝐒3,0𝐒3,0{\bf{\Sigma}}_{\varepsilon\varepsilon,0}={\bf{S}}_{3,0}{\bf{S}}_{3,0}^{\top} and 𝚺ζζ,0=𝐒4,0𝐒4,0{\bf{\Sigma}}_{\zeta\zeta,0}={\bf{S}}_{4,0}{\bf{S}}_{4,0}^{\top}, where \top denotes the transpose. It is supposed that 𝚺δδ,0{\bf{\Sigma}}_{\delta\delta,0} and 𝚺εε,0{\bf{\Sigma}}_{\varepsilon\varepsilon,0} are positive definite matrices, and W1,tW_{1,t}, W2,tW_{2,t}, W3,tW_{3,t} and W4,tW_{4,t} are independent standard Wiener processes on a stochastic basis with usual conditions (Ω,,{t},𝐏)(\Omega,\mathscr{F},\{\mathscr{F}_{t}\},{\bf{P}}). Let 𝕏0,t=(𝕏1,0,t,𝕏2,0,t)\mathbb{X}_{0,t}=(\mathbb{X}_{1,0,t}^{\top},\mathbb{X}_{2,0,t}^{\top})^{\top}. Set 𝚺0{\bf{\Sigma}}_{0} as the variance of 𝕏0,t\mathbb{X}_{0,t}. If there is no misunderstanding, we simply write 𝕏0,t\mathbb{X}_{0,t} as 𝕏t\mathbb{X}_{t}. 𝕏n=(𝕏tin)0in=(𝕏0,tin)0in\mathbb{X}_{n}=(\mathbb{X}_{t_{i}^{n}})_{0\leq i\leq n}=(\mathbb{X}_{0,t_{i}^{n}})_{0\leq i\leq n} are discrete observations, where tin=ihnt_{i}^{n}=ih_{n}, hn=Tnh_{n}=\frac{T}{n}, TT is fixed, and p1p_{1}, p2p_{2}, k1k_{1} and k2k_{2} are independent of nn. We consider the situation where hn0h_{n}\longrightarrow 0 as nn\longrightarrow\infty. We cannot estimate all the elements of 𝚲x1,0{\bf{\Lambda}}_{x_{1},0}, 𝚲x2,0{\bf{\Lambda}}_{x_{2},0}, 𝚪0{\bf{\Gamma}}_{0}, 𝚿0{\bf{\Psi}}_{0}, 𝚺ξξ,0{\bf{\Sigma}}_{\xi\xi,0}, 𝚺δδ,0{\bf{\Sigma}}_{\delta\delta,0}, 𝚺εε,0{\bf{\Sigma}}_{\varepsilon\varepsilon,0} and 𝚺ζζ,0{\bf{\Sigma}}_{\zeta\zeta,0}. Thus, some elements may be assumed to be zero to satisfy an identifiability condition; see, e.g., Everitt [6]. Note that these constraints and the number of factors k1k_{1} and k2k_{2} are determined from the theoretical viewpoint of each research field.

A model selection problem among the following MM parametric models is considered. We define the parametric model of Model m{1,,M}m\in\{1,\cdots,M\} as follows. Set θmΘmqm\theta_{m}\in\Theta_{m}\subset\mathbb{R}^{q_{m}} as the parameter of Model mm, where Θm\Theta_{m} is a convex compact space. It is assumed that Θm\Theta_{m} has locally Lipschitz boundary; see, e.g., Adams and Fournier [1]. The stochastic processes 𝕏1,m,tθ\mathbb{X}^{\theta}_{1,m,t} and 𝕏2,m,tθ\mathbb{X}^{\theta}_{2,m,t} are defined as the following factor models:

𝕏1,m,tθ\displaystyle\mathbb{X}^{\theta}_{1,m,t} =𝚲x1,mθξm,tθ+δm,tθ,\displaystyle={\bf{\Lambda}}^{\theta}_{x_{1},m}\xi^{\theta}_{m,t}+\delta^{\theta}_{m,t}, (1.8)
𝕏2,m,tθ\displaystyle\mathbb{X}^{\theta}_{2,m,t} =𝚲x2,mθηm,tθ+εm,tθ,\displaystyle={\bf{\Lambda}}^{\theta}_{x_{2},m}\eta^{\theta}_{m,t}+\varepsilon^{\theta}_{m,t}, (1.9)

where {𝕏1,m,tθ}t0\{\mathbb{X}^{\theta}_{1,m,t}\}_{t\geq 0} and {𝕏2,m,tθ}t0\{\mathbb{X}^{\theta}_{2,m,t}\}_{t\geq 0} are p1p_{1} and p2p_{2}-dimensional observable vector processes, {ξm,tθ}t0\{\xi^{\theta}_{m,t}\}_{t\geq 0} and {ηm,tθ}t0\{\eta^{\theta}_{m,t}\}_{t\geq 0} are k1k_{1} and k2k_{2}-dimensional latent common factor vector processes, {δm,tθ}t0\{\delta^{\theta}_{m,t}\}_{t\geq 0} and {εm,tθ}t0\{\varepsilon^{\theta}_{m,t}\}_{t\geq 0} are p1p_{1} and p2p_{2}-dimensional latent unique factor vector processes, respectively. 𝚲x1,mθp1×k1{\bf{\Lambda}}^{\theta}_{x_{1},m}\in\mathbb{R}^{p_{1}\times k_{1}} and 𝚲x2,mθp2×k2{\bf{\Lambda}}^{\theta}_{x_{2},m}\in\mathbb{R}^{p_{2}\times k_{2}} are constant loading matrices. Assume that {ξm,tθ}t0\{\xi^{\theta}_{m,t}\}_{t\geq 0}, {δm,tθ}t0\{\delta^{\theta}_{m,t}\}_{t\geq 0} and {εm,tθ}t0\{\varepsilon^{\theta}_{m,t}\}_{t\geq 0} satisfy the following stochastic differential equations:

dξm,tθ\displaystyle\quad\mathrm{d}\xi^{\theta}_{m,t} =B1(ξm,tθ)dt+𝐒1,mθdW1,t,ξm,0θ=c1,\displaystyle=B_{1}(\xi^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{1,m}\mathrm{d}W_{1,t},\ \ \xi^{\theta}_{m,0}=c_{1}, (1.10)
dδm,tθ\displaystyle\quad\mathrm{d}\delta^{\theta}_{m,t} =B2(δm,tθ)dt+𝐒2,mθdW2,t,δm,0θ=c2,\displaystyle=B_{2}(\delta^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{2,m}\mathrm{d}W_{2,t},\ \ \delta^{\theta}_{m,0}=c_{2}, (1.11)
dεm,tθ\displaystyle\quad\mathrm{d}\varepsilon^{\theta}_{m,t} =B3(εm,tθ)dt+𝐒3,mθdW3,t,εm,0θ=c3,\displaystyle=B_{3}(\varepsilon^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{3,m}\mathrm{d}W_{3,t},\ \ \varepsilon^{\theta}_{m,0}=c_{3}, (1.12)

where 𝐒1,mθk1×r1{\bf{S}}^{\theta}_{1,m}\in\mathbb{R}^{k_{1}\times r_{1}}, 𝐒2,mθp1×r2{\bf{S}}^{\theta}_{2,m}\in\mathbb{R}^{p_{1}\times r_{2}} and 𝐒3,mθp2×r3{\bf{S}}^{\theta}_{3,m}\in\mathbb{R}^{p_{2}\times r_{3}}. Furthermore, the relationship between ηm,tθ\eta^{\theta}_{m,t} and ξm,tθ\xi^{\theta}_{m,t} is expressed as follows:

ηm,tθ=𝐁mθηm,tθ+𝚪mθξm,tθ+ζm,tθ,\displaystyle\eta^{\theta}_{m,t}={\bf{B}}_{m}^{\theta}\eta^{\theta}_{m,t}+{\bf{\Gamma}}_{m}^{\theta}\xi^{\theta}_{m,t}+\zeta^{\theta}_{m,t}, (1.13)

where 𝐁mθk2×k2{\bf{B}}^{\theta}_{m}\in\mathbb{R}^{k_{2}\times k_{2}} is a constant loading matrix, whose diagonal elements are zero, and 𝚪mθk2×k1{\bf{\Gamma}}^{\theta}_{m}\in\mathbb{R}^{k_{2}\times k_{1}} is a constant loading matrix. Set 𝚿mθ=𝕀k2𝐁mθ{\bf{\Psi}}^{\theta}_{m}=\mathbb{I}_{k_{2}}-{\bf{B}}^{\theta}_{m}. It is supposed that 𝚲x1,mθ{\bf{\Lambda}}^{\theta}_{x_{1},m} is a full column rank matrix and 𝚿mθ{\bf{\Psi}}^{\theta}_{m} is non-singular. {ζm,tθ}t0\{\zeta^{\theta}_{m,t}\}_{t\geq 0} is a k2k_{2}-dimensional latent unique factor vector process defined by the following stochastic differential equation:

dζm,tθ=B4(ζm,tθ)dt+𝐒4,mθdW4,t,ζm,0θ=c4,\displaystyle\quad\mathrm{d}\zeta^{\theta}_{m,t}=B_{4}(\zeta^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{4,m}\mathrm{d}W_{4,t},\ \ \zeta^{\theta}_{m,0}=c_{4}, (1.14)

where 𝐒4,mθk2×r4{\bf{S}}^{\theta}_{4,m}\in\mathbb{R}^{k_{2}\times r_{4}}. Let 𝚺ξξ,mθ=𝐒1,mθ𝐒1,mθ{\bf{\Sigma}}^{\theta}_{\xi\xi,m}={\bf{S}}^{\theta}_{1,m}{\bf{S}}^{\theta\top}_{1,m}, 𝚺δδ,mθ=𝐒2,mθ𝐒2,mθ{\bf{\Sigma}}^{\theta}_{\delta\delta,m}={\bf{S}}^{\theta}_{2,m}{\bf{S}}^{\theta\top}_{2,m}, 𝚺εε,mθ=𝐒3,mθ𝐒3,mθ{\bf{\Sigma}}^{\theta}_{\varepsilon\varepsilon,m}={\bf{S}}^{\theta}_{3,m}{\bf{S}}^{\theta\top}_{3,m} and 𝚺ζζ,mθ=𝐒4,mθ𝐒4,mθ{\bf{\Sigma}}^{\theta}_{\zeta\zeta,m}={\bf{S}}^{\theta}_{4,m}{\bf{S}}^{\theta\top}_{4,m}. It is assumed that 𝚺δδ,mθ{\bf{\Sigma}}^{\theta}_{\delta\delta,m} and 𝚺εε,mθ{\bf{\Sigma}}^{\theta}_{\varepsilon\varepsilon,m} are positive definite matrices. Define 𝕏m,tθ=(𝕏1,m,tθ,𝕏2,m,tθ)\mathbb{X}^{\theta}_{m,t}=(\mathbb{X}_{1,m,t}^{\theta\top},\mathbb{X}_{2,m,t}^{\theta\top})^{\top}. Set

𝚺m(θm)=(𝚺m11(θm)𝚺m12(θm)𝚺m12(θm)𝚺m22(θm))\displaystyle{\bf{\Sigma}}_{m}(\theta_{m})=\begin{pmatrix}{\bf{\Sigma}}_{m}^{11}(\theta_{m})&{\bf{\Sigma}}_{m}^{12}(\theta_{m})\\ {\bf{\Sigma}}_{m}^{12\top}(\theta_{m})&{\bf{\Sigma}}_{m}^{22}(\theta_{m})\end{pmatrix}

as the variance of 𝕏m,tθ\mathbb{X}^{\theta}_{m,t}, where

𝚺m11(θm)\displaystyle\qquad\qquad{\bf{\Sigma}}^{11}_{m}(\theta_{m}) =𝚲x1,mθ𝚺ξξ,mθ𝚲x1,mθ+𝚺δδ,mθ,\displaystyle={\bf{\Lambda}}^{\theta}_{x_{1},m}{\bf{\Sigma}}^{\theta}_{\xi\xi,m}{\bf{\Lambda}}_{x_{1},m}^{\theta\top}+{\bf{\Sigma}}^{\theta}_{\delta\delta,m},
𝚺m12(θm)\displaystyle{\bf{\Sigma}}^{12}_{m}(\theta_{m}) =𝚲x1,mθ𝚺ξξ,mθ𝚪mθ𝚿mθ1𝚲x2,mθ,\displaystyle={\bf{\Lambda}}^{\theta}_{x_{1},m}{\bf{\Sigma}}^{\theta}_{\xi\xi,m}{\bf{\Gamma}}_{m}^{\theta\top}{\bf{\Psi}}_{m}^{\theta-1\top}{\bf{\Lambda}}_{x_{2},m}^{\theta\top},
𝚺m22(θm)\displaystyle{\bf{\Sigma}}^{22}_{m}(\theta_{m}) =𝚲x2,mθ𝚿mθ1(𝚪mθ𝚺ξξ,mθ𝚪mθ+𝚺ζζ,mθ)𝚿mθ1𝚲x2,mθ+𝚺εε,mθ.\displaystyle={\bf{\Lambda}}^{\theta}_{x_{2},m}{\bf{\Psi}}_{m}^{\theta-1}({\bf{\Gamma}}_{m}^{\theta}{\bf{\Sigma}}^{\theta}_{\xi\xi,m}{\bf{\Gamma}}_{m}^{\theta\top}+{\bf{\Sigma}}^{\theta}_{\zeta\zeta,m}){\bf{\Psi}}_{m}^{\theta-1\top}{\bf{\Lambda}}_{x_{2},m}^{\theta\top}+{\bf{\Sigma}}^{\theta}_{\varepsilon\varepsilon,m}.

It is supposed that there exists θm,0IntΘm\theta_{m,0}\in{\rm{Int}}\Theta_{m} such that 𝚺0=𝚺m(θm,0){\bf{\Sigma}}_{0}={\bf{\Sigma}}_{m}(\theta_{m,0}), and Model mm satisfies an identifiability condition.

Structural equation modeling (SEM) with latent variables is a method of analyzing the relationships between latent variables that cannot be observed; see, e.g., Jöreskog [10], Everitt [6], Mueller [15] and references therein. A researcher has often some candidate models in SEM. Note that the candidate models are usually specified to express different hypotheses. The goodness-of-fit test based on the likelihood ratio is widely used for model evaluation in SEM. Akaike [4] proposed the use of the Akaike information criterion (AIC) in a factor model. Using AIC in a factor model, we can choose the optimal number of factors in terms of prediction. AIC is also widely used in SEM to choose the optimal model as well as a factor model; see e.g., Huang [9].

Thanks to the development of measuring devices, high-frequency data such as stock prices can be easily obtained these days, so that many researchers have studied parametric estimation of diffusion processes based on high-frequency data; see, e.g., Yoshida [18], Genon-Catalot and Jacod [7], Kessler [11], Uchida and Yoshida [17] and references therein. Recently, in the field of financial econometrics, the factor model based on high-frequency data has been extensively studied. Aït-Sahalia and Xiu [3] proposed a continuous-time latent factor model for a high-dimensional model using principal component analysis. Kusano and Uchida [12] suggested classical factor analysis for diffusion processes. This method enables us to analyze the relationships between low-dimensional observed variables sampled with high frequency and latent variables. For instance, based on high-frequency stock price data, we can analyze latent variables such as a world market factor and factors related to a certain industry (Figure 1). On the other hand, there have been few researchers who examine the relationships between these latent variables based on high-frequency data. Kusano and Uchida [13] proposed SEM with latent variables for diffusion processes. Using this method, one can examine the relationships between latent variables based on high-frequency data. For example, if we want to study the relationship between the world market factor and the Japanese financial factor, this method enables us to analyze the relationship (Figure 2). SEM with latent variables may be referred as the regression analysis between latent variables. While both explanatory and objective variables are observable in regression analysis, both of them are latent in SEM with latent variables. For the regression analysis and the market models based on high-frequency data, see, e.g., Aït-Sahalia et al. [2].

The model selection problem for diffusion processes based on discrete observations has been actively studied. Uchida [16] proposed the contrast-based information criterion for ergodic diffusion processes, and obtained the asymptotic result of the difference between the contrast-based information criteria. Eguchi and Masuda [5] studied the model comparison problem for semiparametric Le´\acute{e}vy driven SDE and suggested the Gaussian quasi-AIC. Since the information criterion is important in SEM as mentioned above, we propose the quasi-AIC (QAIC) of SEM with latent variables for diffusion processes and study the asymptotic properties. In this paper, we consider the non-ergodic case. For the ergodic case, see Appendix 6.3.

The paper is organized as follows. In Section 2, we introduce the notation and assumptions. In Section 3, the QAIC of SEM with latent variables for diffusion processes is considered. Moreover, the situation where the set of competing models includes some (not all) misspecified parametric models is studied. It is shown that the probability of choosing the misspecified models converges to zero. In Section 4, we give examples and simulation results. In Section 5, the results described in Section 3 are proved.

Refer to caption
Refer to caption
Figure 1. The path diagram for the example of factor analysis.
Refer to caption
Figure 2. The path diagram for the example of SEM.

2. Notation and assumptions

First, we prepare the following notations and definitions. For any vector vv, |v|=trvv|v|=\sqrt{\mathop{\rm tr}\nolimits{vv^{\top}}}, v(i)v^{(i)} is the ii-th element of vv, and Diagv\mathop{\rm Diag}\nolimits v is the diagonal matrix, whose ii-th diagonal element is v(i)v^{(i)}. For any matrix AA, |A|=trAA|A|=\sqrt{\mathop{\rm tr}\nolimits{AA^{\top}}}, and AijA_{ij} is the (i,j)(i,j)-th element of AA. For matrices AA and BB of the same size, A[B]=tr(AB)A[B]=\mathop{\rm tr}\nolimits(AB^{\top}). For any matrix Ap×pA\in\mathbb{R}^{p\times p} and vectors x,ypx,y\in\mathbb{R}^{p}, we define A[x,y]=xAyA[x,y]=x^{\top}Ay. For a positive definite matrix AA, we write A>0A>0. For any symmetric matrix Ap×pA\in\mathbb{R}^{p\times p}, vecA\mathop{\rm vec}\nolimits A, vechA\mathop{\rm vech}\nolimits A and 𝔻p\mathbb{D}_{p} are the vectorization of AA, the half-vectorization of AA and the p2×p¯p^{2}\times\bar{p} duplication matrix respectively, where p¯=p(p+1)/2\bar{p}=p(p+1)/2. Note that vecA=𝔻pvechA\mathop{\rm vec}\nolimits{A}=\mathbb{D}_{p}\mathop{\rm vech}\nolimits{A}; see, e.g., Harville [8]. For any matrix AA, A+A^{+} stands for the Moore-Penrose inverse of AA. Set p++\mathcal{M}_{p}^{++} as the sets of all p×pp\times p real-valued positive definite matrices. For any positive sequence unu_{n}, R:[0,)×d{\rm{R}}:[0,\infty)\times\mathbb{R}^{d}\rightarrow\mathbb{R} denotes the short notation for functions which satisfy |R(un,x)|unC(1+|x|)C|{\rm{R}}({u_{n}},x)|\leq u_{n}C(1+|x|)^{C} for some C>0C>0. Let Ck(d)C^{k}_{\uparrow}(\mathbb{R}^{d}) be the space of all functions ff satisfying the following conditions:

  • (i)

    ff is continuously differentiable with respect to xdx\in\mathbb{R}^{d} up to order kk.

  • (ii)

    ff and all its derivatives are of polynomial growth in xdx\in\mathbb{R}^{d}, i.e., gg is of polynomial growth in xdx\in\mathbb{R}^{d} if g(x)=R(1,x)\displaystyle g(x)=R(1,x).

The symbols p\stackrel{{\scriptstyle p}}{{\longrightarrow}} and d\stackrel{{\scriptstyle d}}{{\longrightarrow}} denote convergence in probability and convergence in distribution, respectively. For any process YtY_{t}, ΔiY=YtinYti1n\Delta_{i}Y=Y_{t_{i}^{n}}-Y_{t_{i-1}^{n}}. Set

𝕏𝕏=1Ti=1n(𝕏tin𝕏ti1n)(𝕏tin𝕏ti1n).\displaystyle\mathbb{Q}_{\mathbb{XX}}=\frac{1}{T}\sum_{i=1}^{n}(\mathbb{X}_{t_{i}^{n}}-\mathbb{X}_{t_{i-1}^{n}})(\mathbb{X}_{t_{i}^{n}}-\mathbb{X}_{t_{i-1}^{n}})^{\top}.

𝐄{\bf{E}} denotes the expectation under 𝐏{\bf{P}}. Next, we make the following assumptions.

  1. [A]
    1. (a)
      1. (i)

        There exists a constant C>0C>0 such that

        |B1(x)B1(y)|C|xy|\displaystyle|B_{1}(x)-B_{1}(y)|\leq C|x-y|

        for any x,yk1x,y\in\mathbb{R}^{k_{1}}.

      2. (ii)

        For all >0\ell>0, supt𝐄[|ξ0,t|]<\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\xi_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty.

      3. (iii)

        B1C4(k1)B_{1}\in C^{4}_{\uparrow}(\mathbb{R}^{k_{1}}).

    2. (b)
      1. (i)

        There exists a constant C>0C>0 such that

        |B2(x)B2(y)|C|xy|\displaystyle|B_{2}(x)-B_{2}(y)|\leq C|x-y|

        for any x,yp1x,y\in\mathbb{R}^{p_{1}}.

      2. (ii)

        For all 0\ell\geq 0, supt𝐄[|δ0,t|]<\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\delta_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty.

      3. (iii)

        B2C4(p1)B_{2}\in C^{4}_{\uparrow}(\mathbb{R}^{p_{1}}).

    3. (c)
      1. (i)

        There exists a constant C>0C>0 such that

        |B3(x)B3(y)|C|xy|\displaystyle|B_{3}(x)-B_{3}(y)|\leq C|x-y|

        for any x,yp2x,y\in\mathbb{R}^{p_{2}}.

      2. (ii)

        For all 0\ell\geq 0, supt𝐄[|ε0,t|]<\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\varepsilon_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty.

      3. (iii)

        B3C4(p2)B_{3}\in C^{4}_{\uparrow}(\mathbb{R}^{p_{2}}).

    4. (d)
      1. (i)

        There exists a constant C>0C>0 such that

        |B4(x)B4(y)|C|xy|\displaystyle|B_{4}(x)-B_{4}(y)|\leq C|x-y|

        for any x,yk2x,y\in\mathbb{R}^{k_{2}}.

      2. (ii)

        For all 0\ell\geq 0, supt𝐄[|ζ0,t|]<\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\zeta_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty.

      3. (iii)

        B4C4(k2)B_{4}\in C^{4}_{\uparrow}(\mathbb{R}^{k_{2}}).

Remark 1

For diffusion processes, [𝐀][{\bf{A}}] is the standard assumption; see, e.g., Kessler [11].

3. Qaic of sem for diffusion processes

Using a locally Gaussian approximation, we obtain the following quasi-likelihood of Model mm from (1.8)-(1.14):

i=1n1(2πhn)p2(det𝚺m(θm))12exp(12hn𝚺m(θm)1[(Δi𝕏mθ)2]).\displaystyle\prod_{i=1}^{n}\frac{1}{(2\pi h_{n})^{\frac{p}{2}}(\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m}))^{\frac{1}{2}}}\exp\Biggl{(}-\frac{1}{2h_{n}}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}^{\theta}_{m}\bigr{)}^{\otimes 2}\Bigr{]}\Biggr{)}.

See Appendix 8.1 in Kusano and Uchida [14] for details of the quasi-likelihood. Define the quasi-likelihood Lm,n{\rm{L}}_{m,n} based on the discrete observations 𝕏n\mathbb{X}_{n} as follows:

Lm,n(𝕏n,θm)=i=1n1(2πhn)p2(det𝚺m(θm))12exp(12hn𝚺m(θm)1[(Δi𝕏)2]).\displaystyle{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\theta_{m}\bigr{)}=\prod_{i=1}^{n}\frac{1}{(2\pi h_{n})^{\frac{p}{2}}(\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m}))^{\frac{1}{2}}}\exp\Biggl{(}-\frac{1}{2h_{n}}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}\Biggr{)}.

The quasi-maximum likelihood estimator θ^m,n\hat{\theta}_{m,n} is defined by

Lm,n(𝕏n,θ^m,n(𝕏n))=supθmΘmLm,n(𝕏n,θm).\displaystyle{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}=\sup_{\theta_{m}\in\Theta_{m}}{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\theta_{m}\bigr{)}.

Set n\mathbb{Z}_{n} as an i.i.d. copy of 𝕏n\mathbb{X}_{n}. Let us consider the following Kullback-Leibler divergence between the transition density qn(n)q_{n}(\mathbb{Z}_{n}) of the true model (1.1)-(1.7) and the quasi-likelihood Lm,n{\rm{L}}_{m,n}:

KL(𝕏n,m)=𝐄n[logqn(n)Lm,n(n,θ^m,n(𝕏n))]=𝐄n[logqn(n)]𝐄n[logLm,n(n,θ^m,n(𝕏n))],\displaystyle\begin{split}{\rm{K_{L}}}(\mathbb{X}_{n},m)&={\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\log\frac{q_{n}({\mathbb{Z}}_{n})}{{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}}\Biggr{]}\\ &={\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log q_{n}(\mathbb{Z}_{n})\Bigr{]}-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]},\end{split}

where 𝐄n{\bf{E}}_{\mathbb{Z}_{n}} is the expectation under the law of n\mathbb{Z}_{n}. Our purpose is to know the model which minimizes KL(𝕏n,m){\rm{K_{L}}}(\mathbb{X}_{n},m). Since 𝐄n[logqn(n)]{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log q_{n}(\mathbb{Z}_{n})\Bigr{]} does not depend on the model, it is sufficient to consider the model which maximizes

𝐄n[logLm,n(n,θ^m,n(𝕏n))],\displaystyle{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]}, (3.1)

so that we need to estimate (3.1). Set

Δm,0=θvech𝚺m(θm)|θm=θm,0\displaystyle\Delta_{m,0}=\left.\frac{\partial}{\partial\theta^{\top}}\mathop{\rm vech}\nolimits{\bf{\Sigma}}_{m}(\theta_{m})\right|_{\theta_{m}=\theta_{m,0}}

and

Ym(θm)=12(𝚺m(θm)1𝚺m(θm,0)1)[𝚺m(θm,0)]12logdet𝚺m(θm)det𝚺m(θm,0).\displaystyle{\rm{Y}}_{m}(\theta_{m})=-\frac{1}{2}\Bigl{(}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}-{\bf{\Sigma}}_{m}(\theta_{m,0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}_{m}(\theta_{m,0})\Bigr{]}-\frac{1}{2}\log\frac{\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m})}{\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m,0})}.

Moreover, the following assumptions are made.

  1. [B1]
    1. (a)

      There exists a constant χ>0\chi>0 such that

      Ym(θm)χ|θmθm,0|2\displaystyle{\rm{Y}}_{m}(\theta_{m})\leq-\chi\bigl{|}\theta_{m}-\theta_{m,0}\bigr{|}^{2}

      for all θmΘm\theta_{m}\in\Theta_{m}.

    2. (b)

      rankΔm,0=qm\mathop{\rm rank}\nolimits\Delta_{m,0}=q_{m}.

Remark 2

[𝐁𝟏](a)[{\bf{B1}}]\ ({\rm{a}}) is the identifiability condition. [𝐁𝟏](b)[{\bf{B1}}]\ ({\rm{b}}) implies that the asymptotic variance of θ^m,n\hat{\theta}_{m,n} is non-singular; see Lemma 35 in Kusano and Uchida [14] and Lemma 2.

By the following theorem, we obtain the asymptotically unbiased estimator of (3.1).

Theorem 1

Let m{1,,M}m\in\{1,\cdots,M\}. Under [A] and [B1], as nn\longrightarrow\infty,

𝐄𝕏n[logLm,n(𝕏n,θ^m,n(𝕏n))𝐄n[logLm,n(n,θ^m,n(𝕏n))]]=qm+op(1).\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]}\biggr{]}=q_{m}+o_{p}(1).

We define the quasi-Akaike information criterion as

QAIC(𝕏n,m)=2logLm,n(𝕏n,θ^m,n(𝕏n))+2qm.\displaystyle{\rm{QAIC}}(\mathbb{X}_{n},m)=-2\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}+2q_{m}. (3.2)

Since it holds from Theorem 1 that QAIC(𝕏n,m){\rm{QAIC}}(\mathbb{X}_{n},m) is the asymptotically unbiased estimator of

2𝐄n[logLm,n(n,θ^m,n(𝕏n))],\displaystyle-2{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]},

we select the optimal model m^n\hat{m}_{n} among competing models by

QAIC(𝕏n,m^n)=minm{1,,M}QAIC(𝕏n,m).\displaystyle{\rm{QAIC}}(\mathbb{X}_{n},\hat{m}_{n})=\min_{m\in\{1,\cdots,M\}}{\rm{QAIC}}(\mathbb{X}_{n},m). (3.3)
Remark 3

Since Lm,n{\rm{L}}_{m,n} is not the exact likelihood but the quasi-likelihood, all the competing models are misspecified. Note that we consider a model selection problem among the quasi-likelihood models; see, e.g., Eguchi and Masuda [5].

Remark 4

In SEM, instead of (3.2),

nFm,n(𝕏n,θ^m,n(𝕏n))+2qm\displaystyle n{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}+2q_{m} (3.4)

is often used for a model selection as 𝕏𝕏>0\mathbb{Q}_{\mathbb{XX}}>0, where

Fm,n(𝕏n,θm)=logdet𝚺m(θm)logdet𝕏𝕏+tr(𝚺m(θm)1𝕏𝕏)p.\displaystyle{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\theta_{m}\bigr{)}=\log\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m})-\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}\mathbb{Q}_{\mathbb{XX}}\Bigr{)}-p.

For details of (3.4), see, e.g., Huang [9]. Since

2logLm,n(𝕏n,θ^m,n(𝕏n))\displaystyle-2\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)} =nplog(2πhn)+nlogdet𝚺m(θ^m,n)+ntr(𝚺m(θ^m,n)1𝕏𝕏)\displaystyle=np\log(2\pi h_{n})+n\log\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\hat{\theta}_{m,n})+n\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}_{m}(\hat{\theta}_{m,n})^{-1}\mathbb{Q}_{\mathbb{XX}}\Bigr{)}
=nFm,n(𝕏n,θ^m,n(𝕏n))+n{plog(2πhn)+logdet𝕏𝕏+p}\displaystyle=n{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}+n\Bigl{\{}p\log(2\pi h_{n})+\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+p\Bigr{\}}

as 𝕏𝕏>0\mathbb{Q}_{\mathbb{XX}}>0, it is shown that

nFm,n(𝕏n,θ^m,n(𝕏n))+2qm=QAIC(𝕏n,m)n{plog(2πhn)+logdet𝕏𝕏+p}\displaystyle n{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}+2q_{m}={\rm{QAIC}}(\mathbb{X}_{n},m)-n\Bigl{\{}p\log(2\pi h_{n})+\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+p\Bigr{\}}

as 𝕏𝕏>0\mathbb{Q}_{\mathbb{XX}}>0. Note that

n{plog(2πhn)+logdet𝕏𝕏+p}\displaystyle n\Bigl{\{}p\log(2\pi h_{n})+\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+p\Bigr{\}}

does not depend on the model. Even if we use (3.4) instead of (3.2), the model selection results are not different.

Next, we consider the situation where the set of competing models includes some (not all) misspecified parametric models; that is, there exists m{1,,M}m\in\{1,\cdots,M\} such that

𝚺0𝚺m(θm)\displaystyle{\bf{\Sigma}}_{0}\neq{\bf{\Sigma}}_{m}(\theta_{m})

for all θmΘm\theta_{m}\in\Theta_{m}. Set

={m{1,,M}|There existsθm,0Θmsuch that𝚺0=𝚺m(θm,0).}\displaystyle\mathcal{M}=\biggl{\{}m\in\{1,\cdots,M\}\ \Big{|}\ \mbox{There exists}\ \theta_{m,0}\in\Theta_{m}\ \mbox{such that}\ {\bf{\Sigma}}_{0}={\bf{\Sigma}}_{m}(\theta_{m,0}).\biggr{\}}

and c={1,,M}\\mathcal{M}^{c}=\{1,\cdots,M\}\backslash\mathcal{M}. The optimal parameter θ¯m\bar{\theta}_{m} is defined as

Hm(θ¯m)=supθmΘmHm(θm),\displaystyle{\rm{H}}_{m}(\bar{\theta}_{m})=\sup_{\theta_{m}\in\Theta_{m}}{\rm{H}}_{m}(\theta_{m}),

where

Hm(θm)=12tr(𝚺m(θm)1𝚺0)12logdet𝚺m(θm).\displaystyle{\rm{H}}_{m}(\theta_{m})=-\frac{1}{2}\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}{\bf{\Sigma}}_{0}\Bigr{)}-\frac{1}{2}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m}).

Note that θ¯m=θm,0\bar{\theta}_{m}=\theta_{m,0} for mm\in\mathcal{M}. Furthermore, we make the following assumption.

  1. [B2]

    Hm(θm)=Hm(θ¯m)θm=θ¯m{\rm{H}}_{m}(\theta_{m})={\rm{H}}_{m}(\bar{\theta}_{m})\Longrightarrow\theta_{m}=\bar{\theta}_{m}.

[𝐁𝟐]\bf{[B2]} implies that θ^m,npθ¯m\hat{\theta}_{m,n}\stackrel{{\scriptstyle p}}{{\longrightarrow}}\bar{\theta}_{m}; see, Lemma 36 in Kusano and Uchida [14]. The following asymptotic result of m^n\hat{m}_{n} defined in (3.3) holds.

Theorem 2

Under [A] and [B2], as nn\longrightarrow\infty,

𝐏(m^nc)0.\displaystyle{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}\longrightarrow 0.

Theorem 2 shows that the probability of choosing the misspecified models converges to zero as nn\longrightarrow\infty.

4. Simulation results

4.1. True model

The stochastic process 𝕏1,0,t\mathbb{X}_{1,0,t} is defined by the following factor model :

𝕏1,0,t=(152000000147)ξ0,t+δ0,t,\displaystyle\mathbb{X}_{1,0,t}=\begin{pmatrix}1&5&2&0&0&0\\ 0&0&0&1&4&7\end{pmatrix}^{\top}\xi_{0,t}+\delta_{0,t},

where {𝕏1,0,t}t0\{\mathbb{X}_{1,0,t}\}_{t\geq 0} is a six-dimensional observable vector process, {ξ0,t}t0\{\xi_{0,t}\}_{t\geq 0} is a two-dimensional latent common factor vector process, and {δ0,t}t0\{\delta_{0,t}\}_{t\geq 0} is a six-dimensional latent unique factor vector process. The stochastic process 𝕏2,0,t\mathbb{X}_{2,0,t} is defined by the factor model as follows:

𝕏2,0,t=(12)η0,t+ε0,t,\displaystyle\mathbb{X}_{2,0,t}=\begin{pmatrix}1\\ 2\end{pmatrix}\eta_{0,t}+\varepsilon_{0,t},

where {𝕏2,0,t}t0\{\mathbb{X}_{2,0,t}\}_{t\geq 0} is a two-dimensional observable vector process, {η0,t}t0\{\eta_{0,t}\}_{t\geq 0} is a one-dimensional latent common factor vector process, and {ε0,t}t0\{\varepsilon_{0,t}\}_{t\geq 0} is a two-dimensional latent unique factor vector process. Furthermore, the relationship between η0,t\eta_{0,t} and ξ0,t\xi_{0,t} is expressed as follows:

η0,t=(32)ξ0,t+ζ0,t,\displaystyle\eta_{0,t}=\begin{pmatrix}3&2\end{pmatrix}\xi_{0,t}+\zeta_{0,t},

where {ζ0,t}t0\{\zeta_{0,t}\}_{t\geq 0} is a one-dimensional latent unique factor vector process. It is supposed that {ξ0,t}t0\{\xi_{0,t}\}_{t\geq 0} is the two-dimensional OU process as follows:

dξ0,t={(10.70.70.5)ξ0,t(12)}dt+(10.30.41)dW1,t(t[0,T]),ξ0,0=(21),\displaystyle\quad\mathrm{d}\xi_{0,t}=-\left\{\begin{pmatrix}1&0.7\\ 0.7&0.5\end{pmatrix}\xi_{0,t}-\begin{pmatrix}1\\ 2\end{pmatrix}\right\}\mathrm{d}t+\begin{pmatrix}1&0.3\\ 0.4&1\end{pmatrix}\mathrm{d}W_{1,t}\ \ (t\in[0,T]),\ \ \xi_{0,0}=\begin{pmatrix}2\\ 1\end{pmatrix},

where W1,tW_{1,t} is a two-dimensional standard Wiener process. {δ0,t}t0\{\delta_{0,t}\}_{t\geq 0} is defined by the six-dimensional OU process as follows:

dδ0,t=(B0δ0,tμ0)dt+𝐒2,0dW2,t(t[0,T]),δ0,0=c0,\displaystyle\quad\mathrm{d}\delta_{0,t}=-\bigl{(}B_{0}\delta_{0,t}-\mu_{0}\bigr{)}dt+{\bf{S}}_{2,0}\mathrm{d}W_{2,t}\ \ (t\in[0,T]),\ \ \delta_{0,0}=c_{0},

where B0=Diag(3,2,4,1,2,1)B_{0}={\rm{Diag}}(3,2,4,1,2,1)^{\top}, μ0=(3,2,1,2,6,4)\mu_{0}=(3,2,1,2,6,4)^{\top}, 𝐒2,0=Diag(3,2,1,2,1,3){\bf{S}}_{2,0}={\rm{Diag}}(3,2,1,2,1,3)^{\top}, c0=(1,3,2,1,4,3)c_{0}=(1,3,2,1,4,3)^{\top} and W2,tW_{2,t} is a six-dimensional standard Wiener process. {ε0,t}t0\{\varepsilon_{0,t}\}_{t\geq 0} satisfies the following two-dimensional OU process:

dε0,t={(1003)ε0,t(23)}dt+(1002)dW3,t(t[0,T]),ε0,0=(15),\displaystyle\quad\mathrm{d}\varepsilon_{0,t}=-\left\{\begin{pmatrix}1&0\\ 0&3\end{pmatrix}\varepsilon_{0,t}-\begin{pmatrix}2\\ 3\end{pmatrix}\right\}\mathrm{d}t+\begin{pmatrix}1&0\\ 0&2\end{pmatrix}\mathrm{d}W_{3,t}\ \ (t\in[0,T]),\ \ \varepsilon_{0,0}=\begin{pmatrix}1\\ 5\end{pmatrix},

where W3,tW_{3,t} is a two-dimensional standard Wiener process. {ζ0,t}t0\{\zeta_{0,t}\}_{t\geq 0} is defined by the following one-dimensional OU process:

dζ0,t=ζ0,tdt+2dW4,t(t[0,T]),ζ0,0=0,\displaystyle\quad\mathrm{d}\zeta_{0,t}=-\zeta_{0,t}\mathrm{d}t+2\mathrm{d}W_{4,t}\ \ (t\in[0,T]),\ \ \zeta_{0,0}=0,

where W4,tW_{4,t} is a one-dimensional standard Wiener process. Figure 3 shows the path diagram of the true model at time tt.

Refer to caption
Figure 3. The path diagram of the true model at time tt.

4.2. Competing models

4.2.1. Model 1

Set the parameter as θ1Θ119\theta_{1}\in\Theta_{1}\subset\mathbb{R}^{19}. Let p1=6p_{1}=6, p2=2p_{2}=2, k1=2k_{1}=2 and k2=1k_{2}=1. Assume

𝚲1,x1θ\displaystyle{\bf{\Lambda}}^{\theta}_{1,x_{1}} =(1θ1(1)θ1(2)0000001θ1(3)θ1(4))\displaystyle=\begin{pmatrix}1&\theta^{(1)}_{1}&\theta^{(2)}_{1}&0&0&0\\ 0&0&0&1&\theta^{(3)}_{1}&\theta^{(4)}_{1}\end{pmatrix}^{\top}

and

𝚲1,x2θ=(1θ1(5)),𝚪1θ=(θ1(6)θ1(7)),\displaystyle{\bf{\Lambda}}^{\theta}_{1,x_{2}}=\begin{pmatrix}1&\theta_{1}^{(5)}\end{pmatrix}^{\top},\quad{\bf{\Gamma}}^{\theta}_{1}=\begin{pmatrix}\theta_{1}^{(6)}&\theta_{1}^{(7)}\end{pmatrix},

where θ1(1)\theta_{1}^{(1)}, θ1(2)\theta_{1}^{(2)}, θ1(3)\theta_{1}^{(3)}, θ1(4)\theta_{1}^{(4)}, θ1(5)\theta_{1}^{(5)}, θ1(6)\theta_{1}^{(6)} and θ1(7)\theta_{1}^{(7)} are not zero. It is supposed that 𝐒1,1θ{\bf{S}}^{\theta}_{1,1} and 𝐒2,1θ{\bf{S}}^{\theta}_{2,1} satisfy

𝚺1,ξξθ=𝐒1,1θ𝐒1,1θ=(θ1(8)θ1(9)θ1(9)θ1(10))2++\displaystyle\qquad\qquad\qquad{\bf{\Sigma}}^{\theta}_{1,\xi\xi}={\bf{S}}^{\theta}_{1,1}{\bf{S}}^{\theta\top}_{1,1}=\begin{pmatrix}\theta_{1}^{(8)}&\theta_{1}^{(9)}\\ \theta_{1}^{(9)}&\theta_{1}^{(10)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and

𝚺1,δδθ=𝐒2,1θ𝐒2,1θ=Diag(θ1(11),θ1(12),θ1(13),θ1(14),θ1(15),θ1(16))6++,\displaystyle{\bf{\Sigma}}^{\theta}_{1,\delta\delta}={\bf{S}}^{\theta}_{2,1}{\bf{S}}^{\theta\top}_{2,1}={\rm{Diag}}\Bigl{(}\theta_{1}^{(11)},\theta_{1}^{(12)},\theta_{1}^{(13)},\theta_{1}^{(14)},\theta_{1}^{(15)},\theta_{1}^{(16)}\Bigr{)}\in\mathcal{M}_{6}^{++},

where θ1(9)\theta_{1}^{(9)} is not zero. Moreover, 𝐒3,1θ{\bf{S}}^{\theta}_{3,1} and 𝐒4,1θ{\bf{S}}^{\theta}_{4,1} are assumed to satisfy

𝚺1,εεθ=𝐒3,1θ𝐒3,1θ=(θ1(17)00θ1(18))2++\displaystyle\qquad\qquad\qquad{\bf{\Sigma}}^{\theta}_{1,\varepsilon\varepsilon}={\bf{S}}^{\theta}_{3,1}{\bf{S}}^{\theta\top}_{3,1}=\begin{pmatrix}\theta_{1}^{(17)}&0\\ 0&\theta_{1}^{(18)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and 𝚺1,ζζθ=(𝐒4,1θ)2=θ1(19)>0{\bf{\Sigma}}^{\theta}_{1,\zeta\zeta}=({\bf{S}}^{\theta}_{4,1})^{2}=\theta_{1}^{(19)}>0. Set

θ1,0=(5,2,4,7,2,3,2,1.09,0.70,1.16,9,4,1,4,1,9,1,4,4).\displaystyle\theta_{1,0}=\Bigl{(}5,2,4,7,2,3,2,1.09,0.70,1.16,9,4,1,4,1,9,1,4,4\Bigr{)}.

It holds that 𝚺0=𝚺1(θ1,0){\bf{\Sigma}}_{0}={\bf{\Sigma}}_{1}(\theta_{1,0}), so that Model 11 is a correctly specified model. There exists a constant χ>0\chi>0 such that

Y1(θ1)χ|θ1θ1,0|2\displaystyle{\rm{Y}}_{1}(\theta_{1})\leq-\chi|\theta_{1}-\theta_{1,0}|^{2} (4.1)

for all θ1Θ1\theta_{1}\in\Theta_{1}. For the proof of (4.1), see Appendix 6.2. Figure 4 shows the path diagram of Model 11 at time tt.

Refer to caption
Figure 4. The path diagram of Model 1.

4.2.2. Model 2

The parameter is defined as θ2Θ220\theta_{2}\in\Theta_{2}\subset\mathbb{R}^{20}. Set p1=6p_{1}=6, p2=2p_{2}=2, k1=2k_{1}=2 and k2=1k_{2}=1. Suppose

𝚲2,x1θ\displaystyle{\bf{\Lambda}}^{\theta}_{2,x_{1}} =(1θ2(1)θ2(2)00000θ2(3)1θ2(4)θ2(5))\displaystyle=\begin{pmatrix}1&\theta^{(1)}_{2}&\theta^{(2)}_{2}&0&0&0\\ 0&0&\theta^{(3)}_{2}&1&\theta^{(4)}_{2}&\theta^{(5)}_{2}\end{pmatrix}^{\top}

and

𝚲2,x2θ=(1θ2(6)),𝚪2θ=(θ2(7)θ2(8)),\displaystyle{\bf{\Lambda}}^{\theta}_{2,x_{2}}=\begin{pmatrix}1&\theta_{2}^{(6)}\end{pmatrix}^{\top},\quad{\bf{\Gamma}}^{\theta}_{2}=\begin{pmatrix}\theta_{2}^{(7)}&\theta_{2}^{(8)}\end{pmatrix},

where θ2(1)\theta_{2}^{(1)}, θ2(2)\theta_{2}^{(2)}, θ2(3)\theta_{2}^{(3)}, θ2(4)\theta_{2}^{(4)}, θ2(5)\theta_{2}^{(5)}, θ2(6)\theta_{2}^{(6)}, θ2(7)\theta_{2}^{(7)} and θ2(8)\theta_{2}^{(8)} are not zero. 𝐒1,2θ{\bf{S}}^{\theta}_{1,2} and 𝐒2,2θ{\bf{S}}^{\theta}_{2,2} are assumed to satisfy

𝚺2,ξξθ\displaystyle{\bf{\Sigma}}^{\theta}_{2,\xi\xi} =𝐒1,2θ𝐒1,2θ=(θ2(9)θ2(10)θ2(10)θ2(11))2++\displaystyle={\bf{S}}^{\theta}_{1,2}{\bf{S}}^{\theta\top}_{1,2}=\begin{pmatrix}\theta_{2}^{(9)}&\theta_{2}^{(10)}\\ \theta_{2}^{(10)}&\theta_{2}^{(11)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and

𝚺2,δδθ\displaystyle{\bf{\Sigma}}^{\theta}_{2,\delta\delta} =𝐒2,2θ𝐒2,2θ=Diag(θ2(12),θ2(13),θ2(14),θ2(15),θ2(16),θ2(17))6++,\displaystyle={\bf{S}}^{\theta}_{2,2}{\bf{S}}^{\theta\top}_{2,2}={\rm{Diag}}\Bigl{(}\theta_{2}^{(12)},\theta_{2}^{(13)},\theta_{2}^{(14)},\theta_{2}^{(15)},\theta_{2}^{(16)},\theta_{2}^{(17)}\Bigr{)}\in\mathcal{M}_{6}^{++},

where θ2(10)\theta_{2}^{(10)} is not zero. Furthermore, we suppose that 𝐒3,2θ{\bf{S}}^{\theta}_{3,2} and 𝐒4,2θ{\bf{S}}^{\theta}_{4,2} satisfy

𝚺2,εεθ\displaystyle{\bf{\Sigma}}^{\theta}_{2,\varepsilon\varepsilon} =𝐒3,2θ𝐒3,2θ=(θ2(18)00θ2(19))2++\displaystyle={\bf{S}}^{\theta}_{3,2}{\bf{S}}^{\theta\top}_{3,2}=\begin{pmatrix}\theta_{2}^{(18)}&0\\ 0&\theta_{2}^{(19)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and 𝚺2,ζζθ=(𝐒4,2θ)2=θ2(20)>0{\bf{\Sigma}}^{\theta}_{2,\zeta\zeta}=({\bf{S}}^{\theta}_{4,2})^{2}=\theta_{2}^{(20)}>0. Let

θ2,0=(5,2,0,4,7,2,3,2,1.09,0.70,1.16,9,4,1,4,1,9,1,4,4).\displaystyle\theta_{2,0}=\Bigl{(}5,2,0,4,7,2,3,2,1.09,0.70,1.16,9,4,1,4,1,9,1,4,4\Bigr{)}.

Since 𝚺0=𝚺2(θ2,0){\bf{\Sigma}}_{0}={\bf{\Sigma}}_{2}(\theta_{2,0}), Model 22 is a correctly specified model. In a similar way to the proof of (4.1), we can prove that there exists a constant χ>0\chi>0 such that

Y2(θ2)χ|θ2θ2,0|2\displaystyle{\rm{Y}}_{2}(\theta_{2})\leq-\chi|\theta_{2}-\theta_{2,0}|^{2}

for all θ2Θ2\theta_{2}\in\Theta_{2}. Figure 5 shows the path diagram of Model 22 at time tt.

Refer to caption
Figure 5. The path diagram of Model 2.

4.2.3. Model 3

Set the parameter as θ3Θ317\theta_{3}\in\Theta_{3}\subset\mathbb{R}^{17}. Let p1=6p_{1}=6, p2=2p_{2}=2, k1=1k_{1}=1 and k2=1k_{2}=1. Assume

𝚲3,x1θ\displaystyle{\bf{\Lambda}}^{\theta}_{3,x_{1}} =(1θ3(1)θ3(2)θ3(3)θ3(4)θ3(5))\displaystyle=\begin{pmatrix}1&\theta^{(1)}_{3}&\theta^{(2)}_{3}&\theta^{(3)}_{3}&\theta^{(4)}_{3}&\theta^{(5)}_{3}\end{pmatrix}^{\top}

and

𝚲3,x2θ=(1θ3(6)),𝚪3θ=θ3(7),\displaystyle{\bf{\Lambda}}^{\theta}_{3,x_{2}}=\begin{pmatrix}1&\theta_{3}^{(6)}\end{pmatrix}^{\top},\quad{\bf{\Gamma}}^{\theta}_{3}=\theta_{3}^{(7)},

where θ3(1)\theta^{(1)}_{3}, θ3(2)\theta^{(2)}_{3}, θ3(3)\theta^{(3)}_{3}, θ3(4)\theta^{(4)}_{3}, θ3(5)\theta^{(5)}_{3}, θ3(6)\theta^{(6)}_{3} and θ3(7)\theta^{(7)}_{3} are not zero. We assume that 𝐒1,3θ{\bf{S}}^{\theta}_{1,3} and 𝐒2,3θ{\bf{S}}^{\theta}_{2,3} satisfy 𝚺3,ξξθ=(𝐒1,3θ)2=θ3(8)>0{\bf{\Sigma}}^{\theta}_{3,\xi\xi}=({\bf{S}}^{\theta}_{1,3})^{2}=\theta_{3}^{(8)}>0 and

𝚺3,δδθ\displaystyle{\bf{\Sigma}}^{\theta}_{3,\delta\delta} =𝐒2,3θ𝐒2,3θ=Diag(θ3(9),θ3(10),θ3(11),θ3(12),θ3(13),θ3(14))6++.\displaystyle={\bf{S}}^{\theta}_{2,3}{\bf{S}}^{\theta\top}_{2,3}={\rm{Diag}}\Bigl{(}\theta_{3}^{(9)},\theta_{3}^{(10)},\theta_{3}^{(11)},\theta_{3}^{(12)},\theta_{3}^{(13)},\theta_{3}^{(14)}\Bigr{)}\in\mathcal{M}_{6}^{++}.

Moreover, it is supposed that 𝐒3,3θ{\bf{S}}^{\theta}_{3,3} and 𝐒4,3θ{\bf{S}}^{\theta}_{4,3} satisfy

𝚺3,εεθ\displaystyle{\bf{\Sigma}}^{\theta}_{3,\varepsilon\varepsilon} =𝐒3,3θ𝐒3,3θ=(θ3(15)00θ3(16))2++\displaystyle={\bf{S}}^{\theta}_{3,3}{\bf{S}}^{\theta\top}_{3,3}=\begin{pmatrix}\theta_{3}^{(15)}&0\\ 0&\theta_{3}^{(16)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and 𝚺3,ζζθ=(𝐒4,3θ)2=θ3(17)>0{\bf{\Sigma}}^{\theta}_{3,\zeta\zeta}=({\bf{S}}^{\theta}_{4,3})^{2}=\theta_{3}^{(17)}>0. For any θ3Θ3\theta_{3}\in\Theta_{3}, one has 𝚺0𝚺3(θ3){\bf{\Sigma}}_{0}\neq{\bf{\Sigma}}_{3}(\theta_{3}), so that Model 33 is a misspecified model. Figure 6 shows the path diagram of Model 33 at time tt.

Refer to caption
Figure 6. The path diagram of Model 3.

4.3. Simulation results

In the simulation, we use optim() with the BFGS method in R language. The initial parameter is chosen as θ=θ0\theta=\theta_{0}. The number of iterations is 10000. Set T=1T=1 and consider the case where n=102,103,104,105n=10^{2},10^{3},10^{4},10^{5}. Table 1 shows the number of models selected by QAIC. Since Model 3 is not selected, this simulation result implies that Theorem 2 seems to be correct in this example. Furthermore, we see from this result that QAIC does not have consistency. In other words, the over-fitted model (Model 2) is selected with significant probability. This result is natural since QAIC chooses the best model in terms of prediction.







n=102n=10^{2} n=103n=10^{3} n=104n=10^{4} n=105n=10^{5}
Model 1 8394 8417 8461 8410
Model 2 1606 1583 1539 1590
Model 3 0 0 0 0
Table 1. The number of models selected by QAIC.

5. proof

In this section, we may omit the model index “mm”, and we use θ^n\hat{\theta}_{n} instead of θ^n(𝕏n)\hat{\theta}_{n}(\mathbb{X}_{n}). Moreover, we simply write 𝕏1,0,t\mathbb{X}_{1,0,t}, 𝕏2,0,t\mathbb{X}_{2,0,t}, ξ0,t\xi_{0,t}, δ0,t\delta_{0,t}, ε0,t\varepsilon_{0,t} and ζ0,t\zeta_{0,t} as 𝕏1,t\mathbb{X}_{1,t}, 𝕏2,t\mathbb{X}_{2,t}, ξt\xi_{t}, δt\delta_{t}, εt\varepsilon_{t} and ζt\zeta_{t}, respectively. For any process YtY_{t} and 0\ell\geq 0, we set Ri(hn,Y)=R(hn,Ytin){\rm{R}}_{i}(h_{n}^{\ell},Y)={\rm{R}}(h_{n}^{\ell},Y_{t_{i}^{n}}). Without loss of generality, we suppose that T=1T=1. Set

in=σ(W1,s,W2,s,W3,s,W4,s,stin)\displaystyle\mathscr{F}^{n}_{i}=\sigma\bigl{(}W_{1,s},W_{2,s},W_{3,s},W_{4,s},s\leq t_{i}^{n}\bigr{)}

for i=0,,ni=0,\cdots,n. Let

Hn(𝕏n,θ)\displaystyle{\rm{H}}_{n}(\mathbb{X}_{n},\theta) =log(2πhn)np2Ln(𝕏n,θ).\displaystyle=\log(2\pi h_{n})^{\frac{np}{2}}{\rm{L}}_{n}(\mathbb{X}_{n},\theta).

Set 𝐈(θ0)=Δ0𝐖(θ0)1Δ0{\bf{I}}(\theta_{0})=\Delta^{\top}_{0}{\bf{W}}(\theta_{0})^{-1}\Delta_{0}, where

𝐖(θ0)=2𝔻p+(𝚺(θ0)𝚺(θ0))𝔻p+.\displaystyle{\bf{W}}(\theta_{0})=2\mathbb{D}^{+}_{p}\bigl{(}{\bf{\Sigma}}(\theta_{0})\otimes{\bf{\Sigma}}(\theta_{0})\bigr{)}\mathbb{D}^{+\top}_{p}.

Define the random field Yn:{\rm{Y}}_{n}:

Yn(𝕏n,θ;θ0)=1n{Hn(𝕏n,θ)Hn(𝕏n,θ0)}.\displaystyle{\rm{Y}}_{n}(\mathbb{X}_{n},\theta;\theta_{0})=\frac{1}{n}\biggl{\{}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{\}}.

Let Zn{\rm{Z}}_{n} be the random field as follows:

Zn(𝕏n,u;θ0)=exp{Hn(𝕏n,θ0+1nu)Hn(𝕏n,θ0)}\displaystyle{\rm{Z}}_{n}(\mathbb{X}_{n},u;\theta_{0})=\exp\Biggl{\{}{\rm{H}}_{n}\biggl{(}\mathbb{X}_{n},\theta_{0}+\frac{1}{\sqrt{n}}u\biggr{)}-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\Biggr{\}}

for u𝕌nu\in\mathbb{U}_{n}, where

𝕌n={uq:θ0+1nuΘ}.\displaystyle{\mathbb{U}}_{n}=\left\{u\in\mathbb{R}^{q}:\ \theta_{0}+\frac{1}{\sqrt{n}}u\in\Theta\right\}.

Set Vn(r)={u𝕌n:r|u|}{\rm{V}}_{n}(r)=\bigl{\{}u\in{\mathbb{U}}_{n}:r\leq|u|\bigr{\}} and u^n=n(θ^nθ0)\hat{u}_{n}=\sqrt{n}(\hat{\theta}_{n}-\theta_{0}). 𝐕𝕏n{\bf{V}}_{\mathbb{X}_{n}} denotes the variance under the law of 𝕏n\mathbb{X}_{n}. Write θ=/θ\partial_{\theta}=\partial/\partial\theta and θ2=θθ\partial^{2}_{\theta}=\partial_{\theta}\partial^{\top}_{\theta}. Define ζ\zeta as a qq-dimensional standard normal random variable.

Lemma 1

Under [A], as nn\longrightarrow\infty,

1nθHn(𝕏n,θ0)\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}) d𝐈(θ0)12ζ\displaystyle\stackrel{{\scriptstyle d}}{{\longrightarrow}}{\bf{I}}(\theta_{0})^{\frac{1}{2}}\zeta

and

1nθ2Hn(𝕏n,θ0)\displaystyle\quad\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}) p𝐈(θ0).\displaystyle\stackrel{{\scriptstyle p}}{{\longrightarrow}}-{\bf{I}}(\theta_{0}).
Lemma 2

Under [A] and [B1], as nn\longrightarrow\infty,

θ^npθ0\displaystyle\hat{\theta}_{n}\stackrel{{\scriptstyle p}}{{\longrightarrow}}\theta_{0}

and

n(θ^nθ0)\displaystyle\sqrt{n}(\hat{\theta}_{n}-\theta_{0}) d𝐈(θ0)12ζ.\displaystyle\stackrel{{\scriptstyle d}}{{\longrightarrow}}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta.
Proofs of Lemmas 1-2.

In the same way as the proof of Theorem 2 in Kusano and Uchida [13], we can prove the results. See also Appendix 6.1. ∎

In the proofs of Lemmas 3-7, we simply write 𝐄𝕏n{\bf{E}}_{\mathbb{X}_{n}}, 𝐕𝕏n{\bf{V}}_{\mathbb{X}_{n}}, n(𝕏n,θ)\mathbb{H}_{n}(\mathbb{X}_{n},\theta), Yn(𝕏n,θ;θ0){\rm{Y}}_{n}(\mathbb{X}_{n},\theta;\theta_{0}) and Zn(𝕏n,u;θ0){\rm{Z}}_{n}(\mathbb{X}_{n},u;\theta_{0}) as 𝐄{\bf{E}}, 𝐕{\bf{V}}, n(θ)\mathbb{H}_{n}(\theta), Yn(θ;θ0){\rm{Y}}_{n}(\theta;\theta_{0}) and Zn(u;θ0){\rm{Z}}_{n}(u;\theta_{0}), respectively.

Lemma 3

Under [A], for all L>1L>1,

𝐄𝕏n[|𝕏tin𝕏ti1n|L]CLhnL2,\displaystyle\qquad\qquad{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}, (5.1)
𝐄𝕏n[|𝕏tin𝐄𝕏n[𝕏tin|i1n]|L]CLhnL2,\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}, (5.2)
𝐄𝕏n[|𝐄𝕏n[𝕏tin|i1n]𝕏ti1n|L]CLhnL\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{L} (5.3)

and

𝐄𝕏n[|𝐕𝕏n[𝕏tin|i1n]|L]CLhnL.\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}{\bf{V}}_{\mathbb{X}_{n}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{L}. (5.4)
Proof.

First, we will prove (5.1). Lemmas 14-15 in Kusano and Uchida [14] implies

𝐄[Δi𝕏1(j)|i1n]=𝐄[Ai,n(j)|i1n]+𝐄[Bi,n(j)|i1n]=Ri1(hn,ξ)+Ri1(hn,δ)\displaystyle\begin{split}{\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j)}_{1}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}&={\bf{E}}\biggl{[}A^{(j)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\bf{E}}\biggl{[}B^{(j)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}\\ &={\rm{R}}_{i-1}(h_{n},\xi)+{\rm{R}}_{i-1}(h_{n},\delta)\end{split} (5.5)

for j=1,,p1j=1,\cdots,p_{1}, where

Ai,n=𝚲x1,0Δiξ,Bi,n=Δiδ.\displaystyle A_{i,n}={\bf{\Lambda}}_{x_{1},0}\Delta_{i}\xi,\quad B_{i,n}=\Delta_{i}\delta.

Since

𝕏2,t\displaystyle\mathbb{X}_{2,t} =𝚲x2,0𝚿01𝚪0ξt+𝚲x2,0𝚿01ζt+εt,\displaystyle={\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}{\bf{\Gamma}}_{0}\xi_{t}+{\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}\zeta_{t}+\varepsilon_{t},

it follows from Lemmas 16-18 in Kusano and Uchida [14] that

𝐄[Δi𝕏2(k)|i1n]=𝐄[Ci,n(k)|i1n]+𝐄[Di,n(k)|i1n]+𝐄[Ei,n(k)|i1n]=Ri1(hn,ξ)+Ri1(hn,ε)+Ri1(hn,ζ)\displaystyle\begin{split}{\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(k)}_{2}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}&={\bf{E}}\biggl{[}C^{(k)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\bf{E}}\biggl{[}D^{(k)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\bf{E}}\biggl{[}E^{(k)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}\\ &={\rm{R}}_{i-1}(h_{n},\xi)+{\rm{R}}_{i-1}(h_{n},\varepsilon)+{\rm{R}}_{i-1}(h_{n},\zeta)\end{split} (5.6)

for k=1,,p2k=1,\cdots,p_{2}, where

Ci,n=𝚲x2,0𝚿01𝚪0Δiξ,Di,n=𝚲x2,0𝚿01Δiζ,Ei,n=Δiε.\displaystyle C_{i,n}={\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}{\bf{\Gamma}}_{0}\Delta_{i}\xi,\quad D_{i,n}={\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}\Delta_{i}\zeta,\quad E_{i,n}=\Delta_{i}\varepsilon.

Lemma 20 in Kusano and Uchida [14] shows

𝐄[|Δi𝕏1(j)|L|i1n]\displaystyle{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]} CL𝐄[|Ai,n(j)|L|i1n]+CL𝐄[|Bi,n(j)|L|i1n]\displaystyle\leq C_{L}{\bf{E}}\biggl{[}\bigl{|}A^{(j)}_{i,n}\bigr{|}^{L}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}B^{(j)}_{i,n}\bigr{|}^{L}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}
CLRi1(hnL2,ξ)+CLRi1(hnL2,δ)\displaystyle\leq C_{L}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\xi)+C_{L}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\delta)

for all L>1L>1, so that

𝐄[|Δi𝕏1(j)|L]=𝐄[𝐄[|Δi𝕏1(j)|L|i1n]]CL𝐄[Ri1(hnL2,ξ)]+CL𝐄[Ri1(hnL2,δ)]CLhnL2.\displaystyle\begin{split}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\biggr{]}&={\bf{E}}\Biggl{[}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}\Biggr{]}\\ &\leq C_{L}{\bf{E}}\biggl{[}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\xi)\biggr{]}+C_{L}{\bf{E}}\biggl{[}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\delta)\biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}.\end{split} (5.7)

Similarly, we see from Lemma 20 in Kusano and Uchida [14] that

𝐄[|Δi𝕏2(k)|L]CLhnL2\displaystyle\begin{split}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(k)}_{2}\bigr{|}^{L}\biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}\end{split} (5.8)

for any L>1L>1. Thus, it holds from (5.7) and (5.8) that for all L>1L>1,

𝐄[|Δi𝕏|L]\displaystyle{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}\bigr{|}^{L}\biggr{]} CL=1p𝐄[|Δi𝕏()|L]\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(\ell)}\bigr{|}^{L}\biggr{]}
=CLj=1p1𝐄[|Δi𝕏1(j)|L]+CLk=1p2𝐄[|Δi𝕏2(k)|L]CLhnL2,\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\biggr{]}+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(k)}_{2}\bigr{|}^{L}\biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}},

which yields (5.1). Using (5.5) and (5.7), one gets

𝐄[|𝕏1,tin(j)𝐄[𝕏1,tin(j)|i1n]|L]=𝐄[|Δi𝕏1(j)Ri1(hn,ξ)Ri1(hn,δ)|L]CL𝐄[|Δi𝕏1(j)|L]+CL𝐄[|Ri1(hn,ξ)|L]+CL𝐄[|Ri1(hn,δ)|L]CL(hnL2+hnL+hnL)CLhnL2\displaystyle\begin{split}&\quad\ {\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\\ &={\bf{E}}\Biggl{[}\Bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\Bigr{|}^{L}\Biggr{]}\\ &\leq C_{L}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\xi)\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\delta)\bigr{|}^{L}\biggr{]}\\ &\leq C_{L}\Bigl{(}h_{n}^{\frac{L}{2}}+h_{n}^{L}+h_{n}^{L}\Bigr{)}\\ &\leq C_{L}h_{n}^{\frac{L}{2}}\end{split} (5.9)

for all L>1L>1. In an analogous manner, (5.6) and (5.8) deduce

𝐄[|𝕏2,tin(k)𝐄[𝕏2,tin(k)|i1n]|L]CLhnL2\displaystyle\begin{split}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}\end{split} (5.10)

for any L>1L>1. Consequently, we see from (5.9) and (5.10) that

𝐄[|𝕏tin𝐄[𝕏tin|i1n]|L]\displaystyle{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]} CL=1p𝐄[|𝕏tin()𝐄[𝕏tin()|i1n]|L]\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(\ell)}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(\ell)}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}
=CLj=1p1𝐄[|𝕏1,tin(j)𝐄[𝕏1,tin(j)|i1n]|L]\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}
+CLk=1p2𝐄[|𝕏2,tin(k)𝐄[𝕏2,tin(k)|i1n]|L]\displaystyle\quad+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}
CLhnL2\displaystyle\leq C_{L}h_{n}^{\frac{L}{2}}

for all L>1L>1, which yields (5.2). It follows from (5.5) that

𝐄[|𝐄[𝕏1,tin(j)|i1n]𝕏1,ti1n(j)|L]𝐄[|Ri1(hn,ξ)+Ri1(hn,δ)|L]CL𝐄[|Ri1(hn,ξ)|L]+CL𝐄[|Ri1(hn,δ)|L]CLhnL\displaystyle\begin{split}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(j)}_{1,t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}&\leq{\bf{E}}\Biggl{[}\Bigl{|}{\rm{R}}_{i-1}(h_{n},\xi)+{\rm{R}}_{i-1}(h_{n},\delta)\Bigr{|}^{L}\Biggr{]}\\ &\leq C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\xi)\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\delta)\bigr{|}^{L}\biggr{]}\\ &\leq C_{L}h_{n}^{L}\end{split} (5.11)

for any L>1L>1. In a similar way, (5.6) implies

𝐄[|𝐄[𝕏2,tin(k)|i1n]𝕏2,ti1n(k)|L]\displaystyle{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(k)}_{2,t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]} CLhnL\displaystyle\leq C_{L}h_{n}^{L} (5.12)

for all L>1L>1. Hence, it holds from (5.11) and (5.12) that

𝐄[|𝐄[𝕏tin|i1n]𝕏ti1n|L]\displaystyle{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]} CL=1p𝐄[|𝐄[𝕏tin()|i1n]𝕏ti1n()|L]\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(\ell)}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(\ell)}_{t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}
=CLj=1p1𝐄[|𝐄[𝕏1,tin(j)|i1n]𝕏1,ti1n(j)|L]\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(j)}_{1,t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}
+CLk=1p2𝐄[|𝐄[𝕏2,tin(k)|i1n]𝕏2,ti1n(k)|L]\displaystyle\quad+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(k)}_{2,t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}
CLhnL\displaystyle\leq C_{L}h_{n}^{L}

for all L>1L>1, so that (5.3) holds. Next, we consider (5.4). Since it follows from (5.5) and Lemma 21 in Kusano and Uchida [14] that

𝐄[(𝕏1,tin(j1)𝐄[𝕏1,tin(j1)|i1n])(𝕏1,tin(j2)𝐄[𝕏1,tin(j2)|i1n])|i1n]\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}
=𝐄[(𝕏1,tin(j1)𝕏1,tin(j1)Ri1(hn,ξ)Ri1(hn,δ))\displaystyle={\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\biggr{)}
×(𝕏1,tin(j2)𝕏1,tin(j2)Ri1(hn,ξ)Ri1(hn,δ))|i1n]\displaystyle\qquad\qquad\qquad\qquad\times\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}
=𝐄[(Δi𝕏1(j1))(Δi𝕏1(j2))|i1n]\displaystyle={\bf{E}}\Biggl{[}\Bigl{(}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\Bigr{)}\Bigl{(}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\Bigr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}
+Ri1(hn,ξ)𝐄[Δi𝕏1(j1)|i1n]+Ri1(hn,δ)𝐄[Δi𝕏1(j1)|i1n]\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\rm{R}}_{i-1}(h_{n},\delta){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}
+Ri1(hn,ξ)𝐄[Δi𝕏1(j2)|i1n]+Ri1(hn,δ)𝐄[Δi𝕏1(j2)|i1n]\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\rm{R}}_{i-1}(h_{n},\delta){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}
+Ri1(hn2,ξ)+Ri1(hn2,δ)+Ri1(hn,ξ)Ri1(hn,δ)\displaystyle\quad+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)
=hn𝚺11(θ0)j1j2+Ri1(hn2,ξ)+Ri1(hn2,δ)+Ri1(hn,ξ)Ri1(hn,δ)\displaystyle=h_{n}{\bf{\Sigma}}^{11}(\theta_{0})_{j_{1}j_{2}}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)

for j1,j2=1,,p1j_{1},j_{2}=1,\cdots,p_{1}, we see

𝐄[|𝐄[(𝕏1,tin(j1)𝐄[𝕏1,tin(j1)|i1n])(𝕏1,tin(j2)𝐄[𝕏1,tin(j2)|i1n])|i1n]|L]\displaystyle\quad\ {\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]
CLhnL|𝚺11(θ0)j1j2|L+CL𝐄[|Ri1(hn2,ξ)|L]+CL𝐄[|Ri1(hn2,δ)|L]\displaystyle\leq C_{L}h_{n}^{L}\Bigl{|}{\bf{\Sigma}}^{11}(\theta_{0})_{j_{1}j_{2}}\Bigr{|}^{L}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n}^{2},\xi)\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n}^{2},\delta)\bigr{|}^{L}\biggr{]}
+CL𝐄[|Ri1(hn,ξ)Ri1(hn,δ)|L]CLhnL\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)\bigr{|}^{L}\biggr{]}\leq C_{L}h_{n}^{L}

for any L>1L>1. In an analogous manner, one has

𝐄[(𝕏1,tin(j)𝐄[𝕏1,tin(j)|i1n])(𝕏2,tin(k)𝐄[𝕏2,tin(k)|i1n])|i1n]\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}
=hn𝚺12(θ0)jk+Ri1(hn2,ξ)+Ri1(hn,ξ)Ri1(hn,δ)+Ri1(hn,ξ)Ri1(hn,ε)\displaystyle=h_{n}{\bf{\Sigma}}^{12}(\theta_{0})_{jk}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\varepsilon)
+Ri1(hn,ξ)Ri1(hn,ζ)+Ri1(hn,δ)Ri1(hn,ε)+Ri1(hn,δ)Ri1(hn,ζ)\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\zeta)+{\rm{R}}_{i-1}(h_{n},\delta){\rm{R}}_{i-1}(h_{n},\varepsilon)+{\rm{R}}_{i-1}(h_{n},\delta){\rm{R}}_{i-1}(h_{n},\zeta)

for j=1,,p1j=1,\cdots,p_{1} and k=1,,p2k=1,\cdots,p_{2}, and

𝐄[(𝕏2,tin(k1)𝐄[𝕏2,tin(k1)|i1n])(𝕏2,tin(k2)𝐄[𝕏2,tin(k2)|i1n])|i1n]\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}
=hn𝚺22(θ0)k1k2+Ri1(hn2,ξ)+Ri1(hn2,ε)+Ri1(hn2,ζ)\displaystyle=h_{n}{\bf{\Sigma}}^{22}(\theta_{0})_{k_{1}k_{2}}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\varepsilon)+{\rm{R}}_{i-1}(h_{n}^{2},\zeta)
+Ri1(hn,ξ)Ri1(hn,ε)+Ri1(hn,ξ)Ri1(hn,ζ)+Ri1(hn,ε)Ri1(hn,ζ)\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\varepsilon)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\zeta)+{\rm{R}}_{i-1}(h_{n},\varepsilon){\rm{R}}_{i-1}(h_{n},\zeta)

for k1,k2=1,,p2k_{1},k_{2}=1,\cdots,p_{2}, so that we get

𝐄[|𝐄[(𝕏1,tin(j)𝐄[𝕏1,tin(j)|i1n])(𝕏2,tin(k)𝐄[𝕏2,tin(k)|i1n])|i1n]|L]CLhnL\displaystyle{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]\leq C_{L}h_{n}^{L}

and

𝐄[|𝐄[(𝕏2,tin(k1)𝐄[𝕏2,tin(k1)|i1n])(𝕏2,tin(k2)𝐄[𝕏2,tin(k2)|i1n])|i1n]|L]CLhnL\displaystyle{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]\leq C_{L}h_{n}^{L}

for all L>1L>1. Therefore, it is shown that

𝐄[|𝐕[𝕏tin|i1n]|L]\displaystyle\quad\ {\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}
CLj1=1p1j2=1p1𝐄[|𝐄[(𝕏1,tin(j1)𝐄[𝕏1,tin(j1)|i1n])(𝕏1,tin(j2)𝐄[𝕏1,tin(j2)|i1n])|i1n]|L]\displaystyle\leq C_{L}\sum_{j_{1}=1}^{p_{1}}\sum_{j_{2}=1}^{p_{1}}{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]
+CLj=1p1k=1p2𝐄[|𝐄[(𝕏(j)1,tin𝐄[𝕏(j)1,tin|ni1])(𝕏(k)2,tin𝐄[𝕏(k)2,tin|ni1])|ni1]|L]\displaystyle\quad+C_{L}\sum_{j=1}^{p_{1}}\sum_{k=1}^{p_{2}}{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]
+CLk1=1p2k2=1p2𝐄[|𝐄[(𝕏(k1)2,tin𝐄[𝕏(k1)2,tin|ni1])(𝕏(k2)2,tin𝐄[𝕏(k2)2,tin|ni1])|ni1]|L]\displaystyle\quad+C_{L}\sum_{k_{1}=1}^{p_{2}}\sum_{k_{2}=1}^{p_{2}}{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]
CLhnL\displaystyle\leq C_{L}h_{n}^{L}

for any L>1L>1, which yields (5.4). ∎

Lemma 4

Under [A], for all L>0L>0,

supn𝐄𝕏n[|1nθ(j)Hn(𝕏n,θ0)|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{|}^{L}\Biggr{]}<\infty

for j=1,,qj=1,\cdots,q.

Proof of Lemma 4.

Note that

θ(j)Hn(θ)\displaystyle\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta) =12hni=1n(θ(j)𝚺(θ)1)[(Δi𝕏)2]n2θ(j)logdet𝚺(θ0)\displaystyle=-\frac{1}{2h_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-\frac{n}{2}\partial_{\theta^{(j)}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})
=12hni=1n(𝚺(θ)1)(θ(j)𝚺(θ))(𝚺(θ)1)[(Δi𝕏)2]n2(𝚺(θ)1)[θ(j)𝚺(θ)]\displaystyle=\frac{1}{2h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-\frac{n}{2}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{]}

for j=1,,qj=1,\cdots,q. Since

(Δi𝕏)2\displaystyle\Bigl{(}\Delta_{i}\mathbb{X}\Bigr{)}^{\otimes 2} =(𝕏tin𝐄[𝕏tin|ni1])2+(𝐄[𝕏tin|ni1]𝕏ti1n)2\displaystyle=\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}+\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}
+(𝕏tin𝐄[𝕏tin|ni1])(𝐄[𝕏tin|ni1]𝕏ti1n)\displaystyle\qquad\qquad+\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\top}
+(𝐄[𝕏tin|ni1]𝕏ti1n)(𝕏tin𝐄[𝕏tin|ni1]),\displaystyle\qquad\qquad+\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\top},

we have

1nθ(j)Hn(θ0)\displaystyle\quad\ \frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})
=12n12hni=1n{(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)[(Δi𝕏)2]hn(𝚺(θ0)1)[θ(j)𝚺(θ0)]}\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-h_{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{]}\Biggr{\}}
=12n12hni=1n(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)[(Δi𝕏)2hn𝚺(θ0)]\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}
=𝐌n(j)+𝐑n(j)\displaystyle={\bf{M}}_{n}^{(j)}+{\bf{R}}_{n}^{(j)}

for j=1,,qj=1,\cdots,q, where

𝐌n(j)\displaystyle{\bf{M}}_{n}^{(j)} =12n12hni=1n(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}
[(𝕏tin𝐄[𝕏tin|ni1])2𝐕[𝕏tin|ni1]]\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}
+1n12hni=1n(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)\displaystyle\quad+\frac{1}{n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}
[𝕏tin𝐄[𝕏tin|ni1],𝐄[𝕏tin|ni1]𝕏ti1n]\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}

and

𝐑n(j)\displaystyle{\bf{R}}_{n}^{(j)} =12n12hni=1n(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)[(𝐄[𝕏tin|ni1]𝕏ti1n)2]\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}\Biggr{]}
+12n12hni=1n(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)[𝐕[𝕏tin|ni1]hn𝚺(θ0)].\displaystyle\qquad\quad+\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Biggr{]}.

First, we will prove

supn𝐄[|𝐌(j)n|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}<\infty (5.13)

for any L>1L>1. Set

𝐍(j)k=12hn=1k𝐋(j)\displaystyle{\bf{N}}^{(j)}_{k}=\frac{1}{2h_{n}}\sum_{\ell=1}^{k}{\bf{L}}^{(j)}_{\ell}

for k=0,,nk=0,\cdots,n, where

𝐋(j)\displaystyle{\bf{L}}^{(j)}_{\ell} =(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)[(𝕏tin𝐄[𝕏tin|ni1])2𝐕[𝕏tin|ni1]]\displaystyle=\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}
+2(𝚺(θ0)1)(θ(j)𝚺(θ0))(𝚺(θ0)1)[𝕏tin𝐄[𝕏tin|ni1],𝐄[𝕏tin|ni1]𝕏ti1n]\displaystyle\quad+2\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}

for =1,,k\ell=1,\cdots,k. Since

𝐄[𝐋(j)|n1]=0,\displaystyle{\bf{E}}\biggl{[}{\bf{L}}^{(j)}_{\ell}\big{|}\mathscr{F}^{n}_{\ell-1}\biggr{]}=0,

one has

𝐄[𝐍(j)k|nk1]\displaystyle{\bf{E}}\biggl{[}{\bf{N}}^{(j)}_{k}\big{|}\mathscr{F}^{n}_{k-1}\biggr{]} =12hn=1k1𝐋(j)+12hn𝐄[𝐋(j)k|nk1]=𝐍(j)k1,\displaystyle=\frac{1}{2h_{n}}\sum_{\ell=1}^{k-1}{\bf{L}}^{(j)}_{\ell}+\frac{1}{2h_{n}}{\bf{E}}\biggl{[}{\bf{L}}^{(j)}_{k}\big{|}\mathscr{F}^{n}_{k-1}\biggr{]}={\bf{N}}^{(j)}_{k-1},

so that {𝐍k(j)}k=0n\{{\bf{N}}_{k}^{(j)}\}_{k=0}^{n} is a discrete-time martingale with respect to {ni}i=0n\{\mathscr{F}^{n}_{i}\}_{i=0}^{n}. Note that n𝐌(j)n\sqrt{n}{\bf{M}}^{(j)}_{n} is the terminal value of {𝐍k(j)}k=0n\{{\bf{N}}_{k}^{(j)}\}_{k=0}^{n}:

n𝐌n(j)=𝐍(j)n.\displaystyle\sqrt{n}{\bf{M}}_{n}^{(j)}={\bf{N}}^{(j)}_{n}.

Using the Burkholder inequality and

𝐍(j)n=k=1n(𝐍k(j)𝐍k1(j))2=14h2nk=1n𝐋k(j)2,\displaystyle\bigl{\langle}{\bf{N}}^{(j)}\bigr{\rangle}_{n}=\sum_{k=1}^{n}\bigl{(}{\bf{N}}_{k}^{(j)}-{\bf{N}}_{k-1}^{(j)}\bigr{)}^{2}=\frac{1}{4h^{2}_{n}}\sum_{k=1}^{n}{\bf{L}}_{k}^{(j)2},

we have

𝐄[|𝐍(j)n|L]\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{N}}^{(j)}_{n}\bigr{|}^{L}\biggr{]} CL𝐄[𝐍(j)nL2]\displaystyle\leq C_{L}{\bf{E}}\biggl{[}\bigl{\langle}{\bf{N}}^{(j)}\bigr{\rangle}_{n}^{\frac{L}{2}}\biggr{]}
CLhnL𝐄[(k=1n𝐋(j)2k)L2]CLhnL×nL21k=1n𝐄[|𝐋(j)k|L]\displaystyle\leq\frac{C_{L}}{h_{n}^{L}}{\bf{E}}\left[\Biggl{(}\sum_{k=1}^{n}{\bf{L}}^{(j)2}_{k}\Biggr{)}^{\frac{L}{2}}\right]\leq\frac{C_{L}}{h_{n}^{L}}\times n^{\frac{L}{2}-1}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{(j)}_{k}\bigr{|}^{L}\biggr{]}

for all L>1L>1, which yields

𝐄[|𝐌(j)n|L]=1nL2𝐄[|𝐍(j)n|L]CLnhnLk=1n𝐄[|𝐋(j)k|L].\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}=\frac{1}{n^{\frac{L}{2}}}{\bf{E}}\biggl{[}\bigl{|}{\bf{N}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}\leq\frac{C_{L}}{nh_{n}^{L}}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\big{|}{\bf{L}}^{(j)}_{k}\big{|}^{L}\biggr{]}. (5.14)

Moreover, it follows from Lemma 3 and the Cauchy-Schwartz inequality that

𝐄[|𝐋(j)k|L]\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{(j)}_{k}\bigr{|}^{L}\biggr{]} CL𝐄[|𝕏tin𝐄[𝕏tin|ni1]|2L]+CL𝐄[|𝐕[𝕏tin|ni1]|L]\displaystyle\leq C_{L}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{2L}\Biggr{]}+C_{L}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}
+CL𝐄[|𝕏tin𝐄[𝕏tin|ni1]|L|𝐄[𝕏tin|ni1]𝕏ti1n|L]\displaystyle\quad+C_{L}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}
CL𝐄[|𝕏tin𝐄[𝕏tin|ni1]|2L]+CL𝐄[|𝐕[𝕏tin|ni1]|L]\displaystyle\leq C_{L}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{2L}\Biggr{]}+C_{L}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}
+CL𝐄[|𝕏tin𝐄[𝕏tin|ni1]|2L]12𝐄[|𝐄[𝕏tin|ni1]𝕏ti1n|2L]12\displaystyle\quad+C_{L}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{2L}\Biggr{]}^{\frac{1}{2}}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{2L}\Biggr{]}^{\frac{1}{2}}
CL(hnL+hnL+hn32L)\displaystyle\leq C_{L}\Bigl{(}h_{n}^{L}+h_{n}^{L}+h_{n}^{\frac{3}{2}L}\Bigr{)}
CLhnL\displaystyle\leq C_{L}h_{n}^{L}

for any L>1L>1, so that it holds from (5.14) that

𝐄[|𝐌(j)n|L]CL,\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}\leq C_{L},

which implies (5.13). Next, we will prove

supn𝐄[|𝐑(j)n|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{\bf{R}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}<\infty (5.15)

for all L>1L>1. In an analogous manner to the proof of Lemma 3, one has

𝐄[|𝐕[𝕏tin|ni1]hn𝚺(θ0)|L]CLhn2L\displaystyle{\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{2L} (5.16)

for all L>1L>1. Lemma 3 and (5.16) show

𝐄[|𝐑(j)n|L]\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{R}}^{(j)}_{n}\bigr{|}^{L}\biggr{]} CLnL2hnL×nL1i=1n𝐄[|𝐄[𝕏tin|ni1]𝕏ti1n|2L]\displaystyle\leq\frac{C_{L}}{n^{\frac{L}{2}}h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{2L}\Biggr{]}
+CLnL2hnL×nL1i=1n𝐄[|𝐕[𝕏tin|ni1]hn𝚺(θ0)|L]\displaystyle\quad+\frac{C_{L}}{n^{\frac{L}{2}}h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{|}^{L}\Biggr{]}
CLnL2hnL(hn2L+hn2L)\displaystyle\leq\frac{C_{L}n^{\frac{L}{2}}}{h_{n}^{L}}\bigl{(}h_{n}^{2L}+h_{n}^{2L}\bigr{)}
CL(nhn2)L2\displaystyle\leq C_{L}(nh_{n}^{2})^{\frac{L}{2}}

for any L>1L>1. Since nhn2=n10nh_{n}^{2}=n^{-1}\longrightarrow 0 as nn\longrightarrow\infty, we obtain (5.15). Consequently, for all L>1L>1, it holds from (5.13) and (5.15) that

supn𝐄[|1nθ(j)Hn(θ0)|L]\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})\biggr{|}^{L}\Biggr{]} CLsupn𝐄[|𝐌(j)n|L]+CLsupn𝐄[|𝐑(j)n|L]\displaystyle\leq C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}+C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{\bf{R}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}
<.\displaystyle<\infty.

Therefore, it is shown that for all L>0L>0,

supn𝐄[|1nθ(j)Hn(θ0)|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})\biggr{|}^{L}\Biggr{]}<\infty

for j=1,,qj=1,\cdots,q. ∎

Lemma 5

Under [A], for all ε(0,12)\varepsilon\in(0,\frac{1}{2}) and L>0L>0,

supn𝐄𝕏n[(nε|1nθ(j1)θ(j2)Hn(𝕏n,θ0)+𝐈(θ0)j1j2|)L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{(}n^{\varepsilon}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}\biggr{|}\biggr{)}^{L}\Biggr{]}<\infty

for j1,j2=1,,qj_{1},j_{2}=1,\cdots,q.

Proof of Lemma 5.

Note that

(θ(j1)θ(j2)𝚺(θ)1)[𝚺(θ)]\displaystyle\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta)\Bigr{]} =tr{(𝚺(θ)1)(θ(j1)𝚺(θ))(𝚺(θ)1)(θ(j2)𝚺(θ))}\displaystyle=\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}
tr{(𝚺(θ)1)(θ(j1)θ(j2)𝚺(θ))}\displaystyle\qquad-\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}
+tr{(𝚺(θ)1)(θ(j2)𝚺(θ))(𝚺(θ)1)(θ(j1)𝚺(θ))}\displaystyle\qquad+\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}

and

θ(j1)θ(j2)logdet𝚺(θ)\displaystyle\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta) =tr{(𝚺(θ)1)(θ(j1)𝚺(θ))(𝚺(θ)1)(θ(j2)𝚺(θ))}\displaystyle=-\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}
+tr{(𝚺(θ)1)(θ(j1)θ(j2)𝚺(θ))}\displaystyle\qquad+\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}

for j1,j2=1,,qj_{1},j_{2}=1,\cdots,q. Since

12(θ(j1)θ(j2)𝚺(θ0)1)[𝚺(θ0)]+12θ(j1)θ(j2)logdet𝚺(θ0)\displaystyle\quad\ \frac{1}{2}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta_{0})\Bigr{]}+\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})
=12tr{(𝚺(θ0)1)(θ(j2)𝚺(θ0))(𝚺(θ0)1)(θ(j1)𝚺(θ0))}\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\Biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Biggl{\}}
=12(vecθ(j1)𝚺(θ0))(𝚺(θ0)1𝚺(θ0)1)(vecθ(j2)𝚺(θ0))\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vec}\nolimits{\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\mathop{\rm vec}\nolimits{\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}
=12(vechθ(j1)𝚺(θ0))𝔻p(𝚺(θ0)1𝚺(θ0)1)𝔻p(vechθ(j2)𝚺(θ0))\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vech}\nolimits{\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\mathbb{D}_{p}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{D}_{p}\Bigl{(}\mathop{\rm vech}\nolimits{\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}
=(θ(j1)vech𝚺(θ0))𝐖(θ0)1(θ(j2)vech𝚺(θ0))\displaystyle=\Bigl{(}\partial_{\theta^{(j_{1})}}\mathop{\rm vech}\nolimits{{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}{\bf{W}}(\theta_{0})^{-1}\Bigl{(}\partial_{\theta^{(j_{2})}}\mathop{\rm vech}\nolimits{{\bf{\Sigma}}(\theta_{0})}\Bigr{)}
=𝐈(θ0)j1j2,\displaystyle={\bf{I}}(\theta_{0})_{j_{1}j_{2}},

we have

1nθ(j1)θ(j2)Hn(θ0)\displaystyle\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0}) =12nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[(Δi𝕏)2]12θ(j1)θ(j2)logdet𝚺(θ0)\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})
=12nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[(Δi𝕏)2hn𝚺(θ0)]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}
12(θ(j1)θ(j2)𝚺(θ0)1)[𝚺(θ0)]12θ(j1)θ(j2)logdet𝚺(θ0)\displaystyle\qquad\qquad\quad-\frac{1}{2}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta_{0})\Bigr{]}-\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})
=12nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[(Δi𝕏)2hn𝚺(θ0)]𝐈(θ0)j1j2,\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}-{\bf{I}}(\theta_{0})_{j_{1}j_{2}},

so that a decomposition is given by

1nθ(j1)θ(j2)Hn(θ0)+𝐈(θ0)j1j2=𝐌n,j1j2+𝐑n,j1j2,\displaystyle\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}={\bf{M}}^{\dagger}_{n,j_{1}j_{2}}+{\bf{R}}^{\dagger}_{n,j_{1}j_{2}},

where

𝐌n,j1j2\displaystyle{\bf{M}}^{\dagger}_{n,j_{1}j_{2}} =12nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[(𝕏tin𝐄[𝕏tin|ni1])2𝐕[𝕏tin|ni1]]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}
1nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[𝕏tin𝐄[𝕏tin|ni1],𝐄[𝕏tin|ni1]𝕏ti1n]\displaystyle\qquad-\frac{1}{nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}

and

𝐑n,j1j2\displaystyle{\bf{R}}^{\dagger}_{n,j_{1}j_{2}} =12nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[(𝐄[𝕏tin|ni1]𝕏ti1n)2]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}\Biggr{]}
12nhni=1n(θ(j1)θ(j2)𝚺(θ0)1)[𝐕[𝕏tin|ni1]hn𝚺(θ0)].\displaystyle\qquad\qquad\qquad-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Biggr{]}.

Let

𝐍k,j1j2=12hn=1k𝐋,j1j2\displaystyle{\bf{N}}^{\dagger}_{k,j_{1}j_{2}}=\frac{1}{2h_{n}}\sum_{\ell=1}^{k}{\bf{L}}^{\dagger}_{\ell,j_{1}j_{2}}

for k=0,,nk=0,\cdots,n, where

𝐋,j1j2\displaystyle{\bf{L}}^{\dagger}_{\ell,j_{1}j_{2}} =(θ(j1)θ(j2)𝚺(θ0)1)[(𝕏tin𝐄[𝕏tin|ni1])2𝐕[𝕏tin|ni1]]\displaystyle=-\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}
2(θ(j1)θ(j2)𝚺(θ0)1)[𝕏tin𝐄[𝕏tin|ni1],𝐄[𝕏tin|ni1]𝕏ti1n]\displaystyle\qquad\qquad-2\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}

for =1,,k\ell=1,\cdots,k. In a similar way to the proof of Lemma 4, {𝐍k,j1j2}k=0n\{{\bf{N}}_{k,j_{1}j_{2}}^{\dagger}\}_{k=0}^{n} is a discrete-time martingale with respect to {ni}i=0n\{\mathscr{F}^{n}_{i}\}_{i=0}^{n}, and n𝐌n,j1j2n{\bf{M}}^{\dagger}_{n,j_{1}j_{2}} is the terminal value of {𝐍k,j1j2}k=0n\{{\bf{N}}_{k,j_{1}j_{2}}^{\dagger}\}_{k=0}^{n}:

n𝐌n,j1j2=𝐍n,j1j2.\displaystyle n{\bf{M}}^{\dagger}_{n,j_{1}j_{2}}={\bf{N}}^{\dagger}_{n,j_{1}j_{2}}.

In a similar way to the proof of Lemma 4, it follows from the Burkholder inequality that

𝐄[|𝐍n,j1j2|L]CLhnL×nL21k=1n𝐄[|𝐋k,j1j2|L]\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{N}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq\frac{C_{L}}{h_{n}^{L}}\times n^{\frac{L}{2}-1}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{\dagger}_{k,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}

for all L>1L>1, which yields

𝐄[|nε𝐌n,j1j2|L]=nL(ε1)𝐄[|n𝐌n,j1j2|L]CLhnL×nL(ε12)1k=1n𝐄[|𝐋k,j1j2|L].\displaystyle{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}=n^{L(\varepsilon-1)}{\bf{E}}\biggl{[}\bigl{|}n{\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq\frac{C_{L}}{h_{n}^{L}}\times n^{L(\varepsilon-\frac{1}{2})-1}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{\dagger}_{k,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}.

For any L>1L>1, it is shown that

𝐄[|𝐋k,j1j2|L]CLhnL\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{\dagger}_{k,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq C_{L}h_{n}^{L}

in an analogous manner to the proof of Lemma 4, which deduces

𝐄[|nε𝐌n,j1j2|L]CLnL(ε12).\displaystyle{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq C_{L}n^{L(\varepsilon-\frac{1}{2})}.

Since ε12<0\varepsilon-\frac{1}{2}<0, we have nL(ε12)0n^{L(\varepsilon-\frac{1}{2})}\longrightarrow 0 as nn\longrightarrow\infty, so that

supn𝐄[|nε𝐌n,j1j2|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}<\infty (5.17)

for all L>1L>1. Furthermore, we see from Lemma 3 and (5.16) that for all L>1L>1,

𝐄[|nε𝐑n,j1j2|L]\displaystyle{\bf{E}}\biggl{[}\bigl{|}n^{\varepsilon}{\bf{R}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]} CLnL(ε1)hnL×nL1i=1n𝐄[|𝐄[𝕏tin|ni1]𝕏ti1n|2L]\displaystyle\leq\frac{C_{L}n^{L(\varepsilon-1)}}{h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{|}^{2L}\Biggr{]}
+CLnL(ε1)hnL×nL1i=1n𝐄[|𝐕[𝕏tin|ni1]hn𝚺(θ0)|L]\displaystyle\quad+\frac{C_{L}n^{L(\varepsilon-1)}}{h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{|}^{L}\Biggr{]}
CLnLεhnL(hn2L+hn2L)\displaystyle\leq\frac{C_{L}n^{L\varepsilon}}{h_{n}^{L}}\bigl{(}h_{n}^{2L}+h_{n}^{2L}\bigr{)}
CL(nhn2)L2nL(ε12)\displaystyle\leq C_{L}(nh_{n}^{2})^{\frac{L}{2}}n^{L(\varepsilon-\frac{1}{2})}

and (nhn2)L2nL(ε12)0(nh_{n}^{2})^{\frac{L}{2}}n^{L(\varepsilon-\frac{1}{2})}\longrightarrow 0 as nn\longrightarrow\infty, which implies

supn𝐄[|nε𝐑n,j1j2|L]<.\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{R}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}<\infty. (5.18)

Hence, it holds from (5.17) and (5.18) that

supn𝐄[(nε|1nθ(j1)θ(j2)Hn(θ0)+𝐈(θ0)j1j2|)L]\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}\biggr{|}\biggr{)}^{L}\Biggr{]} supn𝐄[|nε𝐌n,j1j2|L]+supn𝐄[|nε𝐑n,j1j2|L]\displaystyle\leq\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}+\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{R}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}
<\displaystyle<\infty

for all L>1L>1. Therefore, for all L>0L>0, we obtain

supn𝐄[(nε|1nθ(j1)θ(j2)Hn(θ0)+𝐈(θ0)j1j2|)L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}\biggr{|}\biggr{)}^{L}\Biggr{]}<\infty

for j1,j2=1,,qj_{1},j_{2}=1,\cdots,q. ∎

Lemma 6

Under [A], for all ε(0,12)\varepsilon\in(0,\frac{1}{2}) and L>0L>0,

supn𝐄𝕏n[(supθΘnε|Yn(𝕏n,θ;θ0)Y(θ)|)L]<.\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\left(\sup_{\theta\in\Theta}n^{\varepsilon}\Bigl{|}{\rm{Y}}_{n}(\mathbb{X}_{n},\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\right)^{L}\right]<\infty.
Proof of Lemma 6.

Since

Yn(θ;θ0)\displaystyle{\rm{Y}}_{n}(\theta;\theta_{0}) =1n{Hn(θ)Hn(θ0)}\displaystyle=\frac{1}{n}\Bigl{\{}{\rm{H}}_{n}(\theta)-{\rm{H}}_{n}(\theta_{0})\Bigr{\}}
=12nhni=1n(𝚺(θ)1𝚺(θ0)1)[(Δi𝕏)2]12logdet𝚺(θ)det𝚺(θ0)\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}(\Delta_{i}\mathbb{X})^{\otimes 2}\Bigr{]}-\frac{1}{2}\log\frac{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)}{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})}

and

Y(θ)\displaystyle{\rm{Y}}(\theta) =12(𝚺(θ)1𝚺(θ0)1)[𝚺(θ0)]12logdet𝚺(θ)det𝚺(θ0),\displaystyle=-\frac{1}{2}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta_{0})\Bigr{]}-\frac{1}{2}\log\frac{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)}{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})},

one has a decomposition

Yn(θ;θ0)Y(θ)\displaystyle{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta) =12nhni=1n(𝚺(θ)1𝚺(θ0)1)[(Δi𝕏)2hn𝚺(θ0)]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}(\Delta_{i}\mathbb{X})^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}
=𝐌n+𝐑n,\displaystyle={\bf{M}}^{\dagger\dagger}_{n}+{\bf{R}}^{\dagger\dagger}_{n},

where

𝐌n\displaystyle{\bf{M}}^{\dagger\dagger}_{n} =12nhni=1n(𝚺(θ)1𝚺(θ0)1)[(𝕏tin𝐄[𝕏tin|ni1])2𝐕[𝕏tin|ni1]]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}
1nhni=1n(𝚺(θ)1𝚺(θ0)1)[𝕏tin𝐄[𝕏tin|ni1],𝐄[𝕏tin|ni1]𝕏ti1n]\displaystyle\quad-\frac{1}{nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}

and

𝐑n\displaystyle{\bf{R}}^{\dagger\dagger}_{n} =12nhni=1n(𝚺(θ)1𝚺(θ0)1)[(𝐄[𝕏tin|ni1]𝕏ti1n)2]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}\Biggr{]}
12nhni=1n(𝚺(θ)1𝚺(θ0)1)[𝐕[𝕏tin|ni1]hn𝚺(θ0)].\displaystyle\qquad\qquad\qquad\qquad\qquad-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Biggr{]}.

In an analogous manner to Lemma 5, one has

supnsupθΘ𝐄[|nε𝐌n|L]<\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}<\infty

and

supnsupθΘ𝐄[|nεθ𝐌n|L]<\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}<\infty

for all L>1L>1. Consequently, it holds from the Sobolev inequality that

𝐄[supθΘ|nε𝐌n|L]\displaystyle{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]} 𝐄[Θ|nε𝐌n|L+|nεθ𝐌n|Ldθ]\displaystyle\leq{\bf{E}}\Biggl{[}\int_{\Theta}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}+\bigl{|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}d\theta\Biggr{]}
=Θ𝐄[|nε𝐌n|L]dθ+Θ𝐄[|nεθ𝐌n|L]dθ\displaystyle=\int_{\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}d\theta+\int_{\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}d\theta
ΘsupθΘ𝐄[|nε𝐌n|L]dθ+ΘsupθΘ𝐄[|nεθ𝐌n|L]dθ\displaystyle\leq\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}d\theta+\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}d\theta
CΘsupθΘ𝐄[|nε𝐌n|L]+CΘsupθΘ𝐄[|nεθ𝐌n|L]\displaystyle\leq C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}+C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}

for any L>qL>q, which yields

supn𝐄[supθΘ|nε𝐌n|L]<.\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]}<\infty. (5.19)

In a similar way, it is shown that

supn𝐄[supθΘ|nε𝐑n|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{R}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]}<\infty (5.20)

for all L>qL>q. Thus, we see from (5.19) and (5.20) that

supn𝐄[(nεsupθΘ|Yn(θ;θ0)Y(θ)|)L]\displaystyle\quad\ \sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\sup_{\theta\in\Theta}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}
CLsupn𝐄[supθΘ|nε𝐌n|L]+CLsupn𝐄[supθΘ|nε𝐑n|L]<\displaystyle\leq C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]}+C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{R}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]}<\infty

for any L>qL>q. Therefore, one gets

supn𝐄[(nεsupθΘ|Yn(θ;θ0)Y(θ)|)L]<\displaystyle\quad\ \sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\sup_{\theta\in\Theta}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty

for all ε(0,12)\varepsilon\in(0,\frac{1}{2}) and L>0L>0. ∎

Lemma 7

Under [A], for all L>0L>0,

supn𝐄𝕏n[(1nsupθΘ|θ(j1)θ(j2)θ(j3)Hn(𝕏n,θ)|)L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\left(\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)\Bigr{|}\right)^{L}\right]<\infty

for j1,j2,j3=1,,qj_{1},j_{2},j_{3}=1,\cdots,q.

Proof of Lemma 7.

Since

1nθ(j1)θ(j2)θ(j3)Hn(θ)\displaystyle\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta) =12nhni=1n(θ(j1)θ(j2)θ(j3)𝚺(θ)1)[(Δi𝕏)2]\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}
12θ(j1)θ(j2)θ(j3)logdet𝚺(θ)\displaystyle\qquad\qquad\qquad\qquad\quad-\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)

for j1,j2,j3=1,,qj_{1},j_{2},j_{3}=1,\cdots,q, it holds from Lemma 3 that

𝐄[(1n|θ(j1)θ(j2)θ(j3)Hn(θ)|)L]\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}
CLnLhnL×nL1i=1n𝐄[|(θ(j1)θ(j2)θ(j3)𝚺(θ)1)[(Δi𝕏)2]|L]\displaystyle\leq\frac{C_{L}}{n^{L}h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\biggl{|}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}\biggr{|}^{L}\Biggr{]}
+CL|θ(j1)θ(j2)θ(j3)logdet𝚺(θ)|L\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad+C_{L}\biggl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)\biggr{|}^{L}
CLnhnL|θ(j1)θ(j2)θ(j3)𝚺(θ)1|Li=1n𝐄[|Δi𝕏|2L]\displaystyle\leq\frac{C_{L}}{nh_{n}^{L}}\biggl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\biggr{|}^{L}\sum_{i=1}^{n}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}\bigr{|}^{2L}\biggr{]}
+CL|θ(j1)θ(j2)θ(j3)logdet𝚺(θ)|L\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad+C_{L}\biggl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)\biggr{|}^{L}
CLsupθΘ|θ(j1)θ(j2)θ(j3)𝚺(θ)1|L+CLsupθΘ|θ(j1)θ(j2)θ(j3)logdet𝚺(θ)|L\displaystyle\leq C_{L}\sup_{\theta\in\Theta}\biggl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\biggr{|}^{L}+C_{L}\sup_{\theta\in\Theta}\biggl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)\biggr{|}^{L}
CL\displaystyle\leq C_{L}

for all L>1L>1, which yields

supnsupθΘ𝐄[(1n|θ(j1)θ(j2)θ(j3)Hn(θ)|)L]<.\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty. (5.21)

Similarly, one has

supnsupθΘ𝐄[(1n|θθ(j1)θ(j2)θ(j3)Hn(θ)|)L]<\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\Bigl{|}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty (5.22)

for any L>1L>1. By using the Sobolev inequality, it is shown that

𝐄[(1nsupθΘ|θ(j1)θ(j2)θ(j3)Hn(θ)|)L]\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}
𝐄[Θ|1nθ(j1)θ(j2)θ(j3)Hn(θ)|L+|1nθθ(j1)θ(j2)θ(j3)Hn(θ)|Ldθ]\displaystyle\leq{\bf{E}}\Biggl{[}\int_{\Theta}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}+\biggl{|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}d\theta\Biggr{]}
=Θ𝐄[|1nθ(j1)θ(j2)θ(j3)Hn(θ)|L]dθ+Θ𝐄[|1nθθ(j1)θ(j2)θ(j3)Hn(θ)|L]dθ\displaystyle=\int_{\Theta}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}\Biggr{]}d\theta+\int_{\Theta}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}\Biggr{]}d\theta
ΘsupθΘ𝐄[|1nθ(j1)θ(j2)θ(j3)Hn(θ)|L]dθ\displaystyle\leq\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}\Biggr{]}d\theta
+ΘsupθΘ𝐄[|1nθθ(j1)θ(j2)θ(j3)Hn(θ)|L]dθ\displaystyle\qquad\qquad\qquad\qquad\qquad+\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}\Biggr{]}d\theta
CΘsupθΘ𝐄[|1nθ(j1)θ(j2)θ(j3)Hn(θ)|L]\displaystyle\leq C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}\Biggr{]}
+CΘsupθΘ𝐄[|1nθθ(j1)θ(j2)θ(j3)Hn(θ)|L]\displaystyle\qquad\qquad\qquad\qquad\qquad+C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{|}^{L}\Biggr{]}

for all L>qL>q, so that we obtain from (5.21) and (5.22) that for all ε(0,12)\varepsilon\in(0,\frac{1}{2}) and L>0L>0,

supn𝐄[(1nsupθΘ|θ(j1)θ(j2)θ(j3)Hn(θ)|)L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty

for j1,j2,j3=1,,qj_{1},j_{2},j_{3}=1,\cdots,q. ∎

Lemma 8

Under [A] and [B1], for all L>0L>0, there exists CL>0C_{L}>0 such that

𝐏(supuVn(r)Zn(𝕏n,u;θ0)er)CLrL\displaystyle{\bf{P}}\left(\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(\mathbb{X}_{n},u;\theta_{0})\geq e^{-r}\right)\leq\frac{C_{L}}{r^{L}}

for all r>0r>0 and nn\in\mathbb{N}.

Proof.

It is enough to check the regularity conditions [A1′′], [A4], [A6], [B1] and [B2] of Theorem 3 (c) in Yoshida [19]. It is supposed that α\alpha, ρ1\rho_{1}, ρ2\rho_{2}, β1\beta_{1} and β2\beta_{2} satisfy [A4]:

0<β1<12, 0<ρ1<min{1,β,2β11α}, 2α<ρ2,β20, 12β2ρ2>0,\displaystyle 0<\beta_{1}<\frac{1}{2},\ \ 0<\rho_{1}<\min\Bigl{\{}1,\beta,\frac{2\beta_{1}}{1-\alpha}\Bigr{\}},\ \ 2\alpha<\rho_{2},\ \ \beta_{2}\geq 0,\ \ 1-2\beta_{2}-\rho_{2}>0,

where β=α(1α)1\beta=\alpha(1-\alpha)^{-1}. For example, we can take α=110\alpha=\frac{1}{10}, ρ1=110\rho_{1}=\frac{1}{10}, ρ2=14\rho_{2}=\frac{1}{4}, β1=14\beta_{1}=\frac{1}{4} and β2=13\beta_{2}=\frac{1}{3}. For any L>0L>0, it follows from Lemmas 4 and 6 that

supn𝐄[|1nθHn(θ0)|M1]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\theta_{0})\biggr{|}^{M_{1}}\Biggr{]}<\infty

and

supn𝐄[(supθΘn12β1|Yn(θ;θ0)Y(θ)|)M2]<,\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\left[\left(\sup_{\theta\in\Theta}n^{\frac{1}{2}-\beta_{1}}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\right)^{M_{2}}\right]<\infty,

where M1=L(1ρ1)1>0M_{1}=L(1-\rho_{1})^{-1}>0 and M2=L(12β2ρ2)1>0M_{2}=L(1-2\beta_{2}-\rho_{2})^{-1}>0, which satisfies [A6]. Furthermore, we see from Lemmas 5 and 7 that

supn𝐄[(supθΘ1n|3θHn(θ)|)M3]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\left[\left(\sup_{\theta\in\Theta}\frac{1}{n}\Bigl{|}\partial^{3}_{\theta}{\rm{H}}_{n}(\theta)\Bigr{|}\right)^{M_{3}}\right]<\infty

and

supn𝐄[|nβ1{1n2θHn(θ0)+𝐈(θ0)}|M4]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}n^{\beta_{1}}\left\{\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})\right\}\biggr{|}^{M_{4}}\Biggr{]}<\infty

for all L>0L>0, where M3=L(βρ1)1>0M_{3}=L(\beta-\rho_{1})^{-1}>0 and M4=L(2β11αρ1)1>0M_{4}=L\bigl{(}\frac{2\beta_{1}}{1-\alpha}-\rho_{1}\bigr{)}^{-1}>0. Hence, [A1′′] is satisfied. It follows from Lemma 35 in Kusano and Uchida [14] and [B1] (b) that 𝐈(θ0){\bf{I}}(\theta_{0}) is a positive definite matrix, which satisfies [B1]. Moreover, [B1] (a) yields [B2]. ∎

𝔼\mathbb{E} denotes the expectation under the probability measure on the probability space on which ζ\zeta is realized.

Lemma 9

Under [A] and [B1], for all L>0L>0,

supn𝐄𝕏n[|n(θ^nθ0)|L]<\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{|}^{L}\Biggr{]}<\infty

and for fC(q)f\in C_{\uparrow}(\mathbb{R}^{q}),

𝐄𝕏n[f(n(θ^nθ0))]𝔼[f(𝐈(θ0)12ζ)]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}f\Bigl{(}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{)}\biggr{]}\longrightarrow\mathbb{E}\biggl{[}f\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\biggr{]}

as nn\longrightarrow\infty.

Proof.

Note that u^n𝕌n\hat{u}_{n}\in\mathbb{U}_{n} since θ0+1nu^n=θ^nΘ\theta_{0}+\frac{1}{\sqrt{n}}\hat{u}_{n}=\hat{\theta}_{n}\in\Theta. For all r>0r>0, we have

0\displaystyle 0 Hn(θ^n)Hn(θ0)\displaystyle\leq{\rm{H}}_{n}(\hat{\theta}_{n})-{\rm{H}}_{n}(\theta_{0})
=Hn(θ0+1nu^n)Hn(θ0)\displaystyle={\rm{H}}_{n}\Bigl{(}\theta_{0}+\frac{1}{\sqrt{n}}\hat{u}_{n}\Bigr{)}-{\rm{H}}_{n}(\theta_{0})
=logZn(u^n;θ0)logsupuVn(r)Zn(u;θ0)\displaystyle=\log{\rm{Z}}_{n}(\hat{u}_{n};\theta_{0})\leq\log\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})

on {|u^n|r}\{|\hat{u}_{n}|\geq r\}, which yields

1supuVn(r)Zn(u;θ0)\displaystyle 1\leq\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})

on {|u^n|r}\{|\hat{u}_{n}|\geq r\}. For any L>0L>0, it holds from Lemma 8 that there exists CL>0C_{L}>0 such that

𝐏(|n(θ^nθ0)|r)\displaystyle{\bf{P}}\Bigl{(}\bigl{|}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\bigr{|}\geq r\Bigr{)} 𝐏(supuVn(r)Zn(u;θ0)1)\displaystyle\leq{\bf{P}}\left(\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})\geq 1\right)
𝐏(supuVn(r)Zn(u;θ0)er)CLrL\displaystyle\leq{\bf{P}}\left(\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})\geq e^{-r}\right)\leq\frac{C_{L}}{r^{L}}

for all r>0r>0 and nn\in\mathbb{N}, which implies

supn𝐄[|n(θ^nθ0)|L]<.\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\Bigl{|}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{|}^{L}\Biggr{]}<\infty. (5.23)

Furthermore, we see from (5.23) and Lemma 2 that for all fC(q)f\in C_{\uparrow}(\mathbb{R}^{q}),

𝐄[f(n(θ^nθ0))]𝔼[f(𝐈(θ0)12ζ)]\displaystyle{\bf{E}}\Biggl{[}f\Bigl{(}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{)}\Biggr{]}\longrightarrow\mathbb{E}\Biggl{[}f\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\Biggr{]}

as nn\longrightarrow\infty. ∎

Proof of Theorem 1.

Let us consider the following decomposition:

𝐄𝕏n[logLn(𝕏n,θ^n)𝐄n[logLn(n,θ^n)]]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]} =𝐄𝕏n[Hn(𝕏n,θ^n)𝐄n[Hn(n,θ^n)]]\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}
=𝐄𝕏n[Hn(𝕏n,θ^n)Hn(𝕏n,θ0)]\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]}
+𝐄𝕏n[Hn(𝕏n,θ0)]𝐄n[Hn(n,θ0)]\displaystyle\quad+{\bf{E}}_{\mathbb{X}_{n}}\biggr{[}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]}
+𝐄n[Hn(n,θ0)]𝐄𝕏n[𝐄n[Hn(n,θ^n)]]\displaystyle\quad+{\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}
=D1,n+D2,n+D3,n,\displaystyle={\rm{D}}_{1,n}+{\rm{D}}_{2,n}+{\rm{D}}_{3,n},

where

D1,n\displaystyle{\rm{D}}_{1,n} =𝐄𝕏n[Hn(𝕏n,θ^n)Hn(𝕏n,θ0)],\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]},
D2,n\displaystyle{\rm{D}}_{2,n} =𝐄𝕏n[Hn(𝕏n,θ0)]𝐄n[Hn(n,θ0)],\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggr{[}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]},
D3,n\displaystyle{\rm{D}}_{3,n} =𝐄n[Hn(n,θ0)]𝐄𝕏n[𝐄n[Hn(n,θ^n)]].\displaystyle={\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}.

First of all, we will prove

D1,n\displaystyle{\rm{D}}_{1,n} q2\displaystyle\longrightarrow\frac{q}{2} (5.24)

as nn\longrightarrow\infty. Using the Taylor expansion, one has

Hn(𝕏n,θ^n)Hn(𝕏n,θ0)\displaystyle{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}) =i=1qθ(i)Hn(𝕏n,θ0)(θ^(i)nθ(i)0)\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})
+12i=1qj=1qθ(i)θ(j)Hn(𝕏n,θ0)(θ^(i)nθ(i)0)(θ^(j)nθ(j)0)\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})
+12i=1qj=1qk=1q(01(1λ)2θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ)\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)
×(θ^(i)nθ(i)0)(θ^(j)nθ(j)0)(θ^(k)nθ(k)0)\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0})
=E1,n+E2,n+E3,n,\displaystyle={\rm{E}}_{1,n}+{\rm{E}}_{2,n}+{\rm{E}}_{3,n},

where θ~n,λ=θ0+λ(θ^nθ0)\tilde{\theta}_{n,\lambda}=\theta_{0}+\lambda(\hat{\theta}_{n}-\theta_{0}) and

E1,n\displaystyle{\rm{E}}_{1,n} =i=1qθ(i)Hn(𝕏n,θ0)(θ^(i)nθ(i)0),\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0}),
E2,n\displaystyle{\rm{E}}_{2,n} =12i=1qj=1qθ(i)θ(j)Hn(𝕏n,θ0)(θ^(i)nθ(i)0)(θ^(j)nθ(j)0),\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0}),
E3,n\displaystyle{\rm{E}}_{3,n} =12i=1qj=1qk=1q(01(1λ)2θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ)\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)
×(θ^(i)nθ(i)0)(θ^(j)nθ(j)0)(θ^(k)nθ(k)0).\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0}).

First, we consider the expectation of E1,n\rm{E}_{1,n}. Set

An={θ^nintΘ}.\displaystyle A_{n}=\Bigl{\{}\hat{\theta}_{n}\in{\rm{int}}\Theta\Bigr{\}}.

Note that 𝐏(An)1{\bf{P}}(A_{n})\longrightarrow 1 as nn\longrightarrow\infty. By using the Taylor expansion, one gets

0\displaystyle 0 =1nθ(i)Hn(𝕏n,θ^n)\displaystyle=\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})
=1nθ(i)Hn(𝕏n,θ0)+1nj=1qθ(i)θ(j)Hn(𝕏n,θ0)u^n(j)\displaystyle=\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+\frac{1}{n}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\hat{u}_{n}^{(j)}
+1nnj=1qk=1q(01(1λ)θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ)u^n(j)u^n(k)\displaystyle\quad+\frac{1}{n\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)}

for i=1,,qi=1,\cdots,q on AnA_{n}, so that we have

E1,n=1ni=1qθ(i)Hn(𝕏n,θ0)u^(i)n=i=1qj=1q𝐈(θ0)iju^(i)nu^(j)ni=1qR1,n(i)u^(i)n,\displaystyle\begin{split}{\rm{E}}_{1,n}&=\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\hat{u}^{(i)}_{n}\\ &=\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}-\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n},\end{split} (5.25)

where

R1,n(i)={j=1q{1nθ(i)θ(j)Hn(𝕏n,θ0)+𝐈(θ0)ij}u^(j)n+1nnj=1qk=1q(01(1λ)θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ)u^n(j)u^n(k),(onAn)j=1q𝐈(θ0)iju^(j)n1nθ(i)Hn(𝕏n,θ0).(onAnc)\displaystyle{\rm{R}}_{1,n}^{(i)}=\left\{\begin{aligned} \hfil\displaystyle\begin{split}&\sum_{j=1}^{q}\left\{\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\}\hat{u}^{(j)}_{n}\\ &\quad+\frac{1}{n\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)},\end{split}&\quad\bigl{(}\mbox{on}\ A_{n}\bigr{)}\\ &\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(j)}_{n}-\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}).&\quad\bigl{(}\mbox{on}\ A_{n}^{c}\bigr{)}\end{aligned}\right.

Let

R¯1,n(i)\displaystyle\bar{{\rm{R}}}_{1,n}^{(i)} =j=1q{1nθ(i)θ(j)Hn(𝕏n,θ0)+𝐈(θ0)ij}u^(j)n\displaystyle=\sum_{j=1}^{q}\left\{\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\}\hat{u}^{(j)}_{n}
+1nnj=1qk=1q(01(1λ)θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ)u^n(j)u^n(k).\displaystyle\qquad+\frac{1}{n\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)}.

Since

𝐄𝕏n[|n14R¯1,n(i)|2]\displaystyle\quad\ {\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}n^{\frac{1}{4}}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}
Cj=1q𝐄𝕏n[|n14{1nθ(i)θ(j)Hn(𝕏n,θ0)+𝐈(θ0)ij}u^(j)n|2]\displaystyle\leq C\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\Biggl{|}n^{\frac{1}{4}}\left\{\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\}\hat{u}^{(j)}_{n}\Biggr{|}^{2}\right]
+Cnj=1qk=1q𝐄𝕏n[|1n(01(1λ)θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ)u^n(j)u^n(k)|2]\displaystyle\qquad+\frac{C}{\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\Biggl{|}\frac{1}{n}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)}\Biggr{|}^{2}\right]
Cj=1q𝐄𝕏n[(n14|1nθ(i)θ(j)Hn(𝕏n,θ0)+𝐈(θ0)ij|)4]12𝐄𝕏n[|u^(j)n|4]12\displaystyle\leq C\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}n^{\frac{1}{4}}\left|\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right|\biggr{)}^{4}\right]^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(j)}_{n}\bigr{|}^{4}\biggr{]}^{\frac{1}{2}}
+Cnj=1qk=1q𝐄𝕏n[(1nsupθΘ|θ(i)θ(j)θ(k)Hn(𝕏n,θ)|)4]12\displaystyle\qquad+\frac{C}{\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)\Bigr{|}\biggr{)}^{4}\right]^{\frac{1}{2}}
×𝐄𝕏n[|u^(j)n|8]14𝐄𝕏n[|u^(k)n|8]14,\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(j)}_{n}\bigr{|}^{8}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(k)}_{n}\bigr{|}^{8}\biggr{]}^{\frac{1}{4}},

it holds from Lemmas 5, 7 and 9 that

supn𝐄𝕏n[|n14R¯1,n(i)|2]<.\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}n^{\frac{1}{4}}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}<\infty. (5.26)

Consequently, we see from (5.26) and Lemma 9 that

|𝐄𝕏n[i=1qR1,n(i)u^(i)n1An]|\displaystyle\left|{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\Biggr{]}\right| =|𝐄𝕏n[i=1qR¯1,n(i)u^(i)n1An]|\displaystyle=\left|{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\sum_{i=1}^{q}\bar{{\rm{R}}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\Biggr{]}\right|
i=1q𝐄𝕏n[|R¯1,n(i)||u^(i)n|]\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{|}\bigl{|}\hat{u}^{(i)}_{n}\bigr{|}\biggr{]}
i=1q𝐄𝕏n[|R¯1,n(i)|2]12𝐄𝕏n[|u^n(i)|2]12\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}
1n14i=1qsupn𝐄𝕏n[|n14R¯1,n(i)|2]12supn𝐄𝕏n[|u^n(i)|2]120\displaystyle\leq\frac{1}{n^{\frac{1}{4}}}\sum_{i=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}n^{\frac{1}{4}}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}\longrightarrow 0

as nn\longrightarrow\infty, which yields

𝐄𝕏n[i=1qR1,n(i)u^(i)n1An]0\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\right]\longrightarrow 0 (5.27)

as nn\longrightarrow\infty. Set

R¯1,n(i)=j=1q𝐈(θ0)iju^(j)n1nθ(i)Hn(𝕏n,θ0).\displaystyle\underline{{\rm{R}}}_{1,n}^{(i)}=\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(j)}_{n}-\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}).

Using Lemmas 4 and 9, we obtain

𝐄𝕏n[|R¯1,n(i)|2]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]} Cj=1q𝐄𝕏n[|𝐈(θ0)iju^(j)n|2]+C𝐄𝕏n[|1nθ(i)Hn(𝕏n,θ0)|2]\displaystyle\leq C\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(j)}_{n}\Bigr{|}^{2}\Biggr{]}+C{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{|}^{2}\Biggr{]}
Cj=1qsupn𝐄𝕏n[|u^(j)n|2]+Csupn𝐄𝕏n[|1nθ(i)Hn(𝕏n,θ0)|2]<,\displaystyle\leq C\sum_{j=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(j)}_{n}\bigr{|}^{2}\biggr{]}+C\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{|}^{2}\Biggr{]}<\infty,

which implies

supn𝐄𝕏n[|R¯1,n(i)|2]<.\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}<\infty. (5.28)

It follows from (5.28) and Lemma 9 that

|𝐄𝕏n[i=1qR1,n(i)u^(i)n1Anc]|\displaystyle\left|{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\right| =|𝐄𝕏n[i=1qR¯1,n(i)u^(i)n1Anc]|\displaystyle=\left|{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\underline{\rm{R}}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\right|
i=1q𝐄𝕏n[|R¯1,n(i)||u^(i)n|1Anc]\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{|}\bigl{|}\hat{u}^{(i)}_{n}\bigr{|}1_{A_{n}^{c}}\biggr{]}
i=1q𝐄𝕏n[|R¯1,n(i)|2]12𝐄𝕏n[|u^(i)n|4]14𝐄𝕏n[1Anc]14\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(i)}_{n}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}1_{A_{n}^{c}}\Bigr{]}^{\frac{1}{4}}
𝐏(Anc)14i=1qsupn𝐄𝕏n[|R¯1,n(i)|2]12supn𝐄𝕏n[|u^(i)n|4]14\displaystyle\leq{\bf{P}}\bigl{(}A_{n}^{c}\bigr{)}^{\frac{1}{4}}\sum_{i=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(i)}_{n}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}
0\displaystyle\longrightarrow 0

as nn\longrightarrow\infty, so that one gets

𝐄𝕏n[i=1qR1,n(i)u^(i)n1Anc]0\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\longrightarrow 0 (5.29)

as nn\longrightarrow\infty. Hence, it holds from (5.27) and (5.29) that

𝐄𝕏n[i=1qR1,n(i)u^(i)n]=𝐄𝕏n[i=1qR1,n(i)u^(i)n1An]+𝐄𝕏n[i=1qR1,n(i)u^(i)n1Anc]0\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}\right]&={\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\right]+{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\longrightarrow 0\end{split} (5.30)

as nn\longrightarrow\infty. Let

f1(x)=x𝐈(θ0)x\displaystyle f_{1}(x)=x^{\top}{\bf{I}}(\theta_{0})x

for xqx\in\mathbb{R}^{q}. Since f1C(q)f_{1}\in C_{\uparrow}(\mathbb{R}^{q}), we see from Lemma 9 that

𝐄𝕏n[u^n𝐈(θ0)u^n]=𝐄𝕏n[f1(u^n)]𝔼[f1(𝐈(θ0)12ζ)]=𝔼[ζζ]=q\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\hat{u}_{n}^{\top}{\bf{I}}(\theta_{0})\hat{u}_{n}\biggr{]}&={\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}f_{1}\bigl{(}\hat{u}_{n}\bigr{)}\Bigr{]}\\ &\longrightarrow\mathbb{E}\biggl{[}f_{1}\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\biggr{]}=\mathbb{E}\Bigl{[}\zeta^{\top}\zeta\Bigr{]}=q\end{split} (5.31)

as nn\longrightarrow\infty. Therefore, (5.25), (5.30) and (5.31) show

𝐄𝕏n[E1,n]=𝐄𝕏n[i=1qj=1q𝐈(θ0)iju^(i)nu^(j)n]𝐄𝕏n[i=1qR1,n(i)u^(i)n]q\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{1,n}\Bigr{]}&={\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\right]-{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}\right]\longrightarrow q\end{split} (5.32)

as nn\longrightarrow\infty. Next, the expectation of E2,n\rm{E}_{2,n} is considered. Note that

E2,n\displaystyle{\rm{E}}_{2,n} =12i=1qj=1q𝐈(θ0)iju^(i)nu^(j)n+12i=1qj=1qR2,n,iju^(i)nu^(j)n,\displaystyle=-\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}{\rm{R}}_{2,n,ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n},

where

R2,n,ij=1nθ(i)θ(j)Hn(𝕏n,θ0)+𝐈(θ0)ij\displaystyle{\rm{R}}_{2,n,ij}=\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}

for i,j=1,,qi,j=1,\cdots,q. By using Lemmas 5 and 9, it is shown that

|𝐄𝕏n[i=1qj=1qR2,n,iju^(i)nu^(j)n]|\displaystyle\quad\ \left|{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\sum_{i=1}^{q}\sum_{j=1}^{q}{\rm{R}}_{2,n,ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\Biggr{]}\right|
i=1qj=1q𝐄𝕏n[|R2,n,ij||u^(i)n||u^(j)n|]\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}{\rm{R}}_{2,n,ij}\bigr{|}\bigl{|}\hat{u}^{(i)}_{n}\bigr{|}\bigl{|}\hat{u}^{(j)}_{n}\bigr{|}\biggr{]}
i=1qj=1q𝐄𝕏n[|R2,n,ij|2]12𝐄𝕏n[|u^(i)n|4]14𝐄𝕏n[|u^(j)n|4]14\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}{\rm{R}}_{2,n,ij}\bigr{|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(i)}_{n}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}^{(j)}_{n}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}
1n14i=1qj=1qsupn𝐄𝕏n[(n14|1nθ(i)θ(j)Hn(𝕏n,θ0)+𝐈(θ0)ij|)2]12\displaystyle\leq\frac{1}{n^{\frac{1}{4}}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}n^{\frac{1}{4}}\left|\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right|\biggr{)}^{2}\right]^{\frac{1}{2}}
×supn𝐄𝕏n[|u^n(i)|4]14supn𝐄𝕏n[|u^n(j)|4]14\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(j)}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}
0\displaystyle\longrightarrow 0

as nn\longrightarrow\infty. Thus, it follows from (5.31) that

𝐄𝕏n[E2,n]=12𝐄𝕏n[i=1qj=1q𝐈(θ0)iju^(i)nu^(j)n]+12𝐄𝕏n[i=1qj=1qR2,n,iju^(i)nu^(j)n]q2\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{2,n}\Bigr{]}&=-\frac{1}{2}{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\right]+\frac{1}{2}{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}\sum_{j=1}^{q}{\rm{R}}_{2,n,ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\right]\\ &\longrightarrow-\frac{q}{2}\end{split} (5.33)

as nn\longrightarrow\infty. Note that

E3,n\displaystyle{\rm{E}}_{3,n} =i=1qj=1qk=1qR3,n,ijku^n(i)u^n(j)u^n(k),\displaystyle=\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\rm{R}}_{3,n,ijk}\hat{u}_{n}^{(i)}\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)},

where

R3,n,ijk=12nn01(1λ)2θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ\displaystyle{\rm{R}}_{3,n,ijk}=\frac{1}{2n\sqrt{n}}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda

for i,j,k=1,,qi,j,k=1,\cdots,q. It holds from Lemmas 7 and 9 that

|𝐄𝕏n[E3,n]|\displaystyle\biggl{|}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{3,n}\Bigr{]}\biggr{|} 1ni=1qj=1qk=1q𝐄𝕏n[|1n01(1λ)2θ(i)θ(j)θ(k)Hn(𝕏n,θ~n,λ)dλ|2]12\displaystyle\leq\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\left|\frac{1}{n}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right|^{2}\right]^{\frac{1}{2}}
×𝐄𝕏n[|u^n(i)|4]14𝐄𝕏n[|u^n(j)|8]18𝐄𝕏n[|u^n(k)|8]18\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\quad\times{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(j)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(k)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}
1ni=1qj=1qk=1qsupn𝐄𝕏n[(1nsupθΘ|θ(i)θ(j)θ(k)Hn(𝕏n,θ)|)2]12\displaystyle\leq\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigr{|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)\Bigl{|}\biggr{)}^{2}\right]^{\frac{1}{2}}
×supn𝐄𝕏n[|u^n(i)|4]14supn𝐄𝕏n[|u^n(j)|8]18supn𝐄𝕏n[|u^n(k)|8]18\displaystyle\qquad\qquad\qquad\quad\ \ \times\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(j)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(k)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}
0\displaystyle\longrightarrow 0

as nn\longrightarrow\infty, which yields

𝐄𝕏n[E3,n]0\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{3,n}\Bigr{]}\longrightarrow 0 (5.34)

as nn\longrightarrow\infty. Hence, (5.32), (5.33) and (5.34) show (5.24). Next, we will prove

D3,n\displaystyle{\rm{D}}_{3,n} q2\displaystyle\longrightarrow\frac{q}{2} (5.35)

as nn\longrightarrow\infty. By using the Taylor expansion, one gets

Hn(n,θ^n)Hn(n,θ0)\displaystyle{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0}) =i=1qθ(i)Hn(n,θ0)(θ^(i)nθ(i)0)\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})
+12i=1qj=1qθ(i)θ(j)Hn(n,θ0)(θ^(i)nθ(i)0)(θ^(j)nθ(j)0)\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})
+12i=1qj=1qk=1q(01(1λ)2θ(i)θ(j)θ(k)Hn(n,θ~n,λ)dλ)\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)
×(θ^(i)nθ(i)0)(θ^(j)nθ(j)0)(θ^(k)nθ(k)0)\displaystyle\qquad\qquad\qquad\qquad\qquad\quad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0})
=F1,n+F2,n+F3,n,\displaystyle={\rm{F}}_{1,n}+{\rm{F}}_{2,n}+{\rm{F}}_{3,n},

where

F1,n\displaystyle{\rm{F}}_{1,n} =i=1qθ(i)Hn(n,θ0)(θ^(i)nθ(i)0),\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0}),
F2,n\displaystyle{\rm{F}}_{2,n} =12i=1qj=1qθ(i)θ(j)Hn(n,θ0)(θ^(i)nθ(i)0)(θ^(j)nθ(j)0),\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0}),
F3,n\displaystyle{\rm{F}}_{3,n} =12i=1qj=1qk=1q(01(1λ)2θ(i)θ(j)θ(k)Hn(n,θ~n,λ)dλ)\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)
×(θ^(i)nθ(i)0)(θ^(j)nθ(j)0)(θ^(k)nθ(k)0).\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0}).

Since it holds from Lemmas 1, 4 and 9 that

𝐄n[1nθHn(n,θ0)]𝔼[𝐈(θ0)12ζ]=0\displaystyle{\bf{E}}_{\mathbb{Z}_{n}}\left[\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\right]\longrightarrow\mathbb{E}\Bigl{[}{\bf{I}}(\theta_{0})^{\frac{1}{2}}\zeta\Bigr{]}=0

and

𝐄𝕏n[n(θ^nθ0)]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{]} 𝔼[𝐈(θ0)12ζ]=0\displaystyle\longrightarrow\mathbb{E}\Bigl{[}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{]}=0

as nn\longrightarrow\infty, we have

𝐄𝕏n[𝐄n[F1,n]]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{1,n}\Bigr{]}\biggr{]} =i=1q𝐄n[1nθ(i)Hn(n,θ0)]𝐄𝕏n[n(θ^(i)nθ(i)0)]0\displaystyle=\sum_{i=1}^{q}{\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\Biggr{]}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\sqrt{n}(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})\biggr{]}\longrightarrow 0 (5.36)

as nn\longrightarrow\infty. Note that

F2,n\displaystyle{\rm{F}}_{2,n} =12u^n(1n2θHn(n,θ0))u^n=12tr{(1n2θHn(n,θ0))u^nu^n}.\displaystyle=\frac{1}{2}\hat{u}_{n}^{\top}\biggl{(}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{)}\hat{u}_{n}=\frac{1}{2}\mathop{\rm tr}\nolimits\Biggr{\{}\biggl{(}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{)}\hat{u}_{n}\hat{u}_{n}^{\top}\Biggr{\}}.

Let

f2(x)=xx\displaystyle f_{2}(x)=xx^{\top}

for xqx\in\mathbb{R}^{q}. Lemmas 1, 4 and 9 deduce

𝐄n[1n2θHn(n,θ0)]𝐈(θ0)\displaystyle{\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\Biggr{]}\longrightarrow-{\bf{I}}(\theta_{0})

and

𝐄𝕏n[u^nu^n]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\hat{u}_{n}\hat{u}_{n}^{\top}\Bigr{]} =𝐄𝕏n[f2(u^n)]𝔼[f2(𝐈(θ0)12ζ)]=𝐈(θ0)1\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}f_{2}\bigl{(}\hat{u}_{n}\bigr{)}\Bigr{]}\longrightarrow\mathbb{E}\biggl{[}f_{2}\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\biggr{]}={\bf{I}}(\theta_{0})^{-1}

as nn\longrightarrow\infty, which implies

𝐄𝕏n[𝐄n[F2,n]]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{2,n}\Bigr{]}\biggr{]} =12tr{𝐄n[1n2θHn(n,θ0)]𝐄𝕏n[u^nu^n]}q2\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\left\{{\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\Biggr{]}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\hat{u}_{n}\hat{u}_{n}^{\top}\Bigr{]}\right\}\longrightarrow-\frac{q}{2} (5.37)

as nn\longrightarrow\infty. Moreover, we note that

F3,n=i=1qj=1qk=1qR~3,n,ijku^n(i)u^n(j)u^n(k),\displaystyle{\rm{F}}_{3,n}=\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\tilde{{\rm{R}}}_{3,n,ijk}\hat{u}_{n}^{(i)}\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)},

where

R~3,n,ijk=12nn01(1λ)2θ(i)θ(j)θ(k)Hn(n,θ~n,λ)dλ\displaystyle\tilde{{\rm{R}}}_{3,n,ijk}=\frac{1}{2n\sqrt{n}}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda

for i,j,k=1,,qi,j,k=1,\cdots,q. Since

𝐄𝕏n[|𝐄n[R~3,n,ijk]|2]\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\Bigl{|}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\tilde{{\rm{R}}}_{3,n,ijk}\Bigr{]}\Bigr{|}^{2}\biggr{]} 𝐄𝕏n[𝐄n[|R~3,n,ijk|2]]\displaystyle\leq{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\bigl{|}\tilde{{\rm{R}}}_{3,n,ijk}\bigr{|}^{2}\Bigr{]}\biggr{]}
1n𝐄𝕏n[𝐄n[|1n01(1λ)2θ(i)θ(j)θ(k)Hn(n,θ~n,λ)dλ|2]]\displaystyle\leq\frac{1}{n}{\bf{E}}_{\mathbb{X}_{n}}\left[{\bf{E}}_{\mathbb{Z}_{n}}\left[\left|\frac{1}{n}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right|^{2}\right]\right]
1nsupn𝐄n[(1nsupθΘ|θ(i)θ(j)θ(k)Hn(n,θ)|)2],\displaystyle\leq\frac{1}{n}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{Z}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta)\Bigr{|}\biggr{)}^{2}\right],

it follows from Lemmas 7 and 9 that

|𝐄𝕏n[𝐄n[F3,n]]|\displaystyle\Biggl{|}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{3,n}\Bigr{]}\biggr{]}\Biggr{|} i=1qj=1qk=1q𝐄𝕏n[|𝐄n[R~3,n,ijk]||u^n(i)||u^n(j)||u^n(k)|]\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\Bigl{|}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\tilde{{\rm{R}}}_{3,n,ijk}\Bigr{]}\Bigr{|}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}\bigl{|}\hat{u}_{n}^{(j)}\bigr{|}\bigl{|}\hat{u}_{n}^{(k)}\bigr{|}\biggr{]}
i=1qj=1qk=1q𝐄𝕏n[|𝐄n[R~3,n,ijk]|2]12𝐄𝕏n[|u^n(i)|4]14\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\Bigr{|}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\tilde{{\rm{R}}}_{3,n,ijk}\Bigr{]}\Bigr{|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}
×𝐄𝕏n[|u^n(j)|8]18𝐄𝕏n[|u^n(k)|8]18\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad\times{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(j)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(k)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}
1ni=1qj=1qk=1qsupn𝐄n[(1nsupθΘ|θ(i)θ(j)θ(k)Hn(n,θ)|)2]12\displaystyle\leq\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{Z}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta)\Bigr{|}\biggr{)}^{2}\right]^{\frac{1}{2}}
×supn𝐄𝕏n[|u^n(i)|4]14supn𝐄𝕏n[|u^n(j)|8]18supn𝐄𝕏n[|u^n(k)|8]18\displaystyle\qquad\quad\times\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(i)}\bigr{|}^{4}\biggr{]}^{\frac{1}{4}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(j)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\hat{u}_{n}^{(k)}\bigr{|}^{8}\biggr{]}^{\frac{1}{8}}
0\displaystyle\longrightarrow 0

as nn\longrightarrow\infty, so that one has

𝐄𝕏n[𝐄n[F3,n]]0\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{3,n}\Bigr{]}\biggr{]}\longrightarrow 0 (5.38)

as nn\longrightarrow\infty. Consequently, we see from (5.36), (5.37) and (5.38) that

D3,n=𝐄𝕏n[𝐄n[F1,n+F2,n+F3,n]]q2\displaystyle{\rm{D}}_{3,n}=-{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{1,n}+{\rm{F}}_{2,n}+{\rm{F}}_{3,n}\Bigr{]}\biggr{]}\longrightarrow\frac{q}{2}

as nn\longrightarrow\infty, which yields (5.35). Furthermore, we have

D2,n=0\displaystyle\rm{D}_{2,n}=0 (5.39)

since 𝕏n\mathbb{X}_{n} and n\mathbb{Z}_{n} have the same distribution. Therefore, it holds from (5.24), (5.35) and (5.39) that

𝐄𝕏n[logLn(𝕏n,θ^n)𝐄n[logLn(n,θ^n)]]=q+op(1)\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}=q+o_{p}(1)

as nn\longrightarrow\infty. ∎

Lemma 10

Under [A] and [B2], as nn\longrightarrow\infty,

1nHm,n(𝕏n,θ^m,n)pHm(θ¯m).\displaystyle\frac{1}{n}{\rm{H}}_{m,n}(\mathbb{X}_{n},\hat{\theta}_{m,n})\stackrel{{\scriptstyle p}}{{\longrightarrow}}{\rm{H}}_{m}(\bar{\theta}_{m}).
Proof of Lemma 10.

In an analogous manner to the proof of Theorem 4 in Kusano and Uchida [13], we can obtain the result. See also Appendix 6.1. ∎

Proof of Theorem 2.

Fix mm^{*}\in\mathcal{M}. From the definition of m^n\hat{m}_{n}, one has

𝐏(m^nc)𝐏(minm1cQAIC(𝕏n,m1)<minm2QAIC(𝕏n,m2))=𝐏(m1c{QAIC(𝕏n,m1)<minm2QAIC(𝕏n,m2)})m1c𝐏(QAIC(𝕏n,m1)<minm2QAIC(𝕏n,m2))=m1c𝐏(m2{QAIC(𝕏n,m1)<QAIC(𝕏n,m2)})m1c𝐏(QAIC(𝕏n,m1)<QAIC(𝕏n,m)).\displaystyle\begin{split}{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}&\leq{\bf{P}}\left(\min_{m_{1}\in\mathcal{M}^{c}}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<\min_{m_{2}\in\mathcal{M}}{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\right)\\ &={\bf{P}}\left(\bigcup_{m_{1}\in\mathcal{M}^{c}}\Bigl{\{}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<\min_{m_{2}\in\mathcal{M}}{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\Bigr{\}}\right)\\ &\leq\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\left({\rm{QAIC}}(\mathbb{X}_{n},m_{1})<\min_{m_{2}\in\mathcal{M}}{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\right)\\ &=\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\left(\bigcap_{m_{2}\in\mathcal{M}}\Bigl{\{}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\Bigr{\}}\right)\\ &\leq\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\Bigl{(}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{)}.\end{split} (5.40)

For all m1cm_{1}\in\mathcal{M}^{c}, it follows from Lemma 10 that

1nQAIC(𝕏n,m1)1nQAIC(𝕏n,m)=2nlogLm1,n(𝕏n,θ^m1,n)+2nqm1+2nlogLm,n(𝕏n,θ^m,n)2nqm=2nHm1,n(𝕏n,θ^m1,n)+2nHm,n(𝕏n,θ^m,n)+2nqm12nqmpcm1,m\displaystyle\begin{split}&\quad\ \frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\\ &=-\frac{2}{n}\log{\rm{L}}_{m_{1},n}(\mathbb{X}_{n},\hat{\theta}_{m_{1},n})+\frac{2}{n}q_{m_{1}}+\frac{2}{n}\log{\rm{L}}_{m^{*},n}(\mathbb{X}_{n},\hat{\theta}_{m^{*},n})-\frac{2}{n}q_{m^{*}}\\ &=-\frac{2}{n}{\rm{H}}_{m_{1},n}(\mathbb{X}_{n},\hat{\theta}_{m_{1},n})+\frac{2}{n}{\rm{H}}_{m^{*},n}(\mathbb{X}_{n},\hat{\theta}_{m^{*},n})+\frac{2}{n}q_{m_{1}}-\frac{2}{n}q_{m^{*}}\\ &\stackrel{{\scriptstyle p}}{{\longrightarrow}}c_{m_{1},m^{*}}\end{split} (5.41)

as nn\longrightarrow\infty, where

cm1,m=2Hm1(θ¯m1)+2Hm(θm,0).\displaystyle c_{m_{1},m^{*}}=-2{\rm{H}}_{m_{1}}(\bar{\theta}_{m_{1}})+2{\rm{H}}_{m^{*}}(\theta_{m^{*},0}).

Define the function G:p++{\rm{G}}:\mathcal{M}_{p}^{++}\longrightarrow\mathbb{R}:

G(𝚺)=12tr(𝚺1𝚺0)12logdet𝚺.\displaystyle{\rm{G}}({\bf{\Sigma}})=-\frac{1}{2}\mathop{\rm tr}\nolimits\bigl{(}{\bf{\Sigma}}^{-1}{\bf{\Sigma}}_{0}\bigr{)}-\frac{1}{2}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}.

Note that G(𝚺){\rm{G}}({\bf{\Sigma}}) has the unique maximum point at 𝚺=𝚺0{\bf{\Sigma}}={\bf{\Sigma}}_{0}. Since

𝚺0=𝚺m(θm,0)𝚺m1(θ¯m1),\displaystyle{\bf{\Sigma}}_{0}={\bf{\Sigma}}_{m^{*}}(\theta_{m^{*},0})\neq{\bf{\Sigma}}_{m_{1}}(\bar{\theta}_{m_{1}}),

we obtain

Hm1(θ¯m1)=G(𝚺m1(θ¯m1))<G(𝚺m(θm,0))=Hm(θm,0)\displaystyle{\rm{H}}_{m_{1}}(\bar{\theta}_{m_{1}})={\rm{G}}\bigl{(}{\bf{\Sigma}}_{m_{1}}(\bar{\theta}_{m_{1}})\bigr{)}<{\rm{G}}\bigl{(}{\bf{\Sigma}}_{m^{*}}(\theta_{m^{*},0})\bigr{)}={\rm{H}}_{m^{*}}(\theta_{m^{*},0})

for any m1cm_{1}\in\mathcal{M}^{c}, which yields cm1,m>0c_{m_{1},m^{*}}>0. Consequently, it holds from (5.41) that for all m1cm_{1}\in\mathcal{M}^{c},

0\displaystyle 0 𝐏(QAIC(𝕏n,m1)<QAIC(𝕏n,m))\displaystyle\leq{\bf{P}}\Bigl{(}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{)}
=𝐏(1nQAIC(𝕏n,m1)1nQAIC(𝕏n,m)cm1,m<cm1,m)\displaystyle={\bf{P}}\left(\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{*})-c_{m_{1},m^{*}}<-c_{m_{1},m^{*}}\right)
=𝐏(cm1,m1nQAIC(𝕏n,m1)+1nQAIC(𝕏n,m)>cm1,m)\displaystyle={\bf{P}}\left(c_{m_{1},m^{*}}-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})+\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{*})>c_{m_{1},m^{*}}\right)
𝐏(|cm1,m1nQAIC(𝕏n,m1)+1nQAIC(𝕏n,m)|>cm1,m)\displaystyle\leq{\bf{P}}\left(\Bigl{|}c_{m_{1},m^{*}}-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})+\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{|}>c_{m_{1},m^{*}}\right)
=𝐏(|1nQAIC(𝕏n,m1)1nQAIC(𝕏n,m)cm1,m|>cm1,m)0\displaystyle={\bf{P}}\left(\Bigl{|}\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{*})-c_{m_{1},m^{*}}\Bigr{|}>c_{m_{1},m^{*}}\right)\longrightarrow 0

as nn\longrightarrow\infty, which implies

m1c𝐏(QAIC(𝕏n,m1)<QAIC(𝕏n,m))0\displaystyle\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\Bigl{(}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{)}\longrightarrow 0 (5.42)

as nn\longrightarrow\infty. Therefore, we see from (5.40) and (5.42) that

𝐏(m^nc)0\displaystyle{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}\longrightarrow 0

as nn\longrightarrow\infty. ∎

References

  • [1] Adams, R. A. and Fournier, J. J. (2003). Sobolev spaces. Elsevier.
  • [2] Aït-Sahalia, Y., Kalnina, I. and Xiu, D. (2020). High-frequency factor models and regressions. Journal of Econometrics, 216(1), 86-105.
  • [3] Aït-Sahalia, Y.and Xiu, D. (2017). Using principal component analysis to estimate a high dimensional factor model with high-frequency data. Journal of Econometrics, 201(2), 384-399.
  • [4] Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317-332.
  • [5] Eguchi, S. and Masuda, H. (2023). Gaussian quasi-information criteria for ergodic Lévy driven SDE. Annals of the Institute of Statistical Mathematics, 1-47.
  • [6] Everitt, B. (1984) An introduction to latent variable models, Springer Science & Business Media
  • [7] Genon-Catalot, V. and Jacod, J. (1993). On the estimation of the diffusion coefficient for multidimensional diffusion processes. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques,29, 119-151.
  • [8] Harville, D. A. (1998). Matrix algebra from a statistician’s perspective. Taylor & Francis.
  • [9] Huang, P. H. (2017). Asymptotics of AIC, BIC, and RMSEA for model selection in structural equation modeling. Psychometrika, 82(2), 407-426.
  • [10] Jöreskog, K. G. (1970). A general method for analysis of covariance structures. Biometrika, 57(2), 239-251.
  • [11] Kessler, M. (1997). Estimation of an ergodic diffusion from discrete observations. Scandinavian Journal of Statistics, 24(2), 211-229.
  • [12] Kusano, S., and Uchida, M. (2023). Statistical inference in factor analysis for diffusion processes from discrete observations. Journal of Statistical Planning and Inference, (Version of Record). DOI: https://doi.org/10.1016/j.jspi.2023.07.009
  • [13] Kusano, S., and Uchida, M. (2023). Sparse inference of structural equation modeling with latent variables for diffusion processes. Japanese Journal of Statistics and Data Science, (Version of Record). DOI: https://doi.org/10.1007/s42081-023-00230-1
  • [14] Kusano, S., and Uchida, M. (2023). Structural equation modeling with latent variables for diffusion processes and its application to sparse estimation. arXiv preprint arXiv:2305.02655v2.
  • [15] Mueller, R. O. (1999). Basic principles of structural equation modeling: An introduction to LISREL and EQS. Springer Science & Business Media.
  • [16] Uchida, M. (2010). Contrast-based information criterion for ergodic diffusion processes from discrete observations. Annals of the Institute of Statistical Mathematics, 62, 161-187.
  • [17] Uchida, M. and Yoshida, N. (2012). Adaptive estimation of an ergodic diffusion process based on sampled data. Stochastic Processes and their Applications, 122(8), 2885-2924.
  • [18] Yoshida, N. (1992). Estimation for diffusion processes from discrete observation. Journal of Multivariate Analysis, 41, 220–242.
  • [19] Yoshida, N. (2011). Polynomial type large deviation inequalities and quasi-likelihood analysis for stochastic differential equations. Annals of the Institute of Statistical Mathematics, 63(3), 431-479.

6. Appendix

6.1. Proofs of Lemmas

Proof of Lemma 1.

Since

θ(i)Hn(θ)\displaystyle\partial_{\theta^{(i)}}{\rm{H}}_{n}(\theta) =n2θ(i)logdet𝚺(θ)n2θ(i)tr(𝚺(θ)1𝕏𝕏)\displaystyle=-\frac{n}{2}\partial_{\theta^{(i)}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)-\frac{n}{2}\partial_{\theta^{(i)}}\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\mathbb{Q}_{\mathbb{XX}}\Bigr{)}
=n2tr{(𝚺(θ)1)(θ(i)𝚺(θ))}\displaystyle=-\frac{n}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}
+n2tr{(𝚺(θ)1)(θ(i)𝚺(θ))(𝚺(θ)1)𝕏𝕏}\displaystyle\qquad\qquad\qquad+\frac{n}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}

for i=1,,qi=1,\cdots,q, it is shown that

1nθ(i)Hn(θ0)\displaystyle\quad\ \frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\theta_{0})
=n2tr{(𝚺(θ0)1)(θ(i)𝚺(θ0))(𝚺(θ0)1)𝕏𝕏}\displaystyle=\frac{\sqrt{n}}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}
n2tr{(𝚺(θ0)1)(θ(i)𝚺(θ0))}\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad-\frac{\sqrt{n}}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\biggr{\}}
=12tr{(𝚺(θ0)1)(θ(i)𝚺(θ0))(𝚺(θ0)1)n(𝕏𝕏𝚺(θ0))}\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\sqrt{n}\Bigl{(}\mathbb{Q}_{\mathbb{XX}}-{\bf{\Sigma}}(\theta_{0})\Bigr{)}\biggr{\}}
=12(vecθ(i)𝚺(θ0))(𝚺(θ0)1𝚺(θ0)1)n(vec𝕏𝕏vec𝚺(θ0))\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vec}\nolimits{\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\sqrt{n}\Bigl{(}\mathop{\rm vec}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vec}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}
=12(vechθ(i)𝚺(θ0))𝔻p(𝚺(θ0)1𝚺(θ0)1)𝔻pn(vech𝕏𝕏vech𝚺(θ0))\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vech}\nolimits{\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\mathbb{D}_{p}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{D}_{p}\sqrt{n}\Bigl{(}\mathop{\rm vech}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}
=(θ(i)vech𝚺(θ0))𝐖(θ0)1n(vech𝕏𝕏vech𝚺(θ0)).\displaystyle=\Bigl{(}\partial_{\theta^{(i)}}\mathop{\rm vech}\nolimits{{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}{\bf{W}}(\theta_{0})^{-1}\sqrt{n}\Bigl{(}\mathop{\rm vech}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}.

Thus, we see from Theorem 1 in Kusano and Uchida [14] that

1nθHn(θ0)\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\theta_{0}) =Δ0𝐖(θ0)1n(vech𝕏𝕏vech𝚺(θ0))\displaystyle=\Delta_{0}^{\top}{\bf{W}}(\theta_{0})^{-1}\sqrt{n}\Bigl{(}\mathop{\rm vech}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}
dNq(0,Δ0𝐖(θ0)1Δ0)𝐈(θ0)12ζ.\displaystyle\qquad\qquad\qquad\stackrel{{\scriptstyle d}}{{\longrightarrow}}N_{q}\Bigl{(}0,\Delta_{0}^{\top}{\bf{W}}(\theta_{0})^{-1}\Delta_{0}\Bigr{)}\sim{\bf{I}}(\theta_{0})^{\frac{1}{2}}\zeta.

Note that

1nθ(i)θ(j)Hn(θ)\displaystyle\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta) =12tr{(𝚺(θ)1)(θ(i)𝚺(θ))(𝚺(θ)1)(θ(j)𝚺(θ))}\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}
12tr{(𝚺(θ)1)(θ(i)θ(j)𝚺(θ))}\displaystyle\quad-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}
12tr{(𝚺(θ)1)(θ(i)𝚺(θ))(𝚺(θ)1)(θ(j)𝚺(θ))(𝚺(θ)1)𝕏𝕏}\displaystyle\quad-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}
+12tr{(𝚺(θ)1)(θ(i)θ(j)𝚺(θ))(𝚺(θ)1)𝕏𝕏}\displaystyle\quad+\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}
12tr{(𝚺(θ)1)(θ(j)𝚺(θ))(𝚺(θ)1)(θ(i)𝚺(θ))(𝚺(θ)1)𝕏𝕏}\displaystyle\quad-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}

for i,j=1,,qi,j=1,\cdots,q. It follows from Theorem 1 in Kusano and Uchida [14] and Slutsky Theorem that

1nθ(i)θ(j)Hn(θ0)\displaystyle\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0}) p12tr{(𝚺(θ0)1)(θ(i)𝚺(θ0))(𝚺(θ0)1)(θ(j)𝚺(θ0))}\displaystyle\stackrel{{\scriptstyle p}}{{\longrightarrow}}-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\biggr{\}}
=12(vecθ(i)𝚺(θ0))(𝚺(θ0)1𝚺(θ0)1)(vecθ(j)𝚺(θ0))\displaystyle\quad=-\frac{1}{2}\Bigl{(}\mathop{\rm vec}\nolimits\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\mathop{\rm vec}\nolimits\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}
=12(vechθ(i)𝚺(θ0))𝔻p(𝚺(θ0)1𝚺(θ0)1)𝔻p(vechθ(j)𝚺(θ0))\displaystyle\quad=-\frac{1}{2}\Bigl{(}\mathop{\rm vech}\nolimits\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}^{\top}\mathbb{D}_{p}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{D}_{p}\Big{(}\mathop{\rm vech}\nolimits\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}
=(θ(i)vech𝚺(θ0))𝐖(θ0)1(θ(j)vech𝚺(θ0))\displaystyle\quad=-\Bigl{(}\partial_{\theta^{(i)}}\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}^{\top}{\bf{W}}(\theta_{0})^{-1}\Bigl{(}\partial_{\theta^{(j)}}\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}
=(Δ0𝐖(θ0)1Δ0)ij\displaystyle\quad=-\bigl{(}\Delta_{0}^{\top}{\bf{W}}(\theta_{0})^{-1}\Delta_{0}\bigr{)}_{ij}

as nn\longrightarrow\infty, so that we obtain

1n2θHn(θ0)p𝐈(θ0)\displaystyle\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\theta_{0})\stackrel{{\scriptstyle p}}{{\longrightarrow}}-{\bf{I}}(\theta_{0})

as nn\longrightarrow\infty. ∎

Proof of Lemma 2.

[B1] deduces

Y(θ)=0θ=θ0.\displaystyle{\rm{Y}}(\theta)=0\Longrightarrow\theta=\theta_{0}.

For all ε>0\varepsilon>0, there exists δ>0\delta>0 such that

|θ^nθ0|>εY(θ0)Y(θ^n)>δ.\displaystyle|\hat{\theta}_{n}-\theta_{0}|>\varepsilon\Longrightarrow{\rm{Y}}(\theta_{0})-{\rm{Y}}(\hat{\theta}_{n})>\delta.

In an analogous manner to Lemma 33 in Kusano and Uchida [14], we obtain

supθΘ|Yn(θ;θ0)Y(θ)|p0.\displaystyle\sup_{\theta\in\Theta}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\stackrel{{\scriptstyle p}}{{\longrightarrow}}0.

Since it holds from the definition of θ^n\hat{\theta}_{n} that

Yn(θ0;θ0)Yn(θ^n;θ0),\displaystyle{\rm{Y}}_{n}(\theta_{0};\theta_{0})\leq{\rm{Y}}_{n}(\hat{\theta}_{n};\theta_{0}),

we see

𝐏(|θ^nθ0|>ε)\displaystyle{\bf{P}}\Bigl{(}|\hat{\theta}_{n}-\theta_{0}|>\varepsilon\Bigr{)} 𝐏(Y(θ0)Y(θ^n)>δ)\displaystyle\leq{\bf{P}}\biggl{(}{\rm{Y}}(\theta_{0})-{\rm{Y}}(\hat{\theta}_{n})>\delta\biggr{)}
𝐏(Y(θ0)Yn(θ0;θ0)>δ3)\displaystyle\leq{\bf{P}}\biggl{(}{\rm{Y}}(\theta_{0})-{\rm{Y}}_{n}(\theta_{0};\theta_{0})>\frac{\delta}{3}\biggr{)}
+𝐏(Yn(θ0;θ0)Yn(θ^n;θ0)>δ3)\displaystyle\quad+{\bf{P}}\biggl{(}{\rm{Y}}_{n}(\theta_{0};\theta_{0})-{\rm{Y}}_{n}(\hat{\theta}_{n};\theta_{0})>\frac{\delta}{3}\biggr{)}
+𝐏(Yn(θ^n;θ0)Y(θ^n)>δ3)\displaystyle\quad+{\bf{P}}\biggl{(}{\rm{Y}}_{n}(\hat{\theta}_{n};\theta_{0})-{\rm{Y}}(\hat{\theta}_{n})>\frac{\delta}{3}\biggr{)}
2𝐏(supθΘ|Yn(θ;θ0)Y(θ)|>δ3)0\displaystyle\leq 2{\bf{P}}\Biggl{(}\sup_{\theta\in\Theta}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}>\frac{\delta}{3}\Biggr{)}\longrightarrow 0

as nn\longrightarrow\infty, which yields

θ^npθ0\displaystyle\hat{\theta}_{n}\stackrel{{\scriptstyle p}}{{\longrightarrow}}\theta_{0} (6.1)

as nn\longrightarrow\infty. Using the Taylor expansion, we have

1nθHn(θ^n)\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\hat{\theta}_{n}) =1nθHn(θ0)+(1n012θHn(θ~n,λ)dλ)n(θ^nθ0),\displaystyle=\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\theta_{0})+\biggl{(}\frac{1}{n}\int_{0}^{1}\partial^{2}_{\theta}{\rm{H}}_{n}(\tilde{\theta}_{n,\lambda})d\lambda\biggr{)}\sqrt{n}(\hat{\theta}_{n}-\theta_{0}),

where θ~n,λ=θ0+λ(θ^nθ0)\tilde{\theta}_{n,\lambda}=\theta_{0}+\lambda(\hat{\theta}_{n}-\theta_{0}). Note that

1nθHn(θ^n)=0\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\hat{\theta}_{n})=0

on AnA_{n} and 𝐏(An)1{\bf{P}}(A_{n})\longrightarrow 1 as nn\longrightarrow\infty, where

An={θ^nIntΘ}.\displaystyle A_{n}=\Bigl{\{}\hat{\theta}_{n}\in{\rm{Int}}\Theta\Bigr{\}}.

In a similar manner to Theorem 2 in Kusano and Uchida [13], it holds from Lemma 2 and (6.1) that

1n012θHn(θ~n,λ)dλp𝐈(θ0)\displaystyle\frac{1}{n}\int_{0}^{1}\partial^{2}_{\theta}{\rm{H}}_{n}(\tilde{\theta}_{n,\lambda})d\lambda\stackrel{{\scriptstyle p}}{{\longrightarrow}}-{\bf{I}}(\theta_{0})

as nn\longrightarrow\infty. Therefore, we see from Lemma 2 that

n(θ^nθ0)d𝐈(θ0)12ζ\displaystyle\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\stackrel{{\scriptstyle d}}{{\longrightarrow}}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta

as nn\longrightarrow\infty. ∎

Proof of Lemma 10.

In a similar way to Lemma 33 in Kusano and Uchida [14], it is shown that

supθmΘm|1nHm,n(θm)Hm(θm)|p0.\displaystyle\sup_{\theta_{m}\in\Theta_{m}}\biggl{|}\frac{1}{n}{\rm{H}}_{m,n}(\theta_{m})-{\rm{H}}_{m}(\theta_{m})\biggr{|}\stackrel{{\scriptstyle p}}{{\longrightarrow}}0.

Since Hm(θm){\rm{H}}_{m}(\theta_{m}) is continuous in θmΘm\theta_{m}\in\Theta_{m}, it holds from the continuous mapping theorem and Lemma 36 in Kusano and Uchida [14] that

Hm(θ^m,n)pHm(θ¯m)\displaystyle{\rm{H}}_{m}(\hat{\theta}_{m,n})\stackrel{{\scriptstyle p}}{{\longrightarrow}}{\rm{H}}_{m}(\bar{\theta}_{m})

as nn\longrightarrow\infty. Therefore, we see

|1nHm,n(θ^m,n)Hm(θ¯m)|\displaystyle\biggl{|}\frac{1}{n}{\rm{H}}_{m,n}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\bar{\theta}_{m})\biggr{|} |1nHm,n(θ^m,n)Hm(θ^m,n)|+|Hm(θ^m,n)Hm(θ¯m)|\displaystyle\leq\biggl{|}\frac{1}{n}{\rm{H}}_{m,n}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\hat{\theta}_{m,n})\biggr{|}+\biggl{|}{\rm{H}}_{m}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\bar{\theta}_{m})\biggr{|}
supθΘ|1nHm,n(θm)Hm(θm)|+|Hm(θ^m,n)Hm(θ¯m)|p0\displaystyle\leq\sup_{\theta\in\Theta}\biggl{|}\frac{1}{n}{\rm{H}}_{m,n}(\theta_{m})-{\rm{H}}_{m}(\theta_{m})\biggr{|}+\biggl{|}{\rm{H}}_{m}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\bar{\theta}_{m})\biggr{|}\stackrel{{\scriptstyle p}}{{\longrightarrow}}0

as nn\longrightarrow\infty, which yields

1nHm,n(θ^m,n)pHm(θ¯m)\displaystyle\frac{1}{n}{\rm{H}}_{m,n}(\hat{\theta}_{m,n})\stackrel{{\scriptstyle p}}{{\longrightarrow}}{\rm{H}}_{m}(\bar{\theta}_{m})

as nn\longrightarrow\infty. ∎

6.2. Proof of (4.1)

Proof.

In an analogous manner to Appendix 6.2 in Kusano and Uchida [14], it is shown that

𝚺(θ)=𝚺(θ0)θ=θ0.\displaystyle{\bf{\Sigma}}(\theta)={\bf{\Sigma}}(\theta_{0})\Longrightarrow\theta=\theta_{0}. (6.2)

For 𝚺p++{\bf{\Sigma}}\in\mathcal{M}_{p}^{++}, we define

G2(𝚺)=logdet𝚺logdet𝚺(θ0)+tr(𝚺1𝚺(θ0))p.\displaystyle{\rm{G}}_{2}({\bf{\Sigma}})=\log\mathop{\rm det}\nolimits{\bf{\Sigma}}-\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})+\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}^{-1}{\bf{\Sigma}}(\theta_{0})\Bigr{)}-p.

Note that G2(𝚺){\rm{G}}_{2}({\bf{\Sigma}}) has the unique minimum point at 𝚺=𝚺(θ0){\bf{\Sigma}}={\bf{\Sigma}}(\theta_{0}). Since

Y(θ)\displaystyle{\rm{Y}}(\theta) =12{logdet𝚺(θ)logdet𝚺(θ0)+tr(𝚺(θ)1𝚺(θ0))p}\displaystyle=-\frac{1}{2}\Bigl{\{}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)-\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})+\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}{\bf{\Sigma}}(\theta_{0})\Bigr{)}-p\Bigr{\}}
=12G2(𝚺(θ)),\displaystyle=-\frac{1}{2}{\rm{G}}_{2}\bigl{(}{\bf{\Sigma}}(\theta)\bigr{)},

it holds from (6.2) that Y(θ){\rm{Y}}(\theta) has the unique maximum point at θ=θ0\theta=\theta_{0}, which yields

sup|θθ0|>vY(θ)<Y(θ0)=0\displaystyle\sup_{|\theta-\theta_{0}|>v}{\rm{Y}}(\theta)<{\rm{Y}}(\theta_{0})=0 (6.3)

for all v>0v>0. The Taylor expansion of Y(θ){\rm{Y}}(\theta) around θ=θ0\theta=\theta_{0} is given by

Y(θ)=Y(θ0)+θY(θ0)(θθ0)+(θθ0)(01(1λ)2θY(θλ)dλ)(θθ0)=12(θθ0)2θY(θ0)(θθ0)+(θθ0)(01(1λ)2θY(θλ)dλ122θY(θ0))(θθ0),\displaystyle\begin{split}{\rm{Y}}(\theta)&={\rm{Y}}(\theta_{0})+\partial_{\theta}{\rm{Y}}(\theta_{0})^{\top}(\theta-\theta_{0})\\ &\qquad\qquad\qquad+(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda\biggr{)}(\theta-\theta_{0})\\ &=\frac{1}{2}(\theta-\theta_{0})^{\top}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})(\theta-\theta_{0})\\ &\qquad\qquad\qquad+(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda-\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})\biggr{)}(\theta-\theta_{0}),\end{split} (6.4)

where θλ=θ0+λ(θθ0)\theta_{\lambda}=\theta_{0}+\lambda(\theta-\theta_{0}). In a similar way to Theorem 2 in Kusano and Uchida [13], it is shown that

01(1λ)2θY(θλ)dλ122θY(θ0)\displaystyle\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda\longrightarrow\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})

as θθ0\theta\longrightarrow\theta_{0}, which deduces

(θθ0)(01(1λ)2θY(θλ)dλ122θY(θ0))(θθ0)0\displaystyle(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda-\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})\biggr{)}(\theta-\theta_{0})\longrightarrow 0

as θθ0\theta\longrightarrow\theta_{0}. Hence, for all ε>0\varepsilon>0, there exists δ>0\delta>0 such that

|θθ0|δ|(θθ0)(01(1λ)2θY(θλ)dλ122θY(θ0))(θθ0)|<ε,\displaystyle|\theta-\theta_{0}|\leq\delta\Longrightarrow\biggl{|}(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda-\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})\biggr{)}(\theta-\theta_{0})\biggr{|}<\varepsilon,

so that we see from (6.4) that

Y(θ)<12(θθ0)2θY(θ0)(θθ0)+ε\displaystyle{\rm{Y}}(\theta)<\frac{1}{2}(\theta-\theta_{0})^{\top}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})(\theta-\theta_{0})+\varepsilon

for θBδ(θ0)\theta\in B_{\delta}(\theta_{0}), where

Bδ(θ0)={θΘ:|θθ0|δ}.\displaystyle B_{\delta}(\theta_{0})=\Bigl{\{}\theta\in\Theta:|\theta-\theta_{0}|\leq\delta\Bigr{\}}.

Note that it holds from the proof of Lemma 1 that 𝐈(θ0)=2θY(θ0){\bf{I}}(\theta_{0})=-\partial^{2}_{\theta}{\rm{Y}}(\theta_{0}). Since ε>0\varepsilon>0 is arbitrary, one has

Y(θ)12(θθ0)𝐈(θ0)(θθ0)\displaystyle{\rm{Y}}(\theta)\leq-\frac{1}{2}(\theta-\theta_{0})^{\top}{\bf{I}}(\theta_{0})(\theta-\theta_{0})

as ε0\varepsilon\downarrow 0 for θBδ(θ0)\theta\in B_{\delta}(\theta_{0}). Recalling that 𝐈(θ0){\bf{I}}(\theta_{0}) is a positive definite matrix, we have

λmin|θθ0|2(θθ0)𝐈(θ0)(θθ0)λmax|θθ0|2,\displaystyle\lambda_{min}|\theta-\theta_{0}|^{2}\leq(\theta-\theta_{0})^{\top}{\bf{I}}(\theta_{0})(\theta-\theta_{0})\leq\lambda_{max}|\theta-\theta_{0}|^{2},

where λmin>0\lambda_{min}>0 and λmax>0\lambda_{max}>0 are the minimum and maximum eigenvalues of 𝐈(θ0){\bf{I}}(\theta_{0}). There exists C1>0C_{1}>0 such that

Y(θ)C1|θθ0|2\displaystyle{\rm{Y}}(\theta)\leq-C_{1}|\theta-\theta_{0}|^{2} (6.5)

for θBδ(θ0)\theta\in B_{\delta}(\theta_{0}). Let

DiamΘ=supθ1,θ2Θ|θ1θ2|>0.\displaystyle{\rm{Diam}}\Theta=\sup_{\theta_{1},\theta_{2}\in\Theta}|\theta_{1}-\theta_{2}|>0.

Since

1DiamΘ|θθ0|1,\displaystyle\frac{1}{{\rm{Diam}}\Theta}|\theta-\theta_{0}|\leq 1,

we see from (6.3) that

Y(θ)sup|θθ0|>δY(θ)(sup|θθ0|>δY(θ))1(DiamΘ)2|θθ0|2\displaystyle\begin{split}{\rm{Y}}(\theta)&\leq\sup_{|\theta-\theta_{0}|>\delta}{\rm{Y}}(\theta)\\ &\leq-\Biggl{(}-\sup_{|\theta-\theta_{0}|>\delta}{\rm{Y}}(\theta)\Biggr{)}\frac{1}{({\rm{Diam}}\Theta)^{2}}|\theta-\theta_{0}|^{2}\end{split}

for θBδ(θ0)c\theta\in B_{\delta}(\theta_{0})^{c}, so that there exists C2>0C_{2}>0 such that

Y(θ)C2|θθ0|2\displaystyle{\rm{Y}}(\theta)\leq-C_{2}|\theta-\theta_{0}|^{2} (6.6)

for θBδ(θ0)c\theta\in B_{\delta}(\theta_{0})^{c}. Therefore, it follows from (6.5) and (6.6) that

Y(θ)C|θθ0|2\displaystyle{\rm{Y}}(\theta)\leq-C|\theta-\theta_{0}|^{2}

for θΘ\theta\in\Theta, where C=min(C1,C2)C=\min(C_{1},C_{2}). ∎

6.3. Ergodic case

In this section, we consider the ergodic case. The following assumptions are made.

  1. [C]
    1. (a)

      The diffusion process ξt\xi_{t} is ergodic with its invariant measure πξ,0\pi_{\xi,0}. For any πξ,0\pi_{\xi,0}-integrable function f1f_{1}, it holds that

      1T0Tf1(ξt)dt𝑝f1(x)πξ,0(dx)\displaystyle\frac{1}{T}\int_{0}^{T}{f_{1}(\xi_{t})dt}\overset{p}{\longrightarrow}\int f_{1}(x)\pi_{\xi,0}(dx)

      as TT\longrightarrow\infty.

    2. (b)

      The diffusion process δt\delta_{t} is ergodic with its invariant measure πδ,0\pi_{\delta,0}. For any πδ,0\pi_{\delta,0}-integrable function f2f_{2}, it holds that

      1T0Tf2(δt)dt𝑝f2(x)πδ,0(dx)\displaystyle\frac{1}{T}\int_{0}^{T}{f_{2}(\delta_{t})dt}\overset{p}{\longrightarrow}\int f_{2}(x)\pi_{\delta,0}(dx)

      as TT\longrightarrow\infty.

    3. (c)

      The diffusion process εt\varepsilon_{t} is ergodic with its invariant measure πε,0\pi_{\varepsilon,0}. For any πε,0\pi_{\varepsilon,0}-integrable function f3f_{3}, it holds that

      1T0Tf3(εt)dt𝑝f3(x)πε,0(dx)\displaystyle\frac{1}{T}\int_{0}^{T}{f_{3}(\varepsilon_{t})dt}\overset{p}{\longrightarrow}\int f_{3}(x)\pi_{\varepsilon,0}(dx)

      as TT\longrightarrow\infty.

    4. (d)

      The diffusion process ζt\zeta_{t} is ergodic with its invariant measure πζ,0\pi_{\zeta,0}. For any πζ,0\pi_{\zeta,0}-integrable function f4f_{4}, it holds that

      1T0Tf4(ζt)dt𝑝f4(x)πζ,0(dx)\displaystyle\frac{1}{T}\int_{0}^{T}{f_{4}(\zeta_{t})dt}\overset{p}{\longrightarrow}\int f_{4}(x)\pi_{\zeta,0}(dx)

      as TT\longrightarrow\infty.

In the ergodic case, we have the following results similar to the non-ergodic case.

Theorem 3

Let m{1,,M}m\in\{1,\cdots,M\}. Under [A], [B1] and [C], as hn0h_{n}\longrightarrow 0, nhnnh_{n}\longrightarrow\infty and nhn20nh_{n}^{2}\longrightarrow 0,

𝐄𝕏n[logLm,n(𝕏n,θ^m,n(𝕏n))𝐄n[logLm,n(n,θ^m,n(𝕏n))]]=qm+op(1).\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]}\biggr{]}=q_{m}+o_{p}(1).
Theorem 4

Under [A], [B2] and [C], as hn0h_{n}\longrightarrow 0, nhnnh_{n}\longrightarrow\infty and nhn20nh_{n}^{2}\longrightarrow 0,

𝐏(m^nc)0.\displaystyle{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}\longrightarrow 0.
Proofs of Theorems 3-4.

Since hn0h_{n}\longrightarrow 0 and nhn2nh_{n}^{2}\longrightarrow\infty, we can prove the results in the same way as the proofs of Theorems 1-2. ∎