Quasi-Akaike information criterion of structural equation modeling with latent variables for diffusion processes

Shogo Kusano ¹ and Masayuki Uchida ^1,2 ¹Graduate School of Engineering Science, Osaka University ²Center for Mathematical Modeling and Data Science (MMDS), Osaka University and JST CREST

Abstract.

We consider a model selection problem for structural equation modeling (SEM) with latent variables for diffusion processes based on high-frequency data. First, we propose the quasi-Akaike information criterion of the SEM and study the asymptotic properties. Next, we consider the situation where the set of competing models includes some misspecified parametric models. It is shown that the probability of choosing the misspecified models converges to zero. Furthermore, examples and simulation results are given.

Key words and phrases:

Structural equation modeling; Quasi-Akaike information criterion; Quasi-likelihood analysis; High-frequency data; Stochastic differential equation.

1. Introduction

We consider a model selection problem for structural equation modeling (SEM) with latent variables for diffusion processes. First, we define the true model of the SEM. The stochastic processes $\mathbb{X}_{1,0,t}$ and $\mathbb{X}_{2,0,t}$ are defined by the factor models as follows:

	$\displaystyle\mathbb{X}_{1,0,t}$	$\displaystyle={\bf{\Lambda}}_{x_{1},0}\xi_{0,t}+\delta_{0,t},$		(1.1)
	$\displaystyle\mathbb{X}_{2,0,t}$	$\displaystyle={\bf{\Lambda}}_{x_{2},0}\eta_{0,t}+\varepsilon_{0,t},$		(1.2)

where $\{\mathbb{X}_{1,0,t}\}_{t\geq 0}$ and $\{\mathbb{X}_{2,0,t}\}_{t\geq 0}$ are $p_{1}$ and $p_{2}$ -dimensional observable vector processes, $\{\xi_{0,t}\}_{t\geq 0}$ and $\{\eta_{0,t}\}_{t\geq 0}$ are $k_{1}$ and $k_{2}$ -dimensional latent common factor vector processes, $\{\delta_{0,t}\}_{t\geq 0}$ and $\{\varepsilon_{0,t}\}_{t\geq 0}$ are $p_{1}$ and $p_{2}$ -dimensional latent unique factor vector processes, respectively. ${\bf{\Lambda}}_{x_{1},0}\in\mathbb{R}^{p_{1}\times k_{1}}$ and ${\bf{\Lambda}}_{x_{2},0}\in\mathbb{R}^{p_{2}\times k_{2}}$ are constant loading matrices. Both $p_{1}$ and $p_{2}$ are not zero, $p_{1}$ , $p_{2}$ , $k_{1}$ and $k_{2}$ are fixed, $k_{1}\leq p_{1}$ and $k_{2}\leq p_{2}$ . Let $p=p_{1}+p_{2}$ . Suppose that $\{\xi_{0,t}\}_{t\geq 0}$ , $\{\delta_{0,t}\}_{t\geq 0}$ and $\{\varepsilon_{0,t}\}_{t\geq 0}$ satisfy the following stochastic differential equations:

$\displaystyle\quad\mathrm{d}\xi_{0,t}$	$\displaystyle=B_{1}(\xi_{0,t})\mathrm{d}t+{\bf{S}}_{1,0}\mathrm{d}W_{1,t},\ \ \xi_{0,0}=c_{1},$	(1.3)
$\displaystyle\quad\mathrm{d}\delta_{0,t}$	$\displaystyle=B_{2}(\delta_{0,t})\mathrm{d}t+{\bf{S}}_{2,0}\mathrm{d}W_{2,t},\ \ \delta_{0,0}=c_{2},$	(1.4)
$\displaystyle\quad\mathrm{d}\varepsilon_{0,t}$	$\displaystyle=B_{3}(\varepsilon_{0,t})\mathrm{d}t+{\bf{S}}_{3,0}\mathrm{d}W_{3,t},\ \ \varepsilon_{0,0}=c_{3},$	(1.5)

where $B_{1}:\mathbb{R}^{k_{1}}\rightarrow\mathbb{R}^{k_{1}}$ , ${\bf{S}}_{1,0}\in\mathbb{R}^{k_{1}\times r_{1}}$ , $c_{1}\in\mathbb{R}^{k_{1}}$ , $B_{2}:\mathbb{R}^{p_{1}}\rightarrow\mathbb{R}^{p_{1}}$ , ${\bf{S}}_{2,0}\in\mathbb{R}^{p_{1}\times r_{2}}$ , $c_{2}\in\mathbb{R}^{p_{1}}$ , $B_{3}:\mathbb{R}^{p_{2}}\rightarrow\mathbb{R}^{p_{2}}$ , ${\bf{S}}_{3,0}\in\mathbb{R}^{p_{2}\times r_{3}}$ , $c_{3}\in\mathbb{R}^{p_{2}}$ and $W_{1,t}$ , $W_{2,t}$ and $W_{3,t}$ are $r_{1}$ , $r_{2}$ and $r_{3}$ -dimensional standard Wiener processes, respectively. Moreover, we express the relationship between $\eta_{0,t}$ and $\xi_{0,t}$ as follows:

\displaystyle\eta_{0,t}={\bf{B}}_{0}\eta_{0,t}+{\bf{\Gamma}}_{0}\xi_{0,t}+\zeta_{0,t},

(1.6)

where ${\bf{B}}_{0}\in\mathbb{R}^{k_{2}\times k_{2}}$ is a constant loading matrix, whose diagonal elements are zero, and ${\bf{\Gamma}}_{0}\in\mathbb{R}^{k_{2}\times k_{1}}$ is a constant loading matrix. Define ${\bf{\Psi}}_{0}=\mathbb{I}_{k_{2}}-{\bf{B}}_{0}$ , where $\mathbb{I}_{k_{2}}$ denotes the identity matrix of size $k_{2}$ . We assume that ${\bf{\Lambda}}_{x_{1},0}$ is a full column rank matrix and ${\bf{\Psi}}_{0}$ is non-singular. $\{\zeta_{0,t}\}_{t\geq 0}$ is a $k_{2}$ -dimensional latent unique factor vector process defined by the following stochastic differential equation:

\displaystyle\quad\mathrm{d}\zeta_{0,t}=B_{4}(\zeta_{0,t})\mathrm{d}t+{\bf{S}}_{4,0}\mathrm{d}W_{4,t},\ \ \zeta_{0,0}=c_{4},

(1.7)

where $B_{4}:\mathbb{R}^{k_{2}}\rightarrow\mathbb{R}^{k_{2}}$ , ${\bf{S}}_{4,0}\in\mathbb{R}^{k_{2}\times r_{4}}$ , $c_{4}\in\mathbb{R}^{k_{2}}$ and $W_{4,t}$ is an $r_{4}$ -dimensional standard Wiener process. Set ${\bf{\Sigma}}_{\xi\xi,0}={\bf{S}}_{1,0}{\bf{S}}_{1,0}^{\top}$ , ${\bf{\Sigma}}_{\delta\delta,0}={\bf{S}}_{2,0}{\bf{S}}_{2,0}^{\top}$ , ${\bf{\Sigma}}_{\varepsilon\varepsilon,0}={\bf{S}}_{3,0}{\bf{S}}_{3,0}^{\top}$ and ${\bf{\Sigma}}_{\zeta\zeta,0}={\bf{S}}_{4,0}{\bf{S}}_{4,0}^{\top}$ , where $\top$ denotes the transpose. It is supposed that ${\bf{\Sigma}}_{\delta\delta,0}$ and ${\bf{\Sigma}}_{\varepsilon\varepsilon,0}$ are positive definite matrices, and $W_{1,t}$ , $W_{2,t}$ , $W_{3,t}$ and $W_{4,t}$ are independent standard Wiener processes on a stochastic basis with usual conditions $(\Omega,\mathscr{F},\{\mathscr{F}_{t}\},{\bf{P}})$ . Let $\mathbb{X}_{0,t}=(\mathbb{X}_{1,0,t}^{\top},\mathbb{X}_{2,0,t}^{\top})^{\top}$ . Set ${\bf{\Sigma}}_{0}$ as the variance of $\mathbb{X}_{0,t}$ . If there is no misunderstanding, we simply write $\mathbb{X}_{0,t}$ as $\mathbb{X}_{t}$ . $\mathbb{X}_{n}=(\mathbb{X}_{t_{i}^{n}})_{0\leq i\leq n}=(\mathbb{X}_{0,t_{i}^{n}})_{0\leq i\leq n}$ are discrete observations, where $t_{i}^{n}=ih_{n}$ , $h_{n}=\frac{T}{n}$ , $T$ is fixed, and $p_{1}$ , $p_{2}$ , $k_{1}$ and $k_{2}$ are independent of $n$ . We consider the situation where $h_{n}\longrightarrow 0$ as $n\longrightarrow\infty$ . We cannot estimate all the elements of ${\bf{\Lambda}}_{x_{1},0}$ , ${\bf{\Lambda}}_{x_{2},0}$ , ${\bf{\Gamma}}_{0}$ , ${\bf{\Psi}}_{0}$ , ${\bf{\Sigma}}_{\xi\xi,0}$ , ${\bf{\Sigma}}_{\delta\delta,0}$ , ${\bf{\Sigma}}_{\varepsilon\varepsilon,0}$ and ${\bf{\Sigma}}_{\zeta\zeta,0}$ . Thus, some elements may be assumed to be zero to satisfy an identifiability condition; see, e.g., Everitt [6]. Note that these constraints and the number of factors $k_{1}$ and $k_{2}$ are determined from the theoretical viewpoint of each research field.

A model selection problem among the following $M$ parametric models is considered. We define the parametric model of Model $m\in\{1,\cdots,M\}$ as follows. Set $\theta_{m}\in\Theta_{m}\subset\mathbb{R}^{q_{m}}$ as the parameter of Model $m$ , where $\Theta_{m}$ is a convex compact space. It is assumed that $\Theta_{m}$ has locally Lipschitz boundary; see, e.g., Adams and Fournier [1]. The stochastic processes $\mathbb{X}^{\theta}_{1,m,t}$ and $\mathbb{X}^{\theta}_{2,m,t}$ are defined as the following factor models:

	$\displaystyle\mathbb{X}^{\theta}_{1,m,t}$	$\displaystyle={\bf{\Lambda}}^{\theta}_{x_{1},m}\xi^{\theta}_{m,t}+\delta^{\theta}_{m,t},$		(1.8)
	$\displaystyle\mathbb{X}^{\theta}_{2,m,t}$	$\displaystyle={\bf{\Lambda}}^{\theta}_{x_{2},m}\eta^{\theta}_{m,t}+\varepsilon^{\theta}_{m,t},$		(1.9)

where $\{\mathbb{X}^{\theta}_{1,m,t}\}_{t\geq 0}$ and $\{\mathbb{X}^{\theta}_{2,m,t}\}_{t\geq 0}$ are $p_{1}$ and $p_{2}$ -dimensional observable vector processes, $\{\xi^{\theta}_{m,t}\}_{t\geq 0}$ and $\{\eta^{\theta}_{m,t}\}_{t\geq 0}$ are $k_{1}$ and $k_{2}$ -dimensional latent common factor vector processes, $\{\delta^{\theta}_{m,t}\}_{t\geq 0}$ and $\{\varepsilon^{\theta}_{m,t}\}_{t\geq 0}$ are $p_{1}$ and $p_{2}$ -dimensional latent unique factor vector processes, respectively. ${\bf{\Lambda}}^{\theta}_{x_{1},m}\in\mathbb{R}^{p_{1}\times k_{1}}$ and ${\bf{\Lambda}}^{\theta}_{x_{2},m}\in\mathbb{R}^{p_{2}\times k_{2}}$ are constant loading matrices. Assume that $\{\xi^{\theta}_{m,t}\}_{t\geq 0}$ , $\{\delta^{\theta}_{m,t}\}_{t\geq 0}$ and $\{\varepsilon^{\theta}_{m,t}\}_{t\geq 0}$ satisfy the following stochastic differential equations:

$\displaystyle\quad\mathrm{d}\xi^{\theta}_{m,t}$	$\displaystyle=B_{1}(\xi^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{1,m}\mathrm{d}W_{1,t},\ \ \xi^{\theta}_{m,0}=c_{1},$	(1.10)
$\displaystyle\quad\mathrm{d}\delta^{\theta}_{m,t}$	$\displaystyle=B_{2}(\delta^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{2,m}\mathrm{d}W_{2,t},\ \ \delta^{\theta}_{m,0}=c_{2},$	(1.11)
$\displaystyle\quad\mathrm{d}\varepsilon^{\theta}_{m,t}$	$\displaystyle=B_{3}(\varepsilon^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{3,m}\mathrm{d}W_{3,t},\ \ \varepsilon^{\theta}_{m,0}=c_{3},$	(1.12)

where ${\bf{S}}^{\theta}_{1,m}\in\mathbb{R}^{k_{1}\times r_{1}}$ , ${\bf{S}}^{\theta}_{2,m}\in\mathbb{R}^{p_{1}\times r_{2}}$ and ${\bf{S}}^{\theta}_{3,m}\in\mathbb{R}^{p_{2}\times r_{3}}$ . Furthermore, the relationship between $\eta^{\theta}_{m,t}$ and $\xi^{\theta}_{m,t}$ is expressed as follows:

\displaystyle\eta^{\theta}_{m,t}={\bf{B}}_{m}^{\theta}\eta^{\theta}_{m,t}+{\bf{\Gamma}}_{m}^{\theta}\xi^{\theta}_{m,t}+\zeta^{\theta}_{m,t},

(1.13)

where ${\bf{B}}^{\theta}_{m}\in\mathbb{R}^{k_{2}\times k_{2}}$ is a constant loading matrix, whose diagonal elements are zero, and ${\bf{\Gamma}}^{\theta}_{m}\in\mathbb{R}^{k_{2}\times k_{1}}$ is a constant loading matrix. Set ${\bf{\Psi}}^{\theta}_{m}=\mathbb{I}_{k_{2}}-{\bf{B}}^{\theta}_{m}$ . It is supposed that ${\bf{\Lambda}}^{\theta}_{x_{1},m}$ is a full column rank matrix and ${\bf{\Psi}}^{\theta}_{m}$ is non-singular. $\{\zeta^{\theta}_{m,t}\}_{t\geq 0}$ is a $k_{2}$ -dimensional latent unique factor vector process defined by the following stochastic differential equation:

\displaystyle\quad\mathrm{d}\zeta^{\theta}_{m,t}=B_{4}(\zeta^{\theta}_{m,t})\mathrm{d}t+{\bf{S}}^{\theta}_{4,m}\mathrm{d}W_{4,t},\ \ \zeta^{\theta}_{m,0}=c_{4},

(1.14)

where ${\bf{S}}^{\theta}_{4,m}\in\mathbb{R}^{k_{2}\times r_{4}}$ . Let ${\bf{\Sigma}}^{\theta}_{\xi\xi,m}={\bf{S}}^{\theta}_{1,m}{\bf{S}}^{\theta\top}_{1,m}$ , ${\bf{\Sigma}}^{\theta}_{\delta\delta,m}={\bf{S}}^{\theta}_{2,m}{\bf{S}}^{\theta\top}_{2,m}$ , ${\bf{\Sigma}}^{\theta}_{\varepsilon\varepsilon,m}={\bf{S}}^{\theta}_{3,m}{\bf{S}}^{\theta\top}_{3,m}$ and ${\bf{\Sigma}}^{\theta}_{\zeta\zeta,m}={\bf{S}}^{\theta}_{4,m}{\bf{S}}^{\theta\top}_{4,m}$ . It is assumed that ${\bf{\Sigma}}^{\theta}_{\delta\delta,m}$ and ${\bf{\Sigma}}^{\theta}_{\varepsilon\varepsilon,m}$ are positive definite matrices. Define $\mathbb{X}^{\theta}_{m,t}=(\mathbb{X}_{1,m,t}^{\theta\top},\mathbb{X}_{2,m,t}^{\theta\top})^{\top}$ . Set

\displaystyle{\bf{\Sigma}}_{m}(\theta_{m})=\begin{pmatrix}{\bf{\Sigma}}_{m}^{11}(\theta_{m})&{\bf{\Sigma}}_{m}^{12}(\theta_{m})\\ {\bf{\Sigma}}_{m}^{12\top}(\theta_{m})&{\bf{\Sigma}}_{m}^{22}(\theta_{m})\end{pmatrix}

as the variance of $\mathbb{X}^{\theta}_{m,t}$ , where

	$\displaystyle\qquad\qquad{\bf{\Sigma}}^{11}_{m}(\theta_{m})$	$\displaystyle={\bf{\Lambda}}^{\theta}_{x_{1},m}{\bf{\Sigma}}^{\theta}_{\xi\xi,m}{\bf{\Lambda}}_{x_{1},m}^{\theta\top}+{\bf{\Sigma}}^{\theta}_{\delta\delta,m},$
	$\displaystyle{\bf{\Sigma}}^{12}_{m}(\theta_{m})$	$\displaystyle={\bf{\Lambda}}^{\theta}_{x_{1},m}{\bf{\Sigma}}^{\theta}_{\xi\xi,m}{\bf{\Gamma}}_{m}^{\theta\top}{\bf{\Psi}}_{m}^{\theta-1\top}{\bf{\Lambda}}_{x_{2},m}^{\theta\top},$
	$\displaystyle{\bf{\Sigma}}^{22}_{m}(\theta_{m})$	$\displaystyle={\bf{\Lambda}}^{\theta}_{x_{2},m}{\bf{\Psi}}_{m}^{\theta-1}({\bf{\Gamma}}_{m}^{\theta}{\bf{\Sigma}}^{\theta}_{\xi\xi,m}{\bf{\Gamma}}_{m}^{\theta\top}+{\bf{\Sigma}}^{\theta}_{\zeta\zeta,m}){\bf{\Psi}}_{m}^{\theta-1\top}{\bf{\Lambda}}_{x_{2},m}^{\theta\top}+{\bf{\Sigma}}^{\theta}_{\varepsilon\varepsilon,m}.$

It is supposed that there exists $\theta_{m,0}\in{\rm{Int}}\Theta_{m}$ such that ${\bf{\Sigma}}_{0}={\bf{\Sigma}}_{m}(\theta_{m,0})$ , and Model $m$ satisfies an identifiability condition.

Structural equation modeling (SEM) with latent variables is a method of analyzing the relationships between latent variables that cannot be observed; see, e.g., Jöreskog [10], Everitt [6], Mueller [15] and references therein. A researcher has often some candidate models in SEM. Note that the candidate models are usually specified to express different hypotheses. The goodness-of-fit test based on the likelihood ratio is widely used for model evaluation in SEM. Akaike [4] proposed the use of the Akaike information criterion (AIC) in a factor model. Using AIC in a factor model, we can choose the optimal number of factors in terms of prediction. AIC is also widely used in SEM to choose the optimal model as well as a factor model; see e.g., Huang [9].

Thanks to the development of measuring devices, high-frequency data such as stock prices can be easily obtained these days, so that many researchers have studied parametric estimation of diffusion processes based on high-frequency data; see, e.g., Yoshida [18], Genon-Catalot and Jacod [7], Kessler [11], Uchida and Yoshida [17] and references therein. Recently, in the field of financial econometrics, the factor model based on high-frequency data has been extensively studied. Aït-Sahalia and Xiu [3] proposed a continuous-time latent factor model for a high-dimensional model using principal component analysis. Kusano and Uchida [12] suggested classical factor analysis for diffusion processes. This method enables us to analyze the relationships between low-dimensional observed variables sampled with high frequency and latent variables. For instance, based on high-frequency stock price data, we can analyze latent variables such as a world market factor and factors related to a certain industry (Figure 1). On the other hand, there have been few researchers who examine the relationships between these latent variables based on high-frequency data. Kusano and Uchida [13] proposed SEM with latent variables for diffusion processes. Using this method, one can examine the relationships between latent variables based on high-frequency data. For example, if we want to study the relationship between the world market factor and the Japanese financial factor, this method enables us to analyze the relationship (Figure 2). SEM with latent variables may be referred as the regression analysis between latent variables. While both explanatory and objective variables are observable in regression analysis, both of them are latent in SEM with latent variables. For the regression analysis and the market models based on high-frequency data, see, e.g., Aït-Sahalia et al. [2].

The model selection problem for diffusion processes based on discrete observations has been actively studied. Uchida [16] proposed the contrast-based information criterion for ergodic diffusion processes, and obtained the asymptotic result of the difference between the contrast-based information criteria. Eguchi and Masuda [5] studied the model comparison problem for semiparametric L $\acute{e}$ vy driven SDE and suggested the Gaussian quasi-AIC. Since the information criterion is important in SEM as mentioned above, we propose the quasi-AIC (QAIC) of SEM with latent variables for diffusion processes and study the asymptotic properties. In this paper, we consider the non-ergodic case. For the ergodic case, see Appendix 6.3.

The paper is organized as follows. In Section 2, we introduce the notation and assumptions. In Section 3, the QAIC of SEM with latent variables for diffusion processes is considered. Moreover, the situation where the set of competing models includes some (not all) misspecified parametric models is studied. It is shown that the probability of choosing the misspecified models converges to zero. In Section 4, we give examples and simulation results. In Section 5, the results described in Section 3 are proved.

Refer to caption — Figure 1. The path diagram for the example of factor analysis.

2. Notation and assumptions

First, we prepare the following notations and definitions. For any vector $v$ , $|v|=\sqrt{\mathop{\rm tr}\nolimits{vv^{\top}}}$ , $v^{(i)}$ is the $i$ -th element of $v$ , and $\mathop{\rm Diag}\nolimits v$ is the diagonal matrix, whose $i$ -th diagonal element is $v^{(i)}$ . For any matrix $A$ , $|A|=\sqrt{\mathop{\rm tr}\nolimits{AA^{\top}}}$ , and $A_{ij}$ is the $(i,j)$ -th element of $A$ . For matrices $A$ and $B$ of the same size, $A[B]=\mathop{\rm tr}\nolimits(AB^{\top})$ . For any matrix $A\in\mathbb{R}^{p\times p}$ and vectors $x,y\in\mathbb{R}^{p}$ , we define $A[x,y]=x^{\top}Ay$ . For a positive definite matrix $A$ , we write $A>0$ . For any symmetric matrix $A\in\mathbb{R}^{p\times p}$ , $\mathop{\rm vec}\nolimits A$ , $\mathop{\rm vech}\nolimits A$ and $\mathbb{D}_{p}$ are the vectorization of $A$ , the half-vectorization of $A$ and the $p^{2}\times\bar{p}$ duplication matrix respectively, where $\bar{p}=p(p+1)/2$ . Note that $\mathop{\rm vec}\nolimits{A}=\mathbb{D}_{p}\mathop{\rm vech}\nolimits{A}$ ; see, e.g., Harville [8]. For any matrix $A$ , $A^{+}$ stands for the Moore-Penrose inverse of $A$ . Set $\mathcal{M}_{p}^{++}$ as the sets of all $p\times p$ real-valued positive definite matrices. For any positive sequence $u_{n}$ , ${\rm{R}}:[0,\infty)\times\mathbb{R}^{d}\rightarrow\mathbb{R}$ denotes the short notation for functions which satisfy $|{\rm{R}}({u_{n}},x)|\leq u_{n}C(1+|x|)^{C}$ for some $C>0$ . Let $C^{k}_{\uparrow}(\mathbb{R}^{d})$ be the space of all functions $f$ satisfying the following conditions:

(i)

$f$ is continuously differentiable with respect to $x\in\mathbb{R}^{d}$ up to order $k$ .
(ii)

$f$ and all its derivatives are of polynomial growth in $x\in\mathbb{R}^{d}$ , i.e., $g$ is of polynomial growth in $x\in\mathbb{R}^{d}$ if $\displaystyle g(x)=R(1,x)$ .

The symbols $\stackrel{{\scriptstyle p}}{{\longrightarrow}}$ and $\stackrel{{\scriptstyle d}}{{\longrightarrow}}$ denote convergence in probability and convergence in distribution, respectively. For any process $Y_{t}$ , $\Delta_{i}Y=Y_{t_{i}^{n}}-Y_{t_{i-1}^{n}}$ . Set

\displaystyle\mathbb{Q}_{\mathbb{XX}}=\frac{1}{T}\sum_{i=1}^{n}(\mathbb{X}_{t_{i}^{n}}-\mathbb{X}_{t_{i-1}^{n}})(\mathbb{X}_{t_{i}^{n}}-\mathbb{X}_{t_{i-1}^{n}})^{\top}.

${\bf{E}}$ denotes the expectation under ${\bf{P}}$ . Next, we make the following assumptions.

[A]
1. (a)
  1. (i)
    
    There exists a constant $C>0$ such that
    
    $\displaystyle|B_{1}(x)-B_{1}(y)|\leq C|x-y|$
    
    for any $x,y\in\mathbb{R}^{k_{1}}$ .
  2. (ii)
    
    For all $\ell>0$ , $\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\xi_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty$ .
  3. (iii)
    
    $B_{1}\in C^{4}_{\uparrow}(\mathbb{R}^{k_{1}})$ .
2. (b)
  1. (i)
    
    There exists a constant $C>0$ such that
    
    $\displaystyle|B_{2}(x)-B_{2}(y)|\leq C|x-y|$
    
    for any $x,y\in\mathbb{R}^{p_{1}}$ .
  2. (ii)
    
    For all $\ell\geq 0$ , $\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\delta_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty$ .
  3. (iii)
    
    $B_{2}\in C^{4}_{\uparrow}(\mathbb{R}^{p_{1}})$ .
3. (c)
  1. (i)
    
    There exists a constant $C>0$ such that
    
    $\displaystyle|B_{3}(x)-B_{3}(y)|\leq C|x-y|$
    
    for any $x,y\in\mathbb{R}^{p_{2}}$ .
  2. (ii)
    
    For all $\ell\geq 0$ , $\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\varepsilon_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty$ .
  3. (iii)
    
    $B_{3}\in C^{4}_{\uparrow}(\mathbb{R}^{p_{2}})$ .
4. (d)
  1. (i)
    
    There exists a constant $C>0$ such that
    
    $\displaystyle|B_{4}(x)-B_{4}(y)|\leq C|x-y|$
    
    for any $x,y\in\mathbb{R}^{k_{2}}$ .
  2. (ii)
    
    For all $\ell\geq 0$ , $\displaystyle\sup_{t}{\bf{E}}\Bigl{[}\bigl{|}\zeta_{0,t}\bigr{|}^{\ell}\Bigr{]}<\infty$ .
  3. (iii)
    
    $B_{4}\in C^{4}_{\uparrow}(\mathbb{R}^{k_{2}})$ .

Remark 1

For diffusion processes, $[{\bf{A}}]$ is the standard assumption; see, e.g., Kessler [11].

3. Qaic of sem for diffusion processes

Using a locally Gaussian approximation, we obtain the following quasi-likelihood of Model $m$ from (1.8)-(1.14):

\displaystyle\prod_{i=1}^{n}\frac{1}{(2\pi h_{n})^{\frac{p}{2}}(\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m}))^{\frac{1}{2}}}\exp\Biggl{(}-\frac{1}{2h_{n}}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}^{\theta}_{m}\bigr{)}^{\otimes 2}\Bigr{]}\Biggr{)}.

See Appendix 8.1 in Kusano and Uchida [14] for details of the quasi-likelihood. Define the quasi-likelihood ${\rm{L}}_{m,n}$ based on the discrete observations $\mathbb{X}_{n}$ as follows:

\displaystyle{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\theta_{m}\bigr{)}=\prod_{i=1}^{n}\frac{1}{(2\pi h_{n})^{\frac{p}{2}}(\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m}))^{\frac{1}{2}}}\exp\Biggl{(}-\frac{1}{2h_{n}}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}\Biggr{)}.

The quasi-maximum likelihood estimator $\hat{\theta}_{m,n}$ is defined by

\displaystyle{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}=\sup_{\theta_{m}\in\Theta_{m}}{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\theta_{m}\bigr{)}.

Set $\mathbb{Z}_{n}$ as an i.i.d. copy of $\mathbb{X}_{n}$ . Let us consider the following Kullback-Leibler divergence between the transition density $q_{n}(\mathbb{Z}_{n})$ of the true model (1.1)-(1.7) and the quasi-likelihood ${\rm{L}}_{m,n}$ :

\displaystyle\begin{split}{\rm{K_{L}}}(\mathbb{X}_{n},m)&={\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\log\frac{q_{n}({\mathbb{Z}}_{n})}{{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}}\Biggr{]}\\ &={\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log q_{n}(\mathbb{Z}_{n})\Bigr{]}-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]},\end{split}

where ${\bf{E}}_{\mathbb{Z}_{n}}$ is the expectation under the law of $\mathbb{Z}_{n}$ . Our purpose is to know the model which minimizes ${\rm{K_{L}}}(\mathbb{X}_{n},m)$ . Since ${\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log q_{n}(\mathbb{Z}_{n})\Bigr{]}$ does not depend on the model, it is sufficient to consider the model which maximizes

\displaystyle{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]},

(3.1)

so that we need to estimate (3.1). Set

\displaystyle\Delta_{m,0}=\left.\frac{\partial}{\partial\theta^{\top}}\mathop{\rm vech}\nolimits{\bf{\Sigma}}_{m}(\theta_{m})\right|_{\theta_{m}=\theta_{m,0}}

and

\displaystyle{\rm{Y}}_{m}(\theta_{m})=-\frac{1}{2}\Bigl{(}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}-{\bf{\Sigma}}_{m}(\theta_{m,0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}_{m}(\theta_{m,0})\Bigr{]}-\frac{1}{2}\log\frac{\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m})}{\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m,0})}.

Moreover, the following assumptions are made.

[B1]
1. (a)
  
  There exists a constant $\chi>0$ such that
  
  $\displaystyle{\rm{Y}}_{m}(\theta_{m})\leq-\chi\bigl{|}\theta_{m}-\theta_{m,0}\bigr{|}^{2}$
  
  for all $\theta_{m}\in\Theta_{m}$ .
2. (b)
  
  $\mathop{\rm rank}\nolimits\Delta_{m,0}=q_{m}$ .

Remark 2

$[{\bf{B1}}]\ ({\rm{a}})$ is the identifiability condition. $[{\bf{B1}}]\ ({\rm{b}})$ implies that the asymptotic variance of $\hat{\theta}_{m,n}$ is non-singular; see Lemma 35 in Kusano and Uchida [14] and Lemma 2.

By the following theorem, we obtain the asymptotically unbiased estimator of (3.1).

Theorem 1

Let $m\in\{1,\cdots,M\}$ . Under [A] and [B1], as $n\longrightarrow\infty$ ,

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]}\biggr{]}=q_{m}+o_{p}(1).

We define the quasi-Akaike information criterion as

\displaystyle{\rm{QAIC}}(\mathbb{X}_{n},m)=-2\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}+2q_{m}.

(3.2)

Since it holds from Theorem 1 that ${\rm{QAIC}}(\mathbb{X}_{n},m)$ is the asymptotically unbiased estimator of

\displaystyle-2{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]},

we select the optimal model $\hat{m}_{n}$ among competing models by

\displaystyle{\rm{QAIC}}(\mathbb{X}_{n},\hat{m}_{n})=\min_{m\in\{1,\cdots,M\}}{\rm{QAIC}}(\mathbb{X}_{n},m).

(3.3)

Remark 3

Since ${\rm{L}}_{m,n}$ is not the exact likelihood but the quasi-likelihood, all the competing models are misspecified. Note that we consider a model selection problem among the quasi-likelihood models; see, e.g., Eguchi and Masuda [5].

Remark 4

In SEM, instead of (3.2),

\displaystyle n{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}+2q_{m}

(3.4)

is often used for a model selection as $\mathbb{Q}_{\mathbb{XX}}>0$ , where

\displaystyle{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\theta_{m}\bigr{)}=\log\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m})-\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}\mathbb{Q}_{\mathbb{XX}}\Bigr{)}-p.

For details of (3.4), see, e.g., Huang [9]. Since

	$\displaystyle-2\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}$	$\displaystyle=np\log(2\pi h_{n})+n\log\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\hat{\theta}_{m,n})+n\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}_{m}(\hat{\theta}_{m,n})^{-1}\mathbb{Q}_{\mathbb{XX}}\Bigr{)}$
		$\displaystyle=n{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}+n\Bigl{\{}p\log(2\pi h_{n})+\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+p\Bigr{\}}$

as $\mathbb{Q}_{\mathbb{XX}}>0$ , it is shown that

\displaystyle n{\rm{F}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}(\mathbb{X}_{n})\bigr{)}+2q_{m}={\rm{QAIC}}(\mathbb{X}_{n},m)-n\Bigl{\{}p\log(2\pi h_{n})+\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+p\Bigr{\}}

as $\mathbb{Q}_{\mathbb{XX}}>0$ . Note that

\displaystyle n\Bigl{\{}p\log(2\pi h_{n})+\log\mathop{\rm det}\nolimits\mathbb{Q}_{\mathbb{XX}}+p\Bigr{\}}

does not depend on the model. Even if we use (3.4) instead of (3.2), the model selection results are not different.

Next, we consider the situation where the set of competing models includes some (not all) misspecified parametric models; that is, there exists $m\in\{1,\cdots,M\}$ such that

\displaystyle{\bf{\Sigma}}_{0}\neq{\bf{\Sigma}}_{m}(\theta_{m})

for all $\theta_{m}\in\Theta_{m}$ . Set

\displaystyle\mathcal{M}=\biggl{\{}m\in\{1,\cdots,M\}\ \Big{|}\ \mbox{There exists}\ \theta_{m,0}\in\Theta_{m}\ \mbox{such that}\ {\bf{\Sigma}}_{0}={\bf{\Sigma}}_{m}(\theta_{m,0}).\biggr{\}}

and $\mathcal{M}^{c}=\{1,\cdots,M\}\backslash\mathcal{M}$ . The optimal parameter $\bar{\theta}_{m}$ is defined as

\displaystyle{\rm{H}}_{m}(\bar{\theta}_{m})=\sup_{\theta_{m}\in\Theta_{m}}{\rm{H}}_{m}(\theta_{m}),

where

\displaystyle{\rm{H}}_{m}(\theta_{m})=-\frac{1}{2}\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}_{m}(\theta_{m})^{-1}{\bf{\Sigma}}_{0}\Bigr{)}-\frac{1}{2}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}_{m}(\theta_{m}).

Note that $\bar{\theta}_{m}=\theta_{m,0}$ for $m\in\mathcal{M}$ . Furthermore, we make the following assumption.

[B2]

${\rm{H}}_{m}(\theta_{m})={\rm{H}}_{m}(\bar{\theta}_{m})\Longrightarrow\theta_{m}=\bar{\theta}_{m}$ .

$\bf{[B2]}$ implies that $\hat{\theta}_{m,n}\stackrel{{\scriptstyle p}}{{\longrightarrow}}\bar{\theta}_{m}$ ; see, Lemma 36 in Kusano and Uchida [14]. The following asymptotic result of $\hat{m}_{n}$ defined in (3.3) holds.

Theorem 2

Under [A] and [B2], as $n\longrightarrow\infty$ ,

\displaystyle{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}\longrightarrow 0.

Theorem 2 shows that the probability of choosing the misspecified models converges to zero as $n\longrightarrow\infty$ .

4. Simulation results

4.1. True model

The stochastic process $\mathbb{X}_{1,0,t}$ is defined by the following factor model :

\displaystyle\mathbb{X}_{1,0,t}=\begin{pmatrix}1&5&2&0&0&0\\ 0&0&0&1&4&7\end{pmatrix}^{\top}\xi_{0,t}+\delta_{0,t},

where $\{\mathbb{X}_{1,0,t}\}_{t\geq 0}$ is a six-dimensional observable vector process, $\{\xi_{0,t}\}_{t\geq 0}$ is a two-dimensional latent common factor vector process, and $\{\delta_{0,t}\}_{t\geq 0}$ is a six-dimensional latent unique factor vector process. The stochastic process $\mathbb{X}_{2,0,t}$ is defined by the factor model as follows:

\displaystyle\mathbb{X}_{2,0,t}=\begin{pmatrix}1\\ 2\end{pmatrix}\eta_{0,t}+\varepsilon_{0,t},

where $\{\mathbb{X}_{2,0,t}\}_{t\geq 0}$ is a two-dimensional observable vector process, $\{\eta_{0,t}\}_{t\geq 0}$ is a one-dimensional latent common factor vector process, and $\{\varepsilon_{0,t}\}_{t\geq 0}$ is a two-dimensional latent unique factor vector process. Furthermore, the relationship between $\eta_{0,t}$ and $\xi_{0,t}$ is expressed as follows:

\displaystyle\eta_{0,t}=\begin{pmatrix}3&2\end{pmatrix}\xi_{0,t}+\zeta_{0,t},

where $\{\zeta_{0,t}\}_{t\geq 0}$ is a one-dimensional latent unique factor vector process. It is supposed that $\{\xi_{0,t}\}_{t\geq 0}$ is the two-dimensional OU process as follows:

\displaystyle\quad\mathrm{d}\xi_{0,t}=-\left\{\begin{pmatrix}1&0.7\\ 0.7&0.5\end{pmatrix}\xi_{0,t}-\begin{pmatrix}1\\ 2\end{pmatrix}\right\}\mathrm{d}t+\begin{pmatrix}1&0.3\\ 0.4&1\end{pmatrix}\mathrm{d}W_{1,t}\ \ (t\in[0,T]),\ \ \xi_{0,0}=\begin{pmatrix}2\\ 1\end{pmatrix},

where $W_{1,t}$ is a two-dimensional standard Wiener process. $\{\delta_{0,t}\}_{t\geq 0}$ is defined by the six-dimensional OU process as follows:

\displaystyle\quad\mathrm{d}\delta_{0,t}=-\bigl{(}B_{0}\delta_{0,t}-\mu_{0}\bigr{)}dt+{\bf{S}}_{2,0}\mathrm{d}W_{2,t}\ \ (t\in[0,T]),\ \ \delta_{0,0}=c_{0},

where $B_{0}={\rm{Diag}}(3,2,4,1,2,1)^{\top}$ , $\mu_{0}=(3,2,1,2,6,4)^{\top}$ , ${\bf{S}}_{2,0}={\rm{Diag}}(3,2,1,2,1,3)^{\top}$ , $c_{0}=(1,3,2,1,4,3)^{\top}$ and $W_{2,t}$ is a six-dimensional standard Wiener process. $\{\varepsilon_{0,t}\}_{t\geq 0}$ satisfies the following two-dimensional OU process:

\displaystyle\quad\mathrm{d}\varepsilon_{0,t}=-\left\{\begin{pmatrix}1&0\\ 0&3\end{pmatrix}\varepsilon_{0,t}-\begin{pmatrix}2\\ 3\end{pmatrix}\right\}\mathrm{d}t+\begin{pmatrix}1&0\\ 0&2\end{pmatrix}\mathrm{d}W_{3,t}\ \ (t\in[0,T]),\ \ \varepsilon_{0,0}=\begin{pmatrix}1\\ 5\end{pmatrix},

where $W_{3,t}$ is a two-dimensional standard Wiener process. $\{\zeta_{0,t}\}_{t\geq 0}$ is defined by the following one-dimensional OU process:

\displaystyle\quad\mathrm{d}\zeta_{0,t}=-\zeta_{0,t}\mathrm{d}t+2\mathrm{d}W_{4,t}\ \ (t\in[0,T]),\ \ \zeta_{0,0}=0,

where $W_{4,t}$ is a one-dimensional standard Wiener process. Figure 3 shows the path diagram of the true model at time $t$ .

4.2. Competing models

4.2.1. Model 1

Set the parameter as $\theta_{1}\in\Theta_{1}\subset\mathbb{R}^{19}$ . Let $p_{1}=6$ , $p_{2}=2$ , $k_{1}=2$ and $k_{2}=1$ . Assume

\displaystyle{\bf{\Lambda}}^{\theta}_{1,x_{1}}

\displaystyle=\begin{pmatrix}1&\theta^{(1)}_{1}&\theta^{(2)}_{1}&0&0&0\\ 0&0&0&1&\theta^{(3)}_{1}&\theta^{(4)}_{1}\end{pmatrix}^{\top}

and

\displaystyle{\bf{\Lambda}}^{\theta}_{1,x_{2}}=\begin{pmatrix}1&\theta_{1}^{(5)}\end{pmatrix}^{\top},\quad{\bf{\Gamma}}^{\theta}_{1}=\begin{pmatrix}\theta_{1}^{(6)}&\theta_{1}^{(7)}\end{pmatrix},

where $\theta_{1}^{(1)}$ , $\theta_{1}^{(2)}$ , $\theta_{1}^{(3)}$ , $\theta_{1}^{(4)}$ , $\theta_{1}^{(5)}$ , $\theta_{1}^{(6)}$ and $\theta_{1}^{(7)}$ are not zero. It is supposed that ${\bf{S}}^{\theta}_{1,1}$ and ${\bf{S}}^{\theta}_{2,1}$ satisfy

\displaystyle\qquad\qquad\qquad{\bf{\Sigma}}^{\theta}_{1,\xi\xi}={\bf{S}}^{\theta}_{1,1}{\bf{S}}^{\theta\top}_{1,1}=\begin{pmatrix}\theta_{1}^{(8)}&\theta_{1}^{(9)}\\ \theta_{1}^{(9)}&\theta_{1}^{(10)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and

\displaystyle{\bf{\Sigma}}^{\theta}_{1,\delta\delta}={\bf{S}}^{\theta}_{2,1}{\bf{S}}^{\theta\top}_{2,1}={\rm{Diag}}\Bigl{(}\theta_{1}^{(11)},\theta_{1}^{(12)},\theta_{1}^{(13)},\theta_{1}^{(14)},\theta_{1}^{(15)},\theta_{1}^{(16)}\Bigr{)}\in\mathcal{M}_{6}^{++},

where $\theta_{1}^{(9)}$ is not zero. Moreover, ${\bf{S}}^{\theta}_{3,1}$ and ${\bf{S}}^{\theta}_{4,1}$ are assumed to satisfy

\displaystyle\qquad\qquad\qquad{\bf{\Sigma}}^{\theta}_{1,\varepsilon\varepsilon}={\bf{S}}^{\theta}_{3,1}{\bf{S}}^{\theta\top}_{3,1}=\begin{pmatrix}\theta_{1}^{(17)}&0\\ 0&\theta_{1}^{(18)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and ${\bf{\Sigma}}^{\theta}_{1,\zeta\zeta}=({\bf{S}}^{\theta}_{4,1})^{2}=\theta_{1}^{(19)}>0$ . Set

\displaystyle\theta_{1,0}=\Bigl{(}5,2,4,7,2,3,2,1.09,0.70,1.16,9,4,1,4,1,9,1,4,4\Bigr{)}.

It holds that ${\bf{\Sigma}}_{0}={\bf{\Sigma}}_{1}(\theta_{1,0})$ , so that Model $1$ is a correctly specified model. There exists a constant $\chi>0$ such that

\displaystyle{\rm{Y}}_{1}(\theta_{1})\leq-\chi|\theta_{1}-\theta_{1,0}|^{2}

(4.1)

for all $\theta_{1}\in\Theta_{1}$ . For the proof of (4.1), see Appendix 6.2. Figure 4 shows the path diagram of Model $1$ at time $t$ .

4.2.2. Model 2

The parameter is defined as $\theta_{2}\in\Theta_{2}\subset\mathbb{R}^{20}$ . Set $p_{1}=6$ , $p_{2}=2$ , $k_{1}=2$ and $k_{2}=1$ . Suppose

\displaystyle{\bf{\Lambda}}^{\theta}_{2,x_{1}}

\displaystyle=\begin{pmatrix}1&\theta^{(1)}_{2}&\theta^{(2)}_{2}&0&0&0\\ 0&0&\theta^{(3)}_{2}&1&\theta^{(4)}_{2}&\theta^{(5)}_{2}\end{pmatrix}^{\top}

and

\displaystyle{\bf{\Lambda}}^{\theta}_{2,x_{2}}=\begin{pmatrix}1&\theta_{2}^{(6)}\end{pmatrix}^{\top},\quad{\bf{\Gamma}}^{\theta}_{2}=\begin{pmatrix}\theta_{2}^{(7)}&\theta_{2}^{(8)}\end{pmatrix},

where $\theta_{2}^{(1)}$ , $\theta_{2}^{(2)}$ , $\theta_{2}^{(3)}$ , $\theta_{2}^{(4)}$ , $\theta_{2}^{(5)}$ , $\theta_{2}^{(6)}$ , $\theta_{2}^{(7)}$ and $\theta_{2}^{(8)}$ are not zero. ${\bf{S}}^{\theta}_{1,2}$ and ${\bf{S}}^{\theta}_{2,2}$ are assumed to satisfy

\displaystyle{\bf{\Sigma}}^{\theta}_{2,\xi\xi}

\displaystyle={\bf{S}}^{\theta}_{1,2}{\bf{S}}^{\theta\top}_{1,2}=\begin{pmatrix}\theta_{2}^{(9)}&\theta_{2}^{(10)}\\ \theta_{2}^{(10)}&\theta_{2}^{(11)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and

\displaystyle{\bf{\Sigma}}^{\theta}_{2,\delta\delta}

\displaystyle={\bf{S}}^{\theta}_{2,2}{\bf{S}}^{\theta\top}_{2,2}={\rm{Diag}}\Bigl{(}\theta_{2}^{(12)},\theta_{2}^{(13)},\theta_{2}^{(14)},\theta_{2}^{(15)},\theta_{2}^{(16)},\theta_{2}^{(17)}\Bigr{)}\in\mathcal{M}_{6}^{++},

where $\theta_{2}^{(10)}$ is not zero. Furthermore, we suppose that ${\bf{S}}^{\theta}_{3,2}$ and ${\bf{S}}^{\theta}_{4,2}$ satisfy

\displaystyle{\bf{\Sigma}}^{\theta}_{2,\varepsilon\varepsilon}

\displaystyle={\bf{S}}^{\theta}_{3,2}{\bf{S}}^{\theta\top}_{3,2}=\begin{pmatrix}\theta_{2}^{(18)}&0\\ 0&\theta_{2}^{(19)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and ${\bf{\Sigma}}^{\theta}_{2,\zeta\zeta}=({\bf{S}}^{\theta}_{4,2})^{2}=\theta_{2}^{(20)}>0$ . Let

\displaystyle\theta_{2,0}=\Bigl{(}5,2,0,4,7,2,3,2,1.09,0.70,1.16,9,4,1,4,1,9,1,4,4\Bigr{)}.

Since ${\bf{\Sigma}}_{0}={\bf{\Sigma}}_{2}(\theta_{2,0})$ , Model $2$ is a correctly specified model. In a similar way to the proof of (4.1), we can prove that there exists a constant $\chi>0$ such that

\displaystyle{\rm{Y}}_{2}(\theta_{2})\leq-\chi|\theta_{2}-\theta_{2,0}|^{2}

for all $\theta_{2}\in\Theta_{2}$ . Figure 5 shows the path diagram of Model $2$ at time $t$ .

4.2.3. Model 3

Set the parameter as $\theta_{3}\in\Theta_{3}\subset\mathbb{R}^{17}$ . Let $p_{1}=6$ , $p_{2}=2$ , $k_{1}=1$ and $k_{2}=1$ . Assume

\displaystyle{\bf{\Lambda}}^{\theta}_{3,x_{1}}

\displaystyle=\begin{pmatrix}1&\theta^{(1)}_{3}&\theta^{(2)}_{3}&\theta^{(3)}_{3}&\theta^{(4)}_{3}&\theta^{(5)}_{3}\end{pmatrix}^{\top}

and

\displaystyle{\bf{\Lambda}}^{\theta}_{3,x_{2}}=\begin{pmatrix}1&\theta_{3}^{(6)}\end{pmatrix}^{\top},\quad{\bf{\Gamma}}^{\theta}_{3}=\theta_{3}^{(7)},

where $\theta^{(1)}_{3}$ , $\theta^{(2)}_{3}$ , $\theta^{(3)}_{3}$ , $\theta^{(4)}_{3}$ , $\theta^{(5)}_{3}$ , $\theta^{(6)}_{3}$ and $\theta^{(7)}_{3}$ are not zero. We assume that ${\bf{S}}^{\theta}_{1,3}$ and ${\bf{S}}^{\theta}_{2,3}$ satisfy ${\bf{\Sigma}}^{\theta}_{3,\xi\xi}=({\bf{S}}^{\theta}_{1,3})^{2}=\theta_{3}^{(8)}>0$ and

\displaystyle{\bf{\Sigma}}^{\theta}_{3,\delta\delta}

\displaystyle={\bf{S}}^{\theta}_{2,3}{\bf{S}}^{\theta\top}_{2,3}={\rm{Diag}}\Bigl{(}\theta_{3}^{(9)},\theta_{3}^{(10)},\theta_{3}^{(11)},\theta_{3}^{(12)},\theta_{3}^{(13)},\theta_{3}^{(14)}\Bigr{)}\in\mathcal{M}_{6}^{++}.

Moreover, it is supposed that ${\bf{S}}^{\theta}_{3,3}$ and ${\bf{S}}^{\theta}_{4,3}$ satisfy

\displaystyle{\bf{\Sigma}}^{\theta}_{3,\varepsilon\varepsilon}

\displaystyle={\bf{S}}^{\theta}_{3,3}{\bf{S}}^{\theta\top}_{3,3}=\begin{pmatrix}\theta_{3}^{(15)}&0\\ 0&\theta_{3}^{(16)}\end{pmatrix}\in\mathcal{M}_{2}^{++}

and ${\bf{\Sigma}}^{\theta}_{3,\zeta\zeta}=({\bf{S}}^{\theta}_{4,3})^{2}=\theta_{3}^{(17)}>0$ . For any $\theta_{3}\in\Theta_{3}$ , one has ${\bf{\Sigma}}_{0}\neq{\bf{\Sigma}}_{3}(\theta_{3})$ , so that Model $3$ is a misspecified model. Figure 6 shows the path diagram of Model $3$ at time $t$ .

4.3. Simulation results

In the simulation, we use optim() with the BFGS method in R language. The initial parameter is chosen as $\theta=\theta_{0}$ . The number of iterations is 10000. Set $T=1$ and consider the case where $n=10^{2},10^{3},10^{4},10^{5}$ . Table 1 shows the number of models selected by QAIC. Since Model 3 is not selected, this simulation result implies that Theorem 2 seems to be correct in this example. Furthermore, we see from this result that QAIC does not have consistency. In other words, the over-fitted model (Model 2) is selected with significant probability. This result is natural since QAIC chooses the best model in terms of prediction.

	$n=10^{2}$	$n=10^{3}$	$n=10^{4}$	$n=10^{5}$
Model 1	8394	8417	8461	8410
Model 2	1606	1583	1539	1590
Model 3	0	0	0	0

Table 1. The number of models selected by QAIC.

5. proof

In this section, we may omit the model index “ $m$ ”, and we use $\hat{\theta}_{n}$ instead of $\hat{\theta}_{n}(\mathbb{X}_{n})$ . Moreover, we simply write $\mathbb{X}_{1,0,t}$ , $\mathbb{X}_{2,0,t}$ , $\xi_{0,t}$ , $\delta_{0,t}$ , $\varepsilon_{0,t}$ and $\zeta_{0,t}$ as $\mathbb{X}_{1,t}$ , $\mathbb{X}_{2,t}$ , $\xi_{t}$ , $\delta_{t}$ , $\varepsilon_{t}$ and $\zeta_{t}$ , respectively. For any process $Y_{t}$ and $\ell\geq 0$ , we set ${\rm{R}}_{i}(h_{n}^{\ell},Y)={\rm{R}}(h_{n}^{\ell},Y_{t_{i}^{n}})$ . Without loss of generality, we suppose that $T=1$ . Set

\displaystyle\mathscr{F}^{n}_{i}=\sigma\bigl{(}W_{1,s},W_{2,s},W_{3,s},W_{4,s},s\leq t_{i}^{n}\bigr{)}

for $i=0,\cdots,n$ . Let

\displaystyle{\rm{H}}_{n}(\mathbb{X}_{n},\theta)

\displaystyle=\log(2\pi h_{n})^{\frac{np}{2}}{\rm{L}}_{n}(\mathbb{X}_{n},\theta).

Set ${\bf{I}}(\theta_{0})=\Delta^{\top}_{0}{\bf{W}}(\theta_{0})^{-1}\Delta_{0}$ , where

\displaystyle{\bf{W}}(\theta_{0})=2\mathbb{D}^{+}_{p}\bigl{(}{\bf{\Sigma}}(\theta_{0})\otimes{\bf{\Sigma}}(\theta_{0})\bigr{)}\mathbb{D}^{+\top}_{p}.

Define the random field ${\rm{Y}}_{n}:$

\displaystyle{\rm{Y}}_{n}(\mathbb{X}_{n},\theta;\theta_{0})=\frac{1}{n}\biggl{\{}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{\}}.

Let ${\rm{Z}}_{n}$ be the random field as follows:

\displaystyle{\rm{Z}}_{n}(\mathbb{X}_{n},u;\theta_{0})=\exp\Biggl{\{}{\rm{H}}_{n}\biggl{(}\mathbb{X}_{n},\theta_{0}+\frac{1}{\sqrt{n}}u\biggr{)}-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\Biggr{\}}

for $u\in\mathbb{U}_{n}$ , where

\displaystyle{\mathbb{U}}_{n}=\left\{u\in\mathbb{R}^{q}:\ \theta_{0}+\frac{1}{\sqrt{n}}u\in\Theta\right\}.

Set ${\rm{V}}_{n}(r)=\bigl{\{}u\in{\mathbb{U}}_{n}:r\leq|u|\bigr{\}}$ and $\hat{u}_{n}=\sqrt{n}(\hat{\theta}_{n}-\theta_{0})$ . ${\bf{V}}_{\mathbb{X}_{n}}$ denotes the variance under the law of $\mathbb{X}_{n}$ . Write $\partial_{\theta}=\partial/\partial\theta$ and $\partial^{2}_{\theta}=\partial_{\theta}\partial^{\top}_{\theta}$ . Define $\zeta$ as a $q$ -dimensional standard normal random variable.

Lemma 1

Under [A], as $n\longrightarrow\infty$ ,

\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})

\displaystyle\stackrel{{\scriptstyle d}}{{\longrightarrow}}{\bf{I}}(\theta_{0})^{\frac{1}{2}}\zeta

and

\displaystyle\quad\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})

\displaystyle\stackrel{{\scriptstyle p}}{{\longrightarrow}}-{\bf{I}}(\theta_{0}).

Lemma 2

Under [A] and [B1], as $n\longrightarrow\infty$ ,

\displaystyle\hat{\theta}_{n}\stackrel{{\scriptstyle p}}{{\longrightarrow}}\theta_{0}

and

\displaystyle\sqrt{n}(\hat{\theta}_{n}-\theta_{0})

\displaystyle\stackrel{{\scriptstyle d}}{{\longrightarrow}}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta.

Proofs of Lemmas 1-2.

In the same way as the proof of Theorem 2 in Kusano and Uchida [13], we can prove the results. See also Appendix 6.1. ∎

In the proofs of Lemmas 3-7, we simply write ${\bf{E}}_{\mathbb{X}_{n}}$ , ${\bf{V}}_{\mathbb{X}_{n}}$ , $\mathbb{H}_{n}(\mathbb{X}_{n},\theta)$ , ${\rm{Y}}_{n}(\mathbb{X}_{n},\theta;\theta_{0})$ and ${\rm{Z}}_{n}(\mathbb{X}_{n},u;\theta_{0})$ as ${\bf{E}}$ , ${\bf{V}}$ , $\mathbb{H}_{n}(\theta)$ , ${\rm{Y}}_{n}(\theta;\theta_{0})$ and ${\rm{Z}}_{n}(u;\theta_{0})$ , respectively.

Lemma 3

Under [A], for all $L>1$ ,

	$\displaystyle\qquad\qquad{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}},$		(5.1)
	$\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}},$		(5.2)
	$\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{\|}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}\leq C_{L}h_{n}^{L}$		(5.3)

and

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}{\bf{V}}_{\mathbb{X}_{n}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{L}.

(5.4)

Proof.

First, we will prove (5.1). Lemmas 14-15 in Kusano and Uchida [14] implies

\displaystyle\begin{split}{\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j)}_{1}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}&={\bf{E}}\biggl{[}A^{(j)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\bf{E}}\biggl{[}B^{(j)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}\\ &={\rm{R}}_{i-1}(h_{n},\xi)+{\rm{R}}_{i-1}(h_{n},\delta)\end{split}

(5.5)

for $j=1,\cdots,p_{1}$ , where

\displaystyle A_{i,n}={\bf{\Lambda}}_{x_{1},0}\Delta_{i}\xi,\quad B_{i,n}=\Delta_{i}\delta.

Since

\displaystyle\mathbb{X}_{2,t}

\displaystyle={\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}{\bf{\Gamma}}_{0}\xi_{t}+{\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}\zeta_{t}+\varepsilon_{t},

it follows from Lemmas 16-18 in Kusano and Uchida [14] that

\displaystyle\begin{split}{\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(k)}_{2}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}&={\bf{E}}\biggl{[}C^{(k)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\bf{E}}\biggl{[}D^{(k)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\bf{E}}\biggl{[}E^{(k)}_{i,n}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}\\ &={\rm{R}}_{i-1}(h_{n},\xi)+{\rm{R}}_{i-1}(h_{n},\varepsilon)+{\rm{R}}_{i-1}(h_{n},\zeta)\end{split}

(5.6)

for $k=1,\cdots,p_{2}$ , where

\displaystyle C_{i,n}={\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}{\bf{\Gamma}}_{0}\Delta_{i}\xi,\quad D_{i,n}={\bf{\Lambda}}_{x_{2},0}{\bf{\Psi}}_{0}^{-1}\Delta_{i}\zeta,\quad E_{i,n}=\Delta_{i}\varepsilon.

Lemma 20 in Kusano and Uchida [14] shows

	$\displaystyle{\bf{E}}\biggl{[}\bigl{\|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{\|}^{L}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}$	$\displaystyle\leq C_{L}{\bf{E}}\biggl{[}\bigl{\|}A^{(j)}_{i,n}\bigr{\|}^{L}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{\|}B^{(j)}_{i,n}\bigr{\|}^{L}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}$
		$\displaystyle\leq C_{L}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\xi)+C_{L}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\delta)$

for all $L>1$ , so that

\displaystyle\begin{split}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\biggr{]}&={\bf{E}}\Biggl{[}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\big{|}\mathscr{F}^{n}_{i-1}\biggr{]}\Biggr{]}\\ &\leq C_{L}{\bf{E}}\biggl{[}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\xi)\biggr{]}+C_{L}{\bf{E}}\biggl{[}{\rm{R}}_{i-1}(h_{n}^{\frac{L}{2}},\delta)\biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}.\end{split}

(5.7)

Similarly, we see from Lemma 20 in Kusano and Uchida [14] that

\displaystyle\begin{split}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(k)}_{2}\bigr{|}^{L}\biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}\end{split}

(5.8)

for any $L>1$ . Thus, it holds from (5.7) and (5.8) that for all $L>1$ ,

	$\displaystyle{\bf{E}}\biggl{[}\bigl{\|}\Delta_{i}\mathbb{X}\bigr{\|}^{L}\biggr{]}$	$\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\biggl{[}\bigl{\|}\Delta_{i}\mathbb{X}^{(\ell)}\bigr{\|}^{L}\biggr{]}$
		$\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\biggl{[}\bigl{\|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{\|}^{L}\biggr{]}+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\biggl{[}\bigl{\|}\Delta_{i}\mathbb{X}^{(k)}_{2}\bigr{\|}^{L}\biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}},$

which yields (5.1). Using (5.5) and (5.7), one gets

\displaystyle\begin{split}&\quad\ {\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\\ &={\bf{E}}\Biggl{[}\Bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\Bigr{|}^{L}\Biggr{]}\\ &\leq C_{L}{\bf{E}}\biggl{[}\bigl{|}\Delta_{i}\mathbb{X}^{(j)}_{1}\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\xi)\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\delta)\bigr{|}^{L}\biggr{]}\\ &\leq C_{L}\Bigl{(}h_{n}^{\frac{L}{2}}+h_{n}^{L}+h_{n}^{L}\Bigr{)}\\ &\leq C_{L}h_{n}^{\frac{L}{2}}\end{split}

(5.9)

for all $L>1$ . In an analogous manner, (5.6) and (5.8) deduce

\displaystyle\begin{split}{\bf{E}}\Biggl{[}\Bigl{|}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{\frac{L}{2}}\end{split}

(5.10)

for any $L>1$ . Consequently, we see from (5.9) and (5.10) that

	$\displaystyle{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$	$\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}^{(\ell)}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(\ell)}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\quad+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq C_{L}h_{n}^{\frac{L}{2}}$

for all $L>1$ , which yields (5.2). It follows from (5.5) that

\displaystyle\begin{split}{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(j)}_{1,t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}&\leq{\bf{E}}\Biggl{[}\Bigl{|}{\rm{R}}_{i-1}(h_{n},\xi)+{\rm{R}}_{i-1}(h_{n},\delta)\Bigr{|}^{L}\Biggr{]}\\ &\leq C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\xi)\bigr{|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{|}{\rm{R}}_{i-1}(h_{n},\delta)\bigr{|}^{L}\biggr{]}\\ &\leq C_{L}h_{n}^{L}\end{split}

(5.11)

for any $L>1$ . In a similar way, (5.6) implies

\displaystyle{\bf{E}}\Biggl{[}\Bigl{|}{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(k)}_{2,t_{i-1}^{n}}\Bigr{|}^{L}\Biggr{]}

\displaystyle\leq C_{L}h_{n}^{L}

(5.12)

for all $L>1$ . Hence, it holds from (5.11) and (5.12) that

	$\displaystyle{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$	$\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}^{(\ell)}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(\ell)}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(j)}_{1,t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\quad+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(k)}_{2,t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq C_{L}h_{n}^{L}$

for all $L>1$ , so that (5.3) holds. Next, we consider (5.4). Since it follows from (5.5) and Lemma 21 in Kusano and Uchida [14] that

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle={\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\biggr{)}$
	$\displaystyle\qquad\qquad\qquad\qquad\times\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle={\bf{E}}\Biggl{[}\Bigl{(}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\Bigr{)}\Bigl{(}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\Bigr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\rm{R}}_{i-1}(h_{n},\delta){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\rm{R}}_{i-1}(h_{n},\delta){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)$
	$\displaystyle=h_{n}{\bf{\Sigma}}^{11}(\theta_{0})_{j_{1}j_{2}}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)$

for $j_{1},j_{2}=1,\cdots,p_{1}$ , we see

	$\displaystyle\quad\ {\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\leq C_{L}h_{n}^{L}\Bigl{\|}{\bf{\Sigma}}^{11}(\theta_{0})_{j_{1}j_{2}}\Bigr{\|}^{L}+C_{L}{\bf{E}}\biggl{[}\bigl{\|}{\rm{R}}_{i-1}(h_{n}^{2},\xi)\bigr{\|}^{L}\biggr{]}+C_{L}{\bf{E}}\biggl{[}\bigl{\|}{\rm{R}}_{i-1}(h_{n}^{2},\delta)\bigr{\|}^{L}\biggr{]}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad+C_{L}{\bf{E}}\biggl{[}\bigl{\|}{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)\bigr{\|}^{L}\biggr{]}\leq C_{L}h_{n}^{L}$

for any $L>1$ . In an analogous manner, one has

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle=h_{n}{\bf{\Sigma}}^{12}(\theta_{0})_{jk}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\varepsilon)$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\zeta)+{\rm{R}}_{i-1}(h_{n},\delta){\rm{R}}_{i-1}(h_{n},\varepsilon)+{\rm{R}}_{i-1}(h_{n},\delta){\rm{R}}_{i-1}(h_{n},\zeta)$

for $j=1,\cdots,p_{1}$ and $k=1,\cdots,p_{2}$ , and

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle=h_{n}{\bf{\Sigma}}^{22}(\theta_{0})_{k_{1}k_{2}}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\varepsilon)+{\rm{R}}_{i-1}(h_{n}^{2},\zeta)$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\varepsilon)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\zeta)+{\rm{R}}_{i-1}(h_{n},\varepsilon){\rm{R}}_{i-1}(h_{n},\zeta)$

for $k_{1},k_{2}=1,\cdots,p_{2}$ , so that we get

\displaystyle{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]\leq C_{L}h_{n}^{L}

and

\displaystyle{\bf{E}}\left[\left|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right|^{L}\right]\leq C_{L}h_{n}^{L}

for all $L>1$ . Therefore, it is shown that

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\Bigl{\|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
	$\displaystyle\leq C_{L}\sum_{j_{1}=1}^{p_{1}}\sum_{j_{2}=1}^{p_{1}}{\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\quad+C_{L}\sum_{j=1}^{p_{1}}\sum_{k=1}^{p_{2}}{\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\quad+C_{L}\sum_{k_{1}=1}^{p_{2}}\sum_{k_{2}=1}^{p_{2}}{\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\leq C_{L}h_{n}^{L}$

for any $L>1$ , which yields (5.4). ∎

Lemma 4

Under [A], for all $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{|}^{L}\Biggr{]}<\infty

for $j=1,\cdots,q$ .

Proof of Lemma 4.

Note that

	$\displaystyle\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta)$	$\displaystyle=-\frac{1}{2h_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-\frac{n}{2}\partial_{\theta^{(j)}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})$
		$\displaystyle=\frac{1}{2h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-\frac{n}{2}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{]}$

for $j=1,\cdots,q$ . Since

	$\displaystyle\Bigl{(}\Delta_{i}\mathbb{X}\Bigr{)}^{\otimes 2}$	$\displaystyle=\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}+\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}$
		$\displaystyle\qquad\qquad+\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\top}$
		$\displaystyle\qquad\qquad+\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\top},$

we have

	$\displaystyle\quad\ \frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})$
	$\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-h_{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{]}\Biggr{\}}$
	$\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}$
	$\displaystyle={\bf{M}}_{n}^{(j)}+{\bf{R}}_{n}^{(j)}$

for $j=1,\cdots,q$ , where

	$\displaystyle{\bf{M}}_{n}^{(j)}$	$\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}$
		$\displaystyle\quad+\frac{1}{n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}$

and

	$\displaystyle{\bf{R}}_{n}^{(j)}$	$\displaystyle=\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}\Biggr{]}$
		$\displaystyle\qquad\quad+\frac{1}{2n^{\frac{1}{2}}h_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Biggr{]}.$

First, we will prove

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}<\infty

(5.13)

for any $L>1$ . Set

\displaystyle{\bf{N}}^{(j)}_{k}=\frac{1}{2h_{n}}\sum_{\ell=1}^{k}{\bf{L}}^{(j)}_{\ell}

for $k=0,\cdots,n$ , where

	$\displaystyle{\bf{L}}^{(j)}_{\ell}$	$\displaystyle=\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}$
		$\displaystyle\quad+2\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}$

for $\ell=1,\cdots,k$ . Since

\displaystyle{\bf{E}}\biggl{[}{\bf{L}}^{(j)}_{\ell}\big{|}\mathscr{F}^{n}_{\ell-1}\biggr{]}=0,

one has

\displaystyle{\bf{E}}\biggl{[}{\bf{N}}^{(j)}_{k}\big{|}\mathscr{F}^{n}_{k-1}\biggr{]}

\displaystyle=\frac{1}{2h_{n}}\sum_{\ell=1}^{k-1}{\bf{L}}^{(j)}_{\ell}+\frac{1}{2h_{n}}{\bf{E}}\biggl{[}{\bf{L}}^{(j)}_{k}\big{|}\mathscr{F}^{n}_{k-1}\biggr{]}={\bf{N}}^{(j)}_{k-1},

so that $\{{\bf{N}}_{k}^{(j)}\}_{k=0}^{n}$ is a discrete-time martingale with respect to $\{\mathscr{F}^{n}_{i}\}_{i=0}^{n}$ . Note that $\sqrt{n}{\bf{M}}^{(j)}_{n}$ is the terminal value of $\{{\bf{N}}_{k}^{(j)}\}_{k=0}^{n}$ :

\displaystyle\sqrt{n}{\bf{M}}_{n}^{(j)}={\bf{N}}^{(j)}_{n}.

Using the Burkholder inequality and

\displaystyle\bigl{\langle}{\bf{N}}^{(j)}\bigr{\rangle}_{n}=\sum_{k=1}^{n}\bigl{(}{\bf{N}}_{k}^{(j)}-{\bf{N}}_{k-1}^{(j)}\bigr{)}^{2}=\frac{1}{4h^{2}_{n}}\sum_{k=1}^{n}{\bf{L}}_{k}^{(j)2},

we have

	$\displaystyle{\bf{E}}\biggl{[}\bigl{\|}{\bf{N}}^{(j)}_{n}\bigr{\|}^{L}\biggr{]}$	$\displaystyle\leq C_{L}{\bf{E}}\biggl{[}\bigl{\langle}{\bf{N}}^{(j)}\bigr{\rangle}_{n}^{\frac{L}{2}}\biggr{]}$
		$\displaystyle\leq\frac{C_{L}}{h_{n}^{L}}{\bf{E}}\left[\Biggl{(}\sum_{k=1}^{n}{\bf{L}}^{(j)2}_{k}\Biggr{)}^{\frac{L}{2}}\right]\leq\frac{C_{L}}{h_{n}^{L}}\times n^{\frac{L}{2}-1}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\bigl{\|}{\bf{L}}^{(j)}_{k}\bigr{\|}^{L}\biggr{]}$

for all $L>1$ , which yields

\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}=\frac{1}{n^{\frac{L}{2}}}{\bf{E}}\biggl{[}\bigl{|}{\bf{N}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}\leq\frac{C_{L}}{nh_{n}^{L}}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\big{|}{\bf{L}}^{(j)}_{k}\big{|}^{L}\biggr{]}.

(5.14)

Moreover, it follows from Lemma 3 and the Cauchy-Schwartz inequality that

	$\displaystyle{\bf{E}}\biggl{[}\bigl{\|}{\bf{L}}^{(j)}_{k}\bigr{\|}^{L}\biggr{]}$	$\displaystyle\leq C_{L}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{2L}\Biggr{]}+C_{L}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\quad+C_{L}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq C_{L}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{2L}\Biggr{]}+C_{L}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\quad+C_{L}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{2L}\Biggr{]}^{\frac{1}{2}}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{2L}\Biggr{]}^{\frac{1}{2}}$
		$\displaystyle\leq C_{L}\Bigl{(}h_{n}^{L}+h_{n}^{L}+h_{n}^{\frac{3}{2}L}\Bigr{)}$
		$\displaystyle\leq C_{L}h_{n}^{L}$

for any $L>1$ , so that it holds from (5.14) that

\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{M}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}\leq C_{L},

which implies (5.13). Next, we will prove

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{\bf{R}}^{(j)}_{n}\bigr{|}^{L}\biggr{]}<\infty

(5.15)

for all $L>1$ . In an analogous manner to the proof of Lemma 3, one has

\displaystyle{\bf{E}}\Biggl{[}\Bigl{|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{|}^{L}\Biggr{]}\leq C_{L}h_{n}^{2L}

(5.16)

for all $L>1$ . Lemma 3 and (5.16) show

	$\displaystyle{\bf{E}}\biggl{[}\bigl{\|}{\bf{R}}^{(j)}_{n}\bigr{\|}^{L}\biggr{]}$	$\displaystyle\leq\frac{C_{L}}{n^{\frac{L}{2}}h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{2L}\Biggr{]}$
		$\displaystyle\quad+\frac{C_{L}}{n^{\frac{L}{2}}h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq\frac{C_{L}n^{\frac{L}{2}}}{h_{n}^{L}}\bigl{(}h_{n}^{2L}+h_{n}^{2L}\bigr{)}$
		$\displaystyle\leq C_{L}(nh_{n}^{2})^{\frac{L}{2}}$

for any $L>1$ . Since $nh_{n}^{2}=n^{-1}\longrightarrow 0$ as $n\longrightarrow\infty$ , we obtain (5.15). Consequently, for all $L>1$ , it holds from (5.13) and (5.15) that

	$\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})\biggr{\|}^{L}\Biggr{]}$	$\displaystyle\leq C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{\|}{\bf{M}}^{(j)}_{n}\bigr{\|}^{L}\biggr{]}+C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{\|}{\bf{R}}^{(j)}_{n}\bigr{\|}^{L}\biggr{]}$
		$\displaystyle<\infty.$

Therefore, it is shown that for all $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})\biggr{|}^{L}\Biggr{]}<\infty

for $j=1,\cdots,q$ . ∎

Lemma 5

Under [A], for all $\varepsilon\in(0,\frac{1}{2})$ and $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{(}n^{\varepsilon}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}\biggr{|}\biggr{)}^{L}\Biggr{]}<\infty

for $j_{1},j_{2}=1,\cdots,q$ .

Proof of Lemma 5.

Note that

	$\displaystyle\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta)\Bigr{]}$	$\displaystyle=\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$
		$\displaystyle\qquad-\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$
		$\displaystyle\qquad+\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$

and

	$\displaystyle\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)$	$\displaystyle=-\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$
		$\displaystyle\qquad+\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$

for $j_{1},j_{2}=1,\cdots,q$ . Since

	$\displaystyle\quad\ \frac{1}{2}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta_{0})\Bigr{]}+\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})$
	$\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\Biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Biggl{\}}$
	$\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vec}\nolimits{\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\mathop{\rm vec}\nolimits{\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}$
	$\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vech}\nolimits{\partial_{\theta^{(j_{1})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\mathbb{D}_{p}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{D}_{p}\Bigl{(}\mathop{\rm vech}\nolimits{\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}$
	$\displaystyle=\Bigl{(}\partial_{\theta^{(j_{1})}}\mathop{\rm vech}\nolimits{{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}{\bf{W}}(\theta_{0})^{-1}\Bigl{(}\partial_{\theta^{(j_{2})}}\mathop{\rm vech}\nolimits{{\bf{\Sigma}}(\theta_{0})}\Bigr{)}$
	$\displaystyle={\bf{I}}(\theta_{0})_{j_{1}j_{2}},$

we have

	$\displaystyle\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}-\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})$
		$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}$
		$\displaystyle\qquad\qquad\quad-\frac{1}{2}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta_{0})\Bigr{]}-\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})$
		$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}-{\bf{I}}(\theta_{0})_{j_{1}j_{2}},$

so that a decomposition is given by

\displaystyle\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}={\bf{M}}^{\dagger}_{n,j_{1}j_{2}}+{\bf{R}}^{\dagger}_{n,j_{1}j_{2}},

where

	$\displaystyle{\bf{M}}^{\dagger}_{n,j_{1}j_{2}}$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}$
		$\displaystyle\qquad-\frac{1}{nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}$

and

	$\displaystyle{\bf{R}}^{\dagger}_{n,j_{1}j_{2}}$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}\Biggr{]}$
		$\displaystyle\qquad\qquad\qquad-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Biggr{]}.$

Let

\displaystyle{\bf{N}}^{\dagger}_{k,j_{1}j_{2}}=\frac{1}{2h_{n}}\sum_{\ell=1}^{k}{\bf{L}}^{\dagger}_{\ell,j_{1}j_{2}}

for $k=0,\cdots,n$ , where

	$\displaystyle{\bf{L}}^{\dagger}_{\ell,j_{1}j_{2}}$	$\displaystyle=-\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}$
		$\displaystyle\qquad\qquad-2\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}$

for $\ell=1,\cdots,k$ . In a similar way to the proof of Lemma 4, $\{{\bf{N}}_{k,j_{1}j_{2}}^{\dagger}\}_{k=0}^{n}$ is a discrete-time martingale with respect to $\{\mathscr{F}^{n}_{i}\}_{i=0}^{n}$ , and $n{\bf{M}}^{\dagger}_{n,j_{1}j_{2}}$ is the terminal value of $\{{\bf{N}}_{k,j_{1}j_{2}}^{\dagger}\}_{k=0}^{n}$ :

\displaystyle n{\bf{M}}^{\dagger}_{n,j_{1}j_{2}}={\bf{N}}^{\dagger}_{n,j_{1}j_{2}}.

In a similar way to the proof of Lemma 4, it follows from the Burkholder inequality that

\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{N}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq\frac{C_{L}}{h_{n}^{L}}\times n^{\frac{L}{2}-1}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{\dagger}_{k,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}

for all $L>1$ , which yields

\displaystyle{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}=n^{L(\varepsilon-1)}{\bf{E}}\biggl{[}\bigl{|}n{\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq\frac{C_{L}}{h_{n}^{L}}\times n^{L(\varepsilon-\frac{1}{2})-1}\sum_{k=1}^{n}{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{\dagger}_{k,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}.

For any $L>1$ , it is shown that

\displaystyle{\bf{E}}\biggl{[}\bigl{|}{\bf{L}}^{\dagger}_{k,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq C_{L}h_{n}^{L}

in an analogous manner to the proof of Lemma 4, which deduces

\displaystyle{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}\leq C_{L}n^{L(\varepsilon-\frac{1}{2})}.

Since $\varepsilon-\frac{1}{2}<0$ , we have $n^{L(\varepsilon-\frac{1}{2})}\longrightarrow 0$ as $n\longrightarrow\infty$ , so that

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}<\infty

(5.17)

for all $L>1$ . Furthermore, we see from Lemma 3 and (5.16) that for all $L>1$ ,

	$\displaystyle{\bf{E}}\biggl{[}\bigl{\|}n^{\varepsilon}{\bf{R}}^{\dagger}_{n,j_{1}j_{2}}\bigr{\|}^{L}\biggr{]}$	$\displaystyle\leq\frac{C_{L}n^{L(\varepsilon-1)}}{h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{2L}\Biggr{]}$
		$\displaystyle\quad+\frac{C_{L}n^{L(\varepsilon-1)}}{h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq\frac{C_{L}n^{L\varepsilon}}{h_{n}^{L}}\bigl{(}h_{n}^{2L}+h_{n}^{2L}\bigr{)}$
		$\displaystyle\leq C_{L}(nh_{n}^{2})^{\frac{L}{2}}n^{L(\varepsilon-\frac{1}{2})}$

and $(nh_{n}^{2})^{\frac{L}{2}}n^{L(\varepsilon-\frac{1}{2})}\longrightarrow 0$ as $n\longrightarrow\infty$ , which implies

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{R}}^{\dagger}_{n,j_{1}j_{2}}\bigr{|}^{L}\biggr{]}<\infty.

(5.18)

Hence, it holds from (5.17) and (5.18) that

	$\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\biggl{\|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}\biggr{\|}\biggr{)}^{L}\Biggr{]}$	$\displaystyle\leq\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger}_{n,j_{1}j_{2}}\bigr{\|}^{L}\biggr{]}+\sup_{n\in\mathbb{N}}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\bf{R}}^{\dagger}_{n,j_{1}j_{2}}\bigr{\|}^{L}\biggr{]}$
		$\displaystyle<\infty$

for all $L>1$ . Therefore, for all $L>0$ , we obtain

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\biggl{|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})_{j_{1}j_{2}}\biggr{|}\biggr{)}^{L}\Biggr{]}<\infty

for $j_{1},j_{2}=1,\cdots,q$ . ∎

Lemma 6

Under [A], for all $\varepsilon\in(0,\frac{1}{2})$ and $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\left(\sup_{\theta\in\Theta}n^{\varepsilon}\Bigl{|}{\rm{Y}}_{n}(\mathbb{X}_{n},\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\right)^{L}\right]<\infty.

Proof of Lemma 6.

Since

	$\displaystyle{\rm{Y}}_{n}(\theta;\theta_{0})$	$\displaystyle=\frac{1}{n}\Bigl{\{}{\rm{H}}_{n}(\theta)-{\rm{H}}_{n}(\theta_{0})\Bigr{\}}$
		$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}(\Delta_{i}\mathbb{X})^{\otimes 2}\Bigr{]}-\frac{1}{2}\log\frac{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)}{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})}$

and

\displaystyle{\rm{Y}}(\theta)

\displaystyle=-\frac{1}{2}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}{\bf{\Sigma}}(\theta_{0})\Bigr{]}-\frac{1}{2}\log\frac{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)}{\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})},

one has a decomposition

	$\displaystyle{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{[}(\Delta_{i}\mathbb{X})^{\otimes 2}-h_{n}{\bf{\Sigma}}(\theta_{0})\Bigr{]}$
		$\displaystyle={\bf{M}}^{\dagger\dagger}_{n}+{\bf{R}}^{\dagger\dagger}_{n},$

where

	$\displaystyle{\bf{M}}^{\dagger\dagger}_{n}$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}-{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Biggr{]}$
		$\displaystyle\quad-\frac{1}{nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]},{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Biggr{]}$

and

	$\displaystyle{\bf{R}}^{\dagger\dagger}_{n}$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}\Biggr{]}$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}-{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Biggl{[}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-h_{n}{\bf{\Sigma}}(\theta_{0})\Biggr{]}.$

In an analogous manner to Lemma 5, one has

\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}<\infty

and

\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\biggr{]}<\infty

for all $L>1$ . Consequently, it holds from the Sobolev inequality that

	$\displaystyle{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\Biggr{]}$	$\displaystyle\leq{\bf{E}}\Biggl{[}\int_{\Theta}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}+\bigl{\|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}d\theta\Biggr{]}$
		$\displaystyle=\int_{\Theta}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\biggr{]}d\theta+\int_{\Theta}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\biggr{]}d\theta$
		$\displaystyle\leq\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\biggr{]}d\theta+\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\biggr{]}d\theta$
		$\displaystyle\leq C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\biggr{]}+C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\biggl{[}\bigl{\|}{n^{\varepsilon}\partial_{\theta}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\biggr{]}$

for any $L>q$ , which yields

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]}<\infty.

(5.19)

In a similar way, it is shown that

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{|}{n^{\varepsilon}\bf{R}}^{\dagger\dagger}_{n}\bigr{|}^{L}\Biggr{]}<\infty

(5.20)

for all $L>q$ . Thus, we see from (5.19) and (5.20) that

	$\displaystyle\quad\ \sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\sup_{\theta\in\Theta}\Bigl{\|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{\|}\biggr{)}^{L}\Biggr{]}$
	$\displaystyle\leq C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{\|}{n^{\varepsilon}\bf{M}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\Biggr{]}+C_{L}\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\sup_{\theta\in\Theta}\bigl{\|}{n^{\varepsilon}\bf{R}}^{\dagger\dagger}_{n}\bigr{\|}^{L}\Biggr{]}<\infty$

for any $L>q$ . Therefore, one gets

\displaystyle\quad\ \sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}n^{\varepsilon}\sup_{\theta\in\Theta}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty

for all $\varepsilon\in(0,\frac{1}{2})$ and $L>0$ . ∎

Lemma 7

Under [A], for all $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\left(\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)\Bigr{|}\right)^{L}\right]<\infty

for $j_{1},j_{2},j_{3}=1,\cdots,q$ .

Proof of Lemma 7.

Since

	$\displaystyle\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)$	$\displaystyle=-\frac{1}{2nh_{n}}\sum_{i=1}^{n}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}$
		$\displaystyle\qquad\qquad\qquad\qquad\quad-\frac{1}{2}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)$

for $j_{1},j_{2},j_{3}=1,\cdots,q$ , it holds from Lemma 3 that

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\Bigl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{\|}\biggr{)}^{L}\Biggr{]}$
	$\displaystyle\leq\frac{C_{L}}{n^{L}h_{n}^{L}}\times n^{L-1}\sum_{i=1}^{n}{\bf{E}}\Biggl{[}\biggl{\|}\Bigl{(}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{[}\bigl{(}\Delta_{i}\mathbb{X}\bigr{)}^{\otimes 2}\Bigr{]}\biggr{\|}^{L}\Biggr{]}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad+C_{L}\biggl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)\biggr{\|}^{L}$
	$\displaystyle\leq\frac{C_{L}}{nh_{n}^{L}}\biggl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\biggr{\|}^{L}\sum_{i=1}^{n}{\bf{E}}\biggl{[}\bigl{\|}\Delta_{i}\mathbb{X}\bigr{\|}^{2L}\biggr{]}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad+C_{L}\biggl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)\biggr{\|}^{L}$
	$\displaystyle\leq C_{L}\sup_{\theta\in\Theta}\biggl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\bf{\Sigma}}(\theta)^{-1}\biggr{\|}^{L}+C_{L}\sup_{\theta\in\Theta}\biggl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)\biggr{\|}^{L}$
	$\displaystyle\leq C_{L}$

for all $L>1$ , which yields

\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty.

(5.21)

Similarly, one has

\displaystyle\sup_{n\in\mathbb{N}}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\Bigl{|}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty

(5.22)

for any $L>1$ . By using the Sobolev inequality, it is shown that

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{\|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{\|}\biggr{)}^{L}\Biggr{]}$
	$\displaystyle\leq{\bf{E}}\Biggl{[}\int_{\Theta}\biggl{\|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}+\biggl{\|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}d\theta\Biggr{]}$
	$\displaystyle=\int_{\Theta}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}\Biggr{]}d\theta+\int_{\Theta}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}\Biggr{]}d\theta$
	$\displaystyle\leq\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}\Biggr{]}d\theta$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad+\int_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}\Biggr{]}d\theta$
	$\displaystyle\leq C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{n}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}\Biggr{]}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad+C_{\Theta}\sup_{\theta\in\Theta}{\bf{E}}\Biggl{[}\biggl{\|}\frac{1}{n}\partial_{\theta}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\biggr{\|}^{L}\Biggr{]}$

for all $L>q$ , so that we obtain from (5.21) and (5.22) that for all $\varepsilon\in(0,\frac{1}{2})$ and $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{|}\partial_{\theta^{(j_{1})}}\partial_{\theta^{(j_{2})}}\partial_{\theta^{(j_{3})}}{\rm{H}}_{n}(\theta)\Bigr{|}\biggr{)}^{L}\Biggr{]}<\infty

for $j_{1},j_{2},j_{3}=1,\cdots,q$ . ∎

Lemma 8

Under [A] and [B1], for all $L>0$ , there exists $C_{L}>0$ such that

\displaystyle{\bf{P}}\left(\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(\mathbb{X}_{n},u;\theta_{0})\geq e^{-r}\right)\leq\frac{C_{L}}{r^{L}}

for all $r>0$ and $n\in\mathbb{N}$ .

Proof.

It is enough to check the regularity conditions [A1^′′], [A4^′], [A6], [B1] and [B2] of Theorem 3 (c) in Yoshida [19]. It is supposed that $\alpha$ , $\rho_{1}$ , $\rho_{2}$ , $\beta_{1}$ and $\beta_{2}$ satisfy [A4^′]:

\displaystyle 0<\beta_{1}<\frac{1}{2},\ \ 0<\rho_{1}<\min\Bigl{\{}1,\beta,\frac{2\beta_{1}}{1-\alpha}\Bigr{\}},\ \ 2\alpha<\rho_{2},\ \ \beta_{2}\geq 0,\ \ 1-2\beta_{2}-\rho_{2}>0,

where $\beta=\alpha(1-\alpha)^{-1}$ . For example, we can take $\alpha=\frac{1}{10}$ , $\rho_{1}=\frac{1}{10}$ , $\rho_{2}=\frac{1}{4}$ , $\beta_{1}=\frac{1}{4}$ and $\beta_{2}=\frac{1}{3}$ . For any $L>0$ , it follows from Lemmas 4 and 6 that

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\theta_{0})\biggr{|}^{M_{1}}\Biggr{]}<\infty

and

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\left[\left(\sup_{\theta\in\Theta}n^{\frac{1}{2}-\beta_{1}}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\right)^{M_{2}}\right]<\infty,

where $M_{1}=L(1-\rho_{1})^{-1}>0$ and $M_{2}=L(1-2\beta_{2}-\rho_{2})^{-1}>0$ , which satisfies [A6]. Furthermore, we see from Lemmas 5 and 7 that

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\left[\left(\sup_{\theta\in\Theta}\frac{1}{n}\Bigl{|}\partial^{3}_{\theta}{\rm{H}}_{n}(\theta)\Bigr{|}\right)^{M_{3}}\right]<\infty

and

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\biggl{|}n^{\beta_{1}}\left\{\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\theta_{0})+{\bf{I}}(\theta_{0})\right\}\biggr{|}^{M_{4}}\Biggr{]}<\infty

for all $L>0$ , where $M_{3}=L(\beta-\rho_{1})^{-1}>0$ and $M_{4}=L\bigl{(}\frac{2\beta_{1}}{1-\alpha}-\rho_{1}\bigr{)}^{-1}>0$ . Hence, [A1^′′] is satisfied. It follows from Lemma 35 in Kusano and Uchida [14] and [B1] (b) that ${\bf{I}}(\theta_{0})$ is a positive definite matrix, which satisfies [B1]. Moreover, [B1] (a) yields [B2]. ∎

$\mathbb{E}$ denotes the expectation under the probability measure on the probability space on which $\zeta$ is realized.

Lemma 9

Under [A] and [B1], for all $L>0$ ,

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{|}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{|}^{L}\Biggr{]}<\infty

and for $f\in C_{\uparrow}(\mathbb{R}^{q})$ ,

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}f\Bigl{(}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{)}\biggr{]}\longrightarrow\mathbb{E}\biggl{[}f\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\biggr{]}

as $n\longrightarrow\infty$ .

Proof.

Note that $\hat{u}_{n}\in\mathbb{U}_{n}$ since $\theta_{0}+\frac{1}{\sqrt{n}}\hat{u}_{n}=\hat{\theta}_{n}\in\Theta$ . For all $r>0$ , we have

	$\displaystyle 0$	$\displaystyle\leq{\rm{H}}_{n}(\hat{\theta}_{n})-{\rm{H}}_{n}(\theta_{0})$
		$\displaystyle={\rm{H}}_{n}\Bigl{(}\theta_{0}+\frac{1}{\sqrt{n}}\hat{u}_{n}\Bigr{)}-{\rm{H}}_{n}(\theta_{0})$
		$\displaystyle=\log{\rm{Z}}_{n}(\hat{u}_{n};\theta_{0})\leq\log\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})$

on $\{|\hat{u}_{n}|\geq r\}$ , which yields

\displaystyle 1\leq\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})

on $\{|\hat{u}_{n}|\geq r\}$ . For any $L>0$ , it holds from Lemma 8 that there exists $C_{L}>0$ such that

	$\displaystyle{\bf{P}}\Bigl{(}\bigl{\|}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\bigr{\|}\geq r\Bigr{)}$	$\displaystyle\leq{\bf{P}}\left(\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})\geq 1\right)$
		$\displaystyle\leq{\bf{P}}\left(\sup_{u\in{\rm{V}}_{n}(r)}{\rm{Z}}_{n}(u;\theta_{0})\geq e^{-r}\right)\leq\frac{C_{L}}{r^{L}}$

for all $r>0$ and $n\in\mathbb{N}$ , which implies

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}\Biggl{[}\Bigl{|}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{|}^{L}\Biggr{]}<\infty.

(5.23)

Furthermore, we see from (5.23) and Lemma 2 that for all $f\in C_{\uparrow}(\mathbb{R}^{q})$ ,

\displaystyle{\bf{E}}\Biggl{[}f\Bigl{(}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{)}\Biggr{]}\longrightarrow\mathbb{E}\Biggl{[}f\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\Biggr{]}

as $n\longrightarrow\infty$ . ∎

Proof of Theorem 1.

Let us consider the following decomposition:

	$\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}$	$\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}$
		$\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]}$
		$\displaystyle\quad+{\bf{E}}_{\mathbb{X}_{n}}\biggr{[}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]}$
		$\displaystyle\quad+{\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}$
		$\displaystyle={\rm{D}}_{1,n}+{\rm{D}}_{2,n}+{\rm{D}}_{3,n},$

where

	$\displaystyle{\rm{D}}_{1,n}$	$\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]},$
	$\displaystyle{\rm{D}}_{2,n}$	$\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\biggr{[}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]},$
	$\displaystyle{\rm{D}}_{3,n}$	$\displaystyle={\bf{E}}_{\mathbb{Z}_{n}}\biggl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{]}-{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}.$

First of all, we will prove

\displaystyle{\rm{D}}_{1,n}

\displaystyle\longrightarrow\frac{q}{2}

(5.24)

as $n\longrightarrow\infty$ . Using the Taylor expansion, one has

	$\displaystyle{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})$	$\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})$
		$\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})$
		$\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0})$
		$\displaystyle={\rm{E}}_{1,n}+{\rm{E}}_{2,n}+{\rm{E}}_{3,n},$

where $\tilde{\theta}_{n,\lambda}=\theta_{0}+\lambda(\hat{\theta}_{n}-\theta_{0})$ and

	$\displaystyle{\rm{E}}_{1,n}$	$\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0}),$
	$\displaystyle{\rm{E}}_{2,n}$	$\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0}),$
	$\displaystyle{\rm{E}}_{3,n}$	$\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0}).$

First, we consider the expectation of $\rm{E}_{1,n}$ . Set

\displaystyle A_{n}=\Bigl{\{}\hat{\theta}_{n}\in{\rm{int}}\Theta\Bigr{\}}.

Note that ${\bf{P}}(A_{n})\longrightarrow 1$ as $n\longrightarrow\infty$ . By using the Taylor expansion, one gets

	$\displaystyle 0$	$\displaystyle=\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})$
		$\displaystyle=\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+\frac{1}{n}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\hat{u}_{n}^{(j)}$
		$\displaystyle\quad+\frac{1}{n\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)}$

for $i=1,\cdots,q$ on $A_{n}$ , so that we have

\displaystyle\begin{split}{\rm{E}}_{1,n}&=\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\hat{u}^{(i)}_{n}\\ &=\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}-\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n},\end{split}

(5.25)

where

\displaystyle{\rm{R}}_{1,n}^{(i)}=\left\{\begin{aligned} \hfil\displaystyle\begin{split}&\sum_{j=1}^{q}\left\{\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\}\hat{u}^{(j)}_{n}\\ &\quad+\frac{1}{n\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)},\end{split}&\quad\bigl{(}\mbox{on}\ A_{n}\bigr{)}\\ &\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(j)}_{n}-\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}).&\quad\bigl{(}\mbox{on}\ A_{n}^{c}\bigr{)}\end{aligned}\right.

Let

	$\displaystyle\bar{{\rm{R}}}_{1,n}^{(i)}$	$\displaystyle=\sum_{j=1}^{q}\left\{\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\}\hat{u}^{(j)}_{n}$
		$\displaystyle\qquad+\frac{1}{n\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)}.$

Since

	$\displaystyle\quad\ {\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}n^{\frac{1}{4}}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}^{2}\biggr{]}$
	$\displaystyle\leq C\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\Biggl{\|}n^{\frac{1}{4}}\left\{\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\}\hat{u}^{(j)}_{n}\Biggr{\|}^{2}\right]$
	$\displaystyle\qquad+\frac{C}{\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\Biggl{\|}\frac{1}{n}\left(\int_{0}^{1}(1-\lambda)\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)}\Biggr{\|}^{2}\right]$
	$\displaystyle\leq C\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}n^{\frac{1}{4}}\left\|\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\|\biggr{)}^{4}\right]^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(j)}_{n}\bigr{\|}^{4}\biggr{]}^{\frac{1}{2}}$
	$\displaystyle\qquad+\frac{C}{\sqrt{n}}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{\|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)\Bigr{\|}\biggr{)}^{4}\right]^{\frac{1}{2}}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(j)}_{n}\bigr{\|}^{8}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(k)}_{n}\bigr{\|}^{8}\biggr{]}^{\frac{1}{4}},$

it holds from Lemmas 5, 7 and 9 that

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}n^{\frac{1}{4}}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}<\infty.

(5.26)

Consequently, we see from (5.26) and Lemma 9 that

	$\displaystyle\left\|{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\Biggr{]}\right\|$	$\displaystyle=\left\|{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\sum_{i=1}^{q}\bar{{\rm{R}}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\Biggr{]}\right\|$
		$\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}\bigl{\|}\hat{u}^{(i)}_{n}\bigr{\|}\biggr{]}$
		$\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}$
		$\displaystyle\leq\frac{1}{n^{\frac{1}{4}}}\sum_{i=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}n^{\frac{1}{4}}\bar{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}\longrightarrow 0$

as $n\longrightarrow\infty$ , which yields

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\right]\longrightarrow 0

(5.27)

as $n\longrightarrow\infty$ . Set

\displaystyle\underline{{\rm{R}}}_{1,n}^{(i)}=\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(j)}_{n}-\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0}).

Using Lemmas 4 and 9, we obtain

	$\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}^{2}\biggr{]}$	$\displaystyle\leq C\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\Bigl{\|}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(j)}_{n}\Bigr{\|}^{2}\Biggr{]}+C{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{\|}\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{\|}^{2}\Biggr{]}$
		$\displaystyle\leq C\sum_{j=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(j)}_{n}\bigr{\|}^{2}\biggr{]}+C\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\biggl{\|}\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})\biggr{\|}^{2}\Biggr{]}<\infty,$

which implies

\displaystyle\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{|}^{2}\biggr{]}<\infty.

(5.28)

It follows from (5.28) and Lemma 9 that

	$\displaystyle\left\|{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\right\|$	$\displaystyle=\left\|{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\underline{\rm{R}}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\right\|$
		$\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}\bigl{\|}\hat{u}^{(i)}_{n}\bigr{\|}1_{A_{n}^{c}}\biggr{]}$
		$\displaystyle\leq\sum_{i=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(i)}_{n}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}1_{A_{n}^{c}}\Bigr{]}^{\frac{1}{4}}$
		$\displaystyle\leq{\bf{P}}\bigl{(}A_{n}^{c}\bigr{)}^{\frac{1}{4}}\sum_{i=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\underline{{\rm{R}}}_{1,n}^{(i)}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(i)}_{n}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}$
		$\displaystyle\longrightarrow 0$

as $n\longrightarrow\infty$ , so that one gets

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\longrightarrow 0

(5.29)

as $n\longrightarrow\infty$ . Hence, it holds from (5.27) and (5.29) that

\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}\right]&={\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}}\right]+{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}1_{A_{n}^{c}}\right]\longrightarrow 0\end{split}

(5.30)

as $n\longrightarrow\infty$ . Let

\displaystyle f_{1}(x)=x^{\top}{\bf{I}}(\theta_{0})x

for $x\in\mathbb{R}^{q}$ . Since $f_{1}\in C_{\uparrow}(\mathbb{R}^{q})$ , we see from Lemma 9 that

\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\hat{u}_{n}^{\top}{\bf{I}}(\theta_{0})\hat{u}_{n}\biggr{]}&={\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}f_{1}\bigl{(}\hat{u}_{n}\bigr{)}\Bigr{]}\\ &\longrightarrow\mathbb{E}\biggl{[}f_{1}\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\biggr{]}=\mathbb{E}\Bigl{[}\zeta^{\top}\zeta\Bigr{]}=q\end{split}

(5.31)

as $n\longrightarrow\infty$ . Therefore, (5.25), (5.30) and (5.31) show

\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{1,n}\Bigr{]}&={\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\right]-{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}{\rm{R}}_{1,n}^{(i)}\hat{u}^{(i)}_{n}\right]\longrightarrow q\end{split}

(5.32)

as $n\longrightarrow\infty$ . Next, the expectation of $\rm{E}_{2,n}$ is considered. Note that

\displaystyle{\rm{E}}_{2,n}

\displaystyle=-\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}{\rm{R}}_{2,n,ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n},

where

\displaystyle{\rm{R}}_{2,n,ij}=\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}

for $i,j=1,\cdots,q$ . By using Lemmas 5 and 9, it is shown that

	$\displaystyle\quad\ \left\|{\bf{E}}_{\mathbb{X}_{n}}\Biggl{[}\sum_{i=1}^{q}\sum_{j=1}^{q}{\rm{R}}_{2,n,ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\Biggr{]}\right\|$
	$\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}{\rm{R}}_{2,n,ij}\bigr{\|}\bigl{\|}\hat{u}^{(i)}_{n}\bigr{\|}\bigl{\|}\hat{u}^{(j)}_{n}\bigr{\|}\biggr{]}$
	$\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}{\rm{R}}_{2,n,ij}\bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(i)}_{n}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}^{(j)}_{n}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}$
	$\displaystyle\leq\frac{1}{n^{\frac{1}{4}}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}n^{\frac{1}{4}}\left\|\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta_{0})+{\bf{I}}(\theta_{0})_{ij}\right\|\biggr{)}^{2}\right]^{\frac{1}{2}}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(j)}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}$
	$\displaystyle\longrightarrow 0$

as $n\longrightarrow\infty$ . Thus, it follows from (5.31) that

\displaystyle\begin{split}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{2,n}\Bigr{]}&=-\frac{1}{2}{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}\sum_{j=1}^{q}{\bf{I}}(\theta_{0})_{ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\right]+\frac{1}{2}{\bf{E}}_{\mathbb{X}_{n}}\left[\sum_{i=1}^{q}\sum_{j=1}^{q}{\rm{R}}_{2,n,ij}\hat{u}^{(i)}_{n}\hat{u}^{(j)}_{n}\right]\\ &\longrightarrow-\frac{q}{2}\end{split}

(5.33)

as $n\longrightarrow\infty$ . Note that

\displaystyle{\rm{E}}_{3,n}

\displaystyle=\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\rm{R}}_{3,n,ijk}\hat{u}_{n}^{(i)}\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)},

where

\displaystyle{\rm{R}}_{3,n,ijk}=\frac{1}{2n\sqrt{n}}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda

for $i,j,k=1,\cdots,q$ . It holds from Lemmas 7 and 9 that

	$\displaystyle\biggl{\|}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{3,n}\Bigr{]}\biggr{\|}$	$\displaystyle\leq\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\left[\left\|\frac{1}{n}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right\|^{2}\right]^{\frac{1}{2}}$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\quad\times{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(j)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(k)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}$
		$\displaystyle\leq\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigr{\|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{X}_{n},\theta)\Bigl{\|}\biggr{)}^{2}\right]^{\frac{1}{2}}$
		$\displaystyle\qquad\qquad\qquad\quad\ \ \times\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(j)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(k)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}$
		$\displaystyle\longrightarrow 0$

as $n\longrightarrow\infty$ , which yields

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}{\rm{E}}_{3,n}\Bigr{]}\longrightarrow 0

(5.34)

as $n\longrightarrow\infty$ . Hence, (5.32), (5.33) and (5.34) show (5.24). Next, we will prove

\displaystyle{\rm{D}}_{3,n}

\displaystyle\longrightarrow\frac{q}{2}

(5.35)

as $n\longrightarrow\infty$ . By using the Taylor expansion, one gets

	$\displaystyle{\rm{H}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})-{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})$	$\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})$
		$\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})$
		$\displaystyle\quad+\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\quad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0})$
		$\displaystyle={\rm{F}}_{1,n}+{\rm{F}}_{2,n}+{\rm{F}}_{3,n},$

where

	$\displaystyle{\rm{F}}_{1,n}$	$\displaystyle=\sum_{i=1}^{q}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0}),$
	$\displaystyle{\rm{F}}_{2,n}$	$\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0}),$
	$\displaystyle{\rm{F}}_{3,n}$	$\displaystyle=\frac{1}{2}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\left(\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right)$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\times(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})(\hat{\theta}^{(j)}_{n}-\theta^{(j)}_{0})(\hat{\theta}^{(k)}_{n}-\theta^{(k)}_{0}).$

Since it holds from Lemmas 1, 4 and 9 that

\displaystyle{\bf{E}}_{\mathbb{Z}_{n}}\left[\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\right]\longrightarrow\mathbb{E}\Bigl{[}{\bf{I}}(\theta_{0})^{\frac{1}{2}}\zeta\Bigr{]}=0

and

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\Bigr{]}

\displaystyle\longrightarrow\mathbb{E}\Bigl{[}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{]}=0

as $n\longrightarrow\infty$ , we have

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{1,n}\Bigr{]}\biggr{]}

\displaystyle=\sum_{i=1}^{q}{\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\Biggr{]}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\sqrt{n}(\hat{\theta}^{(i)}_{n}-\theta^{(i)}_{0})\biggr{]}\longrightarrow 0

(5.36)

as $n\longrightarrow\infty$ . Note that

\displaystyle{\rm{F}}_{2,n}

\displaystyle=\frac{1}{2}\hat{u}_{n}^{\top}\biggl{(}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{)}\hat{u}_{n}=\frac{1}{2}\mathop{\rm tr}\nolimits\Biggr{\{}\biggl{(}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\biggr{)}\hat{u}_{n}\hat{u}_{n}^{\top}\Biggr{\}}.

Let

\displaystyle f_{2}(x)=xx^{\top}

for $x\in\mathbb{R}^{q}$ . Lemmas 1, 4 and 9 deduce

\displaystyle{\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\Biggr{]}\longrightarrow-{\bf{I}}(\theta_{0})

and

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\hat{u}_{n}\hat{u}_{n}^{\top}\Bigr{]}

\displaystyle={\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}f_{2}\bigl{(}\hat{u}_{n}\bigr{)}\Bigr{]}\longrightarrow\mathbb{E}\biggl{[}f_{2}\Bigl{(}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta\Bigr{)}\biggr{]}={\bf{I}}(\theta_{0})^{-1}

as $n\longrightarrow\infty$ , which implies

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{2,n}\Bigr{]}\biggr{]}

\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\left\{{\bf{E}}_{\mathbb{Z}_{n}}\Biggl{[}\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta_{0})\Biggr{]}{\bf{E}}_{\mathbb{X}_{n}}\Bigl{[}\hat{u}_{n}\hat{u}_{n}^{\top}\Bigr{]}\right\}\longrightarrow-\frac{q}{2}

(5.37)

as $n\longrightarrow\infty$ . Moreover, we note that

\displaystyle{\rm{F}}_{3,n}=\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\tilde{{\rm{R}}}_{3,n,ijk}\hat{u}_{n}^{(i)}\hat{u}_{n}^{(j)}\hat{u}_{n}^{(k)},

where

\displaystyle\tilde{{\rm{R}}}_{3,n,ijk}=\frac{1}{2n\sqrt{n}}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda

for $i,j,k=1,\cdots,q$ . Since

	$\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\Bigl{\|}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\tilde{{\rm{R}}}_{3,n,ijk}\Bigr{]}\Bigr{\|}^{2}\biggr{]}$	$\displaystyle\leq{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\bigl{\|}\tilde{{\rm{R}}}_{3,n,ijk}\bigr{\|}^{2}\Bigr{]}\biggr{]}$
		$\displaystyle\leq\frac{1}{n}{\bf{E}}_{\mathbb{X}_{n}}\left[{\bf{E}}_{\mathbb{Z}_{n}}\left[\left\|\frac{1}{n}\int_{0}^{1}(1-\lambda)^{2}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\tilde{\theta}_{n,\lambda})d\lambda\right\|^{2}\right]\right]$
		$\displaystyle\leq\frac{1}{n}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{Z}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{\|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta)\Bigr{\|}\biggr{)}^{2}\right],$

it follows from Lemmas 7 and 9 that

	$\displaystyle\Biggl{\|}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{3,n}\Bigr{]}\biggr{]}\Biggr{\|}$	$\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\Bigl{\|}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\tilde{{\rm{R}}}_{3,n,ijk}\Bigr{]}\Bigr{\|}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}\bigl{\|}\hat{u}_{n}^{(j)}\bigr{\|}\bigl{\|}\hat{u}_{n}^{(k)}\bigr{\|}\biggr{]}$
		$\displaystyle\leq\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\Bigr{\|}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\tilde{{\rm{R}}}_{3,n,ijk}\Bigr{]}\Bigr{\|}^{2}\biggr{]}^{\frac{1}{2}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad\times{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(j)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(k)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}$
		$\displaystyle\leq\frac{1}{\sqrt{n}}\sum_{i=1}^{q}\sum_{j=1}^{q}\sum_{k=1}^{q}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{Z}_{n}}\left[\biggl{(}\frac{1}{n}\sup_{\theta\in\Theta}\Bigl{\|}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}\partial_{\theta^{(k)}}{\rm{H}}_{n}(\mathbb{Z}_{n},\theta)\Bigr{\|}\biggr{)}^{2}\right]^{\frac{1}{2}}$
		$\displaystyle\qquad\quad\times\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(i)}\bigr{\|}^{4}\biggr{]}^{\frac{1}{4}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(j)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}\sup_{n\in\mathbb{N}}{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\bigl{\|}\hat{u}_{n}^{(k)}\bigr{\|}^{8}\biggr{]}^{\frac{1}{8}}$
		$\displaystyle\longrightarrow 0$

as $n\longrightarrow\infty$ , so that one has

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{3,n}\Bigr{]}\biggr{]}\longrightarrow 0

(5.38)

as $n\longrightarrow\infty$ . Consequently, we see from (5.36), (5.37) and (5.38) that

\displaystyle{\rm{D}}_{3,n}=-{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}{\rm{F}}_{1,n}+{\rm{F}}_{2,n}+{\rm{F}}_{3,n}\Bigr{]}\biggr{]}\longrightarrow\frac{q}{2}

as $n\longrightarrow\infty$ , which yields (5.35). Furthermore, we have

\displaystyle\rm{D}_{2,n}=0

(5.39)

since $\mathbb{X}_{n}$ and $\mathbb{Z}_{n}$ have the same distribution. Therefore, it holds from (5.24), (5.35) and (5.39) that

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{n}(\mathbb{X}_{n},\hat{\theta}_{n})-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{n}(\mathbb{Z}_{n},\hat{\theta}_{n})\Bigr{]}\biggr{]}=q+o_{p}(1)

as $n\longrightarrow\infty$ . ∎

Lemma 10

Under [A] and [B2], as $n\longrightarrow\infty$ ,

\displaystyle\frac{1}{n}{\rm{H}}_{m,n}(\mathbb{X}_{n},\hat{\theta}_{m,n})\stackrel{{\scriptstyle p}}{{\longrightarrow}}{\rm{H}}_{m}(\bar{\theta}_{m}).

Proof of Lemma 10.

In an analogous manner to the proof of Theorem 4 in Kusano and Uchida [13], we can obtain the result. See also Appendix 6.1. ∎

Proof of Theorem 2.

Fix $m^{*}\in\mathcal{M}$ . From the definition of $\hat{m}_{n}$ , one has

\displaystyle\begin{split}{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}&\leq{\bf{P}}\left(\min_{m_{1}\in\mathcal{M}^{c}}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<\min_{m_{2}\in\mathcal{M}}{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\right)\\ &={\bf{P}}\left(\bigcup_{m_{1}\in\mathcal{M}^{c}}\Bigl{\{}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<\min_{m_{2}\in\mathcal{M}}{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\Bigr{\}}\right)\\ &\leq\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\left({\rm{QAIC}}(\mathbb{X}_{n},m_{1})<\min_{m_{2}\in\mathcal{M}}{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\right)\\ &=\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\left(\bigcap_{m_{2}\in\mathcal{M}}\Bigl{\{}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m_{2})\Bigr{\}}\right)\\ &\leq\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\Bigl{(}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{)}.\end{split}

(5.40)

For all $m_{1}\in\mathcal{M}^{c}$ , it follows from Lemma 10 that

\displaystyle\begin{split}&\quad\ \frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\\ &=-\frac{2}{n}\log{\rm{L}}_{m_{1},n}(\mathbb{X}_{n},\hat{\theta}_{m_{1},n})+\frac{2}{n}q_{m_{1}}+\frac{2}{n}\log{\rm{L}}_{m^{*},n}(\mathbb{X}_{n},\hat{\theta}_{m^{*},n})-\frac{2}{n}q_{m^{*}}\\ &=-\frac{2}{n}{\rm{H}}_{m_{1},n}(\mathbb{X}_{n},\hat{\theta}_{m_{1},n})+\frac{2}{n}{\rm{H}}_{m^{*},n}(\mathbb{X}_{n},\hat{\theta}_{m^{*},n})+\frac{2}{n}q_{m_{1}}-\frac{2}{n}q_{m^{*}}\\ &\stackrel{{\scriptstyle p}}{{\longrightarrow}}c_{m_{1},m^{*}}\end{split}

(5.41)

as $n\longrightarrow\infty$ , where

\displaystyle c_{m_{1},m^{*}}=-2{\rm{H}}_{m_{1}}(\bar{\theta}_{m_{1}})+2{\rm{H}}_{m^{*}}(\theta_{m^{*},0}).

Define the function ${\rm{G}}:\mathcal{M}_{p}^{++}\longrightarrow\mathbb{R}$ :

\displaystyle{\rm{G}}({\bf{\Sigma}})=-\frac{1}{2}\mathop{\rm tr}\nolimits\bigl{(}{\bf{\Sigma}}^{-1}{\bf{\Sigma}}_{0}\bigr{)}-\frac{1}{2}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}.

Note that ${\rm{G}}({\bf{\Sigma}})$ has the unique maximum point at ${\bf{\Sigma}}={\bf{\Sigma}}_{0}$ . Since

\displaystyle{\bf{\Sigma}}_{0}={\bf{\Sigma}}_{m^{*}}(\theta_{m^{*},0})\neq{\bf{\Sigma}}_{m_{1}}(\bar{\theta}_{m_{1}}),

we obtain

\displaystyle{\rm{H}}_{m_{1}}(\bar{\theta}_{m_{1}})={\rm{G}}\bigl{(}{\bf{\Sigma}}_{m_{1}}(\bar{\theta}_{m_{1}})\bigr{)}<{\rm{G}}\bigl{(}{\bf{\Sigma}}_{m^{*}}(\theta_{m^{*},0})\bigr{)}={\rm{H}}_{m^{*}}(\theta_{m^{*},0})

for any $m_{1}\in\mathcal{M}^{c}$ , which yields $c_{m_{1},m^{*}}>0$ . Consequently, it holds from (5.41) that for all $m_{1}\in\mathcal{M}^{c}$ ,

	$\displaystyle 0$	$\displaystyle\leq{\bf{P}}\Bigl{(}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{)}$
		$\displaystyle={\bf{P}}\left(\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{})-c_{m_{1},m^{}}<-c_{m_{1},m^{*}}\right)$
		$\displaystyle={\bf{P}}\left(c_{m_{1},m^{}}-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})+\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{})>c_{m_{1},m^{*}}\right)$
		$\displaystyle\leq{\bf{P}}\left(\Bigl{\|}c_{m_{1},m^{}}-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})+\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{})\Bigr{\|}>c_{m_{1},m^{*}}\right)$
		$\displaystyle={\bf{P}}\left(\Bigl{\|}\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})-\frac{1}{n}{\rm{QAIC}}(\mathbb{X}_{n},m^{})-c_{m_{1},m^{}}\Bigr{\|}>c_{m_{1},m^{*}}\right)\longrightarrow 0$

as $n\longrightarrow\infty$ , which implies

\displaystyle\sum_{m_{1}\in\mathcal{M}^{c}}{\bf{P}}\Bigl{(}{\rm{QAIC}}(\mathbb{X}_{n},m_{1})<{\rm{QAIC}}(\mathbb{X}_{n},m^{*})\Bigr{)}\longrightarrow 0

(5.42)

as $n\longrightarrow\infty$ . Therefore, we see from (5.40) and (5.42) that

\displaystyle{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}\longrightarrow 0

as $n\longrightarrow\infty$ . ∎

References

[1] Adams, R. A. and Fournier, J. J. (2003). Sobolev spaces. Elsevier.
[2] Aït-Sahalia, Y., Kalnina, I. and Xiu, D. (2020). High-frequency factor models and regressions. Journal of Econometrics, 216(1), 86-105.
[3] Aït-Sahalia, Y.and Xiu, D. (2017). Using principal component analysis to estimate a high dimensional factor model with high-frequency data. Journal of Econometrics, 201(2), 384-399.
[4] Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317-332.
[5] Eguchi, S. and Masuda, H. (2023). Gaussian quasi-information criteria for ergodic Lévy driven SDE. Annals of the Institute of Statistical Mathematics, 1-47.
[6] Everitt, B. (1984) An introduction to latent variable models, Springer Science & Business Media
[7] Genon-Catalot, V. and Jacod, J. (1993). On the estimation of the diffusion coefficient for multidimensional diffusion processes. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques,29, 119-151.
[8] Harville, D. A. (1998). Matrix algebra from a statistician’s perspective. Taylor & Francis.
[9] Huang, P. H. (2017). Asymptotics of AIC, BIC, and RMSEA for model selection in structural equation modeling. Psychometrika, 82(2), 407-426.
[10] Jöreskog, K. G. (1970). A general method for analysis of covariance structures. Biometrika, 57(2), 239-251.
[11] Kessler, M. (1997). Estimation of an ergodic diffusion from discrete observations. Scandinavian Journal of Statistics, 24(2), 211-229.
[12] Kusano, S., and Uchida, M. (2023). Statistical inference in factor analysis for diffusion processes from discrete observations. Journal of Statistical Planning and Inference, (Version of Record). DOI: https://doi.org/10.1016/j.jspi.2023.07.009
[13] Kusano, S., and Uchida, M. (2023). Sparse inference of structural equation modeling with latent variables for diffusion processes. Japanese Journal of Statistics and Data Science, (Version of Record). DOI: https://doi.org/10.1007/s42081-023-00230-1
[14] Kusano, S., and Uchida, M. (2023). Structural equation modeling with latent variables for diffusion processes and its application to sparse estimation. arXiv preprint arXiv:2305.02655v2.
[15] Mueller, R. O. (1999). Basic principles of structural equation modeling: An introduction to LISREL and EQS. Springer Science & Business Media.
[16] Uchida, M. (2010). Contrast-based information criterion for ergodic diffusion processes from discrete observations. Annals of the Institute of Statistical Mathematics, 62, 161-187.
[17] Uchida, M. and Yoshida, N. (2012). Adaptive estimation of an ergodic diffusion process based on sampled data. Stochastic Processes and their Applications, 122(8), 2885-2924.
[18] Yoshida, N. (1992). Estimation for diffusion processes from discrete observation. Journal of Multivariate Analysis, 41, 220–242.
[19] Yoshida, N. (2011). Polynomial type large deviation inequalities and quasi-likelihood analysis for stochastic differential equations. Annals of the Institute of Statistical Mathematics, 63(3), 431-479.

6. Appendix

6.1. Proofs of Lemmas

Proof of Lemma 1.

Since

	$\displaystyle\partial_{\theta^{(i)}}{\rm{H}}_{n}(\theta)$	$\displaystyle=-\frac{n}{2}\partial_{\theta^{(i)}}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)-\frac{n}{2}\partial_{\theta^{(i)}}\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\mathbb{Q}_{\mathbb{XX}}\Bigr{)}$
		$\displaystyle=-\frac{n}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$
		$\displaystyle\qquad\qquad\qquad+\frac{n}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}$

for $i=1,\cdots,q$ , it is shown that

	$\displaystyle\quad\ \frac{1}{\sqrt{n}}\partial_{\theta^{(i)}}{\rm{H}}_{n}(\theta_{0})$
	$\displaystyle=\frac{\sqrt{n}}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\quad-\frac{\sqrt{n}}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\biggr{\}}$
	$\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\sqrt{n}\Bigl{(}\mathbb{Q}_{\mathbb{XX}}-{\bf{\Sigma}}(\theta_{0})\Bigr{)}\biggr{\}}$
	$\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vec}\nolimits{\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\sqrt{n}\Bigl{(}\mathop{\rm vec}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vec}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}$
	$\displaystyle=\frac{1}{2}\Bigl{(}\mathop{\rm vech}\nolimits{\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}\mathbb{D}_{p}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{D}_{p}\sqrt{n}\Bigl{(}\mathop{\rm vech}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}$
	$\displaystyle=\Bigl{(}\partial_{\theta^{(i)}}\mathop{\rm vech}\nolimits{{\bf{\Sigma}}(\theta_{0})}\Bigr{)}^{\top}{\bf{W}}(\theta_{0})^{-1}\sqrt{n}\Bigl{(}\mathop{\rm vech}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}.$

Thus, we see from Theorem 1 in Kusano and Uchida [14] that

	$\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\theta_{0})$	$\displaystyle=\Delta_{0}^{\top}{\bf{W}}(\theta_{0})^{-1}\sqrt{n}\Bigl{(}\mathop{\rm vech}\nolimits\mathbb{Q}_{\mathbb{XX}}-\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}$
		$\displaystyle\qquad\qquad\qquad\stackrel{{\scriptstyle d}}{{\longrightarrow}}N_{q}\Bigl{(}0,\Delta_{0}^{\top}{\bf{W}}(\theta_{0})^{-1}\Delta_{0}\Bigr{)}\sim{\bf{I}}(\theta_{0})^{\frac{1}{2}}\zeta.$

Note that

	$\displaystyle\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta)$	$\displaystyle=\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$
		$\displaystyle\quad-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\biggr{\}}$
		$\displaystyle\quad-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}$
		$\displaystyle\quad+\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}$
		$\displaystyle\quad-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta)\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}\Bigr{)}\mathbb{Q}_{\mathbb{XX}}\biggr{\}}$

for $i,j=1,\cdots,q$ . It follows from Theorem 1 in Kusano and Uchida [14] and Slutsky Theorem that

	$\displaystyle\frac{1}{n}\partial_{\theta^{(i)}}\partial_{\theta^{(j)}}{\rm{H}}_{n}(\theta_{0})$	$\displaystyle\stackrel{{\scriptstyle p}}{{\longrightarrow}}-\frac{1}{2}\mathop{\rm tr}\nolimits\biggl{\{}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}\biggr{\}}$
		$\displaystyle\quad=-\frac{1}{2}\Bigl{(}\mathop{\rm vec}\nolimits\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\Bigl{(}\mathop{\rm vec}\nolimits\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}$
		$\displaystyle\quad=-\frac{1}{2}\Bigl{(}\mathop{\rm vech}\nolimits\partial_{\theta^{(i)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}^{\top}\mathbb{D}_{p}^{\top}\Bigl{(}{\bf{\Sigma}}(\theta_{0})^{-1}\otimes{\bf{\Sigma}}(\theta_{0})^{-1}\Bigr{)}\mathbb{D}_{p}\Big{(}\mathop{\rm vech}\nolimits\partial_{\theta^{(j)}}{\bf{\Sigma}}(\theta_{0})\Bigr{)}$
		$\displaystyle\quad=-\Bigl{(}\partial_{\theta^{(i)}}\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}^{\top}{\bf{W}}(\theta_{0})^{-1}\Bigl{(}\partial_{\theta^{(j)}}\mathop{\rm vech}\nolimits{\bf{\Sigma}}(\theta_{0})\Bigr{)}$
		$\displaystyle\quad=-\bigl{(}\Delta_{0}^{\top}{\bf{W}}(\theta_{0})^{-1}\Delta_{0}\bigr{)}_{ij}$

as $n\longrightarrow\infty$ , so that we obtain

\displaystyle\frac{1}{n}\partial^{2}_{\theta}{\rm{H}}_{n}(\theta_{0})\stackrel{{\scriptstyle p}}{{\longrightarrow}}-{\bf{I}}(\theta_{0})

as $n\longrightarrow\infty$ . ∎

Proof of Lemma 2.

[B1] deduces

\displaystyle{\rm{Y}}(\theta)=0\Longrightarrow\theta=\theta_{0}.

For all $\varepsilon>0$ , there exists $\delta>0$ such that

\displaystyle|\hat{\theta}_{n}-\theta_{0}|>\varepsilon\Longrightarrow{\rm{Y}}(\theta_{0})-{\rm{Y}}(\hat{\theta}_{n})>\delta.

In an analogous manner to Lemma 33 in Kusano and Uchida [14], we obtain

\displaystyle\sup_{\theta\in\Theta}\Bigl{|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{|}\stackrel{{\scriptstyle p}}{{\longrightarrow}}0.

Since it holds from the definition of $\hat{\theta}_{n}$ that

\displaystyle{\rm{Y}}_{n}(\theta_{0};\theta_{0})\leq{\rm{Y}}_{n}(\hat{\theta}_{n};\theta_{0}),

we see

	$\displaystyle{\bf{P}}\Bigl{(}\|\hat{\theta}_{n}-\theta_{0}\|>\varepsilon\Bigr{)}$	$\displaystyle\leq{\bf{P}}\biggl{(}{\rm{Y}}(\theta_{0})-{\rm{Y}}(\hat{\theta}_{n})>\delta\biggr{)}$
		$\displaystyle\leq{\bf{P}}\biggl{(}{\rm{Y}}(\theta_{0})-{\rm{Y}}_{n}(\theta_{0};\theta_{0})>\frac{\delta}{3}\biggr{)}$
		$\displaystyle\quad+{\bf{P}}\biggl{(}{\rm{Y}}_{n}(\theta_{0};\theta_{0})-{\rm{Y}}_{n}(\hat{\theta}_{n};\theta_{0})>\frac{\delta}{3}\biggr{)}$
		$\displaystyle\quad+{\bf{P}}\biggl{(}{\rm{Y}}_{n}(\hat{\theta}_{n};\theta_{0})-{\rm{Y}}(\hat{\theta}_{n})>\frac{\delta}{3}\biggr{)}$
		$\displaystyle\leq 2{\bf{P}}\Biggl{(}\sup_{\theta\in\Theta}\Bigl{\|}{\rm{Y}}_{n}(\theta;\theta_{0})-{\rm{Y}}(\theta)\Bigr{\|}>\frac{\delta}{3}\Biggr{)}\longrightarrow 0$

as $n\longrightarrow\infty$ , which yields

\displaystyle\hat{\theta}_{n}\stackrel{{\scriptstyle p}}{{\longrightarrow}}\theta_{0}

(6.1)

as $n\longrightarrow\infty$ . Using the Taylor expansion, we have

\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\hat{\theta}_{n})

\displaystyle=\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\theta_{0})+\biggl{(}\frac{1}{n}\int_{0}^{1}\partial^{2}_{\theta}{\rm{H}}_{n}(\tilde{\theta}_{n,\lambda})d\lambda\biggr{)}\sqrt{n}(\hat{\theta}_{n}-\theta_{0}),

where $\tilde{\theta}_{n,\lambda}=\theta_{0}+\lambda(\hat{\theta}_{n}-\theta_{0})$ . Note that

\displaystyle\frac{1}{\sqrt{n}}\partial_{\theta}{\rm{H}}_{n}(\hat{\theta}_{n})=0

on $A_{n}$ and ${\bf{P}}(A_{n})\longrightarrow 1$ as $n\longrightarrow\infty$ , where

\displaystyle A_{n}=\Bigl{\{}\hat{\theta}_{n}\in{\rm{Int}}\Theta\Bigr{\}}.

In a similar manner to Theorem 2 in Kusano and Uchida [13], it holds from Lemma 2 and (6.1) that

\displaystyle\frac{1}{n}\int_{0}^{1}\partial^{2}_{\theta}{\rm{H}}_{n}(\tilde{\theta}_{n,\lambda})d\lambda\stackrel{{\scriptstyle p}}{{\longrightarrow}}-{\bf{I}}(\theta_{0})

as $n\longrightarrow\infty$ . Therefore, we see from Lemma 2 that

\displaystyle\sqrt{n}(\hat{\theta}_{n}-\theta_{0})\stackrel{{\scriptstyle d}}{{\longrightarrow}}{\bf{I}}(\theta_{0})^{-\frac{1}{2}}\zeta

as $n\longrightarrow\infty$ . ∎

Proof of Lemma 10.

In a similar way to Lemma 33 in Kusano and Uchida [14], it is shown that

\displaystyle\sup_{\theta_{m}\in\Theta_{m}}\biggl{|}\frac{1}{n}{\rm{H}}_{m,n}(\theta_{m})-{\rm{H}}_{m}(\theta_{m})\biggr{|}\stackrel{{\scriptstyle p}}{{\longrightarrow}}0.

Since ${\rm{H}}_{m}(\theta_{m})$ is continuous in $\theta_{m}\in\Theta_{m}$ , it holds from the continuous mapping theorem and Lemma 36 in Kusano and Uchida [14] that

\displaystyle{\rm{H}}_{m}(\hat{\theta}_{m,n})\stackrel{{\scriptstyle p}}{{\longrightarrow}}{\rm{H}}_{m}(\bar{\theta}_{m})

as $n\longrightarrow\infty$ . Therefore, we see

	$\displaystyle\biggl{\|}\frac{1}{n}{\rm{H}}_{m,n}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\bar{\theta}_{m})\biggr{\|}$	$\displaystyle\leq\biggl{\|}\frac{1}{n}{\rm{H}}_{m,n}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\hat{\theta}_{m,n})\biggr{\|}+\biggl{\|}{\rm{H}}_{m}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\bar{\theta}_{m})\biggr{\|}$
		$\displaystyle\leq\sup_{\theta\in\Theta}\biggl{\|}\frac{1}{n}{\rm{H}}_{m,n}(\theta_{m})-{\rm{H}}_{m}(\theta_{m})\biggr{\|}+\biggl{\|}{\rm{H}}_{m}(\hat{\theta}_{m,n})-{\rm{H}}_{m}(\bar{\theta}_{m})\biggr{\|}\stackrel{{\scriptstyle p}}{{\longrightarrow}}0$

as $n\longrightarrow\infty$ , which yields

\displaystyle\frac{1}{n}{\rm{H}}_{m,n}(\hat{\theta}_{m,n})\stackrel{{\scriptstyle p}}{{\longrightarrow}}{\rm{H}}_{m}(\bar{\theta}_{m})

as $n\longrightarrow\infty$ . ∎

6.2. Proof of (4.1)

Proof.

In an analogous manner to Appendix 6.2 in Kusano and Uchida [14], it is shown that

\displaystyle{\bf{\Sigma}}(\theta)={\bf{\Sigma}}(\theta_{0})\Longrightarrow\theta=\theta_{0}.

(6.2)

For ${\bf{\Sigma}}\in\mathcal{M}_{p}^{++}$ , we define

\displaystyle{\rm{G}}_{2}({\bf{\Sigma}})=\log\mathop{\rm det}\nolimits{\bf{\Sigma}}-\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})+\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}^{-1}{\bf{\Sigma}}(\theta_{0})\Bigr{)}-p.

Note that ${\rm{G}}_{2}({\bf{\Sigma}})$ has the unique minimum point at ${\bf{\Sigma}}={\bf{\Sigma}}(\theta_{0})$ . Since

	$\displaystyle{\rm{Y}}(\theta)$	$\displaystyle=-\frac{1}{2}\Bigl{\{}\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta)-\log\mathop{\rm det}\nolimits{\bf{\Sigma}}(\theta_{0})+\mathop{\rm tr}\nolimits\Bigl{(}{\bf{\Sigma}}(\theta)^{-1}{\bf{\Sigma}}(\theta_{0})\Bigr{)}-p\Bigr{\}}$
		$\displaystyle=-\frac{1}{2}{\rm{G}}_{2}\bigl{(}{\bf{\Sigma}}(\theta)\bigr{)},$

it holds from (6.2) that ${\rm{Y}}(\theta)$ has the unique maximum point at $\theta=\theta_{0}$ , which yields

\displaystyle\sup_{|\theta-\theta_{0}|>v}{\rm{Y}}(\theta)<{\rm{Y}}(\theta_{0})=0

(6.3)

for all $v>0$ . The Taylor expansion of ${\rm{Y}}(\theta)$ around $\theta=\theta_{0}$ is given by

\displaystyle\begin{split}{\rm{Y}}(\theta)&={\rm{Y}}(\theta_{0})+\partial_{\theta}{\rm{Y}}(\theta_{0})^{\top}(\theta-\theta_{0})\\ &\qquad\qquad\qquad+(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda\biggr{)}(\theta-\theta_{0})\\ &=\frac{1}{2}(\theta-\theta_{0})^{\top}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})(\theta-\theta_{0})\\ &\qquad\qquad\qquad+(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda-\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})\biggr{)}(\theta-\theta_{0}),\end{split}

(6.4)

where $\theta_{\lambda}=\theta_{0}+\lambda(\theta-\theta_{0})$ . In a similar way to Theorem 2 in Kusano and Uchida [13], it is shown that

\displaystyle\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda\longrightarrow\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})

as $\theta\longrightarrow\theta_{0}$ , which deduces

\displaystyle(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda-\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})\biggr{)}(\theta-\theta_{0})\longrightarrow 0

as $\theta\longrightarrow\theta_{0}$ . Hence, for all $\varepsilon>0$ , there exists $\delta>0$ such that

\displaystyle|\theta-\theta_{0}|\leq\delta\Longrightarrow\biggl{|}(\theta-\theta_{0})^{\top}\biggl{(}\int_{0}^{1}(1-\lambda)\partial^{2}_{\theta}{\rm{Y}}(\theta_{\lambda})d\lambda-\frac{1}{2}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})\biggr{)}(\theta-\theta_{0})\biggr{|}<\varepsilon,

so that we see from (6.4) that

\displaystyle{\rm{Y}}(\theta)<\frac{1}{2}(\theta-\theta_{0})^{\top}\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})(\theta-\theta_{0})+\varepsilon

for $\theta\in B_{\delta}(\theta_{0})$ , where

\displaystyle B_{\delta}(\theta_{0})=\Bigl{\{}\theta\in\Theta:|\theta-\theta_{0}|\leq\delta\Bigr{\}}.

Note that it holds from the proof of Lemma 1 that ${\bf{I}}(\theta_{0})=-\partial^{2}_{\theta}{\rm{Y}}(\theta_{0})$ . Since $\varepsilon>0$ is arbitrary, one has

\displaystyle{\rm{Y}}(\theta)\leq-\frac{1}{2}(\theta-\theta_{0})^{\top}{\bf{I}}(\theta_{0})(\theta-\theta_{0})

as $\varepsilon\downarrow 0$ for $\theta\in B_{\delta}(\theta_{0})$ . Recalling that ${\bf{I}}(\theta_{0})$ is a positive definite matrix, we have

\displaystyle\lambda_{min}|\theta-\theta_{0}|^{2}\leq(\theta-\theta_{0})^{\top}{\bf{I}}(\theta_{0})(\theta-\theta_{0})\leq\lambda_{max}|\theta-\theta_{0}|^{2},

where $\lambda_{min}>0$ and $\lambda_{max}>0$ are the minimum and maximum eigenvalues of ${\bf{I}}(\theta_{0})$ . There exists $C_{1}>0$ such that

\displaystyle{\rm{Y}}(\theta)\leq-C_{1}|\theta-\theta_{0}|^{2}

(6.5)

for $\theta\in B_{\delta}(\theta_{0})$ . Let

\displaystyle{\rm{Diam}}\Theta=\sup_{\theta_{1},\theta_{2}\in\Theta}|\theta_{1}-\theta_{2}|>0.

Since

\displaystyle\frac{1}{{\rm{Diam}}\Theta}|\theta-\theta_{0}|\leq 1,

we see from (6.3) that

\displaystyle\begin{split}{\rm{Y}}(\theta)&\leq\sup_{|\theta-\theta_{0}|>\delta}{\rm{Y}}(\theta)\\ &\leq-\Biggl{(}-\sup_{|\theta-\theta_{0}|>\delta}{\rm{Y}}(\theta)\Biggr{)}\frac{1}{({\rm{Diam}}\Theta)^{2}}|\theta-\theta_{0}|^{2}\end{split}

for $\theta\in B_{\delta}(\theta_{0})^{c}$ , so that there exists $C_{2}>0$ such that

\displaystyle{\rm{Y}}(\theta)\leq-C_{2}|\theta-\theta_{0}|^{2}

(6.6)

for $\theta\in B_{\delta}(\theta_{0})^{c}$ . Therefore, it follows from (6.5) and (6.6) that

\displaystyle{\rm{Y}}(\theta)\leq-C|\theta-\theta_{0}|^{2}

for $\theta\in\Theta$ , where $C=\min(C_{1},C_{2})$ . ∎

6.3. Ergodic case

In this section, we consider the ergodic case. The following assumptions are made.

[C]
1. (a)
  
  The diffusion process $\xi_{t}$ is ergodic with its invariant measure $\pi_{\xi,0}$ . For any $\pi_{\xi,0}$ -integrable function $f_{1}$ , it holds that
  
  $\displaystyle\frac{1}{T}\int_{0}^{T}{f_{1}(\xi_{t})dt}\overset{p}{\longrightarrow}\int f_{1}(x)\pi_{\xi,0}(dx)$
  
  as $T\longrightarrow\infty$ .
2. (b)
  
  The diffusion process $\delta_{t}$ is ergodic with its invariant measure $\pi_{\delta,0}$ . For any $\pi_{\delta,0}$ -integrable function $f_{2}$ , it holds that
  
  $\displaystyle\frac{1}{T}\int_{0}^{T}{f_{2}(\delta_{t})dt}\overset{p}{\longrightarrow}\int f_{2}(x)\pi_{\delta,0}(dx)$
  
  as $T\longrightarrow\infty$ .
3. (c)
  
  The diffusion process $\varepsilon_{t}$ is ergodic with its invariant measure $\pi_{\varepsilon,0}$ . For any $\pi_{\varepsilon,0}$ -integrable function $f_{3}$ , it holds that
  
  $\displaystyle\frac{1}{T}\int_{0}^{T}{f_{3}(\varepsilon_{t})dt}\overset{p}{\longrightarrow}\int f_{3}(x)\pi_{\varepsilon,0}(dx)$
  
  as $T\longrightarrow\infty$ .
4. (d)
  
  The diffusion process $\zeta_{t}$ is ergodic with its invariant measure $\pi_{\zeta,0}$ . For any $\pi_{\zeta,0}$ -integrable function $f_{4}$ , it holds that
  
  $\displaystyle\frac{1}{T}\int_{0}^{T}{f_{4}(\zeta_{t})dt}\overset{p}{\longrightarrow}\int f_{4}(x)\pi_{\zeta,0}(dx)$
  
  as $T\longrightarrow\infty$ .

In the ergodic case, we have the following results similar to the non-ergodic case.

Theorem 3

Let $m\in\{1,\cdots,M\}$ . Under [A], [B1] and [C], as $h_{n}\longrightarrow 0$ , $nh_{n}\longrightarrow\infty$ and $nh_{n}^{2}\longrightarrow 0$ ,

\displaystyle{\bf{E}}_{\mathbb{X}_{n}}\biggl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{X}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}-{\bf{E}}_{\mathbb{Z}_{n}}\Bigl{[}\log{\rm{L}}_{m,n}\bigl{(}\mathbb{Z}_{n},\hat{\theta}_{m,n}({\mathbb{X}_{n}})\bigr{)}\Bigr{]}\biggr{]}=q_{m}+o_{p}(1).

Theorem 4

Under [A], [B2] and [C], as $h_{n}\longrightarrow 0$ , $nh_{n}\longrightarrow\infty$ and $nh_{n}^{2}\longrightarrow 0$ ,

\displaystyle{\bf{P}}\Bigl{(}\hat{m}_{n}\in\mathcal{M}^{c}\Bigr{)}\longrightarrow 0.

Proofs of Theorems 3-4.

Since $h_{n}\longrightarrow 0$ and $nh_{n}^{2}\longrightarrow\infty$ , we can prove the results in the same way as the proofs of Theorems 1-2. ∎

	$\displaystyle{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$	$\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}^{(\ell)}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(\ell)}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\quad+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\Biggl{[}\Bigl{\|}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq C_{L}h_{n}^{\frac{L}{2}}$

	$\displaystyle{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$	$\displaystyle\leq C_{L}\sum_{\ell=1}^{p}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}^{(\ell)}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(\ell)}_{t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle=C_{L}\sum_{j=1}^{p_{1}}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(j)}_{1,t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\quad+C_{L}\sum_{k=1}^{p_{2}}{\bf{E}}\Biggl{[}\Bigl{\|}{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}^{(k)}_{2,t_{i-1}^{n}}\Bigr{\|}^{L}\Biggr{]}$
		$\displaystyle\leq C_{L}h_{n}^{L}$

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle={\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\biggr{)}$
	$\displaystyle\qquad\qquad\qquad\qquad\times\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\rm{R}}_{i-1}(h_{n},\xi)-{\rm{R}}_{i-1}(h_{n},\delta)\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle={\bf{E}}\Biggl{[}\Bigl{(}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\Bigr{)}\Bigl{(}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\Bigr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\rm{R}}_{i-1}(h_{n},\delta){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{1})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n},\xi){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}+{\rm{R}}_{i-1}(h_{n},\delta){\bf{E}}\biggl{[}\Delta_{i}\mathbb{X}^{(j_{2})}_{1}\big{\|}\mathscr{F}^{n}_{i-1}\biggr{]}$
	$\displaystyle\quad+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)$
	$\displaystyle=h_{n}{\bf{\Sigma}}^{11}(\theta_{0})_{j_{1}j_{2}}+{\rm{R}}_{i-1}(h_{n}^{2},\xi)+{\rm{R}}_{i-1}(h_{n}^{2},\delta)+{\rm{R}}_{i-1}(h_{n},\xi){\rm{R}}_{i-1}(h_{n},\delta)$

	$\displaystyle\quad\ {\bf{E}}\Biggl{[}\Bigl{\|}{\bf{V}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\Bigr{\|}^{L}\Biggr{]}$
	$\displaystyle\leq C_{L}\sum_{j_{1}=1}^{p_{1}}\sum_{j_{2}=1}^{p_{1}}{\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{1})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j_{2})}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\quad+C_{L}\sum_{j=1}^{p_{1}}\sum_{k=1}^{p_{2}}{\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(j)}_{1,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(j)}_{1,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k)}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k)}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\quad+C_{L}\sum_{k_{1}=1}^{p_{2}}\sum_{k_{2}=1}^{p_{2}}{\bf{E}}\left[\left\|{\bf{E}}\Biggl{[}\biggl{(}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{1})}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}^{(k_{2})}_{2,t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\Big{\|}\mathscr{F}^{n}_{i-1}\Biggr{]}\right\|^{L}\right]$
	$\displaystyle\leq C_{L}h_{n}^{L}$

	$\displaystyle\Bigl{(}\Delta_{i}\mathbb{X}\Bigr{)}^{\otimes 2}$	$\displaystyle=\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\otimes 2}+\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\otimes 2}$
		$\displaystyle\qquad\qquad+\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}^{\top}$
		$\displaystyle\qquad\qquad+\biggl{(}{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}-\mathbb{X}_{t_{i-1}^{n}}\biggr{)}\biggl{(}\mathbb{X}_{t_{i}^{n}}-{\bf{E}}\Bigl{[}\mathbb{X}_{t_{i}^{n}}\big{\|}\mathscr{F}^{n}_{i-1}\Bigr{]}\biggr{)}^{\top},$